Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Hed
Mar 31, 2004

Fun Shoe
Do you have multiple schemas installed?

You should be able to manually see what raw SQL the ORM is running for your simple filter. Compare that to you running it yourself with the dbshell.

Adbot
ADBOT LOVES YOU

Hed
Mar 31, 2004

Fun Shoe
I thought stripe was really easy to integrate, but they also suck to deal with if you have to deal with fraud (everyone sucks this way). I would probably get started with them and when it’s worth it to optimize fees get a merchant account with better terms through another vendor

Hed
Mar 31, 2004

Fun Shoe
cookiecutter installs a ton of poo poo I don’t need, for a new project so you guys just use startproject and build up?
I like the custom settings files and custom wsgi.py to make it work. Basically looking for that lightweight if it’s a thing.

Hed
Mar 31, 2004

Fun Shoe
Does anyone here run Django serverless (Lambda, Azure Functions) in production? I have a couple of Zappa sites that are basically landing pages but for anything more involved I still host Django on EC2, with an RDS database if the site is sufficiently complex.

With Zapppa looking unhealthy I was curious if anyone is looking at anything else like serverless framework or if it's just not a great fit for Django.

For a more involved site with lots of async task queues spinning off work I'm looking for recommendations of what people like between moving to something like ECS, or going full bore with lambdas calling lambdas, etc.

Hed
Mar 31, 2004

Fun Shoe
Sounds like you’ve got it sorted but yeah I was going to reply that _meta is at least the blessed way since they’ve declared it stable.

Hed
Mar 31, 2004

Fun Shoe
Just doing my check-in to see if anyone's got a good guide to dockerizing Django to run on Fargate or k8s ... I've used cookiecutter-django or some random script I found but they are really opinionated with packages and how to run them.

Hed
Mar 31, 2004

Fun Shoe
I have a Django site behind an AWS ALB, and the auth and redirection it provides is great.

However, my old nemesis the "Invalid HTTP_HOST Header" comes up when the ALB does its health checks. For regular requests, of course the Host header is set in HTTP, but for the health checks it just goes for it, and I end up with "Invalid HTTP_HOST header: '172.31.29.18:5000'" may need to be added to your list of hosts.

I don't see a way to customize the behavior of the ALB health check, is the only way around this to change my Django config for prod to scrape out the internal IP from the AWS HOSTNAME variable (currently HOSTNAME=ip-172-31-29-18.ec2.internal) and stuff it into the ALLOWED_HOSTS ?

I feel like there should be a lot of people with this problem but I'm clearly not encountering it in my searches.

Hed
Mar 31, 2004

Fun Shoe

Hed posted:

I have a Django site behind an AWS ALB, and the auth and redirection it provides is great.

However, my old nemesis the "Invalid HTTP_HOST Header" comes up when the ALB does its health checks. For regular requests, of course the Host header is set in HTTP, but for the health checks it just goes for it, and I end up with "Invalid HTTP_HOST header: '172.31.29.18:5000'" may need to be added to your list of hosts.

I don't see a way to customize the behavior of the ALB health check, is the only way around this to change my Django config for prod to scrape out the internal IP from the AWS HOSTNAME variable (currently HOSTNAME=ip-172-31-29-18.ec2.internal) and stuff it into the ALLOWED_HOSTS ?

I feel like there should be a lot of people with this problem but I'm clearly not encountering it in my searches.

If anyone cares I solved this by writing some Django middleware, and putting it higher in the settings.MIDDLEWARE stack than the built-in SecurityMiddleware, such that it short-circuits the response before "Host:" header gets checked in the HTTP request:

Python code:
class HealthCheckMiddleware:
    def __init__(self, get_response):
        self.get_response = get_response
        # One-time configuration and initialization.
        self.healthy_response = HttpResponse("OK")

    def __call__(self, request: HttpRequest):
        # Code to be executed for each request before
        # the view (and later middleware) are called.
        if request.META["PATH_INFO"] == "/healthcheck/":
            return self.healthy_response
        else:
            response = self.get_response(request)
            return response

Hed
Mar 31, 2004

Fun Shoe
Are there any good guides (outside the docs) for powering up my queries? I've never been able to "bridge" the Django and SQL worlds well, probably because my SQL skills are so crummy.


One example is if I have

Python code:
class DataSource(models.Model):
    vendor = models.CharField(max_length=50)
    name = models.CharField(max_length=50)

class Entry(models.Model):
   source = models.ForeignKey(DataSource, on_delete=models.CASCADE)
   reference_date = models.DateField(blank=False)
   settle = DecimalField(max_digits=19)
If I wanted to grab all the Entries between dates but want to prefer "FINAL" over "INTRADAY" data I want to do something like

entries = Entry.objects.filter(reference_date__gte=start, reference_date__lte=end) and but only unique days and prefer .endswith("FINAL"). In pure SQL (I'm using postgres) I could do some sort of CASE and enumerate my values.

Another would be to choose the "closest" settle to a given number, so I could return the Entry that has the settle for a given date that is closest to what I enter. In postgres raw I could do some ORDER BY ABS() with preconditions to get a reasonably fast query. Turning that into a QuerySet query is a little odd to me.

Hed
Mar 31, 2004

Fun Shoe
I feel like I'm missing something really simple, but I haven't had to do in Django before.

Here's my models:
Python code:
class Contract(models.Model):
    commodity = models.ForeignKey(Commodity, on_delete=models.CASCADE)
    strike_price = MoneyField(max_digits=19, decimal_places=4, default_currency="USD")
    type = models.CharField(max_length=1, choices=ContractTypes.choices)

    class Meta:
        constraints = [
            # only one contract per commodity/type/strike_price
            models.UniqueConstraint(
                name="%(app_label)s_%(class)s_commodity_type_strike_price_unique",
                fields=("commodity", "strike_price", "type", ),
            ),
        ]

class MarketData(models.Model):
    reference_date = models.DateField(
        blank=False, help_text="The entry date for the data."
    )
    contract = models.ForeignKey(
        Contract, on_delete=models.CASCADE
    )
    price = MoneyField(max_digits=19, decimal_places=4, default_currency="USD")


    class Meta:
        # unique_together (reference_date, contract)
        constraints = [
            models.UniqueConstraint(
                name="%(app_label)s_%(class)s_date_and_contract_unique",
                fields=("reference_date", "contract", ),
            ),
        ]
I'm trying to ingest a shitton of tabular data into a normalized database. Each row has prices for each reference_date, for each Contract. I have a Contract model which has a commodity, type and strike_price. The price data will go a MarketData model, which refers to a Contract and provides the prices for each reference_date.

I am trying to speed up ingest by using bulk_create, but I need to bulk_create() the Contracts first, then I can bulk_create the MarketData and populate the PKs directly.

That's where I'm at: Iterator sends me a block of rows with the same contracts, I want to create them if they don't exist and then populate the MarketData using the PKs from the bulk_create() step. There's a shot I already have an entry for the Contract in my database, so I've been using ignore_conflicts=True in bulk_create(), which doesn't return PKs. So I can't just use the PKs and return the rows.

So now I'm trying to bulk_create(ignore_conflicts=True) to guarantee they exist and then go query to get PKs that match, but is there a bulk_get() or something where I can just send a list of unique (commodity, type, strike_price) tuples and get back PKs such as with .values('pk')? Then I could do a python zip(rows, contract_ids) and bulk_create my MarketData and be off to the races.

Am I thinking about this the wrong way? Moving from the naive method to using bulk_create() is around a 30x speedup so I'd really like to get this to work.

edit: added constraints on models to make clearer

Hed fucked around with this message at 14:28 on Oct 17, 2023

Hed
Mar 31, 2004

Fun Shoe

fisting by many posted:

Er which part is meant to be unique? I don't see any unique constraint so presumably Django is happy to add duplicate data, and ignore_conflicts=True is only suppressing other errors (like trying to reference FKs that don't exist yet?)

I updated my post. I simplified the models for posting and forgot those important parts!

Adbot
ADBOT LOVES YOU

Hed
Mar 31, 2004

Fun Shoe
I figured it out... use filters with in_bulk() to evaluate the queryset in 1 SQL query, then your subsequent .get() shouldn't hit the DB.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply