Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Antigravitas
Dec 8, 2019

Die Rettung fuer die Landwirte:
I'm an old square and design my database schema first before even thinking about the application that may use it.

In the majority of use cases it's the data you are interested in. If you can nail down what your data looks like and how to test it for correctness, you can rewrite or throw away your entire program code and reuse the DB without worry. Or have two separate programs run by different buffoons connect to the same DB without blowing up.

It's like unit testing but for data, if some young kid needs convincing. :v:

Adbot
ADBOT LOVES YOU

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


Antigravitas posted:

I'm an old square and design my database schema first before even thinking about the application that may use it.

In the majority of use cases it's the data you are interested in. If you can nail down what your data looks like and how to test it for correctness, you can rewrite or throw away your entire program code and reuse the DB without worry. Or have two separate programs run by different buffoons connect to the same DB without blowing up.

It's like unit testing but for data, if some young kid needs convincing. :v:

Yeah this.

And sorry, I probably misspoke in my sleep induced stupor. When I was talking about "business logic" in my post earlier, I was more referring to the logic that determines certain properties of objects (e.g. "You are not using this feature but because you are an international customer, it isn't relevant to you so we will show N/A instead."). I was not talking about the business rules that form decision points in the application (e.g. "You are returning this item and because it has been 90 days since your purchase, you are only going to receive store credit.").

Business logic is really a vague term that encompasses a wide array of stuff depending on who you ask (validation, calculation, process flow, decision making). I agree that a lot of that belongs on the server and front-end, not the database. The problem (in my opinion) starts when an application shows things to the end user that are derived from rows in a database and can only be reproduced by running them through some complex server-side algorithm.

Volguus
Mar 3, 2009

Ruggan posted:

Yeah this.

And sorry, I probably misspoke in my sleep induced stupor. When I was talking about "business logic" in my post earlier, I was more referring to the logic that determines certain properties of objects (e.g. "You are not using this feature but because you are an international customer, it isn't relevant to you so we will show N/A instead."). I was not talking about the business rules that form decision points in the application (e.g. "You are returning this item and because it has been 90 days since your purchase, you are only going to receive store credit.").

Business logic is really a vague term that encompasses a wide array of stuff depending on who you ask (validation, calculation, process flow, decision making). I agree that a lot of that belongs on the server and front-end, not the database. The problem (in my opinion) starts when an application shows things to the end user that are derived from rows in a database and can only be reproduced by running them through some complex server-side algorithm.

Sometimes the more maintainable option is to write said algorithm in that server-side language. If it's a complex algorithm SQL would more often than not be a nightmare to develop and maintain.

Data Graham
Dec 28, 2009

📈📊🍪😋



Also I feel like a lot of this is at odds with the whole separation-of-concerns concept, isn't it? Build your DB and your API in a way that's agnostic to how it's going to be used or what kind of app will be built around it; if it's a piece of data extraction logic that everybody will need, by all means implement it in the DB, but otherwise it's up to each app developer to fetch the primitive data objects and mash them together however they see fit. Right? It's not as fast, to execute or to develop or to bug-fix, but speed was never the goal of separation-of-concerns in the first place.

Spikes32
Jul 25, 2013

Happy trees
So I'm in a new tech role as a sys admin for labware (a LIMS product). The two languages I was told would be most helpful to learn are SQL and basic. I'll be both helping users with issues and also developing new work flows eventually. I've started a SQL course in udacity, are there any other recommended online courses?

Just-In-Timeberlake
Aug 18, 2003
Maybe I'll just post some of the classic ASP code and dynamically created SQL queries I inherited when I took on this job a gazillion years ago and drive you all into an alcohol induced coma.

TheFluff
Dec 13, 2006

FRIENDS, LISTEN TO ME
I AM A SEAGULL
OF WEALTH AND TASTE
orm's are bad

query builders (sql metaprogramming tools, e.g. jooq, sqlalchemy, knex) are good

Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



TheFluff posted:

orm's are bad

query builders (sql metaprogramming tools, e.g. jooq, sqlalchemy, knex) are good

It's been a while since I used sqlalchemy but I sure had it lumped in with ORMs in my head?

Malcolm XML
Aug 8, 2009

I always knew it would end like this.

Munkeymon posted:

It's been a while since I used sqlalchemy but I sure had it lumped in with ORMs in my head?

it's got both

ElehemEare
May 20, 2001
I am an omnipotent penguin.

Data Graham posted:

Also I feel like a lot of this is at odds with the whole separation-of-concerns concept, isn't it? Build your DB and your API in a way that's agnostic to how it's going to be used or what kind of app will be built around it; if it's a piece of data extraction logic that everybody will need, by all means implement it in the DB, but otherwise it's up to each app developer to fetch the primitive data objects and mash them together however they see fit. Right? It's not as fast, to execute or to develop or to bug-fix, but speed was never the goal of separation-of-concerns in the first place.

This falls apart when teamA and teamB decide there are two distinct mechanisms to mash together the data intending to produce the same value, but doing so differently with slightly different results.

If you push this down into the database or abstract that business logic into a single common API then you obviate this problem, but how many people *actually* obey the Bezos ultimatum about not making direct access to other applications backends?

Antigravitas
Dec 8, 2019

Die Rettung fuer die Landwirte:

ElehemEare posted:

This falls apart when teamA and teamB decide there are two distinct mechanisms to mash together the data intending to produce the same value, but doing so differently with slightly different results.

This happens all the time and I always make sure to point this out because of the agony it causes.

We get a lot of invoices that are off by a cent or two from what we ordered. Typically because we ordered via some web shop front end and the back end ERP calculates correctly while the webshit monkeys do not. And I refuse to make their inability to do basic maths our problem…

D34THROW
Jan 29, 2012

RETAIL RETAIL LISTEN TO ME BITCH ABOUT RETAIL
:rant:

People still use BASIC? I thought that was the domain of hobbyists by this point?

Spikes32
Jul 25, 2013

Happy trees

D34THROW posted:

People still use BASIC? I thought that was the domain of hobbyists by this point?

So the program is written in 'lims basic', and has been around since the 90s. Why they never switched during a version upgrade I couldn't tell you. Wait yes I can, it's a privately owned company with one owner who gets some very strong ideas.

Antigravitas
Dec 8, 2019

Die Rettung fuer die Landwirte:
Once you've put a lot of institutional knowledge into code it becomes impossible to switch away from the language the code is written in without introducing problems because, and this is especially true if you no longer have people around who are familiar with it, quirks or bugs in the code become impossible to distinguish from intentional quirks in the code.

This is especially true when the code is dealing with some real life thing, because real life has the nasty habit of being insufficiently pure and not mapping well to code.

That's how you get BASIC in a "modern" application. It happens all the time and I haven't seen an organisation able to prevent it.

D34THROW
Jan 29, 2012

RETAIL RETAIL LISTEN TO ME BITCH ABOUT RETAIL
:rant:

Antigravitas posted:

Once you've put a lot of institutional knowledge into code it becomes impossible to switch away from the language the code is written in without introducing problems because, and this is especially true if you no longer have people around who are familiar with it, quirks or bugs in the code become impossible to distinguish from intentional quirks in the code.

This is especially true when the code is dealing with some real life thing, because real life has the nasty habit of being insufficiently pure and not mapping well to code.

That's how you get BASIC in a "modern" application. It happens all the time and I haven't seen an organisation able to prevent it.

It's like when Schwarzenegger laid off 10,000 employees and then California was in deep poo poo, because they had a system written in COBOL and he laid off every last employee familiar with COBOL :downsgun:

Data Graham
Dec 28, 2009

📈📊🍪😋



There’s a Terminator/Battlestar Galactica crossover joke in there somewhere but I’m not equipped to make it

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!

Antigravitas posted:

That's how you get BASIC in a "modern" application. It happens all the time and I haven't seen an organisation able to prevent it.
Yet every single one of them gets some VP who's read a magazine and they decide they'll outsource a third party replacement using modern languages and techniques (Cloud, NoSQL, Node,...) without taking to the existing SMEs. :thunk:

They also manage to repeatedly and consistently create business contracts that have no penalties for third parties that deliver 90% over budget and 133% over schedule (if they deliver at all).

(This being the SQL thread, I'll note that the previous topic, on ORM issues, is a good example of why some companies shy away from RDBMS solutions to avoid scaling issues. None of them seem to provide any best practices around KV solutions though. I'm curious how they survive given the amount of inefficiency that comes from dropping query optimizers, entity relationships, transactional support, general acid complaince, and software that's been built for 50yr+ to provide a large range of features at scale.

I have a recent project that I did from a KV perspective for practice, and it looks like a scaling/index-migration/rebuilding mess just waiting to happen. Short of throwing more hardware at it, seems like NoSQL solutions have a "flat" optimization curve.)

Antigravitas
Dec 8, 2019

Die Rettung fuer die Landwirte:
Well, I listened to the agonised screams of software developers I pretend to respect, and what I could make out between sobs is that this inevitably leads to the formation of a relational model in application code on top of a non-relational database.

Data Graham
Dec 28, 2009

📈📊🍪😋



This article is always an amusing read

http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/

nem
Jan 4, 2003

panel.dev
apnscp: cPanel evolved
Obligatory https://www.youtube.com/watch?v=b2F-DItXtZs

I'm firmly in the Postgres camp. Native support for JSONB and efficient time-series storage with Timescale make life so much easier.

D34THROW
Jan 29, 2012

RETAIL RETAIL LISTEN TO ME BITCH ABOUT RETAIL
:rant:
I wish I could get my company to host a SQL server. Our ERP is SQL-based (I think) but it sucks rear end (Dynamics NAV 2013), and it doesn't do half of what we really need it to do. That means we're stuck with a "summary" Excel workbook to track job-related data, in which one has to enter the same data in like 3 places.

I'm working on developing an Access app to manage this info, but I'd much rather use Access as a frontend for SQL.


As a side note, I have a vendor that uses Microsoft Access as their main application. I bet that's just wonderful, especially in a multi-user situation :suicide:

Paper With Lines
Aug 21, 2013

The snozzberries taste like snozzberries!

nem posted:

Obligatory https://www.youtube.com/watch?v=b2F-DItXtZs

I'm firmly in the Postgres camp. Native support for JSONB and efficient time-series storage with Timescale make life so much easier.

How do you hook up time series db stuff to a front end without making an aggregation choice?

I work a lot with SCATA systems (like factory systems) and I am basically one of five or six people that know how to effectively grab data out of the historian. All the business oriented people want it in PowerBI or Tableau and want to do drill down poo poo, but it is effectively impossible if I just pass through different queries with different aggregations.

ElehemEare
May 20, 2001
I am an omnipotent penguin.

Paper With Lines posted:

How do you hook up time series db stuff to a front end without making an aggregation choice?
Your UI just needs to account for the fact that a single record is in fact many potential permutations of a record.

Paper With Lines posted:

I work a lot with SCATA systems (like factory systems) and I am basically one of five or six people that know how to effectively grab data out of the historian. All the business oriented people want it in PowerBI or Tableau and want to do drill down poo poo, but it is effectively impossible if I just pass through different queries with different aggregations.
I haven’t touched TimescaleDB, but the MSSQL implementation of system-versioned tables just lets you pass “FOR SYSTEM_TIME ALL” against a temporal table, which dumps the full history of mutation of a record.

Paper With Lines
Aug 21, 2013

The snozzberries taste like snozzberries!

ElehemEare posted:

Your UI just needs to account for the fact that a single record is in fact many potential permutations of a record.

I haven’t touched TimescaleDB, but the MSSQL implementation of system-versioned tables just lets you pass “FOR SYSTEM_TIME ALL” against a temporal table, which dumps the full history of mutation of a record.

Pt1: All my internal package stuff that interacts with it tries to steralize the choices, but maybe I just let users pass through an aggregation number instead of trying to label it. Or add more labels. I had blanked on that. That is a good idea.



Pt2: Yea, we use Wonderware which is a MSSQL implementation I think. This is is amazing. I need to look more into this. Thanks for pointing me this direction.

Paper With Lines fucked around with this message at 01:08 on Feb 26, 2020

nem
Jan 4, 2003

panel.dev
apnscp: cPanel evolved

Paper With Lines posted:

How do you hook up time series db stuff to a front end without making an aggregation choice?

I work a lot with SCATA systems (like factory systems) and I am basically one of five or six people that know how to effectively grab data out of the historian. All the business oriented people want it in PowerBI or Tableau and want to do drill down poo poo, but it is effectively impossible if I just pass through different queries with different aggregations.

Build an API around the query with a few different mutations of whatever you're interested in. For example, I collect metrics which can be value or monotonic. User can query a set of metric types if necessary along with an optional range. Query calculates either the lag or value depending upon metric type fetched. In the case of bandwidth, different API call but uses Timescale. Bonus is that if I want to switch Timescale out in the future with something else (InfluxDB maybe?) I only have to change the query driver not the API signatures.

redleader
Aug 18, 2005

Engage according to operational parameters

D34THROW posted:

I wish I could get my company to host a SQL server. Our ERP is SQL-based (I think) but it sucks rear end (Dynamics NAV 2013), and it doesn't do half of what we really need it to do.

microsoft dynamics crm? that's backed by a sql server db, almost certainly on-prem. i happen to know this because i used ms dynamics for task tracking and found it vastly easier to query the db (once i'd reverse-engineered the schema and how we used it) directly rather than using the front-end

i wouldn't suggest updating that db directly though, because there's a fair chance doing that could lead to your support contract or w/e being voided

Paper With Lines
Aug 21, 2013

The snozzberries taste like snozzberries!
The loving Navision schema set up is LOL poo poo. How did you reverse engineer it? We basically have 5 general queries that we adapt depending on what we want that were written like six years ago because no one ever wants to ever interact with it on the db side but there is a high business demand to get data out.

redleader
Aug 18, 2005

Engage according to operational parameters

Paper With Lines posted:

The loving Navision schema set up is LOL poo poo. How did you reverse engineer it? We basically have 5 general queries that we adapt depending on what we want that were written like six years ago because no one ever wants to ever interact with it on the db side but there is a high business demand to get data out.

nothing sophisticated - i opened up a thing in the app that showed some of the data i wanted, then went poking around in tables looking for that data. once i found where it was stored, i started joining on tables that seemed relevant using fk-sounding columns until i had enough to understand what i needed to

Anveo
Mar 23, 2002
I'm trying to make an analytic query that counts events by month. The following query works but takes a while and I feel like there is a better solution with a pivot or crosstab or something but can't seem to get my head around the usage.

code:
SELECT 
    table1.f1,
    table2.f2,
    ...
    CASE WHEN table1.created_at < '2020-02-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2020-01-01 00:00:00' AND '2020-02-01 00:00:00') ELSE NULL END as "2020-01",
    CASE WHEN table1.created_at < '2020-01-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-12-01 00:00:00' AND '2020-01-01 00:00:00') ELSE NULL END as "2019-12",
    CASE WHEN table1.created_at < '2019-12-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-11-01 00:00:00' AND '2019-12-01 00:00:00') ELSE NULL END as "2019-11",
    CASE WHEN table1.created_at < '2019-11-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-10-01 00:00:00' AND '2019-11-01 00:00:00') ELSE NULL END as "2019-10",
    CASE WHEN table1.created_at < '2019-10-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-09-01 00:00:00' AND '2019-10-01 00:00:00') ELSE NULL END as "2019-09",
    CASE WHEN table1.created_at < '2019-09-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-08-01 00:00:00' AND '2019-09-01 00:00:00') ELSE NULL END as "2019-08",
    CASE WHEN table1.created_at < '2019-07-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-06-01 00:00:00' AND '2019-07-01 00:00:00') ELSE NULL END as "2019-06",
    CASE WHEN table1.created_at < '2019-06-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-05-01 00:00:00' AND '2019-06-01 00:00:00') ELSE NULL END as "2019-05",
    CASE WHEN table1.created_at < '2019-08-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-07-01 00:00:00' AND '2019-08-01 00:00:00') ELSE NULL END as "2019-07",
    CASE WHEN table1.created_at < '2019-05-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-04-01 00:00:00' AND '2019-05-01 00:00:00') ELSE NULL END as "2019-04",
    CASE WHEN table1.created_at < '2019-04-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-03-01 00:00:00' AND '2019-04-01 00:00:00') ELSE NULL END as "2019-03",
    CASE WHEN table1.created_at < '2019-03-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-02-01 00:00:00' AND '2019-03-01 00:00:00') ELSE NULL END as "2019-02",
    CASE WHEN table1.created_at < '2019-02-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-01-01 00:00:00' AND '2019-02-01 00:00:00') ELSE NULL END as "2019-01"
    ,
    CASE WHEN table1.created_at < '2018-01-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-12-01 00:00:00' AND '2019-01-01 00:00:00') ELSE NULL END as "2018-12",
    CASE WHEN table1.created_at < '2018-12-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-11-01 00:00:00' AND '2018-12-01 00:00:00') ELSE NULL END as "2018-11",
    CASE WHEN table1.created_at < '2018-11-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-10-01 00:00:00' AND '2018-11-01 00:00:00') ELSE NULL END as "2018-10",
    CASE WHEN table1.created_at < '2018-10-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-09-01 00:00:00' AND '2018-10-01 00:00:00') ELSE NULL END as "2018-09",
    CASE WHEN table1.created_at < '2018-09-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-08-01 00:00:00' AND '2018-09-01 00:00:00') ELSE NULL END as "2018-08",
    CASE WHEN table1.created_at < '2018-08-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-07-01 00:00:00' AND '2018-08-01 00:00:00') ELSE NULL END as "2018-07",
    CASE WHEN table1.created_at < '2018-07-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-06-01 00:00:00' AND '2018-07-01 00:00:00') ELSE NULL END as "2018-06",
    CASE WHEN table1.created_at < '2018-06-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-05-01 00:00:00' AND '2018-06-01 00:00:00') ELSE NULL END as "2018-05",
    CASE WHEN table1.created_at < '2018-05-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-04-01 00:00:00' AND '2018-05-01 00:00:00') ELSE NULL END as "2018-04",
    CASE WHEN table1.created_at < '2018-04-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-03-01 00:00:00' AND '2018-04-01 00:00:00') ELSE NULL END as "2018-03",
    CASE WHEN table1.created_at < '2018-03-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-02-01 00:00:00' AND '2018-03-01 00:00:00') ELSE NULL END as "2018-02",
    CASE WHEN table1.created_at < '2018-02-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-01-01 00:00:00' AND '2018-02-01 00:00:00') ELSE NULL END as "2018-01"
    ,
    CASE WHEN table1.created_at < '2017-01-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2017-12-01 00:00:00' AND '2019-01-01 00:00:00') ELSE NULL END as "2017-12",
    ... -- more months
FROM ...

Nth Doctor
Sep 7, 2010

Darkrai used Dream Eater!
It's super effective!


Anveo posted:

I'm trying to make an analytic query that counts events by month. The following query works but takes a while and I feel like there is a better solution with a pivot or crosstab or something but can't seem to get my head around the usage.

code:
SELECT 
    table1.f1,
    table2.f2,
    ...
    CASE WHEN table1.created_at < '2020-02-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2020-01-01 00:00:00' AND '2020-02-01 00:00:00') ELSE NULL END as "2020-01",
    CASE WHEN table1.created_at < '2020-01-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-12-01 00:00:00' AND '2020-01-01 00:00:00') ELSE NULL END as "2019-12",
    CASE WHEN table1.created_at < '2019-12-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-11-01 00:00:00' AND '2019-12-01 00:00:00') ELSE NULL END as "2019-11",
    CASE WHEN table1.created_at < '2019-11-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-10-01 00:00:00' AND '2019-11-01 00:00:00') ELSE NULL END as "2019-10",
    CASE WHEN table1.created_at < '2019-10-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-09-01 00:00:00' AND '2019-10-01 00:00:00') ELSE NULL END as "2019-09",
    CASE WHEN table1.created_at < '2019-09-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-08-01 00:00:00' AND '2019-09-01 00:00:00') ELSE NULL END as "2019-08",
    CASE WHEN table1.created_at < '2019-07-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-06-01 00:00:00' AND '2019-07-01 00:00:00') ELSE NULL END as "2019-06",
    CASE WHEN table1.created_at < '2019-06-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-05-01 00:00:00' AND '2019-06-01 00:00:00') ELSE NULL END as "2019-05",
    CASE WHEN table1.created_at < '2019-08-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-07-01 00:00:00' AND '2019-08-01 00:00:00') ELSE NULL END as "2019-07",
    CASE WHEN table1.created_at < '2019-05-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-04-01 00:00:00' AND '2019-05-01 00:00:00') ELSE NULL END as "2019-04",
    CASE WHEN table1.created_at < '2019-04-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-03-01 00:00:00' AND '2019-04-01 00:00:00') ELSE NULL END as "2019-03",
    CASE WHEN table1.created_at < '2019-03-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-02-01 00:00:00' AND '2019-03-01 00:00:00') ELSE NULL END as "2019-02",
    CASE WHEN table1.created_at < '2019-02-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2019-01-01 00:00:00' AND '2019-02-01 00:00:00') ELSE NULL END as "2019-01"
    ,
    CASE WHEN table1.created_at < '2018-01-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-12-01 00:00:00' AND '2019-01-01 00:00:00') ELSE NULL END as "2018-12",
    CASE WHEN table1.created_at < '2018-12-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-11-01 00:00:00' AND '2018-12-01 00:00:00') ELSE NULL END as "2018-11",
    CASE WHEN table1.created_at < '2018-11-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-10-01 00:00:00' AND '2018-11-01 00:00:00') ELSE NULL END as "2018-10",
    CASE WHEN table1.created_at < '2018-10-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-09-01 00:00:00' AND '2018-10-01 00:00:00') ELSE NULL END as "2018-09",
    CASE WHEN table1.created_at < '2018-09-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-08-01 00:00:00' AND '2018-09-01 00:00:00') ELSE NULL END as "2018-08",
    CASE WHEN table1.created_at < '2018-08-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-07-01 00:00:00' AND '2018-08-01 00:00:00') ELSE NULL END as "2018-07",
    CASE WHEN table1.created_at < '2018-07-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-06-01 00:00:00' AND '2018-07-01 00:00:00') ELSE NULL END as "2018-06",
    CASE WHEN table1.created_at < '2018-06-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-05-01 00:00:00' AND '2018-06-01 00:00:00') ELSE NULL END as "2018-05",
    CASE WHEN table1.created_at < '2018-05-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-04-01 00:00:00' AND '2018-05-01 00:00:00') ELSE NULL END as "2018-04",
    CASE WHEN table1.created_at < '2018-04-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-03-01 00:00:00' AND '2018-04-01 00:00:00') ELSE NULL END as "2018-03",
    CASE WHEN table1.created_at < '2018-03-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-02-01 00:00:00' AND '2018-03-01 00:00:00') ELSE NULL END as "2018-02",
    CASE WHEN table1.created_at < '2018-02-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2018-01-01 00:00:00' AND '2018-02-01 00:00:00') ELSE NULL END as "2018-01"
    ,
    CASE WHEN table1.created_at < '2017-01-01 00:00:00' THEN COUNT(table2.*) FILTER (WHERE table2.sent_at BETWEEN '2017-12-01 00:00:00' AND '2019-01-01 00:00:00') ELSE NULL END as "2017-12",
    ... -- more months
FROM ...

oh my god.


Yeah you want to pivot based on the month and year of table1.created_at

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!

Nth Doctor posted:

oh my god.


Yeah you want to pivot based on the month and year of table1.created_at

Unless I'm misunderstanding, first he wants to group by month and year and then pivot by year.

E: I misunderstood.

Jethro fucked around with this message at 20:53 on Feb 27, 2020

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


That is a loving sexy query

Just-In-Timeberlake
Aug 18, 2003

Ruggan posted:

That is a loving sexy query

that query fucks

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!
:stare: I cannot rightly apprehend what confusion of ideas would provoke such a query.


Can you describe in words what you hope to achieve, because I doubt that does what you/I/we think it does or is meant to do.

Is there even a WHERE condition, or does every matching row also recalculate the table2 counts for all cases greater than table1.createdat?

I strongly recommend you punch print all records so you can use https://en.m.wikipedia.org/wiki/Edge-notched_card since that would be more efficient. :histdowns:

ElehemEare
May 20, 2001
I am an omnipotent penguin.

PhantomOfTheCopier posted:

I doubt that does

Is this not just a GROUP BY YEAR(created_at), MONTH(created_at) HAVING sent_at BETWEEN DATEADD(month,-1,created_at) AND created_at?

ElehemEare fucked around with this message at 03:38 on Feb 28, 2020

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE
well, he apparently wants them as columns, which is the pivot part.

but yeah, it's basically a group-by subquery/CTE that's passed into a pivot, unless I'm misunderstanding what he's going for. I don't understand why Jethro thinks that's not what he's going for?

it looks like he's looking up some combination (f1,f2), let's say brand and product name, and then picking off counts in each time interval (let's say sales), and turning them into columns.

Paul MaudDib fucked around with this message at 05:26 on Feb 28, 2020

nielsm
Jun 1, 2009



Yeah that's totally a pivot. Either do the straight group by query and get your four columns (f1, f2, year_month, count) and format the data at the client side, or do the same query and use the database's pivot features on that year_month column. Maybe have a dimension table of date ranges and the year-month label to simplify the join condition.

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE
re-implementing our core search in HQL took literally 2 patches to Hibernate to fix HQL -> SQL mapping and bring up

tfw you actually legit break the ORM with a minimal reduced test case... twice.

Paul MaudDib fucked around with this message at 13:18 on Feb 28, 2020

Anveo
Mar 23, 2002
Thanks for the responses. Turns out when I ignored the overly complex query I was given and thought more about the problem I realized it was a fairly standard pivot query. The following is roughly what I ended up with:

code:
select
    *
from crosstab (
$$
select
	table1.id,
	table2.category,
  -- other cols
	format('%s-%s', date_part('year', table2.sent_at), date_part('month', table2.sent_at)) as the_date_bucket,
	count(*)
from table1 t1
  -- joins
group by table1.id, the_date_bucket
$$
,
$$
select format('%s-%s', date_part('year', month), date_part('month', month)) from generate_series('2017-01-01'::timestamptz, now(), '1 month') month
$$
) AS 
"ct" (
  "id" text,
  "category" text,
  -- other cols

  -- no easy way to make this part dynamic    
  "2017-1" numeric,
  "2017-2" numeric,
  -- ...etc etc
  "2020-2" numeric
);

Adbot
ADBOT LOVES YOU

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE
Actually you probably can make the date ranges dynamic too, Postgres has real nice features for that and I’m sure other databases do too

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply