Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
DELETE CASCADE
Oct 25, 2017

i haven't washed my penis since i jerked it to a phtotograph of george w. bush in 2003

Mind_Taker posted:

A few design/best practices questions:

1. Do you name your tables using singular or plural nouns? E.g. Customer or Customers.

2. Do you use snake case, camel case, or another case to name columns? E.g. first_name or FirstName?

3. Do you prefix a table's id column name with the table name or just leave it as id? E.g. id or customer_id


I've always done plural, snake case, and id respectively but apparently the new team I have joined has used the alternatives.

If it matters the DB is the backend to an API written in C# with Entity Framework in a DB first workflow. They have said this drives some of their opinions but these opinions are all very weird to me.

I guess I'm wondering if, aside from personal preference, there is any reason to go with one vs. the other in the above items.

DELETE CASCADE posted:

in postgres it is common to use snake_case instead of camelCase, because postgres does automatic case folding of all identifiers to lowercase. you declare the column fooBar, the actual column created is foobar. feel free to call it fooBar in all your queries, postgres will just silently convert that to foobar and it works. if you really want to name the column fooBar with the caps, you can use double quotes to disable case folding: "fooBar". great, you've created a camelCase column. but now you must refer to it as "fooBar" each time, with the quotes, because plain fooBar gets case folded to foobar, and that's not the name of the column. and since in postgres we don't want to sprinkle quotes all over the place, we use snake_case. (microsofties seem to have no problem with the [mssql].[nonsense]...)


i guess i can give my views on your other questions too! don't we just love opinions

singular or plural makes absolutely no difference to me, just pick one and be consistent with it, i agree with hammerite here

id vs tablename_id does have an actual consequence if you like the "using" clause. i tend to name the primary key of a table "id", and then any foreign keys in other tables are foreigntablename_id. but this means you have to write, x join customer c on x.customer_id = c.id. if you had consistently used customer_id in both places, you could write, x join customer c using (customer_id). this can be convenient but also confusing (you only get one customer_id column from this join, and it gets moved to the leftmost position) so i stick with my convention, but it does seem like the existence of a "using" clause suggests that using the same column name is preferred by at least some people

Adbot
ADBOT LOVES YOU

duck monster
Dec 15, 2004

If your going into an existing codebase/deploy with an existing style guide, that is *always* going to be the one you want to use. snake_case and pluralized tables might be more logical, but in the scheme of things, its just a preference, and bucking the teams preference isn't a good way to remain employed.

And yeah, I prefer just using id for the index too, because it means theres always a predictable pk, but again, its a preference, and mob rules when it comes to teamwork.

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

Mind_Taker posted:

A few design/best practices questions:

1. Do you name your tables using singular or plural nouns? E.g. Customer or Customers.

I agree with everyone else, it's irrelevant as long as it's consistent.

quote:

2. Do you use snake case, camel case, or another case to name columns? E.g. first_name or FirstName?

I think snake case is the safest choice. Regardless of what your DB does, there are a million different tools that will interface with it over time (reporting tools, API generators, ORM/query generators, etc. etc.).

I'm not aware of any language or tool that doesn´t like snake_case as identifiers, while a lot of tools will mangle, change, or remove casing when generating their DB code; likewise for kebab case or *shudders* spaces in names.

quote:

3. Do you prefix a table's id column name with the table name or just leave it as id? E.g. id or customer_id

Prefix. It makes things explicit by default, makes queries more readable, makes it hard to write wrong joins, and if you are writing a query where that doesn't help and you use the column a lot, it's easy to just add "as id" to the select.

One variant I've seen that I find tempting is "id_customer". Yes, it's not grammatically correct English, but it means that when you are typing a query you can just type '.id' and autocomplete will show you what's available in the table. Furthermore, all id columns will stand out in the list, and in long queries they will align vertically. I haven't adopted it yet but if I'd like to give it a try in a side project.

NihilCredo fucked around with this message at 11:25 on Oct 26, 2023

TheFluff
Dec 13, 2006

FRIENDS, LISTEN TO ME
I AM A SEAGULL
OF WEALTH AND TASTE
Personally I don't care about plural/singular but every shop I've worked in has used plural. I prefer snake case and it's the most common style in my experience. I'm neutral on id vs tablename_id, used to be a big proponent of id but my current workplace uses tablename_id and that's OK too (and join using(tablename_id) is convenient when you're writing one-off queries by hand).

That said though, it seems OP is in a Microsoft shop and in that case the singular/Pascal case/TablenameID preference makes perfect sense because that's the standard at Microsoft and so that's the pattern used in all the example code in the MSSQL documentation, as well as in most MSSQL books and such.

Roadie
Jun 30, 2013
For me, singular vs plural and all that stuff... just use whatever style is most used in the documentation for a given server type. Unlike in the JavaScript-land I'm most used to you actually have docs for things with examples, so use it.

redleader
Aug 18, 2005

Engage according to operational parameters
i'd definitely name id columns differently if my db supported USING (columns). this database is multitenanted with tables having a clustered index on (TenantId, Id)*, so i'm real sick of writing out JOIN Foos f ON d.TenantId = a.TenantId AND f.Id = a.FooId ad nauseum

* this probably isn't optimal but it tends to work well enough that no one cares

Lib and let die
Aug 26, 2004

I'm far from a seasoned DBA but so far as working in a client onboarding/ETL capacity goes I sure appreciate it when id columns are named thing_id because it makes mapping data from legacy systems more intuitive (ask me about staring into the horrors of a raiser's edge backup file)

Just-In-Timeberlake
Aug 18, 2003
lmao the database I inherited would give you all nightmares

Lib and let die
Aug 26, 2004

Just-In-Timeberlake posted:

lmao the database I inherited would give you all nightmares

[laughs in democratic party fundraising and engagement database]

LightRailTycoon
Mar 24, 2017
Point of sale db in a product with ongoing updates for 30 years, on an SAP db server.

shame on an IGA
Apr 8, 2005

Lib and let die posted:

[laughs in democratic party fundraising and engagement database]

oh poo poo I think I'm one of your users

Lib and let die
Aug 26, 2004

shame on an IGA posted:

oh poo poo I think I'm one of your users

was.

have a backup plan ready.

duck monster
Dec 15, 2004

Y'all fools aint seen nothing till you've worked with a database designed by scientists. Horrors without end.

"I just copied the field names from excel but for some reason it wont let me use spaces"

"I just put the code in a text field and then use pythons exec command to run it"

"Ever Their praises, and abundance to the Black Goat of the Woods. Iä! Shub-Niggurath! Iä! Shub-Niggurath! The Black Goat of the Woods with a Thousand Young"

ElehemEare
May 20, 2001
I am an omnipotent penguin.

duck monster posted:

"I just copied the field names from excel but for some reason it wont let me use spaces"

SELECT [A], [B], [Adjusted Value (Fixed)-FORMULA-DO NOT EDIT] FROM dbo.[Sheet4];

Just-In-Timeberlake
Aug 18, 2003
speaking of Excel I was really excited to hear that they were fixing a decades old “feature” and thought it would be no longer automatically converting large numbers into scientific notation when you paste it in.

it wasn’t that :smith:

Hed
Mar 31, 2004

Fun Shoe
What's the correct way to convert an interval to a number of days in postgres? I'm generating an interval by doing an "age" difference between two dates:

SQL code:
SELECT
       contract.year,
       contract.ratio,
      age(contract.expiration_date, data.reference_date) as days_to_expiration
which yields rows like this:



It works, but I'd really like an integer number of days as the "days_to_expiration". Whenever I try to use the date functions I'm just truncating/extracting the date, so I end up with "0 days, 27 days, ..." in that column, which is not what I want.

nielsm
Jun 1, 2009



https://www.postgresql.org/docs/current/functions-datetime.html

quote:

date - date → integer

Subtract dates, producing the number of days elapsed

date '2001-10-01' - date '2001-09-28' → 3

You can't use the AGE function for this since it calculates a calendar difference, including months which can have varying numbers of days, and you lose that information at the age calculation. Just use the regular minus operator.

nielsm fucked around with this message at 14:45 on Oct 30, 2023

Hed
Mar 31, 2004

Fun Shoe
Thank you! I figured I was making it too complicated. Works great now.

I really appreciate this thread--coming from Django where there was quite a bit of magic going on in the ORM to now knowing when to drop down to Raw SQL has been a game changer. Now I just need to learn it better and not skim the docs so much.

TheFluff
Dec 13, 2006

FRIENDS, LISTEN TO ME
I AM A SEAGULL
OF WEALTH AND TASTE
Also, if you have two timestamps instead, subtracting them gives you an interval. Once you have an interval you can use e.g. date_trunc to truncate it to whatever precision you want (days, seconds, whatever) or use extract to get at specific fields. Postgres' date and time stuff is really convenient, you can just do stuff like now() + interval '1 month 20 minutes' if you like.

DELETE CASCADE
Oct 25, 2017

i haven't washed my penis since i jerked it to a phtotograph of george w. bush in 2003
yes it is excellent, the only thing i don't like about postgres datetime handling is "timestamp with time zone" (it isn't!) but i think that's the sql standard's fault

Hed
Mar 31, 2004

Fun Shoe
Back again with a sorting question for postgres.

I have the following tables:


The options contracts have data for them every day, sometimes from multiple sources.
Here's what I'm trying to do:

  • For a given month and series_id,
  • Consider all the years and reference_dates
  • Return the row that has the smallest nonnegative moneyness for a given reference_date and year, so I can use the implied_volatility for it

What I tried so far:

SQL code:
SELECT
       options_contract.year,
       options_data.iv,
       options_data.moneyness,
       options_data.source_id,
       options_data.reference_date,
       (options_contract.expiration_date - options_data.reference_date) as days_to_expiration
FROM commodityoptionscontract options_contract
    CROSS JOIN LATERAL (
        SELECT DISTINCT ON (od.reference_date, od.moneyness) od.implied_volatility as iv, od.moneyness, od.reference_date, od.source_id
        FROM optionsdataentry od
        WHERE options_contract.id = od.options_contract_id
          AND od.moneyness >= 0
        ORDER BY od.reference_date, od.moneyness -- sort ascending so DISTINCT above takes the unique (and lowest) moneyness
    ) AS options_data
WHERE options_contract.series_id = 4
  and month = 'H';
but I end up getting



where the first two rows I would expect to only see the row with 0.75 moneyness and again between rows 3 and 4 I would expect to see only 1 of those (two different sources have the same IV).

I thought based on reading ORDER BY and SELECT DISTINCT ON() only the first row based on (reference_date, moneyness) would survive that but that's not the case. Maybe that's getting caught in the lateral cross join? In my DB both of those exist (it's not just an outer product thing), I just want the one with smallest possible moneyness returned.

Adding another SELECT DISTINCT() in the outer table just returns the higher moneyness, the 1.05, so I assume it's not doing the thing I want and I'm clearly missing something.

Hed fucked around with this message at 20:54 on Nov 1, 2023

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!
I've never seen DISTINCT ON before, but a little research leads me to believe you are selecting the first row for each combination of reference_date and moneyness with no particular order. If you want to get one row per reference_date then only put reference_date in the DISTINCT ON clause. Then do your ORDER BY as you currently are.

monochromagic
Jun 17, 2023

Hed posted:

The options contracts have data for them every day, sometimes from multiple sources.
Here's what I'm trying to do:

  • For a given month and series_id,
  • Consider all the years and reference_dates
  • Return the row that has the smallest nonnegative moneyness for a given reference_date and year, so I can use the implied_volatility for it


Window functions are almost always the answer for these types of things rather than cross joins. I have a cold rn so I'm not braining good, but something like


SQL code:
SELECT
    series_id,
    month_col,
    year_col,
    reference_date,
    MIN(moneyness) OVER (PARTITION BY year_col, reference_date)
FROM table_name
WHERE moneyness > 0

I would blow Dane Cook
Dec 26, 2008
With halloween just been and gone, what's your spookiest database experience?

Lib and let die
Aug 26, 2004

I would blow Dane Cook posted:

With halloween just been and gone, what's your spookiest database experience?

I mentioned NGP VAN, right?

e: "a client needs the disclosure report updated for this batch of thousands of donations."

"ok, where do those values live?"

"no one knows. make the change in the front-end and see what updates."

"I can't change the front end. The batch is closed."

"Yeah."

Lib and let die fucked around with this message at 14:28 on Nov 6, 2023

redleader
Aug 18, 2005

Engage according to operational parameters

I would blow Dane Cook posted:

With halloween just been and gone, what's your spookiest database experience?

"can you hop on a call? a client is having problems connecting to the api"

they were doing everything in the database - building json via string concatenation and calling our api by calling curl using xp_ cmdshell. the only reason i heard about it was because they couldn't get any error response codes back from curl because xp_cmds hell doesn't give you the exit code from the thing you're calling

(lmao at the something awful WAF blocking posts with xp_c mdshell in)

RandomBlue
Dec 30, 2012

hay guys!


Biscuit Hider

redleader posted:

"can you hop on a call? a client is having problems connecting to the api"

they were doing everything in the database - building json via string concatenation and calling our api by calling curl using xp_ cmdshell. the only reason i heard about it was because they couldn't get any error response codes back from curl because xp_cmds hell doesn't give you the exit code from the thing you're calling

(lmao at the something awful WAF blocking posts with xp_c mdshell in)

only the best code for SA

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
cmds hell sounds like an accurate description

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe
xp_cmdshell

Scrapez
Feb 27, 2004

Postgresql. Is there a way to query the total data size queried for a particular user in a given day?

DELETE CASCADE
Oct 25, 2017

i haven't washed my penis since i jerked it to a phtotograph of george w. bush in 2003
you can do that with https://www.postgresql.org/docs/current/pgstatstatements.html

Scrapez
Feb 27, 2004


That's where I landed but still can't figure out how to do a query that returns the size in bytes of the queries that have been performed. Do you know which parameter would be leveraged for that? The parameters I see all appear to only relate to the time taken for the query to complete and number of rows returned.

nielsm
Jun 1, 2009



Scrapez posted:

That's where I landed but still can't figure out how to do a query that returns the size in bytes of the queries that have been performed. Do you know which parameter would be leveraged for that? The parameters I see all appear to only relate to the time taken for the query to complete and number of rows returned.

My best guess would be a sum of shared_blks_read + local_blks_read + temp_blks_read , multiplied by the block size, whatever that is.

E: Of course that's going to report how much storage the user has touched, not how much data the user has retrieved. Complex queries might touch a lot of indexes and thus a lot of pages, but not return a whole lot of data. For example, a single count(*) of a big join.

nielsm fucked around with this message at 17:59 on Nov 9, 2023

Scrapez
Feb 27, 2004

nielsm posted:

My best guess would be a sum of shared_blks_read + local_blks_read + temp_blks_read , multiplied by the block size, whatever that is.

E: Of course that's going to report how much storage the user has touched, not how much data the user has retrieved. Complex queries might touch a lot of indexes and thus a lot of pages, but not return a whole lot of data. For example, a single count(*) of a big join.

Thank you for the info. These are fairly simple queries and static so it may be possible.

Ihmemies
Oct 6, 2012

gently caress. I got a task to refactor this query to use NATURAL JOIN without subuqeries..

SQL code:
SELECT viini.nimi,
    (
        SELECT viinitila.nimi
        FROM viinitila
        WHERE viini.vttunnus = viinitila.vttunnus
    ) AS tila
FROM viini
ORDER BY viini.nimi;

-- What a man is going to do? This is the only answer I came up with:

ALTER TABLE viinitila
    RENAME COLUMN nimi TO tila_nimi;
SELECT viini.nimi,
    viinitila.tila_nimi AS tila
FROM viini
    NATURAL JOIN viinitila;
ALTER TABLE viinitila
    RENAME COLUMN tila_nimi TO nimi;
Does any other alternative come to mind? Tables look like this. Thanks.

SQL code:
vttunnus INT,
nimi VARCHAR(50) NOT NULL,
maa VARCHAR(50),
PRIMARY KEY (vttunnus),
UNIQUE (nimi));

CREATE TABLE viini(
vtunnus INT,
nimi VARCHAR(50) NOT NULL,
lhinta DECIMAL(7,2),                             --litrahinta
tyyppi VARCHAR(20),
vttunnus INT NOT NULL,
PRIMARY KEY (vtunnus),
UNIQUE (nimi),
FOREIGN KEY (vttunnus) REFERENCES viinitila);
Edit: after reading enough documentation, I realized I must rename the whole table's columns, like this:

SQL code:
SELECT viini.nimi,
    viinitila.tila_nimi AS tila
FROM viini
    NATURAL JOIN viinitila AS viinitila(vttunnus, tila_nimi, maa)
ORDER BY viini.nimi;
;
Maybe there's some stupid answer like that to the other question too.

Ihmemies fucked around with this message at 15:13 on Nov 11, 2023

Ihmemies
Oct 6, 2012

Actually what the gently caress these tasks are..

Course assignment posted:

The task is asking for a non-recursive SQL query that retrieves all pairs of movie IDs and their predecessor movie IDs. For each movie, the query should fetch both its immediate predecessor (if any) and any indirect predecessors. The value in the "level" column indicates the "distance" between the predecessor and the movie in the predecessor-movie hierarchy, i.e., how many levels up the hierarchy the predecessor is found. For example, an immediate predecessor would have a level value of 1.

So how I'm going to write an infinitely deep SQL query without recursion? Like I just can't use a "for" loop or anything :D Hnngh.. poo poo..

Kuule hain nussivan
Nov 27, 2008

Ihmemies posted:

gently caress. I got a task to refactor this query to use NATURAL JOIN without subuqeries..

SQL code:
ALTER TABLE viinitila
    RENAME COLUMN nimi TO tila_nimi;
SELECT viini.nimi,
    viinitila.tila_nimi AS tila
FROM viini
    NATURAL JOIN viinitila;
ALTER TABLE viinitila
    RENAME COLUMN tila_nimi TO nimi;

Why the altering? Couldn't you just do...


SQL code:
SELECT viini.nimi,
    viinitila.tila_nimi AS tila
FROM viini
    JOIN viinitila ON viinitila.vttunnus = viini.vttunnus;
Edit:

No clue what the other task description is about though.

Kuule hain nussivan fucked around with this message at 15:19 on Nov 11, 2023

Ihmemies
Oct 6, 2012

Kuule hain nussivan posted:

Why the altering? Couldn't you just do...


SQL code:
...

The joke was we must use "NATURAL JOIN". So it's problematic if the tables have a "name" column with completely different contents. Renaming all the columns of the 2nd table helped..

Kuule hain nussivan
Nov 27, 2008

Ihmemies posted:

The joke was we must use "NATURAL JOIN". So it's problematic if the tables have a "name" column with completely different contents. Renaming all the columns of the 2nd table helped..

Ah, what a piece of poo poo. Had a look at the oracle documentation for it, and it doesn't mention anything about it taking into account aliases in the SELECT statement, but can you check the query plan for it?

Adbot
ADBOT LOVES YOU

Lib and let die
Aug 26, 2004

Heh, mini-Vinnie

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply