Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
TheFluff
Dec 13, 2006

FRIENDS, LISTEN TO ME
I AM A SEAGULL
OF WEALTH AND TASTE
I don't know your exact circumstances or how important keeping the storage size small is but my instinct is to try to not be overly clever here. I'd do the second one (single table with nullable user_id), and if you don't mind ordering the resultset by the key name you can avoid the subquery like so:

SQL code:
select distinct on (key) *
from codes
where
  user_id = 40
  or user_id is null
order by
  key,
  user_id nulls last; -- nulls last is default for ascending sort but it's nice to be explicit

TheFluff fucked around with this message at 15:11 on Feb 24, 2021

Adbot
ADBOT LOVES YOU

NPR Journalizard
Feb 14, 2008

Im trying to set up a server agent job to run a stored proc and I am rapidly running out of ideas as to why its not working.

If I run the stored proc manually, it works fine. The stored proc calls sp_send_dbmail, and I get the email with the required attachment, which is the results of a query.
If I run the stored proc through the server agent, the tasks complete successfully, but I dont get an email, and there are no entries made in msdb.dbo.sysmail_event_log

Things I have tried :
Given the agent service account db_reader and public on the database the proc runs on, as well as DatabaseMailUserRole in msdb,
Given the agent service execute permissions on the stored proc.
Set up an operator with an email and made sure they are on duty
Set the service account as the owner of the job
Tried using a different account to run the job.
Restarting the server agent after most steps.

Anyone else got any hints on what im screwing up?

galenanorth
May 19, 2016

I'm on the chapter of my Learning PostgreSQL textbook about security. It looks like two-way encryption with a single pair of public and private keys has the same effect as row-level security, wherein a table has a username column and the user has to have a username matching the one in the row to see or do things with the row. I don't see the use for storing hashes aside from the hashes that correspond to RDBMS usernames, though. Can anyone name an example use for having a column for hashes in a database?

galenanorth fucked around with this message at 15:13 on Mar 19, 2021

Empress Brosephine
Mar 31, 2012

by Jeffrey of YOSPOS
I dont have a answer, but what textbook are you using? Do you like it?

galenanorth
May 19, 2016

Learning PostgreSQL 11: A beginner's guide to high-performance PostgreSQL database solutions, 3rd edition, by Salahudin Juba and Andrey Volkov from Packt (2019)

It says it's for beginners, but it's not. It uses a ton of terms that I have to look up on the regular, and sometimes it explains them later in the book, which is bizarre. I'm using it with the PostgreSQL documentation, looking up the pages which match the chapter information as I go along, and Database Systems: Design, Implementation, & Management by Carlos M. Coronel and Steven Morris from Cengage Learning (2018). This is a much better book for beginners. After reading a few chapters of it, I realized I should have read the whole thing before starting on the PostgreSQL textbook. The two chapters on SQL include the basic SQL syntax for Oracle, MySQL, Microsoft SQL Server, and Microsoft Access, though the procedural code examples are for Oracle.

Using the three together works well because the PostgreSQL textbook omits a lot of good information on things like procedural SQL from the documentation, but elsewhere the documentation includes way too much information I'll probably never use. Now that I've got a system set up, I feel like I'm not having any problems.

galenanorth fucked around with this message at 15:27 on Mar 19, 2021

Empress Brosephine
Mar 31, 2012

by Jeffrey of YOSPOS
Sweet thanks for the post. Yeah I know a decent amount of Postgres but nothing advanced so that sounds what I need

NPR Journalizard
Feb 14, 2008

NPR Journalizard posted:

Im trying to set up a server agent job to run a stored proc and I am rapidly running out of ideas as to why its not working.

If I run the stored proc manually, it works fine. The stored proc calls sp_send_dbmail, and I get the email with the required attachment, which is the results of a query.
If I run the stored proc through the server agent, the tasks complete successfully, but I dont get an email, and there are no entries made in msdb.dbo.sysmail_event_log

Things I have tried :
Given the agent service account db_reader and public on the database the proc runs on, as well as DatabaseMailUserRole in msdb,
Given the agent service execute permissions on the stored proc.
Set up an operator with an email and made sure they are on duty
Set the service account as the owner of the job
Tried using a different account to run the job.
Restarting the server agent after most steps.

Anyone else got any hints on what im screwing up?

Finally figured this out. Server agent service didn't have permissions on a completely different msdb.storedproc and it wasn't kicking up an error message so it was a jerk to track down

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!

galenanorth posted:

I'm on the chapter of my Learning PostgreSQL textbook about security. It looks like two-way encryption with a single pair of public and private keys has the same effect as row-level security, wherein a table has a username column and the user has to have a username matching the one in the row to see or do things with the row. I don't see the use for storing hashes aside from the hashes that correspond to RDBMS usernames, though. Can anyone name an example use for having a column for hashes in a database?
You seem to be confusing many different things?

Are you asking about about client connections. https://www.postgresql.org/docs/13/ssl-tcp.html

Or pgcrypto type encryption. https://www.postgresql.org/docs/13/encryption-options.html

Or row level permissions. https://www.postgresql.org/docs/13/ddl-rowsecurity.html

Or application password hashes. https://x-team.com/blog/storing-secure-passwords-with-postgresql/

There are plenty of applications where hashes matter, so as a field it certainly makes sense and provides for all the typical uses of a hash.


ps Sounds like the person who wrote that book needs to read the PostgreSQL documentation first. :)

galenanorth
May 19, 2016

pgcrypto type hashes

I hadn't gotten to application programming and I'd assumed that application users would be a subset of database users, so that explains it

Skyarb
Sep 20, 2018

MMMPH MMMPPHH MPPPH GLUCK GLUCK OH SORRY I DIDNT SEE YOU THERE I WAS JUST CHOKING DOWN THIS BATTLEFIELD COCK DID YOU KNOW BATTLEFIELD IS THE BEST VIDEO GAME EVER NOW IF YOULL EXCUSE ME ILL GO BACK TO THIS BATTLECOCK
I've got a weird question. I basically am designing a table whose rows contain durations. But if they are members of the same group (as defined by a foreign key) their durations are related to each other.

E.g. if I have three rows that look kinda like:
code:
| groupA | 2 days |
| groupA | 4 days |
| groupA | 6 days |
So the durations actually impact the following rows start date. So in effect, assuming the above rows are already in order, the calculated start dates of each would be (now, 2 days, and then 6 days (2+4)). I don't want to calculate and store the start dates in the database, I'd rather it be calcualted from the order of the rows.

Would a standard approach just be to add an "order" column which is just an int column that defines the asc order of the rows relative to each other? Are there other approaches to consider?

TheFluff
Dec 13, 2006

FRIENDS, LISTEN TO ME
I AM A SEAGULL
OF WEALTH AND TASTE

Skyarb posted:

I've got a weird question. I basically am designing a table whose rows contain durations. But if they are members of the same group (as defined by a foreign key) their durations are related to each other.

E.g. if I have three rows that look kinda like:
code:
| groupA | 2 days |
| groupA | 4 days |
| groupA | 6 days |
So the durations actually impact the following rows start date. So in effect, assuming the above rows are already in order, the calculated start dates of each would be (now, 2 days, and then 6 days (2+4)). I don't want to calculate and store the start dates in the database, I'd rather it be calcualted from the order of the rows.

Would a standard approach just be to add an "order" column which is just an int column that defines the asc order of the rows relative to each other? Are there other approaches to consider?

Is the ordering immutable once the rows are inserted, or can it be changed later? Even if it's fixed, can a record be inserted in the middle of a group later? If it's fixed and you can't insert rows in the middle of the sequence later then you don't need an ordering column, you can just use insertion order (well, at least if you have a serial integer primary key). If the ordering is mutable though then you probably want to think a bit more carefully about this, because if the groups get large changing the order can start getting very expensive - for example, if you have an integer ordering column and just increase it by 1 for each row, if you want to insert a row immediately after the first one, you have to update every row in the group after that one. There are a number of solutions that you can use to optimize this, each with their own pitfalls - this blog post is a good place to start reading, I think.

Skyarb
Sep 20, 2018

MMMPH MMMPPHH MPPPH GLUCK GLUCK OH SORRY I DIDNT SEE YOU THERE I WAS JUST CHOKING DOWN THIS BATTLEFIELD COCK DID YOU KNOW BATTLEFIELD IS THE BEST VIDEO GAME EVER NOW IF YOULL EXCUSE ME ILL GO BACK TO THIS BATTLECOCK

TheFluff posted:

Is the ordering immutable once the rows are inserted, or can it be changed later? Even if it's fixed, can a record be inserted in the middle of a group later? If it's fixed and you can't insert rows in the middle of the sequence later then you don't need an ordering column, you can just use insertion order (well, at least if you have a serial integer primary key). If the ordering is mutable though then you probably want to think a bit more carefully about this, because if the groups get large changing the order can start getting very expensive - for example, if you have an integer ordering column and just increase it by 1 for each row, if you want to insert a row immediately after the first one, you have to update every row in the group after that one. There are a number of solutions that you can use to optimize this, each with their own pitfalls - this blog post is a good place to start reading, I think.

Interesting! Thanks for the read, these groups should likely never exceed 10 items so I think a simple int is fine for now and will keep it elegant (anytime something is inserted, the whole group will be redefined anyway)

galenanorth
May 19, 2016

Is what this page calls "read skew" just another name for what my textbook calls "inconsistent retrievals"? I'm also guessing they're are allowed to happen when phantom reads are allowed to happen, e.g. aggregation which is running while a new row is added.

https://vladmihalcea.com/a-beginners-guide-to-read-and-write-skew-phenomena/

I came across three pages which all put MVCC within the category "optimistic concurrency control"

https://stackoverflow.com/questions/5751271/optimistic-vs-multi-version-concurrency-control-differences
https://www.tutorialdba.com/2017/10/what-is-mvcc-multi-version-concurrency.html
https://vladmihalcea.com/optimistic-vs-pessimistic-locking/

but Andy Pavlo at Carnegie Melon University suggests that it's a larger framework of design decisions that can be used with concurrency control and brings up acronyms like MV2PL and MVOCC, so I'll go with the more authoritative source

https://15445.courses.cs.cmu.edu/fall2019/ Anyway, I couldn't help but notice "In-House DJ: DJ Drop Tables" at the end of this course page.

----

How do advisory locks work in PostgreSQL? I read https://www.postgresql.org/docs/current/explicit-locking.html#ADVISORY-LOCKS and https://www.netguru.com/codestories/advisory-locks, but I still don't get it. Where do you get the number provided to the function? The example uses an id attribute, so for a row, is it a surrogate key? Is it that you pass an arbitrary obscure number and using it somewhere in the application is a signal?

https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADVISORY-LOCKS

says "key value", so I'm guessing that means surrogate key

galenanorth fucked around with this message at 22:33 on Mar 28, 2021

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
You can think of the number as being a pointer to an actual lock object in memory, if you like. If two things try to lock using the same pointer, one will succeed and the other will need to wait, while if two things try to lock different pointers they will both succeed and not interfere with each other.

If two things want to use the same lock then they need to use the same number - how exactly they agree on what number to use is up to you.

Star War Sex Parrot
Oct 2, 2003

galenanorth posted:

Is what this page calls "read skew" just another name for what my textbook calls "inconsistent retrievals"? I'm also guessing they're are allowed to happen when phantom reads are allowed to happen, e.g. aggregation which is running while a new row is added.

https://vladmihalcea.com/a-beginners-guide-to-read-and-write-skew-phenomena/
"Inconsistent retrievals" sounds like a more broad anomaly description, while read skew is more well defined. I think your intuition is correct that it could encompass phantoms as well.

galenanorth posted:

I came across three pages which all put MVCC within the category "optimistic concurrency control"

https://stackoverflow.com/questions/5751271/optimistic-vs-multi-version-concurrency-control-differences
https://www.tutorialdba.com/2017/10/what-is-mvcc-multi-version-concurrency.html
https://vladmihalcea.com/optimistic-vs-pessimistic-locking/

but Andy Pavlo at Carnegie Melon University suggests that it's a larger framework of design decisions that can be used with concurrency control and brings up acronyms like MV2PL and MVOCC, so I'll go with the more authoritative source
I'm gonna agree with Andy here that being multi-versioned doesn't really say much about what your concurrency control protocol is. They're somewhat orthogonal, and being multi-versioned is often more of a storage and query processing design decision. That said, I think I have to agree with Andy since I'm one of his PhD students.

Also if you're looking for people to chat with about the course (or learning DBMS internals), there's an unofficial 15-445/645 Discord that people are using to go through the course lectures and projects together.

galenanorth posted:

How do advisory locks work in PostgreSQL? I read https://www.postgresql.org/docs/current/explicit-locking.html#ADVISORY-LOCKS and https://www.netguru.com/codestories/advisory-locks, but I still don't get it. Where do you get the number provided to the function? The example uses an id attribute, so for a row, is it a surrogate key? Is it that you pass an arbitrary obscure number and using it somewhere in the application is a signal?

https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADVISORY-LOCKS

says "key value", so I'm guessing that means surrogate key
As far as I can tell, advisory locks are basically DBMS-provided semaphores that application developers can use to enforce synchronization. The 64-bit or pair of 32-bit identifiers are user-defined, and then a client would use those identifiers to synchronize accesses. Maybe that's useful if you've got different applications that can't share mutexes?

edit: Also if you love learning about DBMS internals, the current (and previous) online seminar series have been great. Today's got Badrish from Microsoft Research talking about their awesome KV store FASTER. Previous highlights for me are 2020's DuckDB, Jepsen, SQL Server (focused on Cascades), and 2021's HarperDB talk (this one is wild and remarkably candid).

Star War Sex Parrot fucked around with this message at 16:35 on Mar 29, 2021

Gatac
Apr 22, 2008

Fifty Cent's next biopic.
Hey all, got a conceptual issue with Oracle's Advanced Queueing. I'm just getting into it for a smaller project, but it's got me thinking about an ETL process where I think it might be useful. Key points:

a) The sender needs to dump processed rows into some sort of buffer. A queue, a staging table, whatever. It can't wait on the receiver to be ready for processing.
b) The receiver needs granular control over when and which rows from that buffer it should process.
c) Processing may be manually triggered, on a schedule or (tricky) when X time has passed with no new rows matching condition Y being added to the buffer.
d) Everything in the buffer should be processed once and only once, adding new rows to the buffer must not affect any running processing.
e) Processing of sets of messages should be independent. Just because one set of messages is waiting for time X to pass before they're processed should not affect the processing of other messages in the buffer.

Fairly sure I could cover most of that and d) in particular made me think of AQ, but my worry is the tricky bit mentioned in c). I'd need to dequeue the last message in BROWSE mode to get its enqueue time, but from what I can see that's not a direct option. I may be able to hack the queue into more of a stack so it generally runs as LIFO, which I think would be okayish for this particular application, but really I'd rather have the queue run FIFO and have a different method to get the minimum age of an enqueued message matching a set of conditions. And as much as possible I'd like that solution to use the queue instead of going around it.

Ideas?

Spikes32
Jul 25, 2013

Happy trees
I've got a view I want to alter to add a column to. The view has several inner / left outer joins to it. I want my column to use some of those joins but not all and add one extra where clause just for this column but not the view as a whole. Is the best way to accomplish this just to create a subquery in the select statement and replicate the joins I want in my subquery?

kumba
Nov 8, 2003

I posted my food for USPOL Thanksgiving!

enjoy the ride

Lipstick Apathy

Spikes32 posted:

I've got a view I want to alter to add a column to. The view has several inner / left outer joins to it. I want my column to use some of those joins but not all and add one extra where clause just for this column but not the view as a whole. Is the best way to accomplish this just to create a subquery in the select statement and replicate the joins I want in my subquery?

it's extremely difficult to parse what you're asking for (for example - inner / left joins, do you mean the view itself is constructed using these joins, or other queries referencing this view are left/inner joining to it?), can you provide some code or pseudocode to give us an idea of what you're trying to accomplish?

Spikes32
Jul 25, 2013

Happy trees
Yeah sorry I'm sure what I wrote is only clear to le because I'm staring at it.

SELECT a, b, c
FROM S inner join I on S.id = I.id
Inner join CWS on I.cws_id = CWS.cws_id and CWS.completed = y
Bunch more joins
Where (I.start_date is not null) and (I.end_date is null)

I want to add column d with a few filters attached from the CWS table where I.end_date is not null, but not affect what the rest of the view pulls. The CWS table has multiple rows for each s.id. If this isn't clear enough let me know and I'll try again.

Gatac
Apr 22, 2008

Fifty Cent's next biopic.
code:
select a, b, c, d
  from s
  join l
    on s.id = l.id
   and l.start_date is not null
  join cws cws_a
    on l.cws_id = cws_a.cws_id 
   and cws_a.completed = y
   and l.end_date is null
  join cws cws_b
    on l.cws_id = cws_b.cws_id
   and l.end_date is not null
Like this? Join CWS again and put the different predicates for l in the respective ON clauses.

Spikes32
Jul 25, 2013

Happy trees
I need those joins to stay active for picking samples to show in the view, I just don't want them active specifically for the new column.

Nth Doctor
Sep 7, 2010

Darkrai used Dream Eater!
It's super effective!


Spikes32 posted:

I need those joins to stay active for picking samples to show in the view, I just don't want them active specifically for the new column.

It sounds like you may need to do a whole-assed subquery (are those even legal in views?) instead of additional joins. The fact that CWS joins from I, and the WHERE has I.End_Date IS NULL means you can't really independently filter rows where I.End_Date IS NOT NULL instead.

I haven't put a ton of thought into this, but you're still running at an abstraction that makes it hard for me to determine your intent with the view or what the additional column represents.

The whole End_Date IS NULL / End_Date IS NOT NULL seems like it would be incompatible to me in the query. For a given row in the view now, what should the column you want to add even have?

Portland Sucks
Dec 21, 2004
༼ つ ◕_◕ ༽つ
I've got something going on in an aggregation query that is confusing me.

code:
SELECT
    COUNT(DISTINCT 'EMAIL') AS DISTINCT_EMAILS
FROM MEMBERS;

query result: 285,292
code:
WITH Q AS (
SELECT
  FISCAL_YEAR, FISCAL_QUARTER, REGION,  COUNT(DISTINCT 'EMAIL') AS DISTINCT_EMAILS
FROM MEMBERS
GROUP BY FISCAL_YEAR, FISCAL_QUARTER, REGION
)
SELECT SUM(DISTINCT_EMAILS) FROM Q;

query result: 285,654
Why am I increasing my aggregate count by 372 when I sum up my groups? Shouldn't the grouped sum be the total of the original count?

Spikes32
Jul 25, 2013

Happy trees

Nth Doctor posted:

It sounds like you may need to do a whole-assed subquery (are those even legal in views?) instead of additional joins. The fact that CWS joins from I, and the WHERE has I.End_Date IS NULL means you can't really independently filter rows where I.End_Date IS NOT NULL instead.

I haven't put a ton of thought into this, but you're still running at an abstraction that makes it hard for me to determine your intent with the view or what the additional column represents.

The whole End_Date IS NULL / End_Date IS NOT NULL seems like it would be incompatible to me in the query. For a given row in the view now, what should the column you want to add even have?

To try and explain what im doing, this view is used to create a report where all samples currently being incubated are listed. Samples can have more than one incubation phase, the view currently only grabs info from the current incubation phase that is ongoing (from cws/i table) .

I need to add a column that grabs the start date of the initial incubation even from samples that are no longer in that initial incubation. Thus changing the join is null clause for this column but not the view as a whole.

kumba
Nov 8, 2003

I posted my food for USPOL Thanksgiving!

enjoy the ride

Lipstick Apathy

Portland Sucks posted:

I've got something going on in an aggregation query that is confusing me.

code:
SELECT
    COUNT(DISTINCT 'EMAIL') AS DISTINCT_EMAILS
FROM MEMBERS;

query result: 285,292
code:
WITH Q AS (
SELECT
  FISCAL_YEAR, FISCAL_QUARTER, REGION,  COUNT(DISTINCT 'EMAIL') AS DISTINCT_EMAILS
FROM MEMBERS
GROUP BY FISCAL_YEAR, FISCAL_QUARTER, REGION
)
SELECT SUM(DISTINCT_EMAILS) FROM Q;

query result: 285,654
Why am I increasing my aggregate count by 372 when I sum up my groups? Shouldn't the grouped sum be the total of the original count?

if you have rows with the same email address with two different regions, because you're grouping your count distinct is going to catch the email twice: once for region A and once for region B

kumba
Nov 8, 2003

I posted my food for USPOL Thanksgiving!

enjoy the ride

Lipstick Apathy

Spikes32 posted:

To try and explain what im doing, this view is used to create a report where all samples currently being incubated are listed. Samples can have more than one incubation phase, the view currently only grabs info from the current incubation phase that is ongoing (from cws/i table) .

I need to add a column that grabs the start date of the initial incubation even from samples that are no longer in that initial incubation. Thus changing the join is null clause for this column but not the view as a whole.

if i'm reading this right, this sounds like subquery territory

SELECT a
, b
, c
, d = (select whatEver from CWS CWS2 where CWS.cws_id = CWS2.cws_id and I.end_date is not null and 1=1 and 2=2 and...)
FROM S inner join I on S.id = I.id
Inner join CWS on I.cws_id = CWS.cws_id and CWS.completed = y
Bunch more joins
Where (I.start_date is not null) and (I.end_date is null)

Spikes32
Jul 25, 2013

Happy trees
Thank you, so my initial idea of just replicating the joins again in the subquery was correct I appreciate the help.

Portland Sucks
Dec 21, 2004
༼ つ ◕_◕ ༽つ

kumba posted:

if you have rows with the same email address with two different regions, because you're grouping your count distinct is going to catch the email twice: once for region A and once for region B

Want to make sure I understand this logic...so if I have two rows

a@b.com, 2021, Q1, USA
a@b.com, 2021, Q1, EUR

when i just count distinct email i end up with 1

if I group and count distinct i end up with two groups

2021, Q1, USA 1
2021, Q1, EUR 1

then the select distinct is evaluated and summed which ends up giving me 2.

I think I get the problem, is there a more standard way to ask this question that'll prevent multiple counting or is this more of a data quality issue?

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Portland Sucks posted:

Want to make sure I understand this logic...so if I have two rows

a@b.com, 2021, Q1, USA
a@b.com, 2021, Q1, EUR

when i just count distinct email i end up with 1

if I group and count distinct i end up with two groups

2021, Q1, USA 1
2021, Q1, EUR 1

then the select distinct is evaluated and summed which ends up giving me 2.

I think I get the problem, is there a more standard way to ask this question that'll prevent multiple counting or is this more of a data quality issue?

Well if you have the same email in both USA and EUR, but you only want it to contribute to one of the two... which region do you want it to contribute to?

TheFluff
Dec 13, 2006

FRIENDS, LISTEN TO ME
I AM A SEAGULL
OF WEALTH AND TASTE

Spikes32 posted:

To try and explain what im doing, this view is used to create a report where all samples currently being incubated are listed. Samples can have more than one incubation phase, the view currently only grabs info from the current incubation phase that is ongoing (from cws/i table) .

I need to add a column that grabs the start date of the initial incubation even from samples that are no longer in that initial incubation. Thus changing the join is null clause for this column but not the view as a whole.

I don't know what database you're using but if you have lateral joins (called cross apply/outer apply in MSSQL) this is like the textbook use case for them. Something like

SQL code:
select
  s.a,
  s.b,
  s.c,
  cws_initial.d
from
  s
  inner join I on s.id = I.id
  inner join CWS on I.cws_id = CWS.cws_id and CWS.completed = y
  -- Bunch more joins
  cross join lateral (
    select cws_2.*
    from
      CWS as cws_2
      inner join l as l_2 on l_2.cws_id = cws_2.cws_id and cws_2.completed = y
    where
      l_2.id = s.id
      and l_2.end_date is not null
    order by l_2.end_date asc
    limit 1
  ) as cws_initial on true
where
  I.start_date is not null
  and I.end_date is null
might work. Postgres syntax above. It's basically the same thing as the subquery solution already suggested but usually performs better.

e: I'm unsure what your data looks like but you may or may not want to convert the cross join to a left join.

TheFluff fucked around with this message at 22:09 on Apr 1, 2021

Spikes32
Jul 25, 2013

Happy trees
Thanks! I ended up writing a sub query that used a where clause to make the sub query Sample_id equal to the Sample_id in the main query which worked well and didn't add processing time that I could see. The dB is a Microsoft sql 2012,ill look into lateral joins to see if that would be better.

Empress Brosephine
Mar 31, 2012

by Jeffrey of YOSPOS
Hey goons simple question I hope:

Is there a way to cascade order by in Postgres? For example say I have the following table:

name | staffid | attribute_id
-----------------------------------------------------
Horse | null | 1
Bird | null | 10
Cat | 1 | 8 |
Dog | 2 | 8
Cat | 1 | 9
Dog | 2 | 9


Can I have it return so that the order is by attribute_id asc and then it sorts by staffid when it sees multiple entries in the order? Such as:
Horse | null | 1
Cat | 1 | 8
Cat | 1 | 9
Dog | 2 | 8
Dog | 2 | 9
Bird | null | 10

edit: I forgot to add, not everything has a staffid.
?

Empress Brosephine fucked around with this message at 14:48 on Apr 8, 2021

Kuule hain nussivan
Nov 27, 2008

Empress Brosephine posted:

Hey goons simple question I hope:

Is there a way to cascade order by in Postgres? For example say I have the following table:

name | staffid | attribute_id
-----------------------------------------------------
Horse | null | 1
Bird | null | 10
Cat | 1 | 8 |
Dog | 2 | 8
Cat | 1 | 9
Dog | 2 | 9


Can I have it return so that the order is by attribute_id asc and then it sorts by staffid when it sees multiple entries in the order? Such as:
Horse | null | 1
Cat | 1 | 8
Cat | 1 | 9
Dog | 2 | 8
Dog | 2 | 9
Bird | null | 10

edit: I forgot to add, not everything has a staffid.
?

Yep, you can simply add multiple columns to your ORDER BY clause. So ORDER BY name, attribute_id, staff_id

Empress Brosephine
Mar 31, 2012

by Jeffrey of YOSPOS

Kuule hain nussivan posted:

Yep, you can simply add multiple columns to your ORDER BY clause. So ORDER BY name, attribute_id, staff_id

Thats what I thought but i'm finding its returning this:

Horse | Null | 1
Cat | 1 | 8
Dog | 2 | 8
Cat | 1 | 9
Dog | 2 | 9

:(

I posted this on stackoverflow also and added a little descriptor:

quote:

Hey Cetin I just updated the list to make more sense and be more correct?

I guess ultimately I want to order by Attribute_ID until it sees multiple entries for a staff_id asc and then orders by staff_id before returning to attribute_id

Empress Brosephine
Mar 31, 2012

by Jeffrey of YOSPOS
They solved it by doing this:

quote:

order by min(attribute_id) over (partition by name),
name,
attribute_id;

which i'm not sure what a min & a over or a partition is but hey it works lol

Kuule hain nussivan
Nov 27, 2008

Which version of postgre are you using? I just tried it out on a fiddle with 9.6 and am getting the wanted result. Is your table structure just text NOT NULL, int NULL, int NOT NULL?

Empress Brosephine
Mar 31, 2012

by Jeffrey of YOSPOS
How do you tell the postgres version?

I'm actually using different data then this example, which now that I apply it to my data isn't really working as well as it should. Here's the actual data i'm using:

ignore the fake data in here:


code:
select f4.attrib_id, c.company_name, f3.attribute_name, f4.value, f4.staff_id 
            from company c 
            inner join formresponse f on c.id = f.company_id 
            inner join forms f2 on f.form_id = f2.form_id 
            inner join formquestion f3 on f2.form_id = f3.form_id 
            inner join formquestionresponse f4 on f3.attrib_id  = f4.attrib_id 
            where f.company_id = 2 AND f.form_id = 9 AND f.response_id = 60 AND f4.response_id = 60'

code:
attrib_id		company_name		attribute_name		value				staff_id
217	Philadelphia Flyers Child Care	IPDP Center Name	Philadelphia Flyers Child Care	
218	Philadelphia Flyers Child Care	IPDP Director Name	Brian Lutz
223	Philadelphia Flyers Child Care	100% IPDP Required  	no	
219	Philadelphia Flyers Child Care	IPDP Staff Person	Brian Lutz              47b77173-43a9-43e7-9c3b-5d0ceca79aec
220	Philadelphia Flyers Child Care	IPDP Completed?	no	                        47b77173-43a9-43e7-9c3b-5d0ceca79aec
221	Philadelphia Flyers Child Care	IPDP Completed Date		 			47b77173-43a9-43e7-9c3b-5d0ceca79aec
222	Philadelphia Flyers Child Care	IPDP Date of Hire	2021-04-12		47b77173-43a9-43e7-9c3b-5d0ceca79aec
219	Philadelphia Flyers Child Care	IPDP Staff Person	Pawl Maltz		542fc543-8290-4238-9788-106cea69ad1a
220	Philadelphia Flyers Child Care	IPDP Completed?	no				542fc543-8290-4238-9788-106cea69ad1a
221	Philadelphia Flyers Child Care	IPDP Completed Date					542fc543-8290-4238-9788-106cea69ad1a
222	Philadelphia Flyers Child Care	IPDP Date of Hire	2021-04-05		542fc543-8290-4238-9788-106cea69ad1a
This is what I wanted to accomplish:

code:
attrib_id		company_name				attribute_name		value							staff_id
217			Philadelphia Flyers Child Care	IPDP Center Name	Philadelphia Flyers Child Care	
218			Philadelphia Flyers Child Care	IPDP Director Name	Brian Lutz	
219			Philadelphia Flyers Child Care	IPDP Staff Person	Brian Lutz						47b77173-43a9-43e7-9c3b-5d0ceca79aec
220			Philadelphia Flyers Child Care	IPDP Completed?	no								47b77173-43a9-43e7-9c3b-5d0ceca79aec
221			Philadelphia Flyers Child Care	IPDP Completed Date									47b77173-43a9-43e7-9c3b-5d0ceca79aec
222			Philadelphia Flyers Child Care	IPDP Date of Hire	2021-04-12						47b77173-43a9-43e7-9c3b-5d0ceca79aec
219			Philadelphia Flyers Child Care	IPDP Staff Person	Pawl Maltz						542fc543-8290-4238-9788-106cea69ad1a
220			Philadelphia Flyers Child Care	IPDP Completed?	no								542fc543-8290-4238-9788-106cea69ad1a
221			Philadelphia Flyers Child Care	IPDP Completed Date									542fc543-8290-4238-9788-106cea69ad1a
222			Philadelphia Flyers Child Care	IPDP Date of Hire	2021-04-05						542fc543-8290-4238-9788-106cea69ad1a
223			Philadelphia Flyers Child Care	100% IPDP Required	no	
the closest I can get from what stackoverflow suggested is this:

attrib_id / company_name/ attribute_name / value / staff_id
code:

select f4.attrib_id, c.company_name, f3.attribute_name, f4.value, f4.staff_id 
            from company c 
            inner join formresponse f on c.id = f.company_id 
            inner join forms f2 on f.form_id = f2.form_id 
            inner join formquestion f3 on f2.form_id = f3.form_id 
            inner join formquestionresponse f4 on f3.attrib_id  = f4.attrib_id 
            where f.company_id = 2 AND f.form_id = 9 AND f.response_id = 60 AND f4.response_id = 60
            order by min(f4.attrib_id) over (partition by f4.staff_id), f4.staff_id , f4.attrib_id

code:
217	Philadelphia Flyers Child Care	IPDP Center Name	Philadelphia Flyers Child Care	
218	Philadelphia Flyers Child Care	IPDP Director Name	Brian Lutz
223	Philadelphia Flyers Child Care	100% IPDP Required	no	
219	Philadelphia Flyers Child Care	IPDP Staff Person	Brian Lutz 			47b77173-43a9-43e7-9c3b-5d0ceca79aec
220	Philadelphia Flyers Child Care	IPDP Completed?	no					47b77173-43a9-43e7-9c3b-5d0ceca79aec
221	Philadelphia Flyers Child Care	IPDP Completed Date						47b77173-43a9-43e7-9c3b-5d0ceca79aec
222	Philadelphia Flyers Child Care	IPDP Date of Hire	2021-04-12			47b77173-43a9-43e7-9c3b-5d0ceca79aec
219	Philadelphia Flyers Child Care	IPDP Staff Person	Pawl Maltz			542fc543-8290-4238-9788-106cea69ad1a
220	Philadelphia Flyers Child Care	IPDP Completed?	no					542fc543-8290-4238-9788-106cea69ad1a
221	Philadelphia Flyers Child Care	IPDP Completed Date						542fc543-8290-4238-9788-106cea69ad1a
222	Philadelphia Flyers Child Care	IPDP Date of Hire	2021-04-05			542fc543-8290-4238-9788-106cea69ad1a
My fields are set up like this:

attrib_id - int8 NOT NULL
------------------
company_name varchar NOTNULL
-----------------
attribute_name varchar NOTNULL
-----------------------
value varchar NULL
------------------------------
staff_id uuid NULL

Empress Brosephine fucked around with this message at 15:32 on Apr 8, 2021

Empress Brosephine
Mar 31, 2012

by Jeffrey of YOSPOS
which I think what I need is something like (pseudo code):

code:
order by min(f4.attrib_id) over (case when f4.staff_id > null then partition by f4.staff_id else ????? end), f4.attrib_id, f4.staff_id
if anything like that even exists or if you can do a case statement like that

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe
order by case when staff_id is null then attribute_id else min(attribute_id) over (partition by staff_id) end

http://sqlfiddle.com/#!17/08d351/6

Adbot
ADBOT LOVES YOU

Empress Brosephine
Mar 31, 2012

by Jeffrey of YOSPOS

Hammerite posted:

order by case when staff_id is null then attribute_id else min(attribute_id) over (partition by staff_id) end

http://sqlfiddle.com/#!17/08d351/6

THANK YOU SO MUCH!

Here's what worked:

code:
      `select f4.attrib_id, c.company_name, f3.attribute_name, f4.value, f4.staff_id 
      from company c 
      inner join formresponse f on c.id = f.company_id 
      inner join forms f2 on f.form_id = f2.form_id 
      inner join formquestion f3 on f2.form_id = f3.form_id 
      inner join formquestionresponse f4 on f3.attrib_id  = f4.attrib_id 
      where f.company_id = $1 AND f.form_id = $2 AND f.response_id = $3 AND f4.response_id = $3
      order by case when f4.staff_id is null then f4.attrib_id else min(f4.attrib_id) over (partition by f4.staff_id) end,
   f4.staff_id,
    f4.attrib_id;

`,

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply