Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



https://discord.gg/fMXvTF

Adbot
ADBOT LOVES YOU

teen phone cutie
Jun 18, 2012

last year i rewrote something awful from scratch because i hate myself
Say I have a relational table called ingredients_in_cocktails that have 2 primary/foreign keys: Cocktail ID and Ingredient ID

so a row might look like:

code:
Ingredient ID          |           Cocktail ID                   |            action
1                                           1                                                 garnish
I want to add a new primary key just called ID (it won't be relational or anything). How do I accomplish this? My goal is to be able to add duplicate rows, such as this:

code:
ID           |        Ingredient ID          |           Cocktail ID                   |            action
1                       1                                           1                                                 add
1                       1                                           1                                                 garnish

teen phone cutie fucked around with this message at 03:19 on May 26, 2019

the talent deficit
Dec 20, 2003

self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture





Grump posted:

Say I have a relational table called ingredients_in_cocktails that have 2 primary/foreign keys: Cocktail ID and Ingredient ID

so a row might look like:

code:
Ingredient ID          |           Cocktail ID                   |            ounces
1                                           1                                                 garnish
I want to add a new primary key just called ID (it won't be relational or anything). How do I accomplish this? My goal is to be able to add duplicate rows, such as this:

code:
ID           |        Ingredient ID          |           Cocktail ID                   |            action
1                       1                                           1                                                 add
1                       1                                           1                                                 garnish

can you insert duplicates now? if not, you already have a unique index on the table and adding a new column and making it a primary key won't help you. you can drop the existing unique index (and recreate it as non-unique if you need it for query performance) and create the new column as a primary key

Scaramouche
Mar 26, 2001

SPACE FACE! SPACE FACE!

Yeah, just make ID the primary key and unique.

Then you can query like:
code:
select * from ingredients_in_cocktails
where cocktailid=x
You can probably massage the output by building some assumptions into ORDER BY IngredientID (I've seen what you're using this for in the Post Screenshots thread).

EDIT-This assumes you know what the cocktail is. Upon review of your app I'm guessing you'll also want to query by ingredient but maybe not return the extra rows.

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!

Grump posted:

Say I have a relational table called ingredients_in_cocktails that have 2 primary/foreign keys: Cocktail ID and Ingredient ID
...

My goal is to be able to add duplicate rows, such as this:

code:
ID           |        Ingredient ID          |           Cocktail ID                   |            action
1                       1                                           1                                                 add
1                       1                                           1                                                 garnish

You don't really want to do that. Either ingredients are unique or recipes are ordered. In the first case, "lime juice" should be a different ingredient than "lime wedge".

In the latter case, much like with baking, "sugar" appears multiple times and the actual distinguishing characteristic is the 'order' of construction. Your composite key in that case will be recipe=(cocktail,step number), and just the foreign key constraint recipe.ingredient references ingredient.id.

As an aside, that table name is a tragedy. Try "cocktail", "ingredient", and "recipe".

teen phone cutie
Jun 18, 2012

last year i rewrote something awful from scratch because i hate myself
Ok so i need to create the ID column as a normal row, THEN remove the unique index, THEN give my ID column the unique index?

PhantomOfTheCopier posted:

You don't really want to do that. Either ingredients are unique or recipes are ordered. In the first case, "lime juice" should be a different ingredient than "lime wedge".

In the latter case, much like with baking, "sugar" appears multiple times and the actual distinguishing characteristic is the 'order' of construction. Your composite key in that case will be recipe=(cocktail,step number), and just the foreign key constraint recipe.ingredient references ingredient.id.

As an aside, that table name is a tragedy. Try "cocktail", "ingredient", and "recipe".

I do have a "step" column in that table - I just didn't post it.

I actually am solving for the problem where you need to wipe the rim of a cocktail glass with a lemon twist and then throw the lemon twist into the glass, similar to your sugar example.

Scaramouche posted:

Yeah, just make ID the primary key and unique.

Then you can query like:
code:
select * from ingredients_in_cocktails
where cocktailid=x
You can probably massage the output by building some assumptions into ORDER BY IngredientID (I've seen what you're using this for in the Post Screenshots thread).

EDIT-This assumes you know what the cocktail is. Upon review of your app I'm guessing you'll also want to query by ingredient but maybe not return the extra rows.

I'm assuming (and hope) my existing queries should be fine

teen phone cutie fucked around with this message at 03:23 on May 26, 2019

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


Frankly speaking you don’t need to do any of that. You don’t even need a new column. Just remove your PK and recreate it as a non-clustered non-unique index.

Is this a good idea? Well... you probably want some unique way to identify your row, so probably not.

Your new ID column should probably be a the only column in a unique clustered index and it should have an auto-incrementing identity value. Basically a default primary key.

teen phone cutie
Jun 18, 2012

last year i rewrote something awful from scratch because i hate myself
hmmm I seem to be running into some errors while doing this. Mainly because both cocktail_id and ingredient_id are foreign keys.

When running something like:

code:
ALTER TABLE ings_in_cocktail DROP INDEX cocktail_id;
I get an error: "Cannot drop index 'cocktail_id': needed in a foreign key constraint"

And when trying

code:
ALTER TABLE `ings_in_cocktail` ADD `id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY
I get an error: "Multiple primary key defined"

I am very much a SQL newbie, so I could use some hand-holding on this.

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


Probably this:

Drop the foreign keys
Drop the indexes
Add the pk
Recreate the indexes
Recreate the foreign keys

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!

Grump posted:

I actually am solving for the problem where you need to wipe the rim of a cocktail glass with a lemon twist and then throw the lemon twist into the glass, similar to your sugar example.
These are not separate ingredients; it is simply part of the instructions. The sugar example from baking requires different sugar in different steps, but could just as well list the aggregate amount as the ingredients with specifics in the instructions. The ingredient here is just "1 lemon twist".

Consider: Do you really, really want to model connected steps between rows? What is the benefit? What queries do you hope to support with such complexity? (And it may not matter, but do you really think what you're trying to build is what all those recipe sites use?)

"Place 1c crushed ice into the glass obtained in step 5".
"Fill glass with water".
...
"Taking the now chilled glass from step 11, dump the ice (from step 9, not the ice you put in the shaker!!) and the water (from step 10), into the sink located in step 3".

And what happens when the step numbers are updated? I still say this approach leads to madness.

But this is all a trick. I'll bet you're really doing a homework problem to write instructions for making a peanut butter and jelly sandwich. :downsrim:

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


You don’t need the ordinal/step to have meaning or appear. The steps could just say “wipe the rim with the lemon and save the lemon for later” and “put the lemon you saved into the drink” or w/e.

teen phone cutie
Jun 18, 2012

last year i rewrote something awful from scratch because i hate myself

Ruggan posted:

You don’t need the ordinal/step to have meaning or appear. The steps could just say “wipe the rim with the lemon and save the lemon for later” and “put the lemon you saved into the drink” or w/e.

This is a good point. I'd probably be better off just adding a "step text" column or something like that. Thanks for that. I think I'm over-complicating this.


PhantomOfTheCopier posted:

But this is all a trick. I'll bet you're really doing a homework problem to write instructions for making a peanut butter and jelly sandwich. :downsrim:

Or I'm a front-end developer trying to learn SQL and Python in my spare time jesus loving christ

Scaramouche
Mar 26, 2001

SPACE FACE! SPACE FACE!

I realize this is arm-chair architecting so if it comes across as wrong-headed I apologize, but it sounds like you need a different class of "ingredients" that are more like "instructions"?

Kuule hain nussivan
Nov 27, 2008

If you want to over-engineer it, you should have Coctail ID | Ingredient ID | Step Number | Action as your table with a PK on Coctail ID | Ingredient ID | Step Number. That way you can have the same ingredient for multiple actions at different points of the recipe. If you want to really over-engineer it, have a FK to an action-table with a list of predefined actions. If you want to really really over-engineer it, add an amount column and a unit column, where the unit column is a FK to a unit-table.

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!

Grump posted:

This is a good point. I'd probably be better off just adding a "step text" column or something like that. Thanks for that. I think I'm over-complicating this.

Scaramouche posted:

I realize this is arm-chair architecting so if it comes across as wrong-headed I apologize, but it sounds like you need a different class of "ingredients" that are more like "instructions"?
I would think the problem is that ingredients are not steps. Cocktails are the class of recipes where the correspondence is closest to 1-to-1, but it's still not exact. My suggestion would be to keep the ingredients with the unique many-to-many, but get rid of the action column. Then, have the steps be either a text field on the recipe header or be a detail table with a foreign key to the recipe header. And maybe make the steps optional since there are plenty of cocktails where all you need to know is what goes into them, but it's not all of them.

teen phone cutie
Jun 18, 2012

last year i rewrote something awful from scratch because i hate myself
I posted redacted a readacted version of my table. Here's what I actually have

Cocktails:

code:
ID   |    Name     |   Glass     |    Finish
1         Mojito       Rocks          Shaken with Ice
Ingredients:

code:
ID   |    Name    |   Type
1         Rum        Liquor
2         Mint       Garnish
CocktailIngredients

code:
Ing_ID (FK)   |   Cocktail_ID (FK)     |  Unit      |   Amount  |  Action   |    Step
1             |           1            |  Ounce     |   1.5     |  Add      |     1
2             |           1            |  Leaves     |   5      |  Garnish  |     2

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

Grump posted:

I posted redacted a readacted version of my table. Here's what I actually have

What if you have a step that doesn't involve ingredients (Shake, Rest, Heat, Apply Smoke Infusion, etc.)? What if you need to prepare some of the ingredients in a separate recipient, and then add the mixture all at once?

I've modelled food production systems in relational DBs a few times. For something straightforward (read: no variations or improvisation allowed) you could use something like this:

code:
Cocktails (
   CocktailID
   Name
   <other properties>
)

Steps (
   StepID
   CocktailID -> Cocktails
   Position // Unique constraint on CocktailID + Position. Keeping StepID separate allows easier reordering and changing.
   Action 
)

Ingredients (
    IngredientID
    StepID -> Steps
    ItemID -> Items
    Quantity
    // Units may or may not be worth adding. If you aren't doing
    // any calculations, they may not appear at all (leave Quantity as a dumb 
    // text column so people can write stuff like 'to taste' or '2-3 spoons').
    // At the other extreme, they may deserve a full table hierarchy of their own
    // if you want to do statistics over cocktail compositions and the like
)
and one of the very first upgrades I would add would be an AlternativeIngredients with a FK to Ingredients, to allow for cocktail variations. Then you could add a Tools table (FK to Steps), and so forth until full enterprise®.

NihilCredo fucked around with this message at 17:43 on May 28, 2019

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!
Like I said, I would remove "step" information from CocktailIngredients and put it in a separate table. An ingredient can be used in multiple steps, and a step can involve multiple ingredients or no ingredients at all. E.G. when you stir a Martini, you're not stirring any particular ingredients (maybe you could say you're stirring all of them).

E: F,b.

E2: If you just want something simple, on the cocktail itself, instead of the "Finish" column, I'd have a varchar(4000) "Instructions" column

Jethro fucked around with this message at 19:49 on May 28, 2019

teen phone cutie
Jun 18, 2012

last year i rewrote something awful from scratch because i hate myself
yeah these are all good points. tbh, I'm not really looking to over-engineer this thing. It's just a fun little portfolio project to prove I can work with backend tech.

They're probably all stretch goals I can accomplish if I ever get bored.

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!
Disclaimer: When you see an emote like that, it's a joke. But like all jokes... have you ever tried writing instructions for peanut butter and jelly? :)

Disclaimer: I've done the this exercise. Cocktails, recipes, etc. Back in the days when a negation query wasn't optimised in PostgreSQL and I hanged the poor thing looking for drinks with A and B and not C.

Amazing how hard the problem can be made, eh? I still think the 'ingredient' table should be the aggregates (14 limes) even if the recipe uses them in different steps (13 lime mix, 1 for garnish).


Disclaimer: I don't run a cocktail recipe website. :eng99:


Fun exercise though.

Jose
Jul 24, 2007

Adrian Chiles is a broadcaster and writer
What are some good places to learn how to use SSIS/AS/RS? Just had a phone interview where it won't be progressing because the company don't have time to train me to use them

kumba
Nov 8, 2003

I posted my food for USPOL Thanksgiving!

enjoy the ride

Lipstick Apathy

Jose posted:

What are some good places to learn how to use SSIS/AS/RS? Just had a phone interview where it won't be progressing because the company don't have time to train me to use them

I'm only familiar with SSRS, but it's pretty basic stuff if you already know SQL and are at all familiar with visual studio. I'd suggest just playing around with it if you have access to VS and can toy with queries and presentation

The query syntax is almost identical to T-SQL, with a few notable differences, the most important one being variable names are case sensitive in SSRS

Other than that I'd say youtube some tutorials on how to play with row/column grouping and go from there

Tibalt
May 14, 2017

What, drawn, and talk of peace! I hate the word, As I hate hell, all Montagues, and thee

So I have less of an SQL question and more of a development approach question.

I've moved from a Medicare Advantage company that used SAS to one that's using SQL - SSMS, SSRS, etc. My first task is to set up the Payment Year 2020 Model in SQL. To anyone who isn't familiar with risk adjustment models (aka everyone), the way the model works is that a SAS dataset of patient information and their diagnosis codes are fed into the model, the patients are grouped into demographic categories (i.e. "Male, institutionalized, 74-79"), the diagnoses grouped together condition categories (i.e. "Diabetes with Complications") that are assigned to the patients, and then a risk score for each member is calculated based on their demographics and conditions. There's some extra elements where some co-morbidity and the number of conditions increase the score, but that's the gist.

My thought is that, well, I'll re-make the SAS program macros step-by-step using tables and queries. It'll take a bit of work to implement, but not particularly difficult to figure out, and it has the advantage that I can justify each step in my calculations by pointing to model in SAS. But is that the best way to approach this problem? It feels a bit crude to take a SAS block and force it through a SQL hole, if that makes any sense, but I also don't want to make this problem harder than it has to be.

Joda
Apr 24, 2010

When I'm off, I just like to really let go and have fun, y'know?

Fun Shoe

Tibalt posted:

So I have less of an SQL question and more of a development approach question.

I've moved from a Medicare Advantage company that used SAS to one that's using SQL - SSMS, SSRS, etc. My first task is to set up the Payment Year 2020 Model in SQL. To anyone who isn't familiar with risk adjustment models (aka everyone), the way the model works is that a SAS dataset of patient information and their diagnosis codes are fed into the model, the patients are grouped into demographic categories (i.e. "Male, institutionalized, 74-79"), the diagnoses grouped together condition categories (i.e. "Diabetes with Complications") that are assigned to the patients, and then a risk score for each member is calculated based on their demographics and conditions. There's some extra elements where some co-morbidity and the number of conditions increase the score, but that's the gist.

My thought is that, well, I'll re-make the SAS program macros step-by-step using tables and queries. It'll take a bit of work to implement, but not particularly difficult to figure out, and it has the advantage that I can justify each step in my calculations by pointing to model in SAS. But is that the best way to approach this problem? It feels a bit crude to take a SAS block and force it through a SQL hole, if that makes any sense, but I also don't want to make this problem harder than it has to be.

I always prefer a natural language business logic reference provided by or worked out with the customer/PM and then starting from scratch on the actual implementation in cases like these. Your proposed approach is how we're currently paying all our technical debt at my current workplace, and in most cases it means there wasn't any real benefit to switching

Joda fucked around with this message at 12:57 on Jun 15, 2019

Jose
Jul 24, 2007

Adrian Chiles is a broadcaster and writer

kumba posted:

I'm only familiar with SSRS, but it's pretty basic stuff if you already know SQL and are at all familiar with visual studio. I'd suggest just playing around with it if you have access to VS and can toy with queries and presentation

The query syntax is almost identical to T-SQL, with a few notable differences, the most important one being variable names are case sensitive in SSRS

Other than that I'd say youtube some tutorials on how to play with row/column grouping and go from there

I don't have VS currently. is it possible to get trial/local versions like I did with SQL server?

kumba
Nov 8, 2003

I posted my food for USPOL Thanksgiving!

enjoy the ride

Lipstick Apathy

Jose posted:

I don't have VS currently. is it possible to get trial/local versions like I did with SQL server?

I'm not familiar with the free versions but it appears so

https://visualstudio.microsoft.com/free-developer-offers/

Shy
Mar 20, 2010

VS Community is basically VS Pro but for free.

Volguus
Mar 3, 2009

Shy posted:

VS Community is basically VS Pro but for free.

For teams (or companies?) less than 5 people.

Jose
Jul 24, 2007

Adrian Chiles is a broadcaster and writer
lol poo poo i forgot that the import/export wizard is part of SSIS so I've not been talking about it in that sense with jobs despite making extremely heavy use of it in my last role

nem
Jan 4, 2003

panel.dev
apnscp: cPanel evolved
What options do I have to copy a few PostgreSQL tables to Redis and keep those schemas up-to-date in Redis whenever a change occurs in PostgreSQL? It's for keeping mail lookup tables in PostgreSQL in sync with Redis for rspamd to determine handling preferences.

I see there's Apache Storm, but anything that's specific for this case?

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!
Maybe the Redis foreign data wrappers.

nem
Jan 4, 2003

panel.dev
apnscp: cPanel evolved

That's plan B. It'll allow me to push data from Postgres into Redis as a separate query, in which case I'd just run separate Postgres + Redis queries since it also has to work with 9.2 to 12 and beyond. Downside is it won't capture queries issued directly to the tables without a trigger. There's also no guarantee the host will be running rspamd + Redis, so it has its limitations.

I'm looking for something that'll arbitrate data between Redis (read-only) and Postgres (read-write) and monitor Postgres for changes from a selection of tables. Almost like replication but concerned with the data not the queries that result in the data. It doesn't need to be real-time either, 30 second lag is fine.

Postgres has a pub/sub facility that'll fit the bill in a flexible manner. Emit a pg_notify() whenever a record is updated then feed the changes to Redis through a separate subscriber.

Desync is still possible if the subscriber goes down, which I'd like a clean approach. :arghfist::saddowns:

Pardot
Jul 25, 2001




There’s also the logical decoding stuff in postgres as of a couple of releases ago. It’d be an efficient thing to use, but there’s probably nothing off the shelf that’ll do exactly what you want, so it’d be on you to write something that uses it and updates redis.

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!

nem posted:

I'm looking for something that'll arbitrate data between Redis (read-only) and Postgres (read-write) and monitor Postgres for changes from a selection of tables. Almost like replication but concerned with the data not the queries that result in the data. It doesn't need to be real-time either, 30 second lag is fine.
This jumped out at me. I don't have time to look it up, but if you can find a helper that can decode WAL, you can set up WAL replication and simply update Redis from that content (or use partial WAL data to inform the queries that perform the update).

Olly the Otter
Jul 22, 2007
So I'm completely at a loss: Sometimes my PostgreSQL tables get into a state where a simple DELETE...WHERE query gives completely unexpected results.

code:
mydb=> select * from tasks;
 tid | dsid 
-----+------
  85 |  101
  86 |  102
  87 |  103
  88 |  104
  89 |  105
(5 rows)

mydb=> begin;
BEGIN
mydb=> delete from tasks where tid in (select tid from tasks order by tid limit 1 for update) returning dsid;
 dsid 
------
  101
  102
  103
  104
  105
(5 rows)
How can it be deleting all 5 rows, when the subquery can only return 1 value of tid? If I do the SELECT from the subquery as a separate step, it only returns one tid value.

Here's the structure of the table:

code:
mydb=> \d+ tasks
                                                Table "public.tasks"
 Column |  Type   | Collation | Nullable |              Default               | Storage | Stats target | Description 
--------+---------+-----------+----------+------------------------------------+---------+--------------+-------------
 tid    | integer |           | not null | nextval('tasks_tid_seq'::regclass) | plain   |              | 
 dsid   | integer |           | not null |                                    | plain   |              | 
Indexes:
    "tasks_pkey" PRIMARY KEY, btree (dsid)

Yeah, I know, the PRIMARY KEY doesn't make any sense here, but I still don't know how that would cause the problem.

I cannot easily reproduce this by creating and populating a new table, or even by pg_dump'ing a bad table and restoring that. But it happens regularly after running some software that uses this table for a while.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Olly the Otter posted:

So I'm completely at a loss: Sometimes my PostgreSQL tables get into a state where a simple DELETE...WHERE query gives completely unexpected results.

Yeah, I know, the PRIMARY KEY doesn't make any sense here, but I still don't know how that would cause the problem.

I cannot easily reproduce this by creating and populating a new table, or even by pg_dump'ing a bad table and restoring that. But it happens regularly after running some software that uses this table for a while.

I haven't ever used PostgreSQL, but it looks as though it's evaluating the condition in the WHERE clause separately for each row of the table. If rows are evaluated for the outer query in order of increasing tid and the delete is actually carried out for a row before the WHERE clause for the next row is evaluated, that's going to result in your table being cleared. If it does this unpredictably then I guess it sometimes doesn't evaluate rows for the outer query in order of increasing tid. It sounds like what you want is for the set of rows to be deleted to be evaluated once, rather than for each row. Maybe you can make that happen either by (1) computing the tid to be deleted in a separate statement, (2) using a CTE, or (3) putting the subquery in the FROM clause rather than the WHERE clause?

redleader
Aug 18, 2005

Engage according to operational parameters
I'd be a little shocked if that was the case, since that sounds like the Halloween problem - and I wouldn't expect that in Postgres. Does EXPLAIN shed any light on it?

Olly the Otter
Jul 22, 2007
I'd be astonished if it's actually running the subquery multiple times, with row deletion happening in between. Is there a query I could try that might confirm this behavior deterministically?

I tried EXPLAIN, but I'm not very good at reading the output. I can tell that it's basically saying the query's inefficient due to the bad primary key, but not sure what else is helpful here.

code:
mydb=> explain delete from tasks where tid in (select tid from tasks order by tid limit 1 for update) returning dsid;
                                                 QUERY PLAN                                                 
------------------------------------------------------------------------------------------------------------
 Delete on tasks  (cost=43.94..82.48 rows=1 width=34)
   ->  Hash Semi Join  (cost=43.94..82.48 rows=1 width=34)
         Hash Cond: (tasks.tid = "ANY_subquery".tid)
         ->  Seq Scan on tasks  (cost=0.00..32.60 rows=2260 width=10)
         ->  Hash  (cost=43.92..43.92 rows=1 width=32)
               ->  Subquery Scan on "ANY_subquery"  (cost=43.90..43.92 rows=1 width=32)
                     ->  Limit  (cost=43.90..43.91 rows=1 width=10)
                           ->  LockRows  (cost=43.90..72.15 rows=2260 width=10)
                                 ->  Sort  (cost=43.90..49.55 rows=2260 width=10)
                                       Sort Key: tasks_1.tid
                                       ->  Seq Scan on tasks tasks_1  (cost=0.00..32.60 rows=2260 width=10)

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Olly the Otter posted:

I'd be astonished if it's actually running the subquery multiple times, with row deletion happening in between. Is there a query I could try that might confirm this behavior deterministically?

No idea.

I created 2 SQL fiddles:

http://sqlfiddle.com/#!17/7964c/1 <-- uses your form of query
http://sqlfiddle.com/#!17/7964c/3 <-- uses PostgreSQL's "using" multi-table delete syntax

Both give the same result but I suspect, given the behaviour you report, that for the first one this is purely down to the order in which the database engine evaluates the rows of the table. Have you tried something like that second delete query?

Adbot
ADBOT LOVES YOU

Olly the Otter
Jul 22, 2007
The second query form doesn't seem to make any difference:

code:
tadpole=> select * from tasks;
 tid  | dsid 
------+------
 2369 |   50
 2370 |   51
 2371 |   52
 2372 |   53
 2373 |   54
 2374 |   55
(6 rows)

tadpole=> delete from tasks using (select tid from tasks order by dsid limit 1 for update) as sq where tasks.tid = sq.tid returning dsid;
 dsid 
------
   50
   51
   52
   53
   54
   55
(6 rows)

DELETE 6
Edit: CTE seems to help. I wish I knew why, though.

code:
tadpole=> with sq as (select tid from tasks order by tid limit 1 for update) delete from tasks where tid in (select tid from sq) returning dsid;
 dsid 
------
   50
(1 row)

DELETE 1
Edit2: The EXPLAIN seems to have changed -- the previous EXPLAIN was before the table had become corrupted, so here's one from this apparently corrupted table:

code:
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Delete on tasks  (cost=1.01..2.05 rows=1 width=34)
   ->  Nested Loop Semi Join  (cost=1.01..2.05 rows=1 width=34)
         Join Filter: (tasks.tid = "ANY_subquery".tid)
         ->  Seq Scan on tasks  (cost=0.00..1.00 rows=1 width=10)
         ->  Subquery Scan on "ANY_subquery"  (cost=1.01..1.03 rows=1 width=32)
               ->  Limit  (cost=1.01..1.02 rows=1 width=10)
                     ->  LockRows  (cost=1.01..1.02 rows=1 width=10)
                           ->  Sort  (cost=1.01..1.01 rows=1 width=10)
                                 Sort Key: tasks_1.tid
                                 ->  Seq Scan on tasks tasks_1  (cost=0.00..1.00 rows=1 width=10)

Olly the Otter fucked around with this message at 22:56 on Jun 27, 2019

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply