Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Ninja.Bob
Mar 31, 2005

Twerk from Home posted:

I feel like there must be a better way to handle filters in a HAVING statement. I can't use window functions there, and it stinks.

Can you use a subquery?

SQL code:
select
   case when fraction > 0.2 then color else 'other' end as color
   , sum(count) as count
from (
  select distinct
    color
    , count(*) over (partition by color) as count
    , cast(count(*) over (partition by color) as float) / count(*) over () as fraction
  from colors
--   where year = 1994
) color_fractions
group by case when fraction > 0.2 then color else 'other' end
order by count desc;

Ninja.Bob fucked around with this message at 05:07 on Oct 22, 2022

Adbot
ADBOT LOVES YOU

Twerk from Home
Jan 17, 2009

This avatar brought to you by the 'save our dead gay forums' foundation.

Ninja.Bob posted:

Can you use a subquery?

Yes, and your solution is a much better approach to the whole space, thanks!

Just-In-Timeberlake
Aug 18, 2003

DELETE CASCADE posted:

is it really best practice in sql server to write all your identifiers [like].[this]? what a pain in the rear end

The SQL intellisense tool i use does the formatting for me, I don’t type that stuff.

I’m not sure if it’s still the case, but back in the days of SQL 2005 if you were executing dynamic SQL there were performance gains to be had by fully qualifying your objects and fields.

kiwid
Sep 30, 2013

(TSQL)

I have the following table data:



I'm trying to get the max record for each bin_id with a date less than a specified date. How can I properly achieve this? I've messed around with distinct and group by but I don't think that's right.

For example, if I wanted the last record (by updated_at) that is less than 2021-01-01, I should expect IDs: 58, 56, 54, 48, 37.



I'm struggling to figure this out. It's probably simple but I'm not getting it.

abelwingnut
Dec 23, 2002


i would create a cte or subquery to reduce your dataset to one only having records from before whatever date.

then do a second cte or subquery where you create a new column defined by the ROW_NUMBER() function. call the column rn. you want:

ROW_NUMBER OVER (PARTITION BY [columns you want to partition by] ORDER BY date DESC) AS rn

this will sort the records you want by whatever partition you define and assign them a row number. because you are sorting descending, the top row will be the max date.


then in the final query, you query the second cte or subquery for where rn = 1.

sorry if this is awkwardly explained, on mobile.

abelwingnut fucked around with this message at 22:16 on Nov 8, 2022

kiwid
Sep 30, 2013

abelwingnut posted:

i would create a cte or subquery to reduce your dataset to one only having records from before whatever date, then do a group by to get the max.

Thanks, I don't know why but I always try to avoid subqueries but in this case I'm not sure if it's possible.

abelwingnut
Dec 23, 2002


i updated what i would do, so be sure to reread my post.

and yea, i will do cte and subqueries all over the place. they make sql much easier and clearer. one step at a time and all that.

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


Make sure you think through how to handle two records from the same bin with the same updated date.

The ROW_NUMBER rn=1 solution at least prevents you from getting two results (unless you’d like to see both in the case of a tie), but it might not get you the result you want.

kiwid
Sep 30, 2013

Ruggan posted:

Make sure you think through how to handle two records from the same bin with the same updated date.

The ROW_NUMBER rn=1 solution at least prevents you from getting two results (unless you’d like to see both in the case of a tie), but it might not get you the result you want.

Definitely don't want a tie, I'm about to dig into this row_number() and partition stuff now. Never touched either before.

kiwid
Sep 30, 2013

Oh wow, that was so easy... Just have to grab rn = 1 now.

NPR Journalizard
Feb 14, 2008

Im having an issue with a left join and I do not know what I am doing wrong. I have tested this in snowflake and t-sql and the behaviour is the same.

I have 3 tables that I want to do left joins over.

code:
select 
dataareaid,
voucher,
invoice
from ledgerjournaltrans
where upper(voucher) = 'VI000045' and dataareaid = 'ifa'


ifa	VI000045	170077170
ifa	VI000045	

 
code:
select 
upper(dataareaid),
vtv.voucher,
upper(invoice)
from vendtrans vtv
where upper(voucher) = 'VI000045' and dataareaid = 'ifa';

IFA	VI000045	170077170

Those are the two tables I want to join to the first table.

code:
select 
t.source,
t.dataareaid as "T.Company",
t.voucher as "T.Voucher",
ljt.dataareaid as "ljt.comp",
ljt.voucher as "ljt.voucher",
vtv.dataareaid as "vtv.comp",
vtv.voucher as "vtv.voucher",
upper(ljt.invoice) as "ljt.invoice",
upper(vtv.invoice) as "tvt.invoice",
LEN( ljt.invoice)

from taxtrans t
        left join ledgerjournaltrans    ljt on upper(ljt.dataareaid) = upper(t.dataareaid)                       
                                                                            and ljt.recid = t.sourcerecid
                                                                            

        left join vendtrans             vtv on upper(ljt.voucher) = upper(vtv.voucher) 
                                                                            and upper(ljt.dataareaid) = upper(vtv.dataareaid)
                                                                            and upper(ljt.invoice) = upper(vtv.invoice)
                                                                            
                                                                            and t.source = 2 
									   and LEN( ljt.invoice) > 0    
                                                                           
                                                                            
where  upper(t.voucher) = 'VI000045' and t.dataareaid = 'ifa';


source	T.Company	T.Voucher	ljt.comp	ljt.voucher	vtv.comp	vtv.voucher	ljt.invoice	tvt.invoice	(No column name)
2		ifa	VI000045	ifa		VI000045	NULL		NULL				NULL		0

It looks like the join to bring in vtv is selecting the row with the blank ljt.invoice value, when i want it to only bring in rows with a valid invoice value. I have tried a couple of different ways to filter out the blank invoice including the LEN one shown, as well as ljt.invoice != ''" and ljt.invoice <> "", but none of them are working as I would expect them to.

Anyone have any ideas about whats going on?

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!
The typical approach is left join on "blah IS NOT NULL".

Why is there an explicit condition on only-t in your join condition? Why are you joining vtv on ltj instead of on t? Rather, why is your second join condition against three tables?

The usual approach to understanding these things is to remove all your conditions so you can see the proper join results. Everything else is just a filter on that Cartesian product, but you need to validate the rows are computed first. Remember that left joins give explicit nulls by design, so it's not a surprise they would be present.

It seems likely that there's no matching row when joining against vtv. Note the length invoice is null, suggesting there's no real row meeting the second set of join conditions.

NPR Journalizard
Feb 14, 2008

PhantomOfTheCopier posted:

The typical approach is left join on "blah IS NOT NULL".
The scenario that is causing the problem is when the ljt.invoice is blank, not null. I cant work out why its trying to join on the blank invoice row, and not the one with the invoice reference 170077170

quote:

Why is there an explicit condition on only-t in your join condition? Why are you joining vtv on ltj instead of on t? Rather, why is your second join condition against three tables?
t does not have all the required information to join vtv on correctly, its missing an invoice reference, so we have to get that from ljt first. I can take the t.source = 2 condition out, and still get the same results.

quote:

The usual approach to understanding these things is to remove all your conditions so you can see the proper join results. Everything else is just a filter on that Cartesian product, but you need to validate the rows are computed first. Remember that left joins give explicit nulls by design, so it's not a surprise they would be present.

It seems likely that there's no matching row when joining against vtv. Note the length invoice is null, suggesting there's no real row meeting the second set of join conditions.

There is a matching row between ljt and vtv. In the example below, its the one with the invoice 170077170. The problem is that its using the row with the blank invoice, when im specifically trying to get it to join on the invoice 170077170

The help is appreciated.

Ninja.Bob
Mar 31, 2005

NPR Journalizard posted:

It looks like the join to bring in vtv is selecting the row with the blank ljt.invoice value, when i want it to only bring in rows with a valid invoice value.

Try moving the len(ljt.invoice) > 0 from the second on clause to the first on clause. Currently this does not filter out rows with blank invoices, it just sets vtv.* to null for them.

NPR Journalizard
Feb 14, 2008

Ninja.Bob posted:

Try moving the len(ljt.invoice) > 0 from the second on clause to the first on clause. Currently this does not filter out rows with blank invoices, it just sets vtv.* to null for them.

I tried this as below

code:
from taxtrans t
        left join ledgerjournaltrans    ljt on upper(ljt.dataareaid) = upper(t.dataareaid)                       
                                                                            and ljt.recid = t.sourcerecid
									     and LEN( ljt.invoice) > 0 
                                                                            

        left join vendtrans             vtv on upper(ljt.voucher) = upper(vtv.voucher) 
                                                                            and upper(ljt.dataareaid) = upper(vtv.dataareaid)
                                                                            and upper(ljt.invoice) = upper(vtv.invoice)
                                                                            
                                                                            and t.source = 2
and got the following results.

code:
source	T.Company	T.Voucher	ljt.comp	ljt.voucher	vtv.comp	vtv.voucher	ljt.invoice	tvt.invoice	(No column name)
2	ifa		VI000045	NULL		NULL		NULL		NULL		NULL		NULL		NULL
If it helps, this is confusing the poo poo out of the data engineer and the data architect as well.

raminasi
Jan 25, 2005

a last drink with no ice

NPR Journalizard posted:

I tried this as below

code:
from taxtrans t
        left join ledgerjournaltrans    ljt on upper(ljt.dataareaid) = upper(t.dataareaid)                       
                                                                            and ljt.recid = t.sourcerecid
									     and LEN( ljt.invoice) > 0 
                                                                            

        left join vendtrans             vtv on upper(ljt.voucher) = upper(vtv.voucher) 
                                                                            and upper(ljt.dataareaid) = upper(vtv.dataareaid)
                                                                            and upper(ljt.invoice) = upper(vtv.invoice)
                                                                            
                                                                            and t.source = 2
and got the following results.

code:
source	T.Company	T.Voucher	ljt.comp	ljt.voucher	vtv.comp	vtv.voucher	ljt.invoice	tvt.invoice	(No column name)
2	ifa		VI000045	NULL		NULL		NULL		NULL		NULL		NULL		NULL
If it helps, this is confusing the poo poo out of the data engineer and the data architect as well.

Ninja.Bob identified the underlying problem, but it's just as present on the first join as it was on the second: That kind of predicate rarely works as expected in left joins, because left joins (by design) don't produce empty result sets. If there's no matching row, you just get a "match" to a bunch of NULLs.

There may be a clever way to juggle the join predicates to do what you want, but you're at the point that I'd personally just start using CTEs.

NPR Journalizard
Feb 14, 2008

raminasi posted:

Ninja.Bob identified the underlying problem, but it's just as present on the first join as it was on the second: That kind of predicate rarely works as expected in left joins, because left joins (by design) don't produce empty result sets. If there's no matching row, you just get a "match" to a bunch of NULLs.

There may be a clever way to juggle the join predicates to do what you want, but you're at the point that I'd personally just start using CTEs.

There is a matching row though. The problem is that the matching row is not being joined on.

There are two relevant rows in the ljt table. One has a matching invoice number, and one has a blank invoice number.

quote:

left join vendtrans vtv on upper(ljt.voucher) = upper(vtv.voucher)
and upper(ljt.dataareaid) = upper(vtv.dataareaid)
and upper(ljt.invoice) = upper(vtv.invoice)

Its like that bit of the join logic is being ignored. Its joining a row that has an invoice record that doesnt match, so when i try to join vtv on using that invoice number, its a blank value and so vtv doesnt join and i get a bunch of nulls.

The len(invoice) > 0 was an attempt to force the join on a row that has a valid (non blank) invoice.

Heavy_D
Feb 16, 2002

"rararararara" contains the meaning of everything, kept in simple rectangular structures
What is ljt.recid = t.sourcerecid doing in the first join? Is it excluding the row which has a non-blank invoice value?

NPR Journalizard
Feb 14, 2008

Heavy_D posted:

What is ljt.recid = t.sourcerecid doing in the first join? Is it excluding the row which has a non-blank invoice value?

Any recid is a reference to a unique record identifier. So we have a tax transaction in the first table (taxtrans), which is linked to a credit and debit transaction in the second table (ledgerjournaltrans) through the rec ids.

Heroic Yoshimitsu
Jan 15, 2008

Ok, really really dumb question but why does

SELECT id FROM table_name WHERE value LIKE “%Between%”

returns results but

SELECT id FROM table_name WHERE CONTAINS(value, ‘Between’)

Returns nothing? CONTAINS is pretty new to me

Bruegels Fuckbooks
Sep 14, 2004

Now, listen - I know the two of you are very different from each other in a lot of ways, but you have to understand that as far as Grandpa's concerned, you're both pieces of shit! Yeah. I can prove it mathematically.

Heroic Yoshimitsu posted:

Ok, really really dumb question but why does

SELECT id FROM table_name WHERE value LIKE “%Between%”

returns results but

SELECT id FROM table_name WHERE CONTAINS(value, ‘Between’)

Returns nothing? CONTAINS is pretty new to me

Does the column have a full text index? My reading of the doc is that if the column does not have a full-text index, it won't work. https://learn.microsoft.com/en-us/sql/t-sql/queries/contains-transact-sql?view=sql-server-ver16

Heroic Yoshimitsu
Jan 15, 2008

Bruegels Fuckbooks posted:

Does the column have a full text index? My reading of the doc is that if the column does not have a full-text index, it won't work. https://learn.microsoft.com/en-us/sql/t-sql/queries/contains-transact-sql?view=sql-server-ver16

I considered that, but the column IS full text enabled.

credburn
Jun 22, 2016
Probation
Can't post for 6 hours!
Hey, gang! I'm currently in a degree program to become a (((data analyst))) and am presently learning about entity-relationship (ER) diagrams. My question pertains to my current assignment which requires me to note the cardinality between two entities. What I'm confused about is this:

When using cardinality symbols, like say
code:
entity_1-|-|------|-<-| entity_2
it makes sense because it goes both ways, and they're not always just the reversal of the two. But this assignment wants a simple 1:1, M:1, 1:M or M:M, reading left-to-right, top-to-bottom. This is not an uncommon form of showing cardinality, but I can't find the distinction between the two, why I would use one over the other, and what use something like this is when the inverse does not match the reciprocal cardinality?

Confusion is:
code:
entity_1-|-|------|-<-entity_2

is not the same as:

entity_1 1...M entity_2

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
You use cardinality symbols in drawn diagrams so that it's easy to see what's up visually. You use the phrases in written or spoken text because trying to embed cardinality symbols there is cumbersome and doesn't actually help with clarity.

I'm not sure where your confusion about reciprocity is coming from.

credburn
Jun 22, 2016
Probation
Can't post for 6 hours!

Jabor posted:

You use cardinality symbols in drawn diagrams so that it's easy to see visual distinctions. You use the phrases in written or spoken text because trying to embed cardinality symbols there is cumbersome and doesn't actually help with clarity.

I'm not sure where your confusion about reciprocity is coming from.

If something is 1:1 in one direction, but 1:M the other, say, then what use is describing a relationship of being "one to one" if that's only true in one direction?

Edit for illustration:



So like in this example, the relationship between Client and Case is 1:M one direction and 1:1 the other. How would you describe this this relationship using just a single notation? 1:M, if I understand the text right, but that's only half the relationship.

I'm doing an online college that's very... there's no instruction, I'm basically paying for a degree and teaching myself through online tutorials and what not. I may be confused about something very obvious, something an instructor could have cleared up immediately, but I don't really have an instructor :mad:

credburn fucked around with this message at 04:31 on Nov 22, 2022

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

credburn posted:

If something is 1:1 in one direction, but 1:M the other, say, then what use is describing a relationship of being "one to one" if that's only true in one direction?

Can you describe a relation that works this way?

In general, if something is one-to-many in one direction, it's many-to-one in the other direction.

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


Saying a relationship between A and B is one-to-many is really shorthand to say:
code:
one A for each B
to
many B for each A
And reading them in reverse still produces a true statement:
code:
many B for each A
to
one A for each B
In other words, it describes the full relationship, not a single side of the relationship looking from one end to the other.

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


Or to quote Microsoft:

quote:

A one-to-many relationship is the most common kind of relationship. In this kind of relationship, a row in table A can have many matching rows in table B. But a row in table B can have only one matching row in table A.

I think you’re caught up on trying to read it as a relationship from one table’s perspective when it really describes the whole thing, bidirectionally.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
To use your specific example, there are many cases that have the same client, so it's many-to-one.

credburn
Jun 22, 2016
Probation
Can't post for 6 hours!

Jabor posted:

To use your specific example, there are many cases that have the same client, so it's many-to-one.

It's making more sense to me now, but this still stumps me:

many-to-one I get because of the way you worded it. But why would it be incorrect to say it was one-to-many, as in one client instructs many cases?

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

credburn posted:

It's making more sense to me now, but this still stumps me:

many-to-one I get because of the way you worded it. But why would it be incorrect to say it was one-to-many, as in one client instructs many cases?

It wouldn't be incorrect, you could absolutely describe it that way if you wanted to. "Clients has a one-to-many relationship with Cases" means the same thing as "Cases has a many-to-one relationship with Clients".

How do you know which way around to describe it? That's all about the context. Often one way will be a more natural way of describing the relationship, while the other one seems confusing or backwards.

raminasi
Jan 25, 2005

a last drink with no ice

NPR Journalizard posted:

There is a matching row though. The problem is that the matching row is not being joined on.

There are two relevant rows in the ljt table. One has a matching invoice number, and one has a blank invoice number.

If you remove all join predicates and WHERE clauses other than the ones that link tables together, does the result set contain the row you want? (Hopefully that query is tractable to run.)

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Rookie question:

I have a StackOverflow demo DB in SQL Server that I would like to replicate in PostgreSQL to practice on. The DB has 9 tables. What is the easiest way to get the data out of SQL Server and into PostgreSQL? Figure out a way to export the tables as CSVs and copy into Postgres?

Twerk from Home
Jan 17, 2009

This avatar brought to you by the 'save our dead gay forums' foundation.

Hughmoris posted:

Rookie question:

I have a StackOverflow demo DB in SQL Server that I would like to replicate in PostgreSQL to practice on. The DB has 9 tables. What is the easiest way to get the data out of SQL Server and into PostgreSQL? Figure out a way to export the tables as CSVs and copy into Postgres?

First I'd check if https://pgloader.io/ supports your version of MS SQL, and use that if it does.

Also, if anybody here is an experienced lisper, please update pgloader to work with MySQLs new auth, it's the most in-demand issue on GitHub.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Twerk from Home posted:

First I'd check if https://pgloader.io/ supports your version of MS SQL, and use that if it does.

Also, if anybody here is an experienced lisper, please update pgloader to work with MySQLs new auth, it's the most in-demand issue on GitHub.

I'll take a look, thanks!

credburn
Jun 22, 2016
Probation
Can't post for 6 hours!
Hey, gang!

One more question about ER-diagrams. I think I realize why I was so baffled over a simple concept, but -- please tell me if this is right or wrong thinking:

The relationship is determined by the cardinality; the modality is not part of it. That is to say, I only look at the outer symbols when determining a relationship. ->-|-----|-|- is many-to-one, and -|-|-----0-|- is one-to-one, right?

Secondly -- this pertains to an assignment, so I'm not expecting anyone to just give me the answer, but I'm struggling over how to properly visualize this relationship:



I am thinking, it's a 1-to-1 relationship because a bagel order is assigned to a single customer and a single customer makes a bagel order. But I also think, well, several bagel orders could be made, belonging to the same customer, though they would be separate orders. So maybe it's many-to-one. Is there a simple qualifier betwixt these entities that makes this obvious to someone more experienced with it?

nielsm
Jun 1, 2009



The Bagel Order table has a "Customer ID" FK, and that field is not unique in the orders table, meaning the database structure allows multiple bagel orders with the same "Customer ID" to exist at the same time. That makes it a one-to-many relationship, because there can be many orders for one customer.

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


The most natural way to describe the relationship between customers and orders is (IMO): “one customer can have many orders”. So I’d describe the relationship between customer and order as one-to-many.

If you really want to describe the relationship from the order side, yes, it’s many-to-one.

But again, you need to stop thinking about “an individual order” and think about the tables’ records more broadly. Yes, each order must have a single customer, but many orders may share the same customer. A one-to-one relationship indicates that there would never be any sharing, i.e. that each order has a single customer AND no other order has that customer.

credburn
Jun 22, 2016
Probation
Can't post for 6 hours!
Thanks again for your help, all. It's making more sense to me now. I'm taking my course final tomorrow so I'll find out then what the cardinality is of credburn-has-PassingGrade

Adbot
ADBOT LOVES YOU

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


credburn posted:

Thanks again for your help, all. It's making more sense to me now. I'm taking my course final tomorrow so I'll find out then what the cardinality is of credburn-has-PassingGrade

Good luck!

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply