Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

PhantomOfTheCopier posted:

The question is certainly valid. I'm merely concerned about a non-extensible solution when the issue has been identified a priori. "Why do BxC when you have A?"

...

Does it take joins? Sure, but join against the list with primary key... because you just know the next question will be "Durr can we get the A.sale-price too since that might tell us something useful?" and the answer need not be "let me rewrite the entire query".

This is really a question of what you view the source of the data as being, canonically. You see the source of the data as A, because it lists all of the ads in the system. I see the source of the data as the full join of the two aggregation queries, because that gives you all the ads in the system that have at least one comment (whereas the question being asked does not even make sense for those ads that don't have any comments).

I notice that you said in your original post about this that you'd use an approach that doesn't involve any outer joins. Based on that, I suppose your favoured approach would be to use two correlated subqueries in the SELECT clause? At least, that's the approach without outer joins that is most obvious to me. I have a dislike of correlated subqueries (and anything that resembles them, for example EXISTS(...) expressions), and I avoid them in general.

Hammerite fucked around with this message at 12:18 on Sep 13, 2022

Adbot
ADBOT LOVES YOU

Seventh Arrow
Jan 26, 2005

Is there a way to prevent an aliased column from returning nulls?

Here's the query that I've come up with so far:

code:
select state, year, tempc,
tempc - lag(tempc) over (partition by state order by year) as change_in_temp
from state_climate
where change_in_temp is not null;
https://www.db-fiddle.com/f/kgQLu7Mcg2DqEXEUxYfGAQ/11

It's not liking the 'where' clause, though. I halfway suspect that I should be using LAG's 'default value' instead, but I'm not quite sure how to use it.

Here are the requirements, if you're curious:

quote:

Let’s see how temperature has changed each year in each state.

Write a query to select the same columns but now you should write a window function that returns the change_in_temp from the previous year (no null values should be returned).

Which states and years saw the largest changes in temperature?
Is there a particular part of the United States that saw the largest yearly changes in temperature?

quote:

Hint

This one is a little tricky! In order to answer the questions you need to:

find the difference in temperature from the previous year using subtraction and LAG
you need all three parameters that LAG accepts in order to look to the previous row and never return a null
PARTITION BY state and ORDER BY year
lastly your entire results set needs to use ORDER BY change_in_temp (use ASC or DESC!)

kumba
Nov 8, 2003

I posted my food for USPOL Thanksgiving!

enjoy the ride

Lipstick Apathy
the instructions specify you must use all 3 arguments for the LAG function and you're only using the first one

you're on the right track, consult the syntax page for LAG

Seventh Arrow
Jan 26, 2005

Thanks, I will look into that!

TheFluff
Dec 13, 2006

FRIENDS, LISTEN TO ME
I AM A SEAGULL
OF WEALTH AND TASTE

Seventh Arrow posted:

Is there a way to prevent an aliased column from returning nulls?

If you actually want a filter on a computed value from the select list (the typical use case being filtering on an aggregation, like a count or something), you want it in the HAVING clause rather than the WHERE clause. But check what kumba said first.

Seventh Arrow
Jan 26, 2005

Yes you're right and I was just looking at that in my SQL cheat sheet yesterday:

• When we want to limit the results of a query based on values of the individual rows, use WHERE.
• When we want to limit the results of a query based on an aggregate property, use HAVING.
HAVING statement always comes after GROUP BY, but before ORDER BY and LIMIT.

and of course I immediately forgot about it right when it could've been useful :cripes:

uguu
Mar 9, 2014

mortarr posted:

Not sure if you made any progress on this, but I did a similar thing at my old work - we had about 30 years worth of 3-waters related scada measurements for various sites around the city at 5min intervals. I don't work there anymore and it was a couple of years ago, but basically we had a bunch of ETL processes that collated the data from the various sources and ultimately inserted into postgres/timescaledb, then we used PostgREST to surface a proc using timescaledb's hyperfunctions that would pull the detail data plus a couple of other procs that could be used to surface summary data to help finding the devices of interest.

The data looked like: device id/location/time/value/comment, so not wide, just shitloads of it.

We surfaced PostgREST through an api gateway that let us limit access via api keys, rate limit, have nice looking url paths and all that good stuff. It took a bit of thought to put it all together, as we were new to postgres and as well as timescaledb, but once it was set up it was pretty good, I really liked the idea of PostgREST and not needing to roll a custom api/db interface.

Hadn't worked on it for a while, but this looks very useful. Thanks man!

abelwingnut
Dec 23, 2002


sql server question.

i have an agent job that runs an ssis package that runs a stored procedure and puts the results in an excel file on our network. if i run this job manually, the results are correct based on some recent changes we made to the stored procedure. however, if i run this job automatically, the results are what the old procedure would've produced, thus wrong.

what the hell is going on?

abelwingnut fucked around with this message at 19:50 on Sep 29, 2022

Condimentalist
Jun 13, 2007
Have you redeployed the Asia package to the ssis db on the server? It sounds like you need to do that.

abelwingnut
Dec 23, 2002


that doesn't really make sense to me, though. again, it's the same job running the same package. the only difference is me either running the job manually or it running the job automatically. it's the same version of the same package, no?

Away all Goats
Jul 5, 2005

Goose's rebellion

Anyone know of any other websites like SQL zoo and SQL practice?

Seventh Arrow
Jan 26, 2005

Perhaps StrataScratch?

Away all Goats
Jul 5, 2005

Goose's rebellion


Thanks! That looks like what I'm looking for

KOTEX GOD OF BLOOD
Jul 7, 2012

I have what I believe is an exceptionally easy question but I started learning SQL literally this morning and my skull is about five inches thick.

Let's say I have a table like so.
code:
Scotch   	Amt     Purpose
Laphroaig	$5,000 	Bar
Lagavulin	$6,000 	Lounge
Glenlivet	$2,000 	Home
Glenmorangie	$1,500 	Lounge
Lagavulin	$1,000 	Lounge
Lagavulin	$9,000 	Bar
Glenlivet	$8,000 	Bar
Glenlivet	$7,500 	Lounge
Glenmorangie	$6,000 	Home
Glenmorangie	$2,000 	Bar
Laphroaig	$3,500 	Lounge
Etc.

I want to have a list of "these are the Scotches where more than 65% of their sales (by dollar amount) contain 'Bar' or 'Lounge.'" I can't figure this out for the life of me.

I also have the data in Tableau so doing this with calculated fields would be fine, but I can't figure that out either.

I am really bad at this lol :smith:

KOTEX GOD OF BLOOD fucked around with this message at 21:56 on Oct 4, 2022

Just-In-Timeberlake
Aug 18, 2003
This is one way to do it, probably a better way
e: cleaned it up and made it functional

code:
DECLARE @tbl TABLE (
	[brand]				VARCHAR(50)
	, [amt]				DECIMAL(10,2)
	, [purpose]			VARCHAR(50)
)

INSERT INTO @tbl
(
    [brand],
    [amt],
    [purpose]
)
VALUES
( 'Laphroaig', 5000, 'Bar')
, ( 'Lagavulin', 6000, 'Lounge')
, ( 'Glenlivet', 2000, 'Home')
, ( 'Glenmorangie', 1500, 'Lounge')
, ( 'Lagavulin', 1000, 'Lounge')
, ( 'Lagavulin', 9000, 'Bar')
, ( 'Glenlivet', 8000, 'Bar')
, ( 'Glenlivet', 7500, 'Lounge')
, ( 'Glenmorangie', 6000, 'Home')
, ( 'Glenmorangie', 2000, 'Bar')
, ( 'Laphroaig', 5000, 'Lounge')

SELECT	[table].[brand]
        , [table].[total]
        , [table].[bar or lounge]
		, ([table].[bar or lounge] / [table].[total]) * 100 AS [percent sales]
FROM	(
			SELECT	[t].[brand]
					, (
						SELECT	SUM([@tbl].[amt])
						FROM	@tbl
						WHERE	[@tbl].[brand] = [t].[brand]
					) AS [total]
					, (
						SELECT	SUM([@tbl].[amt])
						FROM	@tbl
						WHERE	[@tbl].[brand] = [t].[brand]
								AND [@tbl].[purpose] IN ('Bar', 'Lounge')
					) AS [bar or lounge]
			FROM	@tbl [t]
		) AS [table]
WHERE	[table].[bar or lounge] / [table].[total] > .65
GROUP BY ([table].[bar or lounge] / [table].[total]) * 100,
         [table].[brand],
         [table].[total],
         [table].[bar or lounge]
this outputs:

Just-In-Timeberlake fucked around with this message at 14:27 on Oct 5, 2022

kumba
Nov 8, 2003

I posted my food for USPOL Thanksgiving!

enjoy the ride

Lipstick Apathy

KOTEX GOD OF BLOOD posted:

I have what I believe is an exceptionally easy question

fret not! this kind of analysis is surprisingly difficult to conceptualize in SQL

my immediate thought is this is the case for subtotals in CTEs, the quickest thing I can whip up which I'm sure isn't the best is

;with totals as (
select Scotch, SUM(Amount) as TotalSold
from #temp
group by Scotch
)
, ratios as (
select t.Scotch, t.Purpose, t.Amount / tot.TotalSold as Ratio
from #temp t
join totals tot on t.Scotch = tot.Scotch
)
, subtotals as (
select r.Scotch, r.Ratio + r2.Ratio as TotalRatio
from ratios r
join ratios r2 on r.Scotch = r2.Scotch and r.Purpose = 'Bar' and r2.Purpose = 'Lounge'
)
select Scotch from subtotals where TotalRatio > 0.65



i am certain you don't need all 3 of those but it's late and my brain hurts

mortarr
Apr 28, 2005

frozen meat at high speed

KOTEX GOD OF BLOOD posted:

I want to have a list of "these are the Scotches where more than 65% of their sales (by dollar amount) contain 'Bar' or 'Lounge.'"

I like header rows just in case there are null combinations in the data, it's not necessary here as you didn't ask for a breakdown by Purpose for those that made the 65%. Anyway, here we are - probably overkill for your question! I think you can do this with windowed functions in one statement/no subqueries as well.

code:
;with src AS (
  SELECT 'Laphroaig' AS Scotch, 5000.0 AS SalesAmt, 'Bar' AS Purpose
  UNION SELECT 'Lagavulin', 6000, 'Lounge'
  UNION SELECT 'Glenlivet', 2000, 'Home'
  UNION SELECT 'Glenmorangie', 1500, 'Lounge'
  UNION SELECT 'Lagavulin', 1000, 'Lounge'
  UNION SELECT 'Lagavulin', 9000, 'Bar'
  UNION SELECT 'Glenlivet', 8000, 'Bar'
  UNION SELECT 'Glenlivet', 7500, 'Lounge'
  UNION SELECT 'Glenmorangie', 6000, 'Home'
  UNION SELECT 'Glenmorangie', 2000, 'Bar'
  UNION SELECT 'Laphroaig', 3500, 'Lounge'

), detail AS (  
  SELECT
  	Scotch,
    Purpose,
    SUM(SalesAmt) AS SumOfSalesAmt
  FROM
    src
  GROUP BY
    Scotch,
    Purpose

), salesByScotch AS (
  SELECT
  	Scotch,
    SUM(SalesAmt) AS SumOfSalesAmt,
    SUM(CASE WHEN Purpose IN ('Bar','Lounge') THEN SalesAmt ELSE 0 END) AS SumOfSalesAmtBarOrLounge
  FROM
    src
  GROUP BY
    Scotch
  
), headers_Scotch AS (
  SELECT DISTINCT 
  	Scotch
  FROM 
    src
)

SELECT
  hs.Scotch,
  sbs.SumOfSalesAmt,
  COALESCE(sbs.SumOfSalesAmt, 0) * 0.65 AS [65% SumOfSalesAmt],
  sbs.SumOfSalesAmtBarOrLounge  
FROM
  headers_Scotch hs
  INNER JOIN salesByScotch sbs
  ON hs.Scotch = sbs.Scotch
WHERE
  sbs.SumOfSalesAmtBarOrLounge > COALESCE(sbs.SumOfSalesAmt, 0) * 0.65
ORDER BY
  hs.Scotch

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!

KOTEX GOD OF BLOOD posted:

I have what I believe is an exceptionally easy question but I started learning SQL literally this morning and my skull is about five inches thick.

Let's say I have a table like so.
code:
Scotch   	Amt     Purpose
Laphroaig	$5,000 	Bar
Lagavulin	$6,000 	Lounge
Glenlivet	$2,000 	Home
Glenmorangie	$1,500 	Lounge
Lagavulin	$1,000 	Lounge
Lagavulin	$9,000 	Bar
Glenlivet	$8,000 	Bar
Glenlivet	$7,500 	Lounge
Glenmorangie	$6,000 	Home
Glenmorangie	$2,000 	Bar
Laphroaig	$3,500 	Lounge
Etc.

I want to have a list of "these are the Scotches where more than 65% of their sales (by dollar amount) contain 'Bar' or 'Lounge.'" I can't figure this out for the life of me.

I am really bad at this lol :smith:
Not really 'bad'; it's more that you're asking a question that a "query" isn't apt to do. People in the Excel world would, within 0.003sec of any conversation at that, spout "pivot table!", because what you really want is two separate calculated values over your table, and a calculated value based on that, and a query over those results. Normally I'm strongly biased in favor of CTEs, but this one screams Window Function Please. And honestly, I've never ever used one in a production environment, so I'd personally say this is a non-beginner SQL topic:

code:
WITH barlounge AS (
  SELECT name, 100.0*SUM(rev) FILTER (WHERE location~'Bar|Lounge') OVER (PARTITION BY name) / SUM(rev) OVER (PARTITION BY name) percent
  FROM liquor)
SELECT name,percent FROM barlounge WHERE percent>65 GROUP BY name,percent;

   name    |       percent        
-----------+----------------------
 Glenlivet |  88.5714285714285714
 Lagavulin | 100.0000000000000000
 Laphroaig | 100.0000000000000000
(3 rows)
I couldn't get the WHERE (or a HAVING) to work over 'percent' without stuffing it in a CTE first (the WITH clause), and that surprises me because it's usually fairly good about WHERE clauses over calculated fields. Perhaps someone here knows how to coerce it without the CTE.


vvv Thank you Moonwolf. I would have gotten there eventually but had work and booster recovery impeding my brain function.

PhantomOfTheCopier fucked around with this message at 22:07 on Oct 5, 2022

KOTEX GOD OF BLOOD
Jul 7, 2012

Wow, thank you all for these responses. I ultimately got it working in Tableau through a few calculated fields, but I'm looking forward to experimenting with these as a means to git good at SQL.

Moonwolf
Jun 29, 2004

Flee from th' terrifyin' evil of "NHS"!


PhantomOfTheCopier posted:

Not really 'bad'; it's more that you're asking a question that a "query" isn't apt to do. People in the Excel world would, within 0.003sec of any conversation at that, spout "pivot table!", because what you really want is two separate calculated values over your table, and a calculated value based on that, and a query over those results. Normally I'm strongly biased in favor of CTEs, but this one screams Window Function Please. And honestly, I've never ever used one in a production environment, so I'd personally say this is a non-beginner SQL topic:

code:
WITH barlounge AS (
  SELECT name, 100.0*SUM(rev) FILTER (WHERE location~'Bar|Lounge') OVER (PARTITION BY name) / SUM(rev) OVER (PARTITION BY name) percent
  FROM liquor)
SELECT name,percent FROM barlounge WHERE percent>65 GROUP BY name,percent;

   name    |       percent        
-----------+----------------------
 Glenlivet |  88.5714285714285714
 Lagavulin | 100.0000000000000000
 Laphroaig | 100.0000000000000000
(3 rows)
I couldn't get the WHERE (or a HAVING) to work over 'percent' without stuffing it in a CTE first (the WITH clause), and that surprises me because it's usually fairly good about WHERE clauses over calculated fields. Perhaps someone here knows how to coerce it without the CTE.

Window functions blow all the usual query heirarchy out of the water and go flatly last, so wrapping them in a query to filter is about all you can do.

Oysters Autobio
Mar 13, 2017
This isn't really a coding question but not sure about any good resources because of all the enterprise software click bait and I don't know if there's like a dataops / data engineering megathread here.

I'm being asked to help provide input for metadata governance/ Master Data Management stuff but the only stuff I can find online are ads for enterprise software. Any good actual resources for best practices? Checked Reddit and only advice o got out of it was to run which I'm slowly realizing is pretty good advice

Seventh Arrow
Jan 26, 2005

I'm still a newbie data engineer but what kind of metadata governance are we talking about? In the gigs I've been in, there's always been a data governance team that makes the decisions, which they hand down to the DE's. Is that not the case here? Are they looking for a software solution to handle metadata? There could be a lot of facets to data governance, like the handling of personal information, dimensional modeling, etc.

mortarr
Apr 28, 2005

frozen meat at high speed

Oysters Autobio posted:

This isn't really a coding question but not sure about any good resources because of all the enterprise software click bait and I don't know if there's like a dataops / data engineering megathread here.

I'm being asked to help provide input for metadata governance/ Master Data Management stuff but the only stuff I can find online are ads for enterprise software. Any good actual resources for best practices? Checked Reddit and only advice o got out of it was to run which I'm slowly realizing is pretty good advice

I'd be keen on this too if anyone has any info - I've looked into training courses etc but all I've found locally are post-grad papers as part of a masters in records and data management.

kiwid
Sep 30, 2013

Is there anyway to have SSMS show the full view definition in the designer editor when multiple queries are in the view?

Today I was trying to see what the underlying query was for a specific view and it was dumping out this:

SQL code:
SELECT DISTINCT 'Parent'
    ,T1.commodity_parent_id
    ,T2.name + ' Consolidated'
FROM monthend.dba.s1_commodity t1
INNER JOIN monthend.dba.s1_commodity t2 ON t1.commodity_parent_id = t2.commodity_id
WHERE t1.commodity_parent_id IN (
        'CORN'
        ,'SOYBEANS'
        ,'CRUSH-SOY'
        ,'SRW'
        ,'SWW'
        ,'HRS'
        ,'HRW'
        )

UNION ALL

SELECT DISTINCT 'Product'
    ,T1.commodity_id
    ,T1.name + ' Consolidated'
FROM monthend.dba.s1_commodity t1
WHERE t1.commodity_id IN (
        'SRW'
        ,'SWW'
        ,'HRW'
        ,'BAR FEED'
        ,'OATS-COMM'
        )

SELECT * FROM COMMODITY WHERE ...
The problem was in the third query, there is no such commodity table and so I knew the first two queries were being defined as the commodity table somehow.

I ended up using some query I found Googling:

SQL code:
SELECT DEFINITION
	,uses_ansi_nulls
	,uses_quoted_identifier
	,is_schema_bound
FROM sys.sql_modules
WHERE object_id = object_id('database.schema.table');
Which gave me the full view definition as:

SQL code:
CREATE VIEW [MTM_Inventory].[Position]
AS
WITH COMMODITY (
	TYPE
	,Commodity_ID
	,Commodity_Name
	)
AS (
	SELECT DISTINCT 'Parent'
		,T1.commodity_parent_id
		,T2.name + ' Consolidated'
	FROM monthend.dba.s1_commodity t1
	INNER JOIN monthend.dba.s1_commodity t2 ON t1.commodity_parent_id = t2.commodity_id
	WHERE t1.commodity_parent_id IN (
			'CORN'
			,'SOYBEANS'
			,'CRUSH-SOY'
			,'SRW'
			,'SWW'
			,'HRS'
			,'HRW'
			)
	
	UNION ALL
	
	SELECT DISTINCT 'Product'
		,T1.commodity_id
		,T1.name + ' Consolidated'
	FROM monthend.dba.s1_commodity t1
	WHERE t1.commodity_id IN (
			'SRW'
			,'SWW'
			,'HRW'
			,'BAR FEED'
			,'OATS-COMM'
			)
	)

SELECT * FROM COMMODITY WHERE ...

Heavy_D
Feb 16, 2002

"rararararara" contains the meaning of everything, kept in simple rectangular structures
Hi kiwid. I don't think the designer is capable of coping with any query that contains a CTE. But as long as you're comfortable with working with the SQL directly, there's a more direct command you can use. Right-click the view, select "Script view as" -> "CREATE TO" -> "New query editor", and you'll get a script that will include the proper definition of the view.

kiwid
Sep 30, 2013

Heavy_D posted:

Hi kiwid. I don't think the designer is capable of coping with any query that contains a CTE. But as long as you're comfortable with working with the SQL directly, there's a more direct command you can use. Right-click the view, select "Script view as" -> "CREATE TO" -> "New query editor", and you'll get a script that will include the proper definition of the view.

Thanks this is what I ended up doing and is exactly what I was looking for.

kiwid
Sep 30, 2013

[TSQL] - Is there a more succinct way of grouping results than using a CASE statement in the GROUP BY clause?

Here is my ungrouped query and results I'm working with:

SQL code:
SELECT [s1_commodity].[commodity_id] AS [id],
    [s1_commodity].[name],
    [s1_commodity].[group_code] AS [group],
    [s1_commodity].[commodity_parent_id] AS [parent_id],
    CASE [s1_plc_commodity].[plc_id]
        WHEN 'direct' THEN '6'
        WHEN 'mfst01' THEN '5'
        WHEN 'msfd01' THEN '4'
        WHEN 'msfl01' THEN '1'
        WHEN 'msfl02' THEN '1'
        WHEN 'msfp01' THEN '3'
        WHEN 'msfr01' THEN '2'
        WHEN 'seed' THEN '1'
        WHEN 'seed-palm' THEN '3'
        WHEN 'seed-rip' THEN '2'
        END AS [location_id]
FROM [test].[dba].[s1_commodity] WITH (NOLOCK)
INNER JOIN [test].[dba].[s1_plc_commodity] WITH (NOLOCK) ON [s1_commodity].[commodity_id] = [s1_plc_commodity].[commodity_id]
WHERE [s1_plc_commodity].[inactive_flag] = 'n'
    AND [s1_commodity].[commodity_id] = 'SRW'
code:
id	name			group	parent_id	location_id
SRW	Soft Red Winter Wheat	Wheat	WHEAT		6
SRW	Soft Red Winter Wheat	Wheat	WHEAT		5
SRW	Soft Red Winter Wheat	Wheat	WHEAT		4
SRW	Soft Red Winter Wheat	Wheat	WHEAT		1
SRW	Soft Red Winter Wheat	Wheat	WHEAT		1
SRW	Soft Red Winter Wheat	Wheat	WHEAT		3
SRW	Soft Red Winter Wheat	Wheat	WHEAT		2
SRW	Soft Red Winter Wheat	Wheat	WHEAT		1

What I'm trying to do is group the commodities by the location_id alias from the CASE statement but because you can't do that, I'm just copying the CASE statement into the GROUP BY clause. Is this the best way or is there something I'm missing?

My results:

SQL code:
SELECT [s1_commodity].[commodity_id] AS [id],
    [s1_commodity].[name],
    [s1_commodity].[group_code] AS [group],
    [s1_commodity].[commodity_parent_id] AS [parent_id],
    CASE [s1_plc_commodity].[plc_id]
        WHEN 'direct' THEN '6'
        WHEN 'mfst01' THEN '5'
        WHEN 'msfd01' THEN '4'
        WHEN 'msfl01' THEN '1'
        WHEN 'msfl02' THEN '1'
        WHEN 'msfp01' THEN '3'
        WHEN 'msfr01' THEN '2'
        WHEN 'seed' THEN '1'
        WHEN 'seed-palm' THEN '3'
        WHEN 'seed-rip' THEN '2'
    END AS [location_id]
FROM [test].[dba].[s1_commodity] WITH (NOLOCK)
INNER JOIN [test].[dba].[s1_plc_commodity] WITH (NOLOCK) ON [s1_commodity].[commodity_id] = [s1_plc_commodity].[commodity_id]
WHERE [s1_plc_commodity].[inactive_flag] = 'n'
    AND [s1_commodity].[commodity_id] = 'SRW'
GROUP BY [s1_commodity].[commodity_id],
    [s1_commodity].[name],
    [s1_commodity].[group_code],
    [s1_commodity].[commodity_parent_id],
    CASE [s1_plc_commodity].[plc_id]
        WHEN 'direct' THEN '6'
        WHEN 'mfst01' THEN '5'
        WHEN 'msfd01' THEN '4'
        WHEN 'msfl01' THEN '1'
        WHEN 'msfl02' THEN '1'
        WHEN 'msfp01' THEN '3'
        WHEN 'msfr01' THEN '2'
        WHEN 'seed' THEN '1'
        WHEN 'seed-palm' THEN '3'
        WHEN 'seed-rip' THEN '2'
    END
code:
id	name			group	parent_id	location_id
SRW	Soft Red Winter Wheat	Wheat	WHEAT		1
SRW	Soft Red Winter Wheat	Wheat	WHEAT		2
SRW	Soft Red Winter Wheat	Wheat	WHEAT		3
SRW	Soft Red Winter Wheat	Wheat	WHEAT		4
SRW	Soft Red Winter Wheat	Wheat	WHEAT		5
SRW	Soft Red Winter Wheat	Wheat	WHEAT		6

nielsm
Jun 1, 2009



You could make a CTE of the query without the grouping, and then just SELECT * in the main query, but add the GROUP BY on the main clause too.

Or use a table (temporary or permanent) you can join with to get the grouping you want.

Just-In-Timeberlake
Aug 18, 2003

kiwid posted:

[TSQL] - Is there a more succinct way of grouping results than using a CASE statement in the GROUP BY clause?

SQL code:
SELECT	[t].[id]
		, [t].[name]
		, [t].[group]
		, [t].[parent_id]
		, [t].[location_id]
FROM	(
			SELECT [s1_commodity].[commodity_id] AS [id],
				[s1_commodity].[name],
				[s1_commodity].[group_code] AS [group],
				[s1_commodity].[commodity_parent_id] AS [parent_id],
				CASE [s1_plc_commodity].[plc_id]
					WHEN 'direct' THEN '6'
					WHEN 'mfst01' THEN '5'
					WHEN 'msfd01' THEN '4'
					WHEN 'msfl01' THEN '1'
					WHEN 'msfl02' THEN '1'
					WHEN 'msfp01' THEN '3'
					WHEN 'msfr01' THEN '2'
					WHEN 'seed' THEN '1'
					WHEN 'seed-palm' THEN '3'
					WHEN 'seed-rip' THEN '2'
				END AS [location_id]
			FROM [test].[dba].[s1_commodity] WITH (NOLOCK)
			INNER JOIN [test].[dba].[s1_plc_commodity] WITH (NOLOCK) ON [s1_commodity].[commodity_id] = [s1_plc_commodity].[commodity_id]
			WHERE [s1_plc_commodity].[inactive_flag] = 'n'
				AND [s1_commodity].[commodity_id] = 'SRW'
		) AS [t]
GROUP BY [t].[id]
		, [t].[name]
		, [t].[group]
		, [t].[parent_id]
		, [t].[location_id]

kiwid
Sep 30, 2013



Thanks. Is there any performance differences between a CTE vs subquery?

Gwaihir
Dec 8, 2009
Hair Elf
Just gotta be careful about what's going in them because you might pop the memory limits on the machine you're running it on.

redleader
Aug 18, 2005

Engage according to operational parameters

kiwid posted:

Thanks. Is there any performance differences between a CTE vs subquery?

nah, optimiser treats 'em the same. think of a cte as being a way to copy/paste a subquery

primary reasons to use one over the other: defining a subquery once and using it multiple times (cte), readability. some people prefer ctes over subqueries even for single-use things. personally i only use ctes in that case if i'd end up with multiple nested subqueries (which might happen with complicated logic and/or window functions)

note that sql server doesn't evaluate a cte only once and reuse the result set. every time you refer to a cte in a query, you're creating a different subquery. i've seen this trip people up - people expect defining a thing once means it will execute once, but alas that isn't so. in postgres you can add a hint to the cte to get it to materialize the cte results, but sql server has nothing like this :(

nielsm
Jun 1, 2009



redleader posted:

note that sql server doesn't evaluate a cte only once and reuse the result set. every time you refer to a cte in a query, you're creating a different subquery. i've seen this trip people up - people expect defining a thing once means it will execute once, but alas that isn't so. in postgres you can add a hint to the cte to get it to materialize the cte results, but sql server has nothing like this :(

So if you wanted that behavior in SQL Server you'd have to be explicit and create a temp table from the query?

raminasi
Jan 25, 2005

a last drink with no ice

redleader posted:

note that sql server doesn't evaluate a cte only once and reuse the result set. every time you refer to a cte in a query, you're creating a different subquery. i've seen this trip people up - people expect defining a thing once means it will execute once, but alas that isn't so. in postgres you can add a hint to the cte to get it to materialize the cte results, but sql server has nothing like this :(

I think it's worth clarifying that SQL Server doesn't necessarily rerun the CTE every time it's invoked - it just inlines it into the query and then lets the query optimizer do its thing. In any particular case, this might end up having better performance than a materialized CTE, and the SQL Server query optimizer is generally pretty good. But if the optimizer gets a CTE optimization wrong, you don't have the tuning tools that Postgres gives you. (And if your CTE is non-deterministic due to your transaction isolation settings or just invoking a non-deterministic function, you need to pay really close attention to where it's getting used.)

TheFluff
Dec 13, 2006

FRIENDS, LISTEN TO ME
I AM A SEAGULL
OF WEALTH AND TASTE
Note that CTE behavior changed in postgres 12. Before that CTE's were always materialized, but these days they are not materialized by default (unless they're recursive or have side effects), but you can choose to materialize them if you want.

DELETE CASCADE
Oct 25, 2017

i haven't washed my penis since i jerked it to a phtotograph of george w. bush in 2003
is it really best practice in sql server to write all your identifiers [like].[this]? what a pain in the rear end

raminasi
Jan 25, 2005

a last drink with no ice

DELETE CASCADE posted:

is it really best practice in sql server to write all your identifiers [like].[this]? what a pain in the rear end

SSMS generates queries like that, but in my experience nobody actually uses the brackets for hand-written T-SQL unless they actually need them because someone put a stupid identifier in the database somewhere.

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!

DELETE CASCADE posted:

is it really best practice in sql server to write all your identifiers [like].[this]? what a pain in the rear end
When I see questions with that syntax in the thread I flee. It's usually a good indication that everything will be broken, or at least significantly different than expectations. :11tea:

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

TheFluff posted:

Note that CTE behavior changed in postgres 12. Before that CTE's were always materialized, but these days they are not materialized by default (unless they're recursive or have side effects), but you can choose to materialize them if you want.

woah whaaaat. I used PG in the 8.x days for a project I was on, it was much simpler stuff than I was writing later in oracle but the idea of CTEs being materialized by default is bizarre to me. And recursive CTEs are a really common use-case for hierarchal stuff.

Adbot
ADBOT LOVES YOU

Twerk from Home
Jan 17, 2009

This avatar brought to you by the 'save our dead gay forums' foundation.
I feel like there must be a better way to handle filters in a HAVING statement. I can't use window functions there, and it stinks.

I've got a common pattern dealing with aggregates where I want to group together everything that is not a significant % of the total into an "other" category. To get concrete, I threw something together where I have colors and years, and I want to get the count of all colors that make up more than 10% of the total, and group together everything that makes up less than 20% of the total into an "other" category:

SQL code:
WITH colors AS (
SELECT 1 as id, 'blue' as color, 1993 as year
UNION SELECT 2, 'blue', 1994
UNION SELECT 3, 'yellow', 1994 
UNION SELECT 4, 'yellow', 1994
UNION SELECT 5, 'yellow', 1994 
UNION SELECT 6, 'yellow', 1994
UNION SELECT 7, 'blue', 1994
UNION SELECT 8, 'blue', 1994
UNION SELECT 9, 'blue', 1994
UNION SELECT 10, 'red', 1994
UNION SELECT 11, 'red', 1993
UNION SELECT 12, 'red', 1993
UNION SELECT 13, 'red', 1993
UNION SELECT 14, 'red', 1993
UNION SELECT 15, 'red', 1993
UNION SELECT 16, 'green', 1993
UNION SELECT 17, 'orange', 1993
),  topper AS (
  SELECT 
    color 
  FROM 
    colors 
  GROUP BY 
    color 
  HAVING 
    CAST(
      COUNT(*) AS FLOAT
    ) / (
      SELECT 
        COUNT(*) 
      FROM 
        colors
    ) > 0.2
)
SELECT 
  CASE WHEN color IN (
    SELECT 
      color 
    FROM 
      topper
  ) THEN color ELSE 'other' END as case_color, 
  count(*) 
FROM 
  colors 
group by 
  case_color
order by 
  count(*) DESC;
code:
red|6
blue|5
yellow|4
other|2
I feel like I'm bludgeoning my way through this and there's got to be several better ways to do it. The biggest problem I see is that once I want to filter it down to just a single year, I have to duplicate the WHERE both into the main query and the CTE for topper, because I can't use a window function inside of a HAVING. For example, if I add "WHERE YEAR = '1994'", then I need it in both places or otherwise I get "red" in the query result where I should have "other".

I realize there's a completely different way I could get to the "other" amount by subtracting the sum count of all the colors that passed the >=20% bar from the total of all matching rows, but I can't figure out how to use that.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply