|
PhantomOfTheCopier posted:The question is certainly valid. I'm merely concerned about a non-extensible solution when the issue has been identified a priori. "Why do BxC when you have A?" This is really a question of what you view the source of the data as being, canonically. You see the source of the data as A, because it lists all of the ads in the system. I see the source of the data as the full join of the two aggregation queries, because that gives you all the ads in the system that have at least one comment (whereas the question being asked does not even make sense for those ads that don't have any comments). I notice that you said in your original post about this that you'd use an approach that doesn't involve any outer joins. Based on that, I suppose your favoured approach would be to use two correlated subqueries in the SELECT clause? At least, that's the approach without outer joins that is most obvious to me. I have a dislike of correlated subqueries (and anything that resembles them, for example EXISTS(...) expressions), and I avoid them in general. Hammerite fucked around with this message at 12:18 on Sep 13, 2022 |
# ? Sep 13, 2022 12:14 |
|
|
# ? May 24, 2024 17:54 |
|
Is there a way to prevent an aliased column from returning nulls? Here's the query that I've come up with so far: code:
It's not liking the 'where' clause, though. I halfway suspect that I should be using LAG's 'default value' instead, but I'm not quite sure how to use it. Here are the requirements, if you're curious: quote:Let’s see how temperature has changed each year in each state. quote:Hint
|
# ? Sep 22, 2022 20:35 |
|
the instructions specify you must use all 3 arguments for the LAG function and you're only using the first one you're on the right track, consult the syntax page for LAG
|
# ? Sep 22, 2022 21:15 |
|
Thanks, I will look into that!
|
# ? Sep 22, 2022 22:00 |
|
Seventh Arrow posted:Is there a way to prevent an aliased column from returning nulls? If you actually want a filter on a computed value from the select list (the typical use case being filtering on an aggregation, like a count or something), you want it in the HAVING clause rather than the WHERE clause. But check what kumba said first.
|
# ? Sep 22, 2022 23:20 |
|
Yes you're right and I was just looking at that in my SQL cheat sheet yesterday: • When we want to limit the results of a query based on values of the individual rows, use WHERE. • When we want to limit the results of a query based on an aggregate property, use HAVING. HAVING statement always comes after GROUP BY, but before ORDER BY and LIMIT. and of course I immediately forgot about it right when it could've been useful
|
# ? Sep 23, 2022 00:03 |
|
mortarr posted:Not sure if you made any progress on this, but I did a similar thing at my old work - we had about 30 years worth of 3-waters related scada measurements for various sites around the city at 5min intervals. I don't work there anymore and it was a couple of years ago, but basically we had a bunch of ETL processes that collated the data from the various sources and ultimately inserted into postgres/timescaledb, then we used PostgREST to surface a proc using timescaledb's hyperfunctions that would pull the detail data plus a couple of other procs that could be used to surface summary data to help finding the devices of interest. Hadn't worked on it for a while, but this looks very useful. Thanks man!
|
# ? Sep 26, 2022 21:45 |
|
sql server question. i have an agent job that runs an ssis package that runs a stored procedure and puts the results in an excel file on our network. if i run this job manually, the results are correct based on some recent changes we made to the stored procedure. however, if i run this job automatically, the results are what the old procedure would've produced, thus wrong. what the hell is going on? abelwingnut fucked around with this message at 19:50 on Sep 29, 2022 |
# ? Sep 29, 2022 19:33 |
|
Have you redeployed the Asia package to the ssis db on the server? It sounds like you need to do that.
|
# ? Sep 30, 2022 05:38 |
|
that doesn't really make sense to me, though. again, it's the same job running the same package. the only difference is me either running the job manually or it running the job automatically. it's the same version of the same package, no?
|
# ? Sep 30, 2022 14:21 |
|
Anyone know of any other websites like SQL zoo and SQL practice?
|
# ? Oct 4, 2022 02:21 |
|
Perhaps StrataScratch?
|
# ? Oct 4, 2022 03:36 |
|
Seventh Arrow posted:Perhaps StrataScratch? Thanks! That looks like what I'm looking for
|
# ? Oct 4, 2022 05:40 |
|
I have what I believe is an exceptionally easy question but I started learning SQL literally this morning and my skull is about five inches thick. Let's say I have a table like so. code:
I want to have a list of "these are the Scotches where more than 65% of their sales (by dollar amount) contain 'Bar' or 'Lounge.'" I can't figure this out for the life of me. I also have the data in Tableau so doing this with calculated fields would be fine, but I can't figure that out either. I am really bad at this lol KOTEX GOD OF BLOOD fucked around with this message at 21:56 on Oct 4, 2022 |
# ? Oct 4, 2022 21:54 |
|
This is one way to do it, probably a better way e: cleaned it up and made it functional code:
Just-In-Timeberlake fucked around with this message at 14:27 on Oct 5, 2022 |
# ? Oct 4, 2022 22:15 |
|
KOTEX GOD OF BLOOD posted:I have what I believe is an exceptionally easy question fret not! this kind of analysis is surprisingly difficult to conceptualize in SQL my immediate thought is this is the case for subtotals in CTEs, the quickest thing I can whip up which I'm sure isn't the best is ;with totals as ( select Scotch, SUM(Amount) as TotalSold from #temp group by Scotch ) , ratios as ( select t.Scotch, t.Purpose, t.Amount / tot.TotalSold as Ratio from #temp t join totals tot on t.Scotch = tot.Scotch ) , subtotals as ( select r.Scotch, r.Ratio + r2.Ratio as TotalRatio from ratios r join ratios r2 on r.Scotch = r2.Scotch and r.Purpose = 'Bar' and r2.Purpose = 'Lounge' ) select Scotch from subtotals where TotalRatio > 0.65 i am certain you don't need all 3 of those but it's late and my brain hurts
|
# ? Oct 4, 2022 22:30 |
|
KOTEX GOD OF BLOOD posted:I want to have a list of "these are the Scotches where more than 65% of their sales (by dollar amount) contain 'Bar' or 'Lounge.'" I like header rows just in case there are null combinations in the data, it's not necessary here as you didn't ask for a breakdown by Purpose for those that made the 65%. Anyway, here we are - probably overkill for your question! I think you can do this with windowed functions in one statement/no subqueries as well. code:
|
# ? Oct 5, 2022 03:56 |
|
KOTEX GOD OF BLOOD posted:I have what I believe is an exceptionally easy question but I started learning SQL literally this morning and my skull is about five inches thick. code:
vvv Thank you Moonwolf. I would have gotten there eventually but had work and booster recovery impeding my brain function. PhantomOfTheCopier fucked around with this message at 22:07 on Oct 5, 2022 |
# ? Oct 5, 2022 14:04 |
|
Wow, thank you all for these responses. I ultimately got it working in Tableau through a few calculated fields, but I'm looking forward to experimenting with these as a means to git good at SQL.
|
# ? Oct 5, 2022 15:05 |
|
PhantomOfTheCopier posted:Not really 'bad'; it's more that you're asking a question that a "query" isn't apt to do. People in the Excel world would, within 0.003sec of any conversation at that, spout "pivot table!", because what you really want is two separate calculated values over your table, and a calculated value based on that, and a query over those results. Normally I'm strongly biased in favor of CTEs, but this one screams Window Function Please. And honestly, I've never ever used one in a production environment, so I'd personally say this is a non-beginner SQL topic: Window functions blow all the usual query heirarchy out of the water and go flatly last, so wrapping them in a query to filter is about all you can do.
|
# ? Oct 5, 2022 19:44 |
|
This isn't really a coding question but not sure about any good resources because of all the enterprise software click bait and I don't know if there's like a dataops / data engineering megathread here. I'm being asked to help provide input for metadata governance/ Master Data Management stuff but the only stuff I can find online are ads for enterprise software. Any good actual resources for best practices? Checked Reddit and only advice o got out of it was to run which I'm slowly realizing is pretty good advice
|
# ? Oct 6, 2022 01:48 |
|
I'm still a newbie data engineer but what kind of metadata governance are we talking about? In the gigs I've been in, there's always been a data governance team that makes the decisions, which they hand down to the DE's. Is that not the case here? Are they looking for a software solution to handle metadata? There could be a lot of facets to data governance, like the handling of personal information, dimensional modeling, etc.
|
# ? Oct 6, 2022 03:14 |
|
Oysters Autobio posted:This isn't really a coding question but not sure about any good resources because of all the enterprise software click bait and I don't know if there's like a dataops / data engineering megathread here. I'd be keen on this too if anyone has any info - I've looked into training courses etc but all I've found locally are post-grad papers as part of a masters in records and data management.
|
# ? Oct 6, 2022 20:38 |
|
Is there anyway to have SSMS show the full view definition in the designer editor when multiple queries are in the view? Today I was trying to see what the underlying query was for a specific view and it was dumping out this: SQL code:
I ended up using some query I found Googling: SQL code:
SQL code:
|
# ? Oct 11, 2022 15:35 |
|
Hi kiwid. I don't think the designer is capable of coping with any query that contains a CTE. But as long as you're comfortable with working with the SQL directly, there's a more direct command you can use. Right-click the view, select "Script view as" -> "CREATE TO" -> "New query editor", and you'll get a script that will include the proper definition of the view.
|
# ? Oct 13, 2022 08:13 |
|
Heavy_D posted:Hi kiwid. I don't think the designer is capable of coping with any query that contains a CTE. But as long as you're comfortable with working with the SQL directly, there's a more direct command you can use. Right-click the view, select "Script view as" -> "CREATE TO" -> "New query editor", and you'll get a script that will include the proper definition of the view. Thanks this is what I ended up doing and is exactly what I was looking for.
|
# ? Oct 13, 2022 17:45 |
|
[TSQL] - Is there a more succinct way of grouping results than using a CASE statement in the GROUP BY clause? Here is my ungrouped query and results I'm working with: SQL code:
code:
My results: SQL code:
code:
|
# ? Oct 20, 2022 15:53 |
You could make a CTE of the query without the grouping, and then just SELECT * in the main query, but add the GROUP BY on the main clause too. Or use a table (temporary or permanent) you can join with to get the grouping you want.
|
|
# ? Oct 20, 2022 16:01 |
|
kiwid posted:[TSQL] - Is there a more succinct way of grouping results than using a CASE statement in the GROUP BY clause? SQL code:
|
# ? Oct 20, 2022 17:24 |
|
nielsm posted:cte Just-In-Timeberlake posted:subquery Thanks. Is there any performance differences between a CTE vs subquery?
|
# ? Oct 20, 2022 18:37 |
|
Just gotta be careful about what's going in them because you might pop the memory limits on the machine you're running it on.
|
# ? Oct 20, 2022 19:02 |
|
kiwid posted:Thanks. Is there any performance differences between a CTE vs subquery? nah, optimiser treats 'em the same. think of a cte as being a way to copy/paste a subquery primary reasons to use one over the other: defining a subquery once and using it multiple times (cte), readability. some people prefer ctes over subqueries even for single-use things. personally i only use ctes in that case if i'd end up with multiple nested subqueries (which might happen with complicated logic and/or window functions) note that sql server doesn't evaluate a cte only once and reuse the result set. every time you refer to a cte in a query, you're creating a different subquery. i've seen this trip people up - people expect defining a thing once means it will execute once, but alas that isn't so. in postgres you can add a hint to the cte to get it to materialize the cte results, but sql server has nothing like this
|
# ? Oct 20, 2022 21:00 |
redleader posted:note that sql server doesn't evaluate a cte only once and reuse the result set. every time you refer to a cte in a query, you're creating a different subquery. i've seen this trip people up - people expect defining a thing once means it will execute once, but alas that isn't so. in postgres you can add a hint to the cte to get it to materialize the cte results, but sql server has nothing like this So if you wanted that behavior in SQL Server you'd have to be explicit and create a temp table from the query?
|
|
# ? Oct 20, 2022 21:07 |
|
redleader posted:note that sql server doesn't evaluate a cte only once and reuse the result set. every time you refer to a cte in a query, you're creating a different subquery. i've seen this trip people up - people expect defining a thing once means it will execute once, but alas that isn't so. in postgres you can add a hint to the cte to get it to materialize the cte results, but sql server has nothing like this I think it's worth clarifying that SQL Server doesn't necessarily rerun the CTE every time it's invoked - it just inlines it into the query and then lets the query optimizer do its thing. In any particular case, this might end up having better performance than a materialized CTE, and the SQL Server query optimizer is generally pretty good. But if the optimizer gets a CTE optimization wrong, you don't have the tuning tools that Postgres gives you. (And if your CTE is non-deterministic due to your transaction isolation settings or just invoking a non-deterministic function, you need to pay really close attention to where it's getting used.)
|
# ? Oct 20, 2022 21:23 |
|
Note that CTE behavior changed in postgres 12. Before that CTE's were always materialized, but these days they are not materialized by default (unless they're recursive or have side effects), but you can choose to materialize them if you want.
|
# ? Oct 21, 2022 12:17 |
|
is it really best practice in sql server to write all your identifiers [like].[this]? what a pain in the rear end
|
# ? Oct 21, 2022 21:10 |
|
DELETE CASCADE posted:is it really best practice in sql server to write all your identifiers [like].[this]? what a pain in the rear end SSMS generates queries like that, but in my experience nobody actually uses the brackets for hand-written T-SQL unless they actually need them because someone put a stupid identifier in the database somewhere.
|
# ? Oct 21, 2022 21:25 |
|
DELETE CASCADE posted:is it really best practice in sql server to write all your identifiers [like].[this]? what a pain in the rear end
|
# ? Oct 21, 2022 22:58 |
|
TheFluff posted:Note that CTE behavior changed in postgres 12. Before that CTE's were always materialized, but these days they are not materialized by default (unless they're recursive or have side effects), but you can choose to materialize them if you want. woah whaaaat. I used PG in the 8.x days for a project I was on, it was much simpler stuff than I was writing later in oracle but the idea of CTEs being materialized by default is bizarre to me. And recursive CTEs are a really common use-case for hierarchal stuff.
|
# ? Oct 21, 2022 23:55 |
|
|
# ? May 24, 2024 17:54 |
|
I feel like there must be a better way to handle filters in a HAVING statement. I can't use window functions there, and it stinks. I've got a common pattern dealing with aggregates where I want to group together everything that is not a significant % of the total into an "other" category. To get concrete, I threw something together where I have colors and years, and I want to get the count of all colors that make up more than 10% of the total, and group together everything that makes up less than 20% of the total into an "other" category: SQL code:
code:
I realize there's a completely different way I could get to the "other" amount by subtracting the sum count of all the colors that passed the >=20% bar from the total of all matching rows, but I can't figure out how to use that.
|
# ? Oct 22, 2022 02:24 |