|
anthonypants posted:Dunno what flavor of SQL you're in but you're probably looking at a Common Table Expression here. Something like this: This won't quite do it, it still produces all the rows. You'd want something like: code:
|
# ¿ Jun 19, 2018 08:00 |
|
|
# ¿ May 17, 2024 18:23 |
|
This should do it, aggregate your two subtotals separately and then join them on afterwards. code:
|
# ¿ Jun 19, 2018 22:58 |
|
The reason your other way doesn't work is, having done it on the actually super useful Oracle livesql (https://livesql.oracle.com) is hard to show because the data explodes out, but with 5 players, who each kill each other once, you end up with 150 rows with joining for kills and deaths at the same time. You get for each kill row, every possible death row, because they're independent rows joined in. Link to worksheet so you can play with it: https://livesql.oracle.com/apex/livesql/s/guwqntt2shpfia6s7u1fbb5d3 Oracle's accounts are free for this. Edit: They're expensive for everything though. Also, on the topic of your query, if you have it, COALESCE or NVL would save you needing the players table, you can join the two subqueries like so: code:
Moonwolf fucked around with this message at 00:03 on Jun 20, 2018 |
# ¿ Jun 19, 2018 23:52 |
|
anthonypants posted:Let's say I have the following tables: You don't need T-SQL: code:
|
# ¿ Jun 20, 2018 19:13 |
|
Yeah, if you don't have it then just left join the subqueries onto the core table. On the cursors thing, I'd say only use them if you're going to do more than 1 discrete thing per output row. Anything where you're only generating one set of output can be done in a single statement and doesn't need them.
|
# ¿ Jun 20, 2018 23:30 |
|
Bobcats posted:Is there a sql white space/tab standard that doesn’t cause suicidal ideation? 2 space tabs. Any serious reporting query will otherwise end up so indented it linewraps.
|
# ¿ Jul 7, 2018 01:06 |
|
Munkeymon posted:You mean like https://cloud.google.com/dataprep/docs/html/COALESCE-Function_57344756 ? Nah, he wants https://cloud.google.com/dataprep/docs/html/IFNULL-Function_66194064 IFNULL on each column, which mirrors more convention SQL NVL. So IFNULL(Unix1, 'Empty') - IFNULL(Unix2, 'Empty'), or however the actual transform works in bigquery, COALESCE appears to be purely for columns, and doesn't look like it lets you fake it with a string literal.
|
# ¿ Jul 23, 2018 23:59 |
|
You're right, I'm not sure why I picked that, going to blame heat. Either way I think from the looks of the bigquery coalesce it doesn't allow non-column values, where ifnull does. Might just be a gap in my reading of the docs though, bigquery does a bunch of stuff quite different from classic SQL style from their own examples.
|
# ¿ Jul 25, 2018 01:08 |
|
For bulk loading in almost everything you're best off stripping all the indexing and rebuilding it afterwards, building 2 giant indices is vastly faster than keeping 2 b-trees balanced while you keep writing. The cost of constantly rebuilding them will be what's causing that fall off in speed, early on they're still simple. What problem is your second index meant to solve? An index that holds a subset of columns that completely fulfils a known query can be useful to stop you having to read the row after the index lookup, but will be murder on your writes. Btree indices are normally there to let you get to the set of rows you want efficiently, what are you expected access paths onto the data once it's all done? Most recent row for a customer? Today's data for all customers? All data for a customer? This would set what your leading keys should be.
|
# ¿ Oct 6, 2018 21:58 |
|
Hunt11 posted:I am right now working with the sakila dataset and when I am trying to join two tables I keep on getting error message 1046. 1046 is that you don't have a database selected, possibly because you haven't specified where staff is. Also, that doesn't look like the join key you want, staff should have an address id for that.
|
# ¿ Oct 29, 2018 00:22 |
|
Hughmoris posted:In TSQL, is there a difference between putting a clause in the JOIN statement versus in the WHERE statement? Well, that would entirely depend on which table Event comes from. If it's in table 1 then it doesn't matter, if it's in table 2 then the second one stops it being a left join by forcing a value to be required. Version 1 gets T1 with all columns from T2 where the keys match and Event = AAA, or nulls. The second one gets T1 with all columns from T2 where the keys match or nulls, and then demands that Event = AAA, which is impossible if it's not set.
|
# ¿ Dec 1, 2018 21:42 |
|
If you reinsert with it ordered properly then that'll help with reducing reads. If you're doing a ton of whole-day fuckery then you might be best served with partitioning if it's available in your db, since that'll make throwing away the old data fast, unless you're actually truncating the whole table. Although if 10 days ago hasn't changed then how does binning it and reinserting help anything other than your storage vendor?
|
# ¿ Dec 5, 2018 00:53 |
|
In SQL Server it defaults to unbound preceding to current row if you put an order by clause on. If you hadn't ordered it then range boundaries wouldn't make any sense so it'd default to the entire dataset, because you haven't put a partitioning clause on. Window functions function by effectively joining the data on again, to allow it to be sorted and aggregated there. Your ordering on the window applies only to the results of the window, which matters for the choice of bounding the results to aggregate, whereas a final order by clause orders the returned results.
|
# ¿ Dec 31, 2018 21:09 |
|
Grump posted:I need help creating a SQL query that will return a cocktail from a database, given that I have supplied all the ingredients that go into that cocktail Ruggan's question's a good one. This is a really hard problem with a dynamic amount of parameters though. What you need to get is: a) Any cocktail that you have at least one ingredient for; b) All their ingredients lists; c) Compare your input list to those lists. Auto-generated SQL doesn't stand a chance.
|
# ¿ Jan 13, 2019 23:25 |
|
Grump posted:Yes - Gin and Tonic should be returned if you included Whiskey Forums ate my query, will edit it in once I work out how to get it to let me post it. How do people get Cloudflare to let you post SQL? It just claims it's an attack. Moonwolf fucked around with this message at 00:14 on Jan 14, 2019 |
# ¿ Jan 14, 2019 00:04 |
|
Bruegels Fuckbooks posted:what do you mean by "posting" sql- you mean in the post of a web request, perchance? Yeah, just replying with it in a code block, like everyone else is. So that I can reply with the example query to Grump's post. Although now it works, for no obvious reason. This one should be db neutral. code:
|
# ¿ Jan 14, 2019 00:49 |
|
TheFluff posted:If you have a halfway decent database it will have set operators (INTERSECT/EXCEPT), which can make this kind of thing far easier to read: You can make that even simpler, you don't need the EXCEPT at all: SQL code:
|
# ¿ Jan 14, 2019 21:01 |
|
I've only ever found them to be useful for reconciling data coming from two difference sources that's meant to be the same, or checking that a new query/ETL flow spits out the same data as the old one when you're refactoring stuff for performance.
|
# ¿ Jan 14, 2019 21:17 |
|
Those should plan to the same anyway. What could be faster depending on the size of the data would be using an IN clause:code:
code:
|
# ¿ Jul 30, 2019 20:11 |
|
Nth Doctor posted:It's this and never making indices Or covering everything in them 'for speed'
|
# ¿ Oct 16, 2019 22:08 |
|
abelwingnut posted:general sql question: So there was some mention of join types, but pretty much any db is always going to keep the results of your udfBlah(a.x) for evaluation, because anything else takes more cpu. Even at a nested loop it will evaluate that once, and then examine b.y for equality a row at a time. Merge join I'm not certain because of how it'd need to sort b.y to match the ordering of udfBlah(x.y), hash join would just make the hashes of all the udfBlah(a.x) for every a row and then join to the matching ones on b. The only place where this wouldn't hold is if udfBlah is non-determinative, so if you've got a random or time function in there so the output can change, then you'd need to really examine what it was doing, it could either evaluate it per row coming in or per join.
|
# ¿ Oct 24, 2019 23:34 |
|
Having just looked up that on Wikipedia, that looks hideous. Although it does more than SQL, it's still actually more unreadable than any of the DB specific procedural languages I've seen.
|
# ¿ Jan 22, 2020 00:36 |
|
TheFluff posted:Speaking of which, if you have MSSQL, system-versioned temporal tables are great and I wish I had them in Postgres. You can implement something similar with triggers, but you can't get the query syntax and that's half the fun. Oh, someone else other than Oracle does native audited tables, that's good, we might see it spread since it's actually really useful. Oracle's one's called Flashback Data Archive for proper long term, they've let you use undo to wind back queries in the near term a while. I believe Postgres is moving to an undo area system, at least under EDB's work on it, which would enable them to do this stuff much more reasonably than the orphaned-rows-and-vacuuming system they have now.
|
# ¿ Jan 29, 2020 01:26 |
|
Vegetable posted:Quick question: That should work, HAVING is just like WHERE but you can also interrogate the output of aggregates, like you are doing. You could easily make that as: SELECT * FROM ( SELECT buyer_id, seller_id, count(booking_id)/total_buyer_orders perc FROM bookings ) WHERE perc >= 0.8 which is what HAVING lets you avoid in cleaner syntax.
|
# ¿ Feb 29, 2020 12:04 |
|
As far as I know there is no dialect of SQL which can do that in a single operation. Although the better way to solve that would be to have a table of:code:
|
# ¿ Apr 17, 2020 20:47 |
|
That's fair, this sort of thing is often handed to us, but yeah, you're going to just have to join a bunch of times I'm afraid. It should be pretty cheap though, just ugly in the SQL.
|
# ¿ Apr 17, 2020 21:38 |
|
That'll probably be quite slow, but depending on which db Agrikk is using there might be unpivot to do the same thing with less reads. A lot of that would be determined if you're doing this for 1 transaction id or a huge group of them though Agrikk.
|
# ¿ Apr 18, 2020 00:15 |
|
It's also a guard against dynamic/constructed SQL with empty values in, which is generally a sign something's gone terribly wrong and terminating rather than continuing is the best bet.
|
# ¿ May 7, 2020 23:07 |
|
NPR Journalizard posted:Is there an easy way to backup a DB, and then restore it to the same SQL server under a different name, but with all the DB objects like stored procs and views pointing at the new name? Most engines will let you do a rename during restore and recovery, which database are you using here for this?
|
# ¿ Jul 1, 2020 09:10 |
|
Combat Pretzel posted:Just a quick check, because I have been writing a bunch of terrible SQL queries by hand, using subpar tools (MSQuery *cough*), against DB2 tables with expected lovely table and column naming schemes from the 80's, after ages of no SQL at all, while waiting to get proper gear at my new workplace (I'm currently using a terrible thin client with a restricted environment), so my head spinning and need a check whether I'm thinking right (not even sure why I'm doing it on a weekend). That join will probably help, because it gives it the potential to do a better join type than endlessly looping. Because you're doing a LEFT join you should put it in the ON because otherwise you're removing the LEFTness by demanding that an o date matches a punch date. Between is normally just syntactical sugar for x >= foo and <= bar, it shouldn't make any difference to your results at all.
|
# ¿ Jul 4, 2020 11:38 |
|
Hammerite posted:I became aware recently that in SQL Server you can use a VALUES table-expression ("Table Value Constructor") to specify a collection of values without having to create and populate a temporary table. There isn't, as far as I know, anything that nice in Oracle yet. If the set of pseudo-table data is fixed, but you can't store it, you could put it in a WITH clause/CTE by doing UNION ALL and selecting them from Dual. Handily stolen example: code:
|
# ¿ Aug 11, 2020 13:35 |
|
abelwingnut posted:i'm not really sure how to say this, but hopefully this works. This only makes sense within the context of whatever database you're using's procedural language extensions (TSQL, PL/SQL, PL/pgSQL, etc). Within those, there should be a debugger like Sir Fishbone says, probably integrated into whatever IDE you use for it if it's a better supported one. SQL proper is declarative, you can't step through it because the entire statement is parsed and applied as one thing, there you're best off commenting out joins and reading the explain plans while you work out why it's not doing what you expect.
|
# ¿ Aug 26, 2020 21:47 |
|
Knot My President! posted:Crossposting: If this is re: your post in poo poo that pisses me off, the hirers were insane. Doubly so if you were on new Postgres, because they've taken the pattern matching feature that Oracle put in the spec and implemented it, so there's a really performant way to do trend finding now rather than hideous joins. Edit: Also table aliases are free, they're just letting you shorthand a name, just like renaming columns in a result set.
|
# ¿ Oct 21, 2020 08:06 |
|
Malcolm XML posted:Wait they implemented match recognize??? Holy poo poo Hmm, I thought they had, it might have just been EDB or 2nd Quadrant talking about it in the roadmap, since I can't find it in the docs. It is row regex though, including recursion and greedy/non-greedy in the Oracle implementation.
|
# ¿ Oct 21, 2020 19:11 |
|
Nth Doctor posted:As both a SQL and Regex guy, I love this. Well just badger Microsoft to implement it, it's in the SQL standard now (as of 2016), I think Oracle got it put in when they did it in 12.1.
|
# ¿ Oct 23, 2020 13:53 |
|
LostMy2010Accnt posted:So following your advice I ran it in SQLlite and the output is definitely showing my tables are not merging properly. My lat and long columns are showing up as Null for all 1200 of my entries. I'm definitely getting two entries that are merging correctly, but the rest are Null. So clearly my issue is joining the two tables which is happening before I get to the dataframe. Yeah, as above, the size of the tables doesn't matter for your results, given that it's not generating results so large that the query is crashing on resources. The other thing I'd flag is that your query is explictly allowing the results to be NULL, because you're left joining. In your end output where you need a lat/long value you should inner join (just JOIN, inner is optional everywhere) so that the join must complete to give results, otherwise you're going to get all the ones where it doesn't match. Fixing those in development or another data quality query is probably going to be useful, a way to find all the results where it's not giving you what you expect can be done by doing code:
|
# ¿ Nov 2, 2020 01:35 |
|
Knot My President! posted:Thanks man. Was driving me nuts. Very nice, well done. quote:gently caress Twitch for getting verbally upset and having one of the interviewers leave the Zoom because "CTEs aren't performant". Also gently caress Autodesk for making me do a six hour take-home SQL test and then ghosting me after I turned it in. (really???) So glad I'm done with this job hunting shitshow. That's hillariously unprofessional interviewing, wtf? Also CTEs are at worst identical to inline subqueries, but you give the db more chance to temp-table them and not have to run it again if you're reusing it. What the hell was that person on?
|
# ¿ Nov 7, 2020 18:51 |
|
NihilCredo posted:This one isn't wrong but, since it's logically equivalent to a plain JOIN, if I saw that in code review I'd flag the line and check if they intended the LEFT to be there and forgot to add "OR a.binary IS NULL" to the filter. What? No it isn't, a is the Left table, b is the table that can be missing and return NULLs. Left/Right Outer has the table specified need to exist and the other side be optional. If it were a FULL OUTER it'd end up being a LEFT because of the non-NULL specified on a. That's a correct join and filter on the driving table.
|
# ¿ Dec 15, 2020 00:17 |
|
abelwingnut posted:so the query planner sees a difference between: Those are different, because your where clause is separate from your joining clauses and filters post join. In most cases it'd be the same and CTEing it wouldn't do anything either way. In all cases, check your explain plans on as live like system as you can, because behaviour can change based on what the db thinks is most effective with CTEs.
|
# ¿ Dec 16, 2020 01:48 |
|
|
# ¿ May 17, 2024 18:23 |
|
prom candy posted:I have some list views in my application that are just absolutely balls slow. In order to fulfill the UI requirements as well as the advanced filtering requirements we have to do a lot of joins with a lot of different tables and for customers whose base data set includes hundreds of thousands of items in a table that contains 1M+ items, loading a single page of 30 items can take about 15-20 seconds. I've mucked around with EXPLAIN quite a bit and although I've found a couple places where I can improve indices to shave a couple seconds here and there I can't seem to get the base queries down into the range where I'd like them to be. Is this about the time that I should be looking at caching this data in an elasticsearch (or other) store or should MySQL be able to do what I need it to do and it's worth continuing to try to press on and improve my queries? On top of what the others have said, MySQL can only do nested loop joins, which are simple, and the fastest way to do small results sets, but will always suck no matter what for 10k+ results sets, and be terrible for 100k+. Trimming unused joins will certainly help though.
|
# ¿ Jan 11, 2021 20:36 |