Is that handwritten as that, or generated from some ORM or such?
|
|
# ? Jan 31, 2024 20:58 |
|
|
# ? May 6, 2024 03:12 |
|
kiwid posted:Does this entire mess just reduce down to v.contract_nbr >= 0? yes
|
# ? Jan 31, 2024 21:11 |
|
kiwid posted:Can someone help me understand what is happening in this WHERE clause? Yes, though my guess is the left side of all those NULL comparisons is supposed to be a column name, maybe some parameter that was supposed to be passed in and it's being passed in as null instead of a string and you get this. Either that or ORM generated nonsense, though there aren't any mangled aliases in there that you usually get with ORM generated queries. RandomBlue fucked around with this message at 22:41 on Jan 31, 2024 |
# ? Jan 31, 2024 22:07 |
|
nielsm posted:Is that handwritten as that, or generated from some ORM or such? It's from an SQL trace generated by software. So, probably an ORM. RandomBlue posted:Yes, though my guess is the left side of all those NULL comparisons is supposed to be a column name, maybe some parameter that was supposed to be passed in and it's being passed in as null instead of a string and you get this. Either that or ORM generated nonsense, though there aren't any managed aliases in there that you usually get with ORM generated queries. You might be right, perhaps those nulls should have been column names. This is some really lovely legacy software we're talking about.
|
# ? Jan 31, 2024 22:23 |
|
this is really silly but I'm stumped with a null join clause I have 2 tables. table 1 has client# and their subscription (1 line per subscription), the relevant start and end date. table two has products bought. I need an qsl that returns basically everything from table2 (so a left join), but ONLY for people who are subscribed to subscription 24 AND NOT have subscription 01, both clauses true on the day of purchase. I'm not even at the filter of the date yet, just that conditional filtering is stumping me. eg code:
|
# ? Feb 2, 2024 12:14 |
|
double nine posted:this is really silly but I'm stumped with a null join clause as a starting point, exists might be useful? code:
|
# ? Feb 2, 2024 12:34 |
|
kiwid posted:It's from an SQL trace generated by software. So, probably an ORM. If it turned out to be handwritten, I might have guessed it was to make it easier to comment out any combination of filters without leaving an invalid WHERE clause pre:NULL IS NULL --AND v.contract_nbr = 0
|
# ? Feb 2, 2024 14:52 |
|
Heavy_D posted:the system has a variable length list of filters which can be turned on or off I mostly use MySQL so I see this often with code:
|
# ? Feb 5, 2024 21:00 |
|
out of curiosity, is there a way to have an excel sql query refer to the content of a sheet? for example, i have the following query: code:
is this possible
|
# ? Feb 6, 2024 12:32 |
|
double nine posted:out of curiosity, is there a way to have an excel sql query refer to the content of a sheet? Edit: misunderstood the question acidx fucked around with this message at 07:09 on Feb 7, 2024 |
# ? Feb 6, 2024 22:22 |
|
so is there a quick and easy way to produce this kind of pivot? i have all of these columns that will be the same for a subgroup of rows. then, the rest of the columns for that subgroup will be different and need to be pivoted. and yes, this'll need to happen 50 times. plus, each subgroup will have different questions, so q1 for one subgroup will be different than another. i see a very hellish way of doing this with cases, and i also see a very hellish way of doing this with joins, but i'm reeeeeeeeeeeeally hoping there's a way to pivot this. and this is in redshift.
|
# ? Feb 16, 2024 01:49 |
|
If I have a table containing an ordered list:code:
So if the item originally selected was the row with item=17, the query return the rows for 10,15,17,25,35 edit: It's MySQL 8 fwiw Agrikk fucked around with this message at 18:18 on Feb 28, 2024 |
# ? Feb 28, 2024 17:58 |
I'm not sure if this works, but possibly a triple UNION?SQL code:
|
|
# ? Feb 28, 2024 18:08 |
|
SQL code:
|
# ? Feb 28, 2024 18:34 |
|
Can somebody give me a crash course in effective MySQL bulk loading? I could also use a recommendation for a logical backup tool that'll work well with tables in the 50GB-300GB range, are people happier with mydumper or mysqlpump? I've never touched MySQL before, have the time in the trenches with Postgres and Spark SQL, and have some big tables that are desperately in need of some optimization. Right now they're entirely denormalized and have no indexes of any kind, I'm hoping to both normalize to save some space and add indexes to make querying it sane. This is genomics research data, and the researchers have previously been suffering through the fact that any query takes 3min to come back and running an application that makes a lot of queries is going to take 5 days to run. I've got a dozen questions and nobody to talk to, so if anybody sees obvious problems with any aspect of this I'd appreciate advice. This is an example of what I'm dealing with: SQL code:
My plan is to create new, normalized tables from this on a replica of the DB that I have exclusively to myself, so I can do full table locks, turn off indexing while inserting, and do any number of deferred processing tricks while I'm loading the data. I've been reading and threw a bunch of different stuff at it in hopes of faster inserts, but I'm not sure which of these are useful and which are just cargo cult bullshit: SQL code:
In the past these huge tables have been made somewhat usable by adding indexes (to the big duplicated text fields) and also by making more copies of the table that have been somewhat filtered, for example: SQL code:
|
# ? Mar 2, 2024 06:09 |
|
instead of spending brainpower worrying about clustering and how to optimally set up a primary key, i would just do the normalization and then spec out a server that fits the resulting dataset in ram. it's likely to be both cheaper and give better outcomes than spending a bunch of your expensive time on optimization.
|
# ? Mar 2, 2024 07:27 |
|
Twerk from Home posted:This sucks. I'd much rather have indexes work for us. If I were in Postgres I'd partition by pval at the common cutoffs people use, but in MySQL I guess I'd have to make pval the first column in the primary key to affect how InnoDB lays it out on the disk? The natural key here is (tissue, gene, phecode) right now, which will become (tissue_id, gene_id, phecode_id) in my normalized schema. Would it be horrible to throw pval on the front of the primary key, and will that cause MySQL to cluster by pval such that queries with a specific pval range will be almost as fast as the smaller copy of the table? It's been a while since I did anything with MySQL but I believe this is true. What is a pval exactly? Could you just leave the primary key as it is and create an index on pval? (which would then actually be (pval, tissue, gene, phecode) because iirc indexes implicitly get filled out with any primary key columns you didn't declare to be in them, in order)
|
# ? Mar 2, 2024 09:48 |
|
I'm very familiar with Python, and I know the "basics" of SQL enough that I can look at the posts here ITT and generally understand, but I want to learn SQL more. How would you guys recommend learning SQL in 2024?
|
# ? Mar 2, 2024 21:07 |
|
Boris Galerkin posted:I'm very familiar with Python, and I know the "basics" of SQL enough that I can look at the posts here ITT and generally understand, but I want to learn SQL more. How would you guys recommend learning SQL in 2024? I think you might get better advice if you can specify if you're more interested in SQL as an application developer wanting fast transactions or for data analysis and transformation, because those are pretty different worlds with different skillsets once you get past the basics. I'm clueless in the OLAP world and doing my best to learn quickly. There's also so many kinds of tools at this point, everything uses SQL.
|
# ? Mar 2, 2024 21:18 |
|
I track a lot of stuff (finances, mostly, but not limited to) in Google Sheets and I just found out that you can run SQL queries and it just seems so much easier/better than not using SQL. I guess I mostly just wanted some guidance on how to work with spreadsheets/datasets using SQL, not developing applications.
|
# ? Mar 3, 2024 16:35 |
One book I have gotten a lot of use out of is SQL Antipatterns, which is unfortunately out of print and was never updated to reflect newer DBMS functionality either. But I really like its approach of showing common problems, describing bad solutions, and then going into better solutions and why they are better.
|
|
# ? Mar 3, 2024 19:03 |
|
nielsm posted:One book I have gotten a lot of use out of is SQL Antipatterns, which is unfortunately out of print and was never updated to reflect newer DBMS functionality either. But I really like its approach of showing common problems, describing bad solutions, and then going into better solutions and why they are better. Appears to be available as an ebook at https://pragprog.com/titles/bksap1/sql-antipatterns-volume-1/. Not sure if it's been updated or if it's the same text.
|
# ? Mar 3, 2024 23:00 |
It sounds like the ebook version is actually updated, so I would recommend that.
|
|
# ? Mar 3, 2024 23:30 |
|
I'm seconding SQL Antipatterns, it surprisingly became one of my favorite programming books ever, I keep going back to it.
|
# ? Mar 3, 2024 23:50 |
|
Boris Galerkin posted:I'm very familiar with Python, and I know the "basics" of SQL enough that I can look at the posts here ITT and generally understand, but I want to learn SQL more. How would you guys recommend learning SQL in 2024? I am also interested in this. I got a book "sql in 10 minutes" as it was highly rated, but it doesn't teach the way I think or the way most sql is written. Would prefer something with just a bit more theory.
|
# ? Mar 4, 2024 05:05 |
|
I like Codecademy a lot for learning. They have an interactive query window and a course that will take you from basic select queries to calculating churn. Of course, once you're past the free period you have to fork over to continue. I think it's worth it though, given the amount of stuff you can learn on their platform. YMMV though. If you need free stuff, then on youtube there's Ankit Bansal and Alex Freberg. There's also learnsql.com, and sql-practice.com for when you have enough under your belt to start practicing with some challenges.
|
# ? Mar 4, 2024 06:02 |
|
Jabor posted:instead of spending brainpower worrying about clustering and how to optimally set up a primary key, i would just do the normalization and then spec out a server that fits the resulting dataset in ram. This is the advice that I'm here for, but I am still seeking a faster way to do the normalization / transformation in the first place. MySQL is pretty slow chewing on 1.5 billion rows in a single transaction, any suggestions on good ways to batch this down (or parallelize it, or anything)? I've got lots of databases to do this to and have run into some edge case failures on the data that I didn't encounter when I tested this with a tiny subset of rows to make sure everything worked. SQL code:
|
# ? Mar 4, 2024 19:04 |
Consider whether the database needs to be "online" and ready for multi-user during that operation. If you can disable or remove the indexes and constraints while loading data, and then only reinstate them afterwards, that might be significantly faster.
|
|
# ? Mar 4, 2024 21:09 |
|
I'm building a data warehouse and running into deadlocks with SQL Server. One suggestion to me was to use the WITH (NOLOCK) table hint but it seems like this is frowned upon? I don't really want to copy uncommitted data into the data warehouse. Does anyone have any suggestions for me?
|
# ? Mar 6, 2024 15:42 |
|
Another question, I'm trying to use the MERGE statement (TSQL) to sync a table to a data warehouse. SQL code:
SQL code:
|
# ? Mar 8, 2024 01:18 |
|
what data type is src.change_date?
|
# ? Mar 8, 2024 01:28 |
|
redleader posted:what data type is src.change_date? datetime edit: just to clarify my problem, if the source table row is updated and change date is updated, the merge won't update that row in the destination until at minimum 4-5 hours later because the last_synced_at is in UTC. edit2: One option I can do is to do this: SQL code:
edit3: Never mind! I figured it out. Apparently I should be putting the AT TIME ZONE inside the cast brackets instead. This works: SQL code:
OMG I feel stupid. This first query is actually correct and works, it was just my IDE that was giving me a false error: Wrapping it in parentheses caused the error to go away. kiwid fucked around with this message at 01:51 on Mar 8, 2024 |
# ? Mar 8, 2024 01:29 |
|
whoops, should have asked about last_synced_at too. assuming last_synced_at is also a datetime, does this work? datetimeoffsets are stored and manipulated internally as their UTC values (per docs), so this should compare the two in the same TZSQL code:
|
# ? Mar 8, 2024 01:48 |
|
redleader posted:whoops, should have asked about last_synced_at too. assuming last_synced_at is also a datetime, does this work? datetimeoffsets are stored and manipulated internally as their UTC values (per docs), so this should compare the two in the same TZ It's datetime too but yes you fixed my issue by adding parentheses. Thanks so much
|
# ? Mar 8, 2024 01:51 |
|
redleader posted:
Just looking at this again this morning. Do I not have to convert to the same time zone before comparing? Your example shows you compare two different time zones. edit: nvm, looks like you can compare datetimeoffset but not datetime. kiwid fucked around with this message at 14:39 on Mar 8, 2024 |
# ? Mar 8, 2024 13:43 |
|
kiwid posted:Just looking at this again this morning. Do I not have to convert to the same time zone before comparing? Your example shows you compare two different time zones. yeah, when comparing datetimeoffsets, sql server normalises them all into utc before comparing them. in fact, casting [dst].[last_synced_at] to datetimeoffset with AT TIME ZONE 'UTC' might actually not be needed at all. i think that implicitly (or explicitly) casting a datetime to datetimeoffset will set the offset to +00:00, so just converting the EST datetime to datetimeoffset should work. but i'd definitely do some googling and comment the heck out of that if i went that route
|
# ? Mar 9, 2024 01:07 |
|
How come a MERGE is so fast? I'm not working with a lot of data, but copying 5000 rows is about 300ms.
|
# ? Mar 10, 2024 22:54 |
|
redleader posted:yeah, when comparing datetimeoffsets, sql server normalises them all into utc before comparing them. in fact, casting [dst].[last_synced_at] to datetimeoffset with AT TIME ZONE 'UTC' might actually not be needed at all. i think that implicitly (or explicitly) casting a datetime to datetimeoffset will set the offset to +00:00, so just converting the EST datetime to datetimeoffset should work. but i'd definitely do some googling and comment the heck out of that if i went that route I just want to say I have done all this converting from datetime to datetimeoffset and feel your pain. I was the timezone "expert" for a couple of companies in the past, and I hope I never have to dick about with timezones ever again, they are by far the most complex and error-prone kind of calculations I've ever had to work with.
|
# ? Mar 10, 2024 22:58 |
|
mortarr posted:I just want to say I have done all this converting from datetime to datetimeoffset and feel your pain. I was the timezone "expert" for a couple of companies in the past, and I hope I never have to dick about with timezones ever again, they are by far the most complex and error-prone kind of calculations I've ever had to work with. I loving hate working with dates and times because there's always some new way poo poo gets hosed up. There are a shitload of formats even if timezones aren't an issues but they always are. Storing it in UTC is the easiest part. If you control the input and can limit things there it's not so bad but I work on a product that works with user provided CSVs, spreadsheets, DB sources, CRMs, etc.. and of course we work with international customers so all that poo poo has to work together seamlessly.
|
# ? Mar 10, 2024 23:34 |
|
|
# ? May 6, 2024 03:12 |
|
kiwid posted:How come a MERGE is so fast? Merge is just handling insert/update and deletes in one statement, if you make it do all of them, how fast can you get your system to do a 5k insert or update?
|
# ? Mar 11, 2024 00:59 |