I can't think straight today so help me out. T-SQL. I have three tables: Cases, CasesResumed, CasesClosed. Cases have a CaseID, OpenDate, CloseDate (nullable). The OpenDate is always filled, CloseDate is filled if the current status is closed, null if it is open. CasesResumed have CaseID and a ResumeDate. This indicates the case was re-opened on that date. CasesClosed have a CaseID and CloseDate. This indicates the case was closed on that date. I need to get historical data for number of total open and total closed cases on a series of month starts. My current strategy would involve selecting all cases with an OpenDate on or before each month start. For each of these cases, find the last resume and last close date on or before the month start. Use the case open, resume, close to find the status of the case on that month start date. With that, count number of open and closed cases on each month start. The major roadblock I'm having is getting the last resume/close date for each case and coalescing that with the case open date, to determine the case status. Any hints?
|
|
# ¿ Aug 9, 2019 10:50 |
|
|
# ¿ May 6, 2024 11:14 |
Managed to build a solution. I'm not sure if this is the best approach but at least it works.SQL code:
|
|
# ¿ Aug 9, 2019 13:15 |
SQL script that drops the existing procedure and creates the new one? One of those migrations systems often present in ORMs?
|
|
# ¿ Aug 16, 2019 23:43 |
Mycroft Holmes posted:new question. I've got three problems i can't get to work. Keep in mind you need to use the LIKE operator to do wildcard matches, the = operator will do straight equals comparison.
|
|
# ¿ Oct 15, 2019 11:23 |
Nth Doctor posted:I mean, lots of people fail to understand normalization and how to apply it in an RDBMS. That isn't newsworthy. Some people don't understand that computers are actually very good at keeping track of thousands or millions of very similar-looking data rows. They can't imagine themselves doing it/doing it in an Excel sheet, so it can't possibly be a good idea to do it in an RDBMS either.
|
|
# ¿ Oct 15, 2019 20:05 |
SQL code:
pre:Msg 195, Level 15, State 10, Line 1 'CONCAT' is not a recognized built-in function name. It even autocompletes CONCAT in the SSMS editor. MS SQL Server 10.50.6000. Edit: Apparently it was only added in SQL Server 2012 and this is 2008 R2. I thought CONCAT was supposed to be ancient ANSI SQL.
|
|
# ¿ Oct 16, 2019 12:12 |
Agrikk posted:All my tables are set to latin1_swedish_ci so for whatever reason @externalstring is being recognized as cp850_general_ci during the updates. I tried setting the table to be cp850 but either I did it wrong (likely) or this wasn't the problem. Try the CONVERT(a USING b) function to convert your input string to the database's encoding.
|
|
# ¿ Oct 18, 2019 08:11 |
If you can modify the schema and get everyone to agree on it, handle the addresses as 32 bit integers and just store the first and last address of each range, and do a simple between check.
|
|
# ¿ Oct 23, 2019 22:18 |
PhantomOfTheCopier posted:Next thought, the issue is there speed will suck because the join condition is "inside out". IE, you're trying to find a value between two calculated values, and that'll be slow, maybe even O(mn) depending on the database. So, flip the problem around. What if you wrote a big florping case statement that checked all possible netmasks for a given ip? Then you could do a direct equality match with B. This is not too bad an idea, if you're stuck with strings: On the client performing the query, generate all possible netmasks the IP could match, this should be limited to around 25 possibilities. Then query for an exact string match for each of those. E.g. if your address to search for is 15.56.204.85, then the possible netmasks would be: 15.56.204.85/32, 15.56.204.84/31, 15.56.204.84/30, 15.56.204.80/29, 15.56.204.80/28, 15.56.204.64/27, ... 15.56.204.0/22, 15.56.200.0/21, 15.56.192.0/20, ..., 15.0.0.0/8, 14.0.0.0/7 If any of those exist in the database, you have a match.
|
|
# ¿ Oct 24, 2019 07:12 |
Agrikk posted:Yeah. I’m loading 2200 CSV files [each with 1.5 million rows of data consisting of five columns with a total character width of no more than 86] into a single table with an auto-incrementing rowID and an index on three of the five columns. Just to get the obvious questions out of the way: You are doing the inserts inside a transaction, right? I'm pretty sure if you open a transaction the indexes will only be updated when you commit, instead of after every insert.
|
|
# ¿ Oct 24, 2019 07:15 |
Agrikk posted:How do I add columns dynamically to a table based on incoming data? Normalize that poo poo. code:
|
|
# ¿ Nov 1, 2019 22:36 |
Paul MaudDib posted:I did stack machines and regular languages and all that stuff back in college, but we never really got into some of the more esoteric features of regexes. I’m under the impression that some of those actually take regexes past the power of a regular language into context sensitive languages, if I’m remembering the terms right. Yes basically any features for look ahead or look behind assertions that don't consume input are outside regular language powers. At least as rule of thumb. Some regex implementations might degrade from a FSM implementation with O(n) time to match, where n is length of the haystack, to a much slower backtracking algorithm that can be quadratic in length of input plus complexity of the expression. All of this is from memory, might have some details wrong.
|
|
# ¿ Nov 8, 2019 07:42 |
I'm getting a T-SQL task dumped on me that's (probably) too complex/weird to be solved by a straight sequence of insert/update/delete statements, and I have almost zero knowledge of the procedural language. Can anyone recommend some books/other resources for learning procedural T-SQL?
|
|
# ¿ Dec 19, 2019 08:34 |
Yeah that's totally a pivot. Either do the straight group by query and get your four columns (f1, f2, year_month, count) and format the data at the client side, or do the same query and use the database's pivot features on that year_month column. Maybe have a dimension table of date ranges and the year-month label to simplify the join condition.
|
|
# ¿ Feb 28, 2020 11:53 |
Use a TABLESAMPLE query but repeat it until you don't get zero results.
|
|
# ¿ Mar 20, 2020 11:30 |
Check the collation set for database and tables. Perhaps also default collation for connection? I don't remember if that's a thing. Collations with _ci suffix are case insensitive, suffixes _cs and _bin should be case sensitive. Edit: beaten and lol holy poo poo mysql how insane can you be
|
|
# ¿ Mar 24, 2020 16:23 |
I have a large-ish table (8 million records) in SQL Server and I need to do a simple update via a translation table. I currently have: SQL code:
If I was implementing this in a flat file store in a C program I'd be scanning LargeTable and look up the matching mapping row then update. This only performs a single scan of the large table and doesn't really need an index at all. Is there any way to force SQL Server to do the update that way instead? I really only need to run this update once ever, except that the process I'm working on has a bunch of development cycles, so an extra 40 minutes of execution time isn't that much of an issue in the big picture. Mostly asking for curiosity's sake.
|
|
# ¿ Apr 1, 2020 11:32 |
Thanks, I'll try out those methods. Right now I get a Nested Loops (Inner Join) using Hash Match (Aggregate), with a table scan of the mapping table at top and an index seek on the nonclustered index of the large table at the bottom. This has an actual cost of 48% on the index seek, 19% on the nested loops, 31% on the hash match, and 1% on the final clustered index update.
|
|
# ¿ Apr 1, 2020 13:25 |
Okay I experimented a bit, two things made a difference: First, using a tempdb table ##mapping instead of a table var @@mapping. Second, creating the clustered index on ##mapping.A allowed the execution plan to prefer a nested loops join with a table scan of LargeTable at the top, doing a stream aggregate match on the mapping table. I didn't need any index hints or changing the join type on the update statement after those two changes. The actual cost moved to 85% table update, 9% table scan on LargeTable, 6% index seek on ##mapping, zero on the join. Execution time improved by a factor 16, from about 12 minutes down to 45 seconds with a LargeTable dataset of 800k rows. Edit: And yes those numbers are much closer to what I had expected the costs for the opration to be. nielsm fucked around with this message at 14:22 on Apr 1, 2020 |
|
# ¿ Apr 1, 2020 13:49 |
Agrikk posted:Another question: Uh, untested stupid idea: SQL code:
|
|
# ¿ Apr 17, 2020 23:27 |
When IdY is unique in itself, why does your Z table need to reference more fields than that? I don't understand what that achieves. All I see is denormal data with the Value field duplicated needlessly. Are you sure you shouldn't have Z not contain the Value and instead have a view that joins Y and Z?
|
|
# ¿ May 1, 2020 10:39 |
mik posted:Here's a WHERE Type = 'Stupid' question: Yes, there is a difference, when you make a left join. If you did an (inner) join they would be equivalent. The first will match every row in A with all rows in B where Field1 matches, and B.Field2 has the value. If there are rows where Field1 matches none where B.Field2 does, you will get a row without any data from B. The second will match every row in A with all rows in B where Field 1 matches. It will then filter out joined rows where B.Field2 does not have the value. This means rows from A where there were no matching rows from B will be left out. (It will effectively behave as a plain inner join.)
|
|
# ¿ May 7, 2020 14:17 |
Actual max concurrent, or number of distinct users seen within each interval? The answer is probably going to be different, with the latter probably being easier.
|
|
# ¿ Jun 18, 2020 22:27 |
kumba posted:if I'm reading this right, I could imagine a setup where you populate a table with each datetime interval you want (e.g. 1/1/2020 00:00:00, 1/1/2020 00:30:00, etc), then left join that table to your user login tracking table on that using BETWEEN and get a count Yeah something like that will probably work, but I think you should change to count(distinct userid) to not get the same user logging in/out repeatedly.
|
|
# ¿ Jun 18, 2020 22:31 |
Assign +1 and -1 values to the log in and log out events. Make sure you have one row per event, not per session, you may have to union the table with itself. Use a window function to keep a running sum of log in/out events. Join that with your time dimension to take a max of every 30 minute interval. Your RDBMS probably has a way to make temporary tables, use that. Temporary tables are private to the connection and disappear when you disconnect.
|
|
# ¿ Jun 18, 2020 23:16 |
Can anyone confirm if this way of securing an application sounds correct? MS SQL Server 2019 The application has a couple data tables, containing sensitive data, and a table for logging access to the application and data accessed. A selection of domain users needs access to the application, other users have no access. Write a set of stored procedures to do all data access required by users, these sprocs validate parameters and write to the access log table. The sprocs are marked as execute as owner (and owned by dbo). Create an AD group (CONTOSO\AppUsers) and make the users member of this group. The AD group is added as a login user to the SQL database, with no permissions on its own. Create a database-level role "app_user" and add the login user "CONTOSO\AppUsers" as member of that role. Grant the "app_user" role execute the sprocs. In the sprocs, the ORIGINAL_LOGIN() function can then be used to obtain the user's own account name (CONTOSO\jsmith) for access logging.
|
|
# ¿ Nov 25, 2020 14:01 |
Ruggan posted:Why EXECUTE AS OWNER? I'm not entirely sure I understand what you're arguing here, but I also don't understand the authorization model entirely. Do you mean that SPs have permissions on tables (etc.) checked at creation time, and then after creation, no checks are made at execution? Other than user having permission to execute the SP itself. So in theory, I could create an SP that accesses a table I have permissions on, such that I'm the owner of that SP, then have my permissions on that table revoked, and then still be able to execute the SP to access the table in that way? Are there any disadvantages or risks involved in using EXECUTE AS OWNER?
|
|
# ¿ Nov 30, 2020 09:18 |
Try this:SQL code:
Actually, the other solution that may be simpler is to just write two separate queries and use UNION ALL between them.
|
|
# ¿ Dec 10, 2020 11:42 |
Hammerite posted:If you need to represent "data is missing" or "column is not applicable", then a nullable column is indicated. If you don't need to represent either of these situations, then a nullable column is not indicated. I've never been in a situation where it's a matter of personal stylistic choice whether to make a column nullable or not, it's always been clear from the context whether it would or wouldn't be correct to make the column nullable. I suppose that to the extent that your observation boils down to "if the column shouldn't be nullable, then don't make it nullable", I agree with it. I believe the idea is that if you have data that can be absent, then those data should live in a separate table that joins with the one you'd otherwise put the nullable column in.
|
|
# ¿ Dec 11, 2020 11:53 |
Generate the amount of id's you need from the sequence, assign them manually to the data you insert in the first table, then you know which id's correspond to which data for inserting into the second table.
|
|
# ¿ Aug 13, 2021 07:04 |
The data will go through the machine the import/export proces is run on. In general, the database servers from different vendors can't connect to each other for direct transfers, they use different protocols and all. All that wizard is doing for you is hooking up a SELECT query on the source to fetch all the data, and then send INSERT queries to the destination to load it back in.
|
|
# ¿ Sep 26, 2021 07:59 |
I can't write the SQL off the top of my head, but you can probably write a CTE that ranks the items for each transaction with ordinal 1, 2, 3, 4, 5,... and then select a five-way self join on the CTE to construct rows with up to five ranked items from each.
|
|
# ¿ Oct 14, 2021 22:20 |
Consider if you should normalize your schema. I think it would look something like Host (id , name) Pop (id, name) Measurement (id, time) ResponseTime (measurement_id, host_id, pop_id, response_time)
|
|
# ¿ Nov 13, 2021 00:02 |
You obviously need to have ingredients that have a recipe of their own too. Like you have a recipe for mayonnaise and then a recipe for a mayo-based burger dressing, and then a recipe for a hamburger using that dressing.
|
|
# ¿ Dec 16, 2021 22:54 |
You can maybe sign up for a virtual desktop on Azure. Not sure if you can get those as an individual or only as an enterprise customer. Or go install Windows on the Mac, I don't think you can run new macOS versions on it anyway, and will get locked out of more and more software as developers update their SDKs.
|
|
# ¿ Jan 6, 2022 18:56 |
You make a JOIN in your FROM clause, joining the related tables to the report table, on the condition of each table's key being equal to the matching foreign key of the report table. That gives you rows containing the data from the report table, the job_card table, the user table, and the report_type table, which you can then select as appropriate. The report.job_id etc. foreign key columns will still be present in the joined result, but if you don't include them in the list of fields to select, you won't see them in the result set.
|
|
# ¿ Feb 24, 2022 14:30 |
Nth Doctor posted:I'm a big advocate for identity columns even when there's already a uniquely valued column in the table, if the column isn't comparable as a single CPU step. There's no reason you should spend multiple cycles comparing J, J, 4, 2, 1, 2, 6, 3 to J, J, 4, 2, 1, 2, 6, 0 one by one when you can just as easily see 34920 vs. 88382381 isn't a match with a single step. On the other hand, 8 bytes is the size of an int64 and your CPU can already compare two int64 values in a single instruction. Compilers for C++ and other languages can translation a comparison of an array of 8 bytes into a comparison of a single 64 bit value, depending on exact context, so a database engine might also be able to.
|
|
# ¿ Mar 7, 2022 18:17 |
You'll have to generate dynamic SQL anyway. The language doesn't allow parameterizing table names anyway, as far as I know. If you want a "nice" result, create a temporary table, generate SELECT INTO statements for all the source tables, and store the results in the temporary table. Then run the report on the temp table.
|
|
# ¿ Apr 7, 2022 21:16 |
https://www.postgresql.org/docs/current/datatype-datetime.html says the timestamp type has 1 microsecond resolution, but your textual representation only has 1 millisecond resolution. The stored values probably have more precision than gets displayed, so they don't produce an exact match with the copy-pasted textual value.
|
|
# ¿ Jun 15, 2022 20:49 |
|
|
# ¿ May 6, 2024 11:14 |
prom candy posted:How do I write a query to return all users that have a Honda AND a Toyota? Use an EXISTS in the WHERE clause instead of joining the tables twice. That's also easier to chain when you need to check for three or four makes.
|
|
# ¿ Jun 17, 2022 17:39 |