|
At the risk of sounding prophetic, are you sure you really want the timestamps in the results? That is, are you really just looking for the final state for the user (a much easier problem), or do you truly need the alert and acknowledgement matched in that fashion? And the obvious question, what happens if the sequence is (alert1,alert2,ack2,ack1)? Or (alert, alert,ack)?
|
# ? Dec 12, 2018 00:25 |
|
|
# ? Jun 1, 2024 17:00 |
|
Nth Doctor posted:Query... PhantomOfTheCopier posted:At the risk of sounding prophetic, are you sure you really want the timestamps in the results? That is, are you really just looking for the final state for the user (a much easier problem), or do you truly need the alert and acknowledgement matched in that fashion? My understanding of the data is that an acknowledgement will acknowledge all of the alerts that have fired since the last acknowledgement. The timestamps are needed because the person requesting the report wants to the see stats around when during the day they typically fire, and how long on average until an alarm is acknowledged.
|
# ? Dec 12, 2018 01:15 |
|
Is there a quick way to convert a string in the format '2013.06.11.PM.09.48' to datetime in SQL Server? For clarity, that is yyyy.mm.dd.AM/PM.hh.mm I could break it apart into components and do various conversions and then put them back together as datetime, but I'm hoping there is something obvious that I have missed.
|
# ? Dec 12, 2018 03:46 |
|
Hughmoris posted:I appreciate you take the time to type all that up. I need to work my way through understanding how it works. That type of performance is awesome, because my correlated subquery takes 8+ minutes for a month of data. Because I can't leave well enough alone, I looked into the edge cases I mentioned before and fixed them: code:
As you can see with Patient 2, if for some reason there is an acknowledgement before any alarms were found in the data set, those acknowledgements are ignored. pre:Patient_No Alarm Alarm_Date Acknowledgement Acknowledgement_Date ----------- ----------- ----------------------- --------------- ----------------------- 1 1 2010-01-01 00:00:00.000 2 2010-01-04 00:00:00.000 1 1 2010-01-02 00:00:00.000 2 2010-01-04 00:00:00.000 1 1 2010-01-03 00:00:00.000 2 2010-01-04 00:00:00.000 1 1 2010-01-05 00:00:00.000 2 2010-01-06 00:00:00.000 1 1 2010-01-07 00:00:00.000 NULL NULL 2 1 2010-01-01 00:00:00.000 2 2010-01-04 00:00:00.000 2 1 2010-01-02 00:00:00.000 2 2010-01-04 00:00:00.000 2 1 2010-01-03 00:00:00.000 2 2010-01-04 00:00:00.000 2 1 2010-01-05 00:00:00.000 2 2010-01-06 00:00:00.000 2 1 2010-01-07 00:00:00.000 NULL NULL
|
# ? Dec 12, 2018 16:07 |
|
I think I may have a simpler query. Using your same base result set:code:
code:
|
# ? Dec 12, 2018 20:46 |
|
Ruggan posted:I think I may have a simpler query. Using your same base result set: Ooh I like that! I don't use windowed functions terribly often in my day to day work, and that's a really nice application of them!
|
# ? Dec 12, 2018 21:00 |
|
Ruggan posted:
You are a wizard. Utilizing this method allowed me to query 11 months worth of data in about 10 seconds. My correlated subquery for 1 month of data took 8+ minutes. Nth Doctor, thanks for taking another crack at. I need to rewrite the query using your approach to see how it compares. I also need to start working on understanding execution plans. It's really great to see how other people approach a given problem, and storing those concepts for future problems. Hughmoris fucked around with this message at 03:59 on Dec 13, 2018 |
# ? Dec 13, 2018 03:50 |
Are there any good guides on when to make something a new column on a table, and when to compose with auxiliary tables? Like I get that if you can have a one to multiple relationship you need to compose either directly or using mapping/indexing tables, but even if you use that as a guideline, if you just take everything that is 1-to-1 and which is not repeated for other tables and add those as new columns, you can end up with some absolute monoliths of tables for any application of scale. On the other hand, maintaining an application where every field of an object is added through composition sounds like hell on earth.
|
|
# ? Dec 14, 2018 22:09 |
|
You basically want a "when to normalize and how much guide" which can have different approaches depending on the goal. This article (was first result, have no idea how good the site is) goes over the basics: https://www.essentialsql.com/get-ready-to-learn-sql-database-normalization-explained-in-simple-english/
|
# ? Dec 17, 2018 22:56 |
That article looks very relevant. Just having the terminology is very useful as well.
|
|
# ? Dec 18, 2018 20:31 |
|
SQL code so bad that Cloudflare refused to allow me to post it unedited. Developer wondering why it wasn't working with any speed. code:
|
# ? Dec 19, 2018 21:22 |
|
lol, that hurts my brain
|
# ? Dec 20, 2018 06:12 |
|
unknown posted:SQL code so bad that Cloudflare refused to allow me to post it unedited. I should scrolled to the bottom in the hope of keeping sanity and saw that.
|
# ? Dec 20, 2018 06:46 |
poo poo like that is normal at my current work place. My old boss who started the company and his early devs were all ideas guys who fancied themselves programmers. You know when entering an SP or function that was made before 2015 that you're about to see some funky stuff. Only difference is aliases are either missing or more often than not just a single letter, and tables and column names are rife with typos and typically mix more than one language IN THE SAME NAME. My favorite is finding an old view that is actively being used that joins to a massive table to get a value that is never used and without a clustered index to join them.
|
|
# ? Dec 20, 2018 10:35 |
|
Nth Doctor posted:So does your dev realize indenting ORs doesn't make them logically nest? Or anything else about query design? Not my dev, I just got called in to investigate wtf was going on (as other people are already on holidays) and discovered the monstrosity. Indenting is from the query clean up/formatting tool linked in the first post. I did have the change variable names for obvious reasons - hence the junk character names, but the logic (ha!) is all the dev's own work.
|
# ? Dec 21, 2018 18:47 |
|
Vague question here: When writing queries, do you typically familiarize yourself with the indexes before you start, or only when the performance needs improvement? I just learned about SP_HELPINDEX on SQL Server and am thinking I need to start using it.
|
# ? Dec 23, 2018 23:25 |
|
Hughmoris posted:Vague question here: When writing queries, do you typically familiarize yourself with the indexes before you start, or only when the performance needs improvement? Only when the performance is a problem. Premature performance tweaks can have detrimental effects of you lock the query optimizer out of performing its magic. Also making a good hygiene habit of joining on the PK / clustered index wherever possible.
|
# ? Dec 24, 2018 00:27 |
|
Hughmoris posted:Vague question here: When writing queries, do you typically familiarize yourself with the indexes before you start, or only when the performance needs improvement? I never start by looking at the indexes. I write the query with the cleanest and most sensible syntax I can come up with, and 95% of the time that gives me a query that is good enough. At that point I’m done. For the other 5% I’ll do some optimization work which usually starts with looking the query plan for things that are costing a lot: scans, key lookups, that sort of thing. Rarely am I presented with a situation where query refactoring could lead to better performance via different index utilization, especially when you’ve written a clean query with simple joins. When this does happen a lot of it involves nudging the optimizer to select from the right index with little tricks like filtering a column in multiple joined tables explicitly (like a date range - only pulling data from that date range in each table aka two where between or on between statements). Sometimes I find that indexes that are necessary or important (like an index on an important foreign key join column) are missing and I’ll either add it if I’m dbo or bug the right person to do it. In this case, having some clout is really helpful as I don’t often have to have the “yes I know indexes have a cost” conversation and rarely request useless / overly specific indexes. If you’re making this sort of request you can bolster your argument by looking at query stats in the query store - if you find a query is running hundreds of times a day and would largely benefit from this index then that helps you convince the dbo to make it.
|
# ? Dec 24, 2018 07:53 |
|
Nth Doctor posted:Only when the performance is a problem. Premature performance tweaks can have detrimental effects of you lock the query optimizer out of performing its magic. Ruggan posted:I never start by looking at the indexes. I write the query with the cleanest and most sensible syntax I can come up with, and 95% of the time that gives me a query that is good enough. At that point I’m done. Thanks for the insights. I recently took a SQL Assessment exam that my software vendor requires. Thanks to this thread, I aced it. In 2018 I've gone from barely being able to spell SQL, to creating executive facing dashboards. In 2019, I look forward to deepening my knowledge of SQL beyond simple data retrieval. Thanks for the help everyone!
|
# ? Dec 24, 2018 22:11 |
|
Ok, I managed to cobble together this query using a window function to calculate a running total, but I'm not 100% understanding how it works.code:
2. I don't understand why the ORDER BY is a part of the OVER instead of at the end...aka FROM "transaction" ORDER_BY "transaction.date". It seems like if you order the results, then your window will automatically be over an ordered set of transactions. Obviously I'm wrong I just don't understand this well enough to understand why I'm wrong. edit: The next post made me realize I didn't put that I'm using postgres 10.1 Thermopyle fucked around with this message at 21:19 on Dec 31, 2018 |
# ? Dec 31, 2018 20:06 |
|
In SQL Server it defaults to unbound preceding to current row if you put an order by clause on. If you hadn't ordered it then range boundaries wouldn't make any sense so it'd default to the entire dataset, because you haven't put a partitioning clause on. Window functions function by effectively joining the data on again, to allow it to be sorted and aggregated there. Your ordering on the window applies only to the results of the window, which matters for the choice of bounding the results to aggregate, whereas a final order by clause orders the returned results.
|
# ? Dec 31, 2018 21:09 |
|
Moonwolf posted:Window functions function by effectively joining the data on again, to allow it to be sorted and aggregated there. Your ordering on the window applies only to the results of the window, which matters for the choice of bounding the results to aggregate, whereas a final order by clause orders the returned results. I’m reasonably sure the implementation of windowed functions like this toss out set based operations and take a more algorithmic (ala read head) approach. They’re beneficial precisely because they do not need to join the full set to itself. I think.
|
# ? Jan 1, 2019 03:45 |
|
Is it possible to have a database (or the server hosting the database maybe) execute some simple code when a write is performed? I started looking at using Firebase for an Android app I'm making. I'm trying to make a database to store high scores, and I'd like it to just keep the top 10 results in the database but I don't want the clients touching other people's records. However, I'm not sure if it's possible. If not, oh well, I can just have the client compare and then send a result if it beats another score and then just store everything forever. I guess the other thing that can be done is the bottom record could be re-written by the new record.
|
# ? Jan 2, 2019 02:06 |
There are things called triggers in SQL which might be what you're looking for. I'm on mobile at the moment but can follow up later if you're still stuck.
|
|
# ? Jan 2, 2019 02:14 |
|
Are you saying you’re only storing high scores in the database or you’re storing all scores in one table and high scores in another? Or what? As mentioned, you can do triggers. However triggers suck as they’re synchronous operations. There are other options that are more technically complex as well. But I’m not convinced you need them in the first place.
|
# ? Jan 2, 2019 02:38 |
|
Bundy posted:There are things called triggers in SQL which might be what you're looking for. I'm on mobile at the moment but can follow up later if you're still stuck. Oh, that's a good starting point, thank you! Ruggan posted:Are you saying you’re only storing high scores in the database or you’re storing all scores in one table and high scores in another? Or what? Yeah, I don't really need to do anything fancy, I don't think the database will ever exceed the free-tier limits or anything. Of course I can just sort the results and return the top 5 or 10 or whatever. I'm just trying to learn a little bit more about using databases and whatnot since I know almost nothing at this point. I'm trying to do it "right" and learn a little bit, if that makes sense. The idea was that the database just holds the top 10 scores. Then the client gets the top 5 or so and shows them on the high score screen; and any time a client scores a new high score, it gets sent off to the database, which then theoretically will update to reflect the new set of high scores, deleting the lowest result.
|
# ? Jan 2, 2019 03:51 |
|
You definitely don't want any situations where the client has the ability to modify someone else's score data. Many years ago Super Meat Boy did that and it didn't take long before that bit them in the rear end. https://forums.somethingawful.com/showthread.php?noseen=0&threadid=2803713&pagenumber=258#post398884189 I'm not familiar with Firebase so I'm not sure what parts of the equation it may handle for you, but the way I'd do this would be to throw together a quick web API that sat between the database and the clients, with an endpoint for submitting high scores and another for retrieving the leaderboard. That app could then do whatever other housekeeping work you wanted it to do. Rule #1 of internet-facing services: Never trust the client. Assume any safeguards you put in to client code will be defeated. Validate everything they send you and don't let them interact with anything they don't need to.
|
# ? Jan 2, 2019 06:45 |
|
Triggers aren’t for what you’re trying to do. A trigger is code that fires when a database table is updated in some way (record added, deleted, updated). They are typically used for dependent updates - for example, if you had a Scores table, you might have a trigger that updates the HighScores table whenever the Scores table changes. Like I said before, I strongly dislike triggers as they are synchronous operations - the transaction does not complete until the trigger does and so the user must wait for the triggered operation to complete before regaining control. When people misuse triggers for expensive operations (like doing some aggregate across an entire table) it can really suck. Once again, you don’t need a trigger. Triggers are for performing a dependent operation. All you need is a stored procedure. This stored procedure should accept two parameters: user and score. It should take the score and update your HighScores table directly when appropriate. You can expose this to your application through a direct SQL call, or some sort of API as wolrah mentioned.
|
# ? Jan 2, 2019 07:38 |
|
wolrah posted:You definitely don't want any situations where the client has the ability to modify someone else's score data. Many years ago Super Meat Boy did that and it didn't take long before that bit them in the rear end. Web is even more terrifying to me than the database, I'm not sure I'm up to that task. wolrah posted:Rule #1 of internet-facing services: Never trust the client. Assume any safeguards you put in to client code will be defeated. Validate everything they send you and don't let them interact with anything they don't need to. Yeah, I definitely assumed that! I don't know how I would go about making sure that scores sent over were valid (I know nothing about security [Actually, it seems like I might know nothing about anything]). Maybe whenever the value is increased naturally, I store the answer as a hash, then send that with the score and have the server hash the score and compare them? Does that sound right? I'm pretty confident approximately 0 people will download my app, but I'm trying to get in good practice. Ruggan posted:Triggers arent for what youre trying to do. A trigger is code that fires when a database table is updated in some way (record added, deleted, updated). They are typically used for dependent updates - for example, if you had a Scores table, you might have a trigger that updates the HighScores table whenever the Scores table changes. Like I said before, I strongly dislike triggers as they are synchronous operations - the transaction does not complete until the trigger does and so the user must wait for the triggered operation to complete before regaining control. When people misuse triggers for expensive operations (like doing some aggregate across an entire table) it can really suck. Stored procedures look interesting, thanks! I think that might work well. Firestore/Firebase doesn't seem to have exactly that, but it does have something called "Cloud Functions". It seems pretty similar I guess I can expose a method like "sendScoreToServer", and then it fires off and does all its thing on the server. Seems like it'll be perfect. Then I'll just have to figure out how the security mess works and limit the database to only take that call and queries. Thanks for the help, guys! Shadow0 fucked around with this message at 15:14 on Jan 2, 2019 |
# ? Jan 2, 2019 14:57 |
|
Shadow0 posted:Web is even more terrifying to me than the database, I'm not sure I'm up to that task. quote:Yeah, I definitely assumed that! If the game is running entirely on the client there's no way to actually ensure the scores are legitimate. Depending on your game's design there may be ways to sanity check it (like if a good high score should be ~30,000 then a client submitting 4.2 billion is probably doing something shady) but unless the server is involved in running the game you pretty much have to trust the client as far as the score is concerned. Shadow0 posted:Stored procedures look interesting, thanks! I think that might work well. Firestore/Firebase doesn't seem to have exactly that, but it does have something called "Cloud Functions". It seems pretty similar Cloud functions seem to be Google's form of the "serverless" idea I mentioned above. Use HTTP triggers for your "submit score" and "get leaderboard" functions that the client connects to, then the trusted code you've set up on the Google platform does the database interaction.
|
# ? Jan 2, 2019 16:46 |
|
Moonwolf posted:In SQL Server it defaults to unbound preceding to current row if you put an order by clause on. If you hadn't ordered it then range boundaries wouldn't make any sense so it'd default to the entire dataset, because you haven't put a partitioning clause on. After thinking about it for a couple days I think I get this. Thanks!
|
# ? Jan 2, 2019 21:01 |
|
One of our internal data warehouses has a shitton of truncate/reloads that occur nightly. I've engineered many of them myself because they're very fast and performance is an ETL bottleneck. However, this data warehouse does not enforce really any foreign key constraints which I have always felt was bad design. Many of our thousands of queries rely on inner joins actually finding the matching record, and so there's a definite risk to not enforcing these foreign keys. So with that said, do any of you have strong opinions between one of the two? - Enforce FK constraints at the cost of ETL speed and necessitating Merge - Stick with Truncate / Bulk Inserts and say "gently caress it" to the foreign key constraints as they come from other systems anyway Also, anyone know off hand whether a Merge operation beats a Truncate / Bulk Insert at a certain rowcount threshold (obviously variable based on width/datatypes, but even a real ballpark number)? I imagine that Merge's overhead (comparing records) costs more at a low rowcount but might beat Truncate / Bulk Insert when through a low cost PK comparison can avoid inserting or updating millions of records. But I don't know that for sure.
|
# ? Jan 8, 2019 00:48 |
|
Ugh. Five minutes before I left for Christmas, one of my customers started complaining of timeouts when running arbitrary SQL against a read-only server via ODBC. We give their poo poo up to five minutes to select their data. Their guy supposedly attached their SQL but actually attached their result set. "Look man, you're getting 60k rows of data with over 900k instances of NULL values in this 9.5MB text result set. Maybe you should select less at once? The whole reason we made this tool available was to let you do these queries yourself without involving a dev like me. I'm not paid to performance tune your lovely report."
|
# ? Jan 8, 2019 01:06 |
|
Hey, finally I get a chance to ask a question WHERE Type = 'Stupid'! I'm sure there's some really easy way of doing this, but I've been pondering it for a while now and can't come up with nothing foolproof. Using T-SQL (SQL Server 14 to be exact), how can I extract all values from an NVARCHAR column that adhere to the pattern 'Firstname Surname'. It doesn't even have to be that restrictive, since I really only want all values that have exactly two strings separated by a space. No other details will matter.
|
# ? Jan 10, 2019 16:45 |
|
Kuule hain nussivan posted:Hey, finally I get a chance to ask a question WHERE Type = 'Stupid'! Probably horrible perf but w/e code:
|
# ? Jan 10, 2019 17:51 |
|
NihilCredo posted:Probably horrible perf but w/e This'll do me just fine. It's only 5000 rows and it's meant to be an ad-hoc query so performance can gently caress right off. Thanks for the help, knew it would be something nice and simple that I've just missed.
|
# ? Jan 10, 2019 18:30 |
|
I wanna learn about databases for web/app backends so what is a good and cheap/free place to run one? It’ll just be for playing around with stuff so preferably free is better. The AWS stuff is confusing as hell because there are like 500 services with descriptions reading like they do the same things. (I’ve also made a AWS account ages ago so I don’t think the free micro server tier is available to me anymore.) Also, what database system would I learn? It probably doesn’t matter but if everybody uses like MySQL then I guess I would prefer that.
|
# ? Jan 11, 2019 07:19 |
|
Postgres, Heroku. There's a free hobby tier IIRC. Or possibly MSSQL on Azure if you really want to end up in a Microsoft shop. MySQL is bad.
|
# ? Jan 11, 2019 07:56 |
Boris Galerkin posted:I wanna learn about databases for web/app backends so what is a good and cheap/free place to run one? Your computer? Also for non surprising, help avoid picking up stupid habits purposes, learn sql/relational theory with postgres. MySQL is a lot better than it used to be but still isn't truly ACID and has some insane defaults. Also its replication sucks. Source: My experience wrangling the drat thing on big (800k+ tables) and small for the last two decades. If your own computer isn't feasible, any free tier AWS ec2, or digital ocean and OVH do decently cheap VMs etc. NinpoEspiritoSanto fucked around with this message at 16:53 on Jan 11, 2019 |
|
# ? Jan 11, 2019 11:29 |
|
|
# ? Jun 1, 2024 17:00 |
|
MS SQL is king imo, and express is free, capped at 10gb databases last I checked. https://www.microsoft.com/en-us/sql-server/sql-server-editions-express Perfect for learning or ad hoc analysis.
|
# ? Jan 11, 2019 15:58 |