|
What's special about lot-1 and lot-2? Why not lot-0 and lot-2? Is this really the problem you're trying to solve, or is this how you've tried to simplify the actual situation? (My magic8 ball is twitching near "solving the wrong problem".)
|
# ? Jan 29, 2021 00:06 |
|
|
# ? Jun 8, 2024 02:14 |
|
PhantomOfTheCopier posted:What's special about lot-1 and lot-2? Why not lot-0 and lot-2? Is this really the problem you're trying to solve, or is this how you've tried to simplify the actual situation? (My magic8 ball is twitching near "solving the wrong problem".) This is actually the problem I'm trying to solve. - 0 is the start of the manufacturing, - 1 designates one middle part of the manufacturing process while - 2 is the final product. The date associated with - 1 is determined to be the production date which needs to match the date on the final product. Some of the final product dates have been set to the - 0 and I need to get a list of them. X y and z will be a 6 character lot number starting with the year 21-1234 an example. And you might ask why was it done this way instead of just another column. Which the program is set up to do, easily, natively. And as always the response is $@#$# previous admin hacked it together. Spikes32 fucked around with this message at 00:33 on Jan 29, 2021 |
# ? Jan 29, 2021 00:22 |
|
Spikes32 posted:This is actually the problem I'm trying to solve. - 0 is the start of the manufacturing, - 1 designates one middle part of the manufacturing process while - 2 is the final product. The date associated with - 1 is determined to be the production date which needs to match the date on the final product. Some of the final product dates have been set to the - 0 and I need to get a list of them. X y and z will be a 6 character lot number starting with the year 21-1234 an example. There's no way to "do" this in SQL in the sense that you can't just filter rows based on basic conditions. That means writing a query that first reshapes the data. Indeed, then, it will be a "self join" kinda, and you're basically using the database as a big clunky calculator, so it will be mostly equivalent to dumping all the data and doing it with an external programming language. I'd start by finding all the tag-2 dates specifically: SELECT left(lot,-2) l2,d d2 FROM table WHERE right(lot,2)='-2'; Next find all the tag-1 dates that differ. (Replace BSELECT, it won't let me post the select statements.) code:
|
# ? Jan 29, 2021 02:40 |
|
Appreciate it, I'll funagle this into working for me tomorrow or something similar via excel.
|
# ? Jan 29, 2021 04:13 |
|
Spikes32 posted:I have to figure out a query for something and I'm hoping you can help. I'm pretty sure I need to do a join but am having trouble visualizing it. I have the following table and I need a way to find the lots where the - 1 and - 2 dates do not match. So in this example I would want the query to return y-1 and or y-2, and z-1 and or z-2. X y and z are variable but each lot will be replicated two or three times with the - 0,- 1 or - 2 after them. You could also use subquery factoring if you have it: code:
|
# ? Jan 29, 2021 12:09 |
|
I know this should be pretty simple, but I don't have a way of testing my query at the moment... Assume I'm working in Oracle SQL Developer and I need to count the number of cases where the number of days between an application date and a start date are within 7 days. Would I be able to do something as simple as: count (CASE WHEN start_date - app_date <= 7 THEN 1 END)
|
# ? Feb 1, 2021 19:45 |
|
Scarf posted:I know this should be pretty simple, but I don't have a way of testing my query at the moment... Assume I'm working in Oracle SQL Developer and I need to count the number of cases where the number of days between an application date and a start date are within 7 days. Yes, as long as they're actually DATE type then doing straight subtraction comes out with a decimal of the days between them.
|
# ? Feb 1, 2021 23:59 |
|
Moonwolf posted:Yes, as long as they're actually DATE type then doing straight subtraction comes out with a decimal of the days between them. Cool, much appreciated!
|
# ? Feb 2, 2021 02:59 |
|
So here I am with my stupid question of the day. I have 3 tables, a user table, a contest entry table, and a votes table: user, entries, votes. The contest runs on a monthly basis and requires a new contest entry each month to be counted. Each user only gets to vote once per gender in any given month, so they can vote for one male and one female. I have the following query: code:
How can I have it only display the top X results (based on votes) for each unique year/month/gender combination? ModeSix fucked around with this message at 21:56 on Feb 8, 2021 |
# ? Feb 8, 2021 20:59 |
|
ModeSix posted:So here I am with my stupid question of the day. This sounds like a job for the ROW_NUMBER() window function, ordered by COUNT() DESC and partitioned by year, month, and gender with HAVING ROW_NUMBER()... <= the number of top entries you want to get
|
# ? Feb 9, 2021 02:47 |
|
Nth Doctor posted:What's your rdbms? This is on a MySQL database. So, ORDER BY count DESC GROUP BY year, month, gender? I am not sure how to partition it. If we can do that then just grab the first row of each grouping, that's exactly what I want to do. edit: Wow there's a PARTITION BY sql function. Thanks for pointing me in the direction I need to go on this!
|
# ? Feb 9, 2021 20:34 |
|
Im a lazy person and have my javascript program doing an insert statement and then running a separate DELETE statement that deletes all null entries right after to clean up the database…is there any reason NOT to do this?
|
# ? Feb 10, 2021 15:55 |
|
Empress Brosephine posted:Im a lazy person and have my javascript program doing an insert statement and then running a separate DELETE statement that deletes all null entries right after to clean up the database…is there any reason NOT to do this? I mean... it seems really goofy to me to go through the hassle of having the DB do the insert only to purge a portion of it right afterward.
|
# ? Feb 10, 2021 18:18 |
|
Yeah its pretty goofy. I don’t want to write a javascript case to check if a field is null, wish sql had a way to be like “insert into table(value) where value is not null” but that last part doesn’t work on insert statements.
|
# ? Feb 10, 2021 18:33 |
|
you could always just make that column explicitly be not null and any inserts get rejected?
|
# ? Feb 10, 2021 18:34 |
|
Wow you guys are so much more helpful than stack overflow, they told me to get hosed. That’s a great idea Bioware, is there a way to do that to force it to skip in SQL the not null and do the rest? Here’s my original question from stackoverflow: quote:I feel like I'm missing something obvious, but I'm trying to run this command in pg-promise for node/express:
|
# ? Feb 10, 2021 18:51 |
|
Your application should not be inserting null values if you don't want them in the first place, it's quite literally one line to check that the string isn't empty, so I'm not sure why we're heading this route, but it's a column type modifier, not per-query: ALTER TABLE formquestionresponse MODIFY value TEXT NOT NULL; This will also fail if you have null values, so you'll have to delete all null values first. Any attempts to insert a null there afterward should hardfail. You'll need to try/catch it.
|
# ? Feb 10, 2021 19:06 |
|
Ill do that, thank you for the help BioWare
|
# ? Feb 10, 2021 19:09 |
|
I'm mostly with Biowarfare on this. The exception I'd give is if you're inserting multiple rows with an INSERT SELECT and the values are coming from another database object, it might make sense to filter it in the SELECT clause and not bother your application, but for single-row inserts, yeah, do your logic in the application. Or write a stored procedure in the database that takes the parameters your application provides and does the check for NULL and building the INSERT internally. Stored procedures + bind variables help with keeping the fiddly SQL details out of your application and ensure that you have a standard interface to make grants for, rather than letting your application access every table and having to touch application code whenever you modify your data model.
|
# ? Feb 10, 2021 21:05 |
|
Yeah I'm inserting about 100 rows at once so that's why I asked. I just need to not be lazy and write the code in the app lol.
|
# ? Feb 10, 2021 21:16 |
|
in mssql you could do something like the following. not sure if postgres or whatever supports this syntax though, but iirc this is standard sql syntaxcode:
basically just putting the VALUES clause into a subquery and filtering that i agree with the previous posters that value should be set as non-nullable (in addition to any code/query changes), if having nulls in value doesn't make logical sense in your schema redleader fucked around with this message at 22:44 on Feb 10, 2021 |
# ? Feb 10, 2021 22:42 |
|
Nullable values in columns should really be an opt-in rather than an opt-out, but it's about 50 years late for that.
|
# ? Feb 10, 2021 23:48 |
|
What is the advantage of using a dynamic query in a stored procedure or function, compared to writing it some other way? I'm looking at PostgreSQL's PL/pgSQL's EXECUTE statement, while the Open Web Application Security Project recommends against using dynamic SQL in a stored procedure. For a dynamic query where the only variables are for data (literals), I'm thinking you don't need a dynamic query because you can pass the procedure's arguments straight into a query written normally. If so, that'd leave dynamic queries involving variables for tables and columns. Is there any equivalent way to write a procedure without a dynamic query using a variable for a table's identifier or a column's identifier? PL/pgSQL doesn't have table variables, so I'm guessing there'd be a workaround involving transforming a table into an array of rows, passing the array as an argument, and then turning it back into a table. galenanorth fucked around with this message at 23:06 on Feb 12, 2021 |
# ? Feb 12, 2021 21:36 |
|
there is no advantage to using a dynamic query, it's just sometimes you have to do it that way, specifically in your case where you want to pass in a variable for a table/column name. the disadvantages are, the query planner can't cache the plan for your dynamic query, and sql injection like owasp is worried about. postgres has functions to quote your variables safely to avoid sql injection, toward the end of the execute section in the documentation. the easiest way is to use the format function as they describe. as long as you're careful, it's safe. i do not suggest doing anything with row arrays to simulate polymorphism. i have attempted this in the past, just to explore how generic i could write things, and that way lies madness. postgres has only a few specialized polymorphic types (anyelement, anyarray, anynonarray, anyenum), and while occasionally you can get away with "ehh here's an anonymous row type, figure it out" in queries, with functions you will eventually have to write down the table and column names. like, if you try to return something generic, it looks like you got away with it, but in fact the caller has to specify the type it expects.
|
# ? Feb 13, 2021 02:40 |
|
Pretty much anything can be done with dynamic SQL, the question is always "What do you need it for?" Heed well my tale of woe: I do some ETL stuff in Oracle where, depending on various factors, I want to use one of several objects as a source for the data entry and decide which I'm using at runtime. Dynamic SQL seemed like it would save me a lot of "If this object then this bigass SELECT, if this object then this bigass SELECT with all the same columns and a slightly different FROM-clause"-ness. That might have been ugly but it would've been far faster to actually code than getting the dynamic SQL working correctly, chiefly made difficult by not being able to get useful errors at compile time and only erroring out at runtime, where I then needed good logging to figure things out. In the end, I hacked it together with some ref cursor jank. It works and performs reasonably well, I just wouldn't recommend anyone else go this way. Unless your data model is bad (realtalk, it probably is in one way or another but let's assume there's no wanton abuse of the common table), a stored function that does the same sorts of transformation on an object tbd with a parameter is probably overly broad and prone to all sorts of subtle errors. I recommend a look at this Steven Feuerstein post. It's brief, the comments have some further discussion on the topic and if you still come away from it thinking your use case requires it, then by all means have at it, just ask yourself first if it really is the best way to solve your concrete issue.
|
# ? Feb 13, 2021 13:49 |
|
D-SQL has its place, and while OWASP has legitimate concerns about injection, in SQL Server's implementation, at least, it can even be parameterized, and even have plans be stored and reused. My company has used D-SQL to allow extremely complex queries chop off entire sets of expensive joins that weren't needed based on the tenant's configuration. I also rolled my own replication using Change Tracking and triply nested D-SQL which SUCKED to write but was flexible enough to support keeping > 100 tables in over a dozen databases synced across 8 shards. I handled the new insert case and my brain basically leaked out of my ears.
|
# ? Feb 13, 2021 17:04 |
|
Yeah I wouldn’t completely denounce dynamic SQL. It has its place, but typically if you see someone jumping to it and they aren’t experienced it’s probably not a good use case. In my personal experience it has been extremely valuable for reducing the complexity of kitchen sink queries, which I believe I’ve posted about in this thread a few times.
|
# ? Feb 14, 2021 19:26 |
|
Ooh, I've hit for once a classic SQL quiz-style problem. We've got a Players table, a Seasons table, and a Players-Seasons table with each player's score. We need to show a join of all three tables, but the catch is that the currently active Season(s) must always be included, even if the player did not play in them (and so the Players-Seasons record is missing and the Score column is null). Here's the SQL Fiddle if you want to try.
|
# ? Feb 15, 2021 14:26 |
|
NihilCredo posted:Ooh, I've hit for once a classic SQL quiz-style problem. http://sqlfiddle.com/#!18/b0e95/10
|
# ? Feb 15, 2021 14:56 |
|
i made all your columns not null btw because i assumed that was an oversight
|
# ? Feb 15, 2021 14:58 |
|
code:
Alternatively you can union two subqueries (one for the active season, one for older seasons) if you need ~blazing fast performance~ and your players_seasons are super sparse or something.
|
# ? Feb 15, 2021 15:08 |
|
Thanks y'all Cross joins are something I only reach for when running diagnostics and need to some some gnarly one-time aggregation, I forgot that they have "legitimate" uses as well. I'll check if it still uses indexes this way though, if it doesn't i'll probably refactor to the UNION approach. Hammerite posted:i made all your columns not null btw because i assumed that was an oversight Don't worry, it was just a minimal repro, actual tables are totally different. NihilCredo fucked around with this message at 15:39 on Feb 15, 2021 |
# ? Feb 15, 2021 15:35 |
|
NihilCredo posted:Don't worry, it was just a minimal repro, actual tables are totally different. I meant more that I wasn't willing to go to the trouble of considering whether and how things might be different if all those columns really were nullable. I was offering a solution to the problem I thought you actually wanted a solution to, rather than a solution to the original problem per se.
|
# ? Feb 15, 2021 16:10 |
|
edit: nevermind
galenanorth fucked around with this message at 04:15 on Feb 16, 2021 |
# ? Feb 16, 2021 02:58 |
|
I have MySQL server 8.0.20 installed from the oracle installer on my Mac (Catalina). Intermittently after a reboot MySQL won't start and gets caught up in a start-up loop. I have to edit /etc/my.cnf and add: code:
code:
If I switch back to innodb_force_recovery=0 here then the problem persists. To fix it I have to backup everything up, then drop the offending table (since mysqlcheck doesn't show up anything I have to work out which table is the bad one through trial and error), then I can go back to innodb_force_recovery=0, start up MySQL and import my back up. It's not after every restart, but about 50% of the time. Does anyone have any idea whats causing this?
|
# ? Feb 22, 2021 15:33 |
|
I have two question about PostgreSQL. The first is about jsonpath. https://blog.sql-workbench.eu/post/json-path/ https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-JSON-PROCESSING-TABLE This blog implies that jsonb_path_exists() and jsonb_path_match() will have slower performance than @? and @@ because they cannot be indexed, so the index on the whole jsonb column will be used. The documentation on jsonb_path_exists_tz() etc., on the other hand, says "Their counterparts are immutable, and so can be used in indexes". Is the first source correct? There appears to be a contradiction, but maybe there is some resolution to it such as functions being used in indexes being different from functions being indexed. The second is about tsvector and tsquery. What is the difference between an anchored text search and a non-anchored text search? My PostgreSQL textbook mentions "PostgreSQL provides a full-text search capability, which is used to overcome SQL pattern matching operators, including LIKE and ILIKE, boosting the performance of the text search. For example, even though an index on text using the text_pattern_op class is supported, this index can't be used to match a non-anchored text search." I don't really get what they mean, and this information is hard to find in a search. galenanorth fucked around with this message at 20:34 on Feb 23, 2021 |
# ? Feb 23, 2021 16:58 |
|
galenanorth posted:I have a question about PostgreSQL's jsonpath. I haven't dug deep into JSONB indexing in postgres but just looking at section 8.14.4 of the official docs I think the first link is just using confusing terminology and talking about something entirely different. When he says "the biggest drawback of the new JSON path functions is, that they can’t be indexed" I think what he actually means is that the GIN index does not support queries using those operators (or rather, not-operators). Postgres doesn't try to parse the path expression to determine if it's going to result in a condition that could be converted into one of the operators the GIN index does support. It would be less confusing to say that these functions "cannot use the index we just created", IMO. On the other hand, when the postgres docs you linked say "their counterparts are immutable, and so can be used in indexes", what they're referring to is that you can create an index on an expression involving one of those functions. I don't know off the top of my head why or when you'd actually want to do that, but what the note is trying to tell you is that the return value of the _tz variants depend on a server setting that could change at any time and as such you can't create an index on an expression involving them (because if you did and then changed the server timezone the index would contain different results). TheFluff fucked around with this message at 20:23 on Feb 23, 2021 |
# ? Feb 23, 2021 20:14 |
|
Double posting because you edited to add a second question:galenanorth posted:The second is about tsvector and tsquery. What is the difference between an anchored text search and a non-anchored text search? My PostgreSQL textbook mentions "PostgreSQL provides a full-text search capability, which is used to overcome SQL pattern matching operators, including LIKE and ILIKE, boosting the performance of the text search. For example, even though an index on text using the text_pattern_op class is supported, this index can't be used to match a non-anchored text search." I don't really get what they mean, and this information is hard to find in a search. Anchoring here is used in the regular expression sense, as in anchoring a pattern to the start or end of a string. If you write WHERE someCol LIKE '%something', the pattern is anchored at the end of the string (there's no wildcard there; the pattern is fixed - anchored - to the end of the string and can only match there), but not at the start of the string (anything can go there and it can be arbitrarily long). Since b-tree indexes rely on sorting to work and strings are sorted based on what's at the start of the string, postgres can't use b-tree indexes for LIKE conditions or regular expressions that start with a wildcard - that is, patterns that are not anchored at the start of the string - because it wouldn't know where in the index to start looking. There's some additional caveats as well though, and in general LIKE and regular expressions are very crude tools, so fulltext indexes are usually what you actually want. For the full details see the manual, section 11.2, about index types and what operators they support. e: just to be clear, b-tree indexes are the default in postgres, what you get if you just say CREATE INDEX without specifying an index type. TheFluff fucked around with this message at 00:18 on Feb 24, 2021 |
# ? Feb 23, 2021 23:54 |
|
Anybody wants to hear about a schema design I'm mulling over? Context: Postgres 11 (soon 12, hopefully). The data is, for the purpose of discussion, a simple key-value store. In 99% the same mapping applies to all users, however users _may_: 1) have no value for a particular key 2) have a different value for a particular key 3) have an additional key-value that only exists for them There are hundreds of thousands of keys, but only a few hundred users. In the first and second case, the different value will usually apply to a significant subset of the users. In the third case, it will usually apply to individual users only. We need to efficiently fetch the key-values for the current user. For the sake of readability, assume all columns are NOT NULL unless I explicitly wrote NULL. Classic design: SQL code:
code:
The former doesn't require a subquery, but the latter has a much simpler SELECT part ("value" is actually multiple columns that would all need coalescing). Alternative design using Postgres array columns: SQL code:
However, enforcing uniqueness becomes pretty tricky. It's also not very extensible - I will not be able to add any user-specific metadata beyond the user_id. To avoid duplicating rows without resorting to array columns, I'd need to keep the values in one column and specify overrides by referring to the record's primary key code:
|
# ? Feb 24, 2021 13:04 |
|
|
# ? Jun 8, 2024 02:14 |
|
My immediate thought would be to use a separate lookup table for the keys with a NULLable column for a default value. I'd probably have a lookup table anyway foreign keyed to CODES.KEY and use a numeric value for that, rather than straight text. That way you have a definitive list of all possible keys and can rename them down the line without loving with existing rows in CODES.
|
# ? Feb 24, 2021 14:40 |