|
Is there only one order? Can users customize the order? Is the order actually entirely arbitrary, or are you not telling us about a property of the data that induces the order? Are the items arranged into some manner of subgroups? I ask because it seems questionable to require the display, in random/chosen order, of a list that's 20k elements long. It's not clear to me that those numeric sorting IDs should be a field in that table (as opposed to some other). It's also not clear --- sacrilege here --- why, if this order is so important, that this isn't part of the primary key for the thingies (create sequence... increment 100, for example). I can see it going either direction on both questions depending.
|
# ? Apr 30, 2018 13:54 |
|
|
# ? Jun 4, 2024 19:42 |
|
Kind of repeating poster above me, and only saying this only because you said you were new to this stuff: typically the "why" of the ordering is manifest in "the data" itself. Maybe it's who created the record or modified_date or store_id or whatever else or combo therein. I guess I can see the speed/simplicity advantage if those criteria are complicated at just having an abstract "sort_order" field, but unless you have a reason NOT to I would focus on having the "why" in "the data" and just putting "order by" at the end of your queries.
pangstrom fucked around with this message at 14:23 on Apr 30, 2018 |
# ? Apr 30, 2018 14:21 |
|
PhantomOfTheCopier posted:It's also not clear --- sacrilege here --- why, if this order is so important, that this isn't part of the primary key for the thingies (create sequence... increment 100, for example). So if you change the sort order, you're going to go through and update all the other tables that relate to this with new keys? Surrogate keys are the way to go, for this and many other reasons.
|
# ? Apr 30, 2018 22:17 |
|
PhantomOfTheCopier posted:Is there only one order? Can users customize the order? Is the order actually entirely arbitrary, or are you not telling us about a property of the data that induces the order? Are the items arranged into some manner of subgroups? I ask because it seems questionable to require the display, in random/chosen order, of a list that's 20k elements long. I'm sorry I have to be vague I hope this makes sense. Each row refers to a physical object in storage and the order is the order they're in on the shelves. The order is basically arbitrary; there's nothing in the data that could be used to determine it. I can't change the order because there's so many of the things and various documents refer to this order. I need to occasionally change it if objects are rearranged.
|
# ? May 1, 2018 00:57 |
|
Pekinduck posted:I'm sorry I have to be vague I hope this makes sense. Each row refers to a physical object in storage and the order is the order they're in on the shelves. The order is basically arbitrary; there's nothing in the data that could be used to determine it. I can't change the order because there's so many of the things and various documents refer to this order. I need to occasionally change it if objects are rearranged. Are these spaces on shelves well defined? Like every foot of shelf space is a separate "object spot"? If that's the case, make a location table and FK it to your items table with a unique index such that only one item could be in any spot on the shelves, and then enforce a rigorous data-defined sort order on your locations and order by that.
|
# ? May 1, 2018 01:16 |
|
I need a sanity check. Can anyone think of a practical reason to write a query like this: code:
code:
The code that generates the query looks like this: code:
|
# ? May 1, 2018 01:35 |
|
Inner joins nested on other inner joins don’t do anything except potentially confuse the optimizer depending on your dbms. Probably better off doing the latter. Nesting an inner join on another join is typically used when you want to compute that join first. For example, let’s say you have three tables: employees, awards, awardtypes. If you want to get every employee and the awards they’ve received whose frequency is annual, you’d do something like this: code:
Maybe they generalized and built it this way to be accommodating. But for your example you could swap in your new query just fine.
|
# ? May 1, 2018 07:07 |
|
Re: user-editable ordering in SQL, begriffs recently wrote a good blog post on the subject covering various approaches https://begriffs.com/posts/2018-03-20-user-defined-order.html I also recommend the HN discussion of that post: https://news.ycombinator.com/item?id=16635440
|
# ? May 1, 2018 10:06 |
|
Nth Doctor posted:Are these spaces on shelves well defined? Like every foot of shelf space is a separate "object spot"? If that's the case, make a location table and FK it to your items table with a unique index such that only one item could be in any spot on the shelves, and then enforce a rigorous data-defined sort order on your locations and order by that. Alas no, the objects vary widely in size and are just packed in. The current system is a huge card catalog with the cards in the order the objects are in on the shelves. Its catagorized a bit but in broad categories. Pekinduck fucked around with this message at 17:58 on May 1, 2018 |
# ? May 1, 2018 17:55 |
|
I suggest renaming the field to something like "shelf_position" then. You aren't sorting the data, you're storing the relative position on the shelf. It's actual data, not metadata about the row ordering.
|
# ? May 1, 2018 18:29 |
|
Am I crazy for toying with the idea of a linked list implementation to define adjacency over the table?
|
# ? May 1, 2018 18:31 |
|
Nth Doctor posted:Am I crazy for toying with the idea of a linked list implementation to define adjacency over the table? Now then, what happens when something is removed? Is it necessary to enforce a pigeonhole principle on the range of values?
|
# ? May 1, 2018 20:13 |
|
add a few columns for width, height, length to measure the items and also a few columns of xpos, ypos, zpos, to represent relative location from a geotagged spot in your store (your choice just be consistent) and also an item_orientation column storing a vector representation so you know what way it’s facing after you’ve set that all up, load your items and run some collision validation using a combination of spatial sql and 3d modeling techniques to make sure all your data makes sense then write a machine learning algorithm that chooses the most likely route a person will choose for any given pick list... implement it as a CLR function in sql finally, pass your list of items to the function and sort by pick order. Boom, done!!! just make sure you work out a process that keeps your sql tables up to date
|
# ? May 1, 2018 21:00 |
|
Ruggan posted:add a few columns for width, height, length to measure the items and also a few columns of xpos, ypos, zpos, to represent relative location from a geotagged spot in your store (your choice just be consistent) and also an item_orientation column storing a vector representation so you know what way it’s facing i like this post! Particularly the part where it glosses over the hardest part in the last sentence, lol.
|
# ? May 1, 2018 21:43 |
|
McGlockenshire posted:I suggest renaming the field to something like "shelf_position" then. You aren't sorting the data, you're storing the relative position on the shelf. It's actual data, not metadata about the row ordering. Alright, thanks. Luckily the collection of objects is fairly static; its the sort of deal where you take one, use it, then put it back where it was. Ruggan posted:add a few columns for width, height, length to measure the items and also a few columns of xpos, ypos, zpos, to represent relative location from a geotagged spot in your store (your choice just be consistent) and also an item_orientation column storing a vector representation so you know what way it’s facing
|
# ? May 2, 2018 00:18 |
|
I have a SQL index that uses the INCLUDE clause. My layman's understanding is that this increases performance, as the columns in the INCLUDE clause are available for read in the index itself, removing the need for a lookup in the main table after finding the records I want. I have to reproduce this index in a SQLite database, but SQLite doesn't seem to support the INCLUDE clause. What are my options here?
|
# ? May 2, 2018 16:44 |
|
LargeHadron posted:I have a SQL index that uses the INCLUDE clause. My layman's understanding is that this increases performance, as the columns in the INCLUDE clause are available for read in the index itself, removing the need for a lookup in the main table after finding the records I want. I have to reproduce this index in a SQLite database, but SQLite doesn't seem to support the INCLUDE clause. What are my options here? Either just make an index with the indexed columns and deal with the probably pretty negligible performance impact (since if you cared that much you probably wouldn't be using SQLite), or add the included columns to the index in some order that makes sense for how they're used.
|
# ? May 2, 2018 17:24 |
|
Our system has a query that runs a lot to drive a part of our UI that's central to the daily work of a good chunk of the business that I've been trying to optimize. Mostly, it's gone pretty well, but the other morning, SQLServer poo poo out a Bad Plan. Here's the diagram of the relevant section of that Bad Plan: That's an inner join on two tables with a one-to-many relationship that just aggregates the count of the many side. It's not a real foreign key relationship, which is probably why this is happening, but actually fixing this schema is a Out of Scope whereas, say, adding the right query hint to keep it limping along is In Scope. Sorry, I know that's triggering and, hell, it bugs the poo poo out of me, but that's life ATM. Here's the diagram of the same part of the good plan: GoodPlan.Object4 == BadPlan.Object3 is the Many side of the relationship GoodPlan.Object3 == BadPlan.Object2 is the One side (sorry about that - it's SentryOne Plan Explorer's anonymized output) The Bad Plan does use a different index that's probably useless because it's basically 'not deleted' which is most of the table, but hinting it to the index used in the Good Plan did not change the merge strategy downstream. That is, it still read a bunch of the table and did the aggregate on that rather than merging and then doing the aggregate on the much smaller intermediate result set. Is there anything I can do to get it to consistently pick the merge strategy from the Good Plan? e: OPTION (LOOP JOIN) seems to be helping a lot 🤞 Munkeymon fucked around with this message at 20:32 on May 2, 2018 |
# ? May 2, 2018 17:34 |
|
What are your predicates on the table that is a scan in the bad diagram? You basically need to find out why the optimizer is preferring a table scan to a seek. If you can figure that out and fix it your performance should return.
|
# ? May 2, 2018 23:19 |
|
Ruggan posted:What are your predicates on the table that is a scan in the bad diagram? The only predicate was that IsDeleted column for some reason. LOOP JOIN actually improved other parts of the plan I was going to work on later where it was doing similar over-reads, but with better indexes on smaller tables, too. I'm also going to drop that index because it just can't be that useful to index on a binary that includes most of the table in the overwhelmingly common case. Well, of course it's loving nullable so trinary, I guess
|
# ? May 3, 2018 14:02 |
|
Is it a filtered index (index with a where clause)? Those are actually super useful because they let the optimizer avoid needing to analyze the table for that filter criteria. The classic example is yours exactly - some stupid flag that is always filtered in the same way for 90% of use cases (like employee.isActive = 1). Is your query basically: FROM A INNER JOIN B ON A.id = B.id WHERE A.IsDeleted = 0 If so, a filtered index on A with the same where clause is probably appropriate. Even better if it is an index on whatever that join key is. Then it will almost certainly do an efficient operation.
|
# ? May 3, 2018 23:03 |
|
One last thing. You absolutely need an index on the many side of that one-to-many on whatever your join key is. That is what allows the optimizer to do a seek to find all matches in the many side from the one side. What I can gather from your original pics of the query plan is that the optimizer could not effectively seek the appropriate records so instead it calculated the aggregate for the ENTIRE table (which involves an implicit sort on the grouper) and then joined that to your other table by picking out the then sorted records.
|
# ? May 3, 2018 23:09 |
|
There is an index on the join key - it'd, be something like B.aId in your example, but the optimizer used a different, unrelated one on B.isDeleted instead*. A.id is a real primary key, but I made a covering index for A on the parts of the where clause that apply to it, so it's good to go. (B is Object4 in the Good Plan and Object3 in the Bad Plan; A is Object3 in the Good and Object2 on the Bad - the anonymization wasn't consistent, again: sorry for the confusion!) Filtering the index on isDeleted is a good idea. I'll try that and see how it shakes out. * I'm not sure why it's even there - I think a lot of indexes have been haphazardly applied over the years because "it made the plan look good in SSMS one time" Munkeymon fucked around with this message at 15:01 on May 4, 2018 |
# ? May 4, 2018 13:50 |
|
I love to concatenate location, date, and transaction ID columns to make unique identifiers because the $30k verifone pos reuses transaction IDs from day to day E: I even more love to be doing it because the software lacks a specific functionality that sales assured us it had and was the main reason we bought it Shugojin fucked around with this message at 13:54 on May 11, 2018 |
# ? May 11, 2018 13:51 |
|
I'm an SQL moron. I'm using MySQL and used AES_ENCRYPT just fine to store passwords in my database. I'm trying to retreive them. I got the following to work, but that returns the blob rather than the password.SQL code:
SQL code:
edit : Got it. SQL code:
Sad Panda fucked around with this message at 14:51 on May 14, 2018 |
# ? May 14, 2018 14:45 |
|
Sad Panda posted:I'm an SQL moron. I'm using MySQL and used AES_ENCRYPT just fine to store passwords in my database. I'm trying to retreive them. You... you do know this is a thing that you should never, ever, ever do under ordinary circumstances, right? I mean, I assume there are real life scenarios where there are good reasons for doing this, but normally when you store passwords you always store them in non-retrievable form, and you didn't specifically say "I know what I'm doing, don't get on my case"...
|
# ? May 14, 2018 16:02 |
|
On top of "don't do that," MySQL AES_ENCRYPT uses a non-standard padding method that will cause problems if you can't use AES_DECRYPT later (read: need to decrypt the data in your own code). Also, I hope you aren't using replication, because the plaintext and encryption key are gonna be in your replication logs now. So yes, please don't do that. It's bad and wrong and a giant loving security threat. Using two-way encryption on passwords is never the right choice. Ever. Even if the specification says the password must be retrievable, the specification is wrong. Do not do it that way, ever. Use a one-way hash. The current industry standard is bcrypt, with implementations available for every single language and platform. Another adequate solution is Argon2, the winner of the Password Hashing Competition, but implementations are slightly more rare. If you feel like being weird, you can use PBKDF2, which hasn't actually been recommended for password storage, but implementations are easy and everywhere. McGlockenshire fucked around with this message at 18:50 on May 14, 2018 |
# ? May 14, 2018 18:48 |
|
What company do you work for I would like to know what other insane security holes you guys must have
|
# ? May 14, 2018 19:39 |
|
It's a personal project. I'm creating a Python application that stores my login information for a bunch of websites in a database. It retrives the requested login information from the database, and logs into the website to scrape some information from it. The MySQL database is stored locally, and the Python is too. What's the security threat other than someone getting my Mac? If there's a better option then I'm all ears.
|
# ? May 14, 2018 21:44 |
|
Sad Panda posted:It's a personal project. I'm creating a Python application that stores my login information for a bunch of websites in a database. It retrives the requested login information from the database, and logs into the website to scrape some information from it. If you are only going to access the MySQL server via programs running on the machine itself, then make sure it is configured to only accept connections from localhost. That might be the default. maybe relevant: https://stackoverflow.com/questions...=google_rich_qa
|
# ? May 14, 2018 21:59 |
|
also, lol
|
# ? May 14, 2018 22:02 |
|
Hammerite posted:If you are only going to access the MySQL server via programs running on the machine itself, then make sure it is configured to only accept connections from localhost. That might be the default. Thanks. Googling makes it sound like it's the default. I edited /etc/my.cnf to include code:
|
# ? May 14, 2018 23:48 |
|
It doesn't matter. Do it the right way. Bad routines breed future failures: https://it.slashdot.org/story/18/05/02/0613205/github-accidentally-exposes-some-plaintext-passwords-in-its-internal-logs
|
# ? May 15, 2018 01:08 |
|
Sad Panda posted:It's a personal project. I'm creating a Python application that stores my login information for a bunch of websites in a database. It retrives the requested login information from the database, and logs into the website to scrape some information from it. Oauth if the websites support it, which I assume they don't. So you get to store passwords and watch people freak out when you ask for help with the only way to solve your problem.
|
# ? May 15, 2018 03:49 |
|
Freaking out is the correct default response to storing the passwords like that.
|
# ? May 15, 2018 06:14 |
|
Sad Panda posted:If there's a better option then I'm all ears. In addition to what the others have said, at least push down the encryption stuff to the application layer instead of doing it in the database, and decrypt to the plaintext at the last practical moment. As I mentioned above, the builtin "AES" stuff is not actually AES the way other things understand AES.
|
# ? May 15, 2018 17:17 |
|
Just get your passwords out of the Keychain, the secure password storage database that comes with your OS? https://gist.github.com/gcollazo/9434580
|
# ? May 15, 2018 19:00 |
|
Unrelated: we recently migrated from on-perm to an AWS instance for our production database. It probably could have been handled better, but I can't say that for sure because I wasn't involved much until D-day when I spent 12 hours helping patch up holes an incomplete replication left and adding indexes because the could server's performance was so different for some reason... *looks pointedly at 'max server memory' setting*. On second thought, maybe I can say that for sure after all. (<- figured I'd spare people my griping if they don't want it) Today I found a bunch of records in a VARBINARY(MAX) column were truncated to exactly 32768 in length and all of them were last updated before the migration. Weird! It hasn't cause any issues because the data in those columns had already been consumed and represented elsewhere - they're just temporary backup data. Still, I'd like to know what did that. I suspect it was replication, but I can't find any confirmation that replication can do that. The logs from when we migrated have rotated out and the old production server is... IDK - in some infrastructure guy's basement now? The backups I can get to with my access level have the same truncated old data, but whole newer data. I just wanna know if this is A Thing and my google-fu is failing me. Or I'm wrong about the cause.
|
# ? May 15, 2018 19:41 |
Munkeymon posted:Just get your passwords out of the Keychain, the secure password storage database that comes with your OS? https://gist.github.com/gcollazo/9434580 Bet the specification for Keychain that said passwords must be retrievable wasn't quite wrong
|
|
# ? May 17, 2018 22:45 |
|
|
# ? Jun 4, 2024 19:42 |
|
Does anyone have a good link for quick and dirty cliff notes style tips for formatting a spool in oracledb?
|
# ? May 22, 2018 15:47 |