|
Aww... I just got smacked by this:SQL code:
https://docs.microsoft.com/en-us/sql/t-sql/functions/substring-transact-sql?view=sql-server-ver16 posted:SUBSTRING ( expression , start , length ) Unexpected that substring(“blah”, 0, 3) gives you a shorter result than substring(“blah”, 1, 3), and the doc for the start arg is pretty numbing.
|
# ? Jun 23, 2022 14:29 |
|
|
# ? Jun 10, 2024 08:11 |
|
epswing posted:Aww... I just got smacked by this: one of those cases where a group of illustrative examples would be a lot more helpful to most readers than explaining in technically correct detail what the behaviour is. e: hello Cloudflare
|
# ? Jun 23, 2022 15:00 |
|
I knew that SUBSTRING was 1-indexed, and I thought it a little unfortunate but pretty common for old languages. Since it's most commonly used with fixed values, no big deal, really. I did not know about this: quote:In this case, the number of characters that are returned is [..] start + length - 1 and it seems utterly loving insane to me. Why on Earth "length - 1" with 1-index positions? What sort of possible example can you come up with where it makes sense?
|
# ? Jun 23, 2022 15:39 |
When the index is less than 1, imagine the starting position moving further back behind the start of the string, and then consuming one length for each virtual character before the start of the string. It just makes a window function into the string:
|
|
# ? Jun 23, 2022 15:47 |
|
Hey all, it's been a while since I messed around with PostGres and I seemed to have forgotten some basic logic. Hoping for some help. I need to make a simple database that does the following: a table for videos that has: id (serial, PK) video-title video-link a table for users that has: id (serial, PK) first-name last-name associated-videos Now, to make a association between these I believe it's good theory to create another table that hosts all the associations right? so: table video-to-user id (serial, pk) linked-video linked-users I guess my issue, is how do I properly set up that third database? I know "id" would be the PK but how would I make linked-users "work"? That's the part i've forgotten lol. I think it'd be a foreign key right, but I can't remember how to have a array in postgres be a foreigh key for a single serial field? Thanks for the help
|
# ? Jun 23, 2022 18:05 |
|
worms butthole guy posted:Hey all, it's been a while since I messed around with PostGres and I seemed to have forgotten some basic logic. Hoping for some help. Is "linked-users" in the plural a typo or...? Each row of the association/junction table should relate to exactly one user and exactly one video. linked-video is a foreign key to the videos table and linked-user is a foreign key to the users table. Not sure what the "associated-videos" column in the users table is for. The junction table is responsible for associating users and videos, there shouldn't be anything in the users table about that.
|
# ? Jun 23, 2022 18:49 |
|
Hammerite posted:Is "linked-users" in the plural a typo or...? Each row of the association/junction table should relate to exactly one user and exactly one video. linked-video is a foreign key to the videos table and linked-user is a foreign key to the users table. Yeah I forgot about junction tables to be honest. I ended up turning it into this: Does this seem pretty "normal" for a set up for this simple problem? Thanks
|
# ? Jun 23, 2022 18:55 |
|
worms butthole guy posted:Yeah I forgot about junction tables to be honest. I ended up turning it into this: There's nothing obviously wrong with it that I can see
|
# ? Jun 23, 2022 19:00 |
|
Wooo I remembered a bunch then thank you
|
# ? Jun 23, 2022 19:03 |
The couple int8 types in that diagram seem to be an error. You'd run into trouble if you don't fix those to match the related key's type.
|
|
# ? Jun 23, 2022 19:16 |
|
Yeah noticed that too; dBeaver keeps making them int8's instead of the bigints I assign them oddly. Don't know why.
|
# ? Jun 23, 2022 19:26 |
|
So you can have a topic associated with a video associated with a user without that topic being "associated with the user"?
|
# ? Jun 24, 2022 02:32 |
|
I guess my idea behind it was that you could query a topic by user and then see videos that are associated to that topic by inner joining the video library? this way videos can have multiple topics also
|
# ? Jun 24, 2022 02:41 |
|
I made a more traditional version of it and styled it off the actual data formats we'll use (so we're limited to int, string, date instead of the usual db ones) and reduced some of the tables. I haven't done one in these in about 1.5 years, so i'm wondering if my crows feet are correct in this diagram. Again, thank you all very very much for any help. Here's what I put together: So the idea is: so: Video Library: A video can have one or many genres attached to it a video can have one and only one age rating attached to it a video can have zero or many contacts associated with it a video can have zero or many 5 point ratings associated with it Genres: a genre can zero or many videos associated with it Ratings: a age rating can have zero or many videos associated with it Video_Ratings: a video rating can have one and only one video associated with it a video rating can have one and only one contact ID associated with it Contact: a contact can have zero or many video ratings associated with it a contact can have zero or one video associated with video last watched (I need to fix this on the diagram) Is this correct? I'm going to cram this weekend on crows feet notation, never personally used it. edit: Uploaded wrong image, fixed worms butthole guy fucked around with this message at 14:45 on Jun 24, 2022 |
# ? Jun 24, 2022 14:18 |
|
For your many to many relationship between video and genre you need a junction table, not a FK to the genre table as that would only allow for one genre per video. Also, you're combining singular and plural table names, pick either singular or plural. Personally I prefer singular because every table can contain multiple rows/entities and saying "video table" already indicates plurality, so making the table name plural is just unnecessary IMO. "Video_Library" is also just another form of pluralizing the table name. I'd name them: video genre video_genre age_rating rating (or review_score, review_rating, contact_rating) contact e: also to avoid age_rating.age_rating I'd change that column name to `name` or `description`. RandomBlue fucked around with this message at 16:27 on Jun 24, 2022 |
# ? Jun 24, 2022 16:23 |
|
Sweet thank you very much. I ended up tweaking it a bit and I think I understood what you meant by the removing the FK so it can have multiples: So: Video: Can have one and only one rating Can have zero or many watch records Can have zero or many contacts associated with it via last_video_watched Can have One or many genres associated to ti Rating: Can have zero or many videos assigned to it Genre: Can have zero or many genre assignments to it (does that make sense?) Contact: Can have zero or one movie assigned to it (last_video_watched) Can have zero or many watch records Video Genre: Can have one and only one video assigned to a record can have one and only one genre assigned to it I think that covers it! Again thanks for the help, this has been pretty fun also tbh.
|
# ? Jun 24, 2022 16:47 |
|
Junction tables are a case where using a composite PK is generally acceptable. Typically you wouldn't have a separate PK there, you'd have a composite PK made up of `genre_id` and `video_id` that are also FK's to the other tables.
|
# ? Jun 24, 2022 17:59 |
|
Awesome, thank you. I figured that would be a option but played it safe!
|
# ? Jun 24, 2022 18:08 |
|
worms butthole guy posted:Sweet thank you very much. I ended up tweaking it a bit and I think I understood what you meant by the removing the FK so it can have multiples: Don't store the last_video_watched column on the Contact record. You're denormalizing the data that's already in the Video_Watched table. Other than that, from a schema standpoint it looks good to me.
|
# ? Jun 25, 2022 06:04 |
|
This seems like a simple thing but I'm having a brain fart right now. I have the following tables: Commodities Locations Commodity_Location (many to many) The Commodity_Location table has a boolean column "inactive". I want to get only commodities that do not have any active locations. If even one active location exists, don't select it. Is this possible without a subquery?
|
# ? Jun 29, 2022 17:25 |
|
If you don't wanna do a NOT EXISTS subquery, you can Left Join the active many-to-many entries, COUNT them and filter for entries where that count is 0, kinda like this:code:
|
# ? Jun 29, 2022 17:32 |
|
could do a cte to find all of the commodities with an active location, then use that to exclude those commodities.
abelwingnut fucked around with this message at 17:50 on Jun 29, 2022 |
# ? Jun 29, 2022 17:47 |
|
Thanks, I ended up using a not exists subquery. I also have never before used CTEs.
|
# ? Jun 29, 2022 17:57 |
|
Does anyone have a good resource for learning indexes, how they work, what they do, how to effectively use them? I'm working a personal project and am about done with the backend and DB, but I'm still a beginner with DBs so I just learned that indexes are a thing and I'd love to read over a dummy's how-to guide if there's a good one out there. Specifically for MySQL if it matters.
|
# ? Jul 7, 2022 21:49 |
|
teen phone cutie posted:Does anyone have a good resource for learning indexes, how they work, what they do, how to effectively use them? I'm working a personal project and am about done with the backend and DB, but I'm still a beginner with DBs so I just learned that indexes are a thing and I'd love to read over a dummy's how-to guide if there's a good one out there. https://use-the-index-luke.com/. It's not laid out like it, but it's a full-rear end book.
|
# ? Jul 7, 2022 22:22 |
|
raminasi posted:https://use-the-index-luke.com/. It's not laid out like it, but it's a full-rear end book. As a quick overview of features supported, something like https://en.m.wikipedia.org/wiki/Comparison_of_relational_database_management_systems has a section on indexes. (I only observe that it provides a concise list, and leave the "A is clearly superior to B" for another day.)
|
# ? Jul 7, 2022 22:49 |
|
Looking at migrating a fairly large production database from MySQL to Postgres, any pitfalls to be aware of? 95% of the code using it is with an ORM.
|
# ? Jul 10, 2022 05:30 |
|
prom candy posted:Looking at migrating a fairly large production database from MySQL to Postgres, any pitfalls to be aware of? 95% of the code using it is with an ORM. Some things you'll find: Oracle and PG have similar command line / client support. MySQL is different. I've worked with a ton of people over the decades that can't do anything without their gui MySQL client, but that's not really a tool that translates well. ORMs are likely to screw up. EG they'll likely stuff network addresses into a text field "for portability". PG has a network type that supports inet4 and inet6; not using it could degrade performance. Same with uuid fields, json, geometric data, etc. Amusingly I've seen many more custom queries in mysql because ORMs yield queries that can be optimized automatically in the PG planner but that miss indexes in mysql without straight joins, hints, and other workarounds. PG heavily leverages foreign key relationships and table joins are much more efficient. By contrast, software engineers tend toward "flatness" in mysql so you may find tables aren't adequately normalized after you migrate. The process model is completely different. MySQL is designed for highly threaded read heavy access. PG is optimized for process based read/write access. Your workload may demand more focus on read replicas in PG. A misconfigured orm library can cause a mess (since for MySQL it'll bumble forward until it yields lock contention and data loss, where for PostgreSQL it may just look like the db is nonresponse because the orm has created the lock contention). MySQL heavily leverages "business logic must be in the application" approaches. PostgreSQL has much better support for "business logic should be independent of application" approaches. If your plan is "stuff it in PostgreSQL, find it's slower, go back to MySQL", then yeap, that's just calendar time. "Migrate to PostgreSQL, fix our data model, data security issues, application limitations, and improve our understand of database design", that's much more likely to be successful. There are articles online of major companies switching to PostgreSQL (and some that went back to mysql). But I've only been using both for 15yr or so in various production settings and am still biased, haha.
|
# ? Jul 10, 2022 08:25 |
|
PhantomOfTheCopier posted:They're... rather different. Of course they both support "queries and relational data" but that's where the similarity ends. (Forgive me if this sounds biased but) MySQL has spent decades optimizing toward ORM-heavy applications where little to no thought is given to data structure or optimization except when there's a problem (operational or cost), and for which the only solution is "maybe add an index". It mostly succeeds because hardware is fast enough and there's sufficient memory for query/file buffers to mask design issues. PostgreSQL has spent decades developing ironclad features and functionality of a fully production capable object relational database, which means both that it supports the basics but is highly optimized for proper design. Thanks for the writeup! I've actually used Postgres a fair bit but I still didn't know most of this stuff. 95% of my interaction with MySQL and Postgres has been through Rails' ActiveRecord ORM and Postgres has been the Rails default for years and years now so I'm hoping the ORM will mostly be doing what I want it to do after the switch. Our database is heavily denormalized and our queries use a lot of joins (and a lot of WHERE EXISTS queries now that this thread pointed out how good those can be for filtering). One of the major reasons I want to switch is I'm finding migrations are really slow. Adding a column to tables with a couple million records takes too long. I've also struggled with a couple of slow queries and in my reading/time on StackOverflow I think that Postgres would mostly do a much better job of executing those queries. I think Rails has always really obfuscated the database and if I hadn't spent a few years working with PHP before starting on Rails I probably would barely know how to hand-write queries at all. The idea of business logic in the database is completely foreign in the Rails world so that's something I'll have to check out. I'd like to get rid of Rails completely eventually but I think that might be a pipe dream at this company. I occasionally use a MySQL gui client but mainly just for importing/exporting faster and for a nicer experience when I'm writing complex queries. I definitely don't need it. From what you've said it sounds like our data model is probably already a good candidate to go to postgres so the bigger questions are going to be around how I can cut it over in production with no data loss and minimal downtime... which is probably a question for the DevOps thread.
|
# ? Jul 10, 2022 14:53 |
All that is true but why is the PG CLI client like some kind of emacs-style relic from the 70s where you have to memorize an esoteric list of backslash commands.
|
|
# ? Jul 10, 2022 23:17 |
|
That's interesting. I find it rather consistent that \d works for everything, and in the rare case that I need something atypical usually the command is the first letter (index, function, type) albeit with some case sensitivity fun at times. Honestly I don't have much complaint about having to look up \a or weird formatting flags when I need them because it actually offers them. In the mysql client I average two or three attempts per command because I can never remember if it's show/tables/like or show/index/from or show/indexes/for or some other plural+preposition combination out of six or eight. (ps, and when someone shares their screen is always a different command, eg show create table...). Output I find either unreadable because it's 2k characters wide wrapped with center/alignment, or unreadable because it's 2k lines tall with no pager, so then you need a script to turn query results into something manageable. I've also never, ever seen a production mysql environment where the creator decided to retain the client assist features. "Oh we can't have -A because it takes forever to start and/or impacts the database" (and queries against information schema might block while they rebuild) (which maybe they've fixed, but not in one of the production systems I've been using recently). In psql tab completion just works, versus mysql where I usually have to write queries somewhere else and individually look up every. single. field. I use savepoints for lots of ddl and functional testing as well, so that's nice to have. Shrug, biased or something. As you suggest, it may be as idiosyncratic as vim versus emacs. PhantomOfTheCopier fucked around with this message at 08:35 on Jul 11, 2022 |
# ? Jul 11, 2022 08:32 |
|
Data Graham posted:All that is true but why is the PG CLI client like some kind of emacs-style relic from the 70s where you have to memorize an esoteric list of backslash commands. that idiom is pretty common in the SQL world, just wait until you encounter SQLPLUS
|
# ? Jul 11, 2022 16:09 |
And yeah, I'm sure a lot of it is as simple as "whatever you learned first is what seems natural". But I always find myself lost digging through the \? listing and trying to figure out the distinction between \dS and \dS+ when all I want for troubleshooting is a command like "describe foo" or "show tables" e: but long ago I learned how dangerous it is to use the CLI client for anything when you can fat-finger a query and drop a table in prod or whatever. GUI clients which I pooh-poohed as toys at one time are now just basic best practice Data Graham fucked around with this message at 16:29 on Jul 11, 2022 |
|
# ? Jul 11, 2022 16:27 |
|
Data Graham posted:. GUI clients which I pooh-poohed as toys at one time are now just basic best practice Not knowing what gui supports for databases, I can only imagine the horrors.
|
# ? Jul 11, 2022 20:01 |
|
PhantomOfTheCopier posted:Let me count the number of systems Ive had to repair where someone accidentally dragged directories inside each other. "Oops windows/ is now inside my documents!" I think someone using a database GUI might be a different type of user than someone who drags Windows into My Documents
|
# ? Jul 11, 2022 21:12 |
|
working from the basic precept that combinatorial blowout with 2+ collections is worse than some JSON_AGG() string work... why is JDBC still a thing? why would you not flatten one-to-many tables into a myChildren : [123, 456, ...] json representation with uuids or consts, and then remap them in client hardware? Combinatorial blowout ain't free in terms of memory either. jodbp protcol should be:
why are collections and maps still such a loving antipattern in jdbc - not even sql itself, but actually jdbc just doesn't work right when every permutation of every child object is a row? free the uuid_arr basically (oh god postgres is standing right behind me isn't it)
|
# ? Jul 12, 2022 06:08 |
|
I'm struggling to write a query and wondering if anyone has any advice. we have samples which have data sets attached to them, each data set has some number of data items. I'm trying to identify samples where two particular data item values in datetime format are further than 45 minutes apart. the DB is SQL Server there's a few extra complications: some samples have more than one of the dataset in question, and one of the data items moved from one dataset to another at some point. for now I just want to consider the set of samples with only one dataset. here's what I have so far: code:
pre:sample id name value 123 STARTTIME <time> 123 ENDTIME <time> 124 STARTTIME <time> 124 ENDTIME <time> code:
this won't be run directly against production and won't be run very often, so I'm not super concerned about performance (though would love to know if i've done something very slow and/or stupid)
|
# ? Jul 20, 2022 16:13 |
|
Asleep Style posted:I'm struggling to write a query and wondering if anyone has any advice. we have samples which have data sets attached to them, each data set has some number of data items. I'm trying to identify samples where two particular data item values in datetime format are further than 45 minutes apart. the DB is SQL Server would this work? code:
|
# ? Jul 20, 2022 16:39 |
|
that totally works, thank you
|
# ? Jul 20, 2022 18:24 |
|
|
# ? Jun 10, 2024 08:11 |
|
Rookie question here: I have a small sample database on SQL Server. I want to import those tables into Postgres for self-learning. In SQL Server, I exported each table as a CSV. What is standard practice for importing data into a database, in regards to keys and constraints? Should I create new tables in PG, import the CSVs, then go back and ALTER TABLES to add constraints and the like? *Also, is it better to generally dump a CSV copy into temp tables and then insert into the final tables, or does it not really matter in practice? Hughmoris fucked around with this message at 19:39 on Aug 1, 2022 |
# ? Aug 1, 2022 19:35 |