|
Alright thanks for the feedback again. Is this better now?
|
# ? Nov 14, 2011 09:36 |
|
|
# ? Jun 7, 2024 19:17 |
|
Yeah that looks way better and is quite easy to understand.
|
# ? Nov 14, 2011 16:22 |
|
In MovieActor and MovieDirector you do not need an additional column to serve as the primary key. The primary key of MovieActor should be (MovieID, ActorID); the primary key of MovieDirector should be (MovieID, DirectorID).
|
# ? Nov 14, 2011 22:58 |
|
I'd be tempted to make the Actors just have a Name field rather than First Name, Last Name. Consider "Samual L. Jackson" for example. EDIT: also have a People table instead of an Actors and Directors table, sometimes they are the same person. You have tables to establish the type of relation. Zombywuf fucked around with this message at 13:13 on Nov 15, 2011 |
# ? Nov 15, 2011 13:09 |
|
I'm looking to create hot (and cold) backups of a production MySQL database. Are there any tools out there that are proven/standard?
|
# ? Nov 15, 2011 16:21 |
|
Zombywuf posted:EDIT: also have a People table instead of an Actors and Directors table, sometimes they are the same person. You have tables to establish the type of relation. I agree with this, personally I view a superior design as something like 'MoviePerson' with the addition of 'MoviePersonRole' to store roles like actor, director, producer, etc. If you just want actors or just want directors, it's still a simple query and it would allow for an individual to be both an Actor and Director without duplicating the person instance, for example.
|
# ? Nov 15, 2011 16:59 |
|
Keep in mind that you can still have duplicates depending on how you determine if a person is the same. First and Last name will obviously reject a lot of legitimate people from having their own record in the table. So then you try and insert their date of birth, but you realize that you can't always get it and you end up declaring the column as nullable and you're back at where you started. Table normalization is important but I'd argue that normalizing your data is more important the database. And then of course is the problem of some movies will have "Samuel L. Jackson" listed as the actor and other will just have "Samuel Jackson". Are they two different people? Will they have separate records in your table? I'd like to hear what some of the other more experienced developers have to say with regard to designing a database schema that takes into account the practical problems that come from imperfect data.
|
# ? Nov 15, 2011 17:45 |
|
MEAT TREAT posted:Keep in mind that you can still have duplicates depending on how you determine if a person is the same. First and Last name will obviously reject a lot of legitimate people from having their own record in the table. So then you try and insert their date of birth, but you realize that you can't always get it and you end up declaring the column as nullable and you're back at where you started. Table normalization is important but I'd argue that normalizing your data is more important the database. And then of course is the problem of some movies will have "Samuel L. Jackson" listed as the actor and other will just have "Samuel Jackson". Are they two different people? Will they have separate records in your table? A lot of it depends on where you are getting the data. If you can look at things like if the actor's name links to a page that can clear up ambiguity between two people. There's obviously a lot of other ways, that's just an example. Even then some of the best designed tables suffer from user error and/or imperfect data. We had a huge events table at my last job, and when I imported data from the old system the locations for the events were all sorts of hosed. It's amazing how many ways someone can mistype or misspell the city they loving live in?!?! subx fucked around with this message at 18:22 on Nov 15, 2011 |
# ? Nov 15, 2011 18:17 |
|
subx posted:Even then some of the best designed tables suffer from user error and/or imperfect data. We had a huge events table at my last job, and when I imported data from the old system the locations for the events were all sorts of hosed. It's amazing how many ways someone can mistype or misspell the city they loving live in?!?! Freeform fields were invented by Satan.
|
# ? Nov 15, 2011 18:32 |
|
mister_gosh posted:I'm looking to create hot (and cold) backups of a production MySQL database. Are there any tools out there that are proven/standard? Percona's Xtrabackup is pretty much the gold standard here and works both with MyISAM and InnoDB tables.
|
# ? Nov 16, 2011 05:35 |
|
mister_gosh posted:I'm looking to create hot (and cold) backups of a production MySQL database. Are there any tools out there that are proven/standard? The standard mysqladmin dumps work fine for all backup purposes i find.
|
# ? Nov 16, 2011 09:44 |
|
Thanks for the recommendations!
|
# ? Nov 16, 2011 16:09 |
|
Factor Mystic posted:I agree with this, personally I view a superior design as something like 'MoviePerson' with the addition of 'MoviePersonRole' to store roles like actor, director, producer, etc. If you just want actors or just want directors, it's still a simple query and it would allow for an individual to be both an Actor and Director without duplicating the person instance, for example. N-thing this, you never know when you'll need to store this movie and the related "talent" http://www.imdb.com/title/tt0366242/ subx posted:A lot of it depends on where you are getting the data. If you can look at things like if the actor's name links to a page that can clear up ambiguity between two people. There's obviously a lot of other ways, that's just an example. That's my experience as well, often you just don't have enough data to uniquely, and correctly, identify an object, and then you can either save it as it is and deal with lovely data quality, or don't, and deal with missing data. Say, john.smith@megacorp.com emailed you some questions about your products. Then a month later John Smith from Megacorp calls you about another product. Is this the same guy? What about Johnathan Smith who shows up at your event? Of course, you can use similarity metrics to determine that John and Johnathan are possibly the same thing, but that still doesn't tell you if it's all the same physical person without any additional information. To deal with this, I'd probably reference the original record in the suspected duplicates, along with a confidence level, perhaps, and collapse them into one once there's enough information to match them. I haven't actually done this IRL, so maybe that's a terrible idea. I suspect that IMDB just has people manually making sure that their poo poo is correct when adding new movies/people. I do have a question of my own, though hopefully not as complicated. Is there a neat way to split a long text by length, and into a table, in Postgres (or T-SQL)? code:
|
# ? Nov 16, 2011 22:21 |
|
I'm wondering if there's a way to do this easily. My google-fu is coming up with nothing... I've got a table of addresses that has duplicate entries I want to clean up: code:
So, my first idea is to make a php script that takes the duplicates and iterates over them and updates any table that references the address table. That's fine, but I keep feeling I should be able to do this without php.. ? It's MySQL and InnoDB. All the foreign key relationships are set up. I can set everything as 'ON UPDATE CASCADE' (or whatever might be needed).... Is there no way to say to MySQL: "Hey, follow the relationships I've set up and switch address 155 and 256 to 187 everywhere." ?
|
# ? Nov 17, 2011 18:44 |
|
Just wondering if any of you know of a decent solution to monitor table sizes in a database and that can output reports and graphs so we can track growth. I just want to check to make sure we aren't reinventing the wheel by coding something.
|
# ? Nov 18, 2011 00:32 |
|
JasonV posted:I'm wondering if there's a way to do this easily. My google-fu is coming up with nothing... The only way that I can think of to do this is to drop the primary key on the referenced table (but keep that column indexed); update the table to unify the PK column values for the rows that are duplicates of one another; then delete the duplicate rows and reinstate the primary key. This may not be acceptable to you. Otherwise I think you have to run a whole bunch of update statements against all the tables referring to this one.
|
# ? Nov 18, 2011 00:55 |
|
demonachizer posted:Just wondering if any of you know of a decent solution to monitor table sizes in a database and that can output reports and graphs so we can track growth. I just want to check to make sure we aren't reinventing the wheel by coding something. Solutions to this are likely going to be vendor-specific, so you should tell us which database software you are using in order to get helpful suggestions.
|
# ? Nov 18, 2011 00:56 |
|
Suran37 posted:Alright thanks for the feedback again. Maybe I am misunderstanding what you are doing, but this is how I would lay out the Movie table: MOVIE MovieID (PK) Title RatingID (FK1) GenreID (FK2) ReleaseYear StudioId (FK3) Length I replaced Rating, Genre, and Studio with RatingId, GenreId, and StudioId since storing those INT values should take up less space than storing the full Rating, Genre, or Studio name every time you create a movie record. I also got rid of the Actor(s) and Director fields, since the Movie Actors and Movie Director are already linked to the movie by the MovieID.
|
# ? Nov 18, 2011 00:56 |
|
Hammerite posted:Solutions to this are likely going to be vendor-specific, so you should tell us which database software you are using in order to get helpful suggestions. poo poo sorry. I started a thread with that question before and included that it is MSSQL in the subject.
|
# ? Nov 18, 2011 06:15 |
|
I ended up reinventing the wheel for the same problem by using a cursor to loop over all tables fitting a criteria (name/user/date) and EXECing sp_spaceusd for each one, then joining the results. It works fine, except that it's not possible to have nested execs in table-valued functions, making this sp a bit awkward to use. If there's a better way, I'm all ears.
|
# ? Nov 18, 2011 08:22 |
|
Anyone know what Postgres means beyond "Your transaction failed, what the gently caress are you doing using triggers to update a table in a highly concurrent serializable query" when it says:quote:Canceled on identification as a pivot, during write. I am very close to using manual table locks at this point, but I could really do with knowing what that error message means otherwise I'm probably just going to make the problem worse.
|
# ? Nov 18, 2011 16:16 |
|
Zombywuf posted:Anyone know what Postgres means beyond "Your transaction failed, what the gently caress are you doing using triggers to update a table in a highly concurrent serializable query" when it says: http://wiki.postgresql.org/wiki/SSI It looks like multiple transactions are running in parallel, and if they were run serially the results would be different depending upon the order.
|
# ? Nov 18, 2011 17:35 |
|
Hammerite posted:The only way that I can think of to do this is to drop the primary key on the referenced table (but keep that column indexed); update the table to unify the PK column values for the rows that are duplicates of one another; then delete the duplicate rows and reinstate the primary key. This may not be acceptable to you. Otherwise I think you have to run a whole bunch of update statements against all the tables referring to this one. Yeah, my php scripts run a whole bunch of update statements, and they work fine. I was just hoping this was a common problem with a simple solution I didn't know about. Thanks, tho.
|
# ? Nov 18, 2011 18:50 |
|
Jethro posted:http://wiki.postgresql.org/wiki/SSI Seems I've developed a blind spot at the top of google as I'm expecting ads to be there.... I suspect I know why this is happening in my db, have had to wrap everything in ugly retry loops. Any idea what the actual message means though? What the hell it means by identification as a pivot specifically?
|
# ? Nov 18, 2011 18:59 |
|
Zombywuf posted:Seems I've developed a blind spot at the top of google as I'm expecting ads to be there.... See this thing here
|
# ? Nov 18, 2011 21:31 |
|
I'm not sure if this is the right thread for this query, but I'm looking for a newbie-level resource for getting into MS SQL. I've read the general SQL tutorial at w3schools.com and I think I have a pretty good handle on it, but I'd like a book/website that specifically teaches how to do things using the MS SQL environment. I've found https://fianga.com, but it seems to be a bit too newbie-friendly (there are instructions for turning on your computer and starting up MS SQL in each section I've read so far), and the writing isn't exactly the clearest.
|
# ? Nov 20, 2011 17:43 |
|
Microsoft has their Books Online thing for SQL Server, have a look at it. I'm usually looking for something specific on MSDN, but this seems to have a good overview of what you might find useful.
|
# ? Nov 20, 2011 20:07 |
|
Ok, so I am setting up my database(from the pictures above) in SQL Developer, and I am at the point where I need to add Foreign Keys. Does this look right? If so, do I need to put the RatingID in the Movie Table for each movie?
|
# ? Nov 20, 2011 22:04 |
|
Suran37 posted:Ok, so I am setting up my database(from the pictures above) in SQL Developer, and I am at the point where I need to add Foreign Keys. No, you should have a RATINGID in both tables that make up the key. I'm not familiar with this particular GUI but it looks like you may be setting this up backwards in addition to mismatching columns. Intuitively it seems like you should be setting this up on the Movie table with Rating as the referenced table and RATING_PK as the referenced constraint. In any event, you definitely don't want the association to be between two different ID columns (unless they are actually the same and you've just got hosed up naming conventions, but I'm guessing this isn't the case)
|
# ? Nov 21, 2011 00:01 |
|
Kekekela posted:No, you should have a RATINGID in both tables that make up the key. I'm not familiar with this particular GUI but it looks like you may be setting this up backwards in addition to mismatching columns. Intuitively it seems like you should be setting this up on the Movie table with Rating as the referenced table and RATING_PK as the referenced constraint. In any event, you definitely don't want the association to be between two different ID columns (unless they are actually the same and you've just got hosed up naming conventions, but I'm guessing this isn't the case) Yeah, I realize I had them all backwards, and I changed some of the column names to make it easier, but I think I got it all figured out now. Also I need to query all the movies that Clint Eastwood is involved in. One of the movies he is an Actor and the Director. Therefore, the movie gets listed twice. Is it possible to alter my query so it is only listed once? I am using the following query: code:
Suran37 fucked around with this message at 00:50 on Nov 21, 2011 |
# ? Nov 21, 2011 00:17 |
|
Simplest way would be just to throw a 'DISTINCT' in, which will remove duplicates - viz.:code:
|
# ? Nov 21, 2011 11:11 |
|
Thel posted:Simplest way would be just to throw a 'DISTINCT' in, which will remove duplicates - viz.: Pros use FOR XML PATH to pull all the roles out as a comma separated string I am not pro so cannot remember the syntax for this except that it's a pain in the arse to use .
|
# ? Nov 21, 2011 17:24 |
|
I'm currently writing a mock filesystem to teach myself SQL, and I'm trying to write a stored procedure that will delete a folder and all sub-objects. DB Schema is like so: Owner (OwnerID, OwnerName, CreateDate) Directory (DirectoryID, OwnerID, DirectoryName, ParentDirectoryID, CreateDate) File (FileID, ParentDirectoryID, FileName, FileData, CreateDate) Here's the procedure: code:
|
# ? Nov 22, 2011 03:56 |
|
herakles posted:I'm currently writing a mock filesystem to teach myself SQL, and I'm trying to write a stored procedure that will delete a folder and all sub-objects. Well one thing is that it'll only work for one level of directories as you're joining child directory to parent and not checking if there are any child directories under the child itself, you could add a while loop to go through all sub directories and add them to the delete list. Also, the DELETE FROM statement will probably error because you've got multiple values going into the WHERE IN()(note: unless this is allowed in the version of SQL you're using), it should probably be something like: DELETE FROM dbo.[File] WHERE ParentDirectoryId in (select DirectoryID from @ChildObjects) And similar for the second delete. Edit: You could cut your table variable to one column and insert the directory ID specified as an input into it as well, that way you'd just have a list of directory IDs to be deleted.
|
# ? Nov 22, 2011 20:57 |
|
Why doesn't using count(expr) work in MySQL? select count(name = 'bob') from employees; It returns the total amount of rows. This works: select count(name) from employees where name = 'bob'; I tried to do something like: select count(name = 'bob'), count(name = 'jeff') from employees; And I noticed they were equal numbers.
|
# ? Nov 22, 2011 21:48 |
|
Bob Morales posted:Why doesn't using count(expr) work in MySQL? In Postgres, at least, count(expr) returns the number of rows for which expr is not null. The expression name = 'bob' is only null if name is null, which I assume doesn't ever happen in your table.
|
# ? Nov 22, 2011 21:53 |
|
What ShoulderDaemon said is correct. You can do the following though:code:
|
# ? Nov 22, 2011 22:02 |
|
Or justcode:
|
# ? Nov 22, 2011 22:11 |
|
I've got a query code:
Hope that makes sense.
|
# ? Nov 22, 2011 23:20 |
|
|
# ? Jun 7, 2024 19:17 |
|
Using Oracle 10g Express (10.2.0.1.0) I need to pull some data from CLOBs that are XML. I have it working using code:
Is there anyway I can default to NULL in these cases rather than have Oracle throw an error?
|
# ? Nov 23, 2011 00:43 |