|
I understand the arguments for keeping video comments and photo comments in separate tables, but consider the benefit you gain when doing things like searching all comments or more generally selecting all comments that meet a certain criteria (e.g., all comments posted by some user). In these cases I think it makes more sense to have all comments in one table. Although, then there is still the problem of selecting the video or photo that matches up with the comment - and the only way I can think of is performing a left join on both tables (photos and videos) and just having one be null for every result row. This isn't a big deal when selecting between only videos and photos, but imagine an extreme case where there are not just 2 types of items with comments, but 30 - that means 30 joins where 29 are null on every row. There must be a better solution that I am missing.
supster fucked around with this message at 15:07 on May 10, 2009 |
# ? May 10, 2009 15:02 |
|
|
# ? May 30, 2024 12:17 |
|
The solution is to stop treating them the same. It isn't one pool of comments with two types. It's two seperate concepts which at some philosophical level happen to both be comments. All your headaches go away the more you treat them as two totally seperate things. There's nothing super compelling here regarding your need to make them subclasses, other than your desire to fetch them all in one go, which isn't worth it. Two queries, done.
|
# ? May 10, 2009 16:36 |
|
quote:Although, then there is still the problem of selecting the video or photo that matches up with the comment - and the only way I can think of is performing a left join on both tables (photos and videos) and just having one be null for every result row. This isn't a big deal when selecting between only videos and photos, but imagine an extreme case where there are not just 2 types of items with comments, but 30 - that means 30 joins where 29 are null on every row. There must be a better solution that I am missing. I think it all boils down to the requirements questions: 1. How many types of attachments are there? Do you anticipate adding more types? 2. What are their attributes? Say there are 30 very distinct types stored in 30 tables. Then writing a query to get all the threads, comments, and attachments will involve joining at least 30 tables regardless of how you model Comments. edit: having a 30+ table join is to obviously to be avoided... I'm guessing there won't be 30 distinct types though. It may make sense when the types are similar to store more than one type in the same table, perhaps with a column for MIME type/Content-type. If the number of types is reasonably small then using a single table has significant advantages. Bad Titty Puker fucked around with this message at 00:39 on May 11, 2009 |
# ? May 11, 2009 00:13 |
|
camels posted:I'm guessing there won't be 30 distinct types though. It may make sense when the types are similar to store more than one type in the same table, perhaps with a column for MIME type/Content-type. I think you're right and that this is the best solution. I could easily combine the photo and video tables into one media table with a type column and another two tables for video-specific or photo-specific attributes that would only be joined when retrieving only videos or only photos. And in reality there will be 5 different types of commentable items and this paradigm should work well with those as well. Thanks.
|
# ? May 11, 2009 00:49 |
|
Is there ever a reason to have an index on a column that does not provide a high level of uniqueness? Like let's say we have an employees table keyed and clustered by id, since single user lookups are common. But in another scenario we want to select a group of users by their department_id. If this were joined to the departments table, would the index be A) employees.department_id B) employees.department_id, employees.id The second index is much more unique than the first (actually, totally unique). I ask because I'm currently using scenario A and wondering if that's the source of my current bottleneck (and I'm testing right now as well).
|
# ? May 11, 2009 21:55 |
|
I'm in the market for a new job. Mostly I do PHP/MySQL stuff but I see adds for a DBA. What all does being a DBA entail? I know how to manage a MySQL server, write queries, optimize them, etc., but I'm sure there's more involved in being a DBA. What kind of things should I know?
|
# ? May 11, 2009 22:37 |
|
Golbez posted:What kind of things should I know? Everything there is to know about a database. The needs depend on the size of the company. If it's a smaller company, you pretty much need to know everything everything (but their needs will be less specific). If it's a larger company, you might have a head DBA, a lot of minion DBAs, and a domain expert (a guy that actually knows how the model relates to its implementation in the database). You probably won't be doing any PHP development or even writing developer/user-level queries. Usually you're rewriting other people's lovely queries or optimizing the server. You control how poo poo like indexes and databases are implemented. You demand what kind of hardware the database sits on. So... Yeah, it's not a development job. It's 100% dedicated to the tuning and administration of a high performance database. Tracking statistics, keeping backups, running disaster recovery scenarios, etc...
|
# ? May 11, 2009 23:08 |
|
Triple Tech posted:Is there ever a reason to have an index on a column that does not provide a high level of uniqueness? Like let's say we have an employees table keyed and clustered by id, since single user lookups are common. But in another scenario we want to select a group of users by their department_id. If this were joined to the departments table, would the index be For DBMSes such as Oracle and SQL Server, the rule of thumb is that if the selectivity of an index isn't at least 5% (edit that is, 5% or less), the optimizer will avoid using it. What you might want to do is to make sure the statistics are up to date, and then run a trace to get the query execution plan that was actually used when the database engine processed the queries. That will tell you if the optimizer chose to use your index. You might also want to take a look at the distribution of department_id in employees & figure out how selective the index is. Bad Titty Puker fucked around with this message at 01:39 on May 12, 2009 |
# ? May 12, 2009 01:18 |
|
Triple Tech posted:Everything there is to know about a database. The needs depend on the size of the company. If it's a smaller company, you pretty much need to know everything everything (but their needs will be less specific). If it's a larger company, you might have a head DBA, a lot of minion DBAs, and a domain expert (a guy that actually knows how the model relates to its implementation in the database). Data warehousing & OLAP is also a very valuable skill set to have. One more to look for is experience with ETL, importing and exporting data, scrubbing data, etc. You don't have to know data warehousing to land a DBA job, but even if a company doesn't do warehousing now, if you know what you're doing you might be able to get one started. ETL is pretty much a must-have skill; at least from my experience, I've used it in every company I've worked for. IMO data modeling and database design is also a key skill. Depending on the job, you may be responsible for - reviewing all the DDL code; designing new database objects (tables, stored procedures, views, etc.- or even whole schemas); compiling a data dictionary; diagramming and documenting, etc.
|
# ? May 12, 2009 01:28 |
|
camels posted:Data warehousing & OLAP is also a very valuable skill set to have. One more to look for is experience with ETL, importing and exporting data, scrubbing data, etc. You don't have to know data warehousing to land a DBA job, but even if a company doesn't do warehousing now, if you know what you're doing you might be able to get one started. ETL is pretty much a must-have skill; at least from my experience, I've used it in every company I've worked for. Since I have no clue what any of these terms mean, I guess I'll pass on DBA jobs.
|
# ? May 12, 2009 02:07 |
|
Golbez posted:Since I have no clue what any of these terms mean, I guess I'll pass on DBA jobs. Since I want to add DBA experience to my CV I had better get some basic answers about JOINS This is going to be a 'help me with my homework' post. I still cannot figure out joins. They are absolutely essential to doing anything with SQL, and I'm stuck. All I understand is that it is about matching the foreign key from one table to the primary key of another table. I managed to fluke my way through a basic one, but anything more than two tables is causing me problems. I would really appreciate any advice offered. Using Oracle 11g: List the TourID and Staff Member's Name for all tours that occurred in the February of 2003. code:
Find the total dollar amount of daily-equipment-rentals for each tour. Now is where I get stuck. I can't even get the joins to work, let alone extract any information to answer the question. code:
OK so I'll try a different approach: code:
The slides with my course are not very helpful, and the best tutorial I can find is at w3schools but it doesn't go into joins of more than two tables. I've been banging my head for the last few hours. Can anyone help?
|
# ? May 12, 2009 05:29 |
|
Gap In The Tooth posted:
You're aliasing the table names in the FROM clause, so you have to use the aliases in the SELECT clause. code:
1. Use meaningful abbreviations for table aliases instead of a, b, c, d... 2. In this query: code:
code:
edit: Also, qualify all the referenced columns with the table name. Preserving case of data element names is a good practice for readbility as well. code:
Bad Titty Puker fucked around with this message at 06:27 on May 12, 2009 |
# ? May 12, 2009 06:20 |
|
Gap In The Tooth posted:All I understand is that it is about matching the foreign key from one table to the primary key of another table. This isn't really true. It might be true like 95% of the time for "natural" joins, but it's not true for what joins actually are. When I get to work I'll write up a post summarizing joins.
|
# ? May 12, 2009 12:12 |
|
camels posted:Awesome advice Love for you right now. The incorrect aliasing was indeed causing the error. If I add d.rentalpriceperday under SELECT and add ORDER BY a.tourid; to the end, I get close to what the question requires (total dollar amount of equipment rentals for each tour): They want the total of RentalPricePerDay for each instance of TourID. I'm thinking that SUM(), or maybe COUNT() needs to come into play here. I'm going to poke around my noob brain for a while. Many thanks again for the pointers.
|
# ? May 12, 2009 12:22 |
|
Find the total dollar amount of daily-equipment-rentals for each tour. Make sure you also display those tour(s) that did not involve any equipment. Sort by ascending TourID.code:
Basically I just learned that you don't need to select so much stuff when you're joining.
|
# ? May 12, 2009 13:01 |
|
I was initially writing up something about joins and then quickly realized how it was getting really complex... That and my example was pretty terrible. Do you guys still need help on joins or are you okay?
|
# ? May 13, 2009 21:34 |
|
I need to replace all backslashes to forward-slashes in a character field. I'm using the following command but I keep getting an error:code:
|
# ? May 14, 2009 00:11 |
|
ABombInABowl posted:I need to replace all backslashes to forward-slashes in a character field. I'm using the following command but I keep getting an error: code:
|
# ? May 14, 2009 00:28 |
|
Triple Tech posted:I was initially writing up something about joins and then quickly realized how it was getting really complex... That and my example was pretty terrible. Do you guys still need help on joins or are you okay? I think the Wikipedia article about SQL joins is actually pretty good (but the article on relational databases is terrible). In addition to the JOINs in SQL statements are the joins used by the query engine. Understanding these is helpful to optimizing SQL statements. I think Wikipedia does a decent job of explaining them as well, but I like Microsoft's explanations (which I believe is not SQL Server specific): edit: This info is somewhat advanced; it probably is useful only if you are already optimizing queries, analyzing query execution plans, and so on. Nested Loop Join (Microsoft) Hash Join (Microsoft) Merge Join (Microsoft) Nested Loop Join (Wikipedia) Hash Join (Wikipedia) Merge Join (Wikipedia) Bad Titty Puker fucked around with this message at 02:02 on May 14, 2009 |
# ? May 14, 2009 00:38 |
|
Is there a way to get the date/time of a record insertion in MySQL?
|
# ? May 17, 2009 00:06 |
|
SHODAN posted:Is there a way to get the date/time of a record insertion in MySQL? Create a field with the type "timestamp". Whenever that row is inserted or updated, the column will be updated.
|
# ? May 18, 2009 05:38 |
|
Gap In The Tooth posted:Find the total dollar amount of daily-equipment-rentals for each tour. Make sure you also display those tour(s) that did not involve any equipment. Sort by ascending TourID. I'm not sure, but I think you wanted an outer join on some of that data, rather than an inner join as it states that you need to include tours that did not involve any equipment. Those tours may not appear in the rental tables, so an inner join wouldn't return anything. It seems like all the data you got out was for tours which did have equipment hires. This may be a fluke. Flamadiddle fucked around with this message at 12:19 on May 18, 2009 |
# ? May 18, 2009 12:12 |
|
Need help on this MySQL query... Here is my table: key_table code:
Any ideas on how to setup a query to do this?
|
# ? May 18, 2009 18:13 |
|
philihp posted:Create a field with the type "timestamp". Whenever that row is inserted or updated, the column will be updated. I'd recommend against using a timestamp in MySQL, and instead use Date/Time, check out this article: http://www.dbtuna.com/article.asp?id=36 (Summarizes that date/time is about 50% faster to query, even with a larger disk space footprint)
|
# ? May 18, 2009 18:17 |
|
VerySolidSnake posted:Need help on this MySQL query... I'd like to know more about your data model, it sounds a little suspect. Otherwise, it's just a join, where you happen to be joining on the same table you're looking up.
|
# ? May 18, 2009 18:26 |
|
Triple Tech posted:I'd like to know more about your data model, it sounds a little suspect. Otherwise, it's just a join, where you happen to be joining on the same table you're looking up. I could split this one table into two, but I figured I could keep it running faster using this query since its just hitting the same table (.....maybe?). Here is the furthest I've got with the query: code:
indulgenthipster fucked around with this message at 18:38 on May 18, 2009 |
# ? May 18, 2009 18:35 |
|
Triple Tech posted:I'd like to know more about your data model, it sounds a little suspect. Otherwise, it's just a join, where you happen to be joining on the same table you're looking up. Doesn't joining a table on itself mean there's a normal form violation of some sort or is there a legit reason to do this?
|
# ? May 18, 2009 18:37 |
|
For such a thin table, you are almost absolutely optimizing way too prematurely. Speed isn't really a concern with such a low/thin amount of data. And no, self joining doesn't automatically signal a problem. It's just wonky at times. All the times I've seen self joining done was for some sort of hierarchical data, which I personally feel relational models are horrible for ("parent_id" would be the type of signal you're looking for). I can't think of any great examples right now but a self join isn't inherently evil. It's mostly obtuse and awkward and something that should be looked at with great care.
|
# ? May 18, 2009 19:16 |
|
VerySolidSnake posted:Need help on this MySQL query... I can't tell if userid = 0 means non-existant or something, so try this and remove the >0 bit if it's not necessary code:
|
# ? May 18, 2009 20:56 |
|
VerySolidSnake posted:I could split this one table into two, but I figured I could keep it running faster using this query since its just hitting the same table (.....maybe?).
|
# ? May 18, 2009 21:05 |
|
MoNsTeR posted:It will be accessing a second logical copy of the table, which for most intents and purposes is just like accessing a second table (or rather, accessing the first table a second time). 99% of the time you're better off normalizing. Uhh, this is an implementation detail, so possibly false (if you meant each pass incurs a disk read). If the engine knows that you're accessing all the columns (and it's page reader probably will since the fields are adjacent to one another), it will have all the data available in one pass. Nota bene across the board: The way you write a query and what it actually does behind the scenes are two totally different things.
|
# ? May 18, 2009 21:15 |
|
Triple Tech posted:Uhh, this is an implementation detail, so possibly false (if you meant each pass incurs a disk read). If the engine knows that you're accessing all the columns (and it's page reader probably will since the fields are adjacent to one another), it will have all the data available in one pass. Well sure, your particular DBMS may be able to make that optimization. Oracle doesn't though, so I doubt it. Either way the second read will be cached and thus faster, but that's beside the point. The point is that modeling your data in a goofy way in pursuit of phantom performance gains is not a good idea. The way you write a query and how it gets executed may be different but they are not unrelated. IMHO you have to prove the performance gains are both Real and Significant before abandoning normalization. FWIW you can often eliminate self-joins in queries such as this using analytic/window functions but those are Oracle specific. And still not a good reason to denormalize.
|
# ? May 18, 2009 23:42 |
|
Speaking of analytic and windowing functions, did anyone else catch that PostgreSQL's current beta (8.4) added support for them? http://www.postgresql.org/docs/8.4/static/tutorial-window.html I always single out Oracle and SQL Server for supporting them, so it's very exciting to see them added to a free DBMS.
|
# ? May 19, 2009 00:37 |
|
VerySolidSnake posted:Need help on this MySQL query... As other people have pointed out, you may have more of a DDL problem than anything else. What entities and relationships are you modeling? What is 'key_table'? 'keyid'? 'papid'? 'userid'? Are you modeling hierarchical data? Please post DDL, including keys and indexes -- that makes it easier to give useful advice.
|
# ? May 19, 2009 02:05 |
|
var1ety posted:Speaking of analytic and windowing functions, did anyone else catch that PostgreSQL's current beta (8.4) added support for them? Yeah, I saw that too. I went from Oracle projects to MySQL and am still suffering withdrawal of not being able to use MERGE statements and analytic functions.
|
# ? May 19, 2009 02:17 |
|
VerySolidSnake posted:I'd recommend against using a timestamp in MySQL, and instead use Date/Time, check out this article: http://www.dbtuna.com/article.asp?id=36
|
# ? May 19, 2009 04:05 |
|
Golbez posted:Will datetime update automatically? Not automatically, but can be done easily. As for my table, I ended up breaking it up into two different tables. I altered the name of the table for my original post, it is a keyword_assign table, where the keyid links to a keyword_name table (which has keyid and keyword columns). Planning on large amounts of scalability and many instances of the same keywords I went this route (also, much faster to search the table VIA a number for matches instead of a string, that may also be UTF8). I split the keyword_assign table into keyword_paper and keyword_userid, the join ended up being MUCH more straightforward. Thanks for the input!
|
# ? May 19, 2009 20:48 |
|
MoNsTeR posted:FWIW you can often eliminate self-joins in queries such as this using analytic/window functions but those are Oracle specific. And still not a good reason to denormalize. Ah, so THATS why I've never had to use a self-join! Anyway, DB2 has window functions, and they have saved my bacon repeatedly. Adding another layer to the SQL is SO MUCH faster then doing a separate query. I love them so much that I'm gonna do a clinic for the reporting team on how to use them (because the documentation sucks).
|
# ? May 19, 2009 20:59 |
|
Good to see window functions are more commonly available than I thought. Along the same lines, does anyone other than Oracle support syntax of the form: some_aggregate_function(foo) keep(dense_rank last order by bar) ...which executes some_aggregate_function over the set of rows that are "last" (or "first") in a dense_rank over the given order by clause?
|
# ? May 19, 2009 23:09 |
|
|
# ? May 30, 2024 12:17 |
|
This is driving me nuts, and i know *why* its happening, but not a way around it. Consider Table A code:
code:
|
# ? May 20, 2009 19:06 |