|
Order doesn't matter. Just reference the proper table name or alias (and alias if the table appears multiple times. You can even update CTE names, probably as long as the corresponding view would be updatable.
|
# ? Jan 27, 2009 20:08 |
|
|
# ? Jun 8, 2024 07:38 |
|
Instead of using a shedload of individual queries, how would I form a query to pull back the category name and how many posts are in each category as suchcode:
|
# ? Jan 28, 2009 06:37 |
|
Safety Shaun posted:Instead of using a shedload of individual queries, how would I form a query to pull back the category name and how many posts are in each category as such code:
|
# ? Jan 28, 2009 06:58 |
|
Safety Shaun posted:Instead of using a shedload of individual queries, how would I form a query to pull back the category name and how many posts are in each category as such code:
|
# ? Jan 28, 2009 07:08 |
|
Markoff Chaney posted:Should work if I understand your question, Oracle syntax. Thank you, that worked a treat. Page load has increased dramatically.
|
# ? Jan 28, 2009 09:08 |
|
Is there TSQL for finding the Volume the db file is on? Without using xp_cmdshell. Bonus points for also being able to determine the disk usage of said volume.
|
# ? Jan 28, 2009 16:24 |
|
code:
code:
Edit: This is based on 2005 - not sure how it'll migrate up or down from there.
|
# ? Jan 28, 2009 16:49 |
|
mister_gosh posted:I'm creating some tables to track the number of times user's log in to my database and to track how often certain tools are executed. Ok, Triple Tech, et al, this is what I came up with. I'm not completely sure if these are technically foreign keys or not (I'm self taught and sort of just beginning in RDBMS stuff). Anyways, based on some analysis, I think I will average about 12,000 events a month. Does this model seem to support that? I'm guessing by everything being a char or int type that this won't fill much data, but I would like to avoid creating any bottlenecks or other problems. Any guess as to how big the initial allotment of space should be? How does a table with eventually 500,000 rows with simple types sound? Normal? The user table will max out at about 100 rows, the location table will be about 20 rows, the product table about 20 rows and the tool table about 30 rows, so it's just the events I'm concerned with. As each event occurs, I will create a new row in the usage_event table, fill out the date and get the user id, product id and tool id based on who executed what. The usage_event_id will be auto-incrementing unique id - not sure how to do that yet, but I'm still at the planning stage. mister_gosh fucked around with this message at 21:42 on Jan 28, 2009 |
# ? Jan 28, 2009 19:17 |
|
More of a database design question than SQL strictly speaking but here it is: I have a database (MySQL, but I mostly interact with it using an ORM) that holds biological data. Several of the objects realized from the db have fields that are similar in meaning, and currently identical in implementation. For example: There is a table of countries. Each country can have several synonyms. There is a table of organism names (taxa). Each organism can have several synonyms. There is a table of samples. Each sample can have several external references (URNs). There is a table of molecular sequences. Each organism can have several external references. So there will be a table of synonyms and a table for external references. The big question is: should these secondary tables be shared by the parent primary tables? That is, should the country and organism synonyms be in one table ("synonyms") or two ("country_synonyms", "organism_synonyms")? Arguments for the one table solution: * Why the hell not? * It's simpler * It's fewer tables (and few classes derived from the tables) * The implementation for the two fields is identical. Arguments for the two table solution: * "synonym" and "external reference" can mean slightly different things for the different parent objects. (The external reference for a molecular sequence points at its id or location in another database system, the external reference for a sample means the number scribbled on physical tag hung on a sample when collected.) * Is the number of tables really a problem? * The implementation for the two fields is identical _currently_ but what if it changes? Any ideas, or design principles to consider in these cases?
|
# ? Jan 29, 2009 11:41 |
|
outlier posted:More of a database design question than SQL strictly speaking but here it is: And as you point out, they aren't really the same thing. You're thinking of maybe keeping them in the same table because you happened to give them the same name when you set up the tables. If you had called them organism_common_names and country_alternate_designations or sequence_external_reference and sample_tag_number you probably wouldn't be asking the question in the first place. Jethro fucked around with this message at 18:38 on Jan 29, 2009 |
# ? Jan 29, 2009 18:34 |
|
Jethro posted:The single synonym solution is simpler until you end up with an organism and country that have the same id, and then your DB tells you that 'Blue breasted tit warbler' is a synonym for Tunisia. Bam. Hadn't thought of that. By and large, for tables that represent fields I use autogenerated IDs, but it's not a decision I'd thought deeply about. Thanks. quote:And as you point out, they aren't really the same thing. You're thinking of maybe keeping them in the same table because you happened to give them the same name when you set up the tables. If you had called them organism_common_names and country_alternate_designations or sequence_external_reference and sample_tag_number you probably wouldn't be asking the question in the first place. For the sake of argument (and better understanding of design), if they were the same and would always be implemented the same, would they belong in the same table together? To cook up an artificial example, say I have two tables that are countries and ecosystems. The boundaries of a country are represented by a polygon that is stored in another table. The boundaries of an ecosystem are also represented by a polygon. Leaving aside the valid point about colliding identifiers as above, do the two sets of polygons belong in the same table?
|
# ? Jan 30, 2009 12:02 |
|
outlier posted:For the sake of argument (and better understanding of design), if they were the same and would always be implemented the same, would they belong in the same table together? To cook up an artificial example, say I have two tables that are countries and ecosystems. The boundaries of a country are represented by a polygon that is stored in another table. The boundaries of an ecosystem are also represented by a polygon. Leaving aside the valid point about colliding identifiers as above, do the two sets of polygons belong in the same table? In my experience the query convenience and the time saved by putting two pieces of data that are only marginally similar into the same table is almost always outweighed by the grief it causes you later in the application's life. The biggest downside of the model is the inability to effectively use database constraints to police your data, which forces you to do constraint checking in your application, which is almost certainly a mistake.
|
# ? Jan 30, 2009 16:17 |
|
Is it possible to SEO-style normalise a URI in a MySQL stored function? I'm thinking of a function that converts "Lingeries sets" into "lingerie-sets", and an appropriate method of querying the database for reverse lookup. Normalising is easy on the command line: code:
code:
code:
MrMoo fucked around with this message at 20:05 on Jan 30, 2009 |
# ? Jan 30, 2009 18:37 |
|
var1ety posted:In my experience the query convenience and the time saved by putting two pieces of data that are only marginally similar into the same table is almost always outweighed by the grief it causes you later in the application's life. That's the sort of advice I was looking for. Thanks.
|
# ? Jan 30, 2009 19:07 |
|
outlier posted:Bam. Hadn't thought of that. By and large, for tables that represent fields I use autogenerated IDs, but it's not a decision I'd thought deeply about. Thanks. You could do something like: Polygons (PolygonID, shape parameters), CountryShape (CountryID, PolygonID), and EcosystemShape (EcosystemID, PolygonID). Whether that's more convenient than just having two separate polygon tables would depend on the application.
|
# ? Jan 30, 2009 23:50 |
camels posted:something like this? Yes! How come the portion after UNION won't properly GROUP BY the Product though?
|
|
# ? Jan 31, 2009 02:13 |
|
fletcher posted:Yes! How come the portion after UNION won't properly GROUP BY the Product though? The result set in effect only has one Product, so the SUM() acts on the whole result set, and doesn't need a GROUP BY statement, if that makes sense.
|
# ? Jan 31, 2009 02:31 |
camels posted:The result set in effect only has one Product, so the SUM() acts on the whole result set, and doesn't need a GROUP BY statement, if that makes sense. Oh I was trying to be lazy and reuse most of the SQL statement already built since it's huge. I don't know why I didn't realize that. I need some sleep I think. Thanks!
|
|
# ? Jan 31, 2009 04:19 |
|
fletcher posted:Oh I was trying to be lazy and reuse most of the SQL statement already built since it's huge. I don't know why I didn't realize that. I need some sleep I think. Thanks! Unless there's some compelling reason not to calculate "d" on the client side, I think doing so would be simpler, clearer, easier to maintain, and likely to perform better. Or you could use a temp table (untested): code:
|
# ? Jan 31, 2009 06:56 |
|
var1ety posted:In my experience the query convenience and the time saved by putting two pieces of data that are only marginally similar into the same table is almost always outweighed by the grief it causes you later in the application's life. Also it violates first normal form. A table models an entity, so you would not want to store different entities in the same table if they are only marginally similar. Besides that you would end up with a table where most of the column values are NULL.
|
# ? Feb 2, 2009 03:05 |
|
Can someone explain aggregate functions and GROUP BY to me? I've read a bunch of simple tutorials and can't seem to make sense of this. All this code has to do is, for each product we sell, find the latest purchase order (PO) that was generated (the MAX(po date)) and return some basic info about it. code:
What am I missing here? It seems like it should be this simple.
|
# ? Feb 3, 2009 19:05 |
|
Forkboy posted:Can someone explain aggregate functions and GROUP BY to me? I've read a bunch of simple tutorials and can't seem to make sense of this. It's exactly what the error message says. Either add the other fields to the GROUP BY or contain them in aggregate functions. Edit: I'm sure it's not the way to do it, but I'd do a subquery. I'm sure you've got some sort of order ID, right? code:
Pizza Partisan fucked around with this message at 19:25 on Feb 3, 2009 |
# ? Feb 3, 2009 19:14 |
|
surrealcatalyst posted:It's exactly what the error message says. Either add the other fields to the GROUP BY or contain them in aggregate functions. This is what I'm not getting about the GROUP BY clause... adding every other field to the GROUP BY returns more than one record for each product code. I just need the one with the latest date.
|
# ? Feb 3, 2009 19:37 |
|
Forkboy posted:This is what I'm not getting about the GROUP BY clause... adding every other field to the GROUP BY returns more than one record for each product code. I just need the one with the latest date. GROUP BY foo, bar, baz sorts the input relation into groups for all the unique combinations of values from foo, bar, and baz; its output relation then has one row summarizing each group. So the columns in the output relation have to summarize (potentially) multiple rows; therefore they must be either (1) aggregate functions (like AVG or MAX) over some input column or (2) one of the group-by columns, which are guaranteed to have a unique value within each group. What you want is an aggregate function which says "pick the row with the maximum date, then return such-and-such column from that row", which is not an a priori unreasonable request, but sadly there's just no such function built in to any databases that I know of. So you'll basically need a subquery. rjmccall fucked around with this message at 20:07 on Feb 3, 2009 |
# ? Feb 3, 2009 20:04 |
|
I present The Ark of the Covenant (SQL 2005):code:
We complained to the provider of the above code that if we hired a people farm they could search our database manually faster than this stored procedure. To which they replied "do you have any pointers on how to make it faster?". I am extremely frustrated right now and can't verbalize any reply to the question. Short of telling them to gently caress off and fire all their staff that has worked on the stored procedures, what resources can I give them on how not to be a worthless loving SQL programmer? edit: table breakage fix
|
# ? Feb 3, 2009 23:33 |
|
Make them read these books. Maybe look at some of my sql scripts such as historical data via EAV. Demand that they perform better, and when they say they can't, prove them wrong. Maybe if they were to read BetterSQL, they could understand some of what they're missing?
|
# ? Feb 4, 2009 00:27 |
|
I would write a query that doesn't use dynamic SQL and is optimized, then run it on a test database where you can do this:code:
Something tells me you won't get anyone to see the light about EAV. Celko has some good stories to tell on USENET. He worked for a company is Austin that used EAV, and over time data integrity, performance, and maintainability got worse and worse and worse until the company went under.
|
# ? Feb 4, 2009 01:22 |
|
Also, even if you can't make any changes to the stored proc or to the table design, I wouldn't be surprised if there are missing, unneeded, and suboptimal indexes on the tables. Again it might be the kind of thing where you could use a test database to make a case for some changes, if you can get it to run much faster.
|
# ? Feb 4, 2009 01:26 |
|
How do I assign a single returned number from a select statement to a variable? Basically I'm looking at something like this: code:
And then I guess as a follow up if anyone knows, I'm going to have to do incremental values of that variable, based on rn + 1 to do this with multiple rows. Unless there's an easier way to do all this... Basically I need to run a case when speedsensorid = something in particular, and I'm using this to try and get those IDs and then put them into the case statement, because it seems like I can't use subselects in my case statements, unless I'm just doing that all wrong.
|
# ? Feb 4, 2009 21:42 |
|
Khatib posted:How do I assign a single returned number from a select statement to a variable? What you're looking for are SQL cursors. I don't have enough confidence to write out the code for you, but I can warn you to make sure they properly deallocate themselves so they don't kill your SQL server. As for part one, code:
|
# ? Feb 4, 2009 23:05 |
|
Let's say I'm getting a large (2 million rows?) amount of data that's supposed to be static and unchanging. Supposed to be. And this data gets republished monthly. What methods are available to 1) be aware of what data points have changed from month to month and 2) consume the data given a point in time? Solution 1) Naively save every snapshot of data, annotated by date. Diff awareness is handled by some in-house program, but consumption of the data by date is trivial. Cons, space requirements balloon by an order of magnitude. Solution 2A) Using an in-house program, track when the diffs happen and store them in an EAV table, annotated by date. Space requirements are low, but consumption integrated with the original data becomes unwieldly. Solution 2B) Using an in-house program, track when the diffs happen and store them in a sparsely filled table that looks much like the original table, filled only with the data that's changed and the date when changed. Cons, model is sparse and consumption integrated with the original data is non-trivial. I guess, basically, how do I integrate the dimension of time into a relational database, keeping in mind both the viewing of the data and awareness of differences between time periods? Does this relate to data warehousing at all? Smells like... Slowly changing dimension? Triple Tech fucked around with this message at 23:23 on Feb 4, 2009 |
# ? Feb 4, 2009 23:09 |
|
Khatib posted:How do I assign a single returned number from a select statement to a variable? Can you use a lookup table? Cursors and row-at-a-time processing are slow as poo poo. I would look for a way to deal with whole sets of data instead of using procedural, one row at a time programming.
|
# ? Feb 4, 2009 23:28 |
|
Triple Tech posted:Let's say I'm getting a large (2 million rows?) amount of data that's supposed to be static and unchanging. Supposed to be. And this data gets republished monthly. What methods are available to 1) be aware of what data points have changed from month to month and 2) consume the data given a point in time? What RDBMS are you using? I know SQL Server has database diff tools that might give you what you need. Here's one: Apex SQL Data Diff. It can diff database backups.
|
# ? Feb 4, 2009 23:44 |
|
Triple Tech, how often do you have to query the 2 million rows and how often are you looking at data that isn't the most current? The specific usage of this table will almost definitely determine how you proceed. You might want to look into table-valued UDFs and see if they are smart about not fully "rendering" the table returned. If SQL Server is this smart, then you can abstract the implementation from the interface in a nice way. Otherwise, you might have to hard-code stuff and have it be relatively ugly.
|
# ? Feb 4, 2009 23:49 |
|
Triple Tech, Not sure what the specifications are, but I'm guessing that you can do what you need on the database side. What exactly do you need to do? The diff tool I mentioned can compare two databases (or backups) and generate a .csv file of the data diffences. There are also tools that can analyze and manipulate transaction log data; actually it looks like Apex SQL makes one. I've used one before but can't remember who makes it. If there is a tool that can analyze a differential database backup that would be p. sweet. Another approach, if the data changes from month to month are relatively small, would be to use triggers to write to auditing tables that you could query to see all the data that changed in that month. I don't really see a need to write much code on the client side, or to use table-valued UDFs here. If you log changes to auditing tables then you can just use a view or straight query the data. If you use SQL Server then you can use a tool that you can buy for 500 bucks.
|
# ? Feb 5, 2009 00:19 |
|
It's SQL Server 2005. The hosed up part is that neither my boss or I know the use-case for this data, they just want it logged. Personally I think this whole venture is retarded and time consuming but my boss doesn't seem to want to budge. He's just out to fulfill this business directive. And without getting into the text space, how do you diff two rows? How do I know that one whole row of data is equal to another row, in SQL space?
|
# ? Feb 5, 2009 04:58 |
|
Triple Tech posted:It's SQL Server 2005. The hosed up part is that neither my boss or I know the use-case for this data, they just want it logged. Personally I think this whole venture is retarded and time consuming but my boss doesn't seem to want to budge. He's just out to fulfill this business directive. I think it would make more sense to capture all the data changes over the one-month-period than diffing all the data in the two databases by hand. First I would consider using a tool to analyze the transaction logs, or using database triggers to write modified rows to audit tables. Second, if you absolutely have to diff all the data I would use a third-party tool instead of rolling your own code.
|
# ? Feb 5, 2009 05:24 |
|
My web host provides PHPMyAdmin as one of the tools that can be used to manage MySQL databases. I was idly looking through just now and noticed that a warning message of the following form appears on the page for each of the tables in my database (except for one of them which has two columns for its primary key):quote:The following indexes appear to be equal and one of them should be removed: PRIMARY, UserID Is this something I should be concerned about? The tables were created using commands of the following form: quote:CREATE TABLE User(UserID SERIAL, some other columns, PRIMARY KEY(UserID))
|
# ? Feb 5, 2009 21:41 |
|
Hammerite posted:My web host provides PHPMyAdmin as one of the tools that can be used to manage MySQL databases. I was idly looking through just now and noticed that a warning message of the following form appears on the page for each of the tables in my database (except for one of them which has two columns for its primary key): Why create tables with commands rather than using PHPMyAdmin, if you've got access to it?
|
# ? Feb 5, 2009 21:52 |
|
|
# ? Jun 8, 2024 07:38 |
|
surrealcatalyst posted:Why create tables with commands rather than using PHPMyAdmin, if you've got access to it? I didn't realise at the time that there was a tool like PHPMyAdmin available. I was learning PHP and MySQL out of a book and pretty much did what the book said to do. It gave the PHP code for a page that could be used for sending queries to the database. I still use that page when I want to query the database, except for security it now needs the MySQL username, password and database name to be entered by the user.
|
# ? Feb 5, 2009 23:18 |