|
pseudorandom posted:What's the best way to go about handling items of information in a database that potentially have an unlimited number of attributes to each item? When you say an unlimited number of attributes, do you mean an unlimited number of the same type of attribute, i.e. a post has an unlimited number of comments but has no other attributes? If that is the case then first you make sure all your tables have a primary key which exists only to identify each record in the table. You can then associate any number of records from one table with the other as DankTamagachi describes. The association can be enforced in the DB with a foreign key relationship. Your schema for both the cases above would look something like (if I recall MySQL syntax correctly: code:
EDIT: I think it should be obvious how you'd add contacts using this scheme You'll need two columns referencing Users(user_id)
|
# ? Sep 24, 2011 00:24 |
|
|
# ? Jun 7, 2024 19:01 |
|
pseudorandom posted:potentially unlimited number of attributes for each item? Define "potentially unlimited attributes." Do you mean that any given item can have any number of user-defined or item-specific attributes ("attributes" in this case would be columns in your table)? Or do you mean that you as the designer are going to specify the attributes (again, columns) that are common to all items and there can be an unlimited number of entries that correspond to one unique item? Your examples seem to indicate the latter. So say item "widget27" could have a million people all enter a comment about that item. If this is the case, I would do what you suggested: have one table for comments and have foreign keys for the item ID and the user ID who entered the comment (if you're storing user information). For ratings you could basically do the same thing. Have a table with UserID, ItemID, and Rating. The first two columns are foreign keys to maintain referential integrity, and you can also make the two together the composite primary key for the table, which will make sure each user only rates each item once and avoids needing to create a separate ID for ratings. If, on the other hand, you want each item to have user-defined or item-specific attributes, that's tougher. You can look at the Entity-Attribute-Value. Be warned, EAV is hairy, hurts readability and breaks some of the advantages of the relational database model.
|
# ? Sep 24, 2011 00:35 |
|
Che Delilas posted:Define "potentially unlimited attributes." Do you mean that any given item can have any number of user-defined or item-specific attributes ("attributes" in this case would be columns in your table)? Or do you mean that you as the designer are going to specify the attributes (again, columns) that are common to all items and there can be an unlimited number of entries that correspond to one unique item? Yeah, the latter is pretty much what I meant. Sorry I couldn't better explain it myself. Actually, I think the reason why I was confused is because several of the things I'm trying to do could be done both ways, so that's why I needed help. And, thank you very much, Zombywuf. That seems to be perfect! Thanks for your help, guys!
|
# ? Sep 24, 2011 01:16 |
|
I'm trying to import a flat file of people/employee details into a new MSSQL 2005 database. I'm using bulk insert on a view of a staging/temp table, but I'm not sure how to move this data into my relational tables. My biggest problem is that each person has multiple IDs associated with them from other systems. I assign them a surrogate primary key "pID"(person ID)in my person table. I then use the pID as a foreign key in a table that stores all IDs each person could be referenced with.
|
# ? Sep 25, 2011 21:30 |
|
Ugh. I'm trying to build a view in MySQL based on different types of user permissions. There is a "friends" table that shows if two users are friends, and a "lock" table that shows if a photo has been given viewing permissions to a specific user. tphoto.sec has security settings: 0 = anonymous viewing, 1 = friends only, 2 = specific users User 54 is friends with user 2 and user 56. There is a row in thas that gives permission on photo 21 to user ID 2. code:
code:
code:
Reo fucked around with this message at 23:58 on Sep 25, 2011 |
# ? Sep 25, 2011 23:55 |
|
WHEN 1 and EXISTS (...) This does not do what you think it does. This case is testing whether tphoto.sec is equal to (1 AND EXISTS(...)), not whether (tphoto.sec = 1) AND EXISTS(...). The value of the "2" case value is always going to be either 1 or 0, never 2. You need to rewrite the CASE statement as: CASE WHEN tphoto.sec = 0 THEN 'Approved-Anon' WHEN tphoto.sec = 1 AND EXISTS ( ... Edit: Got confused between AND and OR, same issue applies though. Hammerite fucked around with this message at 00:10 on Sep 26, 2011 |
# ? Sep 26, 2011 00:06 |
|
I thought the same thing and changed the security value to a char column (A,B,C) instead of integers, and got vastly different results - (the case statements now show WHEN tphoto.secB = 'A' etc.code:
Reo fucked around with this message at 00:13 on Sep 26, 2011 |
# ? Sep 26, 2011 00:10 |
|
Reo posted:I thought the same thing and changed the security value to a char column (A,B,C) instead of integers, and got vastly different results - (the case statements now show WHEN tphoto.secB = 'A' etc. post the amended query in full?
|
# ? Sep 26, 2011 00:11 |
|
I just updated the results, apparently it flips them if I use WHEN 'A' vs. WHEN tphoto.secB = 'A'. Anyways, here it is:code:
code:
Reo fucked around with this message at 00:16 on Sep 26, 2011 |
# ? Sep 26, 2011 00:14 |
|
CASE tphoto.secB WHEN ... This is not what you want. Change it to CASE WHEN ... The workings of the CASE statement are explained on this page: http://dev.mysql.com/doc/refman/5.0/en/case-statement.html Note that there are two different ways to use it.
|
# ? Sep 26, 2011 00:17 |
|
Aha! God drat it. That was it all along. Thank you!code:
|
# ? Sep 26, 2011 00:19 |
|
I have to loop through a table and perform inserts/updates depending on if a certain column exists. Is there anyway to make variables that contain the first entry in a table? I want to do something like the code below, but the code doesn't work. code:
|
# ? Sep 26, 2011 04:09 |
|
Super Delegate posted:I have to loop through a table and perform inserts/updates depending on if a certain column exists. Is there anyway to make variables that contain the first entry in a table? It sounds like what you're doing is horribly wrong, but to do it your code should look something like this: code:
|
# ? Sep 26, 2011 09:17 |
|
If that's MSSQL you need to give the nvarchar a size when you declare it. So if the column is 50 characters wide, you'd docode:
|
# ? Sep 26, 2011 17:14 |
|
Jethro posted:If that's MSSQL you need to give the nvarchar a size when you declare it. You don't have to. Interesting things happen when you don't though.
|
# ? Sep 26, 2011 17:44 |
|
Alright, I'm wondering if anyone can help me figure out some FileMaker nonsense? I know, I know, but it's what the client wants/already has in place. I have data a data structure containing a non-unique ID and a dollar value. I want to have a field display the total dollar value for all instances of a given non-unique ID, across records. How can I do this?
|
# ? Sep 26, 2011 23:20 |
|
Zombywuf posted:You don't have to. Interesting things happen when you don't though. It doesn't just default to nvarchar(max)?
|
# ? Sep 27, 2011 01:59 |
|
Now that I think about it, I believe it defaults to nvarchar(40), but that may be context dependent.
|
# ? Sep 27, 2011 15:10 |
|
Jethro posted:Now that I think about it, I believe it defaults to nvarchar(40), but that may be context dependent. It can be very context dependant. I've seen it default to Varchar(1) for example (and then not give warnings about string truncation).
|
# ? Sep 27, 2011 15:57 |
|
We're slightly paranoid about Oracle's moves on MySQL; does anyone have any experience or comment on MariaDB?
|
# ? Sep 27, 2011 19:23 |
|
Golbez posted:We're slightly paranoid about Oracle's moves on MySQL; does anyone have any experience or comment on MariaDB? The only things i've heard are the people that tried it had a seamless transition if you were on 5.x. If your on 4.x you have the same issues as when your going to 5.x.
|
# ? Sep 27, 2011 19:28 |
|
General database design question; I have lots of relationships that boil down to assigning object A to object B, of varying types. For example, descriptive tag A is tied to photo B, photo B is tied to photoset Q, etc. Is there any advantage/disadvantage to having one table thus: Table [Assoc] ParentID ChildID type ---versus N tables --- Table [TagToPhoto] TagID PhotoID Table [PhotoToSet] PhotoID SetID ...
|
# ? Sep 27, 2011 20:41 |
|
Reo posted:General database design question; Unless you need to have nested sets of photos, go for the second option. It will result in nicer queries and less banging your head against the wall.
|
# ? Sep 27, 2011 20:43 |
|
Reo posted:General database design question; Anything but the first one, you will not be able to create foreign key relationships to enforce integrity, the resulting queries will be terrible, maintenance will make you tear out your eyes, the dead will rise from their graves, etc... What is the relationship between the objects? This should be considered for each pair in turn. For example if a photo always belongs to exactly one photoset (which I assume contains 0 or more photos) then each photo should have a photoset_id. If it's a generic n to m relationship as I would assume is the case between tags and photos then yes you want one table per relationship (and put references constraints on the drat columns).
|
# ? Sep 27, 2011 22:27 |
|
Zombywuf posted:It sounds like what you're doing is horribly wrong, but to do it your code should look something like this: I mostly create reports so I don't know much experience outside of that. Merge sounds like it will be useful, but I'll explain my DB setup, and hopefully someone can confirm this can be done with merge. My new database (SQL server 2005) stores information about employees and will be replacing several legacy databases. Unfortunately, the transition will take a long time, and I will have to keep my database in sync until it completely replaces the legacy systems. I receive a CSV from each legacy system that looks similar to this: code:
[Person] - table for person information COLUMNS pID - PK/person ID number (auto increment) firstname lastname ... [personAlias] - Each person has multiple IDs (each legacy system assigned them unrelated IDs, so this table will link all of their old IDs back to them.) COLUMNS PersonAliasKey - Table PK pID - FK from person LegacyID [PersonTitle] PersonTitleKey - Table PK pID - FK from person title ... [PersonTask] PersonTaskKey - table PK pID - FK from person task - each task is a note about a job an employee is working on. This is the logic I want for the insert: For each row of the CSV, Check to see if the legacyID exists in the personAlias table. If the legacyID is present in by DB: Grab the pID associated with the record and make an entry into the PersonTitle along with the title. (This is an upsert, The person's title should be updated if an entry has already been made or inserted if it was not previously present. This table is normalized because in the future, we will associate multiple titles to people) Next, insert the pID and and task into the PersonTask table. Each person will have multiple tasks and records will never be deleted/updated, only inserted. If the task data is an exact duplicate of an entry already associated with that pID, no changes are needed to this table. If the legacyID is not present in by DB: Insert firstname + lastname into the person table, use the pID that was just created and enter that pID + LegacyID into PersonAliasKey,and pID + task into PersonTask.
|
# ? Sep 28, 2011 05:46 |
|
Super Delegate posted:For each row of the CSV, Check to see if the legacyID exists in the personAlias table. Urgh, joined upserts. The general strategy I would take is to attempt to mirror the structure you want in temporary tables then merge the changes in. So after reading the CSV into a temp table do something like: code:
This is only a sketch solution, sorry if this is a little unclear, these kinds of tasks are often painful.
|
# ? Sep 28, 2011 11:47 |
|
Golbez posted:We're slightly paranoid about Oracle's moves on MySQL; does anyone have any experience or comment on MariaDB? Also consider Percona's not-a-fork. I've used it as a drop-in replacement a few times, and it's worked well. The only problems I've had are the same problems you usually have when upgrading, like forgetting about old_passwords and client library versions.
|
# ? Sep 28, 2011 22:58 |
|
In Oracle 10 Does anyone have a good query solution for concatenating a multiple rows into a single comma delimited column. I need to do this with a straight query with no custom function (ie stragg) and it's on version 10 so no listagg. So that: code:
code:
Withnail fucked around with this message at 03:29 on Sep 29, 2011 |
# ? Sep 29, 2011 02:40 |
|
Does oracle not gave a group concat function?
|
# ? Sep 29, 2011 16:28 |
|
Yes, listagg, but not until 11g release 2. There's also stragg, which is a commonly used one written by Tom Kyte, but it's not built in. The best "straight SQL" way I know of is that connect_by_path trick, but you have to be careful about nulls - if they can appear in the group and you don't want them in the concatanated string then you'll need to do something ugly like regexp_replace(regexp_replace(concatenated_column, '(, )(, )+', ', '), '(^(, )+)|((, )+$)', '')
|
# ? Sep 29, 2011 19:35 |
|
Possible silly question. I'm running a report to compare two different counts: code:
I'm using DB2 on an IBM iSeries machine (Mainframe-lite midrange machine, 16 cpus, 144 gigs of ram, 144 15k RPM sas disk FC-SAN storage). the to_yyyymmdd thing is a little function to convert real database date types to the yyyymmdd int that our vendor's tables use to store dates. If that runtime seems about right, that's OK, since I can just schedule this thing to run once a day at 5 AM or something and park out in a table in my reports library, but the user in question is one of those types that really goes nuts over wanting things that are accurate as of Right Now so I'd prefer being able to stick it in a view and get semi-decent response time. e: Most of the time used comes from doing the general count without a filter on it, this one: code:
Gwaihir fucked around with this message at 22:35 on Sep 29, 2011 |
# ? Sep 29, 2011 22:20 |
|
what happens when you take out the group by?
|
# ? Sep 29, 2011 22:49 |
|
That seems ludicrously long, though I am by no means an expert on IBM hardware. Is the 250,000 rows the result, or the size of the outer join table? Because the most expensive thing there seems to be that join, or potentially the group by depending on how many rows it's compressing.
|
# ? Sep 30, 2011 00:02 |
|
Gwaihir posted:e: Most of the time used comes from doing the general count without a filter on it, this one: What indexes do you have and how wide is the table?
|
# ? Sep 30, 2011 01:14 |
|
Scaramouche posted:That seems ludicrously long, though I am by no means an expert on IBM hardware. Is the 250,000 rows the result, or the size of the outer join table? Because the most expensive thing there seems to be that join, or potentially the group by depending on how many rows it's compressing. the outer join select is looking at about 250,000 records for it's count operation. That table has probably 15+ indexes on it, I'll have to check exactly what columns are indexed, and how wide it is tomorrow morning. I would guess about ~30 columns wide. I know the compdt column is included in at least one of the indexes, as well as medcon, but I'm largely hamstrung making any new useful index. These tables are all part of our vendor's software package, and I can't really touch/fix even the most bone-headed things. (The most often advised index from running the index adviser on their library has been logged as a suggestion approximately 67 million times!)
|
# ? Sep 30, 2011 03:55 |
|
Gwaihir posted:e: Most of the time used comes from doing the general count without a filter on it, this one: Store the To_yyyymmdd(CURRENT DATE - 6 MONTHS) and To_yyyymmdd(CURRENT DATE) as variables first maybe? A guy gave me a query the other day that took 20 seconds and I cut it down to a fraction of a second by doing that.
|
# ? Sep 30, 2011 04:02 |
|
I can't really give much advice, but you're right to think something's wrong. I have a cross join on about 2 million rows for a table 90 columns wide (oh how I hate this table) and it's running on a Windows/MSSQL server that's a shadow of your IBM server, and it takes about 2 mins. My server has a ton of ram so it's rarely touching the disc, but that should be true of yours as well. It could be the variable assignment as Doctor rear end in a top hat states, I'm not really familiar with how DB2 handles dynamic assignment like that. It's unlikely but the function could be just be very very badly written as well.
|
# ? Sep 30, 2011 07:10 |
|
Gwaihir posted:the outer join select is looking at about 250,000 records for it's count operation. That table has probably 15+ indexes on it, I'll have to check exactly what columns are indexed, and how wide it is tomorrow morning. I would guess about ~30 columns wide. I know the compdt column is included in at least one of the indexes, as well as medcon, but I'm largely hamstrung making any new useful index. These tables are all part of our vendor's software package, and I can't really touch/fix even the most bone-headed things. (The most often advised index from running the index adviser on their library has been logged as a suggestion approximately 67 million times!) Well, the index you need is (compdt, medcon, t_mreq), possibly even with the first two columns swapped but that depends heavily on the density of medcon. Also, is your database one of those where count(1) is much faster than count(*)?
|
# ? Sep 30, 2011 09:32 |
|
Goat Bastard posted:Yes, listagg, but not until 11g release 2. There's also stragg, which is a commonly used one written by Tom Kyte, but it's not built in. There is an undocumented function in 10g that will do this, wmsys.wm_concat.
|
# ? Sep 30, 2011 13:16 |
|
|
# ? Jun 7, 2024 19:01 |
|
Allll right.. Problem was totally in my user defined function. No difference in execution time when I replaced count(*) with count(compdt). There is an Index on T_MREQ with COMPDT, MEDCON, (Bunch of other columns I don't need). Taking the function out and changing it to variables chopped execution time from 563635ms (9.4 minutes) to 500ms, much better. Guess I'm going to have to go back and see what junk my predecessor actually put in that function... E: I wish I hadn't looked Augh string concatenation! code:
Gonna replace that mess with RETURN(int(replace(char(YDATE, ISO),'-',''))), which runs just about as fast as the static value variables. Gwaihir fucked around with this message at 14:53 on Sep 30, 2011 |
# ? Sep 30, 2011 14:20 |