|
Lone_Strider, please provide rowcounts for those tables. Is there any possibility of upgrading to SQL2005? Its pivot operator is exactly what you want.MoNsTeR posted:It is highly discouraged and very wrong and makes me want to drink hemlock.
|
# ? Nov 5, 2008 02:23 |
|
|
# ? May 15, 2024 02:21 |
|
Victor posted:Lone_Strider, please provide rowcounts for those tables. Is there any possibility of upgrading to SQL2005? Its pivot operator is exactly what you want. Funny story, we had plans to update to SQL 2005 a year ago, but we work with the mortgage industry and when it took a poo poo, any upgrade plans went with it. I can push for it, arguing how much of an impact will make though & hopefully we can get it done.
|
# ? Nov 5, 2008 02:41 |
|
Triple Tech posted:What is the alternative to a workflow that would like to support ad-hoc fields? It's still a pertinent, albeit ugly, question. In Oracle you could speed up your query by using aggregate functions to avoid the multiple table scans forced by scalar subqueries. Something like this: code:
code:
code:
The user then modifies the view to include this new column, allowing himself to easily include it in his queries. code:
code:
|
# ? Nov 5, 2008 02:48 |
|
Try the case when approach below, along with left joins instead of subqueries.
|
# ? Nov 5, 2008 03:38 |
|
So in finally moving to Sql 2005 at work I'm revisiting some queries we pushed off because they do deep aggregation and would benefit from ROW_NUMBER/RANK/PIVOT/etc. If I have a schema where let's say a Product table can have 0 to many ProductReviews, where ProductReviews.ProductId references Product.ProductId, how would I get the first 9 reviews in a column/cross-tab fashion? I.E. instead of ProductId ReviewId ReviewText 1 1 1 2 ... 1 9 I would see ProductId Review1Id Review1Text Review2Id Review2Text ... Review9Id Review9Text 1 1 'lorem ipsum' 2 'lorem ipsum2' ... 9 'lorem psum9' I've read that the norm in Sql 2000 is using CASE statements in the select query which works fine if the values are certain at query-time, but a ProductReviewId clearly wouldn't be.
|
# ? Nov 5, 2008 07:24 |
|
Victor posted:Yet another person says EVA is bad for every single situation out there. Depending on how much data you have, it isn't that bad at all. It's especially useful for when attributes are very sparsely populated. More to the point, most of the time someone thinks they need EVA, they don't actually need it, and they are underestimating the maintenance effort by an order of magnitude. Hence the exhortation to reconsider. MoNsTeR fucked around with this message at 16:55 on Nov 5, 2008 |
# ? Nov 5, 2008 16:40 |
|
MoNsTeR posted:There's little that EVA can do that you can't do with real tables, in which case you're actually using the RDBMS for what it's for. If you have a problem that truly demands an EVA solution, don't use an RDBMS. We use an EVA successfully in our web-based application. The same code and database layout is used for a number of different customers and industries, allowing them to customize the data they capture about objects in their system. Some of our customers have 20 attribute fields for each object, some of them have over 600. In the case of the 600-field-client (with millions of objects tracked), only 20-40 of those attributes might be used by any given object, but they have many different departments that store different data. To overcome the performance hits regarding data extraction/mining, we do a pivot of the metadata tables every few hours to a standard table that gets indexed. Search results are then based on this table. EVA has its uses -- we couldn't replicate this in any simple fashion with a standard relational database table setup.
|
# ? Nov 5, 2008 18:33 |
|
But why not model them "the normal way"? What's so insanely beneficial to this setup? It's like the cost of adding a new field completely outweighs the cost incurred by modeling it properly? And that includes maintenance costs and complexity for both?
|
# ? Nov 5, 2008 18:53 |
|
Triple Tech posted:But why not model them "the normal way"? What's so insanely beneficial to this setup? It's like the cost of adding a new field completely outweighs the cost incurred by modeling it properly? And that includes maintenance costs and complexity for both? I wouldn't call it "insanely" beneficial, but the structure of the database is not nearly as complex or difficult to maintain as one might think, and we don't have any tables in the system that contain 600 unused columns for a client that only needs 30 columns. There also aren't any particular limits on the number of different columns one might try to track, which could easily go much higher than the 600 with some of the future proposed applications of the software.
|
# ? Nov 5, 2008 19:12 |
|
I'm not convinced that this is the "right" or most effective thing to do (just saying, no convincing needed). Having 600 attributes to one object just smells like an attempt to model what are really heterogeneous objects whose dimensions are skew and have nothing to do with one another.
|
# ? Nov 5, 2008 19:47 |
|
Triple Tech posted:I'm not convinced that this is the "right" or most effective thing to do (just saying, no convincing needed). Having 600 attributes to one object just smells like an attempt to model what are really heterogeneous objects whose dimensions are skew and have nothing to do with one another. A specific application is for the tracking of a piece of evidence. Let's say that the item is a gun. Then the gun has a make, model, serial number, caliber. Then we also want to know if the gun has been tested by a lab. Yes? What types of tests were done to the item? What were the results of those tests? Who performed them? Who found this gun? Where did they find it? And so on and so forth. Now imagine I am a police department with 100 different types of evidence (guns, knives, blood, saliva, clothes, found property, drugs, etc), and for each type of evidence, we might go through a totally different collection process with a whole new set of "questions" or attributes to answer about the item in question. We want to treat the guns and drugs the same at a base level, because we will be performing other functions on these items (chain of custody, disposition, etc) in an identical manner.
|
# ? Nov 5, 2008 20:06 |
|
So you have a base table called "evidence" and then you have a knives table, a guns table, a clothes table, that all join off this evidence table. The only weakness to this system is the inability to address completely brand new types of evidence, which I feel is the only scenario where EVA shines, and it doesn't happen often enough, I feel, to merit using it anyway.
|
# ? Nov 5, 2008 20:11 |
|
Triple Tech posted:So you have a base table called "evidence" and then you have a knives table, a guns table, a clothes table, that all join off this evidence table. The only weakness to this system is the inability to address completely brand new types of evidence, which I feel is the only scenario where EVA shines, and it doesn't happen often enough, I feel, to merit using it anyway. That's exactly what happens with our customer base. The customization of this structure is completely in the hands of the clients and they do, on a very regular basis, modify the types of evidence they are tracking -- especially forensic labs who might deal with items from a crime scene one day, to corporate information theft the next. We use the same software for any of the many industries that can benefit from its use, and it is flexible enough to allow them to track anything the way they want to. I completely agree that for most (the vast majority) of purposes/applications EVA can make a mess of things quick, but the only claim I am making (from experience) is that it does, indeed, have non-abhorrent uses
|
# ? Nov 5, 2008 20:41 |
|
Triple Tech posted:So you have a base table called "evidence" and then you have a knives table, a guns table, a clothes table, that all join off this evidence table. The only weakness to this system is the inability to address completely brand new types of evidence, which I feel is the only scenario where EVA shines, and it doesn't happen often enough, I feel, to merit using it anyway. N.B. this only applies if one is not using table inheritance: Views, if they exist, need replicating. Want to view "all items"? Then N tables need to be unioned together somehow. Whatever does that unioning needs to be kept up-to-date. Do you really want to require all this if we can make performance a non-issue? Victor fucked around with this message at 02:26 on Nov 6, 2008 |
# ? Nov 5, 2008 22:57 |
|
You don't union heterogeneous objects, that's what the evidence table is for. It has all of the pieces of evidence in the entire universe. The only thing in the other types is information specific to that piece of evidence. It's the whole "let's treat things that are wildly different as the same without factoring out the similarities" angle that I find ridiculous.
|
# ? Nov 5, 2008 23:03 |
|
Thanks for the help and recommendations guys, I'm talking my boss into upgrading to SQL 2005, and I'm getting familiar with pivot in the meantime.
|
# ? Nov 5, 2008 23:08 |
|
Triple Tech posted:The only weakness to this system is the inability to address completely brand new types of evidence, which I feel is the only scenario where EVA shines, and it doesn't happen often enough, I feel, to merit using it anyway. There's nothing wrong with having thousands or even tens of thousands of tables in your application's schema, and there's nothing wrong with tables that have hundreds of columns. Intuitively, there's a point at which this becomes unmanageable, if nothing else many implementations have hard limits on things like columns-per-table. But unless you're scratching at that limit, implementing an EVA is just an exercise in abstracting the RDBMS from itself, and re-inventing everything it normally does for you, like referential integrity, in ways that don't actually work. The fallacy that leads people to create unnecessary EVA's is thinking that it's easier to maintain a set of abstracted, generic tables that use homegrown faux-RI (or none at all) than it is to create lots and lots of tables. It isn't.
|
# ? Nov 5, 2008 23:14 |
|
MoNsTeR posted:Make a new table. More specifically, enable your application to create new tables behind the scenes as necessary. You can (and should) even create the corresponding referential integrity. I have to agree with this guy... we use EAV at work and it is horrible. I want to punch the "architect" who came up with our schema... we are years into using EAV schema -- and now we have a huge project to get out of it because of all of the problems we are having.... There is no referential integrity -- we have attributes of the same type for the same entity all over the place -- in one case I've seen 46 active attributes of the same type (with different values). An example of this is a car -- we have one attribute for the car -- called color. Imagine if you go in to query the attributes for CAR 123, and you get back 46 attributes -- some saying the car is blue, some saying the car is red, etc. In addition, every time you want to display another attribute, you either join to the attribute table again, or use another call to a function to get the attributes value. You also have to deal with types, I don't know what the standard way to deal with types is for EAV models, but ours uses a god drat varchar to store every single type of data and all kinds of crap is built in to convert it to a real type (for example, a money amount in our system is stored as a varchar with a precision and scale and a type id specifying what kind of value it is... when you go to retreive the value, it reads the mask and scale and turns '010033' into 100.33). Furthermore, there are attributes out there for certain customers that are the same as other previously created attributes -- back to the car example -- one customer stores his car colors as a 'color' attribute while another calls it 'hue' attribute. So you want to know which customers of yours have blue cars, and now you are just supposed to know that you need to check for both 'hue' and 'color' attributes. The system sure as hell doesn't know the two attributes are synonomous (unless you program in more convoluted logic to tell it so). Hey -- guess what -- you also get the fun of debugging 300 line queries full of magic numbers -- and I remember one day a report stopped working "all of a sudden." It was a query with 17 left joins to the attributes table -- it started with 2000 rows and due to data errors (this is where I found that 46 duplicate attributes for a given entity) it balooned up to 16,000,000 rows (which a distinct clause would pare back down to a few thousand albeit at huge expense). I HATE EAV... every day I think of quitting because of the horrible poo poo queries I see and have to debug.... but for some reason I don't. Maybe I am a masochist.
|
# ? Nov 6, 2008 04:59 |
|
dancavallaro posted:This man speaks the truth. It didn't work when I tried it in the WHERE close, but with HAVING it worked fine. You can also do a join if you don't want to do a bunch of subselects... there are 2 ways you can do it: I prefer: code:
code:
|
# ? Nov 6, 2008 05:10 |
|
Oh god. Please help me figure out what index to put in for this. I see this in all my slow logs since it doesn't use indexes. It's called on pretty much every page on my forum.code:
code:
|
# ? Nov 6, 2008 07:34 |
|
Triple Tech posted:You don't union heterogeneous objects, that's what the evidence table is for. It has all of the pieces of evidence in the entire universe. The only thing in the other types is information specific to that piece of evidence. Just to note, I have quite a bit of experience with using metadata to describe data structure and then automatically generate user interfaces from that metadata. In fact, my shop has an app that would very easily support the table inheritance model you propose. However, this is only because we have a pretty extraordinary metadata architecture -- something that took a lot of time to develop. quote:It's the whole "let's treat things that are wildly different as the same without factoring out the similarities" angle that I find ridiculous. Rumsfoord, be careful of criticizing technique X because you have personal experience with technique X implemented in a particularly crappy way. Your multiple-values-for-a-given-attribute problem can be solved with a simple unique constraint. Trust me, duplicate, crappy data is in no way unique to EVA. Your performance worries of many joins or subqueries is alleviated by SQL2005's pivot operator, which I demonstrate in my short snippet on using EVA to store audit logs. Some of the crappier databases don't support pivot and it's a crying shame. I've written SQL2005 views that pivot over 25 columns, resulting in a view that has 2,000,000 rows, and performance was lightening-fast. Your color-vs-hue problem is merely a problem with enforcing semantically-unique attribute names.
|
# ? Nov 6, 2008 09:08 |
|
I'm trying to write an SQL query for my .NET AJAX AutoCompleteExtender control. I have a bunch of records in a column in my database which are class titles and I'm trying to write the query that will get the suggestion results for the auto-suggest box. Example class records might be: "Contemporary Documentary Films" "Electromagnetic Fields" "Intermediate Calculus" So far all I have is: SELECT * FROM Classes WHERE Title Like 'Prefix%' But that only matches the first word in the title string. I want to make it so that typing "Doc" would match "Contemporary Documentary Films" just like typing "Cont" and "Fil" would. I can't just do WHERE Title Like '%Prefix%' because then weird stuff like "cum" would match "Documentary" in "Contemporary Documentary Films" which is undesirable. Could anyone please let me know how I can do this? Thanks!
|
# ? Nov 10, 2008 10:17 |
|
Rabbi Dan posted:I'm trying to write an SQL query for my .NET AJAX AutoCompleteExtender control. It looks like a simple solution to this would to be match only the beginning of each word which you could do with: WHERE Title LIKE 'Prefix%" OR Title LIKE '% Prefix%'; Be sure to note the space in the second condition.
|
# ? Nov 10, 2008 15:37 |
|
Puddleboy posted:It looks like a simple solution to this would to be match only the beginning of each word which you could do with: Wow that works...and I get it...Thank you.
|
# ? Nov 10, 2008 15:51 |
|
I'm working in a SQL Server 2005 environment here. I have a large data table with a 3 part primary key. The 3 part primary key is something like CustomerNumber, BillID, LineItem. Changes to this table are difficult at best, because it's large and partitioned. I've got another table, now, where I'm working with the line items for a given (CustomerNumber, BillID) combination. Basically, this table is an identity int column, CustomerNumber, BillID, and some XML data about those line items and some associated data for each. What I want is a constraint on this table that, for any inserted row, the (CustomerNumber, BillID) combination matches a record in my larger table. I can't do it as a foreign key, because of the third part of the key(and the PK without the third part is not unique to each record). How can I enforce this? It seems like I'm missing something obvious, but I can't see it.
|
# ? Nov 14, 2008 00:52 |
|
thefncrow posted:I'm working in a SQL Server 2005 environment here. I think you're looking for a TSQL trigger. Basically, you would need to run a query on the 3-key table to search for the specific CustonerNumber and BillID combo that is coming into the database and, if it exists in the 3-key table, you go to the INSERT statement and, if not, throw back some kind of error message. Sorry, but I'm not familiar with TSQL; the limited experience I have with this stuff has been through Oracle's PL/SQL.
|
# ? Nov 14, 2008 02:27 |
|
Sergeant Hobo posted:I think you're looking for a TSQL trigger. Basically, you would need to run a query on the 3-key table to search for the specific CustonerNumber and BillID combo that is coming into the database and, if it exists in the 3-key table, you go to the INSERT statement and, if not, throw back some kind of error message. Sorry, but I'm not familiar with TSQL; the limited experience I have with this stuff has been through Oracle's PL/SQL. Yeah, I just was thinking about a trigger, and I think that's the way I'm going to want to go at this point. It just didn't occur to me at first because I kept thinking there was some sort of constraint I could apply to the table that would do the trick, but I think it'll just have to be a trigger. Thanks.
|
# ? Nov 14, 2008 03:07 |
|
Before making that last post, I was thinking you'd might be able to do this with some kind of assertion on the tables but I don't think that would've been correct, unless you can do some line like:code:
|
# ? Nov 14, 2008 03:14 |
|
I've finished the design work and I'm about to start coding a web application that will use about sixty tables with around twenty to one hundred columns in each, most of those integers and booleans. A few of the tables are going to be update intensive. I was thinking of using Postgres. I looked for database comparisons using Google, but I can't find anything after 2005. In fact, all of the old news I read mislead me to believe that MySQL couldn't perform stored procedures, or that Postgres had performance problems. But apparently Postgres is pretty fast and MySQL has had stored procedures built in since 2005. It's even worse if I limit my search results within the last year - just a bunch of articles from 2001 and 2003 that have website redesigns. So what's the new consensus on Postgres for web applications? Familiarity isn't a problem, nor is hosting. Should anything deter me from using it? Does this come down to religion?
|
# ? Nov 14, 2008 18:39 |
|
thefncrow posted:I'm working in a SQL Server 2005 environment here. Why is there not a "bills" table which has a key of (CustomerNumber, BillId) instead of just a "bill line items" table? If there is such a table, then why aren't you using that for the foreign key constraint? If your new table has XML data about the line items, why is LineItem not part of the key? If the XML data is about the bills and not the bill line items, see the above questions. If you don't have any real ability to change the schema then neither of those questions is particularly helpful, but it seems like it points out why you're having trouble. As for your problem, I just recently learned about check constraints, and they're my new best friend. To make one that does what you want, create a UDF that returns true or false depending on whether a given (CustomerNumber, BillId) is in the line item table, and then create a check constraint on the table which requires that the function return true for the (CustomerNumber, BillId) of the record in question.
|
# ? Nov 14, 2008 18:42 |
|
In SQL 2005 I'm writing a query that works fine as an inner select or a join but I know there's gotta be a way to do it without doing either of those, for obvious performance reasons. I just want to hit the database once. Lets say you have a table with 2 columns, users and shipments. A user can have multiple shipments but those shipments can only be inbound or outbound so the table would look like this (I'm keeping it very basic for the sake of this example): Shipments code:
code:
|
# ? Nov 14, 2008 19:13 |
|
pikes posted:This seems so simple but I don't know how I'd do it without hitting the database twice. Any ideas? code:
|
# ? Nov 14, 2008 19:37 |
|
Forzan posted:I've finished the design work and I'm about to start coding a web application that will use about sixty tables with around twenty to one hundred columns in each, most of those integers and booleans. A few of the tables are going to be update intensive. I was thinking of using Postgres.
|
# ? Nov 14, 2008 19:49 |
|
pikes posted:I want to pull back all users that did not have any outbound shipments. I know I can do it with an inner select:
|
# ? Nov 14, 2008 21:58 |
|
Got a bit of a potentially complex design/query issue.. I work for an image library, which has roughly 200k images on file. We're a small company and our system has grown organically which invariably means it's a bit of a mess. At the moment, image information is stored in a MySQL database along the lines of: Reference code | Keywords | Cat1 | Cat2 | Cat3 | Cat4 | Cat5 The keywords field is a string of 20-30 keywords delimited by commas (e.g. "keyword1,keyword2,keyword3......"). There are tens of thousands of keywords (maybe more). There are maybe 50 different category types (stored in the 'cat' fields). When people search at the moment, the query finds any image refs where the keywords are LIKE '%input%', or any of the 5 categories are like '%input%'. This means that when it's searching for say '%dog%', it goes through a string of up to 255 characters in the keywords field, plus each category field, for 100,000 records. It's painful to watch it struggle like this. I'm wondering if you guys know of a better way to 1) store the information, and 2) search the information. I'm not particularly advanced with mysql, but I thought perhaps a relational database might be in order as I hate comma-delimited fields. Any input is appreciated.
|
# ? Nov 17, 2008 22:00 |
|
Wow, interesting problem, I've never dealt with anything quite like that. I'm curious to hear if there are "standard" solutions to this. The one thing I do know is that putting any kind delimited data into a DB field completely defeats the purpose of a relational DB. That could be most of your performance problems right there. To fix the delimited field, I would make a three column table just for the keywords: unique key, picture ID, keyword. It will have a ton of rows, but you'll be able to properly join and filter on specific keywords.
|
# ? Nov 17, 2008 22:09 |
|
Dr.Khron posted:Wow, interesting problem, I've never dealt with anything quite like that. I'm curious to hear if there are "standard" solutions to this. Yeah sorry I didn't explain that very well. By relational DB I kind of meant what you described - basically a row per keyword per image. It concerns me that the rows in this database will be in the millions - will that slow things down or can mysql handle it?
|
# ? Nov 17, 2008 22:25 |
|
Is there any reason not to make a keywords table, with just unique id and keyword, a picture table, with picture info, and a table to tie the two together (keyword id, picture id)?
|
# ? Nov 17, 2008 22:26 |
|
Vince McMahon posted:Yeah sorry I didn't explain that very well. By relational DB I kind of meant what you described - basically a row per keyword per image. It concerns me that the rows in this database will be in the millions - will that slow things down or can mysql handle it?
|
# ? Nov 17, 2008 22:27 |
|
|
# ? May 15, 2024 02:21 |
|
toby posted:Is there any reason not to make a keywords table, with just unique id and keyword, a picture table, with picture info, and a table to tie the two together (keyword id, picture id)? Would that be like this? code:
|
# ? Nov 17, 2008 22:56 |