Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
minato
Jun 7, 2004

cutty cain't hang, say 7-up.
Taco Defender
The events system was actually just an example to couch the general principle in and get it across, I'm not actually writing an events system. In my particular case, the FKs will not introduce significant overhead.

I was more concerned about what people thought of breaking 3rd-normal form in order to simplify the matter of referring to various different types, and whether people had actually used the EAV anti-design pattern I mentioned in Option 3 and regretted it. Thanks for the comments so far!

Adbot
ADBOT LOVES YOU

cletus42o
Apr 11, 2003

Try to imagine all life as you know it stopping instantaneously and every molecule in your body exploding at the speed of light.
College Slice

minato posted:

In the past I've implemented Option 1, like a good little database normalizer. However the clunkiness of doing so has led me to want to implement Option 2. The rest of my team (not very DB-design-savvy) is advocating Option 3, which to me is right out because it would take god himself to pry FK constraints from my dead hands.
I've always used option 3, but then I never use constraints in my database - I handle all of that in my application. But maybe I should rethink that?

But I don't know, I've never had problems with it. The most common way I've used option 3 is in a "files" table, where a file can be attached to any other item in the system. I have the file table itself, which only stores info pertaining to the file, then a table I usually call file_item_jn - the join table has the file ID and two fields called foreign_table and foreign_key.

So I haven't regretted it in the places I've used it.

But then I did develop another system where I have separate join tables for each "type" - and then I have a view that unions them all and creates the type field for me (so a "virtual option 3"), since there are places where I have to provide a snapshot of all the data at once.

m5
Oct 1, 2001

minato posted:

The events system was actually just an example to couch the general principle in and get it across, I'm not actually writing an events system. In my particular case, the FKs will not introduce significant overhead.

I was more concerned about what people thought of breaking 3rd-normal form in order to simplify the matter of referring to various different types, and whether people had actually used the EAV anti-design pattern I mentioned in Option 3 and regretted it. Thanks for the comments so far!

The EAV approach would be laughably inadequate at the scale my database operates.

The normalized version is a little tricky to deal with when a system wants to read through events, like the mail sender I mentioned. What it does is operate over new events in batches. Over each batch (range of event IDs), it does a single query to get a count of events of each type that happen to be there. It can then construct a set of queries, one per type in the batch, to get the events of each type. The queries are built from SQL generated automatically from the XML. That gnarliness is all centralized in one piece of code, so nothing else in the system really needs to care about it.

When loading just a single event, of course, there's no problem because again the SQL is built straight from the event specification.

increased
Feb 20, 2001

Golbez posted:

Elaborate please, are they getting rid of/drastically fixing fulltext in a few months?

No, we'll just be migrating to a new solution.

Xae
Jan 19, 2005

cletus42o posted:

I've always used option 3, but then I never use constraints in my database - I handle all of that in my application. But maybe I should rethink that?

But I don't know, I've never had problems with it. The most common way I've used option 3 is in a "files" table, where a file can be attached to any other item in the system. I have the file table itself, which only stores info pertaining to the file, then a table I usually call file_item_jn - the join table has the file ID and two fields called foreign_table and foreign_key.

So I haven't regretted it in the places I've used it.

But then I did develop another system where I have separate join tables for each "type" - and then I have a view that unions them all and creates the type field for me (so a "virtual option 3"), since there are places where I have to provide a snapshot of all the data at once.

Yes, yes you should.

There is a reason why there is millions of dollars put into R&D. The database will do constraints, and transactions faster, more accurately, and just plain better than anything you can possibly conceive of.

m5
Oct 1, 2001

Xae posted:

Yes, yes you should.

There is a reason why there is millions of dollars put into R&D. The database will do constraints, and transactions faster, more accurately, and just plain better than anything you can possibly conceive of.

Yes, but constraints are not free.

Example: if you have a column in a table that references another table, and you put a FK constraint on that, you'd better put an index on it too - even if you have no reason to do so. Why? Because if you don't, when you delete rows from the target table the server will have no choice but to do a table scan of the other table to make sure you're not violating the constraint.

Another example: in the case discussed above, with the head table and a heterogeny of auxilliary tables, if each auxilliary table has a FK constraint back to the main table you'll have hell to pay when it's time to clean out a bunch of rows from the main table. (Actually what you'll have to do is drop all the constraints, clean out the crap, and then re-add the constraints - if you don't you'll be waiting days. This has happened to us and we now don't have those constraints.)

I believe that FK constraints are a good idea when appropriate and when you can afford them, but to blindly throw them into your schema whenever you have a cross-table reference is not wise.

npe
Oct 15, 2004
It's true that constraints of any sort make deletes more costly, but deletes are costly in the first place and should be avoided as a matter of design if possible. I would say start with FK's where they should go, and consider dropping them for cases where you absolutely need to in order to maintain performance.

minato
Jun 7, 2004

cutty cain't hang, say 7-up.
Taco Defender

m5 posted:

Example: if you have a column in a table that references another table, and you put a FK constraint on that, you'd better put an index on it too - even if you have no reason to do so. Why? Because if you don't, when you delete rows from the target table the server will have no choice but to do a table scan of the other table to make sure you're not violating the constraint.
Sure, but 99% of the time the FK points to the PK of the target table, and PKs should always have an index on them to ensure uniqueness.

m5 posted:

I believe that FK constraints are a good idea when appropriate and when you can afford them, but to blindly throw them into your schema whenever you have a cross-table reference is not wise.
I see them as just a good database design principle. They ensure my data integrity, they can help prevent silly coding errors, they can improve performance by ensuring I don't have crufty tables, and they help describe the relationship between two tables.

However like all principles, they sometimes need to be violated in the name of performance.

Xae
Jan 19, 2005

minato posted:

Sure, but 99% of the time the FK points to the PK of the target table, and PKs should always have an index on them to ensure uniqueness.

I see them as just a good database design principle. They ensure my data integrity, they can help prevent silly coding errors, they can improve performance by ensuring I don't have crufty tables, and they help describe the relationship between two tables.

However like all principles, they sometimes need to be violated in the name of performance.

Foreign Keys, by definition point to the primary key (or unique field, which could be a primary key). If it isn't a key, it is just a lovely denormalized field.

Having primary keys doesn't hurt performance, unless the developers are morons. So only semi-frequently does it hurt performance.

Most databases will automatically create an index on a column once it is the primary key anyway.

quote:

Yes, but constraints are not free.

Example: if you have a column in a table that references another table, and you put a FK constraint on that, you'd better put an index on it too - even if you have no reason to do so. Why? Because if you don't, when you delete rows from the target table the server will have no choice but to do a table scan of the other table to make sure you're not violating the constraint.

Another example: in the case discussed above, with the head table and a heterogeny of auxilliary tables, if each auxilliary table has a FK constraint back to the main table you'll have hell to pay when it's time to clean out a bunch of rows from the main table. (Actually what you'll have to do is drop all the constraints, clean out the crap, and then re-add the constraints - if you don't you'll be waiting days. This has happened to us and we now don't have those constraints.)

I believe that FK constraints are a good idea when appropriate and when you can afford them, but to blindly throw them into your schema whenever you have a cross-table reference is not wise.
It is not only wise, it is the loving point of a relational database system. Otherwise you could just get a really fast disk and use flat files, it would be cheaper and faster. The database will execute and maintain constraints faster than any system you can make.

Plus, if you can't be sure you data is accurate why are you bothering to store it?

A full scan, only if you don't bother making an index, is a required thing. Otherwise you breach data integrity. You have orphaned records. You are now hoping that something else doesn't take the old PK and completely gently caress your system. I prefer to rely on something more than hope and dumb luck...

Any system in which deletes would cause that much of a strain that it would be unworkable with todays hardware probably isn't deleting anyway. They simply end date the record, set a delete flag or otherwise execute a logical delete. Data has value, and you don't just throw it away.

Xae fucked around with this message at 04:41 on Jan 17, 2008

Victor
Jun 18, 2004

Xae posted:

Plus, if you can't be sure you data is accurate why are you bothering to store it?
I'm afraid I have a hard time seeing how you have the experience to back this up. At least, where I work, sometimes we go without constraints, but have nightly integrity checks that run to ensure there are no orphaned records. This doesn't catch the circumstance where one can have orphan records created, incorrectly, that then get deleted before this integrity check runs. However, from my small amount of experience, I do not see this special case as a bug pattern common enough to worry about.

Fly
Nov 3, 2002

moral compass

minato posted:

Sure, but 99% of the time the FK points to the PK of the target table, and PKs should always have an index on them to ensure uniqueness.
I believe m5 means you had better have an otherwise useless index on the field in the table where it is not the PK, right?

Xae
Jan 19, 2005

Victor posted:

I'm afraid I have a hard time seeing how you have the experience to back this up. At least, where I work, sometimes we go without constraints, but have nightly integrity checks that run to ensure there are no orphaned records. This doesn't catch the circumstance where one can have orphan records created, incorrectly, that then get deleted before this integrity check runs. However, from my small amount of experience, I do not see this special case as a bug pattern common enough to worry about.

You are still running the checks, is your machine that pegged on CPU that you can't afford to ensure that the data you are bothering to store is being stored correctly?

Lets face it, hardware is cheap, the programmer time to build and maintain your own custom system to keep integrity cost more than just buying a better boxx. Also if you are that desperate for computing power why are you deleting in the first place?

That isn't even touching on the fact that a single failure on an enterprise level system can cost serious cash.

Victor
Jun 18, 2004
You're just assuming too much. The rule that programmer time is more valuable than computer cost is a good one, but it isn't all-covering, all-important, never-inapplicable. To give you an idea, checking a foreign key constraint can keep transactions open a tiny bit longer, which result in locks being held a bit longer, which can, if one is near a certain breaking point (think adding 1% to a fully packed highway), will cause the system to go crazy. Moreover, dealing with constraints can be a royal PITA, due to having to delete in the right order, and having major issues with self-referencing tables.

minato
Jun 7, 2004

cutty cain't hang, say 7-up.
Taco Defender

Victor posted:

At least, where I work, sometimes we go without constraints, but have nightly integrity checks that run to ensure there are no orphaned records.
As I see it, you do have FK constraints - they're just deferred to a nightly check as opposed to an immediate check for performance reasons. I think that's acceptable, as long as I was sure that my own integrity checks were correct, and that my app wouldn't crash if it came across an orphaned record.

Currently I'm not sure of either of those things in my own app, which is why I'd be reluctant to do it this way.

Xae, I see your point of view but you seem to be under the assumption that you can throw enough hardware and money at any scalability problem in order to fix it. I don't think that's always the case, and isn't always practical. I'd love it if I could get better performance out of my app just by putting it on bigger iron. :sigh:

Xae
Jan 19, 2005

Victor posted:

You're just assuming too much. The rule that programmer time is more valuable than computer cost is a good one, but it isn't all-covering, all-important, never-inapplicable. To give you an idea, checking a foreign key constraint can keep transactions open a tiny bit longer, which result in locks being held a bit longer, which can, if one is near a certain breaking point (think adding 1% to a fully packed highway), will cause the system to go crazy. Moreover, dealing with constraints can be a royal PITA, due to having to delete in the right order, and having major issues with self-referencing tables.

If your system is at capacity why not execute an immediate logical delete, then come back for the "real" delete during a batch window? It will relieve stress on your system and keep data integrity.

The database doesn't simply decide to do checks for a random reason. It does them because it needs to. It needs to make sure that the integrity of the database is being kept. If you do not check for child records you will problems. Someone running an OLAP system is now getting records, which should have been deleted counted in their statistics.

You don't need to throw big iron at things to solve performance problems. Even a $5 k or $10k server is a hell of a machine these days. If an application takes a month of programmer time, it costs ~$10K between salary, benefits and overhead. This is why throwing hardware at problems is popular, because it works. If you are working in a completely budget constrained organization things may be different, but in that case you shouldn't delete immediately anyway, because it takes too many resources.

Constraints are a tool, and a very useful one. They may force you to change how your normally operate, but they do so for good reasons.

m5
Oct 1, 2001

Xae posted:

A full scan, only if you don't bother making an index, is a required thing. Otherwise you breach data integrity. You have orphaned records. You are now hoping that something else doesn't take the old PK and completely gently caress your system. I prefer to rely on something more than hope and dumb luck...

Any system in which deletes would cause that much of a strain that it would be unworkable with todays hardware probably isn't deleting anyway. They simply end date the record, set a delete flag or otherwise execute a logical delete. Data has value, and you don't just throw it away.

You're not familiar with the real world. The tables I'm talking about have tens of millions of rows in them. We delete stuff because we need to curb infinite growth in the database - I think it's well over 100GB now in backup form.

You only have orphaned records if the software has bugs in it, mostly. If there are a few orphaned records floating around among millions and millions of fine happy ones, I'm OK with that. I don't need my database to be a perfect shining jewel of integrity. I need it to mostly work, and to support the heavy load it's under with acceptable performance.

Victor
Jun 18, 2004
Xae, one major problem with your argument is that you assume getting a new server is an easy thing to do. Combine legacy issues with politics, and it's not a simple choice. Yes, it would be nice if people were all-around competent and powerful hardware could simply be purchased. I know this because we recently got a $6K DB server and it screams. However, your making blatant generalizations are... unsettling.

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Filash posted:

No, we'll just be migrating to a new solution.

Tease.

Xae
Jan 19, 2005

m5 posted:

You're not familiar with the real world. The tables I'm talking about have tens of millions of rows in them. We delete stuff because we need to curb infinite growth in the database - I think it's well over 100GB now in backup form.

I have worked on systems ranging from a ~1995 sunbox to a 45node Terradata array. I have worked on databases that where written before "relational" was an option. I am quite familiar with the real world. I mainly do remediation work. I come in when company decides they want a data warehouse only to realize that their OLTP side is a mess and can't be processed until they unfuck it. I see this crap about "oh, we can manage it better than the database", every goddamn time. Other popular excuses include "Well, it slows down the database!" or "We didn't want to write joins".



quote:

You only have orphaned records if the software has bugs in it, mostly.
So, you just have to trust your programmers to be perfect?

quote:

If there are a few orphaned records floating around among millions and millions of fine happy ones, I'm OK with that. I don't need my database to be a perfect shining jewel of integrity. I need it to mostly work, and to support the heavy load it's under with acceptable performance.
Do not lose sight of the purpose of a database. People base decisions of the data in your database. You are feeding them incorrect or incomplete information. The point of a database isn't just to be a tower of zeros and ones. It is so that the data can be accessed and used. Incorrect data is infinitely worse than no data.


There is a cost associated with using bad data, and it is almost always far greater than the cost of fixing the problem.

m5
Oct 1, 2001
Look, I don't know how to explain the situation other than to say that maintenance of foreign key constraints in the tables we have has sometimes become untenable.

I have not claimed that the code is perfect, nor that I expect no software errors. The issues are performance issues. We cannot buy a server much bigger than what we've got, and partitioning the database is more work than can be afforded now (possibly ever).

Xae
Jan 19, 2005

Part of the thing is that he is asking for design advice. You need to catch things like "will we need to partition" before you start making the system. If you don't, then you end up in a situation where you become trapped and can't fix it.

If he was running in a system I wouldn't advice changing it, but since the system isn't yet built it is best to design it correctly, so that when the inevitable happens, and you are in the best position to deal with the situation.

You design following "the rules", then break where needed due to money, political or other "real" constraints.

m5
Oct 1, 2001

Xae posted:

Part of the thing is that he is asking for design advice. You need to catch things like "will we need to partition" before you start making the system. If you don't, then you end up in a situation where you become trapped and can't fix it.

If he was running in a system I wouldn't advice changing it, but since the system isn't yet built it is best to design it correctly, so that when the inevitable happens, and you are in the best position to deal with the situation.

You design following "the rules", then break where needed due to money, political or other "real" constraints.

Oh we knew we would have to partition, and we definitely can, but it's not going to be a small project. There's more to it than just the database part. (The overall situation is not the best one could hope for; retards are now in control of what we do.)

Note that what we've done is remove FK constraints where they became too burdensome. Where they're not, we still have them.

schzim
May 24, 2006
tl;dr
So I am a terrible SQL noob and basically am trying to execute a reasonably complicated expression.

I am importing some crap from some xml/txt files into a MySQL database through python and basically need help with an INSERT statement into a table that can reference information from 2 other tables.

So I have a table of users with attributes and a table of groups.

I have a txt file for groups that lists all the users of each group and I want to build a relationship table to have an id and then a user_id and group_id.

So I want to be able to write an INSERT statement to insert into relationship (user_id, group_id) but I need to look up the user_id and group_id from my other tables as I go.

So what I think i need guidance the most is SQL user variables.

Victor
Jun 18, 2004
It's typically best to load the data you have into some table created to hold said data, and then do the transformation with SQL statements.

schzim
May 24, 2006
tl;dr
I'd like to include an example of what I did so some can explain
A) Why this was so slow
and
B) What a better method would have been.

INSERT INTO memberships (user_id, group_id) SELECT users.user_id, groups.group_id FROM users, groups WHERE users.user_name = '%PYTHONVAR' AND groups.name = '%PYTHONVAR'

As I parsed through the txt file and found a group and then as i found a user i would generate a query like this.

PS Overall it made like 70k writes into memberships ;)

m5
Oct 1, 2001

schzim posted:

I'd like to include an example of what I did so some can explain
A) Why this was so slow
and
B) What a better method would have been.

INSERT INTO memberships (user_id, group_id) SELECT users.user_id, groups.group_id FROM users, groups WHERE users.user_name = '%PYTHONVAR' AND groups.name = '%PYTHONVAR'

As I parsed through the txt file and found a group and then as i found a user i would generate a query like this.

PS Overall it made like 70k writes into memberships ;)

You could have uploaded the (username, groupname) pairs into their own table and then run a single insert/select (which is what Victor was suggesting).
code:
insert into memberships (userid, groupid)
select u.user_id, g.group_id
  from namepairs np
  join users u on np.username = u.user_name
  join groups g on np.groupname = g.name
You'd save on the overhead of all those separate database interactions.

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine
Is there any site or book that goes in to the best way to construct good search engines in PHP/MySQL? Surely fulltext isn't the end-all be-all to it... right?

deimos
Nov 30, 2006

Forget it man this bat is whack, it's got poobrain!

Golbez posted:

Is there any site or book that goes in to the best way to construct good search engines in PHP/MySQL? Surely fulltext isn't the end-all be-all to it... right?

Step 1: use a text indexing search engine like Lucene

Walked
Apr 14, 2003

code:
EXAMPLETABLE

ID	TITLE		TEXT
1	Hello		Goodbye
2	Words		Lots of words
3	More		Third one
Totally new to SQL, just trying to get a feel for how it would be used in real life.

What sort of query could I run to return the row with the largest ID number? I know it has to do with MAX.. but I just cant quite seem to figure it out (yes I know :downs: )


SELECT *
FROM EXAMPLETABLE
WHERE ID = (SELECT MAX(ID)
FROM EXAMPLETABLE)
(Edited to make a fix so that maybe this is an acqually feesable query?)

Or something like this? All IDs are unique. And I doubt the query proposed there would even execute. This isnt a real database, I'm just trying to make sure I understand some things at the moment.

Walked fucked around with this message at 16:40 on Jan 21, 2008

minato
Jun 7, 2004

cutty cain't hang, say 7-up.
Taco Defender
That's exactly how you'd do it. That query would work fine.

Walked
Apr 14, 2003

minato posted:

That's exactly how you'd do it. That query would work fine.

Woah, raddical :2bong:
Thanks :)

To bump it up, how would one get the most recent n entries? I know to script it with multiple recursive queries from C#, but is there a way to run a single query to get the two highest IDs?

noonches
Dec 5, 2005

It represents my lifestyle and stratus as a street savy irreverent youth, who lives large, and yet hungers for the next level in life.
You could do a SELECT * FROM exampletable ORDER BY id DESC LIMIT 0,2

m5
Oct 1, 2001

Walked posted:

Woah, raddical :2bong:
Thanks :)

To bump it up, how would one get the most recent n entries? I know to script it with multiple recursive queries from C#, but is there a way to run a single query to get the two highest IDs?

Some databases have pagination or "TOP" qualifiers. In SQL Server you can say
code:
select top 2 * from ...

ray2k
Feb 7, 2004

Puppet scum beware!
In sql server 2000, is there any way to check if a Text or varchar parameter is a valid xml document or fragment? Sql 2005 is not available for financial and political reasons. Basically, calling sp_xml_preparedocument with inputs that aren't valid xml causes an error that appears to be un-trappable. I'd like to have this SP indicate that condition by return code instead of bailing during execution.

Alex007
Jul 8, 2004

ray2k posted:

In sql server 2000, is there any way to check if a Text or varchar parameter is a valid xml document or fragment? Sql 2005 is not available for financial and political reasons. Basically, calling sp_xml_preparedocument with inputs that aren't valid xml causes an error that appears to be un-trappable. I'd like to have this SP indicate that condition by return code instead of bailing during execution.

This should work:
code:
CREATE PROCEDURE GetHandleToXMLDocument

        /*
        This procedure expects 1 parameter:
                1. An output parameter that is a handle to a DomDocument

        A temporary table named #T_XML ( doc_xml ntext ) with 1 register and any XML text must also exists.
        Instead of using this temp table you can add one parameter of type ntext with your XML. Inside the
        procedure you can create the table and insert that ntext. This procedure is part of a database project
        and I use a temporary table because I have XML stored if column fields and I can't create ntext variables
        so this is a solution.

        This procedure creates a stream object, loads the XML text from the #T_XML table, creates a DomDocument object
        and then loads this DomDocument object from the stream.

        This procedure returns 0 if text from #T_XML was a well formed XML document or <> 0 if it wasn't.
        It also returns in the output parameter a handle to that DomDocument.

        This procedure uses calls to sp_OA's extended stored procedures, so if you have not permissions to execute that
        extended stores procedures it will fail.
        You must also have to install MDAC 2.6 in the server if you hasn't. It is possible to use other MDAC version, simply
        replace 'ADODB.Stream.2.6' , 'Msxml2.DOMDocument.2.6' with your version. If think if you have installed Enterprise
        Manager in your server MDAC 2.6 will be already installed.

        */

        @oXMLDoc int output                     -- handle to a DomDocument

AS

        /* Variables */
        declare @oStream int,
                @hr int,
                @VaLen int,
                @VaPortion varbinary(8000),
                @VaRest nvarchar(200),
                @VaOffset int,
                @VaNocount int,
                @VaLoad int,
                @VaEncoding varchar(10)

        /* No count */
        select @VaNocount = @@options & 512
        set nocount on

        /* Default encoding */
        select @VaEncoding = 'UTF-8'

        /* Creation of the stream object */
        execute @hr = sp_OACreate 'ADODB.Stream.2.6', @oStream output
        if @hr <> 0 goto Finish
        /* a binary stream */
        execute @hr = sp_OASetProperty @oStream, 'Type', 1
        if @hr <> 0 goto Finish
        /* Open stream */
        execute @hr = sp_OAMethod @oStream, 'Open', null
        if @hr <> 0 goto Finish

        /* Put text into stream: I read portions of 100 characters; you can change it (up to 3000) */

        select @VaLen = datalength ( doc_xml ) / 2 FROM #T_XML  -- length in bytes of text
        select @VaOffset = 1
        while @VaOffset <= @VaLen
        begin
                /* Take 100 characters */
                select @VaPortion = convert ( varbinary(8000), convert ( varchar(3000), substring ( doc_xml, @VaOffset, 100 ) ) ) from #T_XML
                /* put data in stream */
                execute @hr = sp_OAMethod @oStream, 'Write', null, @VaPortion
                if @hr <> 0 goto Finish
                /* next portion */
                select @VaOffset = @VaOffset + 100
        end

        /* reset stream */
        execute @hr = sp_OASetProperty @oStream, 'Position', 0
        if @hr <> 0 goto Finish
        /* change stream to text */
        execute @hr = sp_OASetProperty @oStream, 'Type', 2
        if @hr <> 0 goto Finish
        /* set encoding */
        execute @hr = sp_OASetProperty @oStream, 'Charset', @VaEncoding
        if @hr <> 0 goto Finish

        /* create DomDocument */
        execute @hr = sp_OACreate 'Msxml2.DOMDocument.2.6', @oXMLDoc output
        if @hr <> 0 goto Finish
        /* Load DomDocument from stream */
        execute @hr = sp_OAMethod @oXMLDoc, 'Load', @VaLoad output, @oStream
        if @hr <> 0 goto Finish
        /* if @VaLoad is 0, XML is not well formed. return anything <> 0 to show there was an error */      
        if @hr = 0 and @VaLoad = 0 select @hr = 2

Finish:
        /* kill objects */
        if @oStream is not null execute sp_OADestroy @oStream        
        /* no count */
        if @VaNocount > 0 set nocount on else set NOCOUNT off
        /* Delete data from temp table */
        delete from #T_XML

        /* return result */
        return ( @hr )
GO 
(From: http://tinyurl.com/3684yt)

No Safe Word
Feb 26, 2005

Not a question, just a potentially handy proc for folks using MS SQL Server (works in 2k5, though I think it should work in most versions). If you've ever wondered if a table/proc/view/whatever was referenced in one of your stored procedures and didn't want to rely on your RDBMS's dependency tracking (if any), here's a proc that will do a full text search of each proc for whatever text you give it. This should help you hunt down any instances of usage of procs/tables/views, unless of course they are data-driven (ie, you've got table names/SQL in actual tables and you execute those).

code:
CREATE PROCEDURE ProcedureTextSearch 
	@Txt		VARCHAR(50),
	@FullTxt	BIT = 0
AS
	SELECT
		DISTINCT Name,
		CASE 
			WHEN @FullTxt = 1 THEN Definition
			ELSE '' 
		END AS Definition
	FROM
		Sys.sql_modules sc
		JOIN Sys.objects so
		ON so.Object_id = sc.Object_id
	WHERE
		LOWER(Definition) LIKE LOWER('%' + @Txt + '%')
Takes just the text as an argument, and optionally a "1" in the second arg to make it display the full definition of the proc as well. It's case-insensitive because it LOWER()'s the definition and the comparison string first, though you can take those two bits out if you want it to be case-sensitive (though, why?).

USSR Lives On
Jun 10, 2003
asdf
I have a fun little query that I wanted to write, and I thought it would be really simple, but turns out that it may actually be impossible to do in one query. I wanted to get your guys' ideas about what may be the best way of doing this.

I have two tables. One is a table of items. The other is a table of attributes, with multiple possible attributes of several attribute types possible for each item. So I have something like this:

code:
 item_id   item_name
---------------------
    1        foo
    2        bar

 item_id     tag_name    tag_type
----------------------------------
   1           red         color
   1          green        color
   1          heavy        weight
Given that there will only be 2-3 types of tags, is it possible to create a query that will return the number of tags of each type that were assigned to each item? For example, is it possible to run a query on the example above and get back that item 1 was assigned a color tag twice and a weight tag once?

I've been trying different group by and count combinations, but at this point I'm ready to give up and just give each tag type its own table. Is there a way of just making this one query or should I just separate the tag types into their own tables?

Tad Naff
Jul 8, 2004

I told you you'd be sorry buying an emoticon, but no, you were hung over. Well look at you now. It's not catching on at all!
:backtowork:
Quick one, and the answer is probably "no", but is there a way to insert a row into a MySQL table with an autoincremented field and get the value of the autoincrement field, all in one step?

noonches
Dec 5, 2005

It represents my lifestyle and stratus as a street savy irreverent youth, who lives large, and yet hungers for the next level in life.
If you're using php, you can call mysql_insert_id() after the insert.

Adbot
ADBOT LOVES YOU

Rumsfoord
Jan 29, 2005

Eat or Be Eaten

USSR Lives On posted:

I have a fun little query that I wanted to write, and I thought it would be really simple, but turns out that it may actually be impossible to do in one query. I wanted to get your guys' ideas about what may be the best way of doing this.

I have two tables. One is a table of items. The other is a table of attributes, with multiple possible attributes of several attribute types possible for each item. So I have something like this:

code:
 item_id   item_name
---------------------
    1        foo
    2        bar

 item_id     tag_name    tag_type
----------------------------------
   1           red         color
   1          green        color
   1          heavy        weight
Given that there will only be 2-3 types of tags, is it possible to create a query that will return the number of tags of each type that were assigned to each item? For example, is it possible to run a query on the example above and get back that item 1 was assigned a color tag twice and a weight tag once?

I've been trying different group by and count combinations, but at this point I'm ready to give up and just give each tag type its own table. Is there a way of just making this one query or should I just separate the tag types into their own tables?


SELECT Count(attributes.item_id) AS item_attribute_type_count, attributes.tag_type, attributes.item_id
FROM attributes
GROUP BY attributes.tag_type, attributes.item_id


is that what you mean?

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply