Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
McGlockenshire
Dec 16, 2005

GOLLOCKS!
As long as the MySQL version involved doesn't silently chomp CHAR columns over four characters long into VARCHARs, the identity table sounds perfectly fine.

(Keep in mind that you really only need to worry horribly about enums-as-data when the enum is used in multiple tables. A status column in a single table with a restricted set of possibilities that won't change frequently is ok to keep enumized, though it's still not a super solution. Oh if only MySQL supported CHECK clauses...)

McGlockenshire fucked around with this message at 18:36 on Mar 14, 2011

Adbot
ADBOT LOVES YOU

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

McGlockenshire posted:

As long as the MySQL version involved doesn't silently chomp CHAR columns over four characters long into VARCHARs, the identity table sounds perfectly fine.

(Keep in mind that you really only need to worry horribly about enums-as-data when the enum is used in multiple tables. A status column in a single table with a restricted set of possibilities that won't change frequently is ok to keep enumized, though it's still not a super solution. Oh if only MySQL supported CHECK clauses...)

I did in fact decide to return to an enum. The odds of these three statuses ever changing is closer to zero than the odds of my MySQL database deciding one day to simply delete itself. And if they do, so be it.

Yakattak
Dec 17, 2009

I am Grumpypuss
>:3

Is there a way with a SQL query with MySQL to prevent inserting a duplicate? I'd rather use a SQL query that checks for dupes first before inserting, rather than doing in my own code.

Vanadium Dame
May 29, 2002

HELLO I WOULD LIKE TO TALK TO YOU ABOUT MY VERY STRONG OPINIONS

Yakattak posted:

Is there a way with a SQL query with MySQL to prevent inserting a duplicate? I'd rather use a SQL query that checks for dupes first before inserting, rather than doing in my own code.

I'm not sure about MySQL, but in PL/SQL you can wrap the insert in a begin/end and in the exception block add when dup_val_on_index then NULL. I'm thinking this might be the MySQL equivalent: http://dev.mysql.com/doc/refman/5.0/en/declare-handler.html

EDIT: er, I'm assuming by dupe you mean something that would violate a unique index

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Yakattak posted:

Is there a way with a SQL query with MySQL to prevent inserting a duplicate? I'd rather use a SQL query that checks for dupes first before inserting, rather than doing in my own code.

It sounds like you're looking for INSERT IGNORE. If you try inserting a row that wouldn't be inserted because of a duplicate key, it ignores the insert. So I think you would want to also create a unique key encompassing the values you don't want duplicated.

The different commands are:

*INSERT IGNORE - Tries to insert, and if it duplicates a unique key, ignores the insert.
*INSERT ... ON DUPLICATE KEY UPDATE - Tries to insert, and if it duplicates a unique key, performs the update clause instead.
*REPLACE - Tries to insert, and if it duplicates a unique key, the old row is deleted and the new one is inserted in its place.

McGlockenshire
Dec 16, 2005

GOLLOCKS!
Be careful with REPLACE INTO when the table you're operating on is referred to by other tables with ON DELETE CASCADE. Tt performs a delete and re-insert and the delete cascades immediately.

poemdexter
Feb 18, 2005

Hooray Indie Games!

College Slice
I have 2 claim tables. One is regular claims and the second is supporting claims that are attached to main claims on the first table. I'm inner joining them to get a result set of supporting claims attached to claims.

The first table has the claim id and paid amount.
The second table has the claim id that it is attached to.

What I want is a way to get the sum of the original claims without counting duplicates.

code:
claim | support | paid amt
--------------------------
A       B         10.00
A       C         10.00
A       D         10.00
A       E         10.00
X       Y         20.00
X       Z         20.00
This would be an example of my result set but I need to know that claim A is just 10 dollars and X is 20 dollars so I can do some sum at the end and know that this whole result set is worth 30 dollars if that makes any sense.

Of course the real query behind this is way more complicated but the simple question is how do I sum the claim paid amounts to get 30 bucks instead of 80 bucks.

bamhand
Apr 15, 2010

poemdexter posted:

I have 2 claim tables. One is regular claims and the second is supporting claims that are attached to main claims on the first table. I'm inner joining them to get a result set of supporting claims attached to claims.

The first table has the claim id and paid amount.
The second table has the claim id that it is attached to.

What I want is a way to get the sum of the original claims without counting duplicates.

code:
claim | support | paid amt
--------------------------
A       B         10.00
A       C         10.00
A       D         10.00
A       E         10.00
X       Y         20.00
X       Z         20.00
This would be an example of my result set but I need to know that claim A is just 10 dollars and X is 20 dollars so I can do some sum at the end and know that this whole result set is worth 30 dollars if that makes any sense.

Of course the real query behind this is way more complicated but the simple question is how do I sum the claim paid amounts to get 30 bucks instead of 80 bucks.

Can you do a sum distinct? I know there's a count distinct but don't know off the top of my head if you can sum as well. If not you can always do

select claim, sum(amount)/count(1) from tablename group by claim

So ugly though.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

bamhand posted:

Can you do a sum distinct? I know there's a count distinct but don't know off the top of my head if you can sum as well. If not you can always do

select claim, sum(amount)/count(1) from tablename group by claim

So ugly though.

The way I read it, he wants to integrate this into his existing query so that he doesn't need to run two queries. This suggestion seems to call for a second query, at which point surely you may just as well not perform the join and grab the information straightforwardly from the left-hand table.

poemdexter
Feb 18, 2005

Hooray Indie Games!

College Slice

bamhand posted:

Can you do a sum distinct? I know there's a count distinct but don't know off the top of my head if you can sum as well. If not you can always do

select claim, sum(amount)/count(1) from tablename group by claim

So ugly though.

This whole thing is in a huge stored procedure and would require placing the result set in a temp table then doing a query on that with a sum distinct. Not possible with the large amount of data we're dealing with.

My boss eventually came up with a solution doing a union instead of inner join and throwing zeros in rows where needed on union. I just thought I'd drop a line here to see if anyone could come up with something more elegant or if there was some SQL magic words i didn't know about. This is all IBM DB2 databases if that helps.

Aredna
Mar 17, 2007
Nap Ghost
I'm not familiar with DB2, but it looks like it supports the ROLLUP command that I've used with SQL Server do what you want: http://it.toolbox.com/blogs/db2luw/olap-sql-part-1-rollup-7985

Yakattak
Dec 17, 2009

I am Grumpypuss
>:3

Golbez posted:

It sounds like you're looking for INSERT IGNORE. If you try inserting a row that wouldn't be inserted because of a duplicate key, it ignores the insert. So I think you would want to also create a unique key encompassing the values you don't want duplicated.

The different commands are:

*INSERT IGNORE - Tries to insert, and if it duplicates a unique key, ignores the insert.
*INSERT ... ON DUPLICATE KEY UPDATE - Tries to insert, and if it duplicates a unique key, performs the update clause instead.
*REPLACE - Tries to insert, and if it duplicates a unique key, the old row is deleted and the new one is inserted in its place.

In regards to INSERT IGNORE, what do you mean by duplicating the unique queue? For instance, my unique key is the ID, but this is generated automatically by MySQL. So how would that work?

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Yakattak posted:

In regards to INSERT IGNORE, what do you mean by duplicating the unique queue? For instance, my unique key is the ID, but this is generated automatically by MySQL. So how would that work?

Then how would you be able to tell it's a duplicate? :) You have to make another unique, non-primary, key, containing the fields you want to maintain as unique. So while your primary key might be the ID, the unique key might consist of user, timestamp, and action, for a log, for a basic example.

Yakattak
Dec 17, 2009

I am Grumpypuss
>:3

Golbez posted:

Then how would you be able to tell it's a duplicate? :) You have to make another unique, non-primary, key, containing the fields you want to maintain as unique. So while your primary key might be the ID, the unique key might consist of user, timestamp, and action, for a log, for a basic example.

Oh I see now, I'm pretty stupid with SQL in general :downs:

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine
Is there any difference between SELECT DISTINCT column and SELECT column ... GROUP BY column?

For example:

code:
SELECT column FROM table WHERE mydate >= blah GROUP BY column ORDER BY mydate DESC
versus

code:
SELECT DISTINCT column FROM table WHERE mydate >= blah ORDER BY mydate DESC
?

bamhand
Apr 15, 2010
Some databases may run one of those faster than the other but the end result will be the same.

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine
Let's say I have a nightly process that imports millions of rows from a file. We create a table and indexes, then load the file. This process is running so long, it's starting to run into the work day. So I was wondering: Do you think there'd be quantifiable optimization by not defining the indexes before we load the file (thus not having to reindex on every entry), and defining the indexes only after the file is loaded?

butt dickus
Jul 7, 2007

top ten juiced up coaches
and the top ten juiced up players

Golbez posted:

Is there any difference between SELECT DISTINCT column and SELECT column ... GROUP BY column?

For example:

code:
SELECT column FROM table WHERE mydate >= blah GROUP BY column ORDER BY mydate DESC
versus

code:
SELECT DISTINCT column FROM table WHERE mydate >= blah ORDER BY mydate DESC
?

If you just select one column, you'll get the same result set either way. If you select multiple columns, you'll get different results (this bit me before, but I don't remember exactly how).

bamhand
Apr 15, 2010

Doctor rear end in a top hat posted:

If you just select one column, you'll get the same result set either way. If you select multiple columns, you'll get different results (this bit me before, but I don't remember exactly how).

Huh really? How is that possible?

Also, question:

I have a query that groups and sums a bunch of stuff with the end result being:

category total
--------------
A 1
B 5
C 8

How can I get this to become

A B C
- - -
1 5 8

Using Access (I have no choice) so no pivots (I think). Assume I am unioning 20 rows together all with different categories (max of 3 columns per row) so typing out a big switch would not be practical.

bamhand fucked around with this message at 22:15 on Mar 16, 2011

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Golbez posted:

Let's say I have a nightly process that imports millions of rows from a file. We create a table and indexes, then load the file. This process is running so long, it's starting to run into the work day. So I was wondering: Do you think there'd be quantifiable optimization by not defining the indexes before we load the file (thus not having to reindex on every entry), and defining the indexes only after the file is loaded?

Yes. It makes perfect sense if populating a table with a huge amount of data, to add the indexes after the table is populated, for just that reason. Having said that there is a strong case for having any indexes that are needed to enforce relational constraints (unique keys and indexes on foreign keys). But the indexes that are just for querying the table, you can add after the table is populated.

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Hammerite posted:

Yes. It makes perfect sense if populating a table with a huge amount of data, to add the indexes after the table is populated, for just that reason. Having said that there is a strong case for having any indexes that are needed to enforce relational constraints (unique keys and indexes on foreign keys). But the indexes that are just for querying the table, you can add after the table is populated.

MyISAM, so no foreign keys. Primary keys I wouldn't touch. No otherwise unique keys being imported.

Would it be equivalent to create the table as it is now, and simply issue an ALTER TABLE ... DISABLE KEYS before, and ALTER TABLE ... ENABLE KEYS after?

Aredna
Mar 17, 2007
Nap Ghost

Doctor rear end in a top hat posted:

If you just select one column, you'll get the same result set either way. If you select multiple columns, you'll get different results (this bit me before, but I don't remember exactly how).

I'd also be curious if someone can show how this can happen.

My only complaint about distinct is 99% of the time I see it in use it's when someone is getting duplicate records in their data they don't expect and rather than figuring out why they just get rid of them.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Golbez posted:

MyISAM, so no foreign keys. Primary keys I wouldn't touch. No otherwise unique keys being imported.

Would it be equivalent to create the table as it is now, and simply issue an ALTER TABLE ... DISABLE KEYS before, and ALTER TABLE ... ENABLE KEYS after?

I think that's true. That's one of the use cases for disabling keys. The other being when you have two or more tables that have foreign key relationships going both ways and you don't want to have to fuss about what order you insert data in.

Namlemez
Jul 9, 2003

bamhand posted:

I have a query that groups and sums a bunch of stuff with the end result being:

category total
--------------
A 1
B 5
C 8

How can I get this to become

A B C
- - -
1 5 8

Using Access (I have no choice) so no pivots (I think). Assume I am unioning 20 rows together all with different categories (max of 3 columns per row) so typing out a big switch would not be practical.


http://www.databasejournal.com/features/mssql/article.php/3516331/SQL-Pivot-and-Cross-Tab.htm

Lister_of_smeg
May 25, 2005
This may not be the correct place for this. I apologise if it's not.

Does anyone have any pointers for good places to read about investigating performance issues with MySQL? A bit of a weird problem has been dumped on me at work, and my MySQL knowledge begins and ends at simple SELECT statements.

The situation:.
We have a MySQL database running on a Linux server. Let's call it old-server. old-server is a dual-Xeon machine with 16Gb of RAM. This is running MySQL 5.0 on CentOS 4.6.

We have a nice, shiny, new Linux server that we'll call new-server. It has 12-cores (Not sure how they're split across physical CPUs) and 96Gb of RAM. This is running MySQL 5.5 on CentOS 5.5.

We have migrated the data over to the new server. MySQL is configured almost identically on the new server (I've increased the InnoDB buffer pool size inline with the fact the machine has much more RAM).

When our software is using the database on new-server, the creation of new records is taking a lot longer than on old-server (Creating 1000 records taks 647 seconds on old-server, 902 on new-server. Note that this isn't just inserts into one table, but one 'job' record creates records across multiple tables).

I'm left thinking one of a few things could be causing the problem. 1) MySQL 5.5 sucks for performance. 2) CentOS 5.5 sucks for performance. 3) The disks on new-server are much slower than old-server, so are creating a bottleneck. 4) If neither of those are the case then there must be some magical configuration options that will help.

If anybody can help, or point me to some good places to read up on this it would be greatly appreciated. Google is throwing up so much information that I don't know where to start.

butt dickus
Jul 7, 2007

top ten juiced up coaches
and the top ten juiced up players

Lister_of_smeg posted:

3) The disks on new-server are much slower than old-server, so are creating a bottleneck.

Are you just guessing here or do you know that the new disks are slower? The best thing we did to increase performance was switch to SSDs on a dedicated RAID card.

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Hammerite posted:

I think that's true. That's one of the use cases for disabling keys. The other being when you have two or more tables that have foreign key relationships going both ways and you don't want to have to fuss about what order you insert data in.

Merely wrapping the load file in ENABLE/DISABLE KEYS cut two hours off a six hour import. :holy:

Also, changing a SELECT * to SELECT field1, field2, ... (when all we needed was 5 fields from a 40-field table) cut 3.5 hours off a 4 hour script :holy:

I am hero here.

Lister_of_smeg
May 25, 2005

Doctor rear end in a top hat posted:

Are you just guessing here or do you know that the new disks are slower? The best thing we did to increase performance was switch to SSDs on a dedicated RAID card.

At this point I'm just speculating. I'm waiting for our Nagios guy to get some monitoring setup so I can hammer both old and new server and take a look at what's happening as far as disk I/O is concerned.

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!

Doctor rear end in a top hat posted:

If you just select one column, you'll get the same result set either way. If you select multiple columns, you'll get different results (this bit me before, but I don't remember exactly how).
This should only be true if every column in the select list is not also in the group by list, and even then most RDMBs will yell at you if you have columns that are not in either the group by or an aggregate function. MySQL (or at least whatever flavor of MySQL I last used 5 years ago) will let you have columns not in the group by, in which case it just returns the "first" value it sees.

butt dickus
Jul 7, 2007

top ten juiced up coaches
and the top ten juiced up players

Jethro posted:

This should only be true if every column in the select list is not also in the group by list

It was something like this.
The query
SELECT DISTINCT butts, pizza FROM table
might return something different than
SELECT butts, pizza FROM table GROUP BY butts
Now that I think about it, the first one doesn't even make any sense, no idea why it didn't blow up.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Lister_of_smeg posted:

This may not be the correct place for this. I apologise if it's not.

Does anyone have any pointers for good places to read about investigating performance issues with MySQL? A bit of a weird problem has been dumped on me at work, and my MySQL knowledge begins and ends at simple SELECT statements.

There's this blog, I've occasionally read articles from it and they really seem to know what they are talking about, but I have no idea if it will be of any help to you in diagnosing your problem. Good luck.

bamhand
Apr 15, 2010

Doctor rear end in a top hat posted:

It was something like this.
The query
SELECT DISTINCT butts, pizza FROM table
might return something different than
SELECT butts, pizza FROM table GROUP BY butts
Now that I think about it, the first one doesn't even make any sense, no idea why it didn't blow up.

The first one makes perfect sense? The DBs I've used won't allow the second one to run since it doesn't group by all the fields you selected.

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!

Doctor rear end in a top hat posted:

It was something like this.
The query
SELECT DISTINCT butts, pizza FROM table
might return something different than
SELECT butts, pizza FROM table GROUP BY butts
Now that I think about it, the first one doesn't even make any sense, no idea why it didn't blow up.
SELECT DISTINCT butts, pizza FROM table
will return the same result as
SELECT butts, pizza FROM table GROUP BY butts, pizza
i.e. one row for every distinct (duh) combination of butts and pizza.

On the other hand,
SELECT butts, pizza FROM table GROUP BY butts
will throw an error in most sensible DBs because you're not supposed to have columns that are not in the group or part of an aggregate function. In MySQL it will return one row for each butts, with an indeterminate pizza for each butts.


EDIT: \/\/\/ distinct applies to the whole select statement, not just the first (or whatever) column.

Jethro fucked around with this message at 19:02 on Mar 17, 2011

butt dickus
Jul 7, 2007

top ten juiced up coaches
and the top ten juiced up players

bamhand posted:

The first one makes perfect sense? The DBs I've used won't allow the second one to run since it doesn't group by all the fields you selected.

I just tested it on one of my tables. In the first query, it acts as if the DISTINCT isn't even there and returns all 36 rows in the table. In the second one, it does the grouping and returns 16 rows like I want it to. If I just do
SELECT DISTINCT pizza FROM table
and
SELECT pizza FROM table GROUP BY pizza
they return the same thing (16 rows).
Maybe MySQL is just weird?

e: VVVVVVVVVVVV

Well holy poo poo. The custom reporting tool I just built would have been a huge pain in the rear end without being able to do this.

butt dickus fucked around with this message at 19:11 on Mar 17, 2011

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine
MySQL is notorious for being one of the only, if not the only, servers which will allow the GROUP BY and SELECT to contain different fields.

Janitor Prime
Jan 22, 2004

PC LOAD LETTER

What da fuck does that mean

Fun Shoe

Golbez posted:

MySQL is notorious for being one of the only, if not the only, servers which will allow the GROUP BY and SELECT to contain different fields.

This bit me in the rear end as I started porting my application to postgres and I noticed a few of our devs were using GROUP BY instead of DISTINCT.

Aredna
Mar 17, 2007
Nap Ghost

Golbez posted:

MySQL is notorious for being one of the only, if not the only, servers which will allow the GROUP BY and SELECT to contain different fields.

I'm being pedantic here, but the GROUP BY can contain fields not in the SELECT statement on most DBs, but not the other way around.

namol
Mar 21, 2007
Stupid SQL Question here.

System: SQL Server 2005 running on Server 2003 Enterprise 64bit.

Can you mix using the like clause of where and the contains clause? I'm trying to implement a google like search feature on an internal site and I'm having issues searching for numerical values. The table that being hit has been setup for full text searching. Whenever I search for an exact numerical value it never returns any results but if I search for using a wild card it will find it.

SELECT IDNum, QuoteNumber, DodgeNumber, JobName, Estimator, City, State, Rep, Rep2, ProjectType, Zip, County FROM estdodge WHERE CONTAINS(*, '"12345"') Order by QuoteNumber DESC

SELECT IDNum, QuoteNumber, DodgeNumber, JobName, Estimator, City, State, Rep, Rep2, ProjectType, Zip, County FROM estdodge WHERE CONTAINS(*, '"1234*"') Order by QuoteNumber DESC

IE:
Search for 12345 returns no results.
Search for 1234* returns results like 12341,12342,12343,12344,12345,123456 etc.

Is there some reason for this behavior that I'm missing?

namol fucked around with this message at 20:04 on Mar 22, 2011

Sprawl
Nov 21, 2005


I'm a huge retarded sperglord who can't spell, but Starfleet Dental would still take me and I love them for it!
mysql uses % as a string wildcard i dont know what sql engine your using but it looks like mysql but if you have fulltext shouldn't you be using MATCH

Sprawl fucked around with this message at 19:42 on Mar 22, 2011

Adbot
ADBOT LOVES YOU

namol
Mar 21, 2007

Sprawl posted:

mysql uses % as a string wildcard i dont know what sql engine your using but it looks like mysql but if you have fulltext shouldn't you be using MATCH

SQL Server 2005.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply