SELECT * FROM Questions WHERE Type = 'Stupid'

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > SELECT * FROM Questions WHERE Type = 'Stupid'

«‹›303 »

Civil: Apr 21, 2003; Do you see this? This means "Have a nice day".

This should be really simple, but I'm just not getting it. Pretty new to mysql.

code:

Table A
----------
Field 1
apple
grape
orange
banana


Table B
----------
Field 1   Field 2
apple      car
banana     bus
banana     plane
orange     train

I want to do a JOIN that returns all records from Table A, and only one record (or none for 'grape') from Table B for each join. I started out doing a left join, and realized I screwed up when I was getting a record for each banana.

My first query was:
SELECT * FROM tablea left join tableb on tablea.field1 = tableb.field1

Which join do I want, or do I need another operator?

# ? May 21, 2008 00:00

Adbot: ADBOT LOVES YOU

# ? May 14, 2024 16:07

Alex007: Jul 8, 2004

Civil posted:

This should be really simple, but I'm just not getting it. Pretty new to mysql.

...

My first query was:
SELECT * FROM tablea left join tableb on tablea.field1 = tableb.field1

Which join do I want, or do I need another operator?

The answer is not in the JOIN, it's in the GROUP BY:

code:

SELECT TableA.Field1, MIN(TableB.Field2) AS MyField2
FROM TableA INNER JOIN TableB ON TableA.Field1 = TableB.Field1
GROUP BY TableA.Field1

Edit: Or LEFT join if you want rows from TableA even if they don'T have a row in TableB

# ? May 21, 2008 00:55

Civil: Apr 21, 2003; Do you see this? This means "Have a nice day".

I do want records from TableA, even if they don't have a database match in TableB. My problem is that LEFT JOIN is giving me multiple returns on TableA when there are multiple matches on TableB. I only want the first match on TableB.

# ? May 21, 2008 01:00

Alex007: Jul 8, 2004

Civil posted:

I do want records from TableA, even if they don't have a database match in TableB. My problem is that LEFT JOIN is giving me multiple returns on TableA when there are multiple matches on TableB. I only want the first match on TableB.

This is why you need to GROUP BY, because JOIN *WILL* return all rows from table B when the join condition is met.

Either you GROUP BY, just like I posted, of you join on a subquery to "prepare" table B, which would be slower and stupid. Here's what it would look like anyway:

code:

SELECT TableA.Field1, MyTableB.Field2
FROM TableA LEFT JOIN (SELECT TableB.Field1, MAX(TableB.Field2) AS Field2 GROUP BY TableB.Field1) AS MyTableB

But the correct way is what I posted earlier:

code:

SELECT TableA.Field1, MAX(TableB.Field2) AS MyField2 
FROM TableA LEFT JOIN TableB ON TableA.Field1 = TableB.Field1 
GROUP BY TableA.Field1

# ? May 21, 2008 01:09

Zedlic: Mar 10, 2005; Ask me about being too much of a sperging idiot to understand what resisting arrest means.

Here's a problem I've been working on for longer than I'd like. I know most of the SQL basics but this seems to be a bit over my head.

Let's say I have this simplified table of messages sent to and from a server. All outgoing messages get an ACK from the receiver if he got the message. Outgoing messages are given an ACK number (that loops from 0 to 999), the same number given to the corresponding ACK.

Messages that are sent to the server don't have any ACK's, so they have an ACK number of 0.

So in this example, the first four rows are the server sending two messages of the type ABC, then getting an ACK to the first one, then the second one.

code:

+----+-----------+----------+----------+--------+
| ID | Direction | Time     | MSG type | ACK num|
+----+-----------+---------------------+--------+
|  1 | UP        | 00:11:00 | ABC      |     12 |
|  2 | UP        | 00:11:00 | ABC      |     13 |
|  3 | DN        | 00:15:00 | ACK      |     12 |
|  4 | DN        | 00:20:00 | ACK      |     13 |
|  5 | UP        | 00:25:00 | DEF      |     14 |
|  6 | DN        | 00:26:00 | ACK      |     14 |
|  7 | UP        | 00:26:00 | ABC      |     15 |
|  8 | DN        | 00:26:00 | ACK      |     15 |
|  9 | DN        | 00:27:00 | ABC      |      0 |
| 10 | DN        | 00:29:00 | GHI      |      0 |
| 11 | UP        | 00:29:00 | DEF      |     16 |
| 12 | DN        | 00:29:00 | ACK      |     16 |
| 13 | UP        | 00:29:00 | ABC      |     17 |
| 14 | DN        | 00:30:00 | ACK      |     17 |
| 15 | DN        | 00:30:00 | ABC      |      0 |
+----+-----------+----------+----------+--------+

Now, what I want to do is pair each outgoing message with its corresponding ACK message, and create a new table that looks something like this:

code:

+----+-----------+-----------+---------------+---------+
| ID | Direction | Time sent | Time received | MSG type|
+----+-----------+-----------+---------------+---------+
|  1 | UP        | 00:11:00  | 00:15:00      | ABC     |
|  2 | UP        | 00:11:00  | 00:20:00      | ABC     |
|  5 | UP        | 00:25:00  | 00:26:00      | DEF     |
|  7 | UP        | 00:26:00  | 00:26:00      | ABC     |
|  9 | DN        | 00:27:00  | 00:27:00      | ABC     |
| 10 | DN        | 00:29:00  | 00:29:00      | GHI     |
| 11 | UP        | 00:29:00  | 00:29:00      | DEF     |
| 13 | UP        | 00:29:00  | 00:30:00      | ABC     |
| 15 | DN        | 00:30:00  | 00:30:00      | ABC     |
+----+-----------+-----------+---------------+---------+

Here we have all the outgoing messages paired with their corresponding ACK messages, and so we can see when the message was sent and when it was received. the incoming messages are also displayed, but their sent and received times are the same, for the moment.

I've been trying various forms of self join on the table to get the desired result, but it either excludes what I want, includes what I don't want or is way too slow. Or all of the above.

This might be way too specific for anyone to bother reading and understanding, but I figured I might as well try asking.

# ? May 21, 2008 11:33

Triple Tech: Jul 28, 2006; So what, are you quitting to join Homo Explosion?

When I get crazy queries that are hard to figure out, I ask myself, what is the primary concept I want vs what is joined.

Primary concept: Rows where type != ACK
Join concept: Time
Join condition: ACK num = ACK num (selecting from ACKs only)

So, in pseudo code...

code:

with sent_messages as (
  select *
  from messages
  where type != ACK
), received_messages as (
  select *
  from messages
  where type = ACK
)
select *
from sent_messages
join received_messages
on ACK = ACK

# ? May 21, 2008 13:47

Zedlic: Mar 10, 2005; Ask me about being too much of a sperging idiot to understand what resisting arrest means.

Triple Tech posted:

When I get crazy queries that are hard to figure out, I ask myself, what is the primary concept I want vs what is joined.

Primary concept: Rows where type != ACK
Join concept: Time
Join condition: ACK num = ACK num (selecting from ACKs only)

So, in pseudo code...
code:
with sent_messages as (
  select *
  from messages
  where type != ACK
), received_messages as (
  select *
  from messages
  where type = ACK
)
select *
from sent_messages
join received_messages
on ACK = ACK

Thanks for this.

My basic idea was to do something like this, but on a table that will contain hundreds of thousands of rows and grows about 5.000 rows a day this query will get pretty expensive.

The main problem is that the ACK_number isn't unique (loops from 0 to 999) so joining on that will return a lot of unwanted rows. So my fix was to only join on ACK_numbers that are less than 10 minutes apart, but that's a pretty cheap fix that made the query even more needlessly complicated and slow.

And as a side effect this approach pairs every incoming messages with every other incoming message as well, since they all have an ACK_number of 0.

# ? May 21, 2008 16:37

npe: Oct 15, 2004

Your table is going to be terrible for performance. Can you add a column, something like "ack_of" that is a foreign key to the ID, referencing the sender's id? This would make querying much simpler and faster.

# ? May 21, 2008 19:07

var1ety: Jul 26, 2004

Zedlic posted:

Thanks for this.

My basic idea was to do something like this, but on a table that will contain hundreds of thousands of rows and grows about 5.000 rows a day this query will get pretty expensive.

The main problem is that the ACK_number isn't unique (loops from 0 to 999) so joining on that will return a lot of unwanted rows. So my fix was to only join on ACK_numbers that are less than 10 minutes apart, but that's a pretty cheap fix that made the query even more needlessly complicated and slow.

And as a side effect this approach pairs every incoming messages with every other incoming message as well, since they all have an ACK_number of 0.

If an UP/DN ACK pair is *always* present (the other end always replies) and they are logged reasonably in order and there are no duplicates and you're using Oracle or SQL Server you could use an analytic function like FIRST_VALUE to compute the final table efficiently.

Otherwise, the query is going to be necessarily slow unless you help stage the join condition. One way you could do this is translating this rolling 0-999 value into a real value in the range 0-999,999,999 (a big number you won't ever reach) and storing this in another table, or computing it on the fly with a function. You could then index this function/column, and joins would become very fast.

# ? May 21, 2008 19:42

Aredna: Mar 17, 2007; Nap Ghost

var1ety posted:

If an UP/DN ACK pair is *always* present (the other end always replies) and they are logged reasonably in order and there are no duplicates and you're using Oracle or SQL Server you could use an analytic function like FIRST_VALUE to compute the final table efficiently.

I'll preface this by saying that I'm not very familiar with analytic functions. Would either of these work logically and efficiently in the case that an ACK is not always present?

code:

CASE
  WHEN [Time] - FIRST_VALUE([Time]) Over (...) > 10 minutes THEN NULL
  ELSE FIRST_VALUE([Time]) Over (...)
END AS [Time Received]

or:

- Calculate FIRST_VALUE to fill in the [Time received]
- Having clause to filter out rows with more than a 10 minute lag before the ACK

# ? May 21, 2008 20:16

Stephen: Feb 6, 2004; Stoned

I have two tables:
autos
- id
- other columns that don't matter

and

votes
- id
- auto_id
- vote_value
- ip_address
- date_added

I wrote a query to select all autos that have do not have any votes for the current IP address and day:

code:

SELECT a.id 
FROM autos a 
LEFT JOIN votes v ON v.auto_id = a.id 
WHERE v.ip_address != "This IP address" AND v.date_added != "Today"

The autos table has one entry, and the votes table has been truncated, so I should be getting one result back, however my query is entirely empty. Can anyone spot what I'm doing wrong here?

# ? May 21, 2008 21:08

Alex007: Jul 8, 2004

Stephen posted:

...

Untested, but this is what you need I'm pretty sure

code:

SELECT a.id 
FROM autos a 
LEFT JOIN votes v ON v.auto_id = a.id 
WHERE (v.ip_address != "This IP address" AND v.date_added != "Today") [u]OR V.id IS NULL[/u]

VVV Too slow old man

# ? May 21, 2008 21:36

Aredna: Mar 17, 2007; Nap Ghost

Stephen posted:

...
I wrote a query to select all autos that have do not have any votes for the current IP address and day:
code:
SELECT a.id 
FROM autos a 
LEFT JOIN votes v ON v.auto_id = a.id 
WHERE v.ip_address != "This IP address" AND v.date_added != "Today"
The autos table has one entry, and the votes table has been truncated, so I should be getting one result back, however my query is entirely empty. Can anyone spot what I'm doing wrong here?

Is it possible that your votes table does not have a matching id at all? If so, you will need to also check if v.id is null.

Try

code:

WHERE (v.ip_address != "This IP address" AND v.date_added != "Today") OR v.id IS NULL

There's probably a better way to write this so it makes sense logically as well, perhaps someone can point it out since it's not coming to me right now... or perhaps this is fine.

# ? May 21, 2008 21:37

Stephen: Feb 6, 2004; Stoned

Alex007 posted:

Untested, but this is what you need I'm pretty sure
code:
SELECT a.id 
FROM autos a 
LEFT JOIN votes v ON v.auto_id = a.id 
WHERE (v.ip_address != "This IP address" AND v.date_added != "Today") [u]OR V.id IS NULL[/u]
VVV Too slow old man

Hah, this is perfect, thanks. I just assumed that since the IP address and Date values did not equal null, it should be returning the results anyways.

Thanks again.

# ? May 21, 2008 21:39

Alex007: Jul 8, 2004

Stephen posted:

Hah, this is perfect, thanks.

Just doing my job

Stephen posted:

I just assumed that since the IP address and Date values did not equal null, it should be returning the results anyways.

Thanks again.

You can't test NULL with = or != because NULL is unknown.

If I show you two boxes and you do not know what's in them, you can't say the content is the same because they are BOTH unknown. = and != would BOTH return FALSE.

code:

SELECT CASE WHEN NULL = NULL THEN 'TRUE!!' ELSE 'FALSE' END AS What1
SELECT CASE WHEN NULL != NULL THEN 'TRUE!!' ELSE 'FALSE' END AS What2

# ? May 21, 2008 21:47

Zedlic: Mar 10, 2005; Ask me about being too much of a sperging idiot to understand what resisting arrest means.

yaoi prophet posted:

Your table is going to be terrible for performance. Can you add a column, something like "ack_of" that is a foreign key to the ID, referencing the sender's id? This would make querying much simpler and faster.

That's something I didn't really look into. Thanks for the tip, I'll check tomorrow.

It's going to be hard though, since the ID is just an auto-incremented index and is not related to the message itself in any way. Within the message data, the only connection between an ACK message and the original message is the ACK_number, which loops from 0-999.

var1ety posted:

If an UP/DN ACK pair is *always* present (the other end always replies) and they are logged reasonably in order and there are no duplicates and you're using Oracle or SQL Server you could use an analytic function like FIRST_VALUE to compute the final table efficiently.

Otherwise, the query is going to be necessarily slow unless you help stage the join condition. One way you could do this is translating this rolling 0-999 value into a real value in the range 0-999,999,999 (a big number you won't ever reach) and storing this in another table, or computing it on the fly with a function. You could then index this function/column, and joins would become very fast.

There is always an ACK to an UP message, either an ACK/Failed or ACK/Success. (The ACK isn't coming from the recipient himself, but that's not really important to this problem.) However, I only have MySQL. Don't know if something like that is available there.

I've thought about mapping the ACK_number to something bigger that won't realistically loop ever, but the problem is finding a mapping function that maps each pair of ACK_numbers to a unique number. Any ideas on that are appreciated.

# ? May 21, 2008 23:27

var1ety: Jul 26, 2004

Zedlic posted:

I've thought about mapping the ACK_number to something bigger that won't realistically loop ever, but the problem is finding a mapping function that maps each pair of ACK_numbers to a unique number. Any ideas on that are appreciated.

If you created a new column ack_number_noloop then you could count the number of ACKs with that sequence number with a smaller timestamp, multiply that count by 1000, and add the ack_number. This is a way to link the two records together as yaoi prophet recommended. It will work as long as when you get message 500 then the ACK for the previous message 500 has already been committed to the database.

# ? May 22, 2008 02:54

Ashex: Jun 25, 2007; These pipes are cleeeean!!!

I keep getting this error when trying to import a database:

quote:

ERROR 1064 (42000) at line 35657: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''1' at line 1

I'm assuming there is something up with the very first chunk of code. Does anyone see something off about this?

code:

DROP TABLE IF EXISTS ACCESS;

CREATE TABLE `ACCESS` (
  `USERID`     INT(10) UNSIGNED   NOT NULL   DEFAULT '0',
  `FORUMID`    SMALLINT(5) UNSIGNED   NOT NULL   DEFAULT '0',
  `ACCESSMASK` SMALLINT(5) UNSIGNED   NOT NULL   DEFAULT '0',
    PRIMARY KEY ( `USERID`,`FORUMID` ))
ENGINE = MYISAM
DEFAULT CHARSET = LATIN1;

# ? May 25, 2008 06:45

Diocenes: Jul 18, 2003; Bad Drummer

Ashex posted:

I keep getting this error when trying to import a database:

I think the error is near line 35657 -- it's just line 1 of that clause or section.

# ? May 25, 2008 08:08

epswing: Nov 4, 2003; Soiled Meat

Ok, I've got sizes and persons. Persons have a foreign key (sid) references size (id), standard one-to-many. I want a listing of how many people are wearing which sizes, so if i have sizes S, M and L, I want a result telling me S=10, M=40, L=15.

This is all well and good, except using an inner join leaves out the sizes with totals of 0. I've put together a left join query which actually does do exactly what I want, but I get the feeling that there's a better way to do it.

code:

drop table if exists size;
create table size (
	id int primary key,
	name char(1)
);

insert into size set id=1, name='S';
insert into size set id=2, name='M';
insert into size set id=3, name='L';

drop table if exists person;
create table person (
	id int primary key,
	sid int,
	foreign key (sid) references size(id)
);

-- note: no one is wearing L
insert into person set id=1, sid=1;
insert into person set id=2, sid=1;
insert into person set id=3, sid=2;
insert into person set id=4, sid=2;
insert into person set id=5, sid=2;

-- does what i want, except leaves out sizes with totals of 0
select   s.name as size, count(s.id) as total
from     size s inner join person p on s.id=p.sid
group by s.id
order by s.id;

-- does exactly what i want, but i get the feeling there's a better way
select   s.name as size, sum(case when p.sid is null then 0 else 1 end) as total
from     size s left join (select sid from person) p on s.id=p.sid
group by s.id
order by s.id;

The first query produces

code:

|size | total|
|-----+------|
|S    | 2    |
|M    | 3    |

The second query (correctly) produces

code:

|size | total|
|-----+------|
|S    | 2    |
|M    | 3    |
|L    | 0    |

I just get the feeling that joining a subquery isn't the best solution to what I'm thinking might be a simple problem to solve.

# ? May 25, 2008 21:13

var1ety: Jul 26, 2004

I'd do this

code:

select s.name AS size, coalesce(t.ct, 0) AS ct
  from (select sid, count(*) AS ct from person group by sid) t
 right outer join size s on (s.id = t.sid)
 order by 1

# ? May 25, 2008 22:02

epswing: Nov 4, 2003; Soiled Meat

Can you tell me why this is the same/better/worse?

# ? May 25, 2008 22:39

var1ety: Jul 26, 2004

epswing posted:

Can you tell me why this is the same/better/worse?

I think it's easier to understand what the query is doing when you break the query up. Additionally, while using the case function in that way is convenient, it is hard on the database because Oracle has to evaluate it for every row.

I benchmarked the two original queries and found that my query ran in 50% of the time.

I then benchmarked my original query against a modified version that uses the case method, and found that my query ran in 70% of the time.

code:

select sum(ct)
  from (select s.name, coalesce(t.ct, 0) AS ct
          from (select sid,
                       sum(case
                             when sid is not null then
                              1
                             else
                              0
                           end) AS ct
                  from p2
                 group by sid) t
         right outer join s on (s.id = t.sid));

# ? May 26, 2008 02:27

chocojosh: Jun 9, 2007; D00D.

epswing posted:

-- does exactly what i want, but i get the feeling there's a better way
select s.name as size, sum(case when p.sid is null then 0 else 1 end) as total
from size s left join (select sid from person) p on s.id=p.sid
group by s.id
order by s.id;

Hmm, I haven't tested this but shouldn't you just be able to:

select s.name as size, count(*) as total
from size s left join person p on s.id=p.sid
group by s.id
order by s.id;

http://www.java2s.com/Code/SQL/Select-Clause/COUNTandGROUPBY.htm

# ? May 26, 2008 18:07

epswing: Nov 4, 2003; Soiled Meat

I'm almost positive I've tried exactly that, no dice.

code:

|size | total|
|-----+------|
|S    | 2    |
|M    | 3    |
|L    | 1    |

Incorrect total for L. It works if the true total is at least 1 for each size.

# ? May 26, 2008 19:28

chocojosh: Jun 9, 2007; D00D.

epswing posted:

I'm almost positive I've tried exactly that, no dice.
code:
|size | total|
|-----+------|
|S    | 2    |
|M    | 3    |
|L    | 1    |
Incorrect total for L. It works if the true total is at least 1 for each size.

http://www.w3schools.com/sql/func_count.asp This says that it will return the count of NON-Null values.

As there is ONE record containing the value L after the left join, the COUNT(*) for L is 1. However, the p.id field should be NULL (you can confirm this by doing just the left join and taking out the where conditions). Thus, replacing the count(*) with count(p.id) should fix it... in theory. This is untested

chocojosh fucked around with this message at 19:52 on May 26, 2008

# ? May 26, 2008 19:43

epswing: Nov 4, 2003; Soiled Meat

You win!

code:

select   s.name as size, count(p.id) as total
from     size s left join person p on s.id=p.sid
group by s.id
order by s.id;

code:

|size | total|
|-----+------|
|S    | 2    |
|M    | 3    |
|L    | 0    |

I just needed to change count(s.id) to count(p.id) and inner join to left join in my initial attempt. I knew it was something simple that didn't require crazy sub-selects.

Muchas gracias!

epswing fucked around with this message at 20:36 on May 26, 2008

# ? May 26, 2008 20:32

uh zip zoom: May 28, 2003; Sensitive Thugs Need Hugs

in MS SQL, I'm trying to alter a nullable integer column so that it's not null with a default value of zero.

How could this best be done using a constraint (since I believe that's the only way to go about it, since I don't believe you can add default values when altering a column)?

# ? May 27, 2008 22:28

Xae: Jan 19, 2005

uh zip zoom posted:

in MS SQL, I'm trying to alter a nullable integer column so that it's not null with a default value of zero.

How could this best be done using a constraint (since I believe that's the only way to go about it, since I don't believe you can add default values when altering a column)?

Update the Column, set =0 where col is null
Create a Before Insert Row Trigger
Add Constraint

# ? May 27, 2008 23:19

uh zip zoom: May 28, 2003; Sensitive Thugs Need Hugs

Xae posted:

Update the Column, set =0 where col is null
Create a Before Insert Row Trigger
Add Constraint

suppose a user adds a record and doesn't include a value for the column that has the default value, will that field still get said default value?

# ? May 27, 2008 23:34

Alex007: Jul 8, 2004

Xae posted:

Update the Column, set =0 where col is null
Create a Before Insert Row Trigger
Add Constraint

Triggers ? For Default values ? God dammit this is a stupid idea.

This is what you're looking for:

code:

ALTER TABLE MyTable ADD DEFAULT (0) FOR MyCol
ALTER TABLE MyTable ALTER COLUMN MyCol int NOT NULL

Yes, if you INSERT into that table without specifying a value for MyCol, it will use your default value.

# ? May 27, 2008 23:58

uh zip zoom: May 28, 2003; Sensitive Thugs Need Hugs

Alex007 posted:

This is what you're looking for:

yes, yes it is. Thank you very much.

# ? May 28, 2008 00:11

Xae: Jan 19, 2005

Alex007 posted:

Triggers ? For Default values ? God dammit this is a stupid idea.

This is what you're looking for:
code:
ALTER TABLE MyTable ADD DEFAULT (0) FOR MyCol
ALTER TABLE MyTable ALTER COLUMN MyCol int NOT NULL 
Yes, if you INSERT into that table without specifying a value for MyCol, it will use your default value.

What the gently caress do you think a default value is?

# ? May 28, 2008 00:58

Alex007: Jul 8, 2004

Xae posted:

What the gently caress do you think a default value is?

Default values are better implemented using, well, default values, not triggers.

# ? May 28, 2008 01:11

Xae: Jan 19, 2005

Alex007 posted:

Default values are better implemented using, well, default values, not triggers.

Its a chunk of code that goes "If blah is null, then set to x". Oddly similar to what a trigger would be. And for almost any database the performance hit would be negligible. Regardless of the terminology the code running is going to be shockingly similar.

# ? May 28, 2008 01:21

derdewey: Dec 3, 2004

Help with indices and using mysql's explain! I'm generating a so-called square table which joins 3 tables, but repeats some rows as columns so that my guy's analysis software can plow through it. The end result will be spit out into a csv. Here's the query:

These are InnoDB tables.

code:

SELECT table_0.response AS sex,
       table_1.response AS country,
       table_2.response AS last_serious_relationship,
       table_3.response AS chemistry_predictor,
       table_4.response AS interested_in,
       table_5.response AS age,
       data.respondent_id, data.response, data.primitive_id
FROM numeric_responses AS data
JOIN string_responses AS table_0
                      ON table_0.respondent_id = data.respondent_id
                      AND table_0.primitive_id = 1
JOIN string_responses AS table_1
                      ON table_1.respondent_id = data.respondent_id
                      AND table_1.primitive_id = 2
JOIN numeric_responses AS table_2
                      ON table_2.respondent_id = data.respondent_id
                      AND table_2.primitive_id = 3
JOIN numeric_responses AS table_3
                      ON table_3.respondent_id = data.respondent_id
                      AND table_3.primitive_id = 891
JOIN string_responses AS table_4
                      ON table_4.respondent_id = data.respondent_id
                      AND table_4.primitive_id = 892
JOIN string_responses AS table_5
                      ON table_5.respondent_id = data.respondent_id
                      AND table_5.primitive_id = 893
WHERE data.primitive_id NOT IN (1,2,3,891,892,893)
ORDER BY data.respondent_id, data.primitive_id

And boy howdy is this thing dog slow

And so I ran explain on that bad boy. I'm very unfamiliar with indices but I know they can be handy. Could someone point me in the right direction?
(sorry for the table breakage)

code:

+----+-------------+---------+------+----------------------------+---------------+---------+------------------------------------------+------+---------------------------------+
| id | select_type | table   | type | possible_keys              | key           | key_len | ref                                      | rows | Extra                           |
+----+-------------+---------+------+----------------------------+---------------+---------+------------------------------------------+------+---------------------------------+
|  1 | SIMPLE      | table_2 | ref  | primitive_id,respondent_id | primitive_id  | 4       | const                                    | 9315 | Using temporary; Using filesort | 
|  1 | SIMPLE      | table_0 | ref  | primitive_id,respondent_id | respondent_id | 4       | survey_development.table_2.respondent_id |    3 | Using where                     | 
|  1 | SIMPLE      | table_1 | ref  | primitive_id,respondent_id | respondent_id | 4       | survey_development.table_2.respondent_id |    3 | Using where                     | 
|  1 | SIMPLE      | table_4 | ref  | primitive_id,respondent_id | respondent_id | 4       | survey_development.table_0.respondent_id |    3 | Using where                     | 
|  1 | SIMPLE      | table_5 | ref  | primitive_id,respondent_id | respondent_id | 4       | survey_development.table_1.respondent_id |    3 | Using where                     | 
|  1 | SIMPLE      | table_3 | ref  | primitive_id,respondent_id | respondent_id | 4       | survey_development.table_4.respondent_id |   66 | Using where                     | 
|  1 | SIMPLE      | data    | ref  | primitive_id,respondent_id | respondent_id | 4       | survey_development.table_5.respondent_id |   66 | Using where                     | 
+----+-------------+---------+------+----------------------------+---------------+---------+------------------------------------------+------+---------------------------------+

# ? May 28, 2008 01:36

Xae: Jan 19, 2005

Start with the Postgres Docs Link. Even if you don't use Postgres their docs are some of the best out there for most core DB things like Indexes, constraints, etc.

# ? May 28, 2008 02:46

derdewey: Dec 3, 2004

Those are great docs. So I gave them a read and now I'm trying things out but I can't get it to go fast yet. Even when I'm limiting the result to 100 results it goes dog slow.

Here is what I used to create the indices...

code:

 ~ CREATE INDEX numeric_responses_primitive_id_respondent_id_index ON numeric_responses (`primitive_id`, `respondent_id`)
 ~ CREATE INDEX numeric_responses_primitive_id_index ON numeric_responses (`primitive_id`)
 ~ CREATE INDEX numeric_responses_respondent_id_index ON numeric_responses (`respondent_id`)
 ~ CREATE INDEX string_responses_primitive_id_respondent_id_index ON string_responses (`primitive_id`, `respondent_id`)
 ~ CREATE INDEX string_responses_primitive_id_index ON string_responses (`primitive_id`)
 ~ CREATE INDEX string_responses_respondent_id_index ON string_responses (`respondent_id`)
 ~ CREATE INDEX text_responses_primitive_id_respondent_id_index ON text_responses (`primitive_id`, `respondent_id`)
 ~ CREATE INDEX text_responses_primitive_id_index ON text_responses (`primitive_id`)
 ~ CREATE INDEX text_responses_respondent_id_index ON text_responses (`respondent_id`)

And here's what I get when I do a describe of that query from my post before (eh, those are long index names so I put it in a file to avoid breaking the tables any more).

code:

[url]http://stuff.360test.info/describe.txt[/url]

It looks like those indexes are being used... but it's too slow to be useable. And at 100 rows/32 seconds it'll take just about forever.

# ? May 28, 2008 09:22

nbv4: Aug 21, 2002; by Duchess Gummybuns

I've got a pretty simple problem:

code:

SELECT DISTINCT route FROM flights, planes LEFT JOIN tags ON planes.plane_id = tags.plane_id
WHERE flights.plane_id = planes.plane_id
AND ((planes.category_class = 1 OR planes.category_class = 3))
AND flights.pilot_id = 22
AND flights.route IS NOT NULL
AND tags.tag != "Turbine"

Basically I have a table called "flights" which is linked to one plane, which is then linked to multiple tags via a tag table. If I run this query, it will return no results. If I remove the last line, it'll work perfectly. Even if I change the "!=" to "=", I still get zero results. I need to exclude any flights where the plane has that certain "Turbine" tag.

# ? May 28, 2008 17:54

Adbot: ADBOT LOVES YOU

# ? May 14, 2024 16:07

Jethro: Jun 1, 2000; I was raised on the dairy, Bitch!

nbv4 posted:

I've got a pretty simple problem:
code:
SELECT DISTINCT route FROM flights, planes LEFT JOIN tags ON planes.plane_id = tags.plane_id
WHERE flights.plane_id = planes.plane_id
AND ((planes.category_class = 1 OR planes.category_class = 3))
AND flights.pilot_id = 22
AND flights.route IS NOT NULL
AND tags.tag != "Turbine"
Basically I have a table called "flights" which is linked to one plane, which is then linked to multiple tags via a tag table. If I run this query, it will return no results. If I remove the last line, it'll work perfectly. Even if I change the "!=" to "=", I still get zero results. I need to exclude any flights where the plane has that certain "Turbine" tag.

A few questions:

What RMDBS? MySQL?

Try using single quotes around 'Turbine'

If a plane has a 'Turbine' tag and a different tag (say, 'Butts'), do you want the flights associated with that plane? If not, then what you are trying to do will not work. I believe for that purpose, you would need to use a sub-select like

code:

SELECT route
  FROM flights,
       planes
 WHERE flights.plane_id = planes.plane_id
       AND ((planes.category_class = 1 OR planes.category_class = 3))
       AND flights.pilot_id = 22
       AND flights.route IS NOT NULL
       AND planes.plane_id not in
          (SELECT plane_id
             FROM tags
            WHERE tag = 'Turbine')

# ? May 28, 2008 22:35

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > SELECT * FROM Questions WHERE Type = 'Stupid'

«‹›303 »