SELECT * FROM Questions WHERE Type = 'Stupid'

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > SELECT * FROM Questions WHERE Type = 'Stupid'

«‹›303 »

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

Bonus posted:

You can see how efficient your query is. How much records needed to be traversed to get the results of the query. For instance, you almost never want to do full table scans. It's good to check your queries with explain to see if full table scans are happening at certain queries so you can introduce indices or reform the queries.

So how do I read and interpret the results of the EXPLAIN to do this?

# ? Dec 4, 2007 20:17

Adbot: ADBOT LOVES YOU

# ? May 13, 2024 11:33

Victor: Jun 18, 2004

Wow, I can see how reading EXPLAIN results would be a bit painful. Compare that to MS SQL Server query plans:

# ? Dec 4, 2007 20:23

m5: Oct 1, 2001

Check out Aqua Fold "DataStudio" for a free, flawed, but overall useful tool to provide graphical "EXPLAIN" rendering for a variety of databases.

# ? Dec 4, 2007 20:34

Melraidin: Oct 11, 2005

I've run into some spooky MySQL stuff recently. Specifically, expressions being evaluated for each row of a resultset in a manner I'm really not comfortable with.

I'm looking for a running total of some values. This, at least in my past experience, is difficult with only SQL (at least, if you avoid cursors).

I've now found that I can do this in MySQL:

code:

SET @Total := 0;

SELECT
   UserID,
   @Total := IFNULL( @Total, 0 ) + SUM( BandwidthUsed ) AS BandwidthUsageToDate
FROM
   UserBandwidth
GROUP BY
   UserBandwidth.UserID,
   DATE( UserBandwidth.LogDate )
ORDER BY
   UserBandwidth.LogDate ASC

This returns a resultset with a user ID and their total bandwidth usage to date, grouped by day.

If I were to try the same in MS SQL:

code:

SET @Total = 0;

SELECT
   UserID,
   @Total = ISNULL( @Total, 0 ) + SUM( BandwidthUsed ) AS BandwidthUsageToDate
FROM
   UserBandwidth
GROUP BY
   UserBandwidth.UserID,
   DATEPART( YEAR, UserBandwidth.LogDate ),
   DATEPART( MONTH, UserBandwidth.LogDate ),
   DATEPART( DAY, UserBandwidth.LogDate )
ORDER BY
   UserBandwidth.LogDate ASC

I get this error: 'A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operations.'

Is there a way to get a running total from MS SQL in the same manner as from MySQL? Or, as I suspect, is getting a running total from MySQL a bad idea to begin with?

# ? Dec 5, 2007 15:41

Victor: Jun 18, 2004

How does the MySQL example actually do what you think it is doing? You are grouping by UserID, but only getting one row back -- do you only have one user? If you are convinced that MySQL really is doing the proper thing, do post the results, because I would be very surprised.

The TSQL version can be fixed by only assigning to variables, and not retrieving data. However, be warned that, at least if you throw in an ORDER BY clause, SQL Server is not guaranteed to populate your variable properly. If you'd like, I can yank a script that generates pretty CREATE TABLE code from INFORMATION_SCHEMA, which assigns to a @sql variable in the select statement in the style above, and breaks (only returns one row) if I add an ORDER BY ORDINAL_POSITION, in SQL 2005 (SQL 2000 works fine with that ORDER BY).

Edit: if you're using SQL 2005, do this properly by writing an SQL CLR aggregate. I have two such aggregates: dbo.JoinWithPipe and dbo.JoinWithCommaSpace.

# ? Dec 5, 2007 15:53

Alex007: Jul 8, 2004

Melraidin posted:

(...)

Is there a way to get a running total from MS SQL in the same manner as from MySQL? Or, as I suspect, is getting a running total from MySQL a bad idea to begin with?

Calculating Running Totals, Subtotals and Grand Total in MSSQL Without a Cursor

# ? Dec 5, 2007 16:10

Melraidin: Oct 11, 2005

Victor posted:

How does the MySQL example actually do what you think it is doing? You are grouping by UserID, but only getting one row back -- do you only have one user? If you are convinced that MySQL really is doing the proper thing, do post the results, because I would be very surprised.

The TSQL version can be fixed by only assigning to variables, and not retrieving data. However, be warned that, at least if you throw in an ORDER BY clause, SQL Server is not guaranteed to populate your variable properly. If you'd like, I can yank a script that generates pretty CREATE TABLE code from INFORMATION_SCHEMA, which assigns to a @sql variable in the select statement in the style above, and breaks (only returns one row) if I add an ORDER BY ORDINAL_POSITION, in SQL 2005 (SQL 2000 works fine with that ORDER BY).

Edit: if you're using SQL 2005, do this properly by writing an SQL CLR aggregate. I have two such aggregates: dbo.JoinWithPipe and dbo.JoinWithCommaSpace.

You're right that my example sucked. I tried to write something that would be a bit easier to understand than the actual code, but I sure screwed that one up...

A better example, I hope (better since it's not broken):

code:

SELECT
   DATE( UserBandwidth.LogDate ) AS EntryDate,
   @Total := IFNULL( @Total, 0 ) + SUM( BandwidthUsed ) AS BandwidthUsageToDate
FROM
   UserBandwidth
WHERE
   UserBandwidth.UserID = 21
GROUP BY
   DATE( UserBandwidth.LogDate )
ORDER BY
   DATE( UserBandwidth.LogDate ) ASC

This query is intended to retrieve the date this user (21) used some bandwidth as well as a running total of the user's bandwidth usage.

In MySQL the equivalent of this query on my DB gives:

code:

EntryDate      BandwidthUsageToDate
2007-01-01     31
2007-01-02     63
2007-01-04     82
2007-01-09     123

The second paragraph of your post refers to only assigning to a variable in the query. This is all that I would expect to be allowed to do with MS SQL, but this doesn't resolve the issue, as I would of course only be getting the single value, where I need a value for each log date.

Could you elaborate on your edit? I'm not certain what portion of my post this is in reference to.

Alex007 posted:

Calculating Running Totals, Subtotals and Grand Total in MSSQL Without a Cursor

Thanks for the link, and this does do what I need. This article shows examples of solving this in the manner I would expect, but hadn't thought of. The major difference between this method (a sub-query for each row) and the MySQL method, at least from my understanding, is the MySQL method will only do the single query on the table, while the article's method will be doing Number of Rows + 1 queries on the table.

# ? Dec 5, 2007 16:29

Alex007: Jul 8, 2004

Melraidin posted:

Thanks for the link, and this does do what I need. This article shows examples of solving this in the manner I would expect, but hadn't thought of. The major difference between this method (a sub-query for each row) and the MySQL method, at least from my understanding, is the MySQL method will only do the single query on the table, while the article's method will be doing Number of Rows + 1 queries on the table.

If you want a GREAT book with a lot of complex problems solved, read Joe Celko's "SQL for Smarties - Advanced SQL Programming", it's an AWESOME book with lots of great solutions.

# ? Dec 5, 2007 16:39

Victor: Jun 18, 2004

What version of SQL Server are you using? I realize I misread your first post after you posted sample results, so I'm pretty sure you can't do what you want with variable assignment, but I think you can another way. SQL2005 offers a lot of sweet functionality that could potentially make this a lot easier.

# ? Dec 5, 2007 17:58

Melraidin: Oct 11, 2005

Victor posted:

What version of SQL Server are you using? I realize I misread your first post after you posted sample results, so I'm pretty sure you can't do what you want with variable assignment, but I think you can another way. SQL2005 offers a lot of sweet functionality that could potentially make this a lot easier.

MS SQL Server 2000 I believe. Even so, could you point me to these functions in SQL Server 2005, just for interest's sake?

# ? Dec 5, 2007 19:23

Victor: Jun 18, 2004

This guy says cursors are the best method for doing running totals.

Here are a few different ways to do running totals. I don't have time to explain them now; I can do that later if necessary. Only one of them requires SQL Server 2005; it uses recursive CTEs.

code:

-- play with these (turn them on)
set statistics profile off
set statistics time off

create table #data (a int not null, b int not null)
insert #data
    select a = 1, b = 1 union
    select 1, 2 union
    select 2, 1 union
    select 2, 2

;with 
ordered as (
    select  *, row_number = row_number() over (order by a, b)
    from    #data
),
running as (
    select  a, b, nest = 0, row_number, sum = b
    from    ordered
    where   row_number = 1
    union all
    select  cur.a, cur.b, prev.nest + 1, cur.row_number, prev.sum + cur.b
    from    ordered cur
    inner join running prev on prev.row_number + 1 = cur.row_number
)
select * from running

select  *, (select sum(b) from #data where a = t.a and b <= t.b or a < t.a)
from    #data t

select  t1.a, t1.b, sum(t2.b)
from    #data t1
inner join #data t2 on t2.a < t1.a or t2.a = t1.a and t2.b <= t1.b
group by t1.a, t1.b

declare @a int, @b int, @sum int
declare c cursor local fast_forward for
select  a, b from #data

open c
set @sum = 0
create table #results (a int not null, b int not null, sum int not null)

fetch next from c into @a, @b

while @@fetch_status = 0
begin
    set @sum = @sum + @b
    insert  #results values (@a, @b, @sum)
    
    fetch next from c into @a, @b
end

select * from #results

drop table #results
drop table #data

# ? Dec 5, 2007 20:53

<deleted user>

Melraidin posted:

Is there a way to get a running total from MS SQL in the same manner as from MySQL? Or, as I suspect, is getting a running total from MySQL a bad idea to begin with?

Pull the data out of the database as-is, and model the data in your application. Keeping computation out of your database makes it perform better, and promotes a looser coupling between your application and any one database platform.

Just my opinion!

# ? Dec 5, 2007 21:28

wwb: Aug 17, 2004

^^^What that man said. Most reporting layers easily support running totals with ease.

# ? Dec 5, 2007 21:30

Golbez: Oct 9, 2002; 1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

fletcher posted:

So how do I read and interpret the results of the EXPLAIN to do this?

I found a great tutorial to EXPLAIN and indices at http://hackmysql.com/

Specifically, the urls http://hackmysql.com/case1 and up, just change the number.

# ? Dec 5, 2007 22:01

Dessert Rose: May 17, 2004; awoken in control of a lucid deep dream...

Related to my post in the Excel thread.

This is probably really, really simple for anyone who actually knows SQL well. Me? It's just that thing that ActiveRecord hides from me. And my only experience is with MySQL, not MSSQL. But man, you stick a JOIN in there and my eyes just glaze over.

In this spreadsheet we have two queries. One, amusingly, queries the other. I'm trying to figure out why.

Is there any reason this:

code:

SELECT V_SP1OPERATIONTYPE.NAME                                                                                                                 OPERATION,
       DP_CBS_STACK_VERSION_REVISION.DATAPOINTVALUE                                                                                            BUILD,
       V_PROCESSORARCHITECTURE.NAME                                                                                                            ARCH,
       V_VISTASKU.NAME                                                                                                                         SKU,
       H_SESSIONHEADER.CLIENTSESSIONENDTIME                                                                                                    DATEANDTIME,
       DP_CBS_MACHINE_NAME.DATAPOINTVALUE                                                                                                      MACHINE_NAME,
       ((H_SESSIONHEADER.FLAGS & 16) / 16)                                                                                                     IS_TEST,
       HASHBYTES('SHA1',DP_CBS_MACHINE_NAME.DATAPOINTVALUE) + DP_CBS_STACK_VERSION_REVISION.DATAPOINTVALUE + V_SP1OPERATIONTYPE.DATAPOINTVALUE HASH
FROM   DP_CBS_MACHINE_NAME
       INNER JOIN (V_SP1OPERATIONTYPE
                   INNER JOIN (DP_CBS_STACK_VERSION_REVISION
                               INNER JOIN (H_SESSIONHEADER
                                           INNER JOIN (V_PROCESSORARCHITECTURE
                                                       INNER JOIN (V_VISTASKU
                                                                   FULL OUTER JOIN V_SP1OPERATIONRESULT
                                                                     ON V_SP1OPERATIONRESULT.SESSIONID = V_VISTASKU.SESSIONID)
                                                         ON V_VISTASKU.SESSIONID = V_PROCESSORARCHITECTURE.SESSIONID)
                                             ON V_PROCESSORARCHITECTURE.SESSIONID = H_SESSIONHEADER.SESSIONID)
                                 ON H_SESSIONHEADER.SESSIONID = DP_CBS_STACK_VERSION_REVISION.SESSIONID)
                     ON DP_CBS_STACK_VERSION_REVISION.SESSIONID = V_SP1OPERATIONTYPE.SESSIONID)
         ON V_SP1OPERATIONTYPE.SESSIONID = DP_CBS_MACHINE_NAME.SESSIONID
WHERE  DP_CBS_MACHINE_NAME.DATAPOINTVALUE != '(null)'
       AND (H_SESSIONHEADER.CLIENTUPLOADTIME < (GETDATE() - .5))
       AND V_SP1OPERATIONRESULT.NAME IS NULL

would somehow return results that this:

code:

SELECT
  HASHBYTES('SHA1',DP_CBS_MACHINE_NAME.DATAPOINTVALUE) + DP_CBS_STACK_VERSION_REVISION.DATAPOINTVALUE + V_SP1OPERATIONTYPE.DATAPOINTVALUE HASH,
  V_SP1OPERATIONRESULT.NAME RESULT,
  V_SP1OPERATIONTYPE.NAME OPERATION,
  DP_CBS_STACK_VERSION_REVISION.DATAPOINTVALUE BUILD,
  H_SESSIONHEADER.CLIENTSESSIONENDTIME DATEANDTIME,
  DP_CBS_MACHINE_NAME.DATAPOINTVALUE MACHINE_NAME
FROM
 DP_CBS_MACHINE_NAME
  INNER JOIN (V_SP1OPERATIONRESULT
   INNER JOIN (DP_CBS_STACK_VERSION_REVISION
    INNER JOIN (H_SESSIONHEADER
  INNER JOIN V_SP1OPERATIONTYPE
  ON V_SP1OPERATIONTYPE.SESSIONID = H_SESSIONHEADER.SESSIONID)
    ON H_SESSIONHEADER.SESSIONID = DP_CBS_STACK_VERSION_REVISION.SESSIONID)
   ON DP_CBS_STACK_VERSION_REVISION.SESSIONID = V_SP1OPERATIONRESULT.SESSIONID )
  ON V_SP1OPERATIONRESULT.SESSIONID = DP_CBS_MACHINE_NAME.SESSIONID
WHERE
 DP_CBS_MACHINE_NAME.DATAPOINTVALUE != '(null)'

would NOT?

I've been over this in my head lots and lots of times and I can't see how a query with more criteria would somehow return more data.

The only thing I can think of is that FULL OUTER JOIN in there, but that doesn't make a lot of sense (actually, the existence of the entire join doesn't make sense) because one of the criteria is for that table not to be null.

The way the Excel sheet works is that it takes each query and places them into a table. Then there's a field in the first query's table that checks the second query's table for the existence of the hash, and if it's not there, it assumes there was never a response and says so.

But it doesn't make a lot of sense to me - they're both querying the same table, they're both using the same restraints - the one that returns "more" data actually has more restraints...

I have the feeling that if I could condense this into a single query then I could merge this one with the other query in the spreadsheet and be done with it.

Also, I don't know a lot about SQL, but is this an incredibly messy query or is it just my inexperience with the language?

Dessert Rose fucked around with this message at 02:21 on Dec 6, 2007

# ? Dec 6, 2007 02:14

Victor: Jun 18, 2004

That SQL is revolting, especially the table breaking part. The SQL formatter in the OP sucks, IMHO.

You have two tables in the first query that don't show up in the second query.

code:

select operation    = v_sp1operationtype.name,
       build        = dp_cbs_stack_version_revision.datapointvalue,
       arch         = v_processorarchitecture.name,
       sku          = v_vistasku.name,
       dateandtime  = h_sessionheader.clientsessionendtime,
       machine_name = dp_cbs_machine_name.datapointvalue,
       is_test      = ((h_sessionheader.flags & 16) / 16),
       hash         = hashbytes('SHA1',dp_cbs_machine_name.datapointvalue) + ...
       [b]-- result[/b]
from   dp_cbs_machine_name
       inner join (v_sp1operationtype
       inner join (dp_cbs_stack_version_revision
       inner join (h_sessionheader
       inner join ([b]v_processorarchitecture[/b]
       inner join ([b]v_vistasku full 
       outer join[/b] v_sp1operationresult 
         on v_sp1operationresult.sessionid = v_vistasku.sessionid)
         on v_vistasku.sessionid = v_processorarchitecture.sessionid)
         on v_processorarchitecture.sessionid = h_sessionheader.sessionid)
         on h_sessionheader.sessionid = dp_cbs_stack_version_revision.sessionid)
         on dp_cbs_stack_version_revision.sessionid = v_sp1operationtype.sessionid)
         on v_sp1operationtype.sessionid = dp_cbs_machine_name.sessionid
where  dp_cbs_machine_name.datapointvalue != '(null)'
       [b]and (h_sessionheader.clientuploadtime < (getdate() - .5))
       and v_sp1operationresult.name is null[/b]


select operation    = v_sp1operationtype.name,
       build        = dp_cbs_stack_version_revision.datapointvalue,
       [b]-- arch[/b]
       [b]-- sku[/b]
       dateandtime  = h_sessionheader.clientsessionendtime,
       machine_name = dp_cbs_machine_name.datapointvalue,
       [b]-- is_test[/b]
       hash         = hashbytes('SHA1',dp_cbs_machine_name.datapointvalue) + ... ,
       result       = v_sp1operationresult.name
from   dp_cbs_machine_name
       inner join (v_sp1operationresult
       inner join (dp_cbs_stack_version_revision
       inner join (h_sessionheader
       inner join v_sp1operationtype
         on v_sp1operationtype.sessionid = h_sessionheader.sessionid)
         on h_sessionheader.sessionid = dp_cbs_stack_version_revision.sessionid)
         on dp_cbs_stack_version_revision.sessionid = v_sp1operationresult.sessionid)
         on v_sp1operationresult.sessionid = dp_cbs_machine_name.sessionid
where  dp_cbs_machine_name.datapointvalue != '(null)'

# ? Dec 6, 2007 02:47

Dessert Rose: May 17, 2004; awoken in control of a lucid deep dream...

Victor posted:

That SQL is revolting, especially the table breaking part. The SQL formatter in the OP sucks, IMHO.

You have two tables in the first query that don't show up in the second query. (snip)

Glad to know I'm not the only one who thinks it looks like poo poo.

Wouldn't an inner join not actually add any records to the result, though? My understanding of INNER JOIN is that it only pulls extra records in if there's something on both sides.

# ? Dec 6, 2007 03:04

Victor: Jun 18, 2004

Let's say I run a query on the table Thread. Then I add an inner join to Post. Will not the number of results balloon?

# ? Dec 6, 2007 03:10

Dessert Rose: May 17, 2004; awoken in control of a lucid deep dream...

Victor posted:

Let's say I run a query on the table Thread. Then I add an inner join to Post. Will not the number of results balloon?

Okay, that makes sense for a has-many association. To rephrase:

Look at the two tables in question. A given install will only have one arch and only one sku; it's impossible to install the x64 and x86 versions simultaneously, for example.

So, given that I'm only doing inner joins on tables that have one-to-one relationships with the initial table, is the same true?

This is also a little off what my actual question is.

Is there a way to construct a query such that I only get the results returned by the first query, that aren't returned by the second? I look at the queries and what I come up with is already in there (the OperationResult.NAME IS NULL bit) but all I'm really doing is stumbling in the dark.

And of course I can't find the guy who wrote this to begin with to beat him with a stick. Which is understandable; I know I wouldn't want to maintain this crap.

# ? Dec 6, 2007 03:20

mister_gosh: May 24, 2002

Just the thread I need!

I need some help. I have a table called relations. It stores an OBJECT, its PARENT, and the objects VERSION. So, for example,

code:

OBJECT  PARENT  VERSION AREA
============================
4567    1234    98989   1
4567    4356    98989   1
4522    4356    91102   1
4522    4356    91102   2

I want a result set that contains distinct VERSIONs, I don't care which PARENT I get, but I need it. I only want results from AREA 1. So I want my results to be:

4567 1234 98989
4522 4356 91102

My query is:

code:

SELECT object, parent, version from relations where area='1';

Which gives me the results:
4567 1234 98989
4567 4356 98989
4522 4356 91102

Just throwing a distinct command after the select distinct obviously won't do it. I've tried constructing a nested join but can't get it right (I keep putting things in the wrong order, like - select distinct version from relations in (select...)).

Can anyone help? I'm totally lost.

mister_gosh fucked around with this message at 03:48 on Dec 6, 2007

# ? Dec 6, 2007 03:45

kalleboo: Jan 13, 2001; Hj�lp

mister_gosh posted:

Can anyone help? I'm totally lost.

You need to look into GROUP BY.

# ? Dec 6, 2007 04:43

npe: Oct 15, 2004

The cheap way around this is to just use max() or min() to get the one you want. Like:

code:

SELECT r.object, max(r.parent), r.version
  FROM relationships r
  WHERE r.area = 1
  GROUP BY r.object, r.version;

This will work as long as you only have the single un-grouped column you want data from, if you have multiple then you're going to need an inline view and a join back to the ungrouped table.

# ? Dec 6, 2007 04:45

mister_gosh: May 24, 2002

Satan's Scallion posted:

The cheap way around this is to just use max() or min() to get the one you want. Like:
code:
SELECT r.object, max(r.parent), r.version
  FROM relationships r
  WHERE r.area = 1
  GROUP BY r.object, r.version;
This will work as long as you only have the single un-grouped column you want data from, if you have multiple then you're going to need an inline view and a join back to the ungrouped table.

Works perfectly!

# ? Dec 6, 2007 05:16

Victor: Jun 18, 2004

Ryouga Inverse posted:

Look at the two tables in question. A given install will only have one arch and only one sku; it's impossible to install the x64 and x86 versions simultaneously, for example.

So, given that I'm only doing inner joins on tables that have one-to-one relationships with the initial table, is the same true?

Have you verified that they are one-to-one? The only way I can see this not being the culprit is if I don't fully understand the semantics of full outer join when used like you have in your example. You can figure this out (and report back to me): select sessionid from all the tables in the first query and tell me if any are ever null.

quote:

Is there a way to construct a query such that I only get the results returned by the first query, that aren't returned by the second? I look at the queries and what I come up with is already in there (the OperationResult.NAME IS NULL bit) but all I'm really doing is stumbling in the dark.

If you were using SQL 2005, you could use the except operator. Instead, follow this pattern:

code:

select  a, b
from    table_name_or_derived_table t1
left join table_name_or_derived_table t2 on 
        (t2.a = t1.a or t2.a is null and t1.a is null)
    and (t2.b = t1.b or t2.b is null and t1.b is null)
where   t2.a is null
    and t2.b is null

Use derived tables that contain your two massive queries:

code:

from (
massive query here
) t1

# ? Dec 6, 2007 05:19

Dessert Rose: May 17, 2004; awoken in control of a lucid deep dream...

Victor posted:

Have you verified that they are one-to-one? The only way I can see this not being the culprit is if I don't fully understand the semantics of full outer join when used like you have in your example. You can figure this out (and report back to me): select sessionid from all the tables in the first query and tell me if any are ever null.

If you were using SQL 2005, you could use the except operator. (derived table neat stuff)

I think we might be... I don't know what the database is running but I have a feeling it would be 2005

So just so I "get" your pattern, what I would do is essentially create two derived tables, one from each query, and then do that left join on it? I :google:

d derived tables and that aspect of it makes sense to me. What does "t2.a = t1.a or t2.a is null and t1.a is null" do in the context of a join, exactly? Or would I just replace that with some entirely different code, like "t1.v_spoperationtype.sessionid = t2.v_spoperationtype.sessionid"?

# ? Dec 6, 2007 07:30

Victor: Jun 18, 2004

Ok, if you have SQL2005.. just comment out the columns in each query that aren't found in another, make them in the same order in each query, then stick an except in between the two queries and try to run it.

With derived tables, you'd literally replace "massive query here" with your massive query. However, using except is much cleaner and leaves much less room for error, so try that!

# ? Dec 6, 2007 12:03

mister_gosh: May 24, 2002

$5 paypal to whoever can fix this:

Now I have a situation where I'm retrieving an OBJECT.

Once I have the number, I want to get the OBJECT parent but only if it is of TYPE 201.

So for example, I'm working with OBJECT 4567. Since there are multiple PARENT entries, I only want to get the one that is of TYPE 201 but that info is in a separate table.

TABLE RELATIONS:

code:

OBJECT  PARENT  VERSION AREA
============================
4567    1234    98989   1
4567    5678    98989   1

TABLE INFOPOOL:

code:

ID      TYPE  
=============
1234    201  
5678    202

This has to be simple, I'm just not getting it. I'm thinking it's a join, but then it needs to be part of a WHERE clause.

mister_gosh fucked around with this message at 05:10 on Dec 7, 2007

# ? Dec 7, 2007 05:06

npe: Oct 15, 2004

You mean like

code:

SELECT r.object, r.parent, r.version
  FROM relations r
  JOIN infopool i
    ON r.parent = i.id
 WHERE i.type = 201

?

Sounds like you just need a primer on basic SQL joins, there has to be a million of them out there.

Edit: please don't paypal me anything

# ? Dec 7, 2007 05:16

mister_gosh: May 24, 2002

Satan's Scallion posted:

You mean like
code:
SELECT r.object, r.parent, r.version
  FROM relations r
  JOIN infopool i
    ON r.parent = i.id
 WHERE i.type = 201
?

Sounds like you just need a primer on basic SQL joins, there has to be a million of them out there.

Edit: please don't paypal me anything

In this instance, I know as I construct the query the OBJECT value, in this case 4567, so I want to include that in the conditions to ensure I only get one or so results.

I also failed to mention area needs to be included in this test. This appears to work:

code:

SELECT r.object, r.parent, r.version
  FROM relations r
  JOIN infopool i
    ON r.parent = i.id
 WHERE i.type = 201
   AND r.object='4567'
   AND r.area='1'

Thanks for the free code! I'm just overwhelmed with other work and didn't have the mental capacity to deal with this at the moment. I'm pretty sure I'm getting it now. Thanks again!

# ? Dec 7, 2007 05:34

duck monster: Dec 15, 2004

Sort of on/off topic, BUT;-

If anyones ever wondered about how sql type databases actually work at the core, have a read of this;-

First read this;-
http://www.sqlite.org/arch.html
Then read this;-
http://www.sqlite.org/vdbe.html

Its on how the SQL engine in SQLLite's opcode system works. Its fascinating stuff, and really is quite enlightening the process of going from really high level SQL query, to snappy-as-gently caress low level data retrieval. Its amazing stuff (Probably more so , considering what a wonder that little database really is.)

Honestly, after having a dig around the *excelent* developer documentation for SQLLite, I now know a hell of a lot more about how these things work than I used to. Required reading.

# ? Dec 7, 2007 08:55

USSR Lives On: Jun 10, 2003; asdf

I have a table that's basically a log of searches that has a userid, the search, and the timestamp as the columns. I'm trying to run a query that gets a users last few unique searches. Let's say I have a table like:

code:

userid   search    timestamp
============================
  1       foo     1234567890
  1       bar     1234567891
  1       you     1234567892
  1       foo     1234567893

If I want the last two unique searches, I would run a query like:

code:

SELECT DISTINCT search FROM searches WHERE userid = 1 ORDER BY timestamp DESC LIMIT 2

What I expect to get is foo and you.
What I actually get is you and bar.

I suspect that the last foo is cut out of the results because the DISTINCT clause matches it with the first foo and eliminates it before the ORDER BY is ever applied. Is there some way to get around this?

# ? Dec 8, 2007 05:15

Land Mime: Jul 13, 2007; Love never blows up and gets killed

This seems...wrong somehow, but creating an index on `timestamp` will make your query return the correct result. At least on MySQL 5.

edit: I mean I get why it works, it just seems to me that an index shouldn't affect the result of a query.

Land Mime fucked around with this message at 06:01 on Dec 8, 2007

# ? Dec 8, 2007 05:58

noonches: Dec 5, 2005; It represents my lifestyle and stratus as a street savy irreverent youth, who lives large, and yet hungers for the next level in life.

Since foo is not distinct, it is not part of the result. try a GROUP BY on search and select MAX(timestamp)

# ? Dec 8, 2007 05:59

npe: Oct 15, 2004

Land Mime posted:

This seems...wrong somehow, but creating an index on `timestamp` will make your query return the correct result. At least on MySQL 5.

edit: I mean I get why it works, it just seems to me that an index shouldn't affect the result of a query.

The reason it seems wrong is because MySQL is silently letting you get away with doing something you shouldn't be able to, and that is wrong. When you refer to a non-grouped column anywhere in your query, what are you really asking? For example...

code:

 SELECT a,b  FROM foo GROUP BY a

...this will give me rows with identical values of `a` grouped together, but what about b? There could be multiple rows, so which one is it supposed to give you? This is what aggregate functions like max() and min() are for, they let you specify which row you want.

Other databases, like Oracle, will actually error if you try to do this; you are required to have all columns in either your GROUP BY or wrapped in aggregate functions. MySQL just gives you "whichever" and I don't know if it's well-defined. It seems like adding an index has altered the result for you and should be a good indication to not trust it long term.

# ? Dec 8, 2007 06:09

noonches: Dec 5, 2005; It represents my lifestyle and stratus as a street savy irreverent youth, who lives large, and yet hungers for the next level in life.

I disagree, I don't think it's wrong, although in my attempt to figure out a good reason to debate your point, I couldn't think of anything that couldn't really just be solved with a MIN, MAX, or GROUP_CONCAT, or a well defined query. So it may be wrong, but it definitely make things easier at to just have MySQL overlook and just give you whichever column it feels like.

# ? Dec 8, 2007 06:17

Victor: Jun 18, 2004

noonches, you have to be kidding me. You're advocating an RDBMS which returns different results based on configuration like indexes. That means I could attempt to speed of a platform backed by MySQL by adding an index, and all of a sudden break the application. How on earth could that be construed in any way other than retarded?

This query can be made to work:

code:

SELECT DISTINCT search FROM searches WHERE userid = 1 ORDER BY timestamp DESC LIMIT 2

I don't see why LIMIT 2 can cause an incorrect result, as it is nonintuitive to me. I could blame it on MySQL being a toy RDBMS, but that doesn't help out the "OP". (freaking megathreads)

Try this:

code:

select  timestamp = max(timestamp),
        search
from    searches
where userid = 1
group by search
order by max(timestamp)
limit 2

# ? Dec 8, 2007 17:21

Land Mime: Jul 13, 2007; Love never blows up and gets killed

Victor posted:

code:

select  timestamp = max(timestamp),
        search
from    searches
where userid = 1
group by search
order by max(timestamp)
limit 2

This also returns the incorrect result (at least on mysql 5). I think you might just have to chalk this up to a mysql bug since sqlite and postgres seem to do this correctly. Create the index and, ideally, compare database versions somewhere and die if its not your current one.

# ? Dec 8, 2007 17:53

Victor: Jun 18, 2004

Try the following:

code:

select  *
from (
    select  timestamp = max(timestamp),
            search
    from    searches
    where userid = 1
    group by search
) t
order by timestamp
limit 2

Maybe even try putting the "order by" in the derived table. (SQL Server does not allow this, but perhaps MySQL requires it.)

# ? Dec 8, 2007 18:05

USSR Lives On: Jun 10, 2003; asdf

Sorry guys I thought I replied, but apparently not. I seem to have solved it by running this:

code:

SELECT search FROM searches WHERE userid = 1 GROUP BY search ORDER BY MAX(timestamp) DESC LIMIT 2

I'm not sure if it's going to start breaking due to some of the other things you guys have mentioned, but it's working pretty well so far... I'll report back if I notice things breaking. A couple of people have said that it's not working though, so I'm going to do some more thorough testing.

# ? Dec 9, 2007 17:20

Adbot: ADBOT LOVES YOU

# ? May 13, 2024 11:33

wolrah: May 8, 2006; what?

Victor posted:

noonches, you have to be kidding me. You're advocating an RDBMS which returns different results based on configuration like indexes. That means I could attempt to speed of a platform backed by MySQL by adding an index, and all of a sudden break the application. How on earth could that be construed in any way other than retarded?

AFAIK there are MySQL options that can disable most if not all of the retardedness. Unfortunately it suffers from the same problem as PHP where there are a ton of applications out there built with the assumption of retarded behavior. Obviously these break when the system is configured in a sane manner, so the defaults and settings commonly used on shared hosts are insane.

It's a bad cycle too, since leaving that crap enabled just means more idiots will use it, meaning there are even more applications which depend on bad features, and so on.

# ? Dec 9, 2007 21:03

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > SELECT * FROM Questions WHERE Type = 'Stupid'

«‹›303 »