Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Dessert Rose
May 17, 2004

awoken in control of a lucid deep dream...
Related to my post in the Excel thread.

This is probably really, really simple for anyone who actually knows SQL well. Me? It's just that thing that ActiveRecord hides from me. And my only experience is with MySQL, not MSSQL. But man, you stick a JOIN in there and my eyes just glaze over.

In this spreadsheet we have two queries. One, amusingly, queries the other. I'm trying to figure out why.

Is there any reason this:

code:
SELECT V_SP1OPERATIONTYPE.NAME                                                                                                                 OPERATION,
       DP_CBS_STACK_VERSION_REVISION.DATAPOINTVALUE                                                                                            BUILD,
       V_PROCESSORARCHITECTURE.NAME                                                                                                            ARCH,
       V_VISTASKU.NAME                                                                                                                         SKU,
       H_SESSIONHEADER.CLIENTSESSIONENDTIME                                                                                                    DATEANDTIME,
       DP_CBS_MACHINE_NAME.DATAPOINTVALUE                                                                                                      MACHINE_NAME,
       ((H_SESSIONHEADER.FLAGS & 16) / 16)                                                                                                     IS_TEST,
       HASHBYTES('SHA1',DP_CBS_MACHINE_NAME.DATAPOINTVALUE) + DP_CBS_STACK_VERSION_REVISION.DATAPOINTVALUE + V_SP1OPERATIONTYPE.DATAPOINTVALUE HASH
FROM   DP_CBS_MACHINE_NAME
       INNER JOIN (V_SP1OPERATIONTYPE
                   INNER JOIN (DP_CBS_STACK_VERSION_REVISION
                               INNER JOIN (H_SESSIONHEADER
                                           INNER JOIN (V_PROCESSORARCHITECTURE
                                                       INNER JOIN (V_VISTASKU
                                                                   FULL OUTER JOIN V_SP1OPERATIONRESULT
                                                                     ON V_SP1OPERATIONRESULT.SESSIONID = V_VISTASKU.SESSIONID)
                                                         ON V_VISTASKU.SESSIONID = V_PROCESSORARCHITECTURE.SESSIONID)
                                             ON V_PROCESSORARCHITECTURE.SESSIONID = H_SESSIONHEADER.SESSIONID)
                                 ON H_SESSIONHEADER.SESSIONID = DP_CBS_STACK_VERSION_REVISION.SESSIONID)
                     ON DP_CBS_STACK_VERSION_REVISION.SESSIONID = V_SP1OPERATIONTYPE.SESSIONID)
         ON V_SP1OPERATIONTYPE.SESSIONID = DP_CBS_MACHINE_NAME.SESSIONID
WHERE  DP_CBS_MACHINE_NAME.DATAPOINTVALUE != '(null)'
       AND (H_SESSIONHEADER.CLIENTUPLOADTIME < (GETDATE() - .5))
       AND V_SP1OPERATIONRESULT.NAME IS NULL
would somehow return results that this:
code:
SELECT
  HASHBYTES('SHA1',DP_CBS_MACHINE_NAME.DATAPOINTVALUE) + DP_CBS_STACK_VERSION_REVISION.DATAPOINTVALUE + V_SP1OPERATIONTYPE.DATAPOINTVALUE HASH,
  V_SP1OPERATIONRESULT.NAME RESULT,
  V_SP1OPERATIONTYPE.NAME OPERATION,
  DP_CBS_STACK_VERSION_REVISION.DATAPOINTVALUE BUILD,
  H_SESSIONHEADER.CLIENTSESSIONENDTIME DATEANDTIME,
  DP_CBS_MACHINE_NAME.DATAPOINTVALUE MACHINE_NAME
FROM
 DP_CBS_MACHINE_NAME
  INNER JOIN (V_SP1OPERATIONRESULT
   INNER JOIN (DP_CBS_STACK_VERSION_REVISION
    INNER JOIN (H_SESSIONHEADER
  INNER JOIN V_SP1OPERATIONTYPE
  ON V_SP1OPERATIONTYPE.SESSIONID = H_SESSIONHEADER.SESSIONID)
    ON H_SESSIONHEADER.SESSIONID = DP_CBS_STACK_VERSION_REVISION.SESSIONID)
   ON DP_CBS_STACK_VERSION_REVISION.SESSIONID = V_SP1OPERATIONRESULT.SESSIONID )
  ON V_SP1OPERATIONRESULT.SESSIONID = DP_CBS_MACHINE_NAME.SESSIONID
WHERE
 DP_CBS_MACHINE_NAME.DATAPOINTVALUE != '(null)'
would NOT?

I've been over this in my head lots and lots of times and I can't see how a query with more criteria would somehow return more data.

The only thing I can think of is that FULL OUTER JOIN in there, but that doesn't make a lot of sense (actually, the existence of the entire join doesn't make sense) because one of the criteria is for that table not to be null.

The way the Excel sheet works is that it takes each query and places them into a table. Then there's a field in the first query's table that checks the second query's table for the existence of the hash, and if it's not there, it assumes there was never a response and says so.

But it doesn't make a lot of sense to me - they're both querying the same table, they're both using the same restraints - the one that returns "more" data actually has more restraints...

I have the feeling that if I could condense this into a single query then I could merge this one with the other query in the spreadsheet and be done with it.

Also, I don't know a lot about SQL, but is this an incredibly messy query or is it just my inexperience with the language?

Dessert Rose fucked around with this message at 02:21 on Dec 6, 2007

Adbot
ADBOT LOVES YOU

Dessert Rose
May 17, 2004

awoken in control of a lucid deep dream...

Victor posted:

That SQL is revolting, especially the table breaking part. The SQL formatter in the OP sucks, IMHO.

You have two tables in the first query that don't show up in the second query. (snip)

Glad to know I'm not the only one who thinks it looks like poo poo.

Wouldn't an inner join not actually add any records to the result, though? My understanding of INNER JOIN is that it only pulls extra records in if there's something on both sides.

Dessert Rose
May 17, 2004

awoken in control of a lucid deep dream...

Victor posted:

Let's say I run a query on the table Thread. Then I add an inner join to Post. Will not the number of results balloon?

Okay, that makes sense for a has-many association. To rephrase:

Look at the two tables in question. A given install will only have one arch and only one sku; it's impossible to install the x64 and x86 versions simultaneously, for example.

So, given that I'm only doing inner joins on tables that have one-to-one relationships with the initial table, is the same true?

This is also a little off what my actual question is.

Is there a way to construct a query such that I only get the results returned by the first query, that aren't returned by the second? I look at the queries and what I come up with is already in there (the OperationResult.NAME IS NULL bit) but all I'm really doing is stumbling in the dark.

And of course I can't find the guy who wrote this to begin with to beat him with a stick. Which is understandable; I know I wouldn't want to maintain this crap.

Dessert Rose
May 17, 2004

awoken in control of a lucid deep dream...

Victor posted:

Have you verified that they are one-to-one? The only way I can see this not being the culprit is if I don't fully understand the semantics of full outer join when used like you have in your example. You can figure this out (and report back to me): select sessionid from all the tables in the first query and tell me if any are ever null.

If you were using SQL 2005, you could use the except operator. (derived table neat stuff)

I think we might be... I don't know what the database is running but I have a feeling it would be 2005 :)

So just so I "get" your pattern, what I would do is essentially create two derived tables, one from each query, and then do that left join on it? I :google:d derived tables and that aspect of it makes sense to me. What does "t2.a = t1.a or t2.a is null and t1.a is null" do in the context of a join, exactly? Or would I just replace that with some entirely different code, like "t1.v_spoperationtype.sessionid = t2.v_spoperationtype.sessionid"?

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply