SELECT * FROM Questions WHERE Type = 'Stupid'

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > SELECT * FROM Questions WHERE Type = 'Stupid'

«‹›303 »

lord funk: Feb 16, 2004

Of course, thanks for the Java syntax!

# ? Oct 28, 2012 22:36

Adbot: ADBOT LOVES YOU

# ? May 30, 2024 16:50

Sab669: Sep 24, 2009

How would I do a INSERT-SELECTT with some 'hardcoded' values?

ie.

SQL code:

INSERT INTO Table1 (col1, col2, col3)
SELECT col1
FROM Table2
WHERE qualifier = 25

Is what I have, but col2 and 3 cannot be null so I need to throw some place holders in there. Don't have the privileges to modify the table structure to have a default.

# ? Nov 9, 2012 15:31

No Safe Word: Feb 26, 2005

Sab669 posted:

How would I do a INSERT-SELECTT with some 'hardcoded' values?

ie.
SQL code:
INSERT INTO Table1 (col1, col2, col3)
SELECT col1
FROM Table2
WHERE qualifier = 25
Is what I have, but col2 and 3 cannot be null so I need to throw some place holders in there. Don't have the privileges to modify the table structure to have a default.

SQL code:

INSERT INTO Table1 (col1, col2, col3)
SELECT col1, 'whatever', 'you want'
FROM Table2
WHERE qualifier = 25

# ? Nov 9, 2012 15:39

Sab669: Sep 24, 2009

So simple

I always over complicate stuff :v:

# ? Nov 9, 2012 15:46

Mindless: Dec 7, 2001; WANTED: INFO on Mindless. Anything! Everything! Send to; Pillbug

Structure and efficiency question: Outer Join vs. Union with a NOT IN (for SQL Server 2008) Simplified from current situation for clarity:

code:

User Table
-------
UserUID
UserName

Group Table
----------
GroupUID
GroupName

Training Table
------------
TrainingUID
ProgramName

UserGroup Table
-------------
UserUID
GroupUID

GroupTraining Table
------------
GroupUID
TrainingUID
AnnuallyRequired (bit)

UserTraining Table
-----------
UserTrainingUID
UserUID
TrainingUID
Started (date)
Completed (date)
Active (bit)

There are also functions to calculate the user's progress by training modules and quizscores, etc.

A user can belong to several groups. All training assigned to the group is required to be taken by the users in that group at least once, but possibly annually. Therefore a user may have several UserTraining records for a single training, or none at all if the user has never taken the training. Users also have the option of taking training that is not required by the group. For any User + Training combination, there is only 1 record marked Active.

The question is what is the most efficient way to build the complete list of both started/completed training and training that is required but not started or completed?

I've got two solutions given @UserUID:

code:

SELECT UT.usertraininguid, 
       T.programname, 
       UT.traininguid, 
       UT.started, 
       UT.completed, 
       Maxscore(UT.traininguid)                     AS MaxScore, 
       Progress(UT.usertraininguid)                 AS Progress, 
       Isannuallyrequired(@UserUID, UT.traininguid) AS IsAnnuallyRequired, 
       Isdelinquent(UT.completed)                   AS IsDelinquent 
FROM   usertraining UT 
       JOIN training T 
         ON UT.traininguid = T.traininguid 
WHERE UT.useruid = @UserUID
AND UT.active=1
UNION 
SELECT 0                                           AS UserTraingUID, 
       T.programname, 
       A.traininguid, 
       NULL                                        AS Started, 
       NULL                                        AS Completed, 
       Maxscore(A.traininguid)                    AS MaxScore, 
       0                                           AS Progress, 
       Isannuallyrequired(@UserUID, A.traininguid) AS IsAnnuallyRequired, 
       0                                           AS IsDelinquent 
FROM   (SELECT DISTINCT GT.traininguid 
        FROM   usergroup UG 
               JOIN grouptraining GT 
                 ON UG.groupuid = GT.groupuid 
        WHERE  UG.useruid = @UserUID 
               AND GT.traininguid NOT IN (SELECT traininguid 
                                          FROM   usertraining 
                                          WHERE  useruid = @UserUID)) A 
       JOIN training T 
         ON A.traininguid = T.traininguid

Secondly,

code:

SELECT UT.usertraininguid, 
       T.programname, 
       A.traininguid, 
       UT.started, 
       UT.completed, 
       Maxscore(A.traininguid)                     AS MaxScore, 
       CASE 
         WHEN UT.usertraininguid IS NOT NULL THEN Progress(UT.usertraininguid) 
         ELSE 0 
       END                                         AS Progress, 
       Isannuallyrequired(@UserUID, A.traininguid) AS IsAnnuallyRequired, 
       CASE 
         WHEN ut.completed IS NOT NULL THEN Isdelinquent(UT.completed) 
         ELSE 0 
       END                                         AS IsDelinquent 
FROM   (SELECT UTA.traininguid 
        FROM   usertraining UTA 
        WHERE  UTA.useruid = @UserUID 
               AND UTA.active = 1 
        UNION 
        SELECT gt.traininguid 
        FROM   usergroup UG 
               JOIN gt.grouptraining 
                 ON UG.groupuid = gt.groupuid 
        WHERE  UG.useruid = @UserUID) A 
       JOIN training T 
         ON A.traininguid = T.traininguid 
       LEFT OUTER JOIN usertraining UT 
                    ON A.traininguid = UT.traininguid

My concern about the first solution is that NOT IN statement. The second one has a LEFT OUTER JOIN. Currently I have no data but when I've got a million records in UserTraining, which is going to be the best performer? Is there a better solution or some tweaks I'm overlooking?

Mindless fucked around with this message at 21:52 on Nov 12, 2012

# ? Nov 12, 2012 21:47

McGlockenshire: Dec 16, 2005; GOLLOCKS!

You probably aren't going to find out until you insert a million rows and ask whatever analogue SQL Server has for EXPLAIN. Query plans can change drastically based on the amount of data present in all of the other database systems, and SQL Server is surely no different.

# ? Nov 12, 2012 22:52

Sab669: Sep 24, 2009

This isn't necessarily a SQL question, but what is the differences between why one would use Windows Authentication or SQL Server authentication for SQL Server Management Studio? In school we always did SQL Authentication, but this customer I'm working with uses Windows Authentication. Wasn't sure if there's a difference / what might persuade a user to go one way instead of the other.

# ? Nov 13, 2012 15:01

glompix: Jan 19, 2004; propane grill-pilled

Sab669 posted:

This isn't necessarily a SQL question, but what is the differences between why one would use Windows Authentication or SQL Server authentication for SQL Server Management Studio? In school we always did SQL Authentication, but this customer I'm working with uses Windows Authentication. Wasn't sure if there's a difference / what might persuade a user to go one way instead of the other.

Windows authentication provides a bunch of advantages. For one thing, you're not transmitting or storing a password of any kind - you're leaving it up to Active Directory. That's far more secure, especially when you think about how SQL connection strings are usually stored. What's better to keep stored around your production web servers, "Source=Blah;Security=Integrated" or "Source=Blah;Uid=sa;Password=who-cares-if-it's-hard-it's-plain-text".

It's also a lot easier as a developer, since you can just log in as the Windows user you're current using. (think single sign-on)

# ? Nov 13, 2012 15:24

bucketmouse: Aug 16, 2004; we con-trol the ho-ri-zon-tal
we con-trol the verrr-ti-cal

Here's a really really dumb question.

So, Oracle:

I've got a table with a dozen or so fields which need to be concatenated together, inline, in a select statement, separated by newlines.

code:

select (first || chr(10) || second || chr(10) || third .. ) from table;

works fine, except some of the fields might be null leading to multiple newlines and empty space in the results. Is there a ternary operator or something I can use to keep the extra newlines from being inserted if a field is null?

# ? Nov 15, 2012 07:17

Bad Titty Puker: Nov 3, 2007; Soiled Meat

bucketmouse posted:

Here's a really really dumb question.

So, Oracle:

I've got a table with a dozen or so fields which need to be concatenated together, inline, in a select statement, separated by newlines.
code:
select (first || chr(10) || second || chr(10) || third .. ) from table;
works fine, except some of the fields might be null leading to multiple newlines and empty space in the results. Is there a ternary operator or something I can use to keep the extra newlines from being inserted if a field is null?

NULL || 'any string' returns NULL. You can use COALESCE to convert your NULL column values to non-null values:

code:

SELECT COALESCE(first, '') || CHR(10) || COALESCE(second, '') etc.

# ? Nov 15, 2012 08:49

Goat Bastard: Oct 20, 2004

camels posted:

NULL || 'any string' returns NULL. You can use COALESCE to convert your NULL column values to non-null values:
code:
SELECT COALESCE(first, '') || CHR(10) || COALESCE(second, '') etc.

Not in Oracle. || is the concatenation operator, and NULL || 'any string' returns the string.

Equally unintuitively, Oracle considers the empty string is to be null, so those COALESCE calls are no-ops.

code:

SQL> SELECT NULL || 'x' FROM dual;

x

SQL> SELECT 'x' || NULL FROM dual;

x


SQL> SELECT CASE WHEN '' IS NULL THEN 'true' ELSE 'false' END FROM dual;

true

There's no ternary operator in Oracle SQL, but if you're able to create functions then you could do something like

code:

SQL> CREATE OR REPLACE FUNCTION appendNewline(str IN VARCHAR2)
  2  RETURN VARCHAR2
  3  DETERMINISTIC
  4  AS
  5  BEGIN
  6    IF str IS NOT NULL THEN
  7      RETURN str || chr(10);
  8    END IF;
  9    RETURN NULL;
 10  END;
 11  /

Function created.

SQL> SELECT appendNewline('first') || appendNewline(NULL) || appendNewline('third') FROM dual;

first
third

# ? Nov 15, 2012 09:52

pigdog: Apr 23, 2004; by Smythe

bucketmouse posted:

Here's a really really dumb question.

So, Oracle:

I've got a table with a dozen or so fields which need to be concatenated together, inline, in a select statement, separated by newlines.
code:
select (first || chr(10) || second || chr(10) || third .. ) from table;
works fine, except some of the fields might be null leading to multiple newlines and empty space in the results. Is there a ternary operator or something I can use to keep the extra newlines from being inserted if a field is null?

Take a look at NVL2().

# ? Nov 15, 2012 11:14

Golbez: Oct 9, 2002; 1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Goat Bastard posted:

Not in Oracle. || is the concatenation operator, and NULL || 'any string' returns the string.

How do people come up with these things

# ? Nov 15, 2012 15:54

bucketmouse: Aug 16, 2004; we con-trol the ho-ri-zon-tal
we con-trol the verrr-ti-cal

Yeah I figured I'd have to do something like that. I've already had the fun experience of reimplementing AUTO_INCREMENT with sequences. Thanks.

Golbez posted:

How do people come up with these things

Oracle is so helpful, you see:

code:

SQL> select 'x' | NULL from dual;
select 'x' | NULL from dual
           *
ERROR at line 1:
ORA-00996: the concatenate operator is ||, not |

# ? Nov 15, 2012 17:04

Goat Bastard: Oct 20, 2004

Golbez posted:

How do people come up with these things

I think it's actually defined as that in the ANSI SQL standard, which probably pre-dates '+' becoming the de facto standard.

# ? Nov 16, 2012 00:53

KennyG: Oct 22, 2002; Here to blow my own horn.

Not quite a code question but here goes. I have a SQL server 2k5 app that is currently distributed over 5 nodes by application partition (think Asia on n1, Europe n2, etc). This leads to inefficiencies and confusion. Where do you put Russian records? Etc.

We are looking at consolidation with 2k8r2 or even 12 :monocle:

however as we would be multiplying various tables by 5 and demand by 5, I'm concerned about performance. Short of buying a 4x8 core intel box and shoving 1TB of ram and some crazy solid state storage, are there any load distribution options? Oracle uses RAC but it looks as if ms doesn't have an equivalent technology.

How do SQL server shops handle large scale databases that out strip current hardware resources? Any resources or key words to improve my googling would be appreciated.

# ? Nov 25, 2012 23:34

Bad Titty Puker: Nov 3, 2007; Soiled Meat

KennyG posted:

Not quite a code question but here goes. I have a SQL server 2k5 app that is currently distributed over 5 nodes by application partition (think Asia on n1, Europe n2, etc). This leads to inefficiencies and confusion. Where do you put Russian records? Etc.

We are looking at consolidation with 2k8r2 or even 12 however as we would be multiplying various tables by 5 and demand by 5, I'm concerned about performance. Short of buying a 4x8 core intel box and shoving 1TB of ram and some crazy solid state storage, are there any load distribution options? Oracle uses RAC but it looks as if ms doesn't have an equivalent technology.

How do SQL server shops handle large scale databases that out strip current hardware resources? Any resources or key words to improve my googling would be appreciated.

http://www.sql-server-performance.com is probably a good place to ask about this and to do research.

# ? Nov 26, 2012 03:58

Sab669: Sep 24, 2009

e; Eh, fixed it :downs:

Sab669 fucked around with this message at 21:14 on Nov 26, 2012

# ? Nov 26, 2012 20:59

glompix: Jan 19, 2004; propane grill-pilled

KennyG posted:

Not quite a code question but here goes. I have a SQL server 2k5 app that is currently distributed over 5 nodes by application partition (think Asia on n1, Europe n2, etc). This leads to inefficiencies and confusion. Where do you put Russian records? Etc.

We are looking at consolidation with 2k8r2 or even 12 however as we would be multiplying various tables by 5 and demand by 5, I'm concerned about performance. Short of buying a 4x8 core intel box and shoving 1TB of ram and some crazy solid state storage, are there any load distribution options? Oracle uses RAC but it looks as if ms doesn't have an equivalent technology.

How do SQL server shops handle large scale databases that out strip current hardware resources? Any resources or key words to improve my googling would be appreciated.

Some keywords that may or may not fit, but might lead you in a direction: Eventual Consistency, AlwaysOn Availability Groups, (you get perfectly-replicated readable secondaries) Peer to peer replication, merge replication, any kind of replication really, master data services, etc... I don't even know what that last one is, but I've heard it muttered a couple of times.

I don't know if any of these apply at all, but we've got a pretty twisted database infrastructure here. (even though all of our instances are in the same city...) We actually use transactional replication to feed databases on a web server(s) while the web server(s) itself sends all writes (and reads performed soon after a write) back to the "master." This is a terrible pain and we're looking to consolidate into one data center, but what you're asking for can probably be done as long as your business is willing to live with a little inconsistency or complexity.

# ? Nov 27, 2012 23:32

wolrah: May 8, 2006; what?

I'm working on a pet project disk spanning pseudo-filesystem (think Greyhole or Windows Home Server) for which I'm currently using SQLite to keep track of what files are on what actual disks.

The issue I'm running in to is that my main goal is to tolerate disk failure with as little impact as possible. The file-handling side of this is pretty easy, just don't show any files that only exist on a disk that's unavailable, my problem is on the database side. I don't want to depend on any data outside of the disks in the pool, so I currently have a "master" disk which contains the live SQLite DB and the service simply copies this to the other drives when the pseudo-FS is unmounted.

Obviously this has major flaws if it's not unmounted cleanly, particularly if the master disk was to fail while mounted.

It doesn't seem like there's any way to simultaneously write to multiple database files in SQLite, so is there another similar self-contained database which might be better suited to my use (I don't need much beyond basic SELECTs with a bit of JOIN going on)? Is there some third-party SQLite replication system I've been unable to find? Or is the overlap between those who need self-contained databases and those who need replication so small that there's just nothing there?

# ? Nov 28, 2012 00:24

Mindisgone: May 18, 2011; Yeah, well you know...
That's just like, your opinion man.

I have a stupid question about normalization (still a new concept to me but I shouldn't sound like a total rear end in a top hat right now). I have a table that holds users' information (username, password, state, etc etc) and I originally designed this table to include a column for skills (general skills, weaving, calligraphy, programming etc etc). The problem is multiple users can have the same skill and one user also has many skills. It is clear I cannot use skills as a column in the users table because a row could have multiple values for that column. At the same time I unsure of how to make skills its own table. If "skills" was its own table I believe the only columns it would needs are a skill name column and a user name column (plus a description column if I decide to add one but it would at least need the first two I listed). Am I correct in my thinking to create a normalized database?

# ? Nov 28, 2012 22:24

A MIRACLE: Sep 17, 2007; All right. It's Saturday night; I have no date, a two-liter bottle of Shasta and my all-Rush mix-tape... Let's rock.

Mindisgone posted:

I have a stupid question about normalization (still a new concept to me but I shouldn't sound like a total rear end in a top hat right now). I have a table that holds users' information (username, password, state, etc etc) and I originally designed this table to include a column for skills (general skills, weaving, calligraphy, programming etc etc). The problem is multiple users can have the same skill and one user also has many skills. It is clear I cannot use skills as a column in the users table because a row could have multiple values for that column. At the same time I unsure of how to make skills its own table. If "skills" was its own table I believe the only columns it would needs are a skill name column and a user name column (plus a description column if I decide to add one but it would at least need the first two I listed). Am I correct in my thinking to create a normalized database?

Is a `skill` a unique entity to the user or will users share the same skill entities? If so you might need two tables, a table containing information about skills and a join table to match users with skills. If not, then just one table will suffice.

# ? Nov 28, 2012 22:36

Mindisgone: May 18, 2011; Yeah, well you know...
That's just like, your opinion man.

^ Users will share the same skill entities. So in a separate skills table it should contain columns: skill name, description, and user name to join the tables? I am a bit lost as to the design of the skills table because I have never had to join tables before but now that I think about it there is no reason to duplicate user names in a column in the skills table. I suppose my back end php code handles the logistics of what users get what skills and the only columns the skills table needs is skill name and description?

# ? Nov 28, 2012 23:04

Golbez: Oct 9, 2002; 1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Users table:
user_id
user_name
[other user info]

Skills table:
skill_id
skill_name
[other skill info]

Join table:
user_id
skill_id

This lets you define each user and each skill once, and in the linking/join table you can have as many entries as you want. So if user 1 has skills 3, 7, and 10, there would be three rows in that table for him.

So, to find out what skills someone has, you would query:
SELECT skill_name FROM skills LEFT JOIN user_skills USING (skill_id) WHERE user_id = $i

# ? Nov 28, 2012 23:36

Mindisgone: May 18, 2011; Yeah, well you know...
That's just like, your opinion man.

Golbez posted:

Users table:
user_id
user_name
[other user info]

Skills table:
skill_id
skill_name
[other skill info]

Join table:
user_id
skill_id

This lets you define each user and each skill once, and in the linking/join table you can have as many entries as you want. So if user 1 has skills 3, 7, and 10, there would be three rows in that table for him.

So, to find out what skills someone has, you would query:
SELECT skill_name FROM skills LEFT JOIN user_skills USING (skill_id) WHERE user_id = $i

Excellent, thank you very much for the quick lesson. I believe I still have some problems though. Specifically users should be able to add as many unique skills to their profile as they please. If we do not have record of the skill the database should add it which I can already do. I just still don't see with the above example how it keeps a permanent record of which users assigned which skills to their profile, some users having the same skills.

EDIT: I'm a dumbass and believe some if not all join functions save the joins to another permanent table...right?

Mindisgone fucked around with this message at 06:37 on Nov 29, 2012

# ? Nov 29, 2012 06:34

karms: Jan 22, 2006; by Nyc_Tattoo; Yam Slacker

Mindisgone posted:

Excellent, thank you very much for the quick lesson. I believe I still have some problems though. Specifically users should be able to add as many unique skills to their profile as they please. If we do not have record of the skill the database should add it which I can already do. I just still don't see with the above example how it keeps a permanent record of which users assigned which skills to their profile, some users having the same skills.

The permanent record is the users_skills table. Maybe this will things up a bit:

code:

users:
-----
1 alfred
2 brock
3 carl

skills:
-----
1 cooking
2 cleaning
3 driving
4 analrapist

users_skills:
-----
1 1
1 2
2 3
3 1
3 2
3 3
3 4

Brock decided that he now knows how to fly. The system does this:

code:

users:
-----
1 alfred
2 brock
3 carl

skills:
-----
1 cooking
2 cleaning
3 driving
4 analrapist
5 flying

users_skills:
-----
1 1
1 2
2 3
3 1
3 2
3 3
3 4
2 5

# ? Nov 29, 2012 10:13

Mindisgone: May 18, 2011; Yeah, well you know...
That's just like, your opinion man.

KARMA! posted:

A great example

Ah yes that is very clear now that I can see it, thanks.

# ? Nov 30, 2012 00:12

Sockser: Jun 28, 2007; This world only remembers the results!

So I'm taking a databases course and the professor just decided to change the requirements on an assignment we turned in three weeks ago.

This here's a trigger I have: (Oracle)

SQL code:

CREATE OR replace TRIGGER sendmessage 
  AFTER INSERT ON MESSAGES 
  FOR EACH ROW 
DECLARE 
    CURSOR c1 IS 
      SELECT userid 
      FROM   GROUPMEMBERSHIP 
      WHERE  GROUPMEMBERSHIP.gid = :new.togroupid; 
BEGIN 
    IF :new.togroupid IS NOT NULL THEN 
      --OPEN c1;  
      FOR myrow IN c1 LOOP 
          INSERT INTO MESSAGERECIPIENT 
                      (msgid, 
                       userid) 
          VALUES      (:new.msgid, 
                       myrow.userid); 
      END LOOP; 
    END IF; 
END sendmessage;

And I guess we have to rewrite it without using a loop? How uh... how would I go about doing that?

# ? Nov 30, 2012 00:18

var1ety: Jul 26, 2004

Sockser posted:

So I'm taking a databases course and the professor just decided to change the requirements on an assignment we turned in three weeks ago.

This here's a trigger I have: (Oracle)
SQL code:
CREATE OR replace TRIGGER sendmessage 
  AFTER INSERT ON MESSAGES 
  FOR EACH ROW 
DECLARE 
    CURSOR c1 IS 
      SELECT userid 
      FROM   GROUPMEMBERSHIP 
      WHERE  GROUPMEMBERSHIP.gid = :new.togroupid; 
BEGIN 
    IF :new.togroupid IS NOT NULL THEN 
      --OPEN c1;  
      FOR myrow IN c1 LOOP 
          INSERT INTO MESSAGERECIPIENT 
                      (msgid, 
                       userid) 
          VALUES      (:new.msgid, 
                       myrow.userid); 
      END LOOP; 
    END IF; 
END sendmessage; 
And I guess we have to rewrite it without using a loop? How uh... how would I go about doing that?

You can insert rows in bulk by feeding the INSERT a SELECT statement's results -

code:

insert into foo (bar, baz) select 'bar', 'baz' from all_objects;

# ? Nov 30, 2012 00:47

Sockser: Jun 28, 2007; This world only remembers the results!

Ah, I guess I should have explained a bit better

so we're making some janky-rear end console-based facebook for our final project. This trigger is for when you send a message to a 'group', it find all members of that group and send an individual message to each member.

So User1 sends a message to GroupA, and the table I insert needs to look like

FROM -- TO
User1 -- GroupA:User1
User1 -- GroupA:User2
User1 -- GroupA:User3
etc

so I can do a select and get all members of a group, but then when I insert I need to join it with a single value, that is, User1. Is there a way to do that?

# ? Nov 30, 2012 00:55

var1ety: Jul 26, 2004

Sockser posted:

Ah, I guess I should have explained a bit better

so we're making some janky-rear end console-based facebook for our final project. This trigger is for when you send a message to a 'group', it find all members of that group and send an individual message to each member.

So User1 sends a message to GroupA, and the table I insert needs to look like

FROM -- TO
User1 -- GroupA:User1
User1 -- GroupA:User2
User1 -- GroupA:User3
etc

so I can do a select and get all members of a group, but then when I insert I need to join it with a single value, that is, User1. Is there a way to do that?

You have a SELECT statement that will give you all the users in the group, so just add another column with the message id to the statement (you don't need to use a cursor), and feed that to the INSERT directly.

# ? Nov 30, 2012 01:00

Sockser: Jun 28, 2007; This world only remembers the results!

right, understood, and I'm sure I sound just dumb as hell right now, but how would I go about joining one cell as a column with a table? What's the syntax for doing such a thing? A join would just give me a column full of nulls, wouldn't it?

vvv badaboom. Cross join, a thing I don't recall ever being mentioned in class.

Sockser fucked around with this message at 03:49 on Nov 30, 2012

# ? Nov 30, 2012 01:37

Aredna: Mar 17, 2007; Nap Ghost

Outer joins will give you NULL when a value is not found. Inner joins will only return rows with results.

This is one of the better join explanations: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html

# ? Nov 30, 2012 03:00

Amarkov: Jun 21, 2010

Sockser posted:

vvv badaboom. Cross join, a thing I don't recall ever being mentioned in class.

Yeah, lots of people don't bother to cover them explicitly. There are two reasons for this: they're equivalent to any other kind of join with a trivially true join condition, and your situation is one of the very few where cross joining is a smart thing to do.

# ? Nov 30, 2012 05:22

Sockser: Jun 28, 2007; This world only remembers the results!

Alright, so I loving hate Oracle but what the gently caress is going on here

SQL code:

CREATE OR replace TRIGGER sendmessage 
  AFTER INSERT ON messages 
  FOR EACH ROW 
BEGIN 
    IF :new.togroupid IS NOT NULL THEN 
      -- ??    
      INSERT INTO messagerecipient 
                  (msgid, 
                   userid) 
      SELECT msgid, 
             userid 
      FROM   :new.msgid 
             cross join groupmembership 
      WHERE  ( groupmembership.gid = :new.togroupid ); 
    END IF; 
END sendmessage; 

/

code:

Errors for TRIGGER SENDMESSAGE:

LINE/COL ERROR
-------- -----------------------------------------------------------------
4/7	 PL/SQL: SQL Statement ignored
10/14	 PL/SQL: ORA-00903: invalid table name

my tables are named
profile, friends, pendingfriends, messages, messagerecipient, groups, groupmembership

so uh... what the gently caress is going on here, then?

e:

for reference, this works:

SQL code:

CREATE OR replace TRIGGER sendmessage 
  AFTER INSERT ON MESSAGES 
  FOR EACH ROW 
DECLARE 
    CURSOR c1 IS 
      SELECT userid 
      FROM   GROUPMEMBERSHIP 
      WHERE  GROUPMEMBERSHIP.gid = :new.togroupid; 
BEGIN 
    IF :new.togroupid IS NOT NULL THEN 
      FOR myrow IN c1 LOOP 
          INSERT INTO MESSAGERECIPIENT 
                      (msgid, 
                       userid) 
          VALUES      (:new.msgid, 
                       myrow.userid); 
      END LOOP; 
    END IF; 
END sendmessage; 

/

Sockser fucked around with this message at 19:33 on Nov 30, 2012

# ? Nov 30, 2012 19:29

var1ety: Jul 26, 2004

Sockser posted:

Alright, so I loving hate Oracle but what the gently caress is going on here

SQL code:

CREATE OR replace TRIGGER sendmessage 
  AFTER INSERT ON messages 
  FOR EACH ROW 
BEGIN 
    IF :new.togroupid IS NOT NULL THEN 
      -- ??    
      INSERT INTO messagerecipient 
                  (msgid, 
                   userid) 
      SELECT msgid, 
             userid 
      FROM   :new.msgid 
             cross join groupmembership 
      WHERE  ( groupmembership.gid = :new.togroupid ); 
    END IF; 
END sendmessage; 

/

code:


Errors for TRIGGER SENDMESSAGE:

LINE/COL ERROR
-------- -----------------------------------------------------------------
4/7	 PL/SQL: SQL Statement ignored
10/14	 PL/SQL: ORA-00903: invalid table name

my tables are named
profile, friends, pendingfriends, messages, messagerecipient, groups, groupmembership

so uh... what the gently caress is going on here, then?

e:

for reference, this works:

SQL code:

CREATE OR replace TRIGGER sendmessage 
  AFTER INSERT ON MESSAGES 
  FOR EACH ROW 
DECLARE 
    CURSOR c1 IS 
      SELECT userid 
      FROM   GROUPMEMBERSHIP 
      WHERE  GROUPMEMBERSHIP.gid = :new.togroupid; 
BEGIN 
    IF :new.togroupid IS NOT NULL THEN 
      FOR myrow IN c1 LOOP 
          INSERT INTO MESSAGERECIPIENT 
                      (msgid, 
                       userid) 
          VALUES      (:new.msgid, 
                       myrow.userid); 
      END LOOP; 
    END IF; 
END sendmessage; 

/

You cannot join to a scalar value, nor is there any need to. Just select from your table and add the scalar as a column on the SELECT query.

# ? Nov 30, 2012 19:59

Sockser: Jun 28, 2007; This world only remembers the results!

Well son of a bitch, I made that a lot harder than it needed to be.

# ? Nov 30, 2012 20:01

Bad Munki: Nov 4, 2008; We're all mad here.

There is an sql thread! Wonderful.

Using PostgreSQL 8.4, I have the following table:

code:

CREATE TABLE ems_in_out 
  ( 
     user_id     INTEGER NOT NULL, 
     "timestamp" TIMESTAMP WITH time zone NOT NULL, 
     status      CHARACTER VARYING(3) NOT NULL, 
     app_name    CHARACTER VARYING NOT NULL, 
     ip          INET NOT NULL 
  );

We should be able to ignore app_name and ip for the sake of this question. There is a unique constraint on (user_id, timestamp).

What the table actually contains: check-in/check-out data for a door scanner. There is one small caveat: a user's presence is considered expired after 12 hours if they don't actually check out. Status will always be either IN or OUT.

Here's some sample data:

code:

54	2012-11-21 20:50:16-07	IN	Door Scanner	72.92.211.20
54	2012-11-21 20:50:19-07	OUT	Door Scanner	72.92.211.20
122	2012-11-28 12:14:45-07	IN	Door Scanner	72.92.211.20
54	2012-11-28 17:48:17-07	IN	Door Scanner	72.92.211.20
62	2012-11-28 17:52:44-07	IN	Door Scanner	72.92.211.20
54	2012-11-28 17:58:39-07	OUT	Door Scanner	72.92.211.20
62	2012-11-28 17:58:45-07	OUT	Door Scanner	72.92.211.20
54	2012-12-01 11:34:19-07	IN	Door Scanner	72.92.211.20
62	2012-12-01 11:35:04-07	IN	Door Scanner	72.92.211.20
57	2012-12-01 11:50:07-07	IN	Door Scanner	72.92.211.20
54	2012-12-01 13:57:21-07	OUT	Door Scanner	72.92.211.20
57	2012-12-01 16:00:07-07	OUT	Door Scanner	72.92.211.20
62	2012-12-01 16:00:08-07	OUT	Door Scanner	72.92.211.20
54	2012-12-02 10:14:23-07	IN	Door Scanner	198.228.201.155
54	2012-12-02 10:57:29-07	OUT	Door Scanner	198.228.201.150

What I'm trying to do: I want to get an hourly count of how many people are in, over an arbirtrary time span (say, a month.) I've gotten close, but I just can't seem to make the leap from what I have to what I want. Here's what I've got so far:

(Sorry if this is mis-formatted: the automatic formatter was choking on it so I just tried to reasonably fake it)

code:

SELECT s.timestamp          AS timestamp,
       coalesce(c.count, 0) AS count
FROM (SELECT *
      FROM generate_series((to_timestamp(to_char(now(), 'YYYY-MM-DD HH24'), 'YYYY-MM-DD HH24') - INTERVAL '1 month')::timestamp, now(), '1 hour') AS timestamp) s
LEFT OUTER JOIN
     (SELECT to_timestamp(to_char(timestamp, 'YYYY-MM-DD HH24'), 'YYYY-MM-DD HH24') AS timestamp,
             count(*) AS count
      FROM ems_in_out
      GROUP BY to_char(timestamp, 'YYYY-MM-DD HH24')) c
ON (c.timestamp = s.timestamp)
ORDER BY s.timestamp ASC

So that's maybe sorta close, except it only gives me a count of how many actions occurred during each hour. What I really need is to perform some sort of temporal intersection between each hour-long interval and each IN/OUT span (or IN/+12 hours if they forgot to check out). Anyone have any ideas on how to do this correctly? Part of the problem is that such an intersection of intervals has a number of edge cases.

At this point, I'm honestly considering breaking out the spatial extensions and treating it that way, but I really don't want to go down that road for this.

If it helps, I can provide some extra example sql for performing such queries as "how many members are in right now?"

Bad Munki fucked around with this message at 02:09 on Dec 3, 2012

# ? Dec 3, 2012 02:05

Aredna: Mar 17, 2007; Nap Ghost

This isn't really the best type of problem to solve with SQL queries, but it's certainly possible and I've had to do something similar in the past.

The way I solve this is to add dummy rows to "fix" all of the corner cases by making them simple cases.

The key to this is going to be the lead and lag window functions. This will allow you to compare the values on this row to the values on the next row and previous row as needed.

I may be missing some, but I think you really have 3 basic cases to consider. I would do the following to fix each, pending your business rules.

User has 2 IN scans in a row within 12 hours
- If difference in time stamps between scans is more than the minimum time difference (1 second?) then add an OUT scans at the minimum time difference prior to the 2nd IN scan, otherwise delete the first IN scan as invalid
User does not have an OUT scan within 12 hours of an IN scan
- add an OUT scan 12 hours after the IN scan
User has an OUT scan without an IN scan in the previous 12 hours
- Delete one of the OUT scans as invalid if they are the minimum time difference apart, otherwise add an IN scan 1 second after the first occurring OUT scan

I think fixing them in this order will solve your corner cases, but it's difficult to say without really digging into all of the data. I would normally do this by copying the data for this date range, with an extra day on both sides, to a temp table and doing all of the manipulations there since this is strictly for reporting purposes.

It will be several SQL statements, 5 if I didn't miss any cases above, to clean up the data, but once you've done that it will make any sort of reporting off the data much easier. The key is just to be sure to do lots of analysis around the actual corner cases in your data before and after these changes to be sure we don't make anything look bad. I like to check users that I'm doing the most manipulations to for making sure my code is working as intended as well as users that I'm doing no manipulations to make sure I'm not missing something in their data.

If you need help with a sample statement for one of the corner cases let me know and I'll whip something up later on today when I have some more time.

# ? Dec 3, 2012 03:19

Adbot: ADBOT LOVES YOU

# ? May 30, 2024 16:50

Bad Munki: Nov 4, 2008; We're all mad here.

I can surely clean the data up, I think that's pretty straightforward. I would also consider, when a user checks in, pre-emptively inserting an OUT item 12 hours from then, and just updating it if they check out before the pre-recorded time on that entry. Then, at least, all INs would be guaranteed to have a corresponding OUT, even if it hasn't happened yet.

When I was talking about edge cases, it was more in reference to checking for intersections between ranges. To wit:

code:

target range:          [--------]
case A:       [-----]
case B:                           [-----]
case C:             [-----]
case D:                      [-----]
case E:                  [-----]
case F:             [--------------]

Case A: The intervals do not intersect: the test range is entirely before the target range.
Case B: The intervals do not intersect: the test range is entirely after the target range (isomorphic with case A)
Case C: The intervals intersect: the test range ends during the target range
Case D: The intervals intersect: the test range begins during the target range (isomorphic with case C)
Case E: The intervals intersect: the test range is entirely within the target range.
Case F: The intervals intersect: the test range entirely contains the target range (isomorphic with case E)

Because of the mentioned isomorphism between some of the test cases, there are really just three cases: one interval is entirely outside the other, one interval partially intersects the other, and one interval is wholly contained within the other.

So, that's fine, it'd be a little cumbersome, but a few ORs would take care of it...I'm looking at the link you provided, I think that you're right that if I take the time to clean the data up a little, it may make my job a lot easier. My only concern is that I this data will be queried pretty frequently (and updated arbitrarily often) so I'd be concerned about potential performance issues. Needs to stay fairly snappy.

Another option I'm considering is to sidestep the whole problem of deriving the hourly data and just recording it straight up. That is, have a table with nothing but (rounded) timestamps and counts. Then have an hourly cron run that will examine the currently-present users (I have sql for that already) and add a row with that information. One problem there is that it'll only catch people that are checked in for at least an hour. Another problem is that it's less flexible, although a group and sum would at least let me lower the resolution of the data (to, say, per day.)

What I'd really like is a select statement that will return two columns, start time and stop time. With those, I could pretty easily apply that information to a generated series of time ranges and use the cases I described above to figure out actual accurate counts on the fly with an arbitrary resolution. I'm hoping that once I understand the lead and lag functionality you linked, that it will lead to that possibility.

Oof, I wrote more than intended. Thanks for the help thus far!

e: Oh, and I didn't mention, this is a new system and isn't fully live yet, so while I'd like to keep the data collected up to this point, it'd be okay to nuke it as needed. Some of the early data was strictly IN logs, so that may well go away just on good measure. I mean, realistically, I could nuke ALL the data if I wanted, but where's the fun in that?

Bad Munki fucked around with this message at 04:10 on Dec 3, 2012

# ? Dec 3, 2012 03:51

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > SELECT * FROM Questions WHERE Type = 'Stupid'

«‹›303 »