Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Bad Munki
Nov 4, 2008

We're all mad here.


Shoot, why don't I just make each record a self-contained IN/OUT? That is, instead of a timestamp and a status, drop the status and replace it with a second timestamp. When they check in, set that timestamp as now + 12 hours, and when they check out, update it. If they don't check out, done. Then I have my start/stop time and the rest is easy as pie.

:doh:

I'll have to go back and refigure all the data, but it shouldn't be that hard, and this'll make working with the data a hell of a lot easier in pretty much every regard.

Adbot
ADBOT LOVES YOU

Aredna
Mar 17, 2007
Nap Ghost
If you're in the design phase definitely consider a redesign. I assumed it was in production already and you were trying to report off of a dirty log.

You may also want to update the previous OUT stamp when a user clocks IN for certain cases. My reasoning for this is that in the following scenario I think it would be more accurate:
8:00 AM - IN
1:00 PM - IN
5:00 PM - OUT

We would not want to assume that the user was in the building until 8PM when they clearly left at 5PM.

What would you want the system to do in this case:
8:00 AM - IN
9:00 PM - OUT

Out of curiosity, what is the business sensitivity to accuracy in this data? Is it worth making 2 IN scans in a row invalid and forcing a supervisor to fix their time at the time?

I don't have time to put it together right now, but the reporting piece is pretty easy. I would generate a dummy table of dates and hours that we want to output and left join to this new, and easier to use table. I'll post more later on how to do that.

Bad Munki
Nov 4, 2008

We're all mad here.


Haha, it's nothing even close to that serious. It's for a new makerspace that is trying to get organized. Someone hacked together a quick check-in system, I'm giving it a massive overhaul to be more useful/robust. The old one had a list of codes in a text file and wasn't even properly compiled (they were literally running it from within eclipse every time) and would post to twitter which then got forwarded/spammed to facebook. This new version posts to a DB-driven web service which keeps the website updated with who is around and lets users manage their check-in code through their forum/wiki profile, which will also allow for alternate check-in mechanisms (qr code, nfc, keypad, etc.) Right now, I'm trying to get some useful/interesting data out of the log in order to make some pretty plots.

I shouldn't even need to use an intermediate table, the left outer join I posted above using the generated series and coalesce should suffice with a very little bit of modification.

e: Yeah, this is a lot better. Now getting currently-checked-in members is as easy as looking for anyone with an out time in the future. That is, it went from this:
code:
SELECT phpbb_users.username AS username, 
       phpbb_users.user_id  AS user_id, 
       ems_in_out.timestamp AS timestamp, 
       status 
FROM   (SELECT user_id, 
               Max(timestamp) AS timestamp 
        FROM   ems_in_out 
        GROUP  BY user_id) recent, 
       ems_in_out, 
       phpbb_users 
WHERE  recent.user_id = ems_in_out.user_id 
       AND recent.user_id = phpbb_users.user_id 
       AND phpbb_users.username != 'Guest' 
       AND recent.timestamp = ems_in_out.timestamp 
       AND status = 'IN' 
       AND Now() - ems_in_out.timestamp <= '12 hours' 
ORDER  BY ems_in_out.timestamp 
to this:

code:
SELECT phpbb_users.username AS username, 
       phpbb_users.user_id  AS user_id, 
       ems_in_out.time_in   AS time_in, 
       'IN'                 AS status 
FROM   ems_in_out, 
       phpbb_users 
WHERE  phpbb_users.username != 'Guest' 
       AND phpbb_users.user_id = ems_in_out.user_id 
       AND ems_in_out.time_out > now() 
ORDER  BY ems_in_out.time_in ASC; 
It makes setting someone's status a LITTLE more cumbersome, since I'm providing an "AUTO" option for status in the web service which just toggles the user from their current state, but that's solved by just checking to see if their newest time_out is in the future or not, and updating or inserting as appropriate. So one last question: instead of running two queries from the php script (the first controlling an if(), the second being the actual action), how do I roll this into one statement?

code:
--check if currently in 
SELECT 1 
FROM   ems_in_out 
WHERE  ems_in_out.user_id = {$userID} 
       AND ems_in_out.time_out > now(); 

--checking in 
INSERT INTO ems_in_out 
            (user_id, 
             time_in, 
             time_out, 
             app_name, 
             ip) 
VALUES      ({$userID}, 
             Now(), 
             Now() + interval '12 hours', 
             {$appName}, 
             {$ip}); 

--checking out 
UPDATE ems_in_out 
SET    ems_in_out.time_out = now() 
WHERE  ems_in_out.user_id = {$userID} 
       AND ems_in_out.time_out > now(); 
Sorry about the interspersed php variables on that last one. Oh, and note that I haven't even tried running any of this new sql yet, so there may be mistakes. ;)

Bad Munki fucked around with this message at 05:48 on Dec 3, 2012

Pardot
Jul 25, 2001




Here is my attempt that doesn't account for several of the corner cases discussed, but is pretty straightforward: https://dataclips.heroku.com/qnldojpgggerodgmezjckiecidvp

The result of just the with clause is in version 1

Even with the omissions of some corner cases, this might be close enough.

Bad Munki
Nov 4, 2008

We're all mad here.


Yeah, if I ignore ensuring continuous data, getting a head-count log is pretty easy now with the new table design:

code:
SELECT count(DISTINCT( user_id )) as usage, max(s.start_time) as time_in
  FROM   ems_in_out_test, 
         ( SELECT start_time, 
                 start_time + interval '1 hour' as end_time 
           FROM generate_series((to_timestamp(to_char(now(), 'YYYY-MM-DD HH24'), 'YYYY-MM-DD HH24') - interval '1 month')::timestamp, now(), '1 hour') start_time) s
  WHERE  time_in BETWEEN s.start_time AND s.end_time
          OR time_out BETWEEN s.start_time AND s.end_time
          OR s.start_time BETWEEN time_in AND time_out
          OR s.end_time BETWEEN time_in AND time_out
  GROUP BY s.start_time
That handles all the possible intersections of the signed-in time periods and the report periods, and also (I think correctly) handles a user checking in/out multiple times in an hour, by only counting them once for that hour.

However, for this, I need the data to be continuous, i.e. covering every period in the report, even if nobody was there and thus no log entry exists in that time. So I came up with this:

code:
SELECT a.start_time AS time_period, coalesce(b.usage, 0) AS usage
FROM 
( SELECT start_time, 
         start_time + interval '1 hour' AS end_time 
  FROM generate_series((to_timestamp(to_char(now(), 'YYYY-MM-DD HH24'), 'YYYY-MM-DD HH24') - INTERVAL '1 month')::timestamp, now(), '1 hour') start_time) a
LEFT OUTER JOIN
( SELECT count(DISTINCT( user_id )) AS usage, max(s.start_time) AS time_in
  FROM   ems_in_out_test, 
         ( SELECT start_time, 
                  start_time + interval '1 hour' AS end_time 
           FROM generate_series((to_timestamp(to_char(now(), 'YYYY-MM-DD HH24'), 'YYYY-MM-DD HH24') - INTERVAL '1 month')::timestamp, now(), '1 hour') start_time) s
  WHERE  time_in BETWEEN s.start_time AND s.end_time
          OR time_out BETWEEN s.start_time AND s.end_time
          OR s.start_time BETWEEN time_in AND time_out
          OR s.end_time BETWEEN time_in AND time_out
  GROUP BY s.start_time) b
ON (a.start_time = b.time_in)
ORDER BY a.start_time ASC
(Again, apologies for formatting errors: the sql formatter chokes on the '1 month' interval notation.)

That sql isn't quite right, but I'm pretty sure it's very close. That is, I think the data is technically accurate, although it seems to be sorted strangely (fixed). Oh, and it's slow. Takes a full second to run on less than 100 rows. :downs:

Bad Munki fucked around with this message at 07:08 on Dec 3, 2012

Aredna
Mar 17, 2007
Nap Ghost
It's been a while since I've used PostgreSQL, but I think this works using the generate_series magic that I don't get to play with in MSSQL 2008 :(.

If you start with the series as your base table you souldn't need to do anything special to generate the 0 rows for hours with no users.

I based the conditions off of your cases at the top of the screen and as best I can tell it counts only the 4 cases we would count as valid for that hour.
code:
select
   b.start_time,
   count(distinct e.user_id) as users
from generate_series(now()::date::timestamp - '1 month'::interval, now(), '1 hour'::interval') as b(start_time)
   left join ems_in_out_test e
      on (e.time_in <= b.start_time and e.time_out >= b.start_time)                      -- cases C and F
      or (e.time_in >= b.start_time and e.time_in  <= b.start_time + '1 hour'::interval) -- cases D and E     
group by b.start_time
order by b.start_time

Aredna fucked around with this message at 08:11 on Dec 3, 2012

technic
May 7, 2007
Edit: Nevermind...

technic fucked around with this message at 16:04 on Dec 3, 2012

Bad Munki
Nov 4, 2008

We're all mad here.


Aredna posted:

It's been a while since I've used PostgreSQL, but I think this works using the generate_series magic that I don't get to play with in MSSQL 2008 :(.

If you start with the series as your base table you souldn't need to do anything special to generate the 0 rows for hours with no users.

I based the conditions off of your cases at the top of the screen and as best I can tell it counts only the 4 cases we would count as valid for that hour.

Yeah, that's basically what I had last time, except I had some late-night-induced frenzy going on with an extra level of nesting and an unnecessary generate_series call that was hosing it all up. I used a left outer join instead of a left join, I honestly don't really know the difference (sql is not my strong point)...both options work, although of course the simpler version works better. ;) Made a couple small adjustments, but mostly cosmetic

code:
select 
   b.start_time, 
   count(distinct e.user_id) as users 
from generate_series(now() - interval '1 month', now(), '1 hour') as b(start_time) 
   left join ems_in_out_test e 
      on (e.time_in <= b.start_time and e.time_out >= b.start_time) 
      or (e.time_in >= b.start_time and e.time_in  <= b.start_time + '1 hour'::interval) 
group by b.start_time 
order by b.start_time 
Anyhow, this seems to work just fine. Still takes over half a second, but that's just how it's gotta be, short of dumping the results into a table for easy further querying.

e: Oh, and you took a different approach to the interval intersection than I did. In the end, I reduced it a little further to 4 cases in 2 isomorphisms, but I'm not sure if there are any real performance differences there.

Bad Munki fucked around with this message at 17:33 on Dec 3, 2012

Aredna
Mar 17, 2007
Nap Ghost
How many records do you have in the scan table? If it's still half a second it's going to scale pretty poorly. Obviously the first thing would to be to look at some indexes, maybe try one on (time_in, time_out, userid) to see if that helps?

LEFT JOIN is just shorthand for LEFT OUTER JOIN - no difference.

Bad Munki
Nov 4, 2008

We're all mad here.


Right now, not many (maybe a hundred?), but I'm just playing around at the moment. And yeah, I'm pretty sure the big problem right now is a lack of useful indexes for this operation. I'm not sure how to get an execute plan like I can with oracle, but I'm sure it includes a full table scan somewhere in the tree.

e: Ah ha!

code:
    SELECT b.start_time AS time, 
           count(distinct e.user_id) AS users 
    FROM   (select start_time, start_time + interval '1 hour' as end_time from generate_series(now() - INTERVAL '1 month', now(), '1 hour') as start_time) b
           LEFT JOIN ems_in_out_test e 
             ON (e.time_in BETWEEN b.start_time AND b.end_time
               OR e.time_out BETWEEN b.start_time AND b.end_time
               OR b.start_time BETWEEN e.time_in AND e.time_out
               OR b.end_time BETWEEN e.time_in AND e.time_out)
    group by b.start_time 
    order by b.start_time 
76ms. :)

Difference: using betweens and the four-way OR statement, and changing how the series is generated slightly in that it generates both start and end times.

Bad Munki fucked around with this message at 17:14 on Dec 3, 2012

Suran37
Feb 28, 2009
I actually have a question about PL/SQL in Oracle and I assume this is the correct thread to ask. Anyway, I am trying to print off the names of people who participated in a certain event based on the meet_id that is passed in. I am attempting to do this with a stored procedure and the SQL seems to work fine, but when I put it into the procedure it just lists everbody no matter what meet_id I provide. Even if I ask for meet 5 (which doesn't exist) it still lists everyone's name which should be impossible. Anyway my procedure is:
code:
CREATE OR REPLACE
PROCEDURE VALIDATE_DIVE_LIST 
(
  MEET_ID IN NUMBER
) AS
    divers_name diver_diver.diver_name%TYPE;
    divers_id diver_diver.diver_id%TYPE;
    CURSOR c1 IS
      SELECT DISTINCT dd.diver_id, dd.diver_name
      FROM diver_diver dd JOIN diver_meet_dive dmd
      ON dd.diver_id = dmd.diver_id
      WHERE dmd.meet_id = MEET_ID
      ORDER BY dd.diver_id;
BEGIN
  FOR div_rec IN c1 LOOP
      dbms_output.put_line(div_rec.diver_name);
   END LOOP;
END VALIDATE_DIVE_LIST;
and I run it with EXECUTE VALIDATE_DIVE_LIST(3)

var1ety
Jul 26, 2004

Suran37 posted:

I actually have a question about PL/SQL in Oracle and I assume this is the correct thread to ask. Anyway, I am trying to print off the names of people who participated in a certain event based on the meet_id that is passed in. I am attempting to do this with a stored procedure and the SQL seems to work fine, but when I put it into the procedure it just lists everbody no matter what meet_id I provide. Even if I ask for meet 5 (which doesn't exist) it still lists everyone's name which should be impossible. Anyway my procedure is:
code:

CREATE OR REPLACE
PROCEDURE VALIDATE_DIVE_LIST 
(
  MEET_ID IN NUMBER
) AS
    divers_name diver_diver.diver_name%TYPE;
    divers_id diver_diver.diver_id%TYPE;
    CURSOR c1 IS
      SELECT DISTINCT dd.diver_id, dd.diver_name
      FROM diver_diver dd JOIN diver_meet_dive dmd
      ON dd.diver_id = dmd.diver_id
      WHERE dmd.meet_id = MEET_ID
      ORDER BY dd.diver_id;
BEGIN
  FOR div_rec IN c1 LOOP
      dbms_output.put_line(div_rec.diver_name);
   END LOOP;
END VALIDATE_DIVE_LIST;

and I run it with EXECUTE VALIDATE_DIVE_LIST(3)

Choose a different parameter name (p_meet_id); naming it the same as a column introduces ambiguity, and is probably adding an "always true" to your statement.

Suran37
Feb 28, 2009

var1ety posted:

Choose a different parameter name (p_meet_id); naming it the same as a column introduces ambiguity, and is probably adding an "always true" to your statement.

Once again I feel like a jackass, thanks!

Bad Titty Puker
Nov 3, 2007
Soiled Meat

Suran37 posted:

I actually have a question about PL/SQL in Oracle and I assume this is the correct thread to ask. Anyway, I am trying to print off the names of people who participated in a certain event based on the meet_id that is passed in. I am attempting to do this with a stored procedure and the SQL seems to work fine, but when I put it into the procedure it just lists everbody no matter what meet_id I provide. Even if I ask for meet 5 (which doesn't exist) it still lists everyone's name which should be impossible. Anyway my procedure is:
code:
CREATE OR REPLACE
PROCEDURE VALIDATE_DIVE_LIST 
(
  MEET_ID IN NUMBER
) AS
    divers_name diver_diver.diver_name%TYPE;
    divers_id diver_diver.diver_id%TYPE;
    CURSOR c1 IS
      SELECT DISTINCT dd.diver_id, dd.diver_name
      FROM diver_diver dd JOIN diver_meet_dive dmd
      ON dd.diver_id = dmd.diver_id
      WHERE dmd.meet_id = MEET_ID
      ORDER BY dd.diver_id;
BEGIN
  FOR div_rec IN c1 LOOP
      dbms_output.put_line(div_rec.diver_name);
   END LOOP;
END VALIDATE_DIVE_LIST;
and I run it with EXECUTE VALIDATE_DIVE_LIST(3)

Why are you using a cursor? It will perform like rear end. I would use a straight SELECT statement and not call a procedure to output each line, which mixes presentation layer stuff with the database stuff.

Bad Munki
Nov 4, 2008

We're all mad here.


Okay, I got another one. In my statement, I'm selecting the following:

code:
count(distinct e.user_id)
However, what I really want is for it to give me a count of distinct user_ids NOT including user_id 122, plus NON-distinct user_id 122.

That is, the following data currently gives me a count of 3 (67, 54, 122)
122
122
54
54
54
67

But I want it to give me a count of 4 (67, 54, 122, 122). Is there a straightforward way to do this? Here's the full statement in question:

code:
    SELECT b.start_time AS time, 
           count(DISTINCT e.user_id) AS users 
    FROM   ( SELECT start_time,
                    start_time + ('1 '|| {$res})::interval as end_time
             FROM   generate_series(date_trunc({$res}, timezone('EST', now()) - INTERVAL {$span}), timezone('EST', now()), ('1 '|| {$res})::interval) as start_time) b 
           LEFT JOIN ems_in_out e 
             ON (e.time_in BETWEEN b.start_time AND b.end_time 
               OR e.time_out BETWEEN b.start_time AND b.end_time 
               OR b.start_time BETWEEN e.time_in AND e.time_out 
               OR b.end_time BETWEEN e.time_in AND e.time_out) 
    GROUP BY b.start_time 
    ORDER BY b.start_time ASC

var1ety
Jul 26, 2004

Bad Munki posted:

Okay, I got another one. In my statement, I'm selecting the following:

code:
count(distinct e.user_id)
However, what I really want is for it to give me a count of distinct user_ids NOT including user_id 122, plus NON-distinct user_id 122.

That is, the following data currently gives me a count of 3 (67, 54, 122)
122
122
54
54
54
67

But I want it to give me a count of 4 (67, 54, 122, 122). Is there a straightforward way to do this? Here's the full statement in question:

I would just aggregate the results, and then sum those using your rules
code:
select sum(case when userid = 122 then ct else 1 end) AS modified_sum
  from (select userid, count(*) AS ct
          from myresults
         group by userid)
Another option would be to count the results of a function, instead of your data directly. The idea would be to return your userid for non-122, and a unique value (rownum, a GUID, etc) for 122.

The first method (count/sum) would be more straightforward and self-documenting.

Aredna
Mar 17, 2007
Nap Ghost

Bad Munki posted:

Okay, I got another one. In my statement, I'm selecting the following:

code:
count(distinct e.user_id)
However, what I really want is for it to give me a count of distinct user_ids NOT including user_id 122, plus NON-distinct user_id 122.

I normally would do that by changing 122 to NULL in the distinct count and then counting 122 separately. You've also got several options as var1ety as shown as well. This is a case where I would add a comment to explain why I'm doing this for anyone else that has to read it since in a year someone may wonder what exactly is special about 122.
code:
    SELECT b.start_time AS time, 
           count(DISTINCT case when e.user_id = 122 then NULL else e.user_id end) + count(case when e.user_id = 122 then e.user_id else NULL end) AS users 
    FROM   ( SELECT start_time,
                    start_time + ('1 '|| {$res})::interval as end_time
             FROM   generate_series(date_trunc({$res}, timezone('EST', now()) - INTERVAL {$span}), timezone('EST', now()), ('1 '|| {$res})::interval) as start_time) b 
           LEFT JOIN ems_in_out e 
             ON (e.time_in BETWEEN b.start_time AND b.end_time 
               OR e.time_out BETWEEN b.start_time AND b.end_time 
               OR b.start_time BETWEEN e.time_in AND e.time_out 
               OR b.end_time BETWEEN e.time_in AND e.time_out) 
    GROUP BY b.start_time 
    ORDER BY b.start_time ASC

Bad Munki
Nov 4, 2008

We're all mad here.


Oh, actually, I like that option more. I ended up changing it just a little further to give me separate columns for non-122 and 122 (members vs. guests) because it makes for a more "interesting" report.

Thanks, all, you've all been super helpful. :)

subx
Jan 12, 2003

If we hit that bullseye, the rest of the dominoes should fall like a house of cards. Checkmate.
So we completely redesigned an old system, and everything is fine there, but they want the old data available in the new one. Copying the data is easy enough (and I can reformat the outdated parts), but the problem arises when trying to preserve the bridges between different tables. With the PK's obviously changing, I'm not sure how I can go about recreating the Many-to-Many relationships.

Is there anything simple I'm not aware of to do this?

JeansW
Aug 4, 2004
Yes. Like the jeans...

subx posted:

So we completely redesigned an old system, and everything is fine there, but they want the old data available in the new one. Copying the data is easy enough (and I can reformat the outdated parts), but the problem arises when trying to preserve the bridges between different tables. With the PK's obviously changing, I'm not sure how I can go about recreating the Many-to-Many relationships.

Is there anything simple I'm not aware of to do this?

Can you just insert the old data with the same old PKs? Were they incrementing IDs? If so you could just start your new tables somewhere higher than the old records. For instance if your old tables max ID was 850232 you could just make all your new tables ID's start at 1000000 and then set IDENTITY_INSERT on and populate the tables with the old data preserving their original ID's.

That would also have the added bonus of making it easy to identify old data versus new data.

EDIT: Assuming this is in MSSQL. If it's some other DBMS I'm sure there's a similar setting.

Suran37
Feb 28, 2009
Ok I'm getting really tired of PL/SQL. Anyway, I have to make a package and after googling for well over an hour I came up with: http://pastebin.com/rsmxABKq

However, I just get an error on compile. PLS-00103: Encountered the symbol /
Pointing to line 73.

Bad Munki
Nov 4, 2008

We're all mad here.


How are you trying to execute it? I've had problems with some tools, like DBVisualizer. It works for 90% of the stuff I need to do, but it chokes on some things, specifically those slashes in packages or scripts. Sometimes executing the entire buffer as a single command helps, depending on the sql in question.

subx
Jan 12, 2003

If we hit that bullseye, the rest of the dominoes should fall like a house of cards. Checkmate.

JeansW posted:

Can you just insert the old data with the same old PKs? Were they incrementing IDs? If so you could just start your new tables somewhere higher than the old records. For instance if your old tables max ID was 850232 you could just make all your new tables ID's start at 1000000 and then set IDENTITY_INSERT on and populate the tables with the old data preserving their original ID's.

That would also have the added bonus of making it easy to identify old data versus new data.

EDIT: Assuming this is in MSSQL. If it's some other DBMS I'm sure there's a similar setting.

Well we already have some entries in the new format, but I might take your idea and modify it a bit. Easier to renumber all the tables/bridges with +100000 and then insert the old ones with their old PK's than it is to try renumbering the new inserts. Thanks for the idea!

Suran37
Feb 28, 2009

Bad Munki posted:

How are you trying to execute it? I've had problems with some tools, like DBVisualizer. It works for 90% of the stuff I need to do, but it chokes on some things, specifically those slashes in packages or scripts. Sometimes executing the entire buffer as a single command helps, depending on the sql in question.

Are you talking to me? I'm using oracle. I clicked on procedures > new, which generated some of the code, I filled in the rest, and when I press the compile button I get the Error. If I get rid of the / I get the same error, but CREATE instead of /.

Bad Munki
Nov 4, 2008

We're all mad here.


Suran37 posted:

Are you talking to me? I'm using oracle. I clicked on procedures > new, which generated some of the code, I filled in the rest, and when I press the compile button I get the Error. If I get rid of the / I get the same error, but CREATE instead of /.

Yeah, I was wondering what client you're using to actually apply this to the DB. You're using their little web interfacey thing, I guess? As opposed to, say, sqlplus or what have you. There are lots of options, but if you're doing it through their own provided mechanism, that's probably not the issue, I guess. v:shobon:v

Goat Bastard
Oct 20, 2004

Suran37 posted:

Ok I'm getting really tired of PL/SQL. Anyway, I have to make a package and after googling for well over an hour I came up with: http://pastebin.com/rsmxABKq

However, I just get an error on compile. PLS-00103: Encountered the symbol /
Pointing to line 73.

That particular error usually means that there was a syntax error earlier, but for whatever reason it could keep interpreting stuff until you try and close it all off.

I ran the package create part in SQL*Plus (on 10g r2 XE because that's what I have on this machine) and got

code:
Connected to:
Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production

SQL> CREATE OR REPLACE
  2  PACKAGE MEET_SCORING_PACKAGE IS

...

 72  END MEET_SCORING_PACKAGE;
 73  /

Warning: Package created with compilation errors.

SQL> sho err
Errors for PACKAGE MEET_SCORING_PACKAGE:

LINE/COL ERROR
-------- -----------------------------------------------------------------
3/3      PL/SQL: Declaration ignored
3/3      PLS-00360: cursor declaration without body needs return type
4/3      PL/SQL: Declaration ignored
4/3      PLS-00360: cursor declaration without body needs return type
5/3      PL/SQL: Declaration ignored
5/3      PLS-00360: cursor declaration without body needs return type
So it looks like it should be fine if you also declare the package body (which I didn't do because I didn't do because setting up all the tables would be a pain in the rear end), so I agree with Bad Munki - try running your script in a different tool. You should have access to SQL*Plus, or else Oracle provides SQL Developer for free.

Edit: actually that error says nothing about a package body.
This link suggests you need to define a return type for your cursors. I'd normally just use an implicit cursor myself, or maybe a view if I need to reuse the query like you seem to, although I don't have any performance stats or snappy words about best practices to back that up.

Goat Bastard fucked around with this message at 07:49 on Dec 7, 2012

Suran37
Feb 28, 2009
Finally figured it out. Had the body in with the spec, when I needed to hit a button and paste the body into a different window.

Suran37 fucked around with this message at 18:25 on Dec 7, 2012

Sockser
Jun 28, 2007

This world only remembers the results!




Alright so... writing some dumb loving query and I'm not sure how to proceed from where I'm at.
(Oracle)

Building this stupid facebook knock-off as our final project for a databases course,
Almost done, I think this is the last actual SQL thing I need to knock out.

Relevant to this are two tables,
FRIENDS (userid1, userid2, jdate, message)
MESSAGES (msgid, fromid, message, touserid, msgdate)

I'll let my professor's assignment speak for what I'm doing here:

quote:

My Statistics
This task displays the top k friends who have sent or received the highest number of messages during for the past x months. x and k should be an input parameter that can be specified by the users of your system.

Right now I've got all of that except I'm querying all users instead of just friends.

So, my current query:
SQL code:
SELECT ROWNUM, 
       xx.* 
FROM   (SELECT id, 
               Count(*) AS count 
        FROM   (SELECT fromid AS id 
                FROM   messages 
                WHERE  msgdate > '08-nov-12' 
                UNION ALL 
                SELECT touserid AS id 
                FROM   messages 
                WHERE  msgdate > '08-nov-12') 
        GROUP  BY id 
        ORDER  BY count DESC) xx 
WHERE  id IS NOT NULL 
       AND ROWNUM < 3; 
This works fine but it checks against all users. So what I need to do is uh
gently caress this is getting complicated to type out

I need to join that query up there with something along the lines of this query here:
SQL code:
select * from messages where fromid = 1 or touserid=1;
and I have no idea the best way to do it...

e: I might have figured this out, having typed it all out. Rubber duck debugging.

e2: yep, looks like I got it. Now I wish the professor had given us enough loving sample data to actually test our code.

e3: No, I'm stupid and didn't do anything and I still need help here.

Sockser fucked around with this message at 02:18 on Dec 9, 2012

Amarkov
Jun 21, 2010
How many records does your innermost derived table contain for each message?

How many records should your innermost derived table contain for each message?


e:
This is one of the dangers of reasoning about a problem entirely in natural languages. English is not designed to be a database query language, so "find the user who has sent or received the most messages" sounds incredibly similar to the very different problem "find the friend (of this user) who has sent or received the most messages (to this user)".

Amarkov fucked around with this message at 05:14 on Dec 9, 2012

Sockser
Jun 28, 2007

This world only remembers the results!




yeah the latter problem there is really easy to work out and when I thought I had a solution it was actually just that.

The former is uhhh yeah I'm not sure how to do, and I'm not sure I entirely understand your questions.

Amarkov
Jun 21, 2010
The way I'm interpreting your professor's assignment is this:

Given a user, find the k friends of that user who have communicated with that user the most times within the past n months.


If that's not right, please translate for me or something, because the description is confusing.


e: And the former is the problem that your unmodified query solves, so I'm not quite sure what you mean by not knowing how to do it...

Sockser
Jun 28, 2007

This world only remembers the results!




Given a user, find the k friends who have communicated with anyone the most within the past n months.

And yeah, basically the entire course has been that confusing and muddled and awful so I'm stressing a bit.

e:
and the problem is that I have the most communicating users but I need the most communicating friends of the current user and augh

I'm having trouble even explaining what I'm doing.

e: it solves for all users, not one particular user.

Sockser fucked around with this message at 06:06 on Dec 9, 2012

Amarkov
Jun 21, 2010
What why in the world would such a thing ever be useful

Anyway, your basic query structure will work for this. What you're currently doing is selecting all fromids where the touserid is anything, and selecting all touserids where the fromid is anything. What you want to do is select all fromids where the touserid is the given user and the users are friends, and select all touserids where the fromid is the given user and the users are friends.

All you need to do is replace the table "messages" with a derived table. The derived table will be just as long and ugly as you're thinking; as far as I can tell, there's no way to avoid this with the database structure you have.

e: No, wait, that's for the way I want to interpret it. If it's supposed to be all communications, the only condition you want is that the users are friends.

Amarkov fucked around with this message at 06:30 on Dec 9, 2012

Bad Munki
Nov 4, 2008

We're all mad here.


Are you sure that's what the question is after, for all users? As opposed to "find the top k friends who have communicated most with the user who's asking, for the past x months."

That is, "Who is talking to me the most?"

glompix
Jan 19, 2004

propane grill-pilled
Holy poo poo guys SQL Server 2012 AlwaysOn is awesome. I just installed an availability group (kinda like a clustered mirror with readable secondary) in production this weekend and it went so smooth. Didn't even have to take the database down to do it. (our old singular production database had been built as a node in the cluster to prepare)

Bugamol
Aug 2, 2006
code:
SELECT part.id, 
       Sum(part_location.qty)   AS 'QTY OH', 
       part.qty_in_demand       AS 'QTY IN DEMAND', 
       part.qty_on_order        AS 'QTY ON ORDER', 
       part.pref_vendor_id      AS 'PREF VENDOR', 
       part.abc_code            AS 'ABC', 
       part.planning_leadtime   AS 'LEADTIME', 
       Sum(inventory_trans.qty) AS '90 DAY USAGE' 
FROM   part, 
       part_location, 
       inventory_trans 
WHERE  part.id = part_location.part_id 
       AND part.id = inventory_trans.part_id 
       AND part_location.part_id = inventory_trans.part_id 
       AND part_location.warehouse_id = inventory_trans.warehouse_id 
       AND part_location.location_id = inventory_trans.location_id 
       AND part_location.qty > 0 
       AND part_location.warehouse_id = 'FWS' 
       AND inventory_trans.transaction_date >= Getdate() - 90 
       AND inventory_trans.type = 'O' 
       AND inventory_trans.class = 'I' 
GROUP  BY part.id, 
          part.qty_in_demand, 
          part.qty_on_order, 
          part.pref_vendor_id, 
          part.abc_code, 
          part.planning_leadtime 
ORDER  BY id 
Background: I'm relatively to writing my own SQL code. I've been using Microsoft data query alongside Excel to run reports in the past, but want to start converting my reports into SSRS for ease of use. Normally when I would write a query for excel if I couldn't get the joins to work correctly I would just create two queries and then sum the data using Excel. However moving forward I want to understand how to make these joins work correctly.

Issue: The issue with the above is that there are two tables (inventory_trans and part_location) both with the field name QTY however they are unique fields with unique data they just share the same field name. If I sum inventory_trans.qty and group by part_location.qty I don't have a problem, but if I try and sum inventory_trans.qty AND part_location.qty the part_location.qty blows up and gives me huge incorrect numbers. Any suggestions?

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!

Bugamol posted:

code:
SELECT part.id, 
       Sum(part_location.qty)   AS 'QTY OH', 
       part.qty_in_demand       AS 'QTY IN DEMAND', 
       part.qty_on_order        AS 'QTY ON ORDER', 
       part.pref_vendor_id      AS 'PREF VENDOR', 
       part.abc_code            AS 'ABC', 
       part.planning_leadtime   AS 'LEADTIME', 
       Sum(inventory_trans.qty) AS '90 DAY USAGE' 
FROM   part, 
       part_location, 
       inventory_trans 
WHERE  part.id = part_location.part_id 
       AND part.id = inventory_trans.part_id 
       AND part_location.part_id = inventory_trans.part_id 
       AND part_location.warehouse_id = inventory_trans.warehouse_id 
       AND part_location.location_id = inventory_trans.location_id 
       AND part_location.qty > 0 
       AND part_location.warehouse_id = 'FWS' 
       AND inventory_trans.transaction_date >= Getdate() - 90 
       AND inventory_trans.type = 'O' 
       AND inventory_trans.class = 'I' 
GROUP  BY part.id, 
          part.qty_in_demand, 
          part.qty_on_order, 
          part.pref_vendor_id, 
          part.abc_code, 
          part.planning_leadtime 
ORDER  BY id 
Background: I'm relatively to writing my own SQL code. I've been using Microsoft data query alongside Excel to run reports in the past, but want to start converting my reports into SSRS for ease of use. Normally when I would write a query for excel if I couldn't get the joins to work correctly I would just create two queries and then sum the data using Excel. However moving forward I want to understand how to make these joins work correctly.

Issue: The issue with the above is that there are two tables (inventory_trans and part_location) both with the field name QTY however they are unique fields with unique data they just share the same field name. If I sum inventory_trans.qty and group by part_location.qty I don't have a problem, but if I try and sum inventory_trans.qty AND part_location.qty the part_location.qty blows up and gives me huge incorrect numbers. Any suggestions?
The problem has nothing to do with duplicate field names. The problem is that each unique part_location record is getting repeated for each matching inventory_trans record. You'll need to rewrite the query to aggregate before you join.

I think the best solution for you would be something like:

code:
SELECT part.id, 
       Sum(pl.qty)   AS 'QTY OH', 
       part.qty_in_demand       AS 'QTY IN DEMAND', 
       part.qty_on_order        AS 'QTY ON ORDER', 
       part.pref_vendor_id      AS 'PREF VENDOR', 
       part.abc_code            AS 'ABC', 
       part.planning_leadtime   AS 'LEADTIME', 
       Sum(pl.trans_qty) AS '90 DAY USAGE' 
FROM   part p
       INNER JOIN 
         (SELECT part_location.part_id,
                 part_location.qty,
                 Sum(inventory_trans.qty) AS trans_qty
            FROM part_location
                 INNER JOIN inventory_trans
           WHERE part_location.part_id = inventory_trans.part_id
                 AND part_location.warehouse_id = inventory_trans.warehouse_id 
                 AND part_location.location_id = inventory_trans.location_id 
                 AND part_location.qty > 0 
                 AND part_location.warehouse_id = 'FWS' 
                 AND inventory_trans.transaction_date >= Getdate() - 90 
                 AND inventory_trans.type = 'O' 
                 AND inventory_trans.class = 'I'
          GROUP BY part_location.part_id,
                   part_location.qty,
                   part_location.warehouse_id,
                   part_location.location_id)
         AS pl ON part.id = pl.part_id
GROUP  BY part.id, 
          part.qty_in_demand, 
          part.qty_on_order, 
          part.pref_vendor_id, 
          part.abc_code, 
          part.planning_leadtime 
ORDER  BY id 
It sums up the transaction data by location, and then it sums up the location data.

Bugamol
Aug 2, 2006

Jethro posted:

The problem has nothing to do with duplicate field names. The problem is that each unique part_location record is getting repeated for each matching inventory_trans record. You'll need to rewrite the query to aggregate before you join.

I think the best solution for you would be something like:

code:
SELECT part.id, 
       Sum(pl.qty)   AS 'QTY OH', 
       part.qty_in_demand       AS 'QTY IN DEMAND', 
       part.qty_on_order        AS 'QTY ON ORDER', 
       part.pref_vendor_id      AS 'PREF VENDOR', 
       part.abc_code            AS 'ABC', 
       part.planning_leadtime   AS 'LEADTIME', 
       Sum(pl.trans_qty) AS '90 DAY USAGE' 
FROM   part p
       INNER JOIN 
         (SELECT part_location.part_id,
                 part_location.qty,
                 Sum(inventory_trans.qty) AS trans_qty
            FROM part_location
                 INNER JOIN inventory_trans
           WHERE part_location.part_id = inventory_trans.part_id
                 AND part_location.warehouse_id = inventory_trans.warehouse_id 
                 AND part_location.location_id = inventory_trans.location_id 
                 AND part_location.qty > 0 
                 AND part_location.warehouse_id = 'FWS' 
                 AND inventory_trans.transaction_date >= Getdate() - 90 
                 AND inventory_trans.type = 'O' 
                 AND inventory_trans.class = 'I'
          GROUP BY part_location.part_id,
                   part_location.qty,
                   part_location.warehouse_id,
                   part_location.location_id)
         AS pl ON part.id = pl.part_id
GROUP  BY part.id, 
          part.qty_in_demand, 
          part.qty_on_order, 
          part.pref_vendor_id, 
          part.abc_code, 
          part.planning_leadtime 
ORDER  BY id 
It sums up the transaction data by location, and then it sums up the location data.

This looks good. I'm getting a syntax error "Incorrect syntax near the keyword 'WHERE'.", line 16, which I'm trying to troubleshoot currently, but I think understand what I need to do now. I appreciate the help and explanation of what was going on.

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine
The syntax error is that it has INNER JOIN inventory_trans but never says which column to join on.

Adbot
ADBOT LOVES YOU

Bugamol
Aug 2, 2006

Golbez posted:

The syntax error is that it has INNER JOIN inventory_trans but never says which column to join on.

Just got this figured out. Thanks for pointing it out though. Thanks both for your help looks like everything is working great now.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply