Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
geetee
Feb 2, 2004

>;[

sonic bed head posted:

I am trying to import a few thousand row CSV file (from Excel) into mysql and everything works fine except the few lines that have a comma inside them. There are about 150 of these and it's a pain to manually change them. Is there anyway to get mysql's load function to respect "Smith, Jr." as one entity without the quotes? That's how excel exports it and I don't think that I can format it any other way. Thanks!

How is MySQL going to separate good commas from bad commas if the good text isn't wrapped in quotes? I know that's your question, but think about it.

Excel 2007 wraps text in quotes if it has a comma. I'm not sure about older versions. Worst case scenario, just download OpenOffice and export to CSV from there.

I'm assuming you're using this: http://dev.mysql.com/doc/refman/5.0/en/load-data.html Be sure to have OPTIONALLY ENCLOSED BY '"' in there.

geetee fucked around with this message at 05:08 on Apr 10, 2009

Adbot
ADBOT LOVES YOU

Triple Tech
Jul 28, 2006

So what, are you quitting to join Homo Explosion?

sonic bed head posted:

I am trying to import a few thousand row CSV file (from Excel) into mysql and everything works fine except the few lines that have a comma inside them. There are about 150 of these and it's a pain to manually change them. Is there anyway to get mysql's load function to respect "Smith, Jr." as one entity without the quotes? That's how excel exports it and I don't think that I can format it any other way. Thanks!

Either export the source table as a TSV or character-I'm-not-using-SV.

sonic bed head
Dec 18, 2003

this is naturual, baby!
Thanks Triple Tech and geetee!

geetee posted:

How is MySQL going to separate good commas from bad commas if the good text isn't wrapped in quotes? I know that's your question, but think about it.

Excel 2007 wraps text in quotes if it has a comma. I'm not sure about older versions. Worst case scenario, just download OpenOffice and export to CSV from there.

I'm assuming you're using this: http://dev.mysql.com/doc/refman/5.0/en/load-data.html Be sure to have OPTIONALLY ENCLOSED BY '"' in there.

:) I had no idea there was an "OPTIONALLY" option to that part of the query. I had found enclosed by, and was thinking about writing a script to remove all quotes from the text file and add them around every comma to use that. Now everything works great. Thank you.

P.S. I probably didn't explain well, but that's what I was asking. I know that it couldn't tell the difference between an intended comma and a data delimiting comma if there was no quotes, I was asking what to do if some had quotes and some didn't.

DTMTCM
Feb 18, 2005
I misclicked post icon so I have this account now.
I've recently picked up some code that I need to maintain (i.e. stop the horrid, horrid bleeding). The currently application allows people to make work order requests that are scheduled to run through multiple departs (about 5) with each department completing a give task that can have upwards to 40 to 50 settings. The values of these setting tend to be text, numbers, or enumerations (ex. for colors, blue, red, yellow, etc). There needs to be logic in the code so that some settings are dependent on the existence of other settings but I don't think this can really be modeled in the database too well (or it isn't worth doing).

With this vague description in mind, is this a candidate situation for an EAV database model? And if I do decide to go with an EAV model, is it worth defining types for each setting so that the GUI can be dynamically built (text box for text and numbers, checkbox for boolean, radiobutton group for enumeration, etc...)? Are there any good resources that cover implementations of EAV databases?

If this isn't a candidate for an EAV model, am I probably better off defining each of these settings as a column (for booleans) table (for enumerations) and just hardcoding the GUI to handle each field specifically?

Bad Titty Puker
Nov 3, 2007
Soiled Meat

urmomlolzinbed posted:

I've recently picked up some code that I need to maintain (i.e. stop the horrid, horrid bleeding). The currently application allows people to make work order requests that are scheduled to run through multiple departs (about 5) with each department completing a give task that can have upwards to 40 to 50 settings. The values of these setting tend to be text, numbers, or enumerations (ex. for colors, blue, red, yellow, etc). There needs to be logic in the code so that some settings are dependent on the existence of other settings but I don't think this can really be modeled in the database too well (or it isn't worth doing).

With this vague description in mind, is this a candidate situation for an EAV database model? And if I do decide to go with an EAV model, is it worth defining types for each setting so that the GUI can be dynamically built (text box for text and numbers, checkbox for boolean, radiobutton group for enumeration, etc...)? Are there any good resources that cover implementations of EAV databases?

If this isn't a candidate for an EAV model, am I probably better off defining each of these settings as a column (for booleans) table (for enumerations) and just hardcoding the GUI to handle each field specifically?

Well, you said it -- "vague description". Really need better requirements info to get good advice. I don't see a good reason, from what you wrote, to use EAV; but then again, who knows?

Squashy Nipples
Aug 18, 2007

I'm having trouble with the DB2 OLAP functions ("window" functions). This code works just fine:
code:
SELECT
 COALESCE(MPQ.SEC_NAME, MFT.SEC_NAME) AS FUNDNAME,
 COALESCE(BALANCE, 0.0) AS ASSETS,
 SUM(COALESCE(BALANCE, 0.0)) OVER () AS TOTALASSETS,
 COALESCE(PARTSBAL, 0.0) AS ACTIVEPARTICIPANTS

FROM [Complicated FULL OUTER JOIN)
But this code gives me back all goose eggs for the Percent of Total:

code:
SELECT
 COALESCE(MPQ.SEC_NAME, MFT.SEC_NAME) AS FUNDNAME,
 COALESCE(BALANCE, 0.0) AS ASSETS,
 ( COALESCE(BALANCE, 0.0) / SUM(COALESCE(BALANCE, 0.0)) OVER () ) AS PERCENTOFTOTAL,
 COALESCE(PARTSBAL, 0.0) AS ACTIVEPARTICIPANTS

FROM [Complicated FULL OUTER JOIN)
Its trivial to divide the two numbers programatically from the retrived data, but I just learned how to use OVER and its pissing me off that something this simple doesn't work.

Maybe I'm having data type conversion problems? I tried DECIMAL in various places to try and keep the data types similar, but that didn't work.


EDIT: Nevermind, I got it to work. For some reason, DEC() withouth any parameters works fine. All of the 'Percent of Total' examples I googled up used parameters.

Squashy Nipples fucked around with this message at 23:07 on Apr 12, 2009

Tap
Apr 12, 2003

Note to self: Do not shove foreign objects in mouth.
MySQL Question

Is there a way to move rows from one table to another? I know there's a way to copy rows from one table to another by using the INSERT INTO syntax, but it doesn't remove the row.

Code:

code:
INSERT INTO $this->prefix" . "user (username, country_id, password, email) 
SELECT username, country_id, password, email 
FROM $this->table 
WHERE id = $id AND activation_code = '$activation_code'

Begby
Apr 7, 2005

Light saber? Check. Black boots? Check. Codpiece? Check. He's more machine than kid now.

Tap posted:

MySQL Question

Is there a way to move rows from one table to another? I know there's a way to copy rows from one table to another by using the INSERT INTO syntax, but it doesn't remove the row.

Code:

code:
INSERT INTO $this->prefix" . "user (username, country_id, password, email) 
SELECT username, country_id, password, email 
FROM $this->table 
WHERE id = $id AND activation_code = '$activation_code'

You have to issue a DELETE statement after your insert using the same where class. Thats the only way I have ever heard of doing it.

It might be a good idea to put both statements into a transaction if you are trying to avoid a race condition or something.

Tap
Apr 12, 2003

Note to self: Do not shove foreign objects in mouth.
Yeah, that's the way I have it now, I just wasn't sure if there was a way to issue the delete in a single statement. Thanks though.

No Safe Word
Feb 26, 2005

Tap posted:

Yeah, that's the way I have it now, I just wasn't sure if there was a way to issue the delete in a single statement. Thanks though.

Wrap it in a transaction!

That is, if you're looking to guarantee that both happen and you don't get duplicated data.

cannibustacap
Jul 7, 2003

Brrrruuuuuiinnssss
(I decided to redo this post, it was all over the place before)

I have two main issues:
1. In doing a Group By, one of the columns is not using the right corresponding row. That is, in the Group By, it seems almost random which value is being presented (I will show an example.

2. I'm not sure if I am asking too much from one query, but is there a way to further refine this query so that it will return only the highest ranking row* (noted by ranked_rank) and if there are ties, the highest totalcount will break the tie?

code:
SELECT    bbh.best_nid, MAX(DISTINCT(bbh.best_vid)), bbh.ranked_nid, bbh.ranked_rank, nc.totalcount
FROM      bc_bests_history bbh
LEFT JOIN node_counter nc
ON        nc.nid = bbh.best_nid
WHERE     bbh.ranked_nid = 7966
GROUP BY  bbh.best_nid
code:

+----------+-----------------------------+------------+-------------+------------+
| best_nid | MAX(DISTINCT(bbh.best_vid)) | ranked_nid | ranked_rank | totalcount |
+----------+-----------------------------+------------+-------------+------------+
|     8286 |                       19524 |       7966 |           1 |      25083 |
|     8287 |                        8287 |       7966 |           3 |       7245 |
|     8289 |                        8289 |       7966 |           2 |      13706 |
|     8290 |                        8290 |       7966 |           1 |       3827 |
|     8538 |                        8538 |       7966 |           1 |       2181 |
|     8542 |                        8542 |       7966 |           1 |       1561 |
|    10848 |                       10848 |       7966 |           2 |       1421 |
+----------+-----------------------------+------------+-------------+------------+
7 rows in set (0.00 sec)

(continue of issue #2)
So it would return only one single row, and that would be the row with rank = 1 and totalcount = 25081.

How can I further refine my SQL statement to make sure it does that across all ranked_nids?

If not, I will probably break this up into a couple more queries in PHP and go from there. It'd be sweet if SQL could do this in one shot

(Back to issue #1)
And also, the "ranked_rank" for the first row should be returning 2!

The best_vid is the Version ID. The version ID of best_nid 8286 are: 8286 (the first version) and 19524 (the new version). The rank of the new version is 2. The rank of the older version is 1. But it is reporting the version as #1...

So I just need to make sure all rows correspond to that best_vid. The 19524 best_vid will have a ranked_rank of 2 for that item. I have no idea why it is reporting the older version ID's ranked rank row.

Then after that, I just need to refine the whole thing down to 1 single row with the highest (correct) rank and highest popularity.

cannibustacap fucked around with this message at 22:11 on Apr 21, 2009

Aredna
Mar 17, 2007
Nap Ghost
SQL Server 2000

Which of the following should I expect to run faster? I can't find any appreciable difference in tests and the first clearly shows the intention, but if the later is faster on large data sets then I'd rather go ahead and pick use it now as performance may matter later in this year with higher volume.

code:
datepart(dw,c.deliverydate) = datepart(dw,getdate())
datediff(d,c.deliverydate,getdate()) % 7 = 0

Bad Titty Puker
Nov 3, 2007
Soiled Meat

Aredna posted:

SQL Server 2000

Which of the following should I expect to run faster? I can't find any appreciable difference in tests and the first clearly shows the intention, but if the later is faster on large data sets then I'd rather go ahead and pick use it now as performance may matter later in this year with higher volume.

code:
datepart(dw,c.deliverydate) = datepart(dw,getdate())
datediff(d,c.deliverydate,getdate()) % 7 = 0

Measure, don't expect. But in this case, neither one is sargable. Writing it with functions applied to the data column will prevent the optimizer from using any indexes on c.deliverydate. If that's an issue, consider using a calendar table.

epswing
Nov 4, 2003

Soiled Meat
phpmyadmin is warning me that I shouldn't have both UNIQUE and INDEX on a particular column.

So, if in a table A there are columns x and y, where (A.x, A.y) is unique and A.x is index, there's no need for this separate index?

Triple Tech
Jul 28, 2006

So what, are you quitting to join Homo Explosion?
That doesn't make sense.

epswing
Nov 4, 2003

Soiled Meat
My explanation doesn't make sense?

Here's an example.
code:
CREATE TABLE `t` (
  `id` int(11) NOT NULL auto_increment,
  `a` int(11) NOT NULL,
  `b` varchar(255) NOT NULL,
  PRIMARY KEY  (`id`),
  UNIQUE `uniq` (`a`,`b`),
  INDEX `idx` (`a`)
)
I don't want (a,b) duplicates, hence the unique. But I'll also be issuing queries like SELECT * FROM t WHERE t.a = ?. What I'm asking for clarification about is the redundancy of INDEX `idx` (`a`). It's redundant, yes?

Triple Tech
Jul 28, 2006

So what, are you quitting to join Homo Explosion?
Hmm, the more I think about it I guess yeah it's redundant. Philosophically, no. But uniqueness constraints, at least in SQL Server, are implemented as indices, and rightly so; because of their uniqueness, they qualify as surrogate keys. Since an A,B key also implements A, then yes technically a lone explicit index on A would be redundant.

Had to think about that for a bit, sorry.

Aredna
Mar 17, 2007
Nap Ghost
edit: Just figured it out, turns out I misread a page an hour ago and when I verified that my tables defaulted to MyISAM storage engine, it really was supposed to that MyISAM does NOT support transactions and I want to change them to all be InnoDB.

I'm having trouble making transactions work, specifically rolling them back.

To try and find the cause of the problem I'm connecting to mysql via the terminal as root.

MySQL Server 5.1.33
Installed with xampp 1.7.1, using xampp's default configuration
running on Ubuntu 8.10

I've been able to recreate the problem with only running:
code:
begin;
insert into forms.frm_usernames (userid, name_last, name_first) values (7777, 'User XYZ', 'TestTest');
rollback;
And the values are still inserted into the database. Just to make sure I wasn't losing it I completely closed the terminal and used phpMyAdmin to verify that it was indeed inserted.

I have tried running "set autocommit = 0;" first and replacing "begin;" with "start transaction;"

DDL for the table is:
code:
CREATE TABLE IF NOT EXISTS `frm_usernames` (
  `userid` varchar(20) NOT NULL DEFAULT '',
  `name_last` varchar(100) DEFAULT NULL,
  `name_first` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`userid`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

Aredna fucked around with this message at 07:39 on Apr 25, 2009

Vince McMahon
Dec 18, 2003

cannibustacap posted:

(I decided to redo this post, it was all over the place before)

I have two main issues:
1. In doing a Group By, one of the columns is not using the right corresponding row. That is, in the Group By, it seems almost random which value is being presented (I will show an example.

2. I'm not sure if I am asking too much from one query, but is there a way to further refine this query so that it will return only the highest ranking row* (noted by ranked_rank) and if there are ties, the highest totalcount will break the tie?

code:
SELECT    bbh.best_nid, MAX(DISTINCT(bbh.best_vid)), bbh.ranked_nid, bbh.ranked_rank, nc.totalcount
FROM      bc_bests_history bbh
LEFT JOIN node_counter nc
ON        nc.nid = bbh.best_nid
WHERE     bbh.ranked_nid = 7966
GROUP BY  bbh.best_nid
code:

+----------+-----------------------------+------------+-------------+------------+
| best_nid | MAX(DISTINCT(bbh.best_vid)) | ranked_nid | ranked_rank | totalcount |
+----------+-----------------------------+------------+-------------+------------+
|     8286 |                       19524 |       7966 |           1 |      25083 |
|     8287 |                        8287 |       7966 |           3 |       7245 |
|     8289 |                        8289 |       7966 |           2 |      13706 |
|     8290 |                        8290 |       7966 |           1 |       3827 |
|     8538 |                        8538 |       7966 |           1 |       2181 |
|     8542 |                        8542 |       7966 |           1 |       1561 |
|    10848 |                       10848 |       7966 |           2 |       1421 |
+----------+-----------------------------+------------+-------------+------------+
7 rows in set (0.00 sec)

(continue of issue #2)
So it would return only one single row, and that would be the row with rank = 1 and totalcount = 25081.

How can I further refine my SQL statement to make sure it does that across all ranked_nids?

If not, I will probably break this up into a couple more queries in PHP and go from there. It'd be sweet if SQL could do this in one shot

(Back to issue #1)
And also, the "ranked_rank" for the first row should be returning 2!

The best_vid is the Version ID. The version ID of best_nid 8286 are: 8286 (the first version) and 19524 (the new version). The rank of the new version is 2. The rank of the older version is 1. But it is reporting the version as #1...

So I just need to make sure all rows correspond to that best_vid. The 19524 best_vid will have a ranked_rank of 2 for that item. I have no idea why it is reporting the older version ID's ranked rank row.

Then after that, I just need to refine the whole thing down to 1 single row with the highest (correct) rank and highest popularity.

group by just goes with the first value it finds in that group, so chances are it will be wrong.

try this:
code:
SELECT 
bbh.best_nid,
bbh.best_vid,
bbh.ranked_nid,
bbh.ranked_rank,
nc.totalcount

FROM bc_bests_history bbh

LEFT JOIN node_counter nc
ON bbh.best_nid = nc.nid

INNER JOIN (
	SELECT 
	MAX(best_vid) as best_vid
	FROM bc_bests_history
	WHERE ranked_nid = 7966
	GROUP BY best_nid
) AS max_subquery
ON bbh.best_vid = max_subquery.best_vid
I can't tell if you want just one result or one result for each best_vid, but use limits and order by if need be.

cannibustacap
Jul 7, 2003

Brrrruuuuuiinnssss

Vince McMahon posted:

group by just goes with the first value it finds in that group, so chances are it will be wrong.

try this:
code:
SELECT 
bbh.best_nid,
bbh.best_vid,
bbh.ranked_nid,
bbh.ranked_rank,
nc.totalcount

FROM bc_bests_history bbh

LEFT JOIN node_counter nc
ON bbh.best_nid = nc.nid

INNER JOIN (
	SELECT 
	MAX(best_vid) as best_vid
	FROM bc_bests_history
	WHERE ranked_nid = 7966
	GROUP BY best_nid
) AS max_subquery
ON bbh.best_vid = max_subquery.best_vid
I can't tell if you want just one result or one result for each best_vid, but use limits and order by if need be.

Wow thanks a lot!

I will play with that.

For now, I had to resort to combining the SQL with PHP and created a few queries to do this. It takes roughly 30 seconds my way, but it goes through 5,000 rows, at least. Actually more because it does check on all of them.

But 30 seconds can work, it will only be run once a week.

Anyway, thanks for the code. I may just tweak it out of curiosity.

Vince McMahon
Dec 18, 2003

cannibustacap posted:

Wow thanks a lot!

I will play with that.

For now, I had to resort to combining the SQL with PHP and created a few queries to do this. It takes roughly 30 seconds my way, but it goes through 5,000 rows, at least. Actually more because it does check on all of them.

But 30 seconds can work, it will only be run once a week.

Anyway, thanks for the code. I may just tweak it out of curiosity.

No probs. I had a very similar problem and ended up doing something like this. I think with GROUP BY it keeps all of the values of the first record in that group (except when you use MAX etc). This way you just pull out the max value and then pull out the rest of the data from there.

I'm surprised it takes 30 seconds though - what indexes do you have on the columns?

Whilst farting I
Apr 25, 2006

More fun from this old set of tables!

What I have right now is 3 tables: division, employees, and intersection. Division corresponds to academic department, employee is employee, and intersection is a table that contains information from two of the other tables, as well as additional data.

code:
   
DIVNUM     DIVNAME                        DIVLOCATION          DIVPHONE
---------- ------------------------------ -------------------- ------------
         1 Accounting                     Math/Science 342     555-555-5555
         2 Applied Statistics             Business 344         555-555-5555
         3 Biology                        Life Science 324     555-555-5555
		 
EMPNUM     EMPNAME              EMPRANK         EMPYEARSTART
---------- -------------------- --------------- ------------
         1 Boehnstrom, James    Assoc Prof              1992
         2 Charles, Larry       Asst Prof               2005
         3 Ersatz, Manuel       Assoc Prof              1984

IDIVNUM    IEMPNUM    IOFFICE              IPHONE
---------- ---------- -------------------- ------------
         2          1 Business 344         555-555-5555
         1          2 Math/Science 342     555-555-5555
         3          3 Life Science 324     555-555-5555
In the third table, the intersection table, IDIVNUM corresponds to the DIVNUM, the division number, in the first table (number 2 is statistics, in the Business building) and IEMPNUM corresponds to the EMPNUM, the employee number, in the second table (number 1 is Boehnstrom).

I want to delete a DIVISION, and if it has employees associated with it, those employees should be removed from the INTERSECTION table. This

code:
delete from (select * from EMP, INTS, DIVS where EMP.EMPNUM = :numReq and EMP.EMPNUM = INTS.IEMPNUM and INTS.IDIVNUM = DIVS.DIVNUM)
gets rid of any and all employees in the intersection table associated with that division, but there's still a tuple in the division table of that division. Let's say I wanted to delete Division 2, here's what I'd be left with.

code:
DIVNUM     DIVNAME                        DIVLOCATION          DIVPHONE
---------- ------------------------------ -------------------- ------------
         1 Accounting                     Math/Science 342     555-555-5555
         2 Applied Statistics             Business 344         555-555-5555
         3 Biology                        Life Science 324     555-555-5555
		 
EMPNUM     EMPNAME              EMPRANK         EMPYEARSTART
---------- -------------------- --------------- ------------
         1 Boehnstrom, James    Assoc Prof              1992
         2 Charles, Larry       Asst Prof               2005
         3 Ersatz, Manuel       Assoc Prof              1984

IDIVNUM    IEMPNUM    IOFFICE              IPHONE
---------- ---------- -------------------- ------------
         1          2 Math/Science 342     555-555-5555
         3          3 Life Science 324     555-555-5555
The employee that's associated with it is gone from the Intersection, but it remains in the Division table itself. I'm looking to remove it from there, too. How would I do that?

Bad Titty Puker
Nov 3, 2007
Soiled Meat

Whilst farting I posted:

More fun from this old set of tables!

What I have right now is 3 tables: division, employees, and intersection. Division corresponds to academic department, employee is employee, and intersection is a table that contains information from two of the other tables, as well as additional data.

code:
   
DIVNUM     DIVNAME                        DIVLOCATION          DIVPHONE
---------- ------------------------------ -------------------- ------------
         1 Accounting                     Math/Science 342     555-555-5555
         2 Applied Statistics             Business 344         555-555-5555
         3 Biology                        Life Science 324     555-555-5555
		 
EMPNUM     EMPNAME              EMPRANK         EMPYEARSTART
---------- -------------------- --------------- ------------
         1 Boehnstrom, James    Assoc Prof              1992
         2 Charles, Larry       Asst Prof               2005
         3 Ersatz, Manuel       Assoc Prof              1984

IDIVNUM    IEMPNUM    IOFFICE              IPHONE
---------- ---------- -------------------- ------------
         2          1 Business 344         555-555-5555
         1          2 Math/Science 342     555-555-5555
         3          3 Life Science 324     555-555-5555
In the third table, the intersection table, IDIVNUM corresponds to the DIVNUM, the division number, in the first table (number 2 is statistics, in the Business building) and IEMPNUM corresponds to the EMPNUM, the employee number, in the second table (number 1 is Boehnstrom).

I want to delete a DIVISION, and if it has employees associated with it, those employees should be removed from the INTERSECTION table. This

code:
delete from (select * from EMP, INTS, DIVS where EMP.EMPNUM = :numReq and EMP.EMPNUM = INTS.IEMPNUM and INTS.IDIVNUM = DIVS.DIVNUM)
gets rid of any and all employees in the intersection table associated with that division, but there's still a tuple in the division table of that division. Let's say I wanted to delete Division 2, here's what I'd be left with.

code:
DIVNUM     DIVNAME                        DIVLOCATION          DIVPHONE
---------- ------------------------------ -------------------- ------------
         1 Accounting                     Math/Science 342     555-555-5555
         2 Applied Statistics             Business 344         555-555-5555
         3 Biology                        Life Science 324     555-555-5555
		 
EMPNUM     EMPNAME              EMPRANK         EMPYEARSTART
---------- -------------------- --------------- ------------
         1 Boehnstrom, James    Assoc Prof              1992
         2 Charles, Larry       Asst Prof               2005
         3 Ersatz, Manuel       Assoc Prof              1984

IDIVNUM    IEMPNUM    IOFFICE              IPHONE
---------- ---------- -------------------- ------------
         1          2 Math/Science 342     555-555-5555
         3          3 Life Science 324     555-555-5555
The employee that's associated with it is gone from the Intersection, but it remains in the Division table itself. I'm looking to remove it from there, too. How would I do that?

Please post DDL, including keys and indexes. What RDBMS are you using?

Can an employee belong to more than one division? If an employee can never belong to more than one division, get rid of the intersection table and add DIVNUM to EMP, nullable, foreign key referencing DIV.

If an employee can belong to more than one division, do you want to delete employees that are in both div #1 and in another division?

Do you really want to delete the employee data? Might make sense to add a status code column and mark them as inactive.

You could use a temporary table:

code:
CREATE TABLE #employees_to_delete (EMPNUM int NOT NULL PRIMARY KEY);

INSERT INTO #employees_to_delete(EMPNUM)
SELECT DISTINCT EMPNUM FROM DivEmp WHERE DIVNUM = 1;

DELETE FROM DivEmp WHERE DIVNUM = 1;

DELETE FROM Employees 
  WHERE EXISTS 
     (SELECT * FROM #employees_to_delete e WHERE e.EMPNUM = Employees.EMPNUM);

DELETE FROM Divisions WHERE DIVNUM = 1;

DROP TABLE #employees_to_delete;

Whilst farting I
Apr 25, 2006

That would be ideal, but unfortunately, a temporary table is not an option. :(

Here's the script used to create it.

code:
create table DIVISION
(
   DivNum       NUMBER   (02) Primary Key,
   DivName      VARCHAR2 (30),
   DivLocation  VARCHAR2 (20),
   DivPhone     VARCHAR2 (12)
);

create table EMPLOYEES
(
   EmpNum        NUMBER   (03) Primary Key,
   EmpName       VARCHAR2 (20),
   EmpRank       VARCHAR2 (15),
   EmpYearStart  NUMBER   (04)
);

create table INTERSECTION
(
   IDivNum      NUMBER   (02) References DIVISION(DivNum)
                               on delete cascade,
   IEmpNum       NUMBER   (03) References EMPLOYEES(EmpNum)
                               on delete cascade,
   IOffice       VARCHAR2 (20),
   IPhone        VARCHAR2 (12)
);
I'm using Oracle. The thing is, I don't want to delete the Employees from the Employee table, and this doesn't do that - it just deletes them from the Intersection table, because the Intersection table contains their office and phone number at that office (and if they aren't in a division, they're not going to have either of those). They remain in the Employee table, however.

For employees in more than one division, right now, with that code it'll only remove the tuple from the Intersection table directly related to that Division.

Let's say it looks like this

code:
DIVNUM     DIVNAME                        DIVLOCATION          DIVPHONE
---------- ------------------------------ -------------------- ------------
         1 Accounting                     Math/Science 342     555-555-5555
         2 Applied Statistics             Business 344         555-555-5555
         3 Biology                        Life Science 324     555-555-5555
		 
EMPNUM     EMPNAME              EMPRANK         EMPYEARSTART
---------- -------------------- --------------- ------------
         1 Boehnstrom, James    Assoc Prof              1992
         2 Charles, Larry       Asst Prof               2005
         3 Ersatz, Manuel       Assoc Prof              1984

IDIVNUM    IEMPNUM    IOFFICE              IPHONE
---------- ---------- -------------------- ------------
         2          1 Business 344         555-555-5555
         1          2 Math/Science 342     555-555-5555
         3          3 Life Science 324     555-555-5555
         1          3 Business 356         555-555-5555
and I delete Division 3, the outcome looks like this

code:
DIVNUM     DIVNAME                        DIVLOCATION          DIVPHONE
---------- ------------------------------ -------------------- ------------
         1 Accounting                     Math/Science 342     555-555-5555
         2 Applied Statistics             Business 344         555-555-5555
         3 Biology                        Life Science 324     555-555-5555
		 
EMPNUM     EMPNAME              EMPRANK         EMPYEARSTART
---------- -------------------- --------------- ------------
         1 Boehnstrom, James    Assoc Prof              1992
         2 Charles, Larry       Asst Prof               2005
         3 Ersatz, Manuel       Assoc Prof              1984

IDIVNUM    IEMPNUM    IOFFICE              IPHONE
---------- ---------- -------------------- ------------
         2          1 Business 344         555-555-5555
         1          2 Math/Science 342     555-555-5555
         1          3 Business 356         555-555-5555
Employee 3 is still in the Intersection, the Employee table is unaffected but I'm also looking for Division 3 to be removed from the Division table itself.

Bad Titty Puker
Nov 3, 2007
Soiled Meat
This is untested:

code:
/* Delete Employees that are in division 3 
   and _only_ in division 3 */
DELETE FROM 
   Employee
WHERE EXISTS 
   (SELECT * 
     FROM Intersection i
    WHERE i.EMPNUM = Employee.EMPNUM
      AND i.DIVNUM = 3)
AND NOT EXISTS
   (SELECT * 
     FROM Intersection i
    WHERE i.EMPNUM = Employee.EMPNUM
      AND i.DIVNUM <> 3);

/* Deletes are cascaded to "Intersection" */

DELETE FROM Divisions WHERE DIVNUM = 3;

Whilst farting I
Apr 25, 2006

I never even thought to make it two separate statements, I for some reason was trying to get everything done at once. :doh:

I've got it now, thanks :)

LightI3ulb
Oct 28, 2006

Standard pleasure model.
I've got three tables:

code:
+-------------+
| contract_id |
+-------------+
|         199 |
|         200 |
+-------------+



+--------------+-------------+------------------+
| approvals_id | contract_id | approval_type_id |
+--------------+-------------+------------------+
|            7 |         199 |                4 |
|            8 |         199 |               11 |
|            9 |         199 |                1 |
|           10 |         199 |                5 |
|           11 |         199 |                6 |
|           12 |         199 |                7 |
|           13 |         199 |                2 |
|           14 |         199 |               10 |
|           15 |         199 |                9 |
|           18 |         199 |                8 |
|           19 |         199 |                3 |
+--------------+-------------+------------------+



+------------------+----------------+-------------------------+
| approval_type_id | type           | parent_approval_type_id |
+------------------+----------------+-------------------------+
|                1 | Business       |                    NULL |
|                2 | Legal          |                    NULL |
|                3 | Engineering    |                    NULL |
|                4 | Exploratory    |                       1 |
|                5 | Negotiated     |                       2 |
|                6 | Finalized      |                       2 |
|                7 | Signed         |                       2 |
|                8 | Specs received |                       3 |
|                9 | Test approved  |                       3 |
|               10 | Live           |                       3 |
|               11 | Live           |                       1 |
+------------------+----------------+-------------------------+
I'm trying to come up with a query that will show me the approval types that are missing from approvals with relation to a contract. Originally, I had this (specifically for parent_approval_type_id='1'):

code:
select at.type, at.approval_type_id ati, a.approval_date
from dmb_control.approval_types at 
left outer join dmb_control.approvals a 
      on at.approval_type_id=a.approval_type_id 
left outer join 
      (select * from dmb_control.contracts c where contract_id=?) as t1    
      on t1.contract_id=a.contract_id 
where at.parent_approval_type_id='1' and at.approval_type_id != '1'
What's happening is that when I run that query with 200, it gives me the data for 199, making it look like all of the approvals are in for 200. Left joining 3 tables is difficult :smith: Help?

LightI3ulb fucked around with this message at 21:28 on Apr 28, 2009

Popoi
Jul 23, 2000

What you're getting is essentially all of approvals joined to approval_type joined to nothing. If you had other contracts with approvals in your listing, they'd be showing too, because you're not limiting which approvals you're looking at.

This will give you the list of approvals for a given contract:

code:
select at.type, at.approval_type_id ati, a.approval_date
from dmb_control.approval_types at 
left outer join dmb_control.approvals a 
      on at.approval_type_id=a.approval_type_id 
where a.contract_id = ?
This is basically what you had but without the (unnecessary?) join to contract.

If you want the approvals that are missing:

code:
select at.type, at.approval_type_id
from dmb_control.approval_types at 
where at.id not in 
(select approval_type_id from dmb_control.approvals where contract_id = ?)
(This may be the wrong syntax, but it should get the point across)

LightI3ulb
Oct 28, 2006

Standard pleasure model.

Popoi posted:

What you're getting is essentially all of approvals joined to approval_type joined to nothing. If you had other contracts with approvals in your listing, they'd be showing too, because you're not limiting which approvals you're looking at.

This will give you the list of approvals for a given contract:

code:
select at.type, at.approval_type_id ati, a.approval_date
from dmb_control.approval_types at 
left outer join dmb_control.approvals a 
      on at.approval_type_id=a.approval_type_id 
where a.contract_id = ?
This is basically what you had but without the (unnecessary?) join to contract.

If you want the approvals that are missing:

code:
select at.type, at.approval_type_id
from dmb_control.approval_types at 
where at.id not in 
(select approval_type_id from dmb_control.approvals where contract_id = ?)
(This may be the wrong syntax, but it should get the point across)

Wow, drat. I don't know what I was thinking. Heh, I guess sometimes you can fall into overcomplication.

Little Brittle
Nov 1, 2004

Come visit me dawg
How would you design a MySQL table to deal with a Twitter-style follow relationship? I already have a users table, so I imagine a table like this mapping two user IDs together:

Table: followers
code:
  following_user, followed_user
  18              349
  265             48             
  4668            681
  48              265
Is it efficient to insert a new row for every follow, even if it is just reciprocating an existing one (see row 2 and 4)? If so, how would you query a mutual relationship?

Triple Tech
Jul 28, 2006

So what, are you quitting to join Homo Explosion?
I wouldn't be so concerned about efficiency... (optimizing prematurely?) And how else would you do it relationally?

code:
select * --untested

from followers a

join followers b
 on a.following_user = b.followed_user
and b.following_user = a.followed_user
At the very least there would be two indexes defined. 1) A,B 2) B,A (to make reverse lookups as fast)

Little Brittle
Nov 1, 2004

Come visit me dawg

Triple Tech posted:

I wouldn't be so concerned about efficiency... (optimizing prematurely?) And how else would you do it relationally?

At the very least there would be two indexes defined. 1) A,B 2) B,A (to make reverse lookups as fast)
That makes sense. I'm not overly concerned about efficiency, I just want to make sure there isn't a better way to do it. If I do go with that format, is there a way to query all a user's followers while having a way to determine whether or not it is mutual? For example, I want to create a page displaying all of my followers and have a "follow" link only on users that I am not following.

My Followers:
Frank (you are following Frank)
Tom (you are following Tom)
Bill (click here to follow Bill)
Mike (you are following Mike)

Triple Tech
Jul 28, 2006

So what, are you quitting to join Homo Explosion?

Little Brittle posted:

My Followers:
Frank (you are following Frank)
Tom (you are following Tom)
Bill (click here to follow Bill)
Mike (you are following Mike)

code:
select -- untested
  fan          = a.following_user,
  is_following = b.following_user is not null

from followers a

left join followers b
 on a.following_user = b.followed_user
and b.following_user = 12345

where a.followed_user = 12345 -- the combination to the airlock and luggage

Little Brittle
Nov 1, 2004

Come visit me dawg

Triple Tech posted:

code:
select -- untested
  fan          = a.following_user,
  is_following = b.following_user is not null

from followers a

left join followers b
 on a.following_user = b.followed_user
and b.following_user = 12345

where a.followed_user = 12345 -- the combination to the airlock and luggage
That did the job, thanks for the help.

MattO
Oct 10, 2003

I have a question if someone could help please.

I have a table in a MSSQL 2008 DB.

One column contains index numbers, the other is a comma delimited set of names, kind of like this:
code:
vid_id           names
-----------------------------
1                sally,amy,bruce,dinah
2                carl,sam,amy,sarah
3                bruce,amanda,george
What I need to do is extract a list of distinct names from the names column, just the names, the index numbers don't matter. My usage of SQL is generally limited to pulling basic data and some simple joins, I haven't had to do anything like this before so I haven't a clue.

Thanks!

SLOSifl
Aug 10, 2002


I'm a bit of a SQL-CLR whore, so I'm sure there's a pure T-SQL method I'm ignoring. If you want to just use T-SQL then you can write a SP that does basically the same thing. Or Google for T-SQL string split and see where that gets you.

You could make a CLR Stored Procedure that runs your query, parses the [names] column and adds the values to a List<string>, returning the name using SqlContext.Pipe.Send (SqlDataRecord) if the list didn't already contain that name.
code:
[Microsoft.SqlServer.Server.SqlProcedure]
public static int GetNames() {
   // Create a list to keep track of names you've encountered
   var nameList = new List<string>();

   // Create a SqlDataRecord
   var metadata = new SqlMetaData("Name", SqlDbType.NVarChar, 40 );
   var outputRecord = new SqlDataRecord ( new SqlMetaData[] { metadata } );

   using ( var conn = new SqlConnection ("context connection=true") ) {
      using ( var cmd = conn.CreateCommand() ) {
         cmd.CommandText = "SELECT [names] from [Table]";

         using ( var reader = cmd.ExecuteReader () ) {
             while ( reader.Read () ) {
                // Skip the row if it's null
                if( reader.IsDbNull ( 0 ) ) continue;

                var nameColumn = reader.GetString ( 0 );
             
                // Split the values into an array
                foreach ( var name in nameColumn.Split ( ',',
                            StringSplitOptions.RemoveEmptyEntries) ) {
                   // Ignore names you've seen already
                   if( nameList.Contains ( name.ToLower() ) ) continue;
                 
                   // Add the name to the list
                   nameList.Add ( name.ToLower() );

                   // Set the record's value and spit it out
                   outputRecord.SetString ( 0, name );
                   SqlContext.Pipe.Sent ( outputRecord );
                }
             }
         }         
      }
   }

   return nameList.Count;
}
Something like that. I don't expect that code to work since I wrote it off the top of my head.

SLOSifl fucked around with this message at 21:20 on Apr 30, 2009

MattO
Oct 10, 2003

Thanks for the help. Looking at the table it might be wrong to try to do this in a single go, as (sorry I didn't mention) there are 40,000 rows and probably 30,000 unique names.

I think I might just export the entire name list with commas into a flat file, then put it into a temporary table and then run a distinct() on it.

var1ety
Jul 26, 2004
You can split apart strings in SQL using a pattern like this

code:
with t AS (select row_number() OVER (order by object_name) AS rn from all_objects),
     f AS (select id, list, length(list) - length(replace(list, ',')) + 1 AS elements from foo_test)
select unique case when t.rn = 1
                   then case when f.elements = 1
                             then f.list
                             else substr(f.list, 1, instr(f.list, ',') - 1)
                         end
                   when t.rn = f.elements
                   then substr(f.list, instr(f.list, ',', 1, f.elements - 1) + 1)
                   else substr(f.list,
                               instr(f.list, ',', 1, t.rn - 1) + 1,
                               instr(f.list, ',', 1, t.rn) - instr(f.list, ',', 1, t.rn - 1) - 1)
                    end AS ss
  from f
 inner join t on (t.rn <= f.elements)
 order by 1
The general idea is that if your source table has at most N elements per record then join against a table with N rows, number them 1..N, and use this number to pluck out the individual records one at a time. In Oracle ALL_OBJECTS is a system table known to have many rows in it, which is ideal for our purposes.

Triple Tech
Jul 28, 2006

So what, are you quitting to join Homo Explosion?
And yet neither of you mention how that type of data is set up poorly for relational databases?

code:
-- this is mo betta
vid_id           name
-----------------------------
1                sally
1                amy
1                bruce
1                dinah
2                carl
2                sam
2                amy
2                sarah
If this is someone else's data, you might as well spend the time to scrub it and do it the Right Way (tm) instead of doing it in pure SQL. If you're actually the authors/generators of this data and you're asking how to access it, shame on you! <:mad:>

See also: normalization

Adbot
ADBOT LOVES YOU

MattO
Oct 10, 2003

Triple Tech posted:



If this is someone else's data, you might as well spend the time to scrub it and do it the Right Way (tm) instead of doing it in pure SQL. If you're actually the authors/generators of this data and you're asking how to access it, shame on you! <:mad:>[/b]

Yes this isn't my original data, it was imported into my DB and I wasn't going to scrub it because I just wanted to know if there was a way to get a DISTINCT list of the names for a one time use but I might as well just re-do the table a usable way.

Thanks everyone.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply