Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Argue
Sep 29, 2005

I represent the Philippines

fletcher posted:

What does the INNER do? What happens if you take it out?

The inner join is a type of join, and as far as I know, the default one used when you omit the inner. It just matches each row in the first table against each in the second. You would use other kinds of joins for the cases not covered by inner joins--for instance, if you had a table of books, and its foreign key to the authors table for some reason points to a nonexistent row, you could use an outer join to list each book next to either its author's name or some string like "NULL" if the author doesn't exist. An inner join would miss this case; there's an example in that link I gave you.

Adbot
ADBOT LOVES YOU

noonches
Dec 5, 2005

It represents my lifestyle and stratus as a street savy irreverent youth, who lives large, and yet hungers for the next level in life.

fletcher posted:

What does the INNER do? What happens if you take it out?

The INNER JOIN will select all rows from both tables as long as there is a match between the columns you are matching on. If you take it out and just say JOIN, it will default to an INNER JOIN. I may have read what you are trying to do wrong, but maybe you should try a RIGHT JOIN.

noonches fucked around with this message at 14:51 on Nov 21, 2007

Victor
Jun 18, 2004
Join illustrations.

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine
My company's been running MySQL 4 and PHP 4, mainly because, well, it worked. Now we're testing a new server, and think it's time to upgrade both to version 5. However, when I migrated the 4 databases to the 5 test server, while some straight lookups work fine, some queries are killing the server. A search that took 1.2 seconds to run on the old server, I have to kill after it's run for 600 seconds on the new one. I'm guessing it has to do with it having LIKEs or JOINs or some crap that 5 doesn't like, or at least in an order that 5 doesn't like, but this brings up my question:

Is there a list of changes from MySQL 4 to MySQL 5 that a programmer should know about? Most I've seen is about the features, but what about coding quality? Obviously not all of the commands are handled identically to how they were before. I've googled all I can but I have yet to find anything about performance hits with certain queries or what not.

hey mom its 420
May 12, 2007

I don't know about the queries not running, it's best to just see what the SQL error messages tell you and act on that. But as for the slow queries, have you created indices everywhere where they used to be? Because if, say, your foreign keys aren't indexed the DBMS will be doing full table scans on joins.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
How taxing is something like a SELECT COUNT(id) FROM messages WHERE owner = me if there are hundreds of thousands of entries in the messages table? Is it better to just keep a count in the users table and update that?

fletcher fucked around with this message at 23:19 on Nov 23, 2007

Victor
Jun 18, 2004
If you have an index on owner, it's not very expensive. However, that makes it more expensive to insert/remove rows from your messages table. Whether this actually matters depends on details you have not provided.

Dakha
Feb 18, 2002

Fun Shoe
I'm extracting data from an oracle database using SQL+. One of the tables I need to get has several columns such as '/AB/CDE' and '/AB/CDF' etc.

However, when I attempt to execute:

SELECT KONP./AB/CDE, KONP./AB/CDF FROM KONP
WHERE....

it doesn't recognise the columns. Nor does:

SELECT /AB/CDE, /AB/CDF FROM KONP
WHERE....

Is this an escape character thing? I thought the escape character was backslash not slash. What should I do so that SQL+ can get the columns I need?

m5
Oct 1, 2001

Dakha posted:

I'm extracting data from an oracle database using SQL+. One of the tables I need to get has several columns such as '/AB/CDE' and '/AB/CDF' etc.

However, when I attempt to execute:

SELECT KONP./AB/CDE, KONP./AB/CDF FROM KONP
WHERE....

it doesn't recognise the columns. Nor does:

SELECT /AB/CDE, /AB/CDF FROM KONP
WHERE....

Is this an escape character thing? I thought the escape character was backslash not slash. What should I do so that SQL+ can get the columns I need?

Square brackets maybe? I'm guessing because I'm not sure what you're saying - are those slashes supposed to mean something or are they part of the column names and not otherwise interesting? (If so then wtf)

So maybe
code:
SELECT [/AB/CDE], [/AB/CDF] from KONP WHERE ...

Dakha
Feb 18, 2002

Fun Shoe

m5 posted:

Square brackets maybe? I'm guessing because I'm not sure what you're saying - are those slashes supposed to mean something or are they part of the column names and not otherwise interesting? (If so then wtf)

So maybe
code:
SELECT [/AB/CDE], [/AB/CDF] from KONP WHERE ...

Yeah, they are just really stupidly named columns. The columns are in several tables im joining, so do I use:

KONP.[/AB/CDE] or [KONP./AB/CDE] ?

npe
Oct 15, 2004
Use double quotes, just beware that when doing this oracle suddenly gets case sensitive:

code:
SQL> create table dumb (
  2   "ABC/123" number(5),
  3   "xyz/789" number(5)
  4  );

Table created.

SQL> select "ABC/123" from dumb;

no rows selected

SQL> select "XYZ/789" from dumb;
select "XYZ/789" from dumb
       *
ERROR at line 1:
ORA-00904: "XYZ/789": invalid identifier


SQL> select "xyz/789" from dumb;

no rows selected

uh zip zoom
May 28, 2003

Sensitive Thugs Need Hugs

I'm trying to use Microsoft Excel to build a script to insert spreadsheet data (and not the headers) into an MS SQL database, unfortunately I don't know jack about (read: forgot the first time I was told) the syntax involved with creating a string with Excel. Help?

kalleboo
Jan 13, 2001

Hjälp

Dakha posted:

Yeah, they are just really stupidly named columns. The columns are in several tables im joining, so do I use:

KONP.[/AB/CDE] or [KONP./AB/CDE] ?
The former (if it was the latter it would mean that table names couldn't contain periods).

Wizzle
Jun 7, 2004

Most
Parochial
Poster


uh zip zoom posted:

I'm trying to use Microsoft Excel to build a script to insert spreadsheet data (and not the headers) into an MS SQL database, unfortunately I don't know jack about (read: forgot the first time I was told) the syntax involved with creating a string with Excel. Help?

There's an import utility for SQL 2005 that you can have generate the SQL syntax. It includes the ability to skip column headers and has a GUI for things like mapping columns. You could easily make it a SQL Agent job as long as your excel file is in a predictable place with a predictable name.

Alternatively, you could make a stored procedure that takes the file name/location as a parameter.

Victor
Jun 18, 2004
I might make a thread about sweet SQL2005 features at some point, but I just wanted to get this out there: using cross apply/outer apply allows you to join a table-valued function to a table, using values from the table as input to the UDF. That's useful for, among other things, normalizing data with a CLR Regex Split function. I learned about this tonight, after getting pissed that there was no way to link a table-valued UDF to a table, since relations can't refer to each other outside of the join condition. SQL2005 to the rescue!

Wizzle
Jun 7, 2004

Most
Parochial
Poster


Victor posted:

I might make a thread about sweet SQL2005 features at some point, but I just wanted to get this out there: using cross apply/outer apply allows you to join a table-valued function to a table, using values from the table as input to the UDF. That's useful for, among other things, normalizing data with a CLR Regex Split function. I learned about this tonight, after getting pissed that there was no way to link a table-valued UDF to a table, since relations can't refer to each other outside of the join condition. SQL2005 to the rescue!

Can you elaborate on this or maybe give an example? I work with SQL 2005 a lot and I have no clue what you're talking about. :downs:

Victor
Jun 18, 2004
Let's say I have deployed a table-valued function called dbo.RegexSplit that returns a table containing the column value. Let's say I have a table that is (foo int not null, list_of_bars varchar(100) not null). I want to normalize that to (foo int not null, bar int not null). First, I tried
code:
insert  Normalized
select  foo,
        value
from    Denormalized
inner join dbo.RegexSplit(/* no access to Denormalized */) on /* nothing to put here */
The solution is cross apply:
code:
insert  Normalized
select  foo,
        cast(value as int)
from    Denormalized
cross apply dbo.RegexSplit(list_of_bars, ',')

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine
I figured out my earlier problem with MySQL 5 being slow as hell - my lovely query that I designed three years ago has more parentheses and joins in it than anything should have. But while redesigning it, I had this join question:

Right now, it is (simplified) like this:

code:
SELECT e.EntryID, e.EntryText, d.DepartmentName, a.AgencyName, p.PhaseName
  FROM Entries e
       LEFT JOIN Departments d ON e.DepartmentID = d.DepartmentID
       LEFT JOIN Agencies a ON d.DepartmentID = a.DepartmentID
       LEFT JOIN Phases p ON e.PhaseID = p.PhaseID
In essence, getting information about the entry, plus the name of the department it's in, plus the name of the agency within that department, plus whatever phase it's in (new, old, replaced, etc).

Again, this is vastly simplified, but my point is, would moving to commas give me the exact same results? Also, should the above be reordered, putting the Dept/Agency join at the end, or using parentheses to separate them out from the rest of the joins?

This is what I would see the above being in comma form:

code:
SELECT e.EntryID, e.EntryText, d.DepartmentName, a.AgencyName, p.PhaseName
FROM Entries e, Departments d, Agencies a, Phase p
WHERE d.DepartmentID = e.DepartmentID
  AND a.DepartmentID = d.DepartmentID
  AND p.PhaseID = e.PhaseID
Or would this mess up the joining I have in place now, which I know works? Like I said, I haven't really touched this query in a long time, but it clearly needs work done on it.

npe
Oct 15, 2004
Commas are syntactically equivalent to inner join syntax, so don't get hung up on it. Have you tried running EXPLAIN on your query to see where the cost is?

Complicated looking queries are not necessarily problematic, you'll want to learn to trust EXPLAIN.

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Satan's Scallion posted:

Commas are syntactically equivalent to inner join syntax, so don't get hung up on it. Have you tried running EXPLAIN on your query to see where the cost is?

Complicated looking queries are not necessarily problematic, you'll want to learn to trust EXPLAIN.

Using indexes (one of the joined tables had somehow never been indexed before) cut the execution time from interminable (10+ minutes) down to 12 seconds. Getting rid some of the joins, however, cut it down to instantaneous, so I was trying to figure out how best to clean up the web of joins. EXPLAIN is helping a lot, but I've done all I can with indexing, now I need to clean up the query itself.

If they're equivalent to an inner join, they should only be used when there's positively going to be a DepartmentID and PhaseID in Entries, right? (I apologize if some of this sounds amateur; this system has let me coast by on rather simple queries for a while, so now it's hitting me with stuff I haven't really had to think about in years)

And I actually just discovered EXPLAIN last week - it's the best thing in my life since sex. :holy:

m5
Oct 1, 2001
With left joins like that, what helps is making sure you get your qualifying (filtering, in other words) criteria into the JOIN expressions. Join in an order such that your most restrictive conditions are evaluated as early as possible.

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

m5 posted:

With left joins like that, what helps is making sure you get your qualifying (filtering, in other words) criteria into the JOIN expressions. Join in an order such that your most restrictive conditions are evaluated as early as possible.

I'm not sure what you mean by getting the qualifying criteria into the join expressions... In this case, this join is simply to fill out the info on getting the actual name of the Department instead of the Department ID, etc... Now, sometimes the WHERE will be only on Entries, but sometimes it will be on the DepartmentID, but even then, it's still searching for the DepartmentID in Entries, using the one in Departments only to join the DepartmentName.

m5
Oct 1, 2001

Golbez posted:

I'm not sure what you mean by getting the qualifying criteria into the join expressions... In this case, this join is simply to fill out the info on getting the actual name of the Department instead of the Department ID, etc... Now, sometimes the WHERE will be only on Entries, but sometimes it will be on the DepartmentID, but even then, it's still searching for the DepartmentID in Entries, using the one in Departments only to join the DepartmentName.

I mean that if you're filtering on
code:
d.DepartmentName = 'SALES'
then the query plan isn't (necessarily) the same if you put that in the WHERE clause instead of the JOIN:
code:
SELECT e.EntryID, e.EntryText, d.DepartmentName, a.AgencyName, p.PhaseName
  FROM Departments d
       LEFT JOIN Entries e ON e.DepartmentID = d.DepartmentID
             AND d.DepartmentName = 'SALES'
       LEFT JOIN Agencies a ON d.DepartmentID = a.DepartmentID
       LEFT JOIN Phases p ON e.PhaseID = p.PhaseID
The book SQL Tuning has a bunch of extremely good stuff about query-writing. It's mostly database-agnostic: it's not one of those guides about how to get Oracle or MS SQL Server to do particular things, but instead about how to write SQL code to express query algorithms as efficiently as possible given what you know (and what the server can't know) about your data.

m5 fucked around with this message at 16:16 on Nov 27, 2007

Victor
Jun 18, 2004
Something tells me we should write up a tutorial on joins or create a FAQ entry. I've made the point m5 notes above before -- sometimes you just want an left/right join to fail but don't want the entire row eliminated from the results set.

Powdered Toast Man
Jan 25, 2005

TOAST-A-RIFIC!!!
Glad to see this thread, and I hope I can contribute something useful here and there.

And now, my plea for sympathy:

I support a vertical market software package that is based on multiple Microsoft Access databases accessed via Jet 3.0 and programmed in Visual Basic 6. The amount of bullshit I have to put up with on a daily basis keeping up the slices of swiss cheese that are my clients' databases makes me want to :suicide:

On the plus side, I have learned a great deal about SQL in a short period of time and within the next year we will be migrating to a SQL Server 2005 Express platform. :banjo:

Edit: Golbez, I have a great deal of problems due to Jet 3 being ridiculously inefficient with anything beyond a simple SELECT...especially over a network. Can I assume that using indexed fields in my queries will speed that up? I do notice that JOINs are almost always faster than a nested query...

Powdered Toast Man fucked around with this message at 19:21 on Nov 27, 2007

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Powdered Toast Man posted:

Edit: Golbez, I have a great deal of problems due to Jet 3 being ridiculously inefficient with anything beyond a simple SELECT...especially over a network. Can I assume that using indexed fields in my queries will speed that up? I do notice that JOINs are almost always faster than a nested query...

I would say using indexed fields can help things greatly, though I don't know much outside of MySQL these days.

wwb
Aug 17, 2004

Powdered Toast Man posted:

Edit: Golbez, I have a great deal of problems due to Jet 3 being ridiculously inefficient with anything beyond a simple SELECT...especially over a network. Can I assume that using indexed fields in my queries will speed that up? I do notice that JOINs are almost always faster than a nested query...

Indexes definitely help. The other big thing that helps with JET is to not open and close connections like one should do with SQL Server (or most other server RDBMSs). Seems like there is no connection pool. Depending on how much you are reading/writing, it can speed things up greatly.

I should note my principal use for Access DBs these days is as data stores/logging tools for some command line utilities we use to import things into the CRM system here.

cka
May 3, 2004
How reliable is using phpMyAdmin for dumping a ~600mb database into a mysqldump style output file? Normally I'd just ssh in and use mysqldump itself, but unfortunately this time around I don't have remote access for ssh (nor can I use a local mysqldump connecting to the remote db server, which is how I previously did what I'm trying to accomplish tonight.) Would this pose any risk of corrupting the existing data and/or abruptly stopping mid-process via a php time-out, corrupting the dump? I am, for all intents and purposes, a phpMyAdmin newbie and need to take the utmost care of this data.

ohgodwhat
Aug 6, 2005

I doubt it would work. It won't corrupt anything, but it will probably fail. In my (small) experience with large database backups produced by PHP interfaces, it tends to overrun memory limits very quickly.

cornmonger
Feb 15, 2007

Eat thy corn, fish mongers
Yeah, that'll most likely fail. If you don't have SSH, take a look at your cpanel or what-have-you. Most hosts provide database backups through that as well.

cka
May 3, 2004

Kylratix posted:

Yeah, that'll most likely fail. If you don't have SSH, take a look at your cpanel or what-have-you. Most hosts provide database backups through that as well.

Yeah, I found the link inside cPanel for me to download the mysqldump output of the database in question which worked like gang busters. phpMyAdmin really is horrible, even after I got the database and did what I needed to do with it (namely trimming it down from 600 megs to 20, hooray for old apps that store images in the db), the re-import still didn't work all that well. I had to split the 15 meg .sql text file into 2 or 3 separate chunks to make it work properly. :bang:

Ahwell it's done at any rate, thanks for the help.

Nigger Goku
Dec 11, 2004

あっちに行け
I've got a table set up for allowing people to vote on content items, with the structure like so:
code:
INT id    INT content_id    ENUM(y,n) vote
The vote column is just a 'yes/no' on whether they liked that particular item or not.

The problem is trying to find the mostly liked content item from this table. The query I'm currently using is:
code:
SELECT
	`content_id`,COUNT(content_id) AS count
FROM
	`votes`
WHERE
	`vote` = 'yay'
GROUP BY
	`content_id`
ORDER BY
	`count` DESC
LIMIT 6
However, instead of just finding the items with the most 'yay' votes, I'd like to find the ones with the highest ratio of 'yay' votes to 'nay' votes.

Is there a particular way to assign ENUM values to 'fake' columns, so I could do '(yay/nay) AS ratio'? Or is there a better way to achieve the same result?

Any help is appreciated.

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!

Opius posted:

I've got a table set up for allowing people to vote on content items, with the structure like so:
code:
INT id    INT content_id    ENUM(y,n) vote
The vote column is just a 'yes/no' on whether they liked that particular item or not.

The problem is trying to find the mostly liked content item from this table. The query I'm currently using is:
code:
SELECT
	`content_id`,COUNT(content_id) AS count
FROM
	`votes`
WHERE
	`vote` = 'yay'
GROUP BY
	`content_id`
ORDER BY
	`count` DESC
LIMIT 6
However, instead of just finding the items with the most 'yay' votes, I'd like to find the ones with the highest ratio of 'yay' votes to 'nay' votes.

Is there a particular way to assign ENUM values to 'fake' columns, so I could do '(yay/nay) AS ratio'? Or is there a better way to achieve the same result?

Any help is appreciated.
Perhaps something like
code:
SELECT
	`content_id`,
        sum(case `vote` when 'yay' then 1 else 0 end case) AS count,
        sum(case `vote` when 'yay' then 1 else 0 end case)
         /(sum(case `vote` when 'nay' then 1 else 0 end)+1) AS ratio
FROM
	`votes`
GROUP BY
	`content_id`
ORDER BY
	`ratio` DESC
LIMIT 6
Though now that I think about it, I'd probably do 'yay' over total rather than 'yay' over 'nay' as a better way to avoid the divide by zero error. So[code]sum(case `vote` when 'yay' then 1 else 0 end case)
/count(content_id) AS ratio

minato posted:

Edit: Jethro's solution is better, replace my count(vote) with "sum(case when vote="yay" then 1 else 0 end)".
But be sure to keep the cast to float, which I forgot to do.

Jethro fucked around with this message at 02:09 on Dec 3, 2007

minato
Jun 7, 2004

cutty cain't hang, say 7-up.
Taco Defender

Opius posted:

However, instead of just finding the items with the most 'yay' votes, I'd like to find the ones with the highest ratio of 'yay' votes to 'nay' votes.

Is there a particular way to assign ENUM values to 'fake' columns, so I could do '(yay/nay) AS ratio'? Or is there a better way to achieve the same result?
If you were storing "nay" votes as NULL and "yay" votes as any non-null value, then you could simply do:
code:
SELECT content_id, count(vote)::float/count(*)::float AS ratio
FROM votes
GROUP BY content_id
ORDER BY ratio DESC
This works because count() will only count non-null values if you specify a column.

Edit: Jethro's solution is better, replace my count(vote) with "sum(case when vote="yay" then 1 else 0 end)".

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
Similar to the yay/nay voting, if you are trying to track who has voted on what, where each user is only allowed to vote once on each item being voted on, is it a bad move to just make a 1 column table with the primary key is a concat of userId_contentId? Or should you split it up to 3 columns like id, userId, contentId? What yields better performance?

Victor
Jun 18, 2004
Two columns: userId and contentId. The PK is both of those columns.

Senso
Nov 4, 2005

Always working
I'm using MySQL and I've read a lot about the different possible engines but it's still not clear which one is the best since all of them usually claim to be.

Specifically, I'm interested in the differences and pros/cons between MyISAM and Innodb. I think Innodb is better when you have a lot of queries since it does row-level locking but MyISAM seems to have a lot of optimization too.

EDIT: VVVV Yes I did. I was looking for actual experience and opinions from people here, other than posts in other forums or technical whitepapers and benchmarks.

Senso fucked around with this message at 15:28 on Dec 4, 2007

Victor
Jun 18, 2004
Have you googled your question?

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
How do I use the mysql EXPLAIN to my advantage?

Adbot
ADBOT LOVES YOU

hey mom its 420
May 12, 2007

You can see how efficient your query is. How much records needed to be traversed to get the results of the query. For instance, you almost never want to do full table scans. It's good to check your queries with explain to see if full table scans are happening at certain queries so you can introduce indices or reform the queries.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply