Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

Puddleboy posted:

So for each user, there are multiple articles?

You want to know how many hits all of their articles together have gotten? It looks like your query will only work if each user only has one article.

I think you want to group by the article owner, summing the total hits across multiple articles for each owner:

code:
SELECT   user.username, SUM(article.hits) as totalHits
FROM user
JOIN article ON article.owner=user.id
GROUP BY article.owner
ORDER BY totalHits DESC
LIMIT 20;

Ah, that is what I needed. So what about the performance issue? It takes ~4s to run this query.

Adbot
ADBOT LOVES YOU

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

minato posted:

Make sure you have an index on both article.owner and user.id.

user.id is the primary key for the user table

For the article table I have article.id as the primary.

There are also separate indexes for article.owner, article.pictureId, and article.hits.

When I do an EXPLAIN for the slow query, I see this:

possible_keys: owner
key: null
ref: null
rows: 315615 (which is the size of that table)
Extra: Using where; Using temporary; Using filesort

edit: Side question, but how do I chose my CHARSET and COLLATE for a table?

fletcher fucked around with this message at 07:47 on Mar 21, 2008

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
Anybody have any ideas about my indexing/performance issue a few posts up? Or about CHARSET and COLLATE?

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
If I'm trying to optimize a query that has joins on it, am I supposed to (multicolumn)index the fields I'm joining on? Or just the fields in the WHERE? What about the field I'm doing an order by on?

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

minato posted:

<wisdom>

So will I end up having columns that are indexed multiple times depending on how I'm looking up the data?

Like:

indexA: (first_name, last_name)

indexB: (birthday, last_name)

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
If you have a users table and an articles table, what do you guys name the column that links the article to the user? 'userId' or something more descriptive like 'authorId'?

edit: I guess I should note this is for MySQL/MyISAM table, if that makes any difference.

fletcher fucked around with this message at 06:23 on Apr 8, 2008

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
I wanna add the ability to search my site and I'm wondering if I should even bother with MySQL's fulltext after hearing so many bad things about it. What are some alternatives I should look at that I can use with PHP?

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
Why does adding an index on one columnn speed up one SELECT query a lot while it slows down another SELECT query? Or should that not happen and something else is causing it?

fletcher fucked around with this message at 07:28 on Apr 15, 2008

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

genericadmin posted:

Many databases allow for index hints of some kind. MySQL has "USE/IGNORE/FORCE INDEX"....I've had lots of instances with MySQL where hints were a must.

I was just about to ask about this. It seems like a strange thing to need to specify. How do I know if it is a situation where I have to specify the index to use?

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
I'd like to log the slow queries in separate files for each website on the server. I see that you can specify a user config file in ~/.my.cnf. Is that supposed to be the username of the user you are logging into the DB with? I have no clue how to set this up.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
Is there a way to use mysqldump to dump a database and stuff it into a different one in a single command?

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

fansipans posted:

Since mysqldump just outputs SQL you can do this on the command line:

code:
 mysqldump foo | mysql bar

Hah! Good stuff, thanks. Is there a way for it to hesitate to let me put in passwords when I do something like:

code:
mysqldump -u root -p foo | mysql -u root -p bar
It gives me:

Enter password: Enter password:

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
What's the most efficient way to query a table for timestamps that are < 15 minutes old in MySQL?

code:
SELECT field FROM table WHERE (TIME_TO_SEC(TIMEDIFF(NOW(), timestamp))/60) < 15
or something like this

code:
SELECT field FROM table WHERE timestamp > UNIX_TIMESTAMP()-15*60
or something else?

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
Is there any way to keep mysqldump from killing server performance while it's running?

edit: It seems it is because it locks the tables. Can I do a live replication on the same machine?

fletcher fucked around with this message at 02:02 on Jul 31, 2008

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
In MySQL, when you delete a row from a table it creates 'overhead', which from my understanding is the row still exists, it's just marked as deleted. To clear the overhead, I can do an OPTIMIZE TABLE. Is this supposed to just be done on a weekly basis with a cron job or what?

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
If there are two tables (mysql):

product
-id
name

productvote
-productId
-timestamp

I'm trying to get a list of products ordered by the time it was last voted on.

SELECT product.id, product.name FROM productvote JOIN product ON product.id = productvote.productId ORDER BY productvote.timestamp DESC;

That seemed to work, but it can show duplicate products. Doing a SELECT DISTINCT instead didn't seem to work like I thought it did, why not?

Or should I do it like this? How do I maintain the order of the productIds inside the IN() in the result of the parent query?

SELECT product.id, product.name FROM product WHERE product.id IN (SELECT DISTINCT productvote.productId FROM productvote ORDER BY productvote.timestamp DESC);

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
Thanks for all the responses guys.

Jethro posted:

given that it's mysql, you probably want something like
code:
SELECT product.id,
       product.name 
  FROM (select productId
              ,max(timestamp) as timestamp
          from productvote) as pvote JOIN
       product ON product.id = pvote.productId ORDER BY pvote.timestamp DESC;
Though I'm not sure about the aliasing syntax, and it's possible your version of mysql may not allow subqueries.

This query seems really slow (~2 sec for a pretty small data set)

What indexes would help me with a query like this?

code:
mysql> EXPLAIN SELECT product.id, product.text FROM (SELECT productId, 
MAX(timestamp) AS timestamp FROM productvote GROUP BY productId) AS pvote JOIN 
product ON product.id = pvote.productId ORDER BY pvote.timestamp DESC LIMIT 0,25;
+----+-------------+-------------+------+-----------------+------+---------+-----------------+--------+---------------------------------+
| id | select_type | table       | type | possible_keys   | key  | key_len | ref             | rows   | Extra                           |
+----+-------------+-------------+------+-----------------+------+---------+-----------------+--------+---------------------------------+
|  1 | PRIMARY     | <derived2>  | ALL  | NULL            | NULL | NULL    | NULL            | 152254 | Using filesort                  |
|  1 | PRIMARY     | product     | ref  | PRIMARY,id_2,id | id   | 4       | pvote.productId |      1 | Using where                     |
|  2 | DERIVED     | productvote | ALL  | NULL            | NULL | NULL    | NULL            | 958686 | Using temporary; Using filesort |
+----+-------------+-------------+------+-----------------+------+---------+-----------------+--------+---------------------------------+

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
Yep, there is.

edit: ^ yeah that's why I asked, I don't know how to make sense of it either.

fletcher fucked around with this message at 04:44 on Dec 18, 2008

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
How do you decide when it's appropriate to use a trigger vs. doing it from the application code?

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
I'm using MySQL and I was wondering if there is an easy way to do the following:

SELECT Product, SUM(B1) as group0, SUM(B2) as group1 FROM table WHERE Product IN ('a', 'b', 'c') GROUP BY Product

Now let's say there's also a Product 'd' which is defined as a family of products: the sum of 'a' and 'b' = 'd'. Is there a way to have MySQL calculate 'd' for me as well? I'm currently just doing it in java after the above query executes. Note that the product family is not defined in a MySQL table (retarded, yes).

fletcher fucked around with this message at 02:30 on Jan 24, 2009

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

camels posted:

something like this?

code:
SELECT
  Product,
  SUM(B1) AS group0,
  SUM(B2) AS group1
FROM
  table
WHERE
  Product IN ('a', 'b', 'c)
GROUP BY 
  Product

UNION

SELECT
  'd' AS Product,
  SUM(B1) AS group0,
  SUM(B2) AS group1
FROM
  table
WHERE
  Product IN ('a' , 'b')

Yes! How come the portion after UNION won't properly GROUP BY the Product though?

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

camels posted:

The result set in effect only has one Product, so the SUM() acts on the whole result set, and doesn't need a GROUP BY statement, if that makes sense.

Oh I was trying to be lazy and reuse most of the SQL statement already built since it's huge. I don't know why I didn't realize that. I need some sleep I think. Thanks!

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
How can you speed up a query such as the following? (MySQL)

code:
SELECT vote.articleId, SUM(vote.value) AS votes
FROM vote
WHERE vote.timestamp > FROM_UNIXTIME(1205996595)
GROUP BY vote.articleId
ORDER BY votes DESC
LIMIT 0 , 30

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
Why does

code:
select column from table where something in ("asdf", "fdsa")
return 0 rows, but

code:
select sum(column) from table where something in ("asdf", "fdsa")
return 1 row of just a NULL? Shouldn't it return 0 rows as well?

This is on MySQL 5.1, running on Windows.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
I want to add the ability for users to specify a timezone so I can display timestamps with the appropriate offset in a MySQL based application. Is using set timezone = timezone a bad idea? Should I be doing this in my application code, rather than just setting it at the database level?

edit: actually, I'm not even sure that it's even necessary to display exact timestamps in this app. I think X hours/days/weeks/months/years ago is all people will need. Still curious about where in an application timezone conversion should be done though.

fletcher fucked around with this message at 23:52 on Sep 22, 2009

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
MySQL question here:

Let's say I've got a table with 52 columns, named COLUMN1, COLUMN2, etc. Each row of this table has data for each column, but for some rows COLUMN1 could correspond to 12/1/2009 and for other rows COLUMN1 could correspond to 10/15/2009.

For each row, the amount of time between COLUMN1 and COLUMN2 could be 1 week or 1 month.

First of all, is this type of thing commonly done? Does it have a name where I can read about working with data like this?

I'm trying to do things like viewing weekly data at a monthly or quarterly scale, and I'm running into all kinds of weird situations and need some direction on this.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

Markoff Chaney posted:

What Sprawl said. So your rows look like DATE1, DATE2 (which is later than DATE1), DATA_1, DATA_2, DATA_N? Or is this something more sinister and insane?

Kinda. Each row has a date assigned to the first DATA_1 column, and a scale (weekly or monthly) that indicates how far away DATA_2 is from it.

Scaramouche posted:

How many rows are involved? Is each column time stamped, or does column2-column=1 week | 1 month occur randomly?

100s of millions

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

Sprawl posted:

This sounds like it was bad data planning from the beginning. I would refactor the whole thing hth.


So you have Date1, Value1, scale1 as seperate columns? and then the same for date2,3 etc.?

No, one date and one scale for each row.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

Jethro posted:

What on earth does "Each row has a date assigned to the first DATA_1 column" mean? When you say assigned, that sounds like it means DATA_1 is where that date is stored. But from the context of your explanation, I think you mean that DATA_1 has actual data, and the "start date" and scale (and presumably other identifying information) are stored in other columns in the row that you didn't mention.

So to clarify:

Each row has a start date, a scale, and then the 52 data columns. These 52 data columns hold data, like an account balance or something like that, for 52 weeks or months, where the DATA_1 column corresponds to the balance on the start date, DATA_2 is the balance for the next period, etc. Is this correct?

That sounds kinda like a pivot table or query.

Ya! That sounds exactly right.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

badatshess4ever posted:

I apologize if this has been asked in some form, I read through a few pages but can't go through all 71. I took a database class in college and have some experience with querying, but none actually building databases..

Anyways, I want to build a database that includes lots of information that is easy to access on the web. This includes how the stock market did each day, the weather in a certain city each day, average gas prices, etc. My question is, is there a way to have this information automatically added to the database, and not manually added? Thank you.

Nice troll

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
What is the correct way to do what I am trying to do?

This obviously won't work, since I can only have 1 field in the subquery:
code:
SELECT col_a FROM table WHERE col_b IN (SELECT col_a, col_b FROM temp_table)
But this gives me a Can't reopen table: 'temp_table' error:
code:
SELECT col_a FROM table WHERE (col_b IN (SELECT col_a FROM temp_table) OR col_b IN (SELECT col_b FROM temp_table))
It's addressed in the temporary tables problems page, but there doesn't seem to be a solution.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

camels posted:

Something like this (untested)?

code:
SELECT 
   col_a
FROM
   table
WHERE EXISTS
   (SELECT *
    FROM temp_table
    WHERE temp_table.col_a = table.col_b
    OR temp_table.col_b = table.col_b)

Worked great! Thanks!

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
I have a VARCHAR(1300) column that I want to sort my results by. There wasn't an index on it so it was pretty slow, but when I try to add an index to it I get: "Specified key was too long; max key length is 1000 bytes"

Is there another way to get the query to run faster?

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

Sprawl posted:

use a fulltext index? or convert it to a text field and put a fulltext index on it?

I tried using a fulltext index, but it was still slow, and EXPLAIN said it wasn't using an index. I also tried specifying with USE INDEX() and it was still slow. Same results whether it's a VARCHAR or TEXT field.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
Using MySQL, I'm creating a temp table:

code:
CREATE TEMPORARY TABLE test (
	Id VARCHAR(18) NOT NULL,
	SomeField VARCHAR(1300),
	SomeOtherField VARCHAR(20),
	YetAnotherField VARCHAR(100)
	PRIMARY KEY (Id, SomeOtherField),
	INDEX (YetAnotherField)
)
With the default my.ini, it takes 2 minutes to load 50,000 rows into this account (using prepared statement w/ batch). If I change my create statement to specify ENGINE = MEMORY and set tmp_table_size and max_heap_table_size both to 256M, it takes 5 seconds to load the same data.

Unfortunately, I don't have enough memory on this machine to have temp tables in memory that are that big for X number of users. Are there other ways I can speed this up? 2 minutes to load 50,000 rows seems like way too long.

edit: specifying ENGINE = MyISAM, overriding the InnoDB default, yields a 10 second load time. Hurrah!

fletcher fucked around with this message at 00:04 on Dec 3, 2009

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

Triple Tech posted:

Take off the index. If you want to speed up writes, drop all indices. Then if you need the index for reporting purposes, recreate it.

Interesting...so it would be faster to not have the indicies in the CREATE, and to just create them one I'm done inserting?

Sprawl posted:

i can't even imagine why you would put a 1300 character field into the primary key that doesn't make any sense.

Whoops, that should really be SomeOtherField, not the long one.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

Comatoast posted:

So, how do I go about creating a table which can hold an infinite number of authors?

Sorry, but that is a really dumb question. Go buy an infinite number of hard drives? What class is this for?

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

Comatoast posted:

Bah. You're right. I'm just going to say 100 is the max and go from there. I know how to do that.

You don't need to say infinite or even 100, just say "arbitrary number of authors" or something.

You need three tables, publication, author, and a table to relate the two to each other.

If this whole class was about databases and you can't do this by the end of the semester...well...start putting effort into school and stop wasting your parents money?

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

Sprawl posted:

This is really easy and i shouldn't give it to you but you sound a little slow, think second normal.

Boooooooo! You should have let him figure this poo poo out.

Adbot
ADBOT LOVES YOU

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
I needed to remove a trailing space from a certain column and I tried to use this query to find out which rows I would be updating:

code:
SELECT Product FROM Table WHERE Product != TRIM(Product)
But it didn't return any rows, even though I know there are some Products in there with trailing spaces. Why didn't this work like I thought it would?

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply