Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Victor
Jun 18, 2004
You can do exactly what you want to do, probably without much pain. If I am understanding you properly, I've done what you are describing a thousand times over. I write SQL which generates SQL very frequently. Unless I greatly misunderstand you, you're just missing a crucial piece of how to go about this.

Adbot
ADBOT LOVES YOU

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!

Atimo posted:

What I'd like to do is setup the t-sql that's creating the string into a sproc call, populate a table of table names to run it against, get it all back as one big lump table, then have a bit of .net code or something write each row back out to a file, but without the line breaks its worthless.

Guess I'll just go back to my nightmare now.
Are you sure that the line breaks are being eaten as opposed to simply not showing up in SQL management studio? I'm pretty sure that if you do
code:
DECLARE @Results varchar(max)
SET @Results = 'Line 1
Line 2'
SELECT @Results
PRINT @Results
code:
Line 1 Line2
will show up in the results tab, but the messages tab will properly show
code:
Line 1
Line 2

Atimo
Feb 21, 2007
Lurking since '03
Fun Shoe

Victor posted:

You can do exactly what you want to do, probably without much pain. If I am understanding you properly, I've done what you are describing a thousand times over. I write SQL which generates SQL very frequently. Unless I greatly misunderstand you, you're just missing a crucial piece of how to go about this.

Well it boils down to
code:
DECLARE @Sql varchar(max)
SET @Sql = 'line one' + ( a line break ) + 'line two'
SELECT @Sql
When that comes back from a .net call, or however you consume it, it looks like:

'line1 line2'

It looks fine in Management Studio in text mode sure, but thats not all that usefull, but for now I'm going to have to run this against all 100 or so tables, copy & paste each out by hand, then go back and hand edit a portion of the outputted code.

If I could keep the line breaks, I could cut the steps to just the hand edit per file.

(Which is still a bitch and a half, thanks boss)

That make any more sense?

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!
What are you using to query the database? It seems like there may be a setting there that is causing the problems, not the database itself.

If worse comes to worse, you could always replace the line breaks with some delimiter that isn't used anywhere else, like '|' or something, and then parse the results and replace '|' with a line break.

Atimo
Feb 21, 2007
Lurking since '03
Fun Shoe

Jethro posted:

What are you using to query the database? It seems like there may be a setting there that is causing the problems, not the database itself.

If worse comes to worse, you could always replace the line breaks with some delimiter that isn't used anywhere else, like '|' or something, and then parse the results and replace '|' with a line break.

/facepalm

Thanks, I got so frustrated with the breaks that that didnt even occur to me.

If only I'd thought of that 3 hours ago. Well at the pace this is taking me I can still add in the delimiters, scrap what I have, and still be doign better then without it.

Victor
Jun 18, 2004
This may be of use: http://luke.breuer.com/time/item/SQL2005_print_and_nvarchar(max)/449.aspx

Xae
Jan 19, 2005

Victor posted:

Curious. I'm not aware of any such issues with our triggers, but I have not personally worked with the more complicated scenarios. What I will say is that it is nice to be able to write normal insert/update statements to do things instead of executing sprocs that necessarily have to operate one-at-a-time, at least until we get table-valued parameters in SQL2008. I don't know what the TVP support situation is for other databases. Having to operate through a web service seems even more gross.

Perhaps you have comments on Kenneth Downs advocating triggers? He and I appear to agree, although I have not investigated this thoroughly.

Kyte from Oracle is able to explain my aversion to triggers far better than I can. I ran into a similar situation where I used them to maintain denormalized data similar to Kenneth Downs first example.


Triggers have their place, such as setting watermark fields or simple checks, but any 'business logic' (I loving hate that phrase, but it works) should be explicit.

Aside from simple architecture preference there can be some wonky things that happen with triggers and transactions. I can't remember the code the the ORA errror, but my previous employer get them by the bucket full.

Fly
Nov 3, 2002

moral compass

Xae posted:

Triggers have their place, such as setting watermark fields or simple checks, but any 'business logic' (I loving hate that phrase, but it works) should be explicit.
I think you state it well. Business logic split between code and database triggers is more difficult to maintain than if it is done consistently. That applies for any X and Y other than "code" and "database triggers".

tayl0r
Oct 3, 2002
This post brought to you by SOE
90% of the poo poo I know about Oracle is from AskTom

<3 tom kyte

Victor
Jun 18, 2004
Kyte's argument really seems to boil down to "people are too dumb to write triggers correctly and therefore triggers should not be used much". Either I'm missing something, or my company mysteriously avoids most of the problems he notes, most of the time, with minimal frustration in the process. Methinks I will need to do some asking around and see if my current view of things is correct.

tayl0r
Oct 3, 2002
This post brought to you by SOE

Victor posted:

Kyte's argument really seems to boil down to "people are too dumb to write triggers correctly and therefore triggers should not be used much". Either I'm missing something, or my company mysteriously avoids most of the problems he notes, most of the time, with minimal frustration in the process. Methinks I will need to do some asking around and see if my current view of things is correct.

That is now how I understood his article at all. He seems to be saying that if you have to use triggers to enclose some business logic, you're doing something wrong. Then he presents some common situations where people use triggers and how you can get the same result without triggers.

That seems totally valid to me. I cannot think of any situation where it actually makes sense to use a trigger in your application. It *will* come back and bite someone in the rear end when they're trying to debug a problem or modify the code somewhere down the line.

I've used triggers to help debug and keep track of what/when/how the data in a table is changing, but that type of activity is outside the actual application and isn't part of the business logic of anything.

Victor
Jun 18, 2004
At my shop, we just assume there are triggers. Thus, there's no "oh poo poo, what happened" moment. Kenneth Downs explains the benefits of supertight coupling in the article I posted above. So far, we've really benefited from that coupling ourselves. We don't do stuff in triggers that is not roll-back-able, but honestly, if the email sends are in sprocs, and execution goes through them before the transaction is rolled back, the same problem happens. I don't want to be an rear end and all, but "you're doing it wrong" is most of what comes to my mind.

Dumb people don't know that you can use SQL Server Profiler to examine what's going on in triggers. Profiler is one of the best pieces of software on this planet IMHO, because it complements TSQL and SSMS so loving well. Want to script an action in SSMS? Just fire up Profiler, do the action in SSMS, and BOOM, your dream is reality.

Victor
Jun 18, 2004
More links on people discussing triggers in a non-retarded, non-trivial way would be much appreciated.

Xae
Jan 19, 2005

The issue is less about when the original team is working on the project and more about 5-10 years later. Databases tend to long outlive the application built on them. I do a bit of digging in old databases, and time after time some 'clever' trigger solution is the cause of problems later on.

Oracle's mutating table error causes even more headaches. The intent is to tell people they are bumping into ACID problems, the reality is that they come up with convoluted schemes to circumvent it.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe
Is there a more elegant way of doing what I'm trying to do here?

SELECT (some stuff) FROM Game WHERE (TimeLimitBunit = 'MINUTE' AND TIMESTAMPADD(MINUTE,TimeLimitB,LastMove) < NOW()) OR (TimeLimitBunit = 'HOUR' AND TIMESTAMPADD(HOUR,TimeLimitB,LastMove) < NOW()) OR (TimeLimitBunit = 'DAY' AND TIMESTAMPADD(DAY,TimeLimitB,LastMove) < NOW())

What I really wanted to do was just this:

SELECT (some stuff) FROM Game WHERE TIMESTAMPADD(TimeLimitBunit,TimeLimitB,LastMove) < NOW()

but it doesn't work, presumably because TIMESTAMPADD receives 'MINUTE' rather than MINUTE as its first argument and doesn't know what to do.

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!

Hammerite posted:

Is there a more elegant way of doing what I'm trying to do here?

SELECT (some stuff) FROM Game WHERE (TimeLimitBunit = 'MINUTE' AND TIMESTAMPADD(MINUTE,TimeLimitB,LastMove) < NOW()) OR (TimeLimitBunit = 'HOUR' AND TIMESTAMPADD(HOUR,TimeLimitB,LastMove) < NOW()) OR (TimeLimitBunit = 'DAY' AND TIMESTAMPADD(DAY,TimeLimitB,LastMove) < NOW())

What I really wanted to do was just this:

SELECT (some stuff) FROM Game WHERE TIMESTAMPADD(TimeLimitBunit,TimeLimitB,LastMove) < NOW()

but it doesn't work, presumably because TIMESTAMPADD receives 'MINUTE' rather than MINUTE as its first argument and doesn't know what to do.
Depending upon your definition of 'elegant'

code:
SELECT (some stuff) FROM Game
 WHERE TIMESTAMPADD(MINUTE,
                    TimeLimitB*CASE TimeLimitBunit WHEN 'MINUTE' THEN 1
                                                   WHEN 'HOUR' THEN 60
                                                   WHEN 'DAY' THEN 24*60 END,
                    LastMove) < NOW()

jovial_cynic
Aug 19, 2005

SQL:
SELECT tValues.itemValue, tValues.date
FROM tValues
INNER JOIN tItems ON tItems.ID = tValues.itemID
WHERE tItems.systemItem = 6

This simple sql statement that gives me this:

code:
itemValue   date  
  2      2008-12-20 <--
  12     2008-12-20 <-- two values, same date
  4      2008-12-21
  6      2008-12-18
  1      2008-12-22 <--
  3      2008-12-22 <-- two values, same date
  6      2008-12-19
For the two dates that are repeated (12/20 & 12/22), I want the average of all the values that in that date, so that the returned table looks like this:

code:
itemValue   date  
  7      2008-12-20 <--- averaged data
  4      2008-12-21
  6      2008-12-18
  2      2008-12-22 <--- averaged data
  6      2008-12-19
How would I go about doing this? I looked up the avg function, but I'm having a hard time getting my head around it for this applicaton.

var1ety
Jul 26, 2004

jovial_cynic posted:

SQL:
SELECT tValues.itemValue, tValues.date
FROM tValues
INNER JOIN tItems ON tItems.ID = tValues.itemID
WHERE tItems.systemItem = 6

This simple sql statement that gives me this:

code:
itemValue   date  
  2      2008-12-20 <--
  12     2008-12-20 <-- two values, same date
  4      2008-12-21
  6      2008-12-18
  1      2008-12-22 <--
  3      2008-12-22 <-- two values, same date
  6      2008-12-19
For the two dates that are repeated (12/20 & 12/22), I want the average of all the values that in that date, so that the returned table looks like this:

code:
itemValue   date  
  7      2008-12-20 <--- averaged data
  4      2008-12-21
  6      2008-12-18
  2      2008-12-22 <--- averaged data
  6      2008-12-19
How would I go about doing this? I looked up the avg function, but I'm having a hard time getting my head around it for this applicaton.

SELECT tValues.date, avg(tValues.itemValue) AS avg_itemvalue
FROM tValues
INNER JOIN tItems ON tItems.ID = tValues.itemID
WHERE tItems.systemItem = 6
GROUP BY tValues.date
ORDER BY 1

jovial_cynic
Aug 19, 2005

var1ety posted:

SELECT tValues.date, avg(tValues.itemValue) AS avg_itemvalue
FROM tValues
INNER JOIN tItems ON tItems.ID = tValues.itemID
WHERE tItems.systemItem = 6
GROUP BY tValues.date
ORDER BY 1

aha! It's the GROUP BY that I was missing. I couldn't* figure out how to keep it from just averaging everything together. Thanks!

jovial_cynic fucked around with this message at 17:45 on Jan 8, 2009

chocojosh
Jun 9, 2007

D00D.
I had a small query to do today and I was thinking through it and wanted the opinion of the much more experienced DBAs here.

I have a table Language, with columns Key, Language, Value. It provides a way to support internationalization for a bilingual website I'm doing, inspired by .NET's resource files and GetString call.

I needed to provide a dump of the text for a translator, so I ended up doing the following:

code:
SELECT EL.Name As 'Key', EL.Value as 'English Text', FL.Value as 'French Text' 
FROM Language EL INNER JOIN Language FL ON EL.Name = FL.Name 
WHERE EL.Language = 'EN' AND FL.Language = 'FR';
This provides a table with (Key, English Text, French Text), so that the translator can take my English text, give me the correct French text, and then I can redump it to the database with the key.

I decided to use a self-join because I wanted to show the data of two records of the table in one record on the output. However, while this query definitely does not scale well as each new language would require another self-join. I'm wondering if I'm using self-joins in the appropriate way (to merge multiple records of a table into one record) and if there's a better way to be doing this type of query.

var1ety
Jul 26, 2004

chocojosh posted:

I had a small query to do today and I was thinking through it and wanted the opinion of the much more experienced DBAs here.

I have a table Language, with columns Key, Language, Value. It provides a way to support internationalization for a bilingual website I'm doing, inspired by .NET's resource files and GetString call.

I needed to provide a dump of the text for a translator, so I ended up doing the following:

code:
SELECT EL.Name As 'Key', EL.Value as 'English Text', FL.Value as 'French Text' 
FROM Language EL INNER JOIN Language FL ON EL.Name = FL.Name 
WHERE EL.Language = 'EN' AND FL.Language = 'FR';
This provides a table with (Key, English Text, French Text), so that the translator can take my English text, give me the correct French text, and then I can redump it to the database with the key.

I decided to use a self-join because I wanted to show the data of two records of the table in one record on the output. However, while this query definitely does not scale well as each new language would require another self-join. I'm wondering if I'm using self-joins in the appropriate way (to merge multiple records of a table into one record) and if there's a better way to be doing this type of query.

You can do it in a single scan using a pattern like this:

code:
select name AS 'Key',
       max(case when language = 'EN' then value end) AS english,
       max(case when language = 'FR' then value end) AS french
  from language
 where language IN ('EN', 'FR')
 group by name
 order by 1
Depending on your personal situation you should consider changing your language table so it is wider (name, english, french, dutch, spanish), as it will improve performance on these types of queries.

Plinkey
Aug 4, 2004

by Fluffdaddy
So this is probably a pretty easy question, but I'm new to SQL and just messing around with some stuff.

Anyways, I want to do something like this with mySQL:

code:
SELECT DAYOFMONTH( SELECT date_created FROM issues where id='X')
To convert a DATETIME value to just the day of the month/monthname...etc on the fly. What am I doing wrong here?

Pizza Partisan
Sep 22, 2003

Plinkey posted:

So this is probably a pretty easy question, but I'm new to SQL and just messing around with some stuff.

Anyways, I want to do something like this with mySQL:

code:
SELECT DAYOFMONTH( SELECT date_created FROM issues where id='X')
To convert a DATETIME value to just the day of the month/monthname...etc on the fly. What am I doing wrong here?

code:
SELECT DATEPART(DD, date_created) + '/' + DATEPART(MM, date_created) AS [date] FROM issues where id='x'
More info on DATEPART

hexadecimal
Nov 23, 2008

by Fragmaster

Victor posted:

More links on people discussing triggers in a non-retarded, non-trivial way would be much appreciated.

I have a problem about compiling visiting statistics, and I wonder if triggers could prove to be an efficient way of solving this.

I have a main table that relates:
| sensor_name | user_name | in_time | out_time |

I compile statistics for users that combine one or more sensor_names in one report. I calculate total number of visits, last visit time, total length of visit, and also average time spent.

I use php to do these calculations and for around 200,000 tuples, I manage to generate webpage in a bit under a second.

The largest bottleneck here, I believe, is that I retrieve all related tuples from the main table to compute this, so its a poo poo ton of data to process, and I do it every time somebody loads a page, or changes some option in the report.

One solution, that I think will work well, is to use more tables that contain some precompiled information, such as
| sensor_name | user_name | total_visits | last_time | total_visit_time |

This way I have to read out a lot less tuples from SQL.

However now I have a problem of how to efficiently and correctly generate and maintain such tables.

I could modify stat table each time I add new tuple to main logging table in my php script, but I could also use a trigger. Which way would be better? And what other thoughts do you guys have on compiling visiting statistics in general?

Victor
Jun 18, 2004
Remember that you are fundamentally doing caching. I think Rico Mariani has a good post detailing how a lot of programming can be reduced to caching (and people would be better off if they were explicit about this), but I couldn't find it. Another way to think of this is to think of memoized functions. You're calling a function that returns data from the DB and you need to memoize the result, or a portion of the result. Does this make sense?

hexadecimal
Nov 23, 2008

by Fragmaster

Victor posted:

Remember that you are fundamentally doing caching. I think Rico Mariani has a good post detailing how a lot of programming can be reduced to caching (and people would be better off if they were explicit about this), but I couldn't find it. Another way to think of this is to think of memoized functions. You're calling a function that returns data from the DB and you need to memoize the result, or a portion of the result. Does this make sense?

How exactly do I memoize php function? I read all the tuples from the DB only once per each script invocation, so memorization wouldn't help. But it would if I could have a persistent data structure between php script's invocation. Is that even possible? Would it make sense to use session variables for this?

Victor
Jun 18, 2004
First think about you want to do, divorced from the technology you are forced to use. Just think, theoretically, about what you're doing. You're running some sort of map/reduce on a large body of information, and it's costly. This large body of information is continually getting new pieces of data, so you can't just memoize in a dumb way.

One option is to always maintain an up-to-date value, but that could be expensive. Another option is to maintain values for all days prior to the given day. I hinted at this earlier: this "prior day" thing is really memoizing part of the function's return value.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe
Tell me whether the query I have constructed is a reasonably sensible one for the task I want MySQL to do. Is there a far simpler way of doing this?

code:
SELECT User,
       Name,
       NumWins,
       NumPlays,
       ROUND(100*NumWins/NumPlays, 2) AS PercentageWon
       FROM (
           SELECT User,
                  Name,
                  SUM(Co) AS NumPlays,
                  SUM(Co*CASE GameResult WHEN 'Finished 1st' THEN 1
                                         ELSE 0 END)
                      AS NumWins
                  FROM (
                      SELECT COUNT(PlayerGameRcd.User) AS Co,
                             PlayerGameRcd.User,
                             PlayerGameRcd.GameResult,
                             User.Name
                             FROM PlayerGameRcd
                                 LEFT JOIN User
                                 ON PlayerGameRcd.User = User.UserID
                                 LEFT JOIN Game
                                 ON PlayerGameRcd.Game = Game.GameID
                             WHERE PlayerGameRcd.GameResult <> 'Game Aborted'
                                   AND PlayerGameRcd.GameCounts = 1
                                   AND Game.Friendly = 0
                             GROUP BY PlayerGameRcd.User,
                                      PlayerGameRcd.GameResult
                       ) AS QRYA
                   GROUP BY User
            ) AS QRYB
       ORDER BY PercentageWon DESC
       LIMIT 50
The purpose of the query is to populate a table of up to 50; the table's purpose is to show the details of the people who have won the greatest proportion of games (out of all those who have played at least one game to completion). I want to tabulate their names, the number of games they have played to completion, the number of games they have won, and the percentage of the games they played to completion that they won. I also want to get back their user ID numbers so that I can make each name in the table a hyperlink to the appropriate user details page. PlayerGameRcd is a table containing records of players' performance in games. GameResult is the player's result from a game.

Victor
Jun 18, 2004
As far as I can tell, you need to group three different ways, so what you have is probably the minimum complexity possible. The only potential issue is if MySQL sucks rear end at executing that query, which would not surprise me in the least. MSSQL would have no problems whatsoever.

hexadecimal
Nov 23, 2008

by Fragmaster

Victor posted:

First think about you want to do, divorced from the technology you are forced to use. Just think, theoretically, about what you're doing. You're running some sort of map/reduce on a large body of information, and it's costly. This large body of information is continually getting new pieces of data, so you can't just memoize in a dumb way.

One option is to always maintain an up-to-date value, but that could be expensive. Another option is to maintain values for all days prior to the given day. I hinted at this earlier: this "prior day" thing is really memoizing part of the function's return value.

Can you explain better what you mean by the last paragraph? Do you mean running a script at the end of the day to compile stat values for this day? I guess this could be a good solution since there is only so many visits a day that could occur, and it wouldn't be too expensive to combine the daily information. Is this what you mean?

Also the stat table i described is what you seem to be describing, only I wasn't thinking of it in daily terms. Is this correct? Is this really memoization though? You are still doing SQL queries to get that information, you just do a lot less of them.

Can you also comment on the trigger approach? If I add a trigger to add/modify a tuple in stats table each time i add/modify a tuple in actual log table, is that better than making explicit SQL queries to do same thing?

hexadecimal fucked around with this message at 06:22 on Jan 10, 2009

Victor
Jun 18, 2004
Nightly batch processes are extremely common in the software world, so yes, at midnight or sometime thereafter, you would update the daily table so that when you do your aggregation, you'll take a value or values from the daily table, and only the current day's worth of values from the detail table.

The cost of using triggers is keeping a value up-to-date. If there are possible concurrency concerns (such as lots of people updating the detail table), then you run the danger of locking the aggregate values while doing an update, have that take a nontrivial amount of time, and end up locking people out of selecting those aggregate values until your update is finished. Wow, that's a long sentence -- if you don't understand, I can elaborate.

I suggest you first read up on memoization and some different flavors of it. Then, give me links of what you read and I'll try to fill in any gaps.

hexadecimal
Nov 23, 2008

by Fragmaster

Victor posted:

Nightly batch processes are extremely common in the software world, so yes, at midnight or sometime thereafter, you would update the daily table so that when you do your aggregation, you'll take a value or values from the daily table, and only the current day's worth of values from the detail table.

The cost of using triggers is keeping a value up-to-date. If there are possible concurrency concerns (such as lots of people updating the detail table), then you run the danger of locking the aggregate values while doing an update, have that take a nontrivial amount of time, and end up locking people out of selecting those aggregate values until your update is finished. Wow, that's a long sentence -- if you don't understand, I can elaborate.

I suggest you first read up on memoization and some different flavors of it. Then, give me links of what you read and I'll try to fill in any gaps.

That makes sense. I understand what memoization is, I just don't have much experience with SQL, so I wasn't sure what is efficient way of doing this, but it makes more sense now.

Another question. Is it faster to sort in SQL or php? That is, i can use ORDER BY when selecting tuples out of stats page, or I can read those tuples into PHP objects and use quick sort to do the job. Which one would be more efficient? From what I have tried it is quick sort in php but that was without stats table.

vvvv Well I was compiling stats and sorting in like one SQL statement from the log table, and it was loving slow compared to just reading all tuples in php and sorting using my own sort algo.

hexadecimal fucked around with this message at 21:27 on Jan 10, 2009

Victor
Jun 18, 2004
Unless you have a really really really good reason to do otherwise, sort in MySQL.

jovial_cynic
Aug 19, 2005

I've got a table tblFoo that's like this

code:
userID | dateAdded
-------------------
  1    | 2009-01-01
  2    | 2009-01-01
  3    | 2009-01-01
  4    | 2009-01-02
  5    | 2009-01-03
  6    | 2009-01-03
I want a select statement that'll give me the total number of users on a given date range. I know I can do:

select count(userID) as numberOfUsers, dateAdded from tblFoo where dateAdded < '2009-01-03'

... which will return numberOfUsers -> 4.

But since I'm graphing this, I'd want the query to give me the total for each day. However, I don't want this:

code:
numberOfUsers | dateAdded
--------------------------
     3        | 2009-01-01   \
     1        | 2009-01-02    |---- count per day
     2        | 2009-01-03   /
I want this:

code:
numberOfUsers | dateAdded
--------------------------
     3        | 2009-01-01   \
     4        | 2009-01-02    |---- count tally, grows each day
     6        | 2009-01-03   /
Can this be done in a query without scripting out a series of unions and (x < date) queries?


If this requires a join with a table that contains the date range, good. Because I already have the SQL for that handy.

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!

hexadecimal posted:

vvvv Well I was compiling stats and sorting in like one SQL statement from the log table, and it was loving slow compared to just reading all tuples in php and sorting using my own sort algo.
That probably means you don't have good indexes on your tables.

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!

quote:

If this requires a join with a table that contains the date range, good. Because I already have the SQL for that handy.

code:
select count(userID) as numberOfUsers,
       dateFromDateTable as dateAdded
  from tblFoo inner join
       tblDate on dateFromDateTable >= dateAdded
group by dateFromDateTable
That's the first/easiest thing that comes to mind. I dunno how it would perform, especially if you had very large tables.

Are you using MySQL or something else (read: better)?

Jethro fucked around with this message at 00:53 on Jan 11, 2009

hexadecimal
Nov 23, 2008

by Fragmaster

Jethro posted:

That probably means you don't have good indexes on your tables.

Can you elaborate? I didn't have any primary key in my table because it wasn't needed, however in the log table, the whole tuple could be considered a complex key, but i don't think its possible to create one in MySQL.

Should I just add a primary int key, that is unique for all tuples but is not really meaningful? Would that speed up the operators?

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!

hexadecimal posted:

Can you elaborate? I didn't have any primary key in my table because it wasn't needed, however in the log table, the whole tuple could be considered a complex key, but i don't think its possible to create one in MySQL.

Should I just add a primary int key, that is unique for all tuples but is not really meaningful? Would that speed up the operators?
I mean, what are you sorting by, and create an index (not necessarily a key) on those columns so that MySQL can then use that index when you ask it to sort.

hexadecimal
Nov 23, 2008

by Fragmaster

Jethro posted:

I mean, what are you sorting by, and create an index (not necessarily a key) on those columns so that MySQL can then use that index when you ask it to sort.

I sort by different criteria, but index is a primary key right? that is only applicable to one column in a table?

Adbot
ADBOT LOVES YOU

jovial_cynic
Aug 19, 2005

Jethro posted:

code:
select count(userID) as numberOfUsers,
       dateFromDateTable as dateAdded
  from tblFoo inner join
       tblDate on dateFromDateTable >= dateAdded
group by dateFromDateTable
That's the first/easiest thing that comes to mind. I dunno how it would perform, especially if you had very large tables.

Are you using MySQL or something else (read: better)?

I'm using mysql.

Running the following query:

code:
SELECT count( ID ) , dateFromDateTable AS dateAdded
FROM tblUsers
INNER JOIN 
(
  SELECT '2009-01-01' AS dateFromDateTable
  UNION
  SELECT '2009-01-02' AS dateFromDateTable
  UNION
  SELECT '2009-01-03' AS dateFromDateTable
  ) AS tblDate 
ON dateFromDateTable >= dateAdded
GROUP BY tblDate.dateFromDateTable
(I'm just using the body of that subquery for simplicity; I have a script that'll actually generate a table of dates).

Anyhow, it gives me this:

code:
count(ID) | dateFromDateTable
-----------------------------
311 	  |   2009-01-01
311 	  |   2009-01-02
311 	  |   2009-01-03
And this is not accurate.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply