Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Thel
Apr 28, 2010

Sprawl posted:

its very possible the ado recordset was using a odbc driver at some point that was updated to no longer replace blank spaces with underscores.

That actually happens? :aaaaa:

Adbot
ADBOT LOVES YOU

Backhand
Sep 25, 2008
Apparently so.

Yeah, I'm gonna write this one off as an OBDC driver issue. I didn't update the sucker and if someone else did, they didn't tell me... but honestly, I don't care. It was wrong, it's fixed, end of story so far as I'm concerned. The boss is just obsessed with accountability is all. Not entirely without reason I must admit... I didn't update IIS or any drivers recently, and if someone else did they REALLY should have let me know.

And yeah Thel, it's an ASP classic environment. I know, I know... I'd love to upgrade to .NET, but every single time me and the DBA bring the matter up we get shot down on the grounds that it would take too much time and energy away from other things.

Gravy Jones
Sep 13, 2003

I am not on your side
I'm using MSSQL 2005 so I'm stuck with the datetime datatype.

I have a table which has a series of dated events indexed by the id of a company where that event occured. In these events time does not and will never matter. I'm going to have to run a lot of queries based on the date events occured. As an example SELECT DISTINCT DatePartofTheDateField WHERE CompanyID = 123 will be a particularly common one.

Looking around on Google there are a lot of different methods for querying the date part of a datetime field only however most of them involve some form of conversion and it seems to me that the sort of query I'm talking about above would be really, really inefficient in a large (millions of records) database and now amount of indexing is going to help that.

My first thought (well other than upgrade to 2008) was to see if it is possible to create an index on a function of the date column. But I'm not sure if/how that works).

My next was some kind of trigger that would update the datetime column (or a second column) to the date midight automatically whenever a record was updated and index that.

My final thought was to do the above with a trigger except convert it to some kind of integer based representation of the date (number of days since 1/1/1900 or something, a timestamp I guess) similar and index that on the assumption that indexed integers are a lot more efficient than indexed dates and times. This data is only ever going to be accessed by a single .NET class so I can do all the conversions before the query and on the results in this class.

I have no idea if any of this makes sense or if I'm on the right track.

I guess my main concerns is that there's some degree of reinventing the wheel here and/or if we do eventually upgrade to 2008 (which has a date datatype) the whole thing would become an ugly pointless legacy.

Edit: Just ran a few tests with 100,000 records. I was suprised to find that querying the integer time stamp and a the datetime column where all the dates were set to midnight took literally the same amount of time. Where I was converting the dates as part of the query it took about ten times as long, which was not suprising.

Gravy Jones fucked around with this message at 15:08 on Jul 4, 2011

benisntfunny
Dec 2, 2004
I'm Perfect.

Gravy Jones posted:


I have no idea if any of this makes sense or if I'm on the right track.

I guess my main concerns is that there's some degree of reinventing the wheel here and/or if we do eventually upgrade to 2008 (which has a date datatype) the whole thing would become an ugly pointless legacy.

Short of switching to 2k8 I think you're on the right track. I would serialize the dates into number strings and index those strings in the database. The date would be converted going in and out of the table either by your application or the database itself. If anyone ever did want to look at the database easy you could just create a view of the table.

Sprawl
Nov 21, 2005


I'm a huge retarded sperglord who can't spell, but Starfleet Dental would still take me and I love them for it!

Gravy Jones posted:

I'm using MSSQL 2005 so I'm stuck with the datetime datatype.

I have a table which has a series of dated events indexed by the id of a company where that event occured. In these events time does not and will never matter. I'm going to have to run a lot of queries based on the date events occured. As an example SELECT DISTINCT DatePartofTheDateField WHERE CompanyID = 123 will be a particularly common one.

Looking around on Google there are a lot of different methods for querying the date part of a datetime field only however most of them involve some form of conversion and it seems to me that the sort of query I'm talking about above would be really, really inefficient in a large (millions of records) database and now amount of indexing is going to help that.

My first thought (well other than upgrade to 2008) was to see if it is possible to create an index on a function of the date column. But I'm not sure if/how that works).

My next was some kind of trigger that would update the datetime column (or a second column) to the date midight automatically whenever a record was updated and index that.

My final thought was to do the above with a trigger except convert it to some kind of integer based representation of the date (number of days since 1/1/1900 or something, a timestamp I guess) similar and index that on the assumption that indexed integers are a lot more efficient than indexed dates and times. This data is only ever going to be accessed by a single .NET class so I can do all the conversions before the query and on the results in this class.

I have no idea if any of this makes sense or if I'm on the right track.

I guess my main concerns is that there's some degree of reinventing the wheel here and/or if we do eventually upgrade to 2008 (which has a date datatype) the whole thing would become an ugly pointless legacy.

Edit: Just ran a few tests with 100,000 records. I was suprised to find that querying the integer time stamp and a the datetime column where all the dates were set to midnight took literally the same amount of time. Where I was converting the dates as part of the query it took about ten times as long, which was not suprising.
You could create a column that stores the part of the date you want.

You should also be able to create a view that is able to do the same thing.

Thel
Apr 28, 2010

Gravy Jones posted:

Date stuff in SQL Server 2005

What I would do is either add a field to the table (and update via triggers), or as Sprawl suggested, use a view. It depends on your data and usage as to which is better (If the data changes often but you only need to query on the dates occasionally, then go with the view. If your data is mostly static but you're hitting it with datepart queries every second, then the table might be better*).

As far as actually converting the datetime, instead of doing the excel-style "days since epoch start", just convert the dateparts to YYYYMMDD (i.e. 20110706). Less mess, less fuss, more usability. (Unless you're running queries like "Give me from now to six weeks ago", in which case augh. Either way it's going to be a mess.)

* I haven't actually tested this, I don't know if this is true. :v: Now that I think about it, if you make the view an indexed view the performance should be about the same.

J. Elliot Razorledgeball
Jan 28, 2005
So is there any way to do an ALTER TABLE on a 5.1 MySQL system without write locking? I know just regularly there's not, but surely there's some strange voodoo way of creating another instance and keeping track of checkpoints and that, or copying to temp tables blah blah.

McGlockenshire
Dec 16, 2005

GOLLOCKS!
Because ALTER has to basically rewrite the table, yes, it has to get a lock, and no, there's no sane way to bypass this restriction. What are you trying to accomplish?

J. Elliot Razorledgeball
Jan 28, 2005

McGlockenshire posted:

Because ALTER has to basically rewrite the table, yes, it has to get a lock, and no, there's no sane way to bypass this restriction. What are you trying to accomplish?

Adding a field to a large table without write/read locking, or write/read locking for as short time as possible.

Vanadium Dame
May 29, 2002

HELLO I WOULD LIKE TO TALK TO YOU ABOUT MY VERY STRONG OPINIONS

J. Elliot Razorledgeball posted:

Adding a field to a large table without write/read locking, or write/read locking for as short time as possible.

Dunno if this works in MySQL, but in Oracle you can add a column to a giant table as not having a default/and is null and it will lock the table for a only very short period of time. Then if you need to populate the column and set a default/not null you can update the column values via a loop that batch commits to keep any locks from lasting too long, and then alter the column to have a default/be not null once the update loop is done which should also be quite fast since the column should be fully populated with values (or close to it).

Vanadium Dame fucked around with this message at 02:18 on Jul 7, 2011

Scaramouche
Mar 26, 2001

SPACE FACE! SPACE FACE!

Markoff Chaney posted:

Dunno if this works in MySQL, but in Oracle you can add a column to a giant table as not having a default/and is null and it will lock the table for a only very short period of time. Then if you need to populate the column and set a default/not null you can update the column values via a loop that batch commits to keep any locks from lasting too long, and then alter the column to have a default/be not null once the update loop is done which should also be quite fast since the column should be fully populated with values (or close to it).

This works in MSSQL too; watch out if you're replicating though. Actually altering a replicated table in the first place means you'll probably have to resubscribe/publish anyway.

McGlockenshire
Dec 16, 2005

GOLLOCKS!
Ah, you people and your modern, sane databases.

Traditional MySQL replication works by just sending the actual SQL commands to all listening slaves. If you run the ALTER on the master, the slaves will also receive the command (once completed) unless you halt replication, disconnect every client, record the log position, run the command, record the new log position, then update every slave to pick up replication at the new log position.

Modern MySQL replication can just send over changed data, but DDL statements are still replicated as actual SQL text.

quote:

Adding a field to a large table without write/read locking, or write/read locking for as short time as possible.
Depending on the size of the data, it might be faster to create a copy of the stable structure using CREATE TABLE ... LIKE ..., alter the empty clone, then do an INSERT INTO ... SELECT ... to copy all the data over and then a pair of ALTERs to swap the table names. Dropping all the indexes in the empty clone and adding them after the data has been re-entered also might be faster, depending.

Depending on whether or not you can just disable connections to the database, this may or may not be filled with peril and the chance of subtle data loss if the original table is changed between the time you INSERT INTO ... SELECT and do the rename. MySQL does not support DDL statements inside a transaction, so an exclusive lock is your only choice here.

Running the alter directly is the thing least likely to be trouble for you.

J. Elliot Razorledgeball
Jan 28, 2005

McGlockenshire posted:

:words:

I'm going to give this a shot :nyan:

Thel
Apr 28, 2010

McGlockenshire posted:

:words:

:stare:

Remind me never to use MySQL for ... anything. Jesus.

J. Elliot Razorledgeball
Jan 28, 2005

Thel posted:

:stare:

Remind me never to use MySQL for ... anything. Jesus.

I think that's a bit overreaching, MySQL obviously has its uses. Online schema changes happen to not be a strong point.

Janitor Prime
Jan 22, 2004

PC LOAD LETTER

What da fuck does that mean

Fun Shoe
Does anyone know how Postgres handles it?

McGlockenshire
Dec 16, 2005

GOLLOCKS!

quote:

Does anyone know how Postgres handles it?
Unfortunately I don't have a large enough data set in PG to properly test this -- only a few thousand rows, only a few hundred megs. Adding a column was instantaneous in my largest tables, as was dropping.

The ALTER TABLE docs say that DROP COLUMN doesn't require a table rewrite, but changing a column data type does require one. DDL is also supported in transactions, thank goodness.

Agrikk
Oct 17, 2003

Take care with that! We have not fully ascertained its function, and the ticking is accelerating.
I have a table of data that needs to be updated using the following rules and I don't have the foggiest notion of how to go about solving it:

code:
BEFORE:			
GUID	SessionID	MemberID    Timestamp
10	      11	0	      101
10	      11	0	      102
10	      11	0	      103
10	      22	0	      104
10	      33	0	      105
10	      44	1	      106
10	      55	0	      107
10	      66	0	      108
10	      77	0	      109
10	      77	2	      110
10	      77	2	      111
10	      88	0	      112
10	      88	0	      113
10	      99	0	      114
10	      99	1	      115
becomes

code:
AFTER:			
GUID	SessionID	MemberID    Timestamp
10	      11	1	      101
10	      11	1	      102
10	      11	1	      103
10	      22	1	      104
10	      33	1	      105
10	      44	1	      106
10	      55	1	      107
10	      66	1	      108
10	      77	2	      109
10	      77	2	      110
10	      77	2	      111
10	      88	2	      112
10	      88	2	      113
10	      99	1	      114
10	      99	1	      115
Basically the rules are:

1. When a MemberID is established for a GUID, update all of the prior MemberID that equal zero with that GUID (for all sessions).
2. If a different MemberID is then established for that GUID, use that MemberID for all subsequent sessions.
3. If a session is started with no memberID, but a memberID is subsequently established, update the entire session with the MemberID.
4. If a session has an existing memberID (from rule2) and a memberID is established during the session, update the whole session with the new MemberID.

Agrikk fucked around with this message at 19:29 on Jul 7, 2011

RoadCrewWorker
Nov 19, 2007

camels aren't so great

McGlockenshire posted:

Unfortunately I don't have a large enough data set in PG to properly test this -- only a few thousand rows, only a few hundred megs. Adding a column was instantaneous in my largest tables, as was dropping.
You (or anybody really) got any idea how to fix postgres' horrible batch update performance where "update table set x=x+1" basically copies every single record in the table (even if x is just one of 100 columns) and keeps the previous version of the table as marked invalid rows, effectively doubling the table size up to the next FULL VACUUM?

I got about 50 or so 15-20 GB tables (not counting indices) with >200 million rows each so that'd come in terribly handy. Querys i used with mysql are now completely impossible.

I really love postgres and couldn't go back to mysql, but man, there's some severely headache inducing flaws that just make me think i must be missing something incredibly obvious.

McGlockenshire
Dec 16, 2005

GOLLOCKS!

RoadCrewWorker posted:

I really love postgres and couldn't go back to mysql, but man, there's some severely headache inducing flaws that just make me think i must be missing something incredibly obvious.
Autovacuum?

Goat Bastard
Oct 20, 2004

Agrikk posted:

I have a table of data that needs to be updated using the following rules and I don't have the foggiest notion of how to go about solving it:

...

Basically the rules are:

1. When a MemberID is established for a GUID, update all of the prior MemberID that equal zero with that GUID (for all sessions).
2. If a different MemberID is then established for that GUID, use that MemberID for all subsequent sessions.
3. If a session is started with no memberID, but a memberID is subsequently established, update the entire session with the MemberID.
4. If a session has an existing memberID (from rule2) and a memberID is established during the session, update the whole session with the new MemberID.

From your previous posts I'm assuming you're using SQL Server and therefore have access to the row_number() function.

I think these three beasts of update statements should do what you want (they need to be run in order):
code:
UPDATE guid_data gd
SET    memberid = (SELECT ranked_memberids.memberid 
                   FROM   (SELECT memberid
                                 ,sessionid
                                 ,guid
                                 ,row_number() OVER (PARTITION BY guid
                                                                 ,sessionid 
                                                     ORDER BY     timestamp DESC
                                                    ) ranking
                           FROM   guid_data
                           WHERE  memberid IS NOT NULL 
                           AND    memberid > 0
                          )ranked_memberids 
                   WHERE  gd.sessionid = ranked_memberids.sessionid
                   AND    gd.guid = ranked_memberids.guid
                   AND    ranked_memberids.ranking = 1
                  )
;

UPDATE guid_data gd
SET    memberid = (SELECT DISTINCT previous_data.memberid
                   FROM   guid_data current_data
                   JOIN   guid_data previous_data
                          ON  current_data.guid = previous_data.guid
                          AND current_data.timestamp > previous_data.timestamp
                          AND previous_data.memberid IS NOT NULL
                   WHERE  current_data.guid = gd.guid
                   AND    current_data.sessionid = gd.sessionid
                   AND    previous_data.timestamp = (SELECT max(timestamp)
                                                     FROM   guid_data
                                                     WHERE  timestamp < current_data.timestamp
                                                     AND    memberid IS NOT NULL
                                                    )
                    
                  )
WHERE  memberid IS NULL
;

UPDATE guid_data gd
SET    memberid = (SELECT DISTINCT previous_data.memberid
                   FROM   guid_data current_data
                   JOIN   guid_data previous_data
                          ON  current_data.guid = previous_data.guid
                          AND current_data.timestamp < previous_data.timestamp
                          AND previous_data.memberid IS NOT NULL
                   WHERE  current_data.guid = gd.guid
                   AND    current_data.sessionid = gd.sessionid
                   AND    previous_data.timestamp = (SELECT max(timestamp)
                                                     FROM   guid_data
                                                     WHERE  timestamp > current_data.timestamp
                                                     AND    memberid IS NOT NULL
                                                    )
                    
                  )
WHERE  memberid IS NULL
;
Be careful - if I've made a mistake and they don't catch everything then any member ids that used to be 0 will have been set to NULL by the first statement.

The second two could also be achieved in a much clearer way by writing procedural code to iterate over a cursor and update the rows that way, but I wanted to see if I could solve it using only SQL.

Goat Bastard fucked around with this message at 10:16 on Jul 8, 2011

angrytech
Jun 26, 2009
I've started messing around with postgreSQL, and I've reached a point where I need to actually crack open a book. Are there any that are strongly recommended for beginners?

No Safe Word
Feb 26, 2005

angrytech posted:

I've started messing around with postgreSQL, and I've reached a point where I need to actually crack open a book. Are there any that are strongly recommended for beginners?

Do you need a Postgres book or just a plain SQL book? The documentation online for Postgres is pretty solid/complete.

angrytech
Jun 26, 2009

No Safe Word posted:

Do you need a Postgres book or just a plain SQL book? The documentation online for Postgres is pretty solid/complete.

I just want to kill trees when I learn Postgres, although if there's a really awesome SQL book, recommend away.

Lyon
Apr 17, 2003
I'm very interested in an introduction to SQL book. Sounds like angrytech is far more advanced than I am, but what would be a good first book for learning SQL?

angrytech
Jun 26, 2009

Lyon posted:

...angrytech ... far more advanced than I am

Oh god no, but thanks. :)
An intro to SQL book would be awesome. I just don't see myself using MSSQL, Oracle or MySQL anytime soon so I feel like I can afford to specialize is postgres.

Boogeyman
Sep 29, 2004

Boo, motherfucker.
Quick MSSQL memory question. SQL 2008 R2 Standard supports up to 64GB of RAM. Does anyone know if that limit is per server or per instance? I'd really like to get one beefy server with 128GB of RAM and run two instances (64GB each) instead of having to buy two separate servers.

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!
Some quick googling says that it's 64GB per instance.

Boogeyman
Sep 29, 2004

Boo, motherfucker.

Jethro posted:

Some quick googling says that it's 64GB per instance.

That's the same thing I was seeing, but I was worried since that info wasn't coming directly from Microsoft. I ended up having my boss talk to our CDW rep today and he confirmed that it is 64GB per instance, which makes things easier.

Bob Morales
Aug 18, 2006


Just wear the fucking mask, Bob

I don't care how many people I probably infected with COVID-19 while refusing to wear a mask, my comfort is far more important than the health and safety of everyone around me!

This is a longshot, but does anyone use Webyog sja to synchronize MySQL databases?

I'm trying to use it, but I get this error every time:

code:
Table               SrcRows  TgtRows   Inserted Update Deleted
================    =======  =======   ======== ====== =======
'activities'             85       85          0      0       0
'addresses'            8054     8054 There was an error while executing a query. The querey and the error has been logged in the log file.
The log file simply says "Error No. 1205 Lock wait timeout exceeded; try restarting transaction"

These are InnoDB (I think) tables. I'm trying to see if using this tool is faster than dropping/re-dumping the databases (we might only need to sync up the last week worth of data).

I increased the innodb_lock_wait_timeout to 100 and it does the same thing, it just takes longer. I also tried adding innodb_table_lock=0, no difference.

Both servers are CentOS x86_64 5.6 and mysql is ver 14.12 Distrib 5.0.77

McGlockenshire
Dec 16, 2005

GOLLOCKS!
Looks like you're hitting a deadlock.

Increase it again and wait for it to be near a timeout, then run SHOW FULL PROCESSLIST as root on both instances to reveal the queries that are deadlocking.

Bob Morales
Aug 18, 2006


Just wear the fucking mask, Bob

I don't care how many people I probably infected with COVID-19 while refusing to wear a mask, my comfort is far more important than the health and safety of everyone around me!

McGlockenshire posted:

Looks like you're hitting a deadlock.

Increase it again and wait for it to be near a timeout, then run SHOW FULL PROCESSLIST as root on both instances to reveal the queries that are deadlocking.

http://pastebin.us/463

That's the target server, the other one didn't have anything interesting

Sprawl
Nov 21, 2005


I'm a huge retarded sperglord who can't spell, but Starfleet Dental would still take me and I love them for it!

Bob Morales posted:

http://pastebin.us/463

That's the target server, the other one didn't have anything interesting

Looks like that site has been hijacked or something paste it here.

Bob Morales
Aug 18, 2006


Just wear the fucking mask, Bob

I don't care how many people I probably infected with COVID-19 while refusing to wear a mask, my comfort is far more important than the health and safety of everyone around me!

I'm dyslexic.

http://pastebin.us/643

isnoop
Jan 9, 2001

I used to be an admin,
but then I took an arrow
to the knee.
I'm having a complete brain fart here, so perhaps someone can lend a hand.

I have three tables:
Routes, Flights, and Fares.
* Routes contains rows that indicate you can fly between A and B and all A-B pairs are unique.
* Flights contains rows that describe each individual flight between A and B (AF 443 on July 14, 2011)
* Fares contains a history of every price we've seen reported for a given flight

I'm looking to run a single query that finds the most recent fares for each flight then picks the cheapest flight to every destination from a given origin.

Here's the best stab I've made so far:
code:
SELECT 
       *    # not really.  this is a simplified example.
FROM   route r
       INNER JOIN flight flt
         ON flt.routefk = r.id
       INNER JOIN fare fare
         ON flt.id = fare.flightfk
       LEFT OUTER JOIN fare fare2
         ON ( fare.flightfk = fare2.flightfk
              AND fare.id < fare2.id )
WHERE  r.status = 'active'
       AND flt.takeoffdate > '2011-07-13'
       AND flt.takeoffdate <= '2011-07-25'
       AND r.fromiata = 'JFK'
       AND fare2.id IS NULL
GROUP  BY r.id
ORDER  BY fare.price ASC
This returns data that looks correct and it works very quickly on large datasets. Unfortunately, the data it returns is wrong. I've found that the fare row returned isn't representative of the cheapest flight from A to B but is somewhere in the middle of the pack.

It would seem that I need to apply a sort to my GROUP, but that doesn't change the results. Any thoughts?

Thel
Apr 28, 2010

isnoop posted:

I'm having a complete brain fart here, so perhaps someone can lend a hand.

What's with fare2? What is it doing?

Also, when you're doing a composite query like this, it's probably easier to build it up in stages and test each stage. Just to make sure, you have a given origin (i.e. JFK), and want one result for each single outbound route, with the cheapest price in your date range?

McGlockenshire
Dec 16, 2005

GOLLOCKS!

Bob Morales posted:

I'm dyslexic.

http://pastebin.us/643

Lines 36-40, connection 5 (second row in the process list) is holding open a transaction that, for some reason, connection 4 (first row) wants to lock upon. Unfortunately there's no clue there on why it's blocking, as nothing interesting is listed in those four lines. Connection 4 is just trying to get a shared lock ("LOCK mode S locks rec but NOT gap waiting"), no idea why it'd be denied here. Does the tool make two connections?

There's lots of noise in Google about that particular lock state, but nothing looks immediately relevent. You may find Percona's SHOW INNODB STATUS walkthrough helpful.

McGlockenshire fucked around with this message at 06:48 on Jul 14, 2011

Goat Bastard
Oct 20, 2004

isnoop posted:

...

If your database supports analytic functions then you probably want something like

code:
WITH   sorted_fare AS
      (SELECT flightfk
             ,price
             ,row_number() OVER (PARTITION BY flightfk ORDER BY price) AS sort_order
       FROM   fare fare
      )
SELECT *
FROM   route r
JOIN   flight flt
       ON flt.routefk = r.id
JOIN   sorted_fare fare
       ON  flt.id = fare.flightfk
       AND fare.sort_order = 1
WHERE  r.status = 'active'
AND    flt.takeoffdate > '2011-07-13'
AND    flt.takeoffdate <= '2011-07-25'
AND    r.fromiata = 'JFK'

revmoo
May 25, 2006

#basta
I've just started working with mysql workbench, and if I'm logged into a sql server and I've selected my schema and I type:

DROP DATABASE 'nameofdb';

and hit run, it says there is an error in my syntax. Now I'm sure that syntax is correct. What is the problem here?

Adbot
ADBOT LOVES YOU

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

revmoo posted:

I've just started working with mysql workbench, and if I'm logged into a sql server and I've selected my schema and I type:

DROP DATABASE 'nameofdb';

and hit run, it says there is an error in my syntax. Now I'm sure that syntax is correct. What is the problem here?

'' is for strings, try using backticks ``

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply