Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Okita
Aug 31, 2004
King of toilets.

Golbez posted:

That would seem to duplicate part of an audit trail table, since all of that data would be contained within that table as well. Basically, don't I want to keep all the metadata in one place, rather than half in the audit trail table and half in the data table?

On the other hand, if that's what's necessary to get said metadata into the audit trail table, and we want a true audit trail (that is to say, automatic rather than relying on PHP, or as I call it, "manual"), then that's the way we'll have to go, isn't it.

Do you have a data abstraction layer in PHP, or is the app just running straight-up SQL queries against the database?

If you do have a data abstraction layer, that's one place where you could put the logic in to do your audit trail thing every time something touches the database.

If you don't have a data abstraction layer, you'll have to hit basically everywhere in the code that makes SQL calls anyway to incorporate these new CreateById and LastModifiedById columns.

In such a scenario, I would encourage just writing even a superficial data abstraction layer to encapsulate this functionality. You know, maybe just a simple class that lets you call a method to execute some SQL. You could even add code in there to scrub the SQL and prevent SQL injection... It would centralize this aspect of your application's relationship to the database at least.
This class could also do your audit trail thingy. Make everything else run it's queries through that class and you have your problem solved(you'll have to hit everything else anyway if you wanna add those other columns).

E: Confuddled concepts... I'm used to just calling it a DAL, but it's not a data abstraction layer I'm talking about, it's called a data access layer.

Okita fucked around with this message at 23:27 on Sep 28, 2012

Adbot
ADBOT LOVES YOU

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Okita posted:

Do you have a data abstraction layer in PHP, or is the app just running straight-up SQL queries against the database?

If you do have a data abstraction layer, that's one place where you could put the logic in to do your audit trail thing every time something touches the database.

If you don't have a data abstraction layer, you'll have to hit basically everywhere in the code that makes SQL calls anyway to incorporate these new CreateById and LastModifiedById columns.

In such a scenario, I would encourage just writing even a superficial data abstraction layer to encapsulate this functionality. You know, maybe just a simple class that lets you call a method to execute some SQL. You could even add code in there to scrub the SQL and prevent SQL injection... It would centralize this aspect of your application's relationship to the database at least.
This class could also do your audit trail thingy. Make everything else run it's queries through that class and you have your problem solved(you'll have to hit everything else anyway if you wanna add those other columns).

I don't know if it counts as a data abstraction layer but yes, in the app we're talking about, there's a single point of modification for each table, so there's only five functions I'd have to modify.

Right now my crazy caffeine-shorted brain is considering the following model:
* Main table, stays as it is
* History table. This contains the current and all previous versions of the rows from the main table, rather than only recording changed columns. Disk space is cheap. This is kind of the meta data of the main table.
* The audit trail which is kind of the meta data of the history table. This records who made what change, the details of that change (was it through a web form, through an uploaded file, on what page, etc.).
* When main table is altered, a trigger fires which adds a record to the history table.
* It also adds a record to the audit trail table.
* Is it possible to return the last_insert_id of a triggered table? In this case, the *audit trail table*? Because then I could have PHP modify it to add the additional data of user ID, details, etc.

So we'd have, say, id 10 in the main table. It gets edited, stays 10, but a new history record, let's say id 222, get created in the history table. The audit trail table would then have a composite key of the two columns, 10-222. So on the website I could simply query from that and have it say "This record was edited on ... by ..., click here to view its details, click here to view the differences between the current record." And if PHP isn't around to handle this (e.g. this was triggered because I was messing around in the DB directly), then it will be obvious because there'd be no user ID to insert.

How hosed up is this idea and will I regret it in the morning?

Golbez fucked around with this message at 22:33 on Sep 28, 2012

glompix
Jan 19, 2004

propane grill-pilled

Golbez posted:

* Main table, stays as it is
* History table. This contains the current and all previous versions of the rows from the main table, rather than only recording changed columns. Disk space is cheap. This is kind of the meta data of the main table.

So you're not okay with an extra ModifiedByUserID field but you're okay with duplicating the entire table? In this implementation, I'd just dump the main table and only insert into the history table - never updating any row. That's kind of how your typical banking transaction database is used.

I'm not sure about MySQL, but in SQL Server, @@identity (but not scope_identity()) is changed when a trigger fires an insert to an identity-keyed table. There's probably something similar in MySQL.

butt dickus
Jul 7, 2007

top ten juiced up coaches
and the top ten juiced up players

Golbez posted:

* Is it possible to return the last_insert_id of a triggered table? In this case, the *audit trail table*? Because then I could have PHP modify it to add the additional data of user ID, details, etc.
If you're going to change the PHP anyway, just put an audit method in your user class that adds a record to the audit table.

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

glompix posted:

So you're not okay with an extra ModifiedByUserID field but you're okay with duplicating the entire table? In this implementation, I'd just dump the main table and only insert into the history table - never updating any row. That's kind of how your typical banking transaction database is used.

I'm not sure about MySQL, but in SQL Server, @@identity (but not scope_identity()) is changed when a trigger fires an insert to an identity-keyed table. There's probably something similar in MySQL.

Well even if I had ModifiedByUserID in the main table, I'd probably still be storing a copy of the rows; a lot easier to code than storing a row for each changed column, and having to code in the handling thereof, which would be instantly obsolete if I changed the table schema. ... Then again, so would the audit table, but at least that's a single point of change, rather than having to change queries and triggers.

The point of this story is, I only started wondering about audit trails roughly yesterday so all this has been very quick and new to me. :)

Dumping the main table doesn't quite work in this regard, as this is indeed a transaction log, so I need to be able to say "select sum(amount) as balance from transactions where person_id = #". I'd rather not have to deal with only getting the ones with max(version), because MAX() and I have a very sour relationship, to the point of avoiding it every chance I get. Would there be any way to even do that without a subquery?

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Doctor rear end in a top hat posted:

If you're going to change the PHP anyway, just put an audit method in your user class that adds a record to the audit table.

Good point. Though the actual data would be inserted by the trigger. I guess I'm getting stuck on reading someone somewhere saying "If it requires PHP, it's not purely automatic and thus is not an audit trail." Which may be more dogmatic than I need.

Pardot
Jul 25, 2001




Golbez posted:

Good point. Though the actual data would be inserted by the trigger. I guess I'm getting stuck on reading someone somewhere saying "If it requires PHP, it's not purely automatic and thus is not an audit trail." Which may be more dogmatic than I need.

Use a transaction around the change and the audit log.

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Pardot posted:

Use a transaction around the change and the audit log.

Good point, and new to me since we're only on myisam at the moment, but after this weekend will finally start to migrate to innodb.

That doesn't audit changes directly to the db, though...

No Safe Word
Feb 26, 2005

Why not use views and have the audit info on the table along with a flag indicating the nature of the record?

Table:
  • <your data rows>
  • created by userid
  • created datetime
  • changed by userid
  • changed datetime
  • status/flag (say, 0 = current, 1 = modified, 2 = deleted)

Then make a view that folks who only want current data get:
code:
select <your data rows> from table where flag = 0

Zombywuf
Mar 29, 2008

No Safe Word posted:

Why not use views and have the audit info on the table along with a flag indicating the nature of the record?

Table:
  • <your data rows>
  • created by userid
  • created datetime
  • changed by userid
  • changed datetime
  • status/flag (say, 0 = current, 1 = modified, 2 = deleted)

Then make a view that folks who only want current data get:
code:
select <your data rows> from table where flag = 0

If MySQL supports partial indexes this is a good idea. It means no app needs update permissions and minimises writes.

glompix
Jan 19, 2004

propane grill-pilled

No Safe Word posted:

Why not use views and have the audit info on the table along with a flag indicating the nature of the record?

Table:
  • <your data rows>
  • created by userid
  • created datetime
  • changed by userid
  • changed datetime
  • status/flag (say, 0 = current, 1 = modified, 2 = deleted)

Then make a view that folks who only want current data get:
code:
select <your data rows> from table where flag = 0

drat, this is really nice. Throw some indexed views `where flag = 0` on top of that and even better. Wonder what maintenance of that index would be like though... I'm actually not that fond of databases, but they are kind of fun to muse about.

big trivia FAIL
May 9, 2003

"Jorge wants to be hardcore,
but his mom won't let him"

All of you saying "disk space is cheap" is kind of funny because when you're processing/writing/storing hundreds of millions of transactions a day and each one of those needs to be stored literally ~*forever*~ and you can't just expand by a TB or 2 by throwing a couple hundred at Newegg, disk space is most certainly not cheap and 99% of people at my place would slap you in the mouth for saying so.

Pardot
Jul 25, 2001




-S- posted:

All of you saying "disk space is cheap" is kind of funny because when you're processing/writing/storing hundreds of millions of transactions a day and each one of those needs to be stored literally ~*forever*~ and you can't just expand by a TB or 2 by throwing a couple hundred at Newegg, disk space is most certainly not cheap and 99% of people at my place would slap you in the mouth for saying so.

Disk space for OLTP isn't that cheap, but storing your WAL archive forever is pretty cheap and mostly just as good.

big trivia FAIL
May 9, 2003

"Jorge wants to be hardcore,
but his mom won't let him"

Pardot posted:

Disk space for OLTP isn't that cheap, but storing your WAL archive forever is pretty cheap and mostly just as good.

We're talking about every change, every database transaction on an item that occurs through its life that needs to be readily accessibly 24/7 for 10 years and pulled at a moment's notice, and after 10 years it can be archived for "next day" pull should the client require it. And we're talking of your smallest of clients having over 1,000 transactions a day on millions of records.

Thel
Apr 28, 2010

Re: GUIDs vs sequential ints - GUIDs are the devil. If you're just mucking around with a toy database they're fine, but for any non-trivial workload you're just making things harder and harder for yourself by using GUIDs. Read Kimberley Tripp's posts here - GUIDs (especially random ones) cause horrendous slowdowns in typical use cases.

If you absolutely need them, that's fine. However they should not be your default primary key, and they very definitely should not be your default clustering key. (Remember that a clustered key gets included at the leaf level of every non-clustered index, so if you use a GUID instead of an int as the clustering key on a table that has 200 million rows and 5 NC indexes, that's an additional 30 gigabytes of space utterly wasted.)

Zombywuf
Mar 29, 2008

-S- posted:

We're talking about every change, every database transaction on an item that occurs through its life that needs to be readily accessibly 24/7 for 10 years and pulled at a moment's notice, and after 10 years it can be archived for "next day" pull should the client require it. And we're talking of your smallest of clients having over 1,000 transactions a day on millions of records.

In this case disk should be cheap because your clients should be paying for it :-)

Orbis Tertius
Feb 13, 2007

Golbez posted:

Good point, and new to me since we're only on myisam at the moment, but after this weekend will finally start to migrate to innodb.

That doesn't audit changes directly to the db, though...

Something to note is that MyISAM has a very useful feature for history tables, which is you can do multiple sequences by setting the last column in a multi column index to be auto incrementing. In a history table it looks something like this:

code:

ID    revision   action      description
1     1         'insert'     init value for ID = 1
1     2         'update'     changed value for ID = 1
2     1         'insert'     init value for ID = 2
3     1         'insert'     init value for 3 
1     3         'update'     changed value on 1 etc, etc


You can convert existing tables to fancy pants sequenced versions of themselves like this:

code:

CREATE TABLE MyDB.data_history LIKE MyDB.data

ALTER TABLE MyDB.data_history MODIFY COLUMN your_pk_column int(11) NOT NULL, 
   DROP PRIMARY KEY, ENGINE = MyISAM, ADD action VARCHAR(8) DEFAULT 'insert' FIRST, 
   ADD revision INT(6) NOT NULL AUTO_INCREMENT AFTER action, 
   ADD PRIMARY KEY (your_pk_column, revision)

Here are trigger queries that work with the above:

code:

CREATE TRIGGER MyDB.data__ai AFTER INSERT ON mydb.data FOR EACH ROW
    INSERT INTO MyDB.data_history SELECT 'insert', NULL, d.* 
    FROM MyDB.data AS d WHERE d.id = NEW.id;

CREATE TRIGGER MyDB.data__au AFTER UPDATE ON MyDB.data FOR EACH ROW
    INSERT INTO MyDB.data_history SELECT 'update', NULL, d.*
    FROM MyDB.data AS d WHERE d.id = NEW.id;

CREATE TRIGGER MyDB.data__bd BEFORE DELETE ON MyDB.data FOR EACH ROW
    INSERT INTO MyDB.data_history SELECT 'delete', NULL, d.* 
    FROM MyDB.data AS d WHERE d.id = OLD.id;

Orbis Tertius fucked around with this message at 05:29 on Oct 5, 2012

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine
Usually our load is around 1-2. When it passes 3, the CS people start to bitch. When it passes 6, the site becomes unusable. We have 16 cores, by the way, so this has always annoyed me; in the past it's usually been because the HDD needle is being pulled between MySQL and something else.

So I get in this morning and load is at 10. And nothing bad is running. The site is fast. htop shows all core with minimal activity.

Before the weekend I set myisam_repair_threads to 2 (we only just upgraded to 5.5 so I haven't switched from MyISAM to InnoDB yet), but that's it. Suddenly, our two-week-old installation of MySQL 5.5 has decided, I guess, to use more than one core all the time? Does anyone know if this might be what happened and why it would start now? Or could Linux be screwing up the load reporting somehow?

Zombywuf
Mar 29, 2008

Golbez posted:

Usually our load is around 1-2. When it passes 3, the CS people start to bitch. When it passes 6, the site becomes unusable. We have 16 cores, by the way, so this has always annoyed me; in the past it's usually been because the HDD needle is being pulled between MySQL and something else.

So I get in this morning and load is at 10. And nothing bad is running. The site is fast. htop shows all core with minimal activity.

Before the weekend I set myisam_repair_threads to 2 (we only just upgraded to 5.5 so I haven't switched from MyISAM to InnoDB yet), but that's it. Suddenly, our two-week-old installation of MySQL 5.5 has decided, I guess, to use more than one core all the time? Does anyone know if this might be what happened and why it would start now? Or could Linux be screwing up the load reporting somehow?

Linux load reporting is a bit voodoo, however it does (in modern kernels at least) include those threads that are waiting on disk IO. Based on what you're seeing I would first check for swapping and then (somewhat relatedly) how much free RAM you have. More RAM means bigger disk caches means less processes waiting to read from the disk means less load. If you are seeing any swapping whatsoever then you need to run, not walk, to your supplier and get more RAM.

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Zombywuf posted:

Linux load reporting is a bit voodoo, however it does (in modern kernels at least) include those threads that are waiting on disk IO. Based on what you're seeing I would first check for swapping and then (somewhat relatedly) how much free RAM you have. More RAM means bigger disk caches means less processes waiting to read from the disk means less load. If you are seeing any swapping whatsoever then you need to run, not walk, to your supplier and get more RAM.

Thing is, nothing has changed in our setup in the last few days. On Friday, if load was above 4 things would slow down, but now it's gotten all the way up to 16, and is staying there, and the site is still perfectly usable. Now, when it spiked to 19, then people reported problems.

Swap is very low, htop reporting 152/16378MB; memory usage is acceptable (4gb/16gb currently reported used); all cores are at 10% or below.

Two weeks ago we went from MySQL 5.1 to 5.5, maybe it just took two weeks to realize it could run amok?

Zombywuf
Mar 29, 2008

Golbez posted:

Swap is very low, htop reporting 152/16378MB; memory usage is acceptable (4gb/16gb currently reported used); all cores are at 10% or below.

Is the 4gb including buffers? i.e. is the free stat at 12gb?

If so then I can't think of anything beyond backup and restore to a new machine and see what happens. Otherwise, how big is the db compared to buffered disk? You might have hit a threshold where it suddenly needs to use the disk.

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Zombywuf posted:

Is the 4gb including buffers? i.e. is the free stat at 12gb?

If so then I can't think of anything beyond backup and restore to a new machine and see what happens. Otherwise, how big is the db compared to buffered disk? You might have hit a threshold where it suddenly needs to use the disk.

Backup and restore, but ... why? The system's working fine. I'm just curious why it's suddenly reporting a 16 load (which implies that something, and that something is almost certainly mysql, is now running on all 16 cores) rather than the 1 load it used to.

The DB has not appreciably changed in size over the last two weeks, and its drive is still more than 50% free.

Memory: free reports total 15933, used 13689, free 2243, buffers 610. Swap 16378, used 152, free 16226.

One issue we did have is that, for some reason, load spiked to 19, and a bunch of zombie connection showed up on MySQL and both slaves. This usually happens when things get too slow, restarting mysqld dumped them but didn't dump the 16 core usage. I should turn it off entirely tonight and see what happens to load.

Zombywuf
Mar 29, 2008

Golbez posted:

Backup and restore, but ... why?

To isolate a hardware fault, if there is one.

quote:

One issue we did have is that, for some reason, load spiked to 19, and a bunch of zombie connection showed up on MySQL and both slaves. This usually happens when things get too slow, restarting mysqld dumped them but didn't dump the 16 core usage. I should turn it off entirely tonight and see what happens to load.

You might have an issue at the application layer. If something's doing aggressive reconnection attempts for some reason. Other than that I don't know, a MySQL expert may have some ideas.

We had a similar issue recently where load suddenly spiked at around 6500 when a problem at the db caused a thundering herd of reconnections. The machine was still usable enough to kill everything and restart strangely.

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine
I forgot to point out that it appears all of our scheduled processes finished much faster than usual over the weekend. My theory is still that MySQL decided to notice there were 16 cores and use them, my question is, why did it take two weeks?

edit: lol there may be a far more mundane solution to this, due to a LAN config issue we have a bunch of processes taking up 0% CPU yet appear to be counted in the load number

Golbez fucked around with this message at 20:54 on Oct 8, 2012

Zombywuf
Mar 29, 2008

Golbez posted:

edit: lol there may be a far more mundane solution to this, due to a LAN config issue we have a bunch of processes taking up 0% CPU yet appear to be counted in the load number

Are processes waiting on network IO counted in the load number?

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Zombywuf posted:

Are processes waiting on network IO counted in the load number?

It would seem so, because I'm able to reliably increase the load number by loading df, which then freezes when it tries to display info on a network share.

Zombywuf
Mar 29, 2008

Golbez posted:

It would seem so, because I'm able to reliably increase the load number by loading df, which then freezes when it tries to display info on a network share.

Ah, that's file IO that's blocking. Just the underlying subsystem is network based. That makes sense.

Pilsner
Nov 23, 2002

So I installed SQL Server Management Studio 2012.

1) For every single query window I open, I have to press Ctrl+Alt+Space to force it into a usable intellisense mode.
2) Intellisense still stops working randomly.
3) There's still no decent solution setup where you can save your list of frequently connected databases, and have permanent query files.

Thanks for nothing Microsoft. Absolute poo poo.

JeansW
Aug 4, 2004
Yes. Like the jeans...

Pilsner posted:

So I installed SQL Server Management Studio 2012.

1) For every single query window I open, I have to press Ctrl+Alt+Space to force it into a usable intellisense mode.
2) Intellisense still stops working randomly.
3) There's still no decent solution setup where you can save your list of frequently connected databases, and have permanent query files.

Thanks for nothing Microsoft. Absolute poo poo.

Wait, so the "Registered Servers" window from previous years no longer exists? Also, Intellisense will always be poo poo for SQL until the language itself gets rewritten to be similar to LINQ.

glompix
Jan 19, 2004

propane grill-pilled
My coworkers tout RedGate's SQL Prompt as being a lot better than the built in Intellisense. We have it here at work, but it brings my 3 year old development laptop to a halt about 20% of the time I try to use it. It used to be a lot worse for me, so at least it's improving. I'm inclined to give a company like RedGate the benefit of the doubt and chalk the product's failings for me up to our awful schemas.

Still shouldn't need a third party product to fix problems in a crazy expensive piece of software like this.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
I've been using the WEEK(date, 3) function in MySQL to aggregate daily data into weekly data, which from what I understand will follow the ISO-8601 definition of a week number.

I've been asked to support different locales, and I'm not sure how to support one like ar_SA (Saudi Arabia), since the first day of their week is a Saturday. The MySQL WEEK() function doesn't seem to have a mode for Saturday being the first day of the week. Any ideas?

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

fletcher posted:

I've been using the WEEK(date, 3) function in MySQL to aggregate daily data into weekly data, which from what I understand will follow the ISO-8601 definition of a week number.

I've been asked to support different locales, and I'm not sure how to support one like ar_SA (Saudi Arabia), since the first day of their week is a Saturday. The MySQL WEEK() function doesn't seem to have a mode for Saturday being the first day of the week. Any ideas?

Something like ... "SELECT IF(DAYOFWEEK(date) = 7, WEEK(date, 3) - 1, WEEK(date, 3)) AS week_number" ? And use that idea to populate where clauses and the like.

abelwingnut
Dec 23, 2002


I've been learning SQL on the job over the last 3 months. Pretty useful, but it's definitely got some weird characteristics. Are there any more recent, friendlier scripting languages that are in vogue these days? I realize SQL's old and there's probably something 'better', at least in some sense, out there. Just curious what the alternatives are.

Mac friendly would be nice too.

Cold on a Cob
Feb 6, 2006

i've seen so much, i'm going blind
and i'm brain dead virtually

College Slice

Abel Wingnut posted:

I've been learning SQL on the job over the last 3 months. Pretty useful, but it's definitely got some weird characteristics. Are there any more recent, friendlier scripting languages that are in vogue these days? I realize SQL's old and there's probably something 'better', at least in some sense, out there. Just curious what the alternatives are.

Mac friendly would be nice too.

The short, glib answer:
Nope, suck it up and get good at SQL. It's actually well suited to its purpose (querying relational datastores) despite the occasional annoyance. When I interview, I strongly favour developers that understand relational theory, can write SQL well, and recognize the huge performance costs of a poorly designed data model. For context, I only started my career 11 years ago, so I'm not some graybeard ORACLE wanker. :)


The longer, hopefully more explanatory answer:

For querying a relational database it's the defacto standard and the only alternative I've ever heard of that gained any traction at all is Quel (which I know little about but the syntax isn't completely unlike SQL because it's also specific to dealing with relational databases).

While there are some missing niceties (such as natural joins, though these were added to a recent ANSI SQL draft I believe) and some weirdness (such as having to define columns before tables and joins), I'm not sure what you mean by "weird characteristics" as it's generally well suited to its purpose (though I personally wish the syntax was less verbose). Perhaps you could give some examples and we can explain them. It could also be that you are dealing with a poorly designed database; that can turn anybody off of SQL.

If you do some deeper reading into the subject you might find that database vendors/developers have already circumvented a number of fundamental features of a strictly normalized relational database in the name of making SQL easier and more friendly to use, so it could be a lot worst (imagine a world without NULLable columns for example; this one is especially contentious due to the special handling NULLs require in JOINs for example).

If you want to avoid SQL while still querying an RDBMS there are alternatives but they tend to be dedicated querying libraries tied to existing programming languages or ORMs such as LINQ, HQL, SchemeQL, and so on. However, they all spit out SQL in the end. This is something you'll just have to accept and be prepared to deal with, much like every web developer needs to know JavaScript even if they prefer coding in CoffeeScript.

At the storage level, there are alternatives to SQL databases with their own advantages and disadvantages. Your best bet would be to learn proper RDBMS theory and ANSI-SQL (something most CS programs don't do, shockingly enough), as well as the alternatives, and use the one that makes the most sense in each situation whenever possible. There's a reason why big tech companies like Google use a mix of SQL (MySQL for adwords, though I guess they've move that to F1) and NoSQL (BigTable for Google Maps) these days.

Scaramouche
Mar 26, 2001

SPACE FACE! SPACE FACE!

Pretty well what he said. SQL is the de facto standard for a lot of reasons and frankly nothing has really evolved to compete with it.

Once you get out of relational there are a few new options (e.g. things like MongoDB) where the data is presented as a series of objects/collections and you use your language of choice to manipulate them.

Sprawl
Nov 21, 2005


I'm a huge retarded sperglord who can't spell, but Starfleet Dental would still take me and I love them for it!
LINQ and the other Nosql alternatives have SQL model in them anyways so no SQL is really something need to learn if you are doing anything database related

glompix
Jan 19, 2004

propane grill-pilled

Sprawl posted:

LINQ and the other Nosql alternatives have SQL model in them anyways so no SQL is really something need to learn if you are doing anything database related

This might be pedantic, but I think your post may confuse. LINQ is just a query language built on .NET. It can generate SQL, and often does in practice. Many ORMs have their own integrated query language, which always seems so redundant to me excepting LINQ. NoSQL refers to a class of nonrelational database software, and so SQL is involved nowhere in the process. Knowledge of SQL would be as useful as knowledge of LISP in them.

Zombywuf
Mar 29, 2008

Cold on a Cob posted:

The short, glib answer:
Nope, suck it up and get good at SQL. It's actually well suited to its purpose (querying relational datastores) despite the occasional annoyance. When I interview, I strongly favour developers that understand relational theory, can write SQL well, and recognize the huge performance costs of a poorly designed data model. For context, I only started my career 11 years ago, so I'm not some graybeard ORACLE wanker. :)

While I do generally agree with your answer I think it's important to point out that SQL and RDBMS's are different things. The problem with replacing SQL with a better language is essentially the first-mover problem. Although you would struggle to make a language with more features than SQL, you could make one with much nicer syntax. The problem comes when you have to retrain all your developers (which shouldn't really take long but businesses are extremely conservative) and that most of the NoSQL crowd seem to want to shed RDBMS's and keep the SQL part :psyduck: (see, GQL, VQL, etc...)

SQL is a really terrible language, I seem to spend half my time using it checking where exactly a particular clause goes, whether this clause needs brackets or whether a particular statement needs an 'as'.

It would be nice to be able to use Datalog. :unsmith:

Cold on a Cob
Feb 6, 2006

i've seen so much, i'm going blind
and i'm brain dead virtually

College Slice

Zombywuf posted:

While I do generally agree with your answer I think it's important to point out that SQL and RDBMS's are different things. The problem with replacing SQL with a better language is essentially the first-mover problem. Although you would struggle to make a language with more features than SQL, you could make one with much nicer syntax. The problem comes when you have to retrain all your developers (which shouldn't really take long but businesses are extremely conservative) and that most of the NoSQL crowd seem to want to shed RDBMS's and keep the SQL part :psyduck: (see, GQL, VQL, etc...)

Agreed. But I wanted to make it clear that SQL is what it is because it is tied to RDBMS's.

Zombywuf posted:

SQL is a really terrible language, I seem to spend half my time using it checking where exactly a particular clause goes, whether this clause needs brackets or whether a particular statement needs an 'as'.

So wait, you're complaining about the language because you forget how to use it? I put my HAVING clauses in the wrong place all the time and I always forget the syntax for common table expressions, but that's not SQL's fault. There are plenty of valid complaints about SQL and I agree it could be improved, but your statement is quite vague.

Some specific complaints I personally have with SQL and it's implementations:
1. Wordy syntax; I personally want compact, succinct syntax. Give me a c/c#/java style RDBMS query language anyday, please! :)
2. Further to point 1, I hate the quirks with making it fit into a natural language syntax. Logically, the select clause of a select statement should come last, after you define your from clause, joins, where clause, etc. This would also help with intellisense-like features. LINQ fixes this.
3. Proprietary extensions which introduce incompatibilities between RDBMS implementations (ever ported an ORACLE application to SQL Server? I have :smith:)
4. In contrast to proprietary extensions, most SQL implementations are also missing a lot of the newer ANSI standard features, such as natural joins
5. The only way to override the query engine optimizer is usually via proprietary SQL hints, which the engine can and will cheerfully ignore. I've only ever really needed to do this a few times and this was almost ten years ago, but it's frustrating when it happens. Further, a bad engine can really make things hard for you - I seem to recall a day when ORACLE used to evaluate your where/join clause statements in the order provided and this causing problems, but I've been mostly using SQL Server for years and haven't had problems like this in awhile. I'm sure ORACLE is much better now this was in the 8i days. :)
6. Object/Relational impedance can really suck.
7. Code reuse is usually pretty difficult, though with (again) proprietary extensions there is a lot you can do in the major RDBMS's here.
8. Not enough warnings - for example, warn me if I join on a NULLable column without an ISNULL or if my query is doing full table-scans. This is where the declarative nature of SQL works against it.
9. UPDATE and DELETE statement should always require a WHERE clause, as it's way too easy to clobber a whole table, i.e.
code:
update foo set bar = 'foobar' 
where true
instead of just
code:
update foo set bar = 'foobar' 
That's all I can think of off the top of my head. I will say I might have been a bit unfair to the original post I responded to; my experience is most new developers hate SQL mostly due to their experience and exposure in other (OO or functional) languages and a lack of understanding of relational theory and normalization and that might have coloured my response. Perhaps he or she has the same problems I do with SQL, which have more to do with specific implementation level quirks rather than thinking "relational theory is dumb old school stuff I don't need to know".

Zombywuf posted:

It would be nice to be able to use Datalog. :unsmith:

I have little exposure to Datalog, but I am told the good news is some of the features of datalog (like common table expressions for recursion) are making it into SQL so don't lose all hope. :)

Cold on a Cob fucked around with this message at 16:31 on Oct 14, 2012

Adbot
ADBOT LOVES YOU

Zombywuf
Mar 29, 2008

Cold on a Cob posted:

So wait, you're complaining about the language because you forget how to use it?
If there was any rhyme or reason to the syntax it would be much easier to remember. Every feature has its own special syntax. I mean, imagine if instead of verbs we had different syntactical forms in English.

Anyway, in spirit of imagining a world where we get to dictate what languages people use so I can have my pony:

quote:

1. Wordy syntax; I personally want compact, succinct syntax. Give me a c/c#/java style RDBMS query language anyday, please! :)
I actually quite like the words, I find reaching for punctuation to more typing effort than typing a word. But then again I also think the language should be mauve.

quote:

2. Further to point 1, I hate the quirks with making it fit into a natural language syntax. Logically, the select clause of a select statement should come last, after you define your from clause, joins, where clause, etc. This would also help with intellisense-like features. LINQ fixes this.
3. Proprietary extensions which introduce incompatibilities between RDBMS implementations (ever ported an ORACLE application to SQL Server? I have :smith:)
4. In contrast to proprietary extensions, most SQL implementations are also missing a lot of the newer ANSI standard features, such as natural joins
5. The only way to override the query engine optimizer is usually via proprietary SQL hints, which the engine can and will cheerfully ignore. I've only ever really needed to do this a few times and this was almost ten years ago, but it's frustrating when it happens. Further, a bad engine can really make things hard for you - I seem to recall a day when ORACLE used to evaluate your where/join clause statements in the order provided and this causing problems, but I've been mostly using SQL Server for years and haven't had problems like this in awhile. I'm sure ORACLE is much better now this was in the 8i days. :)
Yes, totally, all of this.

quote:

6. Object/Relational impedance can really suck.
I let other people worry about objects most of the time :-)

quote:

7. Code reuse is usually pretty difficult, though with (again) proprietary extensions there is a lot you can do in the major RDBMS's here.
No language does good code re-use. SQL does tend to be quite easy to copy and paste chunks of.

quote:

8. Not enough warnings - for example, warn me if I join on a NULLable column without an ISNULL or if my query is doing full table-scans. This is where the declarative nature of SQL works against it.
But what about when the table scan is the fastest approach? A standard set of event/messages would be nice, where you would be able to enable specific warnings about table scans or just generic stats. Preferably without having a big list of fixed strings with fixed ids. See also: proprietary extensions.

quote:

9. UPDATE and DELETE statement should always require a WHERE clause, as it's way too easy to clobber a whole table,
But that would take all the fun out of it :-)

quote:

Perhaps he or she has the same problems I do with SQL, which have more to do with specific implementation level quirks rather than thinking "relational theory is dumb old school stuff I don't need to know".
I'd quite like to just write relational calculus...

quote:

I have little exposure to Datalog, but I am told the good news is some of the features of datalog (like common table expressions for recursion) are making it into SQL so don't lose all hope. :)
It is pretty much entirely the language itself (i.e. syntax and set of keywords) that I dislike, so I'd prefer they didn't add new features and let it wither and die so a better syntax could come along. As for Datalog I pretty much love Prolog in a totally irrational way therefore Datalog.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply