Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
McGlockenshire
Dec 16, 2005

GOLLOCKS!

Hammerite posted:

As an additional thought, googling tells me that in order to use foreign keys in MySQL I need to switch the tables that will use them from the MyISAM engine to InnoDB. Idle reading suggests that MyISAM is a generally somewhat higher-performance engine than InnoDB but that since in particular some of my tables are UPDATE'd quite often in comparison with how often they are read, InnoDB with foreign keys may or may not be superior. It is tough for me to know what is optimal.
InnoDB and MyISAM no longer have major performance differences when the server is properly configured, i.e. not using defaults. Using FKs always incurs a performance penalty due to the lookups required, but that's true with any database from any vendor. It's better to eat the possible performance penalty and only worry about it if it becomes a measurable issue.

The only legitimate reason to use MyISAM over InnoDB is for fulltext indexing, and there are plenty of other ways to go about searching (i.e. Sphinx, though that's another can of worms).

Adbot
ADBOT LOVES YOU

McGlockenshire
Dec 16, 2005

GOLLOCKS!
You might have more luck attacking INFORMATION_SCHEMA. None of the special info MySQL provides can be treated like a table, which is very silly.

McGlockenshire
Dec 16, 2005

GOLLOCKS!

awdio posted:

Eventually I want to ignore spaces, case and characters that aren't 0-9 and a-z.
Well, this should ignore spaces:
code:
SELECT prodName FROM products WHERE REPLACE(' ', '', prodName) LIKE '%stuff%'
But for the rest, you'll probably just want to create an entire new column that contains the search-safe data you're looking for. Beware that this entire thing is a Bad Idea(tm) and is very likely to produce bogus search results that will confuse users.

McGlockenshire
Dec 16, 2005

GOLLOCKS!
MySQL uses collations when thinking about case sensitivity. The default collation is case insensitive.

Can we have a sample of a few real product names, and a sample of a real search string?

McGlockenshire
Dec 16, 2005

GOLLOCKS!
Who or what is providing that search term?

I'm gonna fall back to my previous suggestion of just adding a column explicitly for the condensed search terms and populating it programatically. There's no way in SQL to replace ranges of characters, regex-style. Not without a REPLACE() chain a few dozen layers deep, and that's just stupid.

McGlockenshire
Dec 16, 2005

GOLLOCKS!

Doctor rear end in a top hat posted:

MySQL engines: Right now I'm using the default MyISAM engines and I'm thinking about switching to InnoDB for foreign keys. I'm currently using triggers to perform the same cascading deletes/updates that foreign keys would provide. Is there any reason I should switch? I don't have need right now for row-level locking or transactions. If I do switch, is there an easier way to do it than copying my data to a new table with the same schema (but different engine), deleting the old one and then renaming the new one?
Emphasis added.

There is no reason not to use InnoDB, especially given that you're already doing something insanely silly to work around not using it. Converting is easy:
code:
ALTER TABLE foo ENGINE=InnoDB;
This basically rewrites the table, so it might take a while. Before doing this, if you're the server admin, you should examine the various innodb_ settings available to you and decide if you need to change them from their defaults.

McGlockenshire
Dec 16, 2005

GOLLOCKS!
If you are running entirely InnoDB, you probably should look at Percona's "XtraBackup". It works without locking the entire database, and hey, it's OSS.

And if you aren't running InnoDB, fix that.

McGlockenshire
Dec 16, 2005

GOLLOCKS!
It honestly might be easier to separate the two pieces of data into two separate tables, one to store *only* first-time logins, and one to store *only* activity markers. You can then combine this with INSERT IGNORE and a unique key on the id/timestamp column to blindly fire updates without worrying about the state of the other column.

It's a workaround and not an elegant solution. It reduces one area of complexity (entering data) while increasing another area of complexity (reporting on both sets).

McGlockenshire
Dec 16, 2005

GOLLOCKS!

The pHo posted:

I see on edit you've put square brackets around some of the variables... what is that for? (Everything is working as I hoped so I presume it's just a 'good measures' thing?)

MySQL's default setting is to use the backtick to escape identifiers.

MSSQL's default setting is to use square brackets.

The ANSI standard dictates double quotes, FWIW.

Really though, you don't need to escape most identifiers, just ones that could be mistaken for SQL keywords, like DESC and DATE, though most SQL parsers are smart enough to not screw it up.

McGlockenshire
Dec 16, 2005

GOLLOCKS!
Out of morbid curiosity, are you using that UUID as a primary key, and are you also using InnoDB?

If so, you should be aware that InnoDB stores rows on disk in the order of the primary key. Other databases call this a "clustered index." UUIDs are not guaranteed to be anything near sequential, so you may encounter a performance hit when entering new rows if things need to be shuffled around.

McGlockenshire
Dec 16, 2005

GOLLOCKS!

Hammerite posted:

Also, you described it as "basically a Varchar(36)..." Looking up UUIDs it looks like they're unvarying in length, so a CHAR(36) would presumably be better.

I was about to say "except when MySQL decides that your CHAR should be a VARCHAR because it's loving retarded and knows better than you", but I couldn't make it do that in 5.1. When did they fix this, it makes me happy.

e: They fixed that in 5.0. Amazing.

McGlockenshire
Dec 16, 2005

GOLLOCKS!
That won't help much if it's limited server resources that are the problem. One instance could easily consume all available I/O.

But if the other database is small enough to be kept in RAM, and both instances are tuned properly, it could work out.

McGlockenshire
Dec 16, 2005

GOLLOCKS!

Golbez posted:

Let's say I have a status history table...

Totally doable. I think. This will use MySQL's don't-have-to-GROUP BY-all-the-columns logic, which may not work in not-MySQL databases. You did not specify which SQL server you're using, but you're using PHP, and PHP and MySQL go together like Dumb and Dumber.

code:
SELECT `User ID`, `Status`, `Date`, Newest_Data.*
  FROM Status_History,
       (SELECT `User ID` AS nd_uid, `Status` AS nd_status, `Date` AS nd_date
          FROM Status_History
         GROUP BY `User ID`
         ORDER BY `Date` DESC
        ) AS Newest_Data
 WHERE Newest_Data.nd_uid = Status_History.`User ID`
   AND Newest_Data.nd_date > Status_History.`Date`
 GROUP BY `User ID`
 ORDER BY `Date` DESC
What this should give you is two sets of columns. The right-most set will be each user's newest status, and the date that status was given. The left-most set will be the user's immediate last status, and it's date. Once it's shown to work, you can add on a Status_History.`Status` = Newest_Data.nd_status in the WHERE clause to only catch those with two updates of the same status in a row.

McGlockenshire
Dec 16, 2005

GOLLOCKS!

Golbez posted:

Thanks, though I'm not really getting it to work, but... immediate last status? What if there's three in a row? :v:

Uh, yeah, dunno about that one. Time to break out the PHP.

McGlockenshire
Dec 16, 2005

GOLLOCKS!
Keep in mind that different cultures have different naming rules. Some have tribal names, some place the family name first, then the person's given name. Some have family names that are derived from their parents. Some have family names that have multiple words. Some people have multiple middle names.

Creating rules for western pattern names is going to end up excluding someone, and knowing how these things tend to work, that someone will proceed to raise a royal stink.

For these reasons, it's always going to be a workable idea to provide the user with one and only one name field, and then both store that verbatim and try to parse it.

McGlockenshire
Dec 16, 2005

GOLLOCKS!

Sharrow posted:

Does Postgres have anything that approaches high availablity? There seems to be like 20 different third-party solutions and I can't always tell what's semi-official and what's a hacked up Perl script.

Here's the official word on the matter. As of 9.0, it supports hot-standby / read-only slaves without third party solutions. It does this using async WAL shipping, akin to MySQL's binary replication log. They're working on synchronous replication in 9.1.

For failover and reader load balancing, pgpool-II seems like one of the better choices.

If you need multiple writers (multi-master), you'll need one of the many trigger-based third-party solutions. Bucardo seems to be a popular and relatively sane one.

I wish I had something to recommend. We're still deciding how we're going to replace our MySQL multi-master setup, and are debating either Bucardo or making our app perform the replication itself through a message queue. The only reason we're considering doing it ourselves is because we don't use replication for load balancing, but as a communication mechanism between the front-facing website and the in-house bits.

McGlockenshire fucked around with this message at 21:57 on Nov 15, 2010

McGlockenshire
Dec 16, 2005

GOLLOCKS!
e: n/m, contradictory data.

McGlockenshire fucked around with this message at 07:51 on Dec 30, 2010

McGlockenshire
Dec 16, 2005

GOLLOCKS!
They could also use "shouldn't use SQL" to mean "don't need ACID," referring to one of the new shard-based/distributed "NoSQL" key-value stores that sacrifice one or more of the ACID principals in the name of performance.

McGlockenshire
Dec 16, 2005

GOLLOCKS!
Have you considered adding triggers on the inventory table instead? You can use this to capture before-and-after data into another table, which may prove more useful than a daily snapshot. You can even build the daily snapshot from that data by effectively re-playing the log.

(Any sane inventory system should already be doing this, either at the table level or at the application level...)

McGlockenshire
Dec 16, 2005

GOLLOCKS!
I think that PhotoRealAuthorisationRequired is premature optimization. You've only allocated a 16-bit integer for your photo identifier. You'll run out of photo ids way, way before the join or union you'd use to calculate viewer permissions from the other tables becomes a performance concern.

You can easily make that table a view and replace it with a table+triggers later on if you identify it as an actual performance hog. The information can also be cached at the application level instead.

Otherwise, the rest looks fine.

Related but unimportant if you're the only dev, your naming scheme is funny. CategoryAuthorisationRequired is a link table between Category and Authorisation. Consider naming it "CategoryAuthorisations" instead. Same for PhotoAuthorisationRequired => PhotoAuthorisations. CategoryMembership is a link table between Category and Photo. Consider "PhotoCategories" instead. The shorter names will make it very clear for other coders that these are link tables. PhotoRealAuthorisationRequired could become, say, PhotoCalculatedAuthorisation.

McGlockenshire fucked around with this message at 19:02 on Jan 24, 2011

McGlockenshire
Dec 16, 2005

GOLLOCKS!
Someone somewhere recently brought up a technique to represent a parent/child/tree relationship in a relational database without happy things like WITH RECURSIVE. If you have that in your database engine of choice, use it, it's a no-brainer.

I'm still trying to figure out why I didn't bookmark the article I'm thinking about. Going to have to go diving...

McGlockenshire
Dec 16, 2005

GOLLOCKS!
Yeah, that looks similar to if not identical to what I was thinking of. Bookmarking it this time around, thanks!

McGlockenshire
Dec 16, 2005

GOLLOCKS!
With Percona's XtraBackup providing full support for both InnoDB and MyISAM, there's really no excuse to have a MyISAM-specific backup process in place.

To be frank, this sounds more like a "we're afraid of change" excuse than anything else. Your code shouldn't give a drat about the underlying storage engine (and unless you're using MySQLisms, your code shouldn't even care that it's running on MySQL to begin with...)

Really, if "the backups will break" is all they've got, it's time for either some hardcore re-education or time to find another job. InnoDB has been superior to MyISAM in almost every way since 5.0, and with the sole exception of full text search, there is zero reason to keep it in use.

McGlockenshire fucked around with this message at 21:46 on Mar 4, 2011

McGlockenshire
Dec 16, 2005

GOLLOCKS!

Aredna posted:

As someone who has to create reports off of databases with little to no documentation: Trying to reverse engineer enums is a pain in the rear end and any with a table set up have made my life 10x easier.

I guess what I'm trying to say is that if anyone ever wants any reports built off of your data then you need good documentation, but if there is any chance it is slacking at all then having a lookup table is extremely valuable.

Related: make use of COMMENT fields, available both at the column and table levels. One of the little things about MySQL that doesn't suck.

McGlockenshire
Dec 16, 2005

GOLLOCKS!

quote:

Similarly, do people have experience with TRUNCATE performance for InnoDB tables, or at least those without foreign-key relationships?
The largest InnoDB table I've ever had to truncate was about 9 million rows and about 2.4 gigs. It took a lot longer than we expected, it really felt like it was identical to blind DELETE FROM.

We found it faster to do a CREATE TABLE new LIKE old to copy the structure of the table, then do a two way rename -- old => old_old, new => old. It should only take a split second, and then you can deal with the data in the old table as you'd like. I'd imagine this would completely break poo poo if you have foreign keys.

McGlockenshire
Dec 16, 2005

GOLLOCKS!
WRT bizarreness when doing a table rename, I've never seen any sort of copying or *any* view reference updating... but then again, our data is stupid and we don't use views. The mere fact that you can turn off foreign key refs and drop a table full of keys referenced elsewhere pretty much says it all when the conversation turns to MySQL and data integrity.

McGlockenshire
Dec 16, 2005

GOLLOCKS!
As long as the MySQL version involved doesn't silently chomp CHAR columns over four characters long into VARCHARs, the identity table sounds perfectly fine.

(Keep in mind that you really only need to worry horribly about enums-as-data when the enum is used in multiple tables. A status column in a single table with a restricted set of possibilities that won't change frequently is ok to keep enumized, though it's still not a super solution. Oh if only MySQL supported CHECK clauses...)

McGlockenshire fucked around with this message at 18:36 on Mar 14, 2011

McGlockenshire
Dec 16, 2005

GOLLOCKS!
Be careful with REPLACE INTO when the table you're operating on is referred to by other tables with ON DELETE CASCADE. Tt performs a delete and re-insert and the delete cascades immediately.

McGlockenshire
Dec 16, 2005

GOLLOCKS!
Possibly-stupid drowsy-before-bed idea.

Are there ever conditions under which word A is synonymous with word B, but word B is not synonymous with A?

I'm pretty sure the answer is no.

Therefore, you can effectively group all of the related words together into synonym families.

Table Families ( family integer primary key )
Table Words ( word text, family integer references Families(family), unique(word, family) )

To get synonyms, SELECT word, family FROM Words WHERE family = ( SELECT family FROM Words WHERE word = ? )

Keep in mind that words can be multiple parts of speech and have a different meaning based on that context. This implies that each word can be a member of multiple families, which instantly breaks the query I just wrote. You'd need to add part-of-speech or context or something else to the Words table to differentiate different word meanings to tie them together with the appropriate family of synonyms, then find a way to query that one specific word+pos+context.

Given that there could be multiple entries for one word, you could end up returning words from the synonym query that belong to other families, which you could then crawl out to a certain extent to get related-but-not-synonymous words.

Anyway, if this seems stupid, pick option #1. Option #2, the creation of multiple tables for specific data, is an anti-pattern and you should forget you ever proposed it.

McGlockenshire fucked around with this message at 07:32 on Mar 28, 2011

McGlockenshire
Dec 16, 2005

GOLLOCKS!
You have tuned my.cnf for your server specs, right?

McGlockenshire
Dec 16, 2005

GOLLOCKS!
Because ALTER has to basically rewrite the table, yes, it has to get a lock, and no, there's no sane way to bypass this restriction. What are you trying to accomplish?

McGlockenshire
Dec 16, 2005

GOLLOCKS!
Ah, you people and your modern, sane databases.

Traditional MySQL replication works by just sending the actual SQL commands to all listening slaves. If you run the ALTER on the master, the slaves will also receive the command (once completed) unless you halt replication, disconnect every client, record the log position, run the command, record the new log position, then update every slave to pick up replication at the new log position.

Modern MySQL replication can just send over changed data, but DDL statements are still replicated as actual SQL text.

quote:

Adding a field to a large table without write/read locking, or write/read locking for as short time as possible.
Depending on the size of the data, it might be faster to create a copy of the stable structure using CREATE TABLE ... LIKE ..., alter the empty clone, then do an INSERT INTO ... SELECT ... to copy all the data over and then a pair of ALTERs to swap the table names. Dropping all the indexes in the empty clone and adding them after the data has been re-entered also might be faster, depending.

Depending on whether or not you can just disable connections to the database, this may or may not be filled with peril and the chance of subtle data loss if the original table is changed between the time you INSERT INTO ... SELECT and do the rename. MySQL does not support DDL statements inside a transaction, so an exclusive lock is your only choice here.

Running the alter directly is the thing least likely to be trouble for you.

McGlockenshire
Dec 16, 2005

GOLLOCKS!

quote:

Does anyone know how Postgres handles it?
Unfortunately I don't have a large enough data set in PG to properly test this -- only a few thousand rows, only a few hundred megs. Adding a column was instantaneous in my largest tables, as was dropping.

The ALTER TABLE docs say that DROP COLUMN doesn't require a table rewrite, but changing a column data type does require one. DDL is also supported in transactions, thank goodness.

McGlockenshire
Dec 16, 2005

GOLLOCKS!

RoadCrewWorker posted:

I really love postgres and couldn't go back to mysql, but man, there's some severely headache inducing flaws that just make me think i must be missing something incredibly obvious.
Autovacuum?

McGlockenshire
Dec 16, 2005

GOLLOCKS!
Looks like you're hitting a deadlock.

Increase it again and wait for it to be near a timeout, then run SHOW FULL PROCESSLIST as root on both instances to reveal the queries that are deadlocking.

McGlockenshire
Dec 16, 2005

GOLLOCKS!

Bob Morales posted:

I'm dyslexic.

http://pastebin.us/643

Lines 36-40, connection 5 (second row in the process list) is holding open a transaction that, for some reason, connection 4 (first row) wants to lock upon. Unfortunately there's no clue there on why it's blocking, as nothing interesting is listed in those four lines. Connection 4 is just trying to get a shared lock ("LOCK mode S locks rec but NOT gap waiting"), no idea why it'd be denied here. Does the tool make two connections?

There's lots of noise in Google about that particular lock state, but nothing looks immediately relevent. You may find Percona's SHOW INNODB STATUS walkthrough helpful.

McGlockenshire fucked around with this message at 06:48 on Jul 14, 2011

McGlockenshire
Dec 16, 2005

GOLLOCKS!
No, SQL uses double quotes. MySQL uses backticks, but also accepts double quotes. Oh, and MSSQL uses loving square brackets, I am not making that up.

If you want to think cross-platform, think double quotes.

McGlockenshire
Dec 16, 2005

GOLLOCKS!
If not for it being MyISAM and other shenanigains, I'd recommend XtraBackup, which can take and restore entire snapshots, and even has a mode designed to create a new slave from a master. While it works with MyISAM in addition to InnoDB, it's really designed to copy entire *instances*, not just specific databases within the instance.

The various replicate ignore bits do work, however the data still has to be transmitted from master to slave, even if the slave will end up ignoring it. This shouldn't be a problem as long as master and slave are local to each other. Due to the way that the binary log works, though, that amount of data is going to cause some non-trivial replication lag while the slave works through the transmitted log.

McGlockenshire fucked around with this message at 04:31 on Aug 20, 2011

McGlockenshire
Dec 16, 2005

GOLLOCKS!

Golbez posted:

We're slightly paranoid about Oracle's moves on MySQL; does anyone have any experience or comment on MariaDB?

Also consider Percona's not-a-fork. I've used it as a drop-in replacement a few times, and it's worked well. The only problems I've had are the same problems you usually have when upgrading, like forgetting about old_passwords and client library versions.

Adbot
ADBOT LOVES YOU

McGlockenshire
Dec 16, 2005

GOLLOCKS!

mister_gosh posted:

I'm looking to create hot (and cold) backups of a production MySQL database. Are there any tools out there that are proven/standard?

Percona's Xtrabackup is pretty much the gold standard here and works both with MyISAM and InnoDB tables.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply