|
Hammerite posted:Thanks for the overview, that's good information. You should upload it somewhere, rather than let it get buried in this thread. Glad to help. One day I'll resurrect my blog or something, and that'll go there.
|
# ? Jun 11, 2012 20:46 |
|
|
# ? May 28, 2024 14:22 |
|
Hammerite posted:Handy if you want to use UTF-8 to support internationalisation, but also want to use ASCII text for some key columns used internally. For example, if you have a table of user privileges you might use (say) a CHAR(20) CHARACTER SET ascii COLLATE ascii_bin NOT NULL as the primary key so that you can identify privileges by descriptive names. You don't want the indexed column to be 80 bytes when it should be 20 bytes. I'm not sure where to start. Let's see, if you're storing strings they are for user consumption. Your users may be from anywhere, and they should be able to use their own alphabet. If you are storing strings and you care about space use UTF-8, which actually works in postgres. Also in postgres you almost certainly don't want to use length specifiers for text unless you really really mean "a string of exactly/up to this length". Databases identify data with integers or guids, never strings, your primary key should be an integer referencing a table of descriptions for humans to read. Those descriptions should be stored as UTF-8. Hammerite posted:Although, that brings up another possible example for per-column character sets. I could see the possibility that you might want to offer UTF-8 support but want to restrict some columns to be BMP characters only. For example, perhaps you might want to allow user names only to use BMP characters. For reference, here are some features of Postgres that as far as I'm aware MySQL doesn't support:
|
# ? Jun 11, 2012 22:22 |
|
Zombywuf posted:Wat? Why? I see no reason why strings cannot be for internal use. Indeed, I gave you an example of a possible internal use for ascii strings. Another example (of strings that are not for user consumption, not of strings that need be ascii) could be logging data. quote:Your users may be from anywhere, and they should be able to use their own alphabet. If you are storing strings and you care about space use UTF-8, which actually works in postgres. It also works in MySQL, up to poorly-chosen names. The space issue is to do with indexes, not with space taken up in the table on disk. I don't know how it works in Postgres but in MySQL each record in an index takes up the same amount of space. So if a column is CHAR(20) CHARACTER SET utf8mb4 and you index it, each index record is 80 bytes. If you are happy keeping just ascii text in that column, you could make it just 20 bytes, which was my example. quote:... Databases identify data with integers or guids, never strings, your primary key should be an integer referencing a table of descriptions for humans to read. Those descriptions should be stored as UTF-8. I disagree. There is no reason why a table should not use strings as its primary key. quote:This has nothing to do with character sets. This is what constraints are for. But the character set of a column is a constraint on that column! Your statement makes no sense. quote:For reference, here are some features of Postgres that as far as I'm aware MySQL doesn't support: But neither does MySQL? The report linked by Janin notes that the bug has been fixed in up-to-date versions of MySQL.
|
# ? Jun 11, 2012 23:35 |
|
Hammerite posted:I disagree. There is no reason why a table should not use strings as its primary key. Performance. edit: Although I guess that doesn't matter with MySQL
|
# ? Jun 12, 2012 00:08 |
|
Hammerite posted:I disagree. There is no reason why a table should not use strings as its primary key.
|
# ? Jun 12, 2012 00:17 |
|
Coding horrors: The horror is coming from inside the thread
|
# ? Jun 12, 2012 01:55 |
|
Zombywuf posted:
Am I understanding this correctly? If you just make a couple of attempts with whatever password, you'll get in? gently caress.
|
# ? Jun 12, 2012 09:07 |
|
http://www.scmagazine.com.au/News/304509,mysql-servers-vulnerable-to-password-bypass.aspx posted:All MariaDB and MySQL versions up to 5.1.61, 5.2.11, 5.3.5 and 5.5.22 were vulnerable, Golubchik said.
|
# ? Jun 12, 2012 09:44 |
|
dwazegek posted:
It depends on how the code was build. Generic, x86 builds would be fine, but savagely optimised builds are at risk as they have weird bit hack memcmp implementations.
|
# ? Jun 12, 2012 11:31 |
|
Hammerite posted:Why? I see no reason why strings cannot be for internal use. Indeed, I gave you an example of a possible internal use for ascii strings. Another example (of strings that are not for user consumption, not of strings that need be ascii) could be logging data. quote:It also works in MySQL, up to poorly-chosen names. quote:The space issue is to do with indexes, not with space taken up in the table on disk. I don't know how it works in Postgres but in MySQL each record in an index takes up the same amount of space. quote:So if a column is CHAR(20) CHARACTER SET utf8mb4 and you index it, each index record is 80 bytes. If you are happy keeping just ascii text in that column, you could make it just 20 bytes, which was my example. quote:But the character set of a column is a constraint on that column! Your statement makes no sense. quote:But neither does MySQL? The report linked by Janin notes that the bug has been fixed in up-to-date versions of MySQL.
|
# ? Jun 12, 2012 11:40 |
|
Zombywuf posted:The people who read your logs are users. If I get you, you're applying "user" as in a user of the database, in which case anyone who interacts with the database in any capacity is a user. But then your statement "if you're storing strings they are for user consumption" becomes facile; any data stored in the database is for user consumption. It seems to me you're not really objecting to strings per se but rather to primary keys with a business meaning. quote:With the correct phase of the moon etc... I'm not sure what you're alluding to here. You'll have to be more explicit. But please assume that MySQL is operating in 'TRADITIONAL' mode. I have no interest in defending its behaviour when not in strict mode. quote:This would be impossible in Postgres as far as I'm aware due to the fact that the TEXT datatype is unbounded (for given values of unbounded) so your indexes would be very large. I thought Postgres had fixed-length character types analogous to MySQL's CHAR type. But even if you insist on TEXT, Postgres lets you create indexes based on the image of column values under a deterministic function, does it not? So like in MySQL you can index a prefix of a text column's value. quote:A bigint is 8 bytes and will not get mistaken for having semantic content. You do not want your database relying on primary keys with semantic information in them. What happens when someone corrects a spelling error in your primary key? The primary key value changes. Cascading updates transmit the change elsewhere in the database. If material external to the database relies upon the content of the primary key record in question, then that material will become inaccurate. In most cases, that will be unacceptable. In those cases, it is clear that you should not be using primary keys with a business meaning. quote:What the hell has MySQL done to you? Again, you are going to need to be more explicit regarding what your objection is to the statement "the character set of a column is a constraint on that column". quote:Also please explain the differences in allowable values between iso-8859-1 and iso-8859-15. Based on a quick visit to Wikipedia: it depends whether you mean allowable binary values or allowable Unicode characters. In the first case there is no difference in allowable values. In the second case there are eight characters that are valid in iso-8859-1 columns but not in iso-8859-15 columns, and eight characters that are valid in iso-8859-15 columns but not in iso-8859-1 columns. I'm not sure what you meant to demonstrate with this question. It seems straightforward to me.
|
# ? Jun 12, 2012 15:25 |
|
Hammerite posted:If I get you, you're applying "user" as in a user of the database, in which case anyone who interacts with the database in any capacity is a user. But then your statement "if you're storing strings they are for user consumption" becomes facile; any data stored in the database is for user consumption. It seems to me you're not really objecting to strings per se but rather to primary keys with a business meaning. quote:I'm not sure what you're alluding to here. You'll have to be more explicit. But please assume that MySQL is operating in 'TRADITIONAL' mode. I have no interest in defending its behaviour when not in strict mode. quote:I thought Postgres had fixed-length character types analogous to MySQL's CHAR type. quote:But even if you insist on TEXT, Postgres lets you create indexes based on the image of column values under a deterministic function, does it not? So like in MySQL you can index a prefix of a text column's value. quote:The primary key value changes. Cascading updates transmit the change elsewhere in the database. If material external to the database relies upon the content of the primary key record in question, then that material will become inaccurate. In most cases, that will be unacceptable. In those cases, it is clear that you should not be using primary keys with a business meaning. quote:Again, you are going to need to be more explicit regarding what your objection is to the statement "the character set of a column is a constraint on that column".
|
# ? Jun 12, 2012 17:09 |
|
This all reminds me of some of the stuff I saw in an old design for our database.code:
|
# ? Jun 12, 2012 18:11 |
|
God help you if you order the same amount of something twice in one day.
|
# ? Jun 12, 2012 19:26 |
|
This was never actually put into production, fortunately.
|
# ? Jun 12, 2012 20:08 |
|
Golbez posted:God help you if you order the same amount of something twice in one day. Well, not to excuse the godawful design, but assuming that DATETIME has a time component as I assume it ought, then you're probably ok.
|
# ? Jun 12, 2012 20:50 |
|
Factor Mystic posted:Well, not to excuse the godawful design, but assuming that DATETIME has a time component as I assume it ought, then you're probably ok. Oh. I missed that. Now I am become the horror.
|
# ? Jun 12, 2012 21:03 |
|
Factor Mystic posted:Well, not to excuse the godawful design, but assuming that DATETIME has a time component as I assume it ought, then you're probably ok. I ran into a similar thing at an old job where DATETIME was only stored up to seconds. We were processing over a thousand orders a day it got kind of inevitable that a collision would occur
|
# ? Jun 12, 2012 21:20 |
|
This API I have to work with has no concept of joining. I have to do all the data manipulation in the code. So, to get all the time submitted against a single item I have to do this poo poo: http://pastebin.com/jLhm0wWE Or, to translate, I have to get the project task that's referenced by deliverable ID, and then I have to ensure the person referenced (by ICUserName) is actually assigned to that same deliverable (via the Projecttaskassign request) and then I have to sum up the hours for all the time they submitted against that task. And this is including my various wrapper and helper methods. It'd probably be twice as long without it I am getting very tired of this project.
|
# ? Jun 12, 2012 22:01 |
|
I gotta say, I'm impressed. I had an example from our in-house API that I thought would rival that, but then I realized yours was for a single query, where my example actually operated on several values.
|
# ? Jun 12, 2012 23:48 |
|
ijustam posted:I have to do all the data manipulation in the code. code:
|
# ? Jun 13, 2012 00:08 |
|
code:
|
# ? Jun 13, 2012 00:37 |
|
Zombywuf posted:The number of hoops you have to jump through in order to store a unicode string in Postgres is exactly 0. Again, what hoops specifically do you have in mind? If you regard setting up the software correctly for your needs as a "hoop to jump through", then I am sure I would be able to name a "hoop" or two once I made a serious attempt to build something with Postgres. It does have a reputation for being a little more difficult to get to grips with as a beginner than MySQL, doesn't it? quote:It does but it also has variable length strings you can index. If indexes were required to be of a fixed width this would be impossible. Well, I guess what I was talking about doesn't apply then. I can't really comment not knowing how indexes work in Postgres. quote:The character set of a column is how the data in that column is interpreted semantically. Every sequence of bytes that is a valid iso-8859-1 (or even utf-8) string is a valid iso-8859-15 string. The character set is not a constraint, it is metadata that tells you how to interpret the data. This is true everywhere from POSIX locales, to HTTP to databases. A constraint is a specific requirement of the data, which may (probably will) be specified in terms of the metadata defining the semantics of the data. When you talk to the database server you are using a particular character set to do so. For example, you might be using UTF-8, which can express a superset of the characters that can be expressed in iso-8859-1 or in iso-8859-15. If I send an INSERT or UPDATE statement that stores a particular string value in a column whose character set is iso-8859-1 (say), the string value needs to contain only valid iso-8859-1 characters. If it contains other characters, it is not a valid string value for that column. That is a "specific requirement of the data", in your words. This is what I mean when I say that the character set of a column is a constraint on that column. It also implies an interpretation of the data stored in the column, as you say. Suppose that you have two columns, one INT and the other BINARY(4). Would you say that there are different constraints on the two columns, or would you say that there is only a difference in how the columns are interpreted semantically?
|
# ? Jun 13, 2012 01:50 |
|
Hammerite posted:My MySQL is better than your Postgres! Dude you asked for a comparison and then you keep moving the goalposts every single time someone raises a valid point against MySQL. I have no idea what you're trying to gain from this seeing as MySQL has objectively lagged behind Postgres in basically everything, from features (subqueries) to security (this thing just this month). Do you own stock in Oracle or some poo poo? Also, seriously why the gently caress do you think using a string as a primary key is ever a good idea? I actually want an answer for this. e: I guess using them as a composite key maybe? Using them as a sole primary key of a table is pretty loving dumb though. Look Around You fucked around with this message at 02:23 on Jun 13, 2012 |
# ? Jun 13, 2012 02:19 |
|
Look Around You posted:Dude you asked for a comparison and then you keep moving the goalposts every single time someone raises a valid point against MySQL. I'm dismayed that you think so. What have I posted that you feel is an instance of goalpost-moving? I'm not trying to argue that MySQL is "better than" PostgreSQL. I don't believe that it is, in fact, "better than" PostgreSQL. I would also say that given that PostgreSQL has more support for advanced SQL features, I can see why it would meet many users' needs much better than MySQL does. In addition, my and Zombywuf's argument is primarily about topics that I see as quite tangential to the topic of differences between SQL implementations (it arose from Zombywuf asking: "Why, for the love of Codd, would you want per column character sets?"). I do not see us as being engaged in an argument about whether PostgreSQL is better than MySQL or vice-versa. I cannot speak for Zombywuf (with regards to how he sees the argument, that is. It is clear that he thinks PostreSQL better than MySQL). quote:Also, seriously why the gently caress do you think using a string as a primary key is ever a good idea? I actually want an answer for this. I've already suggested a use case in this thread, so you already have an answer for this question. You could criticise that suggestion, or you could offer a general reason why not to use a string as a primary key. I don't think the presumption should be that using a string as a primary key is a bad idea, so I don't think the onus is on me to show that it actually isn't. quote:e: I guess using them as a composite key maybe? Using them as a sole primary key of a table is pretty loving dumb though. Surely if a string is part of a composite key, then you can trace it back eventually to a table in which it forms part of a primary key, so I don't think the exception you posited here makes sense.
|
# ? Jun 13, 2012 02:43 |
|
Hammerite posted:I'm dismayed that you think so. What have I posted that you feel is an instance of goalpost-moving? Your multiple instances of "Well then assume it's in strict mode/assume you use the right 'utf8m4' instead of utf8 datatype/etc" are in effect moving the goalposts. Whether you want to admit it or not, the difficulty of setting the software up correctly actually is of merit in a debate about the pros/cons of two different software packages, so your telling us to overlook this is pretty disingenuous. quote:In addition, my and Zombywuf's argument is primarily about topics that I see as quite tangential to the topic of differences between SQL implementations (it arose from Zombywuf asking: "Why, for the love of Codd, would you want per column character sets?"). I do not see us as being engaged in an argument about whether PostgreSQL is better than MySQL or vice-versa. I cannot speak for Zombywuf (with regards to how he sees the argument, that is. It is clear that he thinks PostreSQL better than MySQL). But that is definitely a difference between SQL implementations, and thus is also something that one may bring up in a debate about the differences between SQL implementations. quote:I've already suggested a use case in this thread, so you already have an answer for this question. You could criticise that suggestion, or you could offer a general reason why not to use a string as a primary key. I don't think the presumption should be that using a string as a primary key is a bad idea, so I don't think the onus is on me to show that it actually isn't. So your use-case is something that by (almost) all accounts is against best practices and is horrible for maintainability. Okay. You also never said your use-case for it, you just handwaved it away by quote:Surely if a string is part of a composite key, then you can trace it back eventually to a table in which it forms part of a primary key, so I don't think the exception you posited here makes sense. Not necessarily, consider a relationship: code:
e: for correctness, (last_name, first_name, dob) can be an alternate key for the table and you can have another unique key as its primary key, which may or may not contain a string. e2: I still can't think of a very good use-case for a string as (part of a) PK though. Look Around You fucked around with this message at 03:55 on Jun 13, 2012 |
# ? Jun 13, 2012 03:09 |
|
What's the difference between a primary key and a mere (unique + not-null) constraint?
|
# ? Jun 13, 2012 03:24 |
|
Jabor posted:What's the difference between a primary key and a mere (unique + not-null) constraint? Point taken. Hammerite: what did you mean by that statement about composite keys with strings? Were you trying to say that it would have to be the PK of another table if it were part of a composite key, that it'd have to be part of the table's PK if it were in a composite key, or what? I had assumed that you were trying to say that it needed to be the PK of another table if it were part of a composite key and I probably shouldn't have assumed that's what you meant. e: Wait, are you getting at the fact that there's no theoretical difference (since any candidate key can technically be used as a primary key), but depending on the DB software you're using you may not be able to refer to a (unique + not-null) constraint as a foreign key? Look Around You fucked around with this message at 04:22 on Jun 13, 2012 |
# ? Jun 13, 2012 04:03 |
|
Look Around You posted:Your multiple instances of "Well then assume it's in strict mode/assume you use the right 'utf8m4' instead of utf8 datatype/etc" are in effect moving the goalposts. Whether you want to admit it or not, the difficulty of setting the software up correctly actually is of merit in a debate about the pros/cons of two different software packages, so your telling us to overlook this is pretty disingenuous. But as I said, I don't see the discussion of the merits, or otherwise, of MySQL as being the real focus of discussion here. (Actually of the things being discussed, I find the thing about whether a column character set is a constraint the most interesting, but perhaps I am weird.) I was moved to defend (my perception of) MySQL's acceptable (although it has one or two warts) Unicode support by Zombywuf's barb that UTF-8 "actually works in Postgres". I'll drop the whole thing. I'm not really interested in defending MySQL from its critics, and heaven knows it has a lot of them. You will notice I was drawn into this argument in the first place to defend the usefulness/desirability of per-column character sets. For the record, I do think that per-column character sets are nice to have, just as unsigned integer types are nice to have and just as various things PostgreSQL offers but MySQL doesn't would be nice to have. quote:So your use-case is something that by (almost) all accounts is against best practices and is horrible for maintainability. Okay. Well I do think that's the issue you and Zombywuf have with what I'm saying, isn't it? What requirements are there of a primary key? By definition, it is meant to uniquely identify rows in the table. To that, you and Zombywuf would both add (unless I misunderstand you) that the primary key should be meaningless. (And I would agree that in the great majority of cases it should be meaningless, I just don't join you is saying that it must 100% of the time be meaningless.) It is possible to have an integer that is meaningful, and it is possible to have a string that is meaningless. So I think when I said "you don't think PKs should have business meaning", I was not handwaving but getting to the heart of the matter. The use-case I was referring to is the following (bold not in original): Hammerite posted:Handy if you want to use UTF-8 to support internationalisation, but also want to use ASCII text for some key columns used internally. For example, if you have a table of user privileges you might use (say) a CHAR(20) CHARACTER SET ascii COLLATE ascii_bin NOT NULL as the primary key so that you can identify privileges by descriptive names. You don't want the indexed column to be 80 bytes when it should be 20 bytes. quote:Not necessarily, consider a relationship: I guess I was incorrect when I said that you would have to be able to trace it back to another table's primary key. I still don't think your suggestion of a composite key containing a string is a sensible exception to the rule "no meaningful primary keys", if we accept that rule. Essentially because a primary key that consists of a single string column is surely just a special case of "composite primary key that includes a string". You note that the example (last name, first name, dob) is a bad one (on this we both agree), but I am not convinced that there is a class of examples where a reasonable argument could be made for a composite primary key containing a string but the argument would collapse if the scenario were changed to make the string the sole component of the primary key. quote:e: Wait, are you getting at the fact that there's no theoretical difference (since any candidate key can technically be used as a primary key), but depending on the DB software you're using you may not be able to refer to a (unique + not-null) constraint as a foreign key? No, your original assumption about what I meant was correct. I was mistaken.
|
# ? Jun 13, 2012 10:15 |
|
quote:Handy if you want to use UTF-8 to support internationalisation, but also want to use ASCII text for some key columns used internally. For example, if you have a table of user privileges you might use (say) a CHAR(20) CHARACTER SET ascii COLLATE ascii_bin NOT NULL as the primary key so that you can identify privileges by descriptive names. You don't want the indexed column to be 80 bytes when it should be 20 bytes. This really begs the question of why it's not worth making the indexed column 4 bytes (or even 1 byte if you don't have that many unique options) and having another table mapping those to your meaningful (full unicode) names. I mean if 20 is better than 80, 4 is better than 20, right?
|
# ? Jun 13, 2012 10:28 |
|
Hammerite posted:I don't think the presumption should be that using a string as a primary key is a bad idea, Frankly, beyond percussive maintenance, I am unsure how to transmit this information to you. Hammerite posted:so I don't think the onus is on me to show that it actually isn't. 'I don't understand why it is bad so it isn't my job to try' Unfortunately, the onus isn't on us to explain why this is a BadIdea™. Hammerite posted:What requirements are there of a primary key? By definition, it is meant to uniquely identify rows in the table. To that, you and Zombywuf would both add (unless I misunderstand you) that the primary key should be meaningless. (And I would agree that in the great majority of cases it should be meaningless, I just don't join you is saying that it must 100% of the time be meaningless.) Artisan primary keys. Hammerite posted:Handy if you want to use UTF-8 to support internationalisation, but also want to use ASCII text for some key columns used internally I know you brought this up earlier, but UTF-8 is an extension of ASCII You don't need to use two different encodings here, just use UTF-8 everywhere. tef fucked around with this message at 10:34 on Jun 13, 2012 |
# ? Jun 13, 2012 10:30 |
|
This thread is becoming a horror in itself, again. There's something amusing about posters in threads like this one or *players.txt becoming just like the people being mocked.
|
# ? Jun 13, 2012 10:36 |
|
When you stare into the abyss and all that.
|
# ? Jun 13, 2012 10:50 |
|
KaneTW posted:This thread is becoming a horror in itself, again. This will never stop happening.
|
# ? Jun 13, 2012 10:57 |
|
So I've tried signing up for AppFog, a cloud platform provider.
|
# ? Jun 13, 2012 11:45 |
|
You should shoot them an email and suggest they move their service into the cloud to improve availability.
|
# ? Jun 13, 2012 12:14 |
|
In fairness, you are their first ever account.
|
# ? Jun 13, 2012 12:53 |
|
w00tz0r posted:
|
# ? Jun 13, 2012 14:14 |
|
LOOK I AM A TURTLE posted:In fairness, you are their first ever account. Well, not really. Sagacity posted:So I've tried signing up for AppFog, a cloud platform provider.
|
# ? Jun 13, 2012 15:04 |
|
|
# ? May 28, 2024 14:22 |
|
Usually I don't post in this thread or try to be judgemental because I'm not that great of a programmer myself but this is just too much. Component in question was made by another department of our company and we are using it. It was working fine but some subtle bugs have been showing up so I go to investigate. Behold:code:
code:
code:
Whole code base is a complete mess. Half of it is commented out, full of debug statements and really I have no idea how it works at all. code:
Every method I look at is a coding horror. And it's like 30Kloc of pure code. What the gently caress. I'll be buying a drink today after work.
|
# ? Jun 13, 2012 16:58 |