|
Defghanistan posted:Sup nerds, heard you guys like computers you heard wrong.
|
# ? Dec 6, 2012 02:42 |
|
|
# ? Jun 11, 2024 23:11 |
|
computers are awesome as long as they work. When there's a problem it's always due to the shittiest reason and it's frustrating as hell. When there's a serious problem it's often due to a deeply rooted conceptual problem and it's depressing as hell.
|
# ? Dec 6, 2012 02:48 |
|
MononcQc posted:computers are awesome as long as they work. When there's a problem it's always due to the shittiest reason and it's frustrating as hell. When there's a serious problem it's often due to a deeply rooted conceptual problem and it's depressing as hell. and we're mostly dealing with legacy constraints made in good faith and short sight
|
# ? Dec 6, 2012 02:49 |
|
tef posted:and we're mostly dealing with legacy constraints made in good faith and short sight this and everything having to do with l10n and i18n.
|
# ? Dec 6, 2012 02:53 |
|
i've been looking at imap and it is a total clusterfuck. old protocols never seem to handle unicode well. http://tools.ietf.org/html/rfc3501#section-5.1.3 we'll embed unicode using utf-7 (putting base64 in it yay). except with a different alphabet and control characters. hooray!.
|
# ? Dec 6, 2012 03:05 |
|
MononcQc posted:this and everything having to do with l10n and i18n. our written language is a legacy system from when we scratched things with implements.
|
# ? Dec 6, 2012 03:11 |
|
"You know UCS-4 would be very nice to use with no surrogate pairs ever, it just would take a bit more storage for text, which is dwarfed by whatever loving JPEG you'll attach to your content. Instead let's make sure us English speakers and some of the other latin-1 retards we share bits of culture with get to keep our 8 bit character representation and make the final encoding have a variable width with surrogate pairs that make it impossible to know where you are in the whole thing so we can then support other people's languages. Once everyone understands this, we'll introduce them to the idea that code points are not a decent unit anyway and we need to go further with combining accents and grapheme clusters and poo poo." "yeah let's do that, but in UTF-7 and base64, too!"
|
# ? Dec 6, 2012 03:12 |
|
challenge for today: find a Unicode sequence which is larger to represent as an encoded UTF-8 string than its visual representation, either as a JPG, PNG-8 or GIF image. It is likely possible but I've not had the energy to do it.
|
# ? Dec 6, 2012 03:14 |
|
MononcQc posted:"You know UCS-4 would be very nice to use with no surrogate pairs ever http://www.unicode.org/history/unicode88.pdf quote:Are 16 bits, providing at most 65,536 distinct codes, sufficient to encode all characters of all the world's scripts? … The answer to this is Yes. quote:, it just would take a bit more storage for text, which is dwarfed by whatever loving JPEG you'll attach to your content. it isn't about the space saving property, it was the compatibility with ascii/byte based systems. the reason utf-8 is so popular it is that it is one of the easiest ways to retrofit your system. quote:Instead let's make sure us English speakers and some of the other latin-1 retards we share bits of culture with get to keep our 8 bit character representation and make the final encoding have a variable width with surrogate pairs that make it impossible to know where you are in the whole thing so we can then support other people's languages. to be fair to utf-8 you do have a synchronisable bytestream so you know if you're in a multibyte bit or not. unlike the other proposals around at the time. quote:Once everyone understands this, we'll introduce them to the idea that code points are not a decent unit anyway and we need to go further with combining accents and grapheme clusters and poo poo." to be fair, people have really weird languages and scripts. (english being no exception, what with having a script bearing not much relation to the spoken language). quote:"yeah let's do that, but in UTF-7 and base64, too!"
|
# ? Dec 6, 2012 03:23 |
|
MononcQc posted:challenge for today: find a Unicode sequence which is larger to represent as an encoded UTF-8 string than its visual representation, either as a JPG, PNG-8 or GIF image. It is likely possible but I've not had the energy to do it. probably via abuse of combining characters.
|
# ? Dec 6, 2012 03:24 |
|
genuine unicode support means giving up the whole petulant notion of a text string being an array indexed by character. them's the breaks. let's not get on to sorting
|
# ? Dec 6, 2012 03:28 |
|
MononcQc posted:computers are awesome as long as they work. When there's a problem it's always due to the shittiest reason and it's frustrating as hell. When there's a serious problem it's often due to a PEBKAC
|
# ? Dec 6, 2012 03:36 |
|
Some characters still require more than one code unit to be represented in UTF-16. Supplementary characters represented with surrogate pairs (adding one code point to the character to represent it anyway) fall into this with UTF-16 (and are incompatible with UCS-2), which include emoji characters, for example.tef posted:it isn't about the space saving property, it was the compatibility with ascii/byte based systems. the reason utf-8 is so popular it is that it is one of the easiest ways to retrofit your system. It could just be an artifact of whatever people describing the spec would speak at the time though. I'm not exactly aware of the history behind it. tef posted:to be fair to utf-8 you do have a synchronisable bytestream so you know if you're in a multibyte bit or not. that's why it happened.
|
# ? Dec 6, 2012 03:41 |
|
tef posted:genuine unicode support means giving up the whole petulant notion of a text string being an array indexed by character. them's the breaks. Other poo poo that sucks: capitalization, title-casing stuff, comparison, string length. A non-breakable space should be equivalent to a normal space, É and E should be seen as identical in some French texts, but rarely should é and e be seen that way (artifacts of typewriters, yay!), not speaking of words containing characters like 'œ' which are often seen as equivalent to 'oe' but not exactly, so whatever string length or reversal means now.
|
# ? Dec 6, 2012 03:45 |
|
I heard you liked Time zones and calendars! Offsets are not always straight on the hour, sadly. Some of them are non-standard and go for a half-hour offset. Nova-Scotia in Canada does this. Then again, Nepal is UTC/GMT +5:45, so that's not exactly right for the rest (Chatham islands, NZ, follow this too). We need more precision. So let's just add minutes and poo poo and overflow to hours, right? Iran changes its timezone offset (DST) based on the Persian lunar calendar, not the gregorian one. So we have to be careful. Also Kiribati has different timezones that make parts of the country be on different days at the same time. Then we have more fun Calendar intricacies... Samoa had a 367-days year in 1892 after changing timezones where it got two 4th of July in the same year. Back in 2011 (I think?) they went back forward in time, skipping an entire Friday by going forward a day. Sweden used its own calendar for 12 years (starting 1700), but things got a bit out of hand: quote:In November 1699, Sweden decided that, rather than adopting the Gregorian calendar outright, it would gradually approach it over a 40-year period. The plan was to skip all leap days in the period 1700 to 1740. Every fourth year, the gap between the Swedish calendar and the Gregorian would reduce by one day, until they finally lined up in 1740. In the meantime, this calendar would not only not be in line with either of the major alternative calendars, but also the differences between them would change every four years. Feb 30 is now a valid date, but only in Sweden in 1712. In 1853, Feb 18+ do not exist for Sweden. We also have to care for leap seconds, different calendars, administrative changes, etc.
|
# ? Dec 6, 2012 04:00 |
|
timezones and calendars are some bullshit.
|
# ? Dec 6, 2012 04:02 |
|
MononcQc posted:I heard you liked Time zones and calendars! Who gives a poo poo about old dates, store them as a simple Strings and don't expect to do any manipulation with dates older than the epoch.
|
# ? Dec 6, 2012 04:19 |
|
MononcQc posted:in ways that end up breaking all the time, I guess. You're right about the compatibility aspect, but overall it still feels utf-8 itself is being popular due to its Western-centric approach. that's ascii compatibility for you. and there is a lot of it around. quote:As far as I remember (and you may correct me on this), a lot of the lower Unicode code points share values similar to those of the latin-1 character set (and UTF-8 breaks a few of them only), compared to say, whatever is used to CJK characters in other standards. utf-8 is multibyte for anything outside of ascii. the second unicode block is latin-1 http://en.wikipedia.org/wiki/C1_Controls_and_Latin-1_Supplement quote:It could just be an artifact of whatever people describing the spec would speak at the time though. I'm not exactly aware of the history behind it. http://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt
|
# ? Dec 6, 2012 04:26 |
|
Hard NOP Life posted:Who gives a poo poo about old dates, store them as a simple Strings and don't expect to do any manipulation with dates older than the epoch. Well you still had this year's leap date, last year's Samoa jumping a day, Iran doing DST based on a lunar Calendar, etc. This year Libya got their clock back one hour, next year they start respecting DST. Russia is stuck in permanent summer time. Jordan also cancelled DST switching this year. Mexico is discussing adding a timezone to their area. The thing is, converting to some standard time like UTC is still ultimately very lovely and prone to error. Sure once you're in epoch manipulating stuff isn't that hard, but going from one to the other is fairly bad. Some events are intimately tied to relative, non-global time. Google Calendar had interesting ones for that a few years ago (not sure if they changed it). Any event saved in a Calendar had the timezone of the time the meeting is set (when it is created) for the event. If you set a meeting time before a DST change, but happening after, once daylight saving time got in effect, the meeting time would be off an hour in the final result, but still right according to UTC. If the meeting is something international, then it made sense, but for local meetups for people within a city, the behavior was off. Time is just the best way to get screwed.
|
# ? Dec 6, 2012 04:28 |
|
MononcQc posted:Other poo poo that sucks: capitalization, title-casing stuff, comparison, string length. let's not even get into right to left/left to right stuff. oh and you should be case-folding too. quote:A non-breakable space should be equivalent to a normal space, similarly shy hyphens should be hidden quote:É and E should be seen as identical in some French texts, but rarely should é and e be seen that way (artifacts of typewriters, yay!), not speaking of words containing characters like 'œ' which are often seen as equivalent to 'oe' but not exactly, so whatever string length or reversal means now. ffuck
|
# ? Dec 6, 2012 04:29 |
|
MononcQc posted:We also have to care for leap seconds, different calendars, administrative changes, etc. timezones used to be much more localized and imprecise. then again we didn't have to synchronize with people on the other side of the planet. gently caress utc, i want to use gps time.
|
# ? Dec 6, 2012 04:37 |
|
you rarely have to care about nepal or iran or kiribati at all let alone one day in the 1700s in sweden. why worry about any of this until the customer complains, then it's extra $$$ to fix their weird nonstandard problems
|
# ? Dec 6, 2012 04:39 |
|
I hope I never see the day space travel makes relativistic effects on time even more obvious than it is with GPS clocks. That would be so much bullshit to deal with. MononcQc fucked around with this message at 04:43 on Dec 6, 2012 |
# ? Dec 6, 2012 04:41 |
|
MononcQc posted:I hope I never see the day space travel makes relativistic effects on time even more obvious than it is with GPS clocks. on the plus side I wouldn't have to worry about leapseconds. today is September 7037, 1993
|
# ? Dec 6, 2012 04:47 |
|
MononcQc posted:I hope I never see the day space travel makes relativistic effects on time even more obvious than it is with GPS clocks. oh my god I never considered this
|
# ? Dec 6, 2012 05:51 |
|
Science fiction writer consensus: it is some bullshit for real
|
# ? Dec 6, 2012 06:39 |
|
MononcQc posted:challenge for today: find a Unicode sequence which is larger to represent as an encoded UTF-8 string than its visual representation, either as a JPG, PNG-8 or GIF image. It is likely possible but I've not had the energy to do it. *inserts several megabytes of zero width spaces*
|
# ? Dec 6, 2012 12:49 |
|
MononcQc posted:Some events are intimately tied to relative, non-global time. Google Calendar had interesting ones for that a few years ago (not sure if they changed it). Any event saved in a Calendar had the timezone of the time the meeting is set (when it is created) for the event. If you set a meeting time before a DST change, but happening after, once daylight saving time got in effect, the meeting time would be off an hour in the final result, but still right according to UTC. If the meeting is something international, then it made sense, but for local meetups for people within a city, the behavior was off. I used to think about making better calendaring software. Then I thought about problems like wanting to schedule meetings on the second tuesday of each month while crossing timezones. I don't think about calendaring software any more.
|
# ? Dec 6, 2012 13:00 |
|
Zombywuf posted:I used to think about making better calendaring software. Then I thought about problems like wanting to schedule meetings on the second tuesday of each month while crossing timezones. Arnold Schwarzenegger had the right idea to just not schedule any meetings.
|
# ? Dec 6, 2012 13:56 |
|
I'm waiting for the glorious day when computing is neither a gold-rush fad nor a dismal cesspit of cost-cutting basically that's going to come around when we finally line up and execute all of the MBAs so, not going to happen
|
# ? Dec 6, 2012 13:59 |
|
Cocoa Crispies posted:Arnold Schwarzenegger had the right idea to just not schedule any meetings. lol all he'll say is ILL BE BACK
|
# ? Dec 6, 2012 14:05 |
|
Mr Dog posted:I'm waiting for the glorious day when computing is neither a gold-rush fad nor a dismal cesspit of cost-cutting Sometimes I wish I worked in a field where people die if I make a mistake.
|
# ? Dec 6, 2012 14:10 |
|
whose the drone yosposter people die when he does his job correctly!
|
# ? Dec 6, 2012 14:17 |
|
zoneinfo is the answer to all of your timezone questions dunno about calendars though
|
# ? Dec 6, 2012 14:38 |
|
Nomnom Cookie posted:Science fiction writer consensus: it is some bullshit for real poul anderson talked about this about once per book in the flandry series. actually maybe the whole polesotechnic league series
|
# ? Dec 6, 2012 14:51 |
|
Mr Dog posted:I'm waiting for the glorious day when computing is neither a gold-rush fad nor a dismal cesspit of cost-cutting the day when computers program themselves and all that's required is to explain precisely just what the gently caress is it you want them to do.
|
# ? Dec 6, 2012 15:10 |
|
Stringent posted:the day when computers program themselves and all that's required is to explain precisely just what the gently caress is it you want them to do. perhaps we could develop some kind of specialised language in which to convey our ideas to computers
|
# ? Dec 6, 2012 15:25 |
|
Stringent posted:the day when computers program themselves and all that's required is to explain precisely just what the gently caress is it you want them to do.
|
# ? Dec 6, 2012 15:30 |
|
Stringent posted:the day when computers program themselves and all that's required is to explain precisely just what the gently caress is it you want them to do.
|
# ? Dec 6, 2012 15:50 |
|
|
# ? Jun 11, 2024 23:11 |
|
PrBacterio posted:That's not actually going to help much with anything, like 99% of the problems with the software in existance right now is that there's no one who can figure out what the gently caress it is precisely that they want the software to do in the first place.
|
# ? Dec 6, 2012 15:52 |