|
qntm posted:utf-16 is also the main reason why we can never extend Unicode to more than 1,114,112 code points well theoretically, they could choose a some unassigned code point and designate that as an extender that provides some arbitrary number of bits for further characters. if a text uses the code point, its new-style unicode with the extender and the reader has to be updated. if it doesnt use it, its backwards compatible
|
# ? May 20, 2016 23:37 |
|
|
# ? Jun 4, 2024 17:11 |
|
Snapchat A Titty posted:well theoretically, they could choose a some unassigned code point and designate that as an extender that provides some arbitrary number of bits for further characters. you might even call the extender and the further bits a pair. maybe one is a high and one is a low. maybe, and stick with me here, we call them "surrogate" because they're a substitute for the proper full value. nah this won't work. unicode has reached its maximum size
|
# ? May 21, 2016 00:27 |
|
how hosed up is unicode that 32 bits turns into a million values
|
# ? May 21, 2016 00:28 |
|
looking forward to exhausting all remaining code points with emoji
|
# ? May 21, 2016 01:17 |
|
pokeyman posted:you might even call the extender and the further bits a pair. maybe one is a high and one is a low. maybe, and stick with me here, we call them "surrogate" because they're a substitute for the proper full value. i guess theres no way to make it work
|
# ? May 21, 2016 02:02 |
|
i think there's emoji ligatures now? only the dead can now peace from this madness
|
# ? May 21, 2016 02:18 |
|
just disregard RFC 3629 and allow utf-8 to represent code points up to U+7FFFFFFF. then force everything to use it. problem solved
|
# ? May 21, 2016 02:19 |
|
then use them all for emoji
|
# ? May 21, 2016 02:20 |
|
compuserved posted:then use them all for emoji 👍
|
# ? May 21, 2016 02:35 |
|
Mr Dog posted:i think there's emoji ligatures now? you can blame the blacks and the gays for that
|
# ? May 21, 2016 02:35 |
|
pseudorandom name posted:you can blame the blacks and the gays for that it was a good idea but because people are racist irl, nobody really wants to use the darker colors (except white people who want to be all yo)
|
# ? May 21, 2016 02:42 |
|
well, it's the degenerate satanists at Apple's fault, really
|
# ? May 21, 2016 02:42 |
|
there are a ton of multi-code-point glyphs, the amazing name for this is extended grapheme cluster. there are a lot of non-dumb examples like accent combining characters and decomposed encodings of hangul, but there are plenty of dumb examples, too! like non-racist emoji and the national flags that are encoded by spelling out the iso 2-letter county codes using the 26 regional indicator symbol code points
|
# ? May 21, 2016 03:47 |
|
fun fact! the two-letter code things aren't actually bounded at two letters. thus an extended grapheme cluster is actually unbounded in size. swift is the one language i know about that supports this correctly. thank you rjmccall for ""doing it right"
|
# ? May 21, 2016 03:51 |
|
Suspicious Dish posted:fun fact! the two-letter code things aren't actually bounded at two letters. thus an extended grapheme cluster is actually unbounded in size. yaaaay pretty sure we still don't canonicalize identifiers in the parser, fwiw
|
# ? May 21, 2016 03:56 |
|
probably a good idea to spec it, just to avoid problems further down the line
|
# ? May 21, 2016 03:59 |
|
btw rjmccall can i ask you whats the best way to ask apple to make an API for the global dictionary? i can make some xml stuff, but i dont want to download a million web-pages. i just want to make a small bit of code thats given a word, looks it up somewhere, transforms the html or json, and pushes it back to the dictionary. bugticket/email support/email tim?
|
# ? May 21, 2016 04:03 |
|
a bug, i guess? i have no idea what you're talking about but emailing customer support does not sound like a productive use of your time
|
# ? May 21, 2016 04:18 |
|
thx ill try and make a bug tomorrow for reference: theres a dictionary developers guide but its basically an XML file format description. theres no way to write a plugin that looks up stuff elsewhere (websites w other languages etc) https://developer.apple.com/library...0006152-CH3-SW7
|
# ? May 21, 2016 04:30 |
|
Perl6 supports all Unicode poo poo to spec (for now) ARABIC LIGATURE UIGHUR KIRGHIZ YEH WITH HAMZA ABOVE WITH ALEF MAKSURA ISOLATED FORM [Lo] (ﯹ)
|
# ? May 21, 2016 07:22 |
I think Rust handles Unicode pretty well, but I don't really do anything but goofy spare time projects in it so idk for sure
|
|
# ? May 21, 2016 07:34 |
|
VikingofRock posted:I think I handle Unicode pretty well, but I don't really do anything but goofy spare time projects in it so idk for sure /
|
# ? May 21, 2016 07:38 |
I guess my post was a bit of a Rusty venture
|
|
# ? May 21, 2016 07:46 |
|
VikingofRock posted:I guess my post was a bit of a Rusty venture HEYOOOOOOO~
|
# ? May 21, 2016 08:01 |
|
it wasnt actually a dig at you btw, its just Rust handling unicode
|
# ? May 21, 2016 08:02 |
Snapchat A Titty posted:it wasnt actually a dig at you btw, its just Rust handling unicode Oooh I had missed your edit of my quote.
|
|
# ? May 21, 2016 08:24 |
|
the angular material design icon font which we use uses ligatures to represent icons so the text content of the div is "zoom_in" but thanks to ligatures it represents itself as a magnifying glass with a plus sign
|
# ? May 21, 2016 08:55 |
|
qntm posted:the angular material design icon font which we use uses ligatures to represent icons so youre using an encoding that your font doesnt support? definitely not your fault!
|
# ? May 21, 2016 09:58 |
|
This is why my own PL doesn't even have a character type. Mad props to those who can figure out how to implement the Unicode nonsense.
|
# ? May 21, 2016 10:05 |
|
qntm posted:the angular material design icon font which we use uses ligatures to represent icons this is so bizarre I guess it's the natural endpoint of those embedded fonts that encode scalable icons e: even better actually, it's meaningful text instead of some random character that maps to an icon hackbunny fucked around with this message at 10:14 on May 21, 2016 |
# ? May 21, 2016 10:08 |
|
http://sansbullshitsans.com/
|
# ? May 21, 2016 13:34 |
|
Athas posted:This is why my own PL doesn't even have a character type. Mad props to those who can figure out how to implement the Unicode nonsense. it's ez you just do whatever swift does
|
# ? May 21, 2016 13:38 |
|
hackbunny posted:this is so bizarre yeah, it's way better than the old approach of just using private range codepoints because you don't just have a bunch of boxes if the font fails to load or is disabled etc.
|
# ? May 21, 2016 16:47 |
|
utf-16 mainly just shows how good utf-8 is
|
# ? May 22, 2016 21:12 |
|
so does crates.io seriously not have package signing, wtf mozilla
|
# ? May 23, 2016 13:48 |
|
rjmccall posted:the national flags that are encoded by spelling out the iso 2-letter county codes using the 26 regional indicator symbol code points this one is my favorite by far
|
# ? May 24, 2016 19:46 |
|
also the way that facebook sometimes renders the skin-colored emojis using the regular emoji with a fitzpatrick-scale colored square beside it.
|
# ? May 24, 2016 19:48 |
|
ultramiraculous posted:also the way that facebook sometimes renders the skin-colored emojis using the regular emoji with a fitzpatrick-scale colored square beside it. wasn't this a fuckup with text rendering in chrome
|
# ? May 25, 2016 02:00 |
|
what does it mean when a programming language is described as "expressive"? e.g. Flat learning curve Concise, readable and expressive syntax, easy to learn for Java developers
|
# ? May 27, 2016 12:42 |
|
|
# ? Jun 4, 2024 17:11 |
|
prefect posted:what does it mean when a programming language is described as "expressive"? "some sugar for the code i hate, it looks a bit like ruby, and we've borrowed the rest from java"
|
# ? May 27, 2016 12:48 |