Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



qntm posted:

utf-16 is also the main reason why we can never extend Unicode to more than 1,114,112 code points

well theoretically, they could choose a some unassigned code point and designate that as an extender that provides some arbitrary number of bits for further characters.

if a text uses the code point, its new-style unicode with the extender and the reader has to be updated.

if it doesnt use it, its backwards compatible

Adbot
ADBOT LOVES YOU

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.

Snapchat A Titty posted:

well theoretically, they could choose a some unassigned code point and designate that as an extender that provides some arbitrary number of bits for further characters.

if a text uses the code point, its new-style unicode with the extender and the reader has to be updated.

if it doesnt use it, its backwards compatible

you might even call the extender and the further bits a pair. maybe one is a high and one is a low. maybe, and stick with me here, we call them "surrogate" because they're a substitute for the proper full value.

nah this won't work. unicode has reached its maximum size

Bloody
Mar 3, 2013

how hosed up is unicode that 32 bits turns into a million values

compuserved
Mar 20, 2006

Nap Ghost
looking forward to exhausting all remaining code points with emoji

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



pokeyman posted:

you might even call the extender and the further bits a pair. maybe one is a high and one is a low. maybe, and stick with me here, we call them "surrogate" because they're a substitute for the proper full value.

nah this won't work. unicode has reached its maximum size

:shrug: i guess theres no way to make it work :shrug:

Sapozhnik
Jan 2, 2005

Nap Ghost
i think there's emoji ligatures now?

only the dead can now peace from this madness

compuserved
Mar 20, 2006

Nap Ghost
just disregard RFC 3629 and allow utf-8 to represent code points up to U+7FFFFFFF. then force everything to use it. problem solved

compuserved
Mar 20, 2006

Nap Ghost
then use them all for emoji

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



compuserved posted:

then use them all for emoji

👍

pseudorandom name
May 6, 2007

Mr Dog posted:

i think there's emoji ligatures now?

only the dead can now peace from this madness

you can blame the blacks and the gays for that

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



pseudorandom name posted:

you can blame the blacks and the gays for that

it was a good idea but because people are racist irl, nobody really wants to use the darker colors (except white people who want to be all :whatup: yo)

pseudorandom name
May 6, 2007

well, it's the degenerate satanists at Apple's fault, really

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
there are a ton of multi-code-point glyphs, the amazing name for this is extended grapheme cluster. there are a lot of non-dumb examples like accent combining characters and decomposed encodings of hangul, but there are plenty of dumb examples, too! like non-racist emoji and the national flags that are encoded by spelling out the iso 2-letter county codes using the 26 regional indicator symbol code points

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
fun fact! the two-letter code things aren't actually bounded at two letters. thus an extended grapheme cluster is actually unbounded in size.

swift is the one language i know about that supports this correctly. thank you rjmccall for ""doing it right"

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

Suspicious Dish posted:

fun fact! the two-letter code things aren't actually bounded at two letters. thus an extended grapheme cluster is actually unbounded in size.

swift is the one language i know about that supports this correctly. thank you rjmccall for ""doing it right"

yaaaay

pretty sure we still don't canonicalize identifiers in the parser, fwiw

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



probably a good idea to spec it, just to avoid problems further down the line

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



btw rjmccall can i ask you whats the best way to ask apple to make an API for the global dictionary?

i can make some xml stuff, but i dont want to download a million web-pages. i just want to make a small bit of code thats given a word, looks it up somewhere, transforms the html or json, and pushes it back to the dictionary.

bugticket/email support/email tim?

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
a bug, i guess? i have no idea what you're talking about but emailing customer support does not sound like a productive use of your time

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



thx ill try and make a bug tomorrow

for reference: theres a dictionary developers guide but its basically an XML file format description. theres no way to write a plugin that looks up stuff elsewhere (websites w other languages etc)
https://developer.apple.com/library...0006152-CH3-SW7

fuck the mods
Mar 30, 2015
Perl6 supports all Unicode poo poo to spec (for now)

ARABIC LIGATURE UIGHUR KIRGHIZ YEH WITH HAMZA ABOVE WITH ALEF MAKSURA ISOLATED FORM [Lo] (ﯹ)

VikingofRock
Aug 24, 2008




I think Rust handles Unicode pretty well, but I don't really do anything but goofy spare time projects in it so idk for sure

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



VikingofRock posted:

I think I handle Unicode pretty well, but I don't really do anything but goofy spare time projects in it so idk for sure

                                                    /

VikingofRock
Aug 24, 2008




I guess my post was a bit of a Rusty venture

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



VikingofRock posted:

I guess my post was a bit of a Rusty venture

HEYOOOOOOO~

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



it wasnt actually a dig at you btw, its just Rust handling unicode

VikingofRock
Aug 24, 2008




Snapchat A Titty posted:

it wasnt actually a dig at you btw, its just Rust handling unicode

Oooh I had missed your edit of my quote.

qntm
Jun 17, 2009
the angular material design icon font which we use uses ligatures to represent icons

so the text content of the div is "zoom_in" but thanks to ligatures it represents itself as a magnifying glass with a plus sign

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



qntm posted:

the angular material design icon font which we use uses ligatures to represent icons

so the text content of the div is "zoom_in" but thanks to ligatures it represents itself as a magnifying glass with a plus sign

so youre using an encoding that your font doesnt support?

definitely not your fault!

Athas
Aug 6, 2007

fuck that joker
This is why my own PL doesn't even have a character type. Mad props to those who can figure out how to implement the Unicode nonsense.

hackbunny
Jul 22, 2007

I haven't been on SA for years but the person who gave me my previous av as a joke felt guilty for doing so and decided to get me a non-shitty av

qntm posted:

the angular material design icon font which we use uses ligatures to represent icons

so the text content of the div is "zoom_in" but thanks to ligatures it represents itself as a magnifying glass with a plus sign

this is so bizarre
I guess it's the natural endpoint of those embedded fonts that encode scalable icons

e: even better actually, it's meaningful text instead of some random character that maps to an icon

hackbunny fucked around with this message at 10:14 on May 21, 2016

Sapozhnik
Jan 2, 2005

Nap Ghost
http://sansbullshitsans.com/

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.

Athas posted:

This is why my own PL doesn't even have a character type. Mad props to those who can figure out how to implement the Unicode nonsense.

it's ez you just do whatever swift does

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

hackbunny posted:

this is so bizarre
I guess it's the natural endpoint of those embedded fonts that encode scalable icons

e: even better actually, it's meaningful text instead of some random character that maps to an icon

yeah, it's way better than the old approach of just using private range codepoints because you don't just have a bunch of boxes if the font fails to load or is disabled etc.

suffix
Jul 27, 2013

Wheeee!
utf-16 mainly just shows how good utf-8 is

Series DD Funding
Nov 25, 2014

by exmarx
so does crates.io seriously not have package signing, wtf mozilla

ultramiraculous
Nov 12, 2003

"No..."
Grimey Drawer

rjmccall posted:

the national flags that are encoded by spelling out the iso 2-letter county codes using the 26 regional indicator symbol code points

this one is my favorite by far

ultramiraculous
Nov 12, 2003

"No..."
Grimey Drawer
also the way that facebook sometimes renders the skin-colored emojis using the regular emoji with a fitzpatrick-scale colored square beside it.

Malcolm XML
Aug 8, 2009

I always knew it would end like this.

ultramiraculous posted:

also the way that facebook sometimes renders the skin-colored emojis using the regular emoji with a fitzpatrick-scale colored square beside it.

wasn't this a fuckup with text rendering in chrome

prefect
Sep 11, 2001

No one, Woodhouse.
No one.




Dead Man’s Band
what does it mean when a programming language is described as "expressive"?

e.g. Flat learning curve

Concise, readable and expressive syntax, easy to learn for Java developers

Adbot
ADBOT LOVES YOU

tef
May 30, 2004

-> some l-system crap ->

prefect posted:

what does it mean when a programming language is described as "expressive"?

e.g. Flat learning curve

Concise, readable and expressive syntax, easy to learn for Java developers

"some sugar for the code i hate, it looks a bit like ruby, and we've borrowed the rest from java"

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply