Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

TheFluff posted:

to a scandinavian, ö is an entirely separate letter which sorts at the end of the alphabet and is just as distinct from o as a is from b, to the point that text that replaces ö with o is markedly hard to read, and a text search that treated the two as equal would be borderline useless.

relevant

Adbot
ADBOT LOVES YOU

Captain Foo
May 11, 2004

we vibin'
we slidin'
we breathin'
we dyin'

oh so that's why you have to set the language at the first step of windows install and it warns you it cannot be changed

because it's not just selecting the display language, it's writing a fundamental part of the filesystem

lol

cinci zoo sniper
Mar 15, 2013




Lutha Mahtin posted:

at first i was like


but then i was like


and finally, i was like


:lsd:

The PL Thread: how a is not a, just compare them looool

Cybernetic Vermin
Apr 18, 2005

Captain Foo posted:

oh so that's why you have to set the language at the first step of windows install and it warns you it cannot be changed

because it's not just selecting the display language, it's writing a fundamental part of the filesystem

lol

nah, pretty sure that's stuff like the boot loader and recovery tools, which windows will not casually overwrite every time a language setting is changed

the downcasing i think is the same for all locales (which is why it is fascinating that it can be made to vary partition-to-partition), whereas stuff like collation is indeed changed by changing the per-user settings

pseudorandom name
May 6, 2007

Turkish and Azerbaijani have İ=i and I=ı and everybody else has I=i. It is literally the only locale-specific case folding rule, and effects both upcasing and downcasing.

Of course there's also locale independent rules like the uppercase Σ downcasing to either σ or ς depending on whether it is the final letter in the word. Fortunately you can just uppercase both of them to Σ, so that's easy to represent in the $UpCase table. (This should've been a shaping rule like Arabic, but Unicode was designed to losslessly round-trip to character encodings that it subsumed, and pre-Unicode Greek encodings had two distinct lowercase Σ because font shaping didn't exist yet.)

Soricidus
Oct 21, 2010
freedom-hating statist shill

pseudorandom name posted:

Turkish and Azerbaijani have İ=i and I=ı and everybody else has I=i. It is literally the only locale-specific case folding rule, and effects both upcasing and downcasing.

Of course there's also locale independent rules like the uppercase Σ downcasing to either σ or ς depending on whether it is the final letter in the word. Fortunately you can just uppercase both of them to Σ, so that's easy to represent in the $UpCase table. (This should've been a shaping rule like Arabic, but Unicode was designed to losslessly round-trip to character encodings that it subsumed, and pre-Unicode Greek encodings had two distinct lowercase Σ because font shaping didn't exist yet.)

what about things like SS maybe downcasing to ß in german or maybe not because it depends on the word

pseudorandom name
May 6, 2007

I was under the impression that wasn't context-dependent, just one of those fun cases where case folding changes the grapheme cluster length of the string.

edit: the good news is that (as of 2017) ß uppercases to ẞ

pseudorandom name fucked around with this message at 18:47 on Mar 5, 2019

Soricidus
Oct 21, 2010
freedom-hating statist shill
nah it's contextual. WASSER -> wasser, STRASSE -> straße. except in switzerland where STRASSE -> strasse, obviously

i see ẞ is also a thing now in standard german orthography, it's just optional so the SS form is still standard

hackbunny
Jul 22, 2007

I haven't been on SA for years but the person who gave me my previous av as a joke felt guilty for doing so and decided to get me a non-shitty av

Soricidus posted:

what about things like SS maybe downcasing to ß in german or maybe not because it depends on the word

windows doesn't implement Unicode casing rules, like that one. or at least that's what michael kaplan says
ALSO unicode has an uppercase sharp-s for some goddamned reason

pseudorandom name
May 6, 2007

hackbunny posted:

ALSO unicode has an uppercase sharp-s for some goddamned reason

its because the Germans have been arguing about it for a century and finally gave into sanity in 2017

echinopsis
Apr 13, 2004

by Fluffdaddy
the alphabet poo poo has been quite enlightening and i’m glad english has hosed up grammar and inconsistent pronounciation rather than dealing with weird looking letters

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slćgt skal fřlge slćgters gang



echinopsis posted:

the alphabet poo poo has been quite enlightening and i’m glad english has hosed up grammar and inconsistent pronounciation rather than dealing with weird looking letters

they wouldn't look weird if you had seen them before :wth:

Cybernetic Vermin
Apr 18, 2005

even dealing with unicode is more interesting than frustrating though, it is just important to not make any assumptions about what is well-defined (and the consortium pretty much enumerates the things that are)

pseudorandom name
May 6, 2007

also keep in mind that OS X has a case-insensitive Unicode normalized file system, but it doesn't store any kind of case folding table or deal with the fact that the Unicode normalization algorithm has changed with every revision of the spec

fritz
Jul 26, 2003

echinopsis posted:

the alphabet poo poo has been quite enlightening and i’m glad english has hosed up grammar and inconsistent pronounciation rather than dealing with weird looking letters

they wouldn't ahve been 'weird' b/c theyd have been in ascii

echinopsis
Apr 13, 2004

by Fluffdaddy
idea : scrap capital letters they don’t really do anything worthwhile they’re just there for the elite to feel smug about knowing of

echinopsis
Apr 13, 2004

by Fluffdaddy
:smugmrgw: “I’ve Heard Of Capital Letters” :smugmrgw: “Do You Have Any GreY PouPon?” :smugmrgw:

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slćgt skal fřlge slćgters gang



bring back the long ſ imo

DELETE CASCADE
Oct 25, 2017

i haven't washed my penis since i jerked it to a phtotograph of george w. bush in 2003

Krankenstyle posted:

bring back the long ſ imo

this guy ſucks

hackbunny
Jul 22, 2007

I haven't been on SA for years but the person who gave me my previous av as a joke felt guilty for doing so and decided to get me a non-shitty av

echinopsis posted:

the alphabet poo poo has been quite enlightening and i’m glad english has hosed up grammar and inconsistent pronounciation rather than dealing with weird looking letters

italian rules, the phonetics are extremely easy, the alphabet is technically smaller than english (jkwxy are only used in loanwords), all vowels can have a grave or acute accent and it always means the same thing (open vs closed), and the orthography is very regular (30 phonemes vs 30 graphemes). italian is written like it's pronounced

Krankenstyle posted:

bring back the long ſ imo

ß is a ligature of ſ and s, so, in a sense,

hackbunny fucked around with this message at 19:27 on Mar 5, 2019

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slćgt skal fřlge slćgters gang



danish has like 20-40* vowel phonemes compared to italian/english less than 10 :smug:

*depending how theyre counted

Zlodo
Nov 25, 2006

fritz posted:

they wouldn't ahve been 'weird' b/c theyd have been in ascii

I'm french and I find some of our poo poo looks weird
like ç, why the gently caress do we have this weird thing used by only like 3 words

fritz
Jul 26, 2003

Zlodo posted:

I'm french and I find some of our poo poo looks weird
like ç, why the gently caress do we have this weird thing used by only like 3 words

better question is you have that cool fancy tailed c why dont you use it in a lot more words

Notorious b.s.d.
Jan 25, 2003

by Reene

hackbunny posted:

italian rules, the phonetics are extremely easy, the alphabet is technically smaller than english (jkwxy are only used in loanwords), all vowels can have a grave or acute accent and it always means the same thing (open vs closed), and the orthography is very regular (30 phonemes vs 30 graphemes). italian is written like it's pronounced

not coincidentally, the latin alphabet has some connections to the italian language

gonadic io
Feb 16, 2011

>>=

Zlodo posted:

I'm french and I find some of our poo poo looks weird
like ç, why the gently caress do we have this weird thing used by only like 3 words

to be fair it's used in the most important french word, français

you can't have the language without the word for that language! that'd trigger Undefined Behaviour probably

echinopsis
Apr 13, 2004

by Fluffdaddy

gonadic io posted:

to be fair it's used in the most important french word, français

you can't have the language without the word for that language! that'd trigger Undefined Behaviour probably

and yet the word alphabet doesn’t contain the entire alphabet that it refers to, and you don’t complain about that?

cinci zoo sniper
Mar 15, 2013




echinopsis posted:

and yet the word alphabet doesn’t contain the entire alphabet that it refers to, and you don’t complain about that?

alphabet in latvian literally is "abc-thing"

Notorious b.s.d.
Jan 25, 2003

by Reene

cinci zoo sniper posted:

alphabet in latvian literally is "abc-thing"

it's called the alphabet because it starts alpha, beta, gamma ...

Lutha Mahtin
Oct 10, 2010

Your brokebrain sin is absolved...go and shitpost no more!

Zlodo posted:

I'm french and I find some of our poo poo looks weird
like ç, why the gently caress do we have this weird thing used by only like 3 words

my favorite bit in french is the little hat diacritic that sometimes means "heads up, we changed the spelling of this word a while ago". thanks, french language board, for letting me know!! :)

Soricidus
Oct 21, 2010
freedom-hating statist shill
at any rate i am grateful that i mostly write programs that deal with other computers rather than human input. trying to take some text that a human wrote and turn it into something a computer can do something useful with looks pretty awful. it's nice to be in a place where i only have to interpret well-defined things, maybe some text that's restricted to ascii, and anything a user typed is just a blob of data that can be stored without further processing and presented on demand for someone else to deal with

if that "someone else" is reading this: my condolences

Soricidus
Oct 21, 2010
freedom-hating statist shill

Lutha Mahtin posted:

my favorite bit in french is the little hat diacritic that sometimes means "heads up, we changed the spelling of this word a while ago". thanks, french language board, for letting me know!! :)

that's sometimes useful though, like you can stick an s in there and maybe be able to guess what the related english word is, like château <- chastel -> castle

echinopsis
Apr 13, 2004

by Fluffdaddy
american cultural imperialism has pretty much paved the way for just doing everything in english and ignoring everything else

gonadic io
Feb 16, 2011

>>=

echinopsis posted:

american cultural imperialism has pretty much paved the way for just doing everything in english and ignoring everything else

please, the british were whipping people for using their native language since literally before america existed

Soricidus
Oct 21, 2010
freedom-hating statist shill

gonadic io posted:

please, the british were whipping people for using their native language since literally before america existed

the english, please, not the british. plenty of the whipping happened to british people who unreasonably insisted on trying to speak cornish, welsh, gaelic, etc.

feedmegin
Jul 30, 2008

Soricidus posted:

the english, please, not the british. plenty of the whipping happened to british people who unreasonably insisted on trying to speak cornish, welsh, gaelic, etc.

Plenty of those Scots, Welsh etc were also fully on board with the empire and down with eg whipping brown people in the New World, too, though.

Notorious b.s.d.
Jan 25, 2003

by Reene

Soricidus posted:

that's sometimes useful though, like you can stick an s in there and maybe be able to guess what the related english word is, like château <- chastel -> castle

this is just because parisian french is weird

the original loan-word from norman french was "castel," which proceeds fairly obviously from old-french "chastel" and leads pretty obviously to english "castle"

how the gently caress did the weirdos in ile d'france go from "chastel" to "chateau" and then also need a special diacritic to express how weird the sound is?

Notorious b.s.d.
Jan 25, 2003

by Reene
"chateau" is also a cool word because it falls into my favorite bucket of loan-words: words borrowed from french twice, with different meanings


chateau vs castle
guarantee vs warranty
etc

Soricidus
Oct 21, 2010
freedom-hating statist shill

Notorious b.s.d. posted:

this is just because parisian french is weird

the original loan-word from norman french was "castel," which proceeds fairly obviously from old-french "chastel" and leads pretty obviously to english "castle"

how the gently caress did the weirdos in ile d'france go from "chastel" to "chateau" and then also need a special diacritic to express how weird the sound is?

gaining or losing some internal consonants or a final L isn't so weird. my favorite english example is the city of bristol, which used to be bricstow before the local accent mangled it

echinopsis
Apr 13, 2004

by Fluffdaddy
i’m quote intrigued by the evolution of the word dick into cock, but how we kept using dick anyway and now we have both

Adbot
ADBOT LOVES YOU

echinopsis
Apr 13, 2004

by Fluffdaddy
the whole “if we evolved from apes how come we still have apes” thing but with words about my own genitals

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply