|
TheFluff posted:to a scandinavian, ö is an entirely separate letter which sorts at the end of the alphabet and is just as distinct from o as a is from b, to the point that text that replaces ö with o is markedly hard to read, and a text search that treated the two as equal would be borderline useless. relevant
|
# ? Mar 5, 2019 14:29 |
|
|
# ? May 25, 2024 14:48 |
|
oh so that's why you have to set the language at the first step of windows install and it warns you it cannot be changed because it's not just selecting the display language, it's writing a fundamental part of the filesystem lol
|
# ? Mar 5, 2019 15:45 |
Lutha Mahtin posted:at first i was like The PL Thread: how a is not a, just compare them looool
|
|
# ? Mar 5, 2019 17:15 |
|
Captain Foo posted:oh so that's why you have to set the language at the first step of windows install and it warns you it cannot be changed nah, pretty sure that's stuff like the boot loader and recovery tools, which windows will not casually overwrite every time a language setting is changed the downcasing i think is the same for all locales (which is why it is fascinating that it can be made to vary partition-to-partition), whereas stuff like collation is indeed changed by changing the per-user settings
|
# ? Mar 5, 2019 18:18 |
|
Turkish and Azerbaijani have İ=i and I=ı and everybody else has I=i. It is literally the only locale-specific case folding rule, and effects both upcasing and downcasing. Of course there's also locale independent rules like the uppercase Σ downcasing to either σ or ς depending on whether it is the final letter in the word. Fortunately you can just uppercase both of them to Σ, so that's easy to represent in the $UpCase table. (This should've been a shaping rule like Arabic, but Unicode was designed to losslessly round-trip to character encodings that it subsumed, and pre-Unicode Greek encodings had two distinct lowercase Σ because font shaping didn't exist yet.)
|
# ? Mar 5, 2019 18:34 |
|
pseudorandom name posted:Turkish and Azerbaijani have İ=i and I=ı and everybody else has I=i. It is literally the only locale-specific case folding rule, and effects both upcasing and downcasing. what about things like SS maybe downcasing to ß in german or maybe not because it depends on the word
|
# ? Mar 5, 2019 18:39 |
|
I was under the impression that wasn't context-dependent, just one of those fun cases where case folding changes the grapheme cluster length of the string. edit: the good news is that (as of 2017) ß uppercases to ẞ pseudorandom name fucked around with this message at 18:47 on Mar 5, 2019 |
# ? Mar 5, 2019 18:42 |
|
nah it's contextual. WASSER -> wasser, STRASSE -> straße. except in switzerland where STRASSE -> strasse, obviously i see ẞ is also a thing now in standard german orthography, it's just optional so the SS form is still standard
|
# ? Mar 5, 2019 18:47 |
|
Soricidus posted:what about things like SS maybe downcasing to ß in german or maybe not because it depends on the word windows doesn't implement Unicode casing rules, like that one. or at least that's what michael kaplan says ALSO unicode has an uppercase sharp-s for some goddamned reason
|
# ? Mar 5, 2019 18:49 |
|
hackbunny posted:ALSO unicode has an uppercase sharp-s for some goddamned reason its because the Germans have been arguing about it for a century and finally gave into sanity in 2017
|
# ? Mar 5, 2019 18:51 |
|
the alphabet poo poo has been quite enlightening and i’m glad english has hosed up grammar and inconsistent pronounciation rather than dealing with weird looking letters
|
# ? Mar 5, 2019 18:53 |
|
echinopsis posted:the alphabet poo poo has been quite enlightening and i’m glad english has hosed up grammar and inconsistent pronounciation rather than dealing with weird looking letters they wouldn't look weird if you had seen them before
|
# ? Mar 5, 2019 18:55 |
|
even dealing with unicode is more interesting than frustrating though, it is just important to not make any assumptions about what is well-defined (and the consortium pretty much enumerates the things that are)
|
# ? Mar 5, 2019 18:56 |
|
also keep in mind that OS X has a case-insensitive Unicode normalized file system, but it doesn't store any kind of case folding table or deal with the fact that the Unicode normalization algorithm has changed with every revision of the spec
|
# ? Mar 5, 2019 19:01 |
|
echinopsis posted:the alphabet poo poo has been quite enlightening and i’m glad english has hosed up grammar and inconsistent pronounciation rather than dealing with weird looking letters they wouldn't ahve been 'weird' b/c theyd have been in ascii
|
# ? Mar 5, 2019 19:06 |
|
idea : scrap capital letters they don’t really do anything worthwhile they’re just there for the elite to feel smug about knowing of
|
# ? Mar 5, 2019 19:06 |
|
“I’ve Heard Of Capital Letters” “Do You Have Any GreY PouPon?”
|
# ? Mar 5, 2019 19:08 |
|
bring back the long ſ imo
|
# ? Mar 5, 2019 19:13 |
|
Krankenstyle posted:bring back the long ſ imo this guy ſucks
|
# ? Mar 5, 2019 19:15 |
|
echinopsis posted:the alphabet poo poo has been quite enlightening and i’m glad english has hosed up grammar and inconsistent pronounciation rather than dealing with weird looking letters italian rules, the phonetics are extremely easy, the alphabet is technically smaller than english (jkwxy are only used in loanwords), all vowels can have a grave or acute accent and it always means the same thing (open vs closed), and the orthography is very regular (30 phonemes vs 30 graphemes). italian is written like it's pronounced Krankenstyle posted:bring back the long ſ imo ß is a ligature of ſ and s, so, in a sense, hackbunny fucked around with this message at 19:27 on Mar 5, 2019 |
# ? Mar 5, 2019 19:22 |
|
danish has like 20-40* vowel phonemes compared to italian/english less than 10 *depending how theyre counted
|
# ? Mar 5, 2019 19:26 |
|
fritz posted:they wouldn't ahve been 'weird' b/c theyd have been in ascii I'm french and I find some of our poo poo looks weird like ç, why the gently caress do we have this weird thing used by only like 3 words
|
# ? Mar 5, 2019 19:31 |
|
Zlodo posted:I'm french and I find some of our poo poo looks weird better question is you have that cool fancy tailed c why dont you use it in a lot more words
|
# ? Mar 5, 2019 19:43 |
|
hackbunny posted:italian rules, the phonetics are extremely easy, the alphabet is technically smaller than english (jkwxy are only used in loanwords), all vowels can have a grave or acute accent and it always means the same thing (open vs closed), and the orthography is very regular (30 phonemes vs 30 graphemes). italian is written like it's pronounced not coincidentally, the latin alphabet has some connections to the italian language
|
# ? Mar 5, 2019 19:44 |
|
Zlodo posted:I'm french and I find some of our poo poo looks weird to be fair it's used in the most important french word, français you can't have the language without the word for that language! that'd trigger Undefined Behaviour probably
|
# ? Mar 5, 2019 20:48 |
|
gonadic io posted:to be fair it's used in the most important french word, français and yet the word alphabet doesn’t contain the entire alphabet that it refers to, and you don’t complain about that?
|
# ? Mar 5, 2019 20:52 |
echinopsis posted:and yet the word alphabet doesn’t contain the entire alphabet that it refers to, and you don’t complain about that? alphabet in latvian literally is "abc-thing"
|
|
# ? Mar 5, 2019 20:53 |
|
cinci zoo sniper posted:alphabet in latvian literally is "abc-thing" it's called the alphabet because it starts alpha, beta, gamma ...
|
# ? Mar 5, 2019 20:55 |
|
Zlodo posted:I'm french and I find some of our poo poo looks weird my favorite bit in french is the little hat diacritic that sometimes means "heads up, we changed the spelling of this word a while ago". thanks, french language board, for letting me know!!
|
# ? Mar 5, 2019 21:43 |
|
at any rate i am grateful that i mostly write programs that deal with other computers rather than human input. trying to take some text that a human wrote and turn it into something a computer can do something useful with looks pretty awful. it's nice to be in a place where i only have to interpret well-defined things, maybe some text that's restricted to ascii, and anything a user typed is just a blob of data that can be stored without further processing and presented on demand for someone else to deal with if that "someone else" is reading this: my condolences
|
# ? Mar 5, 2019 21:45 |
|
Lutha Mahtin posted:my favorite bit in french is the little hat diacritic that sometimes means "heads up, we changed the spelling of this word a while ago". thanks, french language board, for letting me know!! that's sometimes useful though, like you can stick an s in there and maybe be able to guess what the related english word is, like château <- chastel -> castle
|
# ? Mar 5, 2019 21:48 |
|
american cultural imperialism has pretty much paved the way for just doing everything in english and ignoring everything else
|
# ? Mar 5, 2019 21:59 |
|
echinopsis posted:american cultural imperialism has pretty much paved the way for just doing everything in english and ignoring everything else please, the british were whipping people for using their native language since literally before america existed
|
# ? Mar 5, 2019 22:14 |
|
gonadic io posted:please, the british were whipping people for using their native language since literally before america existed the english, please, not the british. plenty of the whipping happened to british people who unreasonably insisted on trying to speak cornish, welsh, gaelic, etc.
|
# ? Mar 5, 2019 22:41 |
|
Soricidus posted:the english, please, not the british. plenty of the whipping happened to british people who unreasonably insisted on trying to speak cornish, welsh, gaelic, etc. Plenty of those Scots, Welsh etc were also fully on board with the empire and down with eg whipping brown people in the New World, too, though.
|
# ? Mar 5, 2019 22:45 |
|
Soricidus posted:that's sometimes useful though, like you can stick an s in there and maybe be able to guess what the related english word is, like château <- chastel -> castle this is just because parisian french is weird the original loan-word from norman french was "castel," which proceeds fairly obviously from old-french "chastel" and leads pretty obviously to english "castle" how the gently caress did the weirdos in ile d'france go from "chastel" to "chateau" and then also need a special diacritic to express how weird the sound is?
|
# ? Mar 5, 2019 23:36 |
|
"chateau" is also a cool word because it falls into my favorite bucket of loan-words: words borrowed from french twice, with different meanings chateau vs castle guarantee vs warranty etc
|
# ? Mar 5, 2019 23:38 |
|
Notorious b.s.d. posted:this is just because parisian french is weird gaining or losing some internal consonants or a final L isn't so weird. my favorite english example is the city of bristol, which used to be bricstow before the local accent mangled it
|
# ? Mar 5, 2019 23:49 |
|
i’m quote intrigued by the evolution of the word dick into cock, but how we kept using dick anyway and now we have both
|
# ? Mar 6, 2019 00:26 |
|
|
# ? May 25, 2024 14:48 |
|
the whole “if we evolved from apes how come we still have apes” thing but with words about my own genitals
|
# ? Mar 6, 2019 00:26 |