|
pseudorandom name posted:Swift calls your HashFunction type Hasher quote:The universal hash function used by Set and Dictionary.
|
# ? Mar 3, 2019 20:54 |
|
|
# ? Jun 1, 2024 06:15 |
|
yeah, like I said, your complaint is that Set and Dictionary don't take custom hashers as a parameter to their initializers
|
# ? Mar 3, 2019 20:56 |
|
It's actually that custom hashers aren't even a concept that is compatible with Swift's concept of hashability, but I won't argue that Set and Dictionary not having initializers for them is a natural consequence of that.
|
# ? Mar 3, 2019 20:57 |
|
oh for fucks sake, I'm sorry, I understand what you're complaining about now I was under the wrong impression that Hasher was a protocol not a struct because why would you correctly generalize Hashable and then gently caress up Hasher
|
# ? Mar 3, 2019 21:01 |
|
we did it!!!!!!
|
# ? Mar 3, 2019 21:03 |
|
pseudorandom name posted:see how few posts that took when reading comprehension was involved? pseudorandom name posted:oh for fucks sake, I'm sorry, I understand what you're complaining about now
|
# ? Mar 3, 2019 21:05 |
|
i bet psuedorandom name's face is so red right now but respect for coming around in the end
|
# ? Mar 3, 2019 21:06 |
|
Makes sense, I had the same misunderstanding about Hasher initially. Sadly even if Hasher was a protocol, it would be a red herring, because at best you could choose which operation you use to combine the members together, rather than customize the entire notion of a hash for a particular type.
|
# ? Mar 3, 2019 21:07 |
|
oh god and you wrote it right there in the very first post
|
# ? Mar 3, 2019 21:08 |
|
if the hash suits smoke it
|
# ? Mar 3, 2019 21:09 |
|
DONT THREAD ON ME posted:i bet psuedorandom name's face is so red right now but respect for coming around in the end
|
# ? Mar 3, 2019 21:21 |
|
as someone who has just spent a day wrangling siphash and fnv, these last two pages were traumatic
|
# ? Mar 3, 2019 21:29 |
|
Volte posted:Makes sense, I had the same misunderstanding about Hasher initially me too but I checked before committing to it. I edit so much wrong stuff out of my posts before hitting submit that I wonder if what's left makes sense
|
# ? Mar 3, 2019 21:40 |
|
ya'll really hit my inferiority complex with this sheer corpus of knowledge you guys have. gd
|
# ? Mar 3, 2019 21:43 |
|
I just use trees so I don't have to worry about computing hashes.
|
# ? Mar 3, 2019 21:49 |
|
Athas posted:I just use trees so I don't have to worry about computing hashes. let’s have this argument again but with generated ordering operators
|
# ? Mar 3, 2019 21:52 |
|
Sweeper posted:let’s have this argument again but with generated ordering operators lol, I don't know of many languages that have ordered sets/maps that take a ord function in their constructor. most of the ones i use require you to wrap your type in a newtype wrapper with a different ord instance. haskell has this stuff built in, you can do stuff like [3, 4].map(x => Down(x)).sort().map(d => unDown(d)) == [4, 3]
|
# ? Mar 3, 2019 21:56 |
|
gonadic io posted:lol, I don't know of many languages that have ordered sets/maps that take a ord function in their constructor. most of the ones i use require you to wrap your type in a newtype wrapper with a different ord instance. haskell has this stuff built in, you can do stuff like have you heard the good news
|
# ? Mar 3, 2019 22:06 |
|
rjmccall posted:case-insensitive comparison is basically a broken concept except to, you know, actual humans who use human languages
|
# ? Mar 3, 2019 23:09 |
|
Soricidus posted:c# does it that way because it’s a java clone and java does it that way because Java is an Objective-C clone at heart and OpenStep NSObject does it that way probably because Stepstone’s original Object class did it that way and I bet it was originally that way in Smalltalk-76 too
|
# ? Mar 3, 2019 23:19 |
|
strings, are bad
|
# ? Mar 4, 2019 00:37 |
|
eschaton posted:except to, you know, actual humans who use human languages actual humans who use actual human languages want a locale-sensitive comparison, and it is a huge fuckup for a programming language to use one of those as the default ordering of strings in general, sorts are parameterized because using a custom sort on data is an incredibly common thing to do because sorts are user-meaningful. hashing is not user-meaningful so the use cases for non-standard hashes are niche as gently caress, and i say that as someone who does a lot of niche algorithms work. making hashing be driven by type is a totally reasonable choice, and the resistance to it comes from the fact that a lot of people treat defining a trivial wrapper type as one the most onerous tasks a programmer can be asked to do, as opposed to something that sensible programmers should be doing as a matter of course whenever it’s useful
|
# ? Mar 4, 2019 00:43 |
|
agreed. consider the case where you want to sort book titles, and ignore "the" at the start for the purposes of sorting.
|
# ? Mar 4, 2019 00:45 |
|
DONT THREAD ON ME posted:we did it!!!!!! This is a true YOSPOS success story and everybody should be happy!!!
|
# ? Mar 4, 2019 00:46 |
|
hackbunny posted:not really? think strings, there are so many ways to call two strings "equal". it's the reason why .net has an IEqualityComparer i call bullshit
|
# ? Mar 4, 2019 00:52 |
|
rjmccall posted:actual humans who use actual human languages want a locale-sensitive comparison, and it is a huge fuckup for a programming language to use one of those as the default ordering of strings this is true nonetheless, when we’ve made affirmative design decisions that some things (like sorting) belong at the presentation layer rather than at lower layers, you would not believe the wailing and gnashing of teeth like, “who cares about the relational model and consistency, I need ordered relationships!” of course followed quickly by “what do you mean these don’t perform as well as unordered relationships?!”
|
# ? Mar 4, 2019 02:47 |
|
preemptive thanks for the swift Hasher discussion because I can already tell this will come up irl one day and I’ll be Prepared also congrats everyone on figuring out what each other was saying, I think its time for everyone involved who’s ok with it to have a hug
|
# ? Mar 4, 2019 03:50 |
|
I do comparisons in PHP by randomly alternating between 2 equals and 3 equals Works 4 me
|
# ? Mar 4, 2019 06:53 |
|
floatman posted:I do comparisons in PHP by randomly alternating between 2 equals and 3 equals same, which is possibly suboptimal because starting all those php subprocesses from java has a fair bit of overhead
|
# ? Mar 4, 2019 09:27 |
|
one does not simply normalize unicode i guess this sorta implying that 7-bit ascii is an evil artifact corrupting whoever uses it, but that's not exactly wrong is it (credit to @FakeUnicode, excellent unicode horror account)
|
# ? Mar 4, 2019 13:19 |
|
it's essentially impossible to do string normalization in a way that makes sense to humans without language metadata at the very least (but even that might not be enough). asking yourself what it actually means for two strings to be considered equal is a downright philosophical question.
|
# ? Mar 4, 2019 13:30 |
|
TheFluff posted:it's essentially impossible to do string normalization in a way that makes sense to humans without language metadata at the very least (but even that might not be enough). asking yourself what it actually means for two strings to be considered equal is a downright philosophical question. clearly the solution lies somewhere in pattern recognition ai
|
# ? Mar 4, 2019 13:39 |
|
just normalize using a neural network to recognize similar character shapes all you need to standardize is which font to use
|
# ? Mar 4, 2019 14:10 |
|
Zlodo posted:all you need to standardize is which font to use this project is doomed from the start
|
# ? Mar 4, 2019 14:30 |
|
all internet arguments end with philosophizing over definitions and this one is no different
|
# ? Mar 4, 2019 15:22 |
|
Zlodo posted:just normalize using a neural network to recognize similar character shapes comic Sans MS - an accessibility font
|
# ? Mar 4, 2019 15:27 |
|
TheFluff posted:one does not simply normalize unicode idk the mongolians have a special character for 'ill get around to it eventually'
|
# ? Mar 4, 2019 16:22 |
|
Zlodo posted:just normalize using a neural network to recognize similar character shapes to a scandinavian, ö is an entirely separate letter which sorts at the end of the alphabet and is just as distinct from o as a is from b, to the point that text that replaces ö with o is markedly hard to read, and a text search that treated the two as equal would be borderline useless. on the other hand, to an american reading the new yorker, ö is just a pretentious way of writing o in certain words, and in order for text search to work as people expect, it should sort as an o and compare equal to an o. the unicode representation is identical in both cases, assuming you did your homework and use NFC normalization like the best practices told you (and they're not wrong). then there's german, which is somewhere in between - it should probably be treated as distinct from a regular o, but on the other hand it usually sorts as if it was an o. this is all babby tier compared to some of the poo poo going on in the east asian languages, where for example the same unicode codepoint can render a visually significantly different character depending on what language (or country, or part of a country) you're in. it might sound like i'm stoned out of my mind when i go "but what does it meeeeeean to say that two character sequences are the same, duuuuuuude" but it's an extremely relevant question to ask yourself if you're doing that e: unicode very intentionally stops short of the language level; the fact that a new yorker ö has the same representation as a swedish ö is a feature, not a bug, and all the spooky language stuff is somebody else's problem TheFluff fucked around with this message at 21:27 on Mar 4, 2019 |
# ? Mar 4, 2019 21:11 |
|
thank you Suspicious Dish for the archived copy of Sorting it All Out, I've been re-reading it with great interest. I've also found a mistake!"Comparison confusion: INVARIANT vs. ORDINAL posted:Originally, LOCALE_INVARIANT had just one noble purpose -- to allow one to use CompareString (and LCMapString with the LCMAP_SORTKEY flag) in a way that would only use the "Default" Windows sorting table as mentioned a little bit here and especially here. the invariant locale uses different casing data! to sort/search strings case-insensitively in the same way the filesystem, registry, etc. do, you want an ordinal comparison - CompareStringOrdinal. I know because this morning I checked all 4294967296 combinations of two WCHARs (including unpaired utf-16 surrogates and other illegal unicode characters - which are actually legal in windows object names) and CompareStringOrdinal matches RtlCompareUnicodeString (what the kernel and drivers use) exactly; I haven't checked wcsicmp but the fact that it folds to lowercase instead of uppercase is a guarantee of different behavior (counter-intuitively, lots of case-folding operations are one-way). I learned something new! and I can stop using a kind-of-undocumented function (I wonder if michael eventually realized the mistake and fixed it in a later post. apropos of nothing lol at his final blog post. what a legacy) why bother, you may ask? because when you duplicate OS behavior, you want the highest fidelity possible. winging it may open a security hole hackbunny fucked around with this message at 21:33 on Mar 4, 2019 |
# ? Mar 4, 2019 21:28 |
|
|
# ? Jun 1, 2024 06:15 |
|
except NTFS and exFAT have embedded case-folding tables
|
# ? Mar 4, 2019 21:33 |