|
just put everything into an in-memory sqlite
|
# ? May 30, 2017 15:49 |
|
|
# ? May 14, 2024 05:16 |
|
my favorite kind of hash tables are the ones with an O(1) lookup time where the O(1) stands for "oh, there's only 1 bucket"
|
# ? May 30, 2017 15:50 |
|
hobbesmaster posted:well you need to have something for the follow up of "how do you handle collisions" i mean, the two options i know of are either using a bucket (which itself may be a hashtable, array, linked list, whatevs) or doing the thing i can't remember the name of where you put the collided value into the next nearest available index (possibly incrementing by some amount >1 in your search). but afaik the tradeoffs end up being super specific to the use case and most of the time your still better off with the standard library version in whatever lang you use because they are less likely to have hosed up the basics than you are.
|
# ? May 30, 2017 15:52 |
|
ThePeavstenator posted:my favorite kind of hash tables are the ones with an O(1) lookup time where the O(1) stands for "oh, there's only 1 bucket" don't they generally call hashtables "amortized O(1)" for the very reason that specific operations might actually require more than one op but over the lifetime of the hashtable it averages out to 1 op per insert or lookup?
|
# ? May 30, 2017 15:53 |
|
hobbesmaster posted:well you need to have something for the follow up of "how do you handle collisions" *bursts into interview room* VERY CAREFULLY!!
|
# ? May 30, 2017 15:54 |
|
cis autodrag posted:i mean, the two options i know of are either using a bucket (which itself may be a hashtable, array, linked list, whatevs) or doing the thing i can't remember the name of where you put the collided value into the next nearest available index (possibly incrementing by some amount >1 in your search). but afaik the tradeoffs end up being super specific to the use case and most of the time your still better off with the standard library version in whatever lang you use because they are less likely to have hosed up the basics than you are. yeah the term i've heard is open addressing. ideally you want to increment by a number that guarantees you'll check every cell in the table eventually but not sure how imperative that is
|
# ? May 30, 2017 15:55 |
|
cis autodrag posted:i mean, the two options i know of are either using a bucket (which itself may be a hashtable, array, linked list, whatevs) or doing the thing i can't remember the name of where you put the collided value into the next nearest available index (possibly incrementing by some amount >1 in your search). but afaik the tradeoffs end up being super specific to the use case and most of the time your still better off with the standard library version in whatever lang you use because they are less likely to have hosed up the basics than you are. yes the correct answer is use the built in language hashtable/dictionary/etc... and override the appropriate equals/gethashcode functions for your objects.
|
# ? May 30, 2017 15:57 |
|
cis autodrag posted:i mean, the two options i know of are either using a bucket (which itself may be a hashtable, array, linked list, whatevs) or doing the thing i can't remember the name of where you put the collided value into the next nearest available index (possibly incrementing by some amount >1 in your search). but afaik the tradeoffs end up being super specific to the use case and most of the time your still better off with the standard library version in whatever lang you use because they are less likely to have hosed up the basics than you are. there's usually a lot you can fiddle with though. loadfactor being a big one even if you can't choose between collision methods (I think most language's default implementations use a simple linear chaining)
|
# ? May 30, 2017 16:00 |
|
Shaggar posted:and override the appropriate equals/gethashcode functions for your objects. this is the part where my knowledge falls down because i have no idea what makes a "good" hash function. my schooling just threw out random (seemingly arbitrary) examples like "for strings just multiply all the ascii codes together then divide by their sum", for numbers do the same thing but with bytes, etc. they never really stooped to explaining what an ideal hash function does.
|
# ? May 30, 2017 16:01 |
|
hobbesmaster posted:there's usually a lot you can fiddle with though. loadfactor being a big one even if you can't choose between collision methods (I think most language's default implementations use a simple linear chaining) dont most builtins just increase their size when the load factor gets over like 75%?
|
# ? May 30, 2017 16:01 |
|
^^^ yeah but you can change it cis autodrag posted:this is the part where my knowledge falls down because i have no idea what makes a "good" hash function. my schooling just threw out random (seemingly arbitrary) examples like "for strings just multiply all the ascii codes together then divide by their sum", for numbers do the same thing but with bytes, etc. they never really stooped to explaining what an ideal hash function does. that's because you quickly get into the "never roll your own crypto" territory when diving into hash functions
|
# ? May 30, 2017 16:02 |
|
hobbesmaster posted:^^^ yeah but you can change it hm, anywhere i can do some reading on when you'd want a lower maximum load factor? i hadn't really thought about it before. is it in applications where you'd genuinely want to as close-to-guarantee as possible o(1) on ever single op?
|
# ? May 30, 2017 16:10 |
|
there is no best answer when it comes to sorting something simple like insertion sort can be the fastest option if the data set is small or already sorted/nearly sorted. quicksort has worse worst-case complexity than something like heapsort (o(n^2) vs o(nlogn)), but in the real world quicksort is usually the faster option as long as you aren't feeding it data which has already been sorted. of course, quicksort and heapsort aren't stable sorting algorithms, so if you need that you need something like merge sort (which potentially needs to allocate memory) or the more naive sorts like insertion if it needs to be in-place cis autodrag posted:i mean, the two options i know of are either using a bucket (which itself may be a hashtable, array, linked list, whatevs) or doing the thing i can't remember the name of where you put the collided value into the next nearest available index (possibly incrementing by some amount >1 in your search). but afaik the tradeoffs end up being super specific to the use case and most of the time your still better off with the standard library version in whatever lang you use because they are less likely to have hosed up the basics than you are. open addressing with linear probing. it's real fast if you store the hashes and data in separate arrays since the search portion just becomes running over an array of integers. it's not too hard to write something which beats unordered_map since it ends up traversing a linked list for many operations The_Franz fucked around with this message at 16:15 on May 30, 2017 |
# ? May 30, 2017 16:12 |
|
cis autodrag posted:hm, anywhere i can do some reading on when you'd want a lower maximum load factor? i hadn't really thought about it before. is it in applications where you'd genuinely want to as close-to-guarantee as possible o(1) on ever single op? There's not really any advantage to trying to do that. You don't really reduce the lookup time once your load factor gets low enough and you just start eating up tons of memory.
|
# ? May 30, 2017 16:20 |
|
cis autodrag posted:hm, anywhere i can do some reading on when you'd want a lower maximum load factor? i hadn't really thought about it before. is it in applications where you'd genuinely want to as close-to-guarantee as possible o(1) on ever single op? I haven't done much of it myself, this article using the java classes and the game of life to test has some great examples: http://pzemtsov.github.io/2015/12/14/choosing-the-hash-maps-capacity.html
|
# ? May 30, 2017 16:20 |
|
https://en.wikipedia.org/wiki/Jenkins_hash_function use this one
|
# ? May 30, 2017 17:32 |
|
tef disagrees
|
# ? May 30, 2017 18:11 |
|
hobbesmaster posted:you quickly get into the "never roll your own crypto" territory when diving into hash functions
|
# ? May 30, 2017 18:12 |
|
Blinkz0rz posted:i recognize that learning the formal notation around algorithmic complexity is important to go through the process. quote:i'm just bemoaning its place in the interview process because imo (and this has always been imo) it doesn't tell the interviewer anything beyond the ability to regurgitate your intro to algorithms textbook
|
# ? May 30, 2017 19:14 |
|
He usually does
|
# ? May 30, 2017 19:14 |
|
but that's "just" because the implementation was bad, not a problem with the algorithm described? I mean that is kind of stuff google and the like have to deal with and that is why they pay people a lot of figgies
|
# ? May 30, 2017 20:16 |
|
find out if http://www.darpa.mil/program/space-time-analysis-for-cybersecurity ever made any public results
|
# ? May 30, 2017 20:49 |
|
cis autodrag posted:this is the part where my knowledge falls down because i have no idea what makes a "good" hash function. my schooling just threw out random (seemingly arbitrary) examples like "for strings just multiply all the ascii codes together then divide by their sum", for numbers do the same thing but with bytes, etc. they never really stooped to explaining what an ideal hash function does. I don't think those are good, especially if you can have 0s. The important thing to remember is that the point is to distribute the input as evenly as you can across the hash table. Once you start getting a lot of collisions sadness happens and with a lovely hash function you could get bad collisions and linear searching long before you approach your load factor.
|
# ? May 30, 2017 20:54 |
|
apseudonym posted:
well right, but everyone says this part and then doesnt go on to explain how you'd write a hash function that achieves this.
|
# ? May 30, 2017 21:49 |
|
cis autodrag posted:well right, but everyone says this part and then doesnt go on to explain how you'd write a hash function that achieves this. That's PhD level stuff
|
# ? May 30, 2017 21:55 |
|
cis autodrag posted:well right, but everyone says this part and then doesnt go on to explain how you'd write a hash function that achieves this. return new { prop1,prop2,prop3,prop4}.GetHashCode();
|
# ? May 30, 2017 21:56 |
|
jre posted:No, it's practically useful because then you can understand documentation because this notation is used everywhere. because in this industry theres always a million new things coming out that you have to learn, and learning is hard work. i'd rather spend my learning time learning something that is actually useful. also, it *is* possible to get a job without spewing out cs trivia. pretty much every single job i've ever gotten has been without any kind of technical interview.
|
# ? May 30, 2017 22:20 |
|
School of How posted:cs trivia the difference between a code janitor and a software developer.txt
|
# ? May 30, 2017 22:27 |
|
School of How posted:because in this industry theres always a million new things coming out that you have to learn, and learning is hard work. i'd rather spend my learning time learning something that is actually useful. also, it *is* possible to get a job without spewing out cs trivia. pretty much every single job i've ever gotten has been without any kind of technical interview. Being able to work out time complexity and able to express it to other professionals in the industry standard way : not useful and trivia ADINSX posted:A lot of people really embarrassing themselves itt
|
# ? May 30, 2017 22:28 |
|
I've been trying to quit my job all day but my boss is unreachable in offsite meetings. very anticlimactic and I don't want to do this over email.
|
# ? May 30, 2017 22:32 |
|
JewKiller 3000 posted:don't quote stymie
|
# ? May 30, 2017 22:33 |
|
The Management posted:I've been trying to quit my job all day but my boss is unreachable in offsite meetings. very anticlimactic and I don't want to do this over email. just did this earlier today. would highly recommend
|
# ? May 30, 2017 22:35 |
|
The Management posted:I've been trying to quit my job all day but my boss is unreachable in offsite meetings. very anticlimactic and I don't want to do this over email. i hear bloody knows how to do this properly
|
# ? May 30, 2017 22:46 |
|
The Management posted:the difference between a code janitor and a software developer.txt nah
|
# ? May 30, 2017 22:56 |
|
cis autodrag posted:well right, but everyone says this part and then doesnt go on to explain how you'd write a hash function that achieves this. the algorithms section on hash function Wikipedia page is surprisingly good: https://en.m.wikipedia.org/wiki/Hash_function
|
# ? May 30, 2017 22:57 |
|
lots of elitism in this thread. very contrary to the attitude of the 'pos
|
# ? May 30, 2017 23:03 |
|
jre posted:Being able to work out time complexity and able to express it to other professionals in the industry standard way : not useful and trivia i've worked with hundreds of engineers in my career and never had trouble expressing the runtime complexity of an algorithm without using big o. i don't disagree that it's useful but there's a lot of "i learned it and therefore you should too" going on here with very little regard for the concepts we're actually taking about.
|
# ? May 30, 2017 23:06 |
|
idk why there's been multiple pages of meltdown around whats essentially "you should be able to tell the interviewer the ballpark performance of whatever code you write for them in the interview" especially since the hangup seems to literally be on the industry-standard field-standard term for "ballpark performance of code"
|
# ? May 30, 2017 23:05 |
|
like this is at least as wild spending pages hollerin that knowing what == is for
|
# ? May 30, 2017 23:07 |
|
|
# ? May 14, 2024 05:16 |
|
apparently it is elitist to expect professionals to have even a basic understanding of their field when you interview them though!!! yospos bitch!
|
# ? May 30, 2017 23:08 |