Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
spiritual bypass
Feb 19, 2008

Grimey Drawer
just put everything into an in-memory sqlite

Adbot
ADBOT LOVES YOU

ThePeavstenator
Dec 18, 2012

:burger::burger::burger::burger::burger:

Establish the Buns

:burger::burger::burger::burger::burger:
my favorite kind of hash tables are the ones with an O(1) lookup time where the O(1) stands for "oh, there's only 1 bucket"

The MUMPSorceress
Jan 6, 2012


^SHTPSTS

Gary’s Answer

hobbesmaster posted:

well you need to have something for the follow up of "how do you handle collisions"

i mean, the two options i know of are either using a bucket (which itself may be a hashtable, array, linked list, whatevs) or doing the thing i can't remember the name of where you put the collided value into the next nearest available index (possibly incrementing by some amount >1 in your search). but afaik the tradeoffs end up being super specific to the use case and most of the time your still better off with the standard library version in whatever lang you use because they are less likely to have hosed up the basics than you are.

The MUMPSorceress
Jan 6, 2012


^SHTPSTS

Gary’s Answer

ThePeavstenator posted:

my favorite kind of hash tables are the ones with an O(1) lookup time where the O(1) stands for "oh, there's only 1 bucket"

don't they generally call hashtables "amortized O(1)" for the very reason that specific operations might actually require more than one op but over the lifetime of the hashtable it averages out to 1 op per insert or lookup?

Shaggar
Apr 26, 2006

hobbesmaster posted:

well you need to have something for the follow up of "how do you handle collisions"

*bursts into interview room*
VERY CAREFULLY!!

carry on then
Jul 10, 2010

by VideoGames

(and can't post for 10 years!)

cis autodrag posted:

i mean, the two options i know of are either using a bucket (which itself may be a hashtable, array, linked list, whatevs) or doing the thing i can't remember the name of where you put the collided value into the next nearest available index (possibly incrementing by some amount >1 in your search). but afaik the tradeoffs end up being super specific to the use case and most of the time your still better off with the standard library version in whatever lang you use because they are less likely to have hosed up the basics than you are.

yeah the term i've heard is open addressing. ideally you want to increment by a number that guarantees you'll check every cell in the table eventually but not sure how imperative that is

Shaggar
Apr 26, 2006

cis autodrag posted:

i mean, the two options i know of are either using a bucket (which itself may be a hashtable, array, linked list, whatevs) or doing the thing i can't remember the name of where you put the collided value into the next nearest available index (possibly incrementing by some amount >1 in your search). but afaik the tradeoffs end up being super specific to the use case and most of the time your still better off with the standard library version in whatever lang you use because they are less likely to have hosed up the basics than you are.

yes the correct answer is use the built in language hashtable/dictionary/etc... and override the appropriate equals/gethashcode functions for your objects.

hobbesmaster
Jan 28, 2008

cis autodrag posted:

i mean, the two options i know of are either using a bucket (which itself may be a hashtable, array, linked list, whatevs) or doing the thing i can't remember the name of where you put the collided value into the next nearest available index (possibly incrementing by some amount >1 in your search). but afaik the tradeoffs end up being super specific to the use case and most of the time your still better off with the standard library version in whatever lang you use because they are less likely to have hosed up the basics than you are.

there's usually a lot you can fiddle with though. loadfactor being a big one even if you can't choose between collision methods (I think most language's default implementations use a simple linear chaining)

The MUMPSorceress
Jan 6, 2012


^SHTPSTS

Gary’s Answer

Shaggar posted:

and override the appropriate equals/gethashcode functions for your objects.

this is the part where my knowledge falls down because i have no idea what makes a "good" hash function. my schooling just threw out random (seemingly arbitrary) examples like "for strings just multiply all the ascii codes together then divide by their sum", for numbers do the same thing but with bytes, etc. they never really stooped to explaining what an ideal hash function does.

The MUMPSorceress
Jan 6, 2012


^SHTPSTS

Gary’s Answer

hobbesmaster posted:

there's usually a lot you can fiddle with though. loadfactor being a big one even if you can't choose between collision methods (I think most language's default implementations use a simple linear chaining)

dont most builtins just increase their size when the load factor gets over like 75%?

hobbesmaster
Jan 28, 2008

^^^ yeah but you can change it

cis autodrag posted:

this is the part where my knowledge falls down because i have no idea what makes a "good" hash function. my schooling just threw out random (seemingly arbitrary) examples like "for strings just multiply all the ascii codes together then divide by their sum", for numbers do the same thing but with bytes, etc. they never really stooped to explaining what an ideal hash function does.

that's because you quickly get into the "never roll your own crypto" territory when diving into hash functions

The MUMPSorceress
Jan 6, 2012


^SHTPSTS

Gary’s Answer

hobbesmaster posted:

^^^ yeah but you can change it


that's because you quickly get into the "never roll your own crypto" territory when diving into hash functions

hm, anywhere i can do some reading on when you'd want a lower maximum load factor? i hadn't really thought about it before. is it in applications where you'd genuinely want to as close-to-guarantee as possible o(1) on ever single op?

The_Franz
Aug 8, 2003

there is no best answer when it comes to sorting

something simple like insertion sort can be the fastest option if the data set is small or already sorted/nearly sorted. quicksort has worse worst-case complexity than something like heapsort (o(n^2) vs o(nlogn)), but in the real world quicksort is usually the faster option as long as you aren't feeding it data which has already been sorted. of course, quicksort and heapsort aren't stable sorting algorithms, so if you need that you need something like merge sort (which potentially needs to allocate memory) or the more naive sorts like insertion if it needs to be in-place

cis autodrag posted:

i mean, the two options i know of are either using a bucket (which itself may be a hashtable, array, linked list, whatevs) or doing the thing i can't remember the name of where you put the collided value into the next nearest available index (possibly incrementing by some amount >1 in your search). but afaik the tradeoffs end up being super specific to the use case and most of the time your still better off with the standard library version in whatever lang you use because they are less likely to have hosed up the basics than you are.

open addressing with linear probing. it's real fast if you store the hashes and data in separate arrays since the search portion just becomes running over an array of integers. it's not too hard to write something which beats unordered_map since it ends up traversing a linked list for many operations

The_Franz fucked around with this message at 16:15 on May 30, 2017

ThePeavstenator
Dec 18, 2012

:burger::burger::burger::burger::burger:

Establish the Buns

:burger::burger::burger::burger::burger:

cis autodrag posted:

hm, anywhere i can do some reading on when you'd want a lower maximum load factor? i hadn't really thought about it before. is it in applications where you'd genuinely want to as close-to-guarantee as possible o(1) on ever single op?

There's not really any advantage to trying to do that. You don't really reduce the lookup time once your load factor gets low enough and you just start eating up tons of memory.

hobbesmaster
Jan 28, 2008

cis autodrag posted:

hm, anywhere i can do some reading on when you'd want a lower maximum load factor? i hadn't really thought about it before. is it in applications where you'd genuinely want to as close-to-guarantee as possible o(1) on ever single op?

I haven't done much of it myself, this article using the java classes and the game of life to test has some great examples: http://pzemtsov.github.io/2015/12/14/choosing-the-hash-maps-capacity.html

Sapozhnik
Jan 2, 2005

Nap Ghost
https://en.wikipedia.org/wiki/Jenkins_hash_function

use this one

Progressive JPEG
Feb 19, 2003


tef disagrees

Progressive JPEG
Feb 19, 2003

hobbesmaster posted:

you quickly get into the "never roll your own crypto" territory when diving into hash functions

jre
Sep 2, 2011

To the cloud ?



Blinkz0rz posted:

i recognize that learning the formal notation around algorithmic complexity is important to go through the process.
No, it's practically useful because then you can understand documentation because this notation is used everywhere.

quote:

i'm just bemoaning its place in the interview process because imo (and this has always been imo) it doesn't tell the interviewer anything beyond the ability to regurgitate your intro to algorithms textbook
I don't understand the mindset that after the first time you flunk an interview because of not knowing something important that can be learned in half a day you just throw your hands up and qq about it rather than spending the time to learn it.

jre
Sep 2, 2011

To the cloud ?




He usually does

hobbesmaster
Jan 28, 2008


but that's "just" because the implementation was bad, not a problem with the algorithm described?

I mean that is kind of stuff google and the like have to deal with and that is why they pay people a lot of figgies

Cocoa Crispies
Jul 20, 2001

Vehicular Manslaughter!

Pillbug

find out if http://www.darpa.mil/program/space-time-analysis-for-cybersecurity ever made any public results

apseudonym
Feb 25, 2011

cis autodrag posted:

this is the part where my knowledge falls down because i have no idea what makes a "good" hash function. my schooling just threw out random (seemingly arbitrary) examples like "for strings just multiply all the ascii codes together then divide by their sum", for numbers do the same thing but with bytes, etc. they never really stooped to explaining what an ideal hash function does.

I don't think those are good, especially if you can have 0s.

The important​ thing to remember is that the point is to distribute the input as evenly as you can across the hash table. Once you start getting a lot of collisions sadness happens and with a lovely hash function you could get bad collisions and linear searching long before you approach your load factor.

The MUMPSorceress
Jan 6, 2012


^SHTPSTS

Gary’s Answer

apseudonym posted:



The important​ thing to remember is that the point is to distribute the input as evenly as you can across the hash table. Once you start getting a lot of collisions sadness happens and with a lovely hash function you could get bad collisions and linear searching long before you approach your load factor.

well right, but everyone says this part and then doesnt go on to explain how you'd write a hash function that achieves this.

Captain Foo
May 11, 2004

we vibin'
we slidin'
we breathin'
we dyin'

cis autodrag posted:

well right, but everyone says this part and then doesnt go on to explain how you'd write a hash function that achieves this.

That's PhD level stuff

Shaggar
Apr 26, 2006

cis autodrag posted:

well right, but everyone says this part and then doesnt go on to explain how you'd write a hash function that achieves this.

return new { prop1,prop2,prop3,prop4}.GetHashCode();

School of How
Jul 6, 2013

quite frankly I don't believe this talk about the market

jre posted:

No, it's practically useful because then you can understand documentation because this notation is used everywhere.

I don't understand the mindset that after the first time you flunk an interview because of not knowing something important that can be learned in half a day you just throw your hands up and qq about it rather than spending the time to learn it.

because in this industry theres always a million new things coming out that you have to learn, and learning is hard work. i'd rather spend my learning time learning something that is actually useful. also, it *is* possible to get a job without spewing out cs trivia. pretty much every single job i've ever gotten has been without any kind of technical interview.

The Management
Jan 2, 2010

sup, bitch?

the difference between a code janitor and a software developer.txt

jre
Sep 2, 2011

To the cloud ?



School of How posted:

because in this industry theres always a million new things coming out that you have to learn, and learning is hard work. i'd rather spend my learning time learning something that is actually useful. also, it *is* possible to get a job without spewing out cs trivia. pretty much every single job i've ever gotten has been without any kind of technical interview.

Being able to work out time complexity and able to express it to other professionals in the industry standard way : not useful and trivia

ADINSX posted:

A lot of people really embarrassing themselves itt

The Management
Jan 2, 2010

sup, bitch?
I've been trying to quit my job all day but my boss is unreachable in offsite meetings. very anticlimactic and I don't want to do this over email.

RISCy Business
Jun 17, 2015

bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork
Fun Shoe

JewKiller 3000 posted:

don't quote stymie

PierreTheMime
Dec 9, 2004

Hero of hormagaunts everywhere!
Buglord

The Management posted:

I've been trying to quit my job all day but my boss is unreachable in offsite meetings. very anticlimactic and I don't want to do this over email.

just did this earlier today. would highly recommend

hobbesmaster
Jan 28, 2008

The Management posted:

I've been trying to quit my job all day but my boss is unreachable in offsite meetings. very anticlimactic and I don't want to do this over email.

i hear bloody knows how to do this properly

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

The Management posted:

the difference between a code janitor and a software developer.txt

nah

hobbesmaster
Jan 28, 2008

cis autodrag posted:

well right, but everyone says this part and then doesnt go on to explain how you'd write a hash function that achieves this.

the algorithms section on hash function Wikipedia page is surprisingly good: https://en.m.wikipedia.org/wiki/Hash_function

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS
lots of elitism in this thread. very contrary to the attitude of the 'pos

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

jre posted:

Being able to work out time complexity and able to express it to other professionals in the industry standard way : not useful and trivia

i've worked with hundreds of engineers in my career and never had trouble expressing the runtime complexity of an algorithm without using big o. i don't disagree that it's useful but there's a lot of "i learned it and therefore you should too" going on here with very little regard for the concepts we're actually taking about.

Bloody
Mar 3, 2013

idk why there's been multiple pages of meltdown around whats essentially "you should be able to tell the interviewer the ballpark performance of whatever code you write for them in the interview" especially since the hangup seems to literally be on the industry-standard field-standard term for "ballpark performance of code"

Bloody
Mar 3, 2013

like this is at least as wild spending pages hollerin that knowing what == is for

Adbot
ADBOT LOVES YOU

Bloody
Mar 3, 2013

apparently it is elitist to expect professionals to have even a basic understanding of their field when you interview them though!!! yospos bitch!

  • Locked thread