Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Bloody
Mar 3, 2013

rotor, please do the needful

Adbot
ADBOT LOVES YOU

Soricidus
Oct 21, 2010
freedom-hating statist shill

Brain Candy posted:

if only java had immutable data structures with minimal performance costs, like, java.util.Collections.unmodifiableSet
that's not an immutable data structure

Subjunctive
Sep 12, 2006

✨sparkle and shine✨

Bloody posted:

as of this post i would up the punishment to 3 day probations for lawchat

that seems reasonable, ok

double sulk
Jul 2, 2010

https://github.com/aspnet

Sweeper
Nov 29, 2007
The Joe Buck of Posting
Dinosaur Gum

Shaggar posted:

if the databases are on the same server, union the tables across all the databases.

different servers

Brain Candy
May 18, 2006

Soricidus posted:

that's not an immutable data structure
totally is, if nobody gets to alter the backing set. maybe you think persistent is the same as immutable?

Notorious b.s.d.
Jan 25, 2003

by Reene

Sweeper posted:

so i have a data set where it is a string and a count for some number of sets. i basically want to find the top 10 things by count in a union of all these sets. the problem is this data is very large and exists in multiple databases.... what should i be looking for? do i actually need to look through the entirety of each set to guarantee that nothing is missed?

hadoop / cascading :q:

Notorious b.s.d.
Jan 25, 2003

by Reene
seriously though, counting things is the "hello world" for hadoop/cascading.

write an etl package to dump everything to hdfs or s3, spin up a few nodes, whomp.

Soricidus
Oct 21, 2010
freedom-hating statist shill

Brain Candy posted:

totally is, if nobody gets to alter the backing set. maybe you think persistent is the same as immutable?
no, i think "if nobody gets to alter the backing set" is a very big if.

it solves one problem, i.e. passing someone a collection without copying it and knowing they won't modify it. but it doesn't solve other problems, such as being passed a collection without copying it and knowing the caller won't modify it.

ComradeCosmobot
Dec 4, 2004

USPOL July

Kevin Mitnick P.E. posted:

java gets it right. except i guess in the case that you give command line arguments that can't be decoded by the platform default codec in which case you've earned whatever hell you find yourself in

two words: surrogate characters

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
As long as humans continue to exist, data will continue to be messy, and attempts to say "this is text and this is bytes" are naive and unrealistic, no matter how firmly you say it.

Python 3's Unicode support is truly embarrassing.

Brain Candy
May 18, 2006

Soricidus posted:

no, i think "if nobody gets to alter the backing set" is a very big if.

it solves one problem, i.e. passing someone a collection without copying it and knowing they won't modify it. but it doesn't solve other problems, such as being passed a collection without copying it and knowing the caller won't modify it.

you know its immutable because because that's what it said in the documentation/annotations. as long as non-language designers get to make new data structures, that's about where you are going to end up, +/- pretend guarantees like const

everything you point out about the data structure also applies to the things it contains. if you want to get a stronger guarantee than this, you'll need to stop people from making mutable things period

ComradeCosmobot
Dec 4, 2004

USPOL July

Suspicious Dish posted:

As long as humans continue to exist, data will continue to be messy, and attempts to say "this is text and this is bytes" are naive and unrealistic, no matter how firmly you say it.

Python 3's Unicode support is truly embarrassing.

How long is the data here? Is it 4 bytes? Or is it 2 16-bit characters?

Sweeper
Nov 29, 2007
The Joe Buck of Posting
Dinosaur Gum

Notorious b.s.d. posted:

seriously though, counting things is the "hello world" for hadoop/cascading.

write an etl package to dump everything to hdfs or s3, spin up a few nodes, whomp.

too slow it is for a UI, no? It is also filterable so it's in a relational database atm. whoever did the setup decided that when they needed more space they would just stand up another pair of databases (master/slave) and now we have 7 pairs of databases and I want to be able to do joins on the data across them

like I could have just dumped the data into dynamo and done scans on it, but it ends up being super loving expensive and a poo poo ton of full table scans to count the counts for ever column in the database.

JewKiller 3000
Nov 28, 2006

by Lowtax

Brain Candy posted:

you know its immutable because because that's what it said in the documentation/annotations. as long as non-language designers get to make new data structures, that's about where you are going to end up, +/- pretend guarantees like const

everything you point out about the data structure also applies to the things it contains. if you want to get a stronger guarantee than this, you'll need to stop people from making mutable things period

you'd know it was immutable if you used a good programming language where everything is immutable unless explicitly specified otherwise

Notorious b.s.d.
Jan 25, 2003

by Reene

Sweeper posted:

too slow it is for a UI, no? It is also filterable so it's in a relational database atm. whoever did the setup decided that when they needed more space they would just stand up another pair of databases (master/slave) and now we have 7 pairs of databases and I want to be able to do joins on the data across them

there exist sharding proxies for mysql (tungsten) and postgres (pgpool) that let you use a subset of sql in parallel... but they're a pain in the butt

if your data is small enough, the cheapest/best answer is probably gonna be a DB server that is 10x as big.

off the shelf x86 will do several TB of RAM now

Sweeper posted:

like I could have just dumped the data into dynamo and done scans on it, but it ends up being super loving expensive and a poo poo ton of full table scans to count the counts for ever column in the database.

by definition a precise count requires a table scan. that's why map/reduce frameworks are good at counting: guaranteed ceiling on worst-case time and it is a simple aggregate function

you can use indices to get estimated counts on postgres/oracle but they're only estimates

Corla Plankun
May 8, 2007

improve the lives of everyone
if you can make any assumptions about the data you could probably whip up an estimator that was good to within an acceptable confidence interval in less execution time but this would probably take a lot more effort than just querying 7 databases

Sweeper
Nov 29, 2007
The Joe Buck of Posting
Dinosaur Gum

Notorious b.s.d. posted:

there exist sharding proxies for mysql (tungsten) and postgres (pgpool) that let you use a subset of sql in parallel... but they're a pain in the butt

if your data is small enough, the cheapest/best answer is probably gonna be a DB server that is 10x as big.

off the shelf x86 will do several TB of RAM now


by definition a precise count requires a table scan. that's why map/reduce frameworks are good at counting: guaranteed ceiling on worst-case time and it is a simple aggregate function

you can use indices to get estimated counts on postgres/oracle but they're only estimates

i'm thinking i could stop query after some count < x where x is probably the lowest in the top 10 counts. i was hoping for something simple dome smart math person came up with where i could be like 85% sure or w/e that there won't be any lower summed counts than that

no one is going to green light my one giant db plan sadly :(

Notorious b.s.d.
Jan 25, 2003

by Reene

Sweeper posted:

i'm thinking i could stop query after some count < x where x is probably the lowest in the top 10 counts. i was hoping for something simple dome smart math person came up with where i could be like 85% sure or w/e that there won't be any lower summed counts than that

no one is going to green light my one giant db plan sadly :(

if estimates are good enough, take a random sample and work with that data instead.

i'm no ace at statistics, but a sample of 1,000 adults is good enough to make high-confidence estimates about the behavior of 100 million voters, so a little thought could go a long way

fritz
Jul 26, 2003

Sweeper posted:

i'm thinking i could stop query after some count < x where x is probably the lowest in the top 10 counts. i was hoping for something simple dome smart math person came up with where i could be like 85% sure or w/e that there won't be any lower summed counts than that

that might work, id probably do a two pass thing where if you want the top 10 overall, take the top 50 (+/-) strings from each db, and then do a second pass where you found the counts of each of the strings from the first pass and took the top 10 of those

fritz
Jul 26, 2003

Notorious b.s.d. posted:

if estimates are good enough, take a random sample and work with that data instead.

i'm no ace at statistics, but a sample of 1,000 adults is good enough to make high-confidence estimates about the behavior of 100 million voters, so a little thought could go a long way

depending on how the counts are distributed and how big the samples are id be worried that you wouldn't be able to tell #10 from #11 (or #5 from #30) very reliably, if there's lots of strings the frequencies of any individual one are probably gonna be pretty low, and then you're in the situation of estimating small binomial probabilities which is not where i personally like being

BONGHITZ
Jan 1, 1970

Booblord Zagats posted:

I figure I should share a story my dad told me about idiot contractors. Not sure If I've told it on here before, but since he just told it again the other night when we were talking to him, I'll go ahead and tell it again because it's a good story.

When my dad first made E-8 they moved him out of the Army Space Program Office and to a DISA project clearing the handing over of training assets like flight simulators and MILES gear to US allies. In all it was a pretty sweet gig because he'd get to fly all over the world and chill out for a few weeks in places like Belgium, Italy, and Australia while making sure what had been handed over by US contractors was satisfactory from both an operational and security standpoint. It's probably the main reason he was hired by a 3-letter within 48 hours of retiring from the Army.

He ended up making friends with a bunch of Australian helicopter pilots due to his frequent trips to their base when they were getting upgraded Kiowa simulators. The contractor for the sims was one my dad hated working with though, because they half-assed every thing they touched, meaning he'd spend months sending them notes CC'd to DISA's chief which always led to a poo poo storm that was worse for him than the people who weren't doing their jobs. The Kiowa sims started off really lovely though, because the company making them just directly imported the software for USMC Mid 80's AH-1 Cobra simulators and just made it work in a faux Kiowa cockpit. Which really didn't work for a lot of reasons. 1) Australia didn't really need to train pilots in 1994 on the proper way to close the Fulda Gap and 2) the Kiowa and the Cobra don't handle anywhere near the same

So they (contractors)go in and after months of arguing with both DISA and the Australian pilots that a simulator that doesn't simulate the right thing is worthless finally make some of the needed updates. They fixed the handling characteristics (by hiring two engineers from Bell who designed the sim the US Army uses) and changing Central Germany to Australia by making the ground brown instead of green and throwing Ayers Rock where some of the German Alps had been. And just to be cute, they replaced random Soviet soldiers that would flee when they spotted the helicopter to Kangaroos just as an added "gently caress you, here's some added realism" poo poo.

The Australian pilots are pretty happy with the changes and spend the first few minutes flying from one side of their base to the other, attempting loops and all other manner of loving around. Then the Aussie lead pilot decides to fly out to Ayers Rock since one of the contractors had mentioned it. As he's flying out to it, he see's some movement on the ground and changes course to check it out. His copilot confirms its a pack of kangaroos so dude figures "gently caress it, lets buzz some kangaroos"

He drops down to 10 feet above the ground and flies over them as fast as he can while the Kangaroos scatter. Then as he pulls up, he realizes he has lost control of his tail rotor and his panels are lighting up like a Christmas tree. Dude tries to put it down gentle, but then gets informed he's been hit by heavy machine gun fire and the sim freezes. Everyone is dumbfounded by this.

Then one of the sim techs has a revelation, they had been shot down by the Kangaroos. After they buzzed them, one of the kangaroos fled to cover, then fired a shoulder launched SAM at them, hitting the tail rotor, while he tried to recover, some other kangaroos set up a 13.7mm gun and shot him up. Because the coders had only changed the graphics of the Soviet soldiers to Kangaroos and nothing else, meaning they still had a full platoon's worth of Soviet anti-helicopter weapons.


So if anything, it proved that everything in the Australian outback is designed to gently caress poo poo up.

Soricidus
Oct 21, 2010
freedom-hating statist shill

Brain Candy posted:

everything you point out about the data structure also applies to the things it contains. if you want to get a stronger guarantee than this, you'll need to stop people from making mutable things period
yes? i agree. java already did this for strings and it worked out really well. taking it further would be great.

but that doesn't alter the only fact i'm arguing, which is that java does not provide immutable collections. all it provides is a way to create an immutable view of a mutable collection, which is plang level poo poo. it's not even visible in the type system, let alone statically checked.

vapid cutlery
Apr 17, 2007

php:
<?
"it's george costanza" ?>

Damiya posted:

js owns and node remains a great platform for scripts and tasks..

Just don't do server in it that's all

typescript is cool

vapid cutlery
Apr 17, 2007

php:
<?
"it's george costanza" ?>

Soricidus posted:

yes? i agree. java already did this for strings and it worked out really well. taking it further would be great.

but that doesn't alter the only fact i'm arguing, which is that java does not provide immutable collections. all it provides is a way to create an immutable view of a mutable collection, which is plang level poo poo. it's not even visible in the type system, let alone statically checked.

are you an objc bro *tries to do the secret handshake with u*

hackbunny
Jul 22, 2007

I haven't been on SA for years but the person who gave me my previous av as a joke felt guilty for doing so and decided to get me a non-shitty av

Brain Candy posted:

totally is, if nobody gets to alter the backing set. maybe you think persistent is the same as immutable?

I can't believe a clanger like me has to point this out, but that's totally not an immutable data structure. an immutable data structure has mutator methods, except instead of mutating the object they return a new, immutable object with the result of the mutation, which you can then operate on, or compare-and-swap with the original to make the result globally visible (the GC will then handle the clean-up of the original object, and as a clanger I'm terribly envious of this). usually, the structure of a true immutable data structure is specifically optimized for this kind of copy-on-write mutation, so that multiple objects resulting from the mutation of a common object can share most or all of the data that wasn't affected by the mutation. this kind of structure tends to look an awful lot like the call tree of a recursive algorithm, frozen in memory, and that's why immutable data structures are a gateway drug to functional programming

(flangers please don't take offense at my very clangerish view of immutable data structures)

Malcolm XML
Aug 8, 2009

I always knew it would end like this.

hackbunny posted:

I can't believe a clanger like me has to point this out, but that's totally not an immutable data structure. an immutable data structure has mutator methods, except instead of mutating the object they return a new, immutable object with the result of the mutation, which you can then operate on, or compare-and-swap with the original to make the result globally visible (the GC will then handle the clean-up of the original object, and as a clanger I'm terribly envious of this). usually, the structure of a true immutable data structure is specifically optimized for this kind of copy-on-write mutation, so that multiple objects resulting from the mutation of a common object can share most or all of the data that wasn't affected by the mutation. this kind of structure tends to look an awful lot like the call tree of a recursive algorithm, frozen in memory, and that's why immutable data structures are a gateway drug to functional programming

(flangers please don't take offense at my very clangerish view of immutable data structures)

yeah this is pretty much it

CoW is so amazingly powerful, systems peeps have been using it forever w.r.t virtual memory/fork etc

that + TCO basically making recursion costless is great

gonadic io
Feb 16, 2011

>>=
i can see why j-langers don't like immutability if the only experience they have with it is mutable containers except with the setters removed

Soricidus
Oct 21, 2010
freedom-hating statist shill
everything should be immutable

starting with this thread

Max Facetime
Apr 18, 2009

ComradeCosmobot posted:

two words: surrogate characters

just stream over .codepoints() no problem :D

Malcolm XML posted:

yeah this is pretty much it

CoW is so amazingly powerful, systems peeps have been using it forever w.r.t virtual memory/fork etc

that + TCO basically making recursion costless is great

yeah, costless except everything is a tree and causes a page fault

Cybernetic Vermin
Apr 18, 2005

i'd have to dig up the post i made like a hundred pages ago about doing reference counting by doing lifetime management on the collections level to make this post complete, but one of the niceties of that type of approach is that you can have the collections mutate if they only have a single reference, while maintaining the appearance of immutability in all other cases. makes for a solid optimization while keeping things nice and simple

Workaday Wizard
Oct 23, 2009

by Pragmatica

Cybernetic Vermin posted:

i'd have to dig up the post i made like a hundred pages ago about doing reference counting by doing lifetime management on the collections level to make this post complete, but one of the niceties of that type of approach is that you can have the collections mutate if they only have a single reference, while maintaining the appearance of immutability in all other cases. makes for a solid optimization while keeping things nice and simple

isnt that what rust is doing?

i wonder how rust would handle memory alignment and packing stuff that high performance software want

kitten emergency
Jan 13, 2008

get meow this wack-ass crystal prison

lol

Brain Candy
May 18, 2006

Soricidus posted:

yes? i agree. java already did this for strings and it worked out really well. taking it further would be great.

but that doesn't alter the only fact i'm arguing, which is that java does not provide immutable collections. all it provides is a way to create an immutable view of a mutable collection, which is plang level poo poo. it's not even visible in the type system, let alone statically checked.

below is the definition of persistent data structures which everybody has been calling immutable for some reason. it's an immutable view of a mutable collection w. idempotence.

hackbunny posted:

I can't believe a clanger like me has to point this out, but that's totally not an immutable data structure. an immutable data structure has mutator methods, except instead of mutating the object they return a new, immutable object with the result of the mutation, which you can then operate on, or compare-and-swap with the original to make the result globally visible (the GC will then handle the clean-up of the original object, and as a clanger I'm terribly envious of this). usually, the structure of a true immutable data structure is specifically optimized for this kind of copy-on-write mutation, so that multiple objects resulting from the mutation of a common object can share most or all of the data that wasn't affected by the mutation. this kind of structure tends to look an awful lot like the call tree of a recursive algorithm, frozen in memory, and that's why immutable data structures are a gateway drug to functional programming

(flangers please don't take offense at my very clangerish view of immutable data structures)

Brain Candy
May 18, 2006

for my next trick i will get mad when somebody says you can't use recursion in a lang without TCO

Blotto Skorzany
Nov 7, 2008

He's a PSoC, loose and runnin'
came the whisper from each lip
And he's here to do some business with
the bad ADC on his chip
bad ADC on his chiiiiip
how about environments that have flaky TCE and detonate your stack when you expected you'd be fine

Nomnom Cookie
Aug 30, 2009



Otto Skorzeny posted:

how about environments that have flaky TCE and detonate your stack when you expected you'd be fine

this never happens if you use this haskell extension

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
Went ahead and published a new article in Xplain: http://magcius.github.io/xplain/article/window-tree.html

It doesn't contain everything I'd like it to contain, namely the WM part of it, but I figured you guys might appreciate it.

Max Facetime
Apr 18, 2009

oh god I hope this hasn't been posted yet http://codeofrob.com/entries/you-have-ruined-javascript.html

quote:

To configure the dealer, all we have to do is
JavaScript code:
app.config(function (CarProviderProvider) {
    CarProviderProvider.setDealerName('Good');
});
Hey, it's just config - no need to change any of the real code!!

I'd write a plain old JS equivalent but trying to wrap my head around all of the indirection in the above example is making me want to crawl under a desk and bang my head on the floor until the brainmeats come out so I don't have to subject myself to this madness any further.

quote:

“But, why CarProviderProvider instead of CarProvider”

Here's a tip. If you find yourself asking a question like this. If you find yourself asking a question which requires this sort of answer and then this sort of question to be asked YOU'VE DONE IT WRONG.

Adbot
ADBOT LOVES YOU

Soricidus
Oct 21, 2010
freedom-hating statist shill
i'm the blog that's not readable on phones in the year of our lord 2014

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply