Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
GOOD TIMES ON METH
Mar 17, 2006

Fun Shoe

oystertoadfish posted:

gently caress p-values

for posterity here's the article about pecota
http://www.baseballprospectus.com/article.php?articleid=12082
and relevant quotes! i started the quoting early to give context and to give nate credit 'cause apparently he did pre-processing in STATA but the relevant part is bolded.

i included the part about one of the big names in sabermetrics parachuting in to save the day and being like ALL MY poo poo IS IN FORTRAN DEAL WITH IT bc i think that also is funny

Fortran rules

Adbot
ADBOT LOVES YOU

Arsenic Lupin
Apr 12, 2012

This particularly rapid💨 unintelligible 😖patter💁 isn't generally heard🧏‍♂️, and if it is🤔, it doesn't matter💁.


It strikes me as very strange that anybody is using landline polling at this point -- even people who are post-cellphone age have been trained by decades of phone spam to screen calls. My 85-year-old parents screen calls. If you poll only people who will pick up the phone when their landphone rings, and stay on the line when they're told it's a survey -- also a technique used by spammers -- that's getting to be a rarer and rarer subset of Americans.

Mirthless
Mar 27, 2011

by the sex ghost

Arsenic Lupin posted:

It strikes me as very strange that anybody is using landline polling at this point -- even people who are post-cellphone age have been trained by decades of phone spam to screen calls. My 85-year-old parents screen calls. If you poll only people who will pick up the phone when their landphone rings, and stay on the line when they're told it's a survey -- also a technique used by spammers -- that's getting to be a rarer and rarer subset of Americans.

Yeah, landline-only polling made sense 8 years ago but it is TYOOL 2016 and I know all of three people with a landline phone, at this point they're generally something that only specific people in specific demographic groups have. It really doesn't make any sense to limit your polling to landlines.

smokyprogg
Apr 9, 2008

BROKEN DOWN!
MISSION FAILED
Silver should have stopped using endorsements the moment Jeb! started polling under 5%, but he's still using them. When the polls plus more than doubles the predicted outcome for Marco loving Rubio (who's "winning" the "endorsement primary") to almost 30% likely win for Illinois, you really gotta just scrap that poo poo

Mirthless
Mar 27, 2011

by the sex ghost

smokyprogg posted:

Silver should have stopped using endorsements the moment Jeb! started polling under 5%, but he's still using them. When the polls plus more than doubles the predicted outcome for Marco loving Rubio (who's "winning" the "endorsement primary") to almost 30% likely win for Illinois, you really gotta just scrap that poo poo

Silver's problem is that he is too rigid with his analysis to write off endorsements as being irrelevant in this cycle, because endorsements mattered a lot (to his perception) in every previous election that polling was done in, so clearly it has to matter here as well. Ironically, despite the internet being one of the main reasons he is famous, he has completely failed to account for the impact that the internet has on campaigning and on the election cycle

In an age where people can get all the information they need to make up their own minds, they no longer need a preferred politician or union to tell them how to vote.

Arsenic Lupin
Apr 12, 2012

This particularly rapid💨 unintelligible 😖patter💁 isn't generally heard🧏‍♂️, and if it is🤔, it doesn't matter💁.


Mirthless posted:

Yeah, landline-only polling made sense 8 years ago but it is TYOOL 2016 and I know all of three people with a landline phone, at this point they're generally something that only specific people in specific demographic groups have. It really doesn't make any sense to limit your polling to landlines.

Make that four. :smugdog: We use it as a spam trap for businesses. Anything personal that is *not* from our parents goes to the appropriate cell.

C. Everett Koop
Aug 18, 2008

I hope Silver completely blows the election and the Unskewed polls guy just loving drills it and me and him will high-five and talk about how Silver is a bitch-rear end pussy while Silver cries into his spreadsheets as an discredited stupid dumb fraud who sucks at life and is bad.

oystertoadfish
Jun 17, 2003

Goetta posted:

Fortran rules

some of my work is in a model that's basically a quarter century's worth of one guy's stream of consciousness in Fortran. there are random exceptions and data from studies from the eighties up to a few years ago scattered throughout the source code and 90 code is shoved in the middle of blocks of 77. it's like some kind of historical forensics or archaeology just trying to understand what's going on

i like the computer history aspect of Fortran but i did my phd in r

btw this might be the thread to whine about it, but unskewing was such a silly thing for that one guy we've been making fun of for four years to call his fat fingered adjustments. did he ever actually correct for a skew so the distribution was symmetric about the mean, or was he just using it in the colloquial sense the whole time?

sometimes distributions have non zero skew, actually practically all of them do, it's OK! like the distribution of states or congressional districts by partisanship for example

C. Everett Koop
Aug 18, 2008

then I'll kick his rear end

e - goddammit

Pinterest Mom
Jun 9, 2009

oystertoadfish posted:

some of my work is in a model that's basically a quarter century's worth of one guy's stream of consciousness in Fortran. there are random exceptions and data from studies from the eighties up to a few years ago scattered throughout the source code and 90 code is shoved in the middle of blocks of 77. it's like some kind of historical forensics or archaeology just trying to understand what's going on

i like the computer history aspect of Fortran but i did my phd in r

btw this might be the thread to whine about it, but unskewing was such a silly thing for that one guy we've been making fun of for four years to call his fat fingered adjustments. did he ever actually correct for a skew so the distribution was symmetric about the mean, or was he just using it in the colloquial sense the whole time?

The ~methodology~ behind UnSkewed Polls was: "I think there are too many Democrats in this sample. I'm going to reweigh the polls so that the electorate looks like 2010."

oystertoadfish
Jun 17, 2003

luv dat methodology

Mirthless
Mar 27, 2011

by the sex ghost

Pinterest Mom posted:

The ~methodology~ behind UnSkewed Polls was: "I think there are too many Democrats in this sample. I'm going to reweigh the polls so that the electorate looks like 2010."

Also boosted's unskewing spreadsheets just took an additional flat 5 or 10% off every state contest to account for media bias or some stupid bullshit

GOOD TIMES ON METH
Mar 17, 2006

Fun Shoe
Romney and the GOP leadership all apparently believed that guy is the funniest thing to me. Some real wizard behind the curtain poo poo that summed up that campaign nicely.

oystertoadfish posted:

some of my work is in a model that's basically a quarter century's worth of one guy's stream of consciousness in Fortran. there are random exceptions and data from studies from the eighties up to a few years ago scattered throughout the source code and 90 code is shoved in the middle of blocks of 77. it's like some kind of historical forensics or archaeology just trying to understand what's going on

i like the computer history aspect of Fortran but i did my phd in r

That's cool, it sounds like we do sort of similar things.

Jewel Repetition
Dec 24, 2012

Ask me about Briar Rose and Chicken Chaser.

I've done it

Mean Baby
May 28, 2005

Calling cellphones has become prohibitively expensive to the point where reaching 18-30 year olds, Bernie's strong demographic, is very difficult. Unless the law changes, doing public polls over the phone are going to get less and less representative.

And even if it changes, people are answering their phone less and less. Accurate polling is a huge challenge now and I think it will only get worse until the unlikely event a much larger share of the population is on online panels or people younger than 30 stop screening their calls.

MaxxBot
Oct 6, 2003

you could have clapped

you should have clapped!!
Has internet polling improved at all and can it realistically replace traditional phone polling? That seems like the only option at this point.

e_angst
Sep 20, 2001

by exmarx

MaxxBot posted:

Has internet polling improved at all and can it realistically replace traditional phone polling? That seems like the only option at this point.

I think if Facebook ever decided to move beyond its product-survery stuff and into political polling, it could provide some very accurate data. Unfortunately, political polling isn't nearly as profitable.

Baku
Aug 20, 2005

by Fluffdaddy
speaking from a social science rather than polling background i'm pretty sure internet surveys and polls had great response rates when they were new (because people thought using a computer made something more "official" or scientific), became horrible with the onslaught of spam and bullshit most people put up with online including needing to get around increasingly sophisticated filters, and have never really recovered; big social media sites like facebook and twitter would be amazing places to conduct research but they're very resistant to people outside their companies using their platforms that way

The Whole Internet
May 26, 2010

by FactsAreUseless

MaxxBot posted:

Has internet polling improved at all and can it realistically replace traditional phone polling? That seems like the only option at this point.

There are internet polling firms that get weighed into the averages on RCP, 538, etc. Zogby and Google Consumer Surveys for instance. These have not been very active in calling the state-by-state races though. They only do national polls.

IBD uses cellphones as well as landlines and has been more accurate than the landline-only polls thus far, though still off quite a bit at times.

Mr.48
May 1, 2007

NNick posted:

Calling cellphones has become prohibitively expensive to the point where reaching 18-30 year olds, Bernie's strong demographic, is very difficult. Unless the law changes, doing public polls over the phone are going to get less and less representative.

Seriously, I'm not even that young anymore at 28, and I haven't had a landline for over 2 years now.

Condiv
May 7, 2008

Sorry to undo the effort of paying a domestic abuser $10 to own this poster, but I am going to lose my dang mind if I keep seeing multiple posters who appear to be Baloogan.

With love,
a mod


the fact that nate silver built an excel file that took 10 minutes to load as a statistical analysis tool makes him look like a giant fool to me

saying "i'm not a programmer" has it's limits. when he got to the point he did with his excel files, it's time to become a programmer and actually use appropriate tools

Vitamin P
Nov 19, 2013

Truth is game rigging is more difficult than it looks pls stay ded

Arsenic Lupin posted:

It strikes me as very strange that anybody is using landline polling at this point -- even people who are post-cellphone age have been trained by decades of phone spam to screen calls. My 85-year-old parents screen calls. If you poll only people who will pick up the phone when their landphone rings, and stay on the line when they're told it's a survey -- also a technique used by spammers -- that's getting to be a rarer and rarer subset of Americans.

In the last UK election the polls got it wrong, and the general explanation that's been pushed is that voters that were slightly more difficult to reach got written off too quickly, and those tended to be Conservative voters. So the pollsters tried to contact 2000 people, actually contacted 1600 and called it good, whereas trying to contact 800 and not stopping until they'd actually contacted all of them would have given a more accurate picture.

But part of why that explanation has gotten so much coverage is because it pushes the message "left-wing voters were all at home on welfare, while right-wing voters were out working :smug:". The Tories new mass data, highly targeted campaigning having a disproportionate but subtle effect was probably a much bigger factor.

Oil!
Nov 5, 2008

Der's e'rl in dem der hills!


Ham Wrangler

Condiv posted:

the fact that nate silver built an excel file that took 10 minutes to load as a statistical analysis tool makes him look like a giant fool to me

saying "i'm not a programmer" has it's limits. when he got to the point he did with his excel files, it's time to become a programmer and actually use appropriate tools

Spreadsheets that take 10 minutes to load aren't born that way, but are a running series of things getting bolted on. It would have been quite literally more than double the work to recreate it as a program, especially if the person has no idea how to code.

jojoinnit
Dec 13, 2010

Strength and speed, that's why you're a special agent.

Condiv posted:

the fact that nate silver built an excel file that took 10 minutes to load as a statistical analysis tool makes him look like a giant fool to me

saying "i'm not a programmer" has it's limits. when he got to the point he did with his excel files, it's time to become a programmer and actually use appropriate tools

Its kinda adorable that you think this is rare and Nate Silver is somehow extra dumb to end up with a stupidly bloated excel file.

^^ yep. By the time anyone realises what's happened it's way too late.

Condiv
May 7, 2008

Sorry to undo the effort of paying a domestic abuser $10 to own this poster, but I am going to lose my dang mind if I keep seeing multiple posters who appear to be Baloogan.

With love,
a mod


jojoinnit posted:

Its kinda adorable that you think this is rare and Nate Silver is somehow extra dumb to end up with a stupidly bloated excel file.

^^ yep. By the time anyone realises what's happened it's way too late.

it not being rare doesn't make it an idiot move. excel is obviously a bad fit for this kind of problem and he blew it off with "i'm not a programmer". he should've learned instead of creating yet another shitheap.

Oil! posted:

Spreadsheets that take 10 minutes to load aren't born that way, but are a running series of things getting bolted on. It would have been quite literally more than double the work to recreate it as a program, especially if the person has no idea how to code.

of course it was a "simple" thing that had poo poo bolted on until it was unmanageable. but part of making any kind of algorithm is realizing when you need to redesign because your problem space has outgrown your current design

also, the "double the work to recreate it" point is kind of silly. yes it takes double the work to start doing things the right way instead of hacking on what you already have until it barely does what you want, but the longer you wait the harder it becomes and the harder it is to actually improve your work

Condiv has issued a correction as of 15:55 on Mar 11, 2016

Absurd Alhazred
Mar 27, 2010

by Athanatos
From the PYF awful/funny graphs and charts thread:

Judge Schnoopy posted:

Here have this abomination, where the scale is so bad it doesn't even encompass all (perceived and non-data based) points

e_angst
Sep 20, 2001

by exmarx
A good buddy of mine used to be an accountant. I remember years ago when he was asking about getting to borrow to a faster computer, because he was trying to do with with an excel spreadsheet. This was back in 2007, and his requirements were that it have Excel 07 on it, since that was the version where Excel boosted the row limit past 65,000. See, this spreadsheet had several hundred thousand rows in it. He also said the spreadsheet had "over a million large formulas" in it as well (this is why he needed the larger computer, his laptop couldn't actually open the thing anymore).

After he asked that, we proceeded to mock him mercilessly for not switching over to a database. I believe one exact reply was "When a spreadsheet gets that complicated or huge, it's time to recognize that it's become something else entirely, bite the bullet, and pay someone to build the database application it has become."

shrike82
Jun 11, 2005

technical debt is a pretty common issue in software projects big or small

to point to something close to hand, the forums are a good example of codebases tending to accrete rather than getting re-written from scratch periodically

Concerned Citizen
Jul 22, 2007
Ramrod XTreme
i watched this thread be closed, moved to SAS, and then somehow come back.

what's happening

Pinely
Jul 23, 2013
College Slice

Condiv posted:

the fact that nate silver built an excel file that took 10 minutes to load as a statistical analysis tool makes him look like a giant fool to me

saying "i'm not a programmer" has it's limits. when he got to the point he did with his excel files, it's time to become a programmer and actually use appropriate tools

or use some of that ESPN cash to hire someone to program instead of sending Clare Malone to Oklahoma for little to no reason.

Condiv
May 7, 2008

Sorry to undo the effort of paying a domestic abuser $10 to own this poster, but I am going to lose my dang mind if I keep seeing multiple posters who appear to be Baloogan.

With love,
a mod


shrike82 posted:

technical debt is a pretty common issue in software projects big or small

to point to something close to hand, the forums are a good example of codebases tending to accrete rather than getting re-written from scratch periodically

i'm in the middle of rewriting a codebase from scratch because it's poo poo cobbled together with bash scripts

rewriting stuff is not a cardinal sin when stuff doesn't work or works very poorly. in this case, excel is a poor replacement for R or other languages with statistics libraries because it's much slower and can only handle a fraction of the computational load. when it started maxxing his computer out for hours he should've moved to a language that would finish calculations in minutes

Condiv has issued a correction as of 22:59 on Mar 11, 2016

shrike82
Jun 11, 2005

nah, it's pretty common at trading desks for example to use Excel spreadsheets backed by C libraries to build pricing models
probably more complex than PECATO does

Condiv
May 7, 2008

Sorry to undo the effort of paying a domestic abuser $10 to own this poster, but I am going to lose my dang mind if I keep seeing multiple posters who appear to be Baloogan.

With love,
a mod


shrike82 posted:

nah, it's pretty common at trading desks for example to use Excel spreadsheets backed by C libraries to build pricing models
probably more complex than PECATO does

C. Everett Koop
Aug 18, 2008

Pinely posted:

or use some of that ESPN cash to hire someone to program instead of sending Clare Malone to Oklahoma for little to no reason.

listen Nate Statboy Silver can either pay someone to code a database for his stupid wrong formulas or he can try to find out the BEST BURRITO IN AMERICA and keep his spergsheets I mean priorities man

Jewel Repetition
Dec 24, 2012

Ask me about Briar Rose and Chicken Chaser.
5-turdy-8

Thundercracker
Jun 25, 2004

Proudly serving the Ruinous Powers since as a veteran of the long war.
College Slice

e_angst posted:

A good buddy of mine used to be an accountant. I remember years ago when he was asking about getting to borrow to a faster computer, because he was trying to do with with an excel spreadsheet. This was back in 2007, and his requirements were that it have Excel 07 on it, since that was the version where Excel boosted the row limit past 65,000. See, this spreadsheet had several hundred thousand rows in it. He also said the spreadsheet had "over a million large formulas" in it as well (this is why he needed the larger computer, his laptop couldn't actually open the thing anymore).

After he asked that, we proceeded to mock him mercilessly for not switching over to a database. I believe one exact reply was "When a spreadsheet gets that complicated or huge, it's time to recognize that it's become something else entirely, bite the bullet, and pay someone to build the database application it has become."


I actually build excel macros for banks and funds and such. Yes, Excel is absolutely not ideal for a lot high end database work, and you have to hack the poo poo out it sometime to do basic things

But the one feature it has, which will keep me employed even when robots take the other white collar jobs, is that it is the final word in ubiquity. Both in the sense that everyone has excel and also that everyone knows how to use it a bit.

The power ofubiquity can't be stressed enough. There are projects I know will be green lighted because I can build the reports and database , painfully, in Excel that wouldn't fly if even loving MS Access was needed.

Broken Machine
Oct 22, 2010

Condiv posted:

i'm in the middle of rewriting a codebase from scratch because it's poo poo cobbled together with bash scripts

rewriting stuff is not a cardinal sin when stuff doesn't work or works very poorly. in this case, excel is a poor replacement for R or other languages with statistics libraries because it's much slower and can only handle a fraction of the computational load. when it started maxxing his computer out for hours he should've moved to a language that would finish calculations in minutes

Ok, but imagine this is the scenario - millions of people are reading your projections on a daily basis, and you can't afford to take that time to learn a new language and refactor things - especially when you've never formally programmed anything like that before. You simply don't have the time, it's not there. What do you do? You continue along with your half-baked thing that works and besides which it's too late to do much about. Just look at how many websites are built on crap. SA makes heavy use of php, an abomination. So does wikipedia.

I'd think Silver probably eventually did what you're suggesting. But real world constraints are A Thing.

Condiv
May 7, 2008

Sorry to undo the effort of paying a domestic abuser $10 to own this poster, but I am going to lose my dang mind if I keep seeing multiple posters who appear to be Baloogan.

With love,
a mod


Broken Machine posted:

Ok, but imagine this is the scenario - millions of people are reading your projections on a daily basis, and you can't afford to take that time to learn a new language and refactor things - especially when you've never formally programmed anything like that before. You simply don't have the time, it's not there. What do you do? You continue along with your half-baked thing that works and besides which it's too late to do much about. Just look at how many websites are built on crap. SA makes heavy use of php, an abomination. So does wikipedia.

I'd think Silver probably eventually did what you're suggesting. But real world constraints are A Thing.

oddly enough, taking the time to rewrite it would free a poo poo ton of time to do projections. you know, since the excel file made his computer unusable for hours (and had to be restarted sometimes due to crashing). and like someone mentioned, he didn't exactly have to do it himself. in any case, it's pretty sad for a "statistician" to not know R these days. my biologist coworkers who need me to fix their computers daily all know R

Broken Machine
Oct 22, 2010

Condiv posted:

oddly enough, taking the time to rewrite it would free a poo poo ton of time to do projections. you know, since the excel file made his computer unusable for hours (and had to be restarted sometimes due to crashing). and like someone mentioned, he didn't exactly have to do it himself. in any case, it's pretty sad for a "statistician" to not know R these days. my biologist coworkers who need me to fix their computers daily all know R

I'm just saying, if you look at all of the popular technology things, most any of them have architectural cruft that happens due to growing pains coupled with bad decisions. It's why linux is rewriting init and displaying graphics is still problematic.

Adbot
ADBOT LOVES YOU

Arsenic Lupin
Apr 12, 2012

This particularly rapid💨 unintelligible 😖patter💁 isn't generally heard🧏‍♂️, and if it is🤔, it doesn't matter💁.


The thing is, Nate Silver was the brand. Annoying Nate Silver in any way increased the risk that he would become unavailable. If the 400-pound gorilla wants to write in Word*Star, everybody else puts up with the file formats.

  • Locked thread