|
Ghost of Reagan Past posted:
DJ doesn't seem to fit most other running back narratives it seems. I had no idea he preferred running up the middle so much though.
|
# ? Feb 27, 2017 17:13 |
|
|
# ? May 21, 2024 18:40 |
|
Just wrapped up both the analysis and the app for chapter 6 and I think it might be the coolest (and most useful) thing we've done so far. It probably won't make much sense until I actually write the damned chapter, but in honor of my boy Jamaal Charles who got cut this week, I'm going to release the beta app for y'all right now. Am I not merciful? App #7: Tracking Single-Game Performances using Information Theory I'm using a measure here that is conceptually similar to the game quality metrics I used for the "greatest games" series, only with less bullshit fingerwaving. Briefly, it works like this: 1) Turn every run from a distance (in yards) to a probability, given a league-average runner. A 30 yard run and a 35 yard run are five yards apart, but almost identical in probability. In contrast, a 3 yard run and a 4 yard run are only 1 yard apart, but differ quite a lot in probability. 2) Lump all the "bad" runs together (so that better and better runs decrease continually in probability). This gets past the problem of, like, -10 yard runs also being very improbable, just not in a "good" way. "Bad" runs were defined as 2 yards or less (where 2 yards is the most probable run). 3) Given those run probabilities, find the probability for the whole game for each game in our data set. We ignore sequencing when calculating this probability (e.g. a 2-carry game [2, 5] is the same as [5, 2]). 4) We convert these game scores from units of probability to units of information, in bits. This "surprisal" measure tells us how what we saw compares to what we expected ahead of time. When you flip a fair coin, you know ahead of time that heads and tails each has a 0.5 probability. So actually performing the flip and landing on "heads" is not totally out of nowhere. The "heads" conveys 1 "bit" of information - think of it like a computer "bit" (that can be either a 1 or a 0) that is unknown but equally likely to be a 1 or a 0: getting told which it is gains you 1 new bit of information. In contrast, rolling an eight-sided dice can take any 1 of 8 different values. So being told the result is a "six" conveys more information (relative to your expectations) than flipping a coin and landing on "heads". Rolling a d8 actually conveys 3 bits of information (it's no accident that it takes 3 computer bits to reflect the 8 different possible states). We apply the same principles to the run probabilities: we know how probable each run is (more or less) given an average RB, and can calculate how each game compares to this expectation. This has a number of nice features that I will explain later. 5) We plot these game surprisal values from any given player over carries (when you flip lots of coins or take lots of carries, ANY specific outcome is relatively improbable and thus has higher surprisal, so this lets you compare like to like) and draw a smooth line through them in red, compared to the league-average in blue. Bottom line: Higher is better. Each red dot is a game. In the app, you can also highlight games from specific years that interest you. The red line doesn't change - it just highlights the individual games with little diamonds. As you might intuit, the more that the red line is above the blue line, the better the performances. If what we care about is the difference between the red dots and the blue line, we can just go ahead and calculate it directly. Here, I'm showing "marginal game surprisal", or the difference between that player's game score and the expectation for a league-average back taking that many carries. Positive numbers mean above-average and negative numbers mean below-average. On the bottom of the plot is a "rug" showing all of the individual games. Over that is a density plot showing the league average variance in game performances in blue, and then the player's typical performances in red. Like I said, this one's a little complicated, but it'll make more sense when we can walk through it. I'm pretty excited by it. Run the app here (through Rstudio): code:
Forever_Peace fucked around with this message at 14:39 on Apr 17, 2017 |
# ? Mar 2, 2017 00:33 |
|
Ghost of Reagan Past posted:
Great visual idea! Love this! quote:Stay tuned for better success metrics and random forests. I can make you charts of any backs you'd like. Code will eventually be up somewhere once I figure out where to drop Jupyter notebooks. I've also played around with random forests but don't plan to write anything up about it so it would be a good topic for you to post about. Making an ensemble with both the GAM and the random forest over situation factors seemed to have the best explanatory power.
|
# ? Mar 2, 2017 00:37 |
|
Wow McCoy is a pretty good runner! That new graph system is amazing dude, I'd love to see poo poo like that on espn instead of blowhards talking about grit.
|
# ? Mar 5, 2017 06:37 |
|
got any sevens posted:Wow McCoy is a pretty good runner! That new graph system is amazing dude, I'd love to see poo poo like that on espn instead of blowhards talking about grit. Thanks! Yeah I'm pleased as punch with this one. In this analysis, McCoy is one of the best-looking runners in the database. However... RIP Charles, you would have been a shoo-in to the hall of fame if you hadn't lost three full seasons at your peak.
|
# ? Mar 5, 2017 17:04 |
|
Lordy Jamaal Charles was so good. RIP
|
# ? Mar 9, 2017 01:08 |
|
Jamaal Charles was otherworldly. Related, one of Sirius/XM's fantasy guys had a pretty neat breakdown on the run schemes of each team. https://twitter.com/JeffRatcliffe/status/841620961321467905?ref_src=twsrc%5Etfw
|
# ? Mar 14, 2017 15:08 |
|
DJExile posted:Jamaal Charles was otherworldly. This is rad. Ohmygosh the things I could do with that level of play charting. That's almost just showing off. Some quick reactions: - The Atlanta outside zone was as beautiful as it looks on paper there this year. - When the niners line is simply asked to block the dude in front of them, poor Carlos is hit an average of half a yard behind the line of scrimmage. lol. - I think the Saints power line there could be a partial consequence of low sample size and a pass-first offense. The Saints ran on only 36.5% of their plays, and IIRC mostly run zone. I would be surprised if they can repeat power run efficiency like that next year. Do like Ingram though. - So much for the Kubiak Run Game theory. Bye Gary. - The Buffalo run game has the potential to be so goddamned fun next year, I'm giddy just thinking about it. Tyrod-McCoy/Gilly as a dual-threat backfield, good run-blocking line, and most importantly, a recent addition of all the fatguys. - I'm still in shock that Exotic Smashmouth actually ended up being a thing. And it... worked? Ish?
|
# ? Mar 14, 2017 16:44 |
|
Forever_Peace posted:- I'm still in shock that Exotic Smashmouth actually ended up being a thing. And it... worked? Ish? It owns
|
# ? Mar 14, 2017 21:26 |
|
I'm curious how they weighted the chart for coloring it in. At a glance, Outside Zone has the highest split between "good" and "bad", but again just eyeballing it also seems to have the lowest spread. Houston, the Jets, and Philly were all fairly internally consistent across concepts and Pittsburgh was wildly similar, but looking at the colors you might get a different idea. It's surprising that Dallas was so bad running Power, but it's also possible they basically never ran it.
|
# ? Mar 23, 2017 17:53 |
|
Guys I'm only halfway through writing this chapter and it's already as long as chapter 5 was. Hope y'all have reeeeeally long attention spans for nerd stats.
|
# ? Apr 15, 2017 23:58 |
|
Forever_Peace posted:Run the app here (through Rstudio): Goddamnit why didn't anybody tell me that I put the launch code for a different app by accident. This is now fixed, go peruse Game Score using the this corrected code.
|
# ? Apr 17, 2017 15:23 |
|
Forever_Peace posted:Guys I'm only halfway through writing this chapter and it's already as long as chapter 5 was. Hope y'all have reeeeeally long attention spans for nerd stats. This is a book won one paragraph at a time.
|
# ? Apr 19, 2017 01:50 |
|
Forever_Peace posted:Guys I'm only halfway through writing this chapter and it's already as long as chapter 5 was. Hope y'all have reeeeeally long attention spans for nerd stats. This post is why I'm too intimidated to open the email to proof read it. I'll get you before this weekend though buddy.
|
# ? Apr 19, 2017 03:33 |
|
Spoeank posted:This post is why I'm too intimidated to open the email to proof read it. I'll get you before this weekend though buddy. Thanks, sounds good. I also sent out a bat signal to the app testers to help peer review the technical stuff. Which reminds me, if you were one of my app testers, check your pms.
|
# ? Apr 19, 2017 03:59 |
|
Chapter 6 ("Surprisal Me") is now Live here Wherein we invent, from scratch, a new statistic that: - Is theoretically motivated. - Is many times more stable from year to year than yards per carry. - Works reasonably well for smaller, game-sized sample sizes. - Handles the "long run" problem that YPC and other stats have. - Comes with a special second script that anybody can use to calculate game score on their own. The final draft weighs in at 7500 words, 28 figures, two equations, and 340 lines of code. Shoutout to my reviewers for helping me whip all this into respectable shape. Chapter 6 app can be found here (I already released this earlier, but now it has it's own page). Would appreciate help sharing this one. Tweet it at all your football people.
|
# ? Apr 27, 2017 13:24 |
|
gently caress me you put some amazing effort into that. Well done.
|
# ? Apr 27, 2017 15:25 |
|
This is the best chapter yet go read it goons If there are any editing mistakes it's because I was in awe reading it
|
# ? Apr 27, 2017 15:59 |
|
Forever_Peace posted:- Is theoretically motivated. Could you expand on what this means.
|
# ? Apr 27, 2017 16:06 |
|
Jiminy Christmas! Shoes! posted:Could you expand on what this means. Good question. Sometimes, we do things with the numbers because it is convenient. When we did the player matching in chapter 3, we lined up the run distributions from -3 to 15 yards, not because we believed that longer runs were completely irrelevant, but because this smaller range captured most of the runs and was easy to see in the plots. Our motivation for that choice was ease of visualization. Here, I tried to align our choices with things that we conceptually believe to be true ahead of time. Not all yards are created equal. That 45 yard run could have easily been a 43 yard run. Our knowledge about the typical run matters. Games provide information about players relative to our expectations. An efficiency metric should compare a player to the average. We start with these beliefs, and the apply them to the numbers. We are motivated by theory instead of convenience. Of course, there's a lot of places where we could have operationalized a belief (like "the defense influences run probabilities") but didn't, for the sake of simplicity. There's much more left here to explore!
|
# ? Apr 27, 2017 16:30 |
|
quote:1/6 x 1/6 = 1/32 I blame you Spoeank edit: Request to re-name the thread "Ground Control: Enough chitchat, here's some Jamaal Fuckin Charles" whypick1 fucked around with this message at 18:25 on Apr 27, 2017 |
# ? Apr 27, 2017 17:59 |
|
whypick1 posted:
Hey man I'm not here to check numbers, I'm here to check letters
|
# ? Apr 27, 2017 18:30 |
|
whypick1 posted:
Motherfucker. Good catch, fixed. Thanks! PS because I am actually insane and hosted the website on github, literally anybody can submit edits to fix typos or whatever if you want.
|
# ? Apr 27, 2017 18:32 |
|
Forever_Peace posted:Chapter 6 ("Surprisal Me") is now Live here Good stuff. I see you added the log equation for surprisal as compared to the draft chapter. Could you also add the explicit formula for game probability? It would save some math nerds a trip to the R code to verify that the math you (presumably) implemented in the code matches the reader's interpretation of the verbal description given in the "whole-game probability" section. I still want to see how this compares to some other historical data on advanced metrics or fantasy ranks, would be really interesting. e.g. surprisal ratings using data for 2010-2015, ranking those players, compare to fantasy rankings pre-season 2016, compare to other advanced metrics' ranks of players using ~2015 era data. Could do this for a whole series of years, of course. The questions being: is this merely an interesting statistic that yes, confirms our ideas about particular examples such as JCharles? Does it have predictive power greater than some other advanced metric? Does it confirm intuition from other sources? An interesting tangent to that would be: how far does a player need to differentiate himself in one (or multiple) seasons on the surprisal metric in order to get into the HOF? It could be a necessary, if not sufficient, condition for HOF.
|
# ? Apr 27, 2017 22:49 |
|
I wonder how beast mode will do this year. Now to read chap 6...
|
# ? Apr 28, 2017 01:41 |
|
pmchem posted:compare to fantasy rankings pre-season 2016 Fantasy rankings are heavily skewed (or should be) by both receptions and scoring, whereas this surprise rating is entirely focused on yardage (which also means it's still underrating goal-line backs and under-valuing x-and-goal carries). So I don't think the numbers are particularly comparable. You'll find some high-ranked fantasy guys who are also very surprising, but also a lot that aren't, because they're receiving backs or specialize at the goal line.
|
# ? Apr 30, 2017 19:35 |
|
Leperflesh posted:Fantasy rankings are heavily skewed (or should be) by both receptions and scoring, whereas this surprise rating is entirely focused on yardage (which also means it's still underrating goal-line backs and under-valuing x-and-goal carries). *Cough cough* Legarrete Blount!
|
# ? Apr 30, 2017 19:44 |
|
Leperflesh posted:Fantasy rankings are heavily skewed (or should be) by both receptions and scoring, whereas this surprise rating is entirely focused on yardage (which also means it's still underrating goal-line backs and under-valuing x-and-goal carries). I agree, but it would be interesting. And let's face it: lots of these types of analyses are motivated by the fantasy football community. I mean, unless FP is gunning to get hired as a quant in some front office, his audience has immense overlap with fantasy football players.
|
# ? May 1, 2017 00:00 |
|
Leperflesh posted:Fantasy rankings are heavily skewed (or should be) by both receptions and scoring, whereas this surprise rating is entirely focused on yardage (which also means it's still underrating goal-line backs and under-valuing x-and-goal carries). I think the surprisal stat takes more weight for a 5 yard carry on 2nd and 4 vs on 2nd and 6. That example was in the chapter somewhere, anyway, I got a little confused in the details and counter-examples. But that would be a clutch factor? I dont think there's any possible way to adequately compare a receiving back with a grinder type, they're very different roles and completely alter schemes for offense and defense.
|
# ? May 1, 2017 02:35 |
|
The following things didn't improve the predictive power of Game Score from year to year at all: - adjusting for the quality of the defenses (estimated using a procedure called Best Linear Unbiased Prediction). - calculating Game Probability using permutations instead of combinations (I had a concern that with combinations, which we used in the chapter, repeated yardages (like [4, 4, 4]) are improbable, but perhaps not improbably good (e.g. "better" than [3, 4, 5]. Turned out not to matter). - Removing heteroskedasticity problems by converting Game Scores to percentiles (on the scale of simulated league-average games). This is a statistical quirk that is somewhat obscure, but important for me to test. It's a good thing that this didn't matter. In contrast, as we suspected in the chapter, there is one clear thing that did improve year-to-year stability: - weighting the Game Scores by sample size when combining into yearly averages. Yay for not using the "dumb" approach. I was writing this all up in a quick hit but it started to dawn on me that these are kind of weird and complex statistical procedures that don't really have any payoff in learning at the moment, since none of it seemed to help above and beyond what we've already done in the chapter. No reason to clutter up Game Score with a bunch of pointless garbage - I like where it's at so for now, I'm not changing a thing. Figured I'd just tell you folks instead. Reminder that I'd really value you folks helping to spread the word. I use an open process so that other writers and analysts can benefit from the work, but they need to actually know it exists in the first place!
|
# ? May 18, 2017 14:32 |
|
Do you want a plug on the dynasty subreddit? Happy to gush about how awesome Ground Control is, just don't know what you would and wouldn't want, or even what the best way is to spread the word (chapter by chapter over a month or two? all at once in one big megapost that says "here's all this awesome stuff"?). It would be really warmly received there, though. Hell, you should post it there yourself and just copy/paste from posts here, that way you get the credit and direct feedback without having to do more work
|
# ? May 18, 2017 15:50 |
|
RVProfootballer posted:Do you want a plug on the dynasty subreddit? Happy to gush about how awesome Ground Control is, just don't know what you would and wouldn't want, or even what the best way is to spread the word (chapter by chapter over a month or two? all at once in one big megapost that says "here's all this awesome stuff"?). It would be really warmly received there, though. Hell, you should post it there yourself and just copy/paste from posts here, that way you get the credit and direct feedback without having to do more work Yeah reddit shoutouts sounds great. Don't have an account myself but I browse it sometimes. I'd imagine folks would probably be interested in the apps and the chapters, though honestly you folks would probably be a better judge of what is interesting to the public than me, the guy writing them!
|
# ? May 18, 2017 16:53 |
|
http://www.espn.com/nfl/story/_/id/19411342/nfl-dwindling-yards-per-carry-stats-show-there-21st-century-running-back-dilemmaquote:While yards per carry have been down, average yards after contact has remained steady -- an identical 1.74 in both 2009 and 2016. What the data shows is a progressive drop in blocking help for running backs over the past decade. The key numbers:
|
# ? May 24, 2017 19:53 |
|
That Chapter 6 is a loving masterstroke.Thank you, F_P, for all of this. Feeling blessed I get the chance to contribute some serious insight here. Ch. 6's "originally posted" date is 4/27/2016. Figure you'll want to save the correct date for posterity.
|
# ? Jun 13, 2017 22:27 |
|
FBG tryin to make Ground Control type observations, failin' https://twitter.com/JeffHaseley/status/876482239730774017
|
# ? Jun 18, 2017 18:06 |
|
Zombie Tsunami posted:That Chapter 6 is a loving masterstroke.Thank you, F_P, for all of this. Thanks for catching that - fixed! Added dates to the other chapters while I was at it. Let's just pretend that me loving up about 10% of the numbers everywhere is just to keep you on your toes and make reading this more interactive.
|
# ? Jun 18, 2017 18:43 |
|
Oh my goodness i just found this yesterday and binged it; it's so good
|
# ? Jun 21, 2017 13:46 |
|
Got something exciting in the works for y'all.
|
# ? Jun 22, 2017 15:09 |
|
Is that a graph of your erection for big data I like big data and i cannot lie
|
# ? Jun 22, 2017 15:23 |
|
|
# ? May 21, 2024 18:40 |
|
You got data going back to (at least) 1999!? e. Wait no, James had 3,028 rushing attempts in his career, so I guess your data maybe just goes back to... 2002? e2. lol wikipedia tells me Edgerrin James quote:has six children: Edquisha, Emani, Eyahna, Edgerrin Jr., Euro, and Eden. Euro James has the Leperflesh fucked around with this message at 01:46 on Jun 23, 2017 |
# ? Jun 23, 2017 01:41 |