## Rebuilding the Favorite Toy again

Posted by Doug on April 20, 2006

Awhile back I posted a few entries (I, II, III) about estimating a player's chances at reaching a career milestone using a mathematical gadget called a Markov chain.

I figured that some baseball stathead had probably attempted something similar, so I did some googling to see if they had any luck. I did not find any Markov models, but what I did find was this interesting article at baseballthinkfactory. It was written by a guy named Jesse Frey and it's a neat idea. I'll run through the basic gist of it using --- guess who --- Clinton Portis and the rushing record as an example.

We start by collecting all 25-year-old running backs throughout NFL history (subject to some fine print). We then record how many yards they gained at age 23 and at age 24, and how many yards they gained in the rest of their careers. So we've got a list that looks something like this:

Player Age23RshYD Age24RshYD RestOfCareer

Robert Smith 632 692 4989

Ricky Ervins 680 495 939

Terrell Davis 1117 1538 4952

Barry Foster 488 1690 1562

[... another hundred-or-so guys ...]

There is, of course, no exact formula that tells you the RestOfCareer rushing yards based on the age 23 and age 24 rushing yards, but using a technique called regression we can estimate the formula that works "best."

Given the above data, what we end up with is this:

Rest-of-career yards ~= -943 + 2.64*(age24yards) + 2.39*(age23yards)

Plugging Clinton Portis' 1516 age 24 yards and 1315 age 23 yards into that formula gives an estimate of 6202 yards for the remainder of his career.

That tells us that we expect Portis to gain about 6202 more yards in the rest of his career. But of course we're not saying he'll end up with *exactly* that. What we're saying is that we don't know, but our best guess is that it'll be *somewhere in the neighborhood* of 6202. But how big is that neighborhood? Obviously there is some chance of him exceeding that by a thousand yards. There is some chance of him exceeding that by 5000 yards. How big are those chances? To answer these questions in a mathematically justifiable way is beyond the scope of this post, but we can get pretty close with the data and our intuition.

Of the 106 running backs that comprised this data set, 20 of them (about 19%) doubled the rest-of-career rushing yards estimate provided by this formula. Doubling his expected rest-of-career rushing yards is almost exactly what Portis needs to do to break Emmitt Smith's record. So this calculation indicates that Portis has about a 19% chance of retiring as the rushing king. That's pretty close to the original Favorite Toy estimate and generally agrees with my gut feeling.

Neat, huh? If I get some time, I'll run this for some other players.

That is pretty sweet.

I know this isnt intended to be fantasy related, but I just know there is a way to use this tool to own my fantasy league. I'm not nearly as statistically inclined as others here, but tell me this--I only know that RBs tend to take a dive after 30. Could this tool:

1) Project the probable peak years based on past performance? (maximum trade value in keeper leagues)

2) Do the same thing for other positions?

3) Factor in performances based on past injuries? (i.e. will Edge have his career shortened based on statistics that prove RBs with ACL tears last on average 1-3 less years than other RBs).

Food for thought anyway--I guess it isnt solely fantasy relevant.

Hey Ben,

This is quite a few years old now, but this might be what you were looking for. http://www.pro-football-reference.com/articles/des.htm

Thanks, Chase. That's precisely what I'm looking for. Was this done with just averages, or was 'the new toy' part of the calculations? (If it wasn't I'd be very interested in seeing how they compare side by side).