SITE NEWS: We are moving all of our site and company news into a single blog for Sports-Reference.com. We'll tag all PFR content, so you can quickly and easily find the content you want.

Also, our existing PFR blog rss feed will be redirected to the new site's feed.

Pro-Football-Reference.com ยป Sports Reference

For more from Chase and Jason, check out their work at Football Perspective and The Big Lead.

Breaking down “Yards Per Carry” II

Posted by Chase Stuart on July 11, 2006

Reading yesterday's post on YPC, left me with the following question. Is it more impressive to rush 50 times for 250 yards, or 300 times for 1250 yards? That's a simple question with lots of complicated answers.

Let's first look at all RBs from 2002-2004 performed the following season.

In 2004, twenty-six RBs ran between 51 and 100 times, and as a group they averaged 4.18 YPC. Seven RBs had 250-300 carries, and that group averaged 4.19 YPC. I'm not sure exactly what you would say about the talent levels of the two groups. Are they similar because the players averaged the same YPC? Or is the high carry group better because those runners earned more carries, which is a reflection of how good they are?

You're probably expecting me to tell you that the low carry group averaged 5.0 YPC in 2005, while the high carry group averaged 3.0 YPC. Or maybe the reverse. Either way, I set you up for nothing. In 2005, the 26 members of the original low carry group averaged 4.15 YPC, and the original high carry group averaged 4.14 YPC. Here's the full chart:

	   2003	   2004	   2005
01-50 4.03 4.17 3.96
51-100 3.64 4.05 3.95
101-150 4.37 4.26 4.22
151-200 3.61 4.34 3.95
201-250 4.31 3.79 3.81
251-300 4.51 4.28 4.13
301+ 4.35 4.36 4.37

Remember exactly what this is saying. It means that all RBs with 151-200 carries in 2002 averaged 3.61 YPC in 2003 (on however many carries). The two high carry groups (251-300 and 301+) in Year N (2002, 2003 or 2004) averaged the most yards per carry in year N+1 (2003, 2004 or 2005). That's pretty interesting, although maybe not entirely surprising. It appears to reaffirm what we thought before: the best RBs get the most carries. And assuming that a player's ability remains relatively constant from year to year, then it makes sense that the RBs with the most carries one year would average the most yards per carry the next.

Here's the same table as above but with carries listed instead of YPC.

	   2003	   2004	   2005
01-50 2087 2131 1330
51-100 1116 1661 1997
101-150 795 1527 965
151-200 731 699 780
201-250 1414 1474 1386
251-300 2407 728 1278
300+ 2822 2866 2774

Don't be alarmed at the high number of carries from that first group. From 2002-2004, 251 RBs that were in the 01-50 carries group while only 31 RBs over those three seasons totaled 300+ carries. What's more important is that only three RBs with 01-50 carries in Year N then rushed for 900 yards in Year N+1. Here's the list, with the first year on the left and the second season on the right.

Player		Rush	Yards	YPC	Rush    Yards	YPC
Rudi Johnson 17 67 3.94 215 957 4.45
Reuben Droughns 6 14 2.33 275 1240 4.51
Willie Parker 32 186 5.81 255 1202 4.71

To be honest, I'm a bit surprised that only one runner per year came out of nowhere to have a big year. You probably want to temper your enthusiasm on any unproven runner unless you've got a really good reason to like him.

  • A couple more ways to look at YPC data

Yesterday I wrote that the RBs with the fewest carries having the lowest yards per carry average (as a group) was probably counterintuitive. But just because those runners as a group had a low average, that doesn't mean it's hard for individual runners to have a high average.

Over the four seasons, the top 26 runners in yards per carry all had fewer than 40 carries. On the flip side, none of the bottom 100 runners in yards per carry even had 50 carries. So yeah, you're going to get some extreme results when you look at these small sample sizes.

But we can still play around with the numbers a little bit. First, let's look at all RBs with a small number of carries one year, and a large number the next. There are thirty-three RBs in NFL history that rushed less than 100 times in Year N, and 250 times or greater in Year N+1.

Name			YearN	Rush	Yards	YPC	Rush	Yards	YPC
Lamont Jordan 2004 93 479 5.2 272 1025 3.8
Willie Parker 2004 32 186 5.8 255 1202 4.7
Reuben Droughns 2003 6 14 2.3 275 1240 4.5
Emmitt Smith 2003 90 256 2.8 267 937 3.5
Troy Hambrick 2002 79 317 4.0 275 972 3.5
Deuce McAllister 2001 16 91 5.7 325 1388 4.3
Fred Taylor 2001 30 116 3.9 287 1314 4.6
Shaun Alexander 2000 64 313 4.9 309 1318 4.3
Lamar Smith 1999 60 205 3.4 309 1139 3.7
James Allen 1999 32 119 3.7 290 1120 3.9
Jamal Anderson 1999 19 59 3.1 282 1024 3.6
Ahman Green 1999 26 120 4.6 263 1175 4.5
Stephen Davis 1998 34 109 3.2 290 1405 4.8
Duce Staley 1997 7 29 4.1 258 1065 4.1
Anthony Johnson 1995 30 140 4.7 300 1120 3.7
Garrison Hearst 1994 37 169 4.6 284 1070 3.8
Harvey Williams 1993 42 149 3.5 282 983 3.5
Erric Pegram 1992 21 89 4.2 292 1185 4.1
Barry Foster 1991 96 488 5.1 390 1690 4.3
Cleveland Gary 1991 68 245 3.6 279 1125 4.0
Gaston Green 1990 68 261 3.8 261 1037 4.0
Ottis Anderson 1988 65 208 3.2 325 1023 3.1
Greg Bell 1987 22 86 3.9 288 1212 4.2
Charles White 1986 22 126 5.7 324 1374 4.2
Curt Warner 1984 10 40 4.0 291 1094 3.8
Earnest Jackson 1983 11 39 3.5 296 1179 4.0
Curtis Dickey 1982 66 232 3.5 254 1122 4.4
Wendell Tyler 1980 30 157 5.2 260 1074 4.1
Terdell Middleton 1977 35 97 2.8 284 1116 3.9
Wilbert Montgomery 1977 45 183 4.1 259 1220 4.7
Otis Armstrong 1973 26 90 3.5 263 1407 5.3
Lydell Mitchell 1972 45 215 4.8 253 963 3.8
Ron Johnson 1971 32 156 4.9 298 1182 4.0
Totals 1359 5583 4.11 9440 38500 4.08

I'm not sure what you would have predicted, but the same runners that averaged 4.1 YPC on an average of 42 carries ran equally well with an average of 286 carries the next year. But that's misleading if it makes you place more value on small sample sizes.

Inside the group there wasn't much consistency: only one-third of the RBs averaged within half a yard per carry of their YPC average from Year N. The correlation coefficient, explained here of the YPC for the RBs in Year N and Year N + 1 was just 0.16. This means that the YPC average of the RBs in the second year can be "explained by" 3% their YPC average in the first year, and 97% other stuff. This is a longwinded way of saying a small bit of data (less than 100 carries) just doesn't tell you very much. What about a bigger piece of data?

There are 188 RBs in NFL history that recorded at least 250 carries in consecutive seasons. How do their numbers compare? The high workload RBs averaged 4.32 YPC in Year N, and 4.28 YPC in Year N+1. But as we saw above, that could be the result of lots of RBs cancelling each other out.

You'd expect the correlation coefficient to be higher than 0.16 here, and it is. But it's only 0.39; that means that even with RBs that we know a lot about, only 15% of each RB's YPC in Year N+1 can be "explained by" his YPC average from the previous year.

Before we get to the last way to measure the data, an analogy might help here.

Let's say you flip a regular coin ten times, and it lands on heads ten times in a row. You'd probably still say the coin is only 50% likely to land on "heads" on the next flip. Because even though a coin will only land on ten straight heads once every 1,000 times, the odds that your regular coin was actually a weighted coin is a lot less than one in a thousand.

But now assume you take a coin with heads on both sides and put it in a bag with two other coins. If you pull out a coin without looking, flip it twice, and it lands on two heads, you won't think the odds are 50/50 anymore that the next flip will produce a heads. Because even though getting two straights heads isn't very unlikely, it's more likely that you've grabbed the coin with two heads.

Once you think of football statistics like that, the following analogy is pretty simple. If Joe Runningback runs 75 times for 500 yards, you can either chalk it up to a combination of good luck and a small sample size, or you can rationalize the result by claiming that Joe Runningback's pretty good. It then becomes a question of what's more likely: that an average RB could do what Joe did, or that Joe's actually a very good runner? And that's why when you deal with small sample sizes, your own personal beliefs on a player become very important in how you interpret the data.

This gets us to the last idea for the day. There have been a few running backs in the NFL, so here's how I narrowed the list. Any RB that debuted before 1970, had fewer than 200 career carries or is still active was thrown out. I then looked at all RBs who had 51-100 career carries at the end of either their first or second season while averaging at least 4.00 yards per carry. That left us with 58 RBs who fit our basic profile: runners who had success on a small number of carries very early in their career.

What we'll add in is their draft position. Presumably, the round in which a player was drafted can serve as a predicate for "I think Joe Runningback is a good or bad runner."

I'll let you guys comment on the data. The numbers on the left represent the group's career-to-date totals after the season in which each RB passed the 50 carry mark (either their first or second); the second set of numbers show how the group performed for the remainder of their careers.

Round	# RBs	Rush	Yards	YPC	Rush	Yards	YPC
1 9 680 3,229 4.75 5,099 21,270 4.17
2-3 10 751 3,519 4.69 6,164 26,316 4.27
4-6 13 975 4,419 4.53 6,743 25,689 3.81
7+ 26 575 2,997 5.21 3,106 13,068 4.21
2,981 14,164 4.75 21,112 86,343 4.09

For those curious, the correlation coefficient of each RB's original YPC average to his YPC average for the remainder of his career was 0.43; the correlation coefficient for draft round (with the number 10 used for any undrafted player or player drafted after round 9) with his remaining YPC average was -0.23. We'd expect a negative correlation here: the lower (better) the round a player was drafted in, the higher his expected yards per carry average.

This entry was posted on Tuesday, July 11th, 2006 at 2:05 am and is filed under General, Statgeekery. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.