## Gray ink reflections

Posted by Doug on February 16, 2007

Two days ago I presented a gray ink test for football players. The name is borrowed from Bill James' analogous test for baseball players and the purpose of the thing is to put a single number on the quality and quantity of a particular player's league-leading or near-league-leading seasons. Having had a couple of days to wade through the lists and reflect upon them, I have a couple of thoughts.

First, I decided the system was potentially a little too sensitive to the particular stats of the #10 (or #5) player. If there is a huge gap between #9 and #10, or between #10 and #11, then the stats of the #10 guy don't really reflect what I think they're supposed to reflect, which is the approximate production of a guy in roughly that position. So I decided to smooth things out a little by averaging the stats of the 9, 10, and 11 guys to get the baseline #10 production. Likewise, I averaged the 4, 5, and 6 players when using a #5 baseline.

But that's pretty minor.

The major thing I noticed is that the really great seasons get *a lot* of credit. The point of this metric, of course, was to give credit to just those seasons, but I think it might go too far.

For example (working with receiving yards and a baseline of #10), Harold Carmichael's 1973 earns him 799 points while Chad Johnson's 2006 nets him 175. Both led the league in receiving, but one gets 624 points more than the other. I specifically said last time that its ability to distinguish between a truly great league-leading season (like Carmichael's) and a fairly weak league-leading season (Johnson in 2006 just happened to be at the top of a group of 6 guys separated by fewer than 100 yards) was one of the selling points of this method. But I'm wondering if I haven't overdone it.

And in 1978, Carmichael got 306 points for finishing third in the league in receiving. Isaac Bruce got roughly half that (158) for *leading* the league in receiving in 1996. Is that right? One could certainly argue that it is. Even though he led the league, Bruce was just a handful of catches from finishing out of the top 10, the fact that he was at the top of a homogeneous pack instead of in the middle or at the bottom is not very important. But still, he did lead the league.

While the lists produce a pretty nice mixture of receivers from all eras, I look at some of those 70s seasons --- like the Carmichael seasons mentioned above, Cliff Branch's 692 points in 1974, and Drew Pearson's 684 points for finishing second in the same season --- and wonder if they aren't being over-credited. In 1974, there were 26 teams, most of which only really utilized two receivers at the most. In 2006, there are 32 teams, many of which use several wide receivers extensively. The #10 receiver in 1974 was #10 of around 50 or 60 "meaningful" receivers, while the #10 receiver in 2006 is #10 of about 80. Am I wrong about this?

Why not look at top 5 percentiles instead?

That's a good idea in theory, silentdibs, but there are some details to work out.

When finding the 95th percentile, are you counting all guys that played a game? All guys that caught a pass? All guys that caught at least X passes? Does X have to slide with the year?

I remember being surprised (since I'm a VBD guy) that you used division instead of subtraction. Why not simply subtract the baseline number from the individual player's number? That will help alleviate the problem from the '70s when some players got really low stats.

BTW unless I'm doing something wrong Carmichael "only" gets 530 points for 1973.

Chase, I used a division

in addition toa subtraction, not instead of a subtraction. The important thing is the subtraction. The division is just there to ensure that all the subtractions are on equal footing in some sense.More specifically, here is the formula:

That's the same as:

Another way to look at it is that I'm subtracting normalized yards instead of subtracting raw yards. This feels right to me because a 1973 yard is harder to obtain and is worth more than a 2006 yard.

I don't have time to fully investigate Carmichael's 1973 score, but I suspect the discrepancy is due to the fact that you are looking at the top 10 in receiving yards, whereas I'm looking at the top 10

WRsin receiving yards (something I now realize I never made clear).Good point on Carmichael, Doug, that is the source of my confusion. And I think you're right, we want to look at the top 10 WRs, not top ten leaders in receiving yards.

As for your formula, I understand it. I'm not sure I agree or disagree with it, but I certainly think tweaking it would "solve" your problem. Subtracting normalized yards might seem fairer, but it has the potential to overreward players from the early eras. Regardless, I think strong arguments could be made that when baseline = 800 in year X and 1000 in year Y, that 1200 years in year X is equal to either 1400 yards in year Y or 1500 yards in year Y. Alternatively, you could use straight subtraction but then multiply seasons with fewer than 16 games by a factor of 16/N. So now we've got arguments to make 1200 yards in year X equal 1400, 1457 or 1500 yards.

(Upon further review with Carmichael, I'm getting 690. I can get 175 for Chad Johnson without a problem, so I don't think it's entirely me messing up. I'm using Biletnikoff's 660 as the baseline. I really hope this doesn't end up like our Jerry Rice games played discussion.)

What about summing all WR yardage gained for a season and dividing by the number of WR who caught at least one pass? Use that as the baseline. Or, maybe use that times 1.5 as the baseline.

Would that help correct for today's game where teams have many WR's catching passes?

Say there's a year where you have these guys at the top of the list (just to throw out some names):

Jerry Rice

Michael Irvin

Marvin Harrison

Chad Johnson

Torry Holt

And say Rice leads the league in yards, but all these guys are close behind.

Now say that the next year, all those other guys are hurt, and Rice's competition for the top spot is much diminished, even assuming that everyone else's numbers are static.

It seems that he'll get more points in the second year than in the first, because he will lead by a greater amount, and the average will be lower as well.

Why does Rice get penalized under this formula for leading the leage against much better competition in the first year, compared to the second? Is his second year really better than the first, assuming he had the same number of yards in each? What if he actually had more yards in the first season, but did not lead the league that year?

In short, it's difficult to say how valid normalization is, because it's impossible to isolate the necessary variables. One guy can lead the league because he's really amazingly good, or he can lead the league because everyone else sucks. Simple normalization won't be able to tell you which.

Why not just use deviations above the mean instead? If the #10 WR yardage is what you want to use as a baseline, just continue adding the top WRs of each year until the average used for the StDev calculation is approximately the same as the #10 WR.

"Why does Rice get penalized under this formula for leading the leage against much better competition in the first year, compared to the second?"

Because in the first example, we THINK Rice was the best receiver in the league. He has a slight statistical edge, but it's close enough that other variables (strength of schedule, QB quality, game situation, etc.) might offset that edge.

"One guy can lead the league because he’s really amazingly good, or he can lead the league because everyone else sucks."

Either way, the point is, he was the dominant receiver in the league that year. I don't think the quality of the league is relevant, when you're talking about different eras in the same league. If you want to apply a similar formula to NFL Europe and find their top receiver, obviously that guy wouldn't be better than the NFL's number 10 guy.

In my example, I should have been more explicit that everything else would remain static - SoS, QB quality, game situation, etc.

The only thing that changes are the few guys in competition for the top spot. Every other player remains exactly the same.

In this example, Rice gets a much higher score in the second year, despite that he basically wins by default because there's no competition. With exactly the same stats, he gets less points in the first year, even though the other guys competing for that spot had zero effect on his individual stats.

"In this example, Rice gets a much higher score in the second year, despite that he basically wins by default because there’s no competition."

Well, why is there no competition? Did everyone else get injured? Then Rice deserves credit for durability. Did everyone else decline due to age? Then Rice deserves credit for, uh, not declining due to age.

"With exactly the same stats, he gets less points in the first year, even though the other guys competing for that spot had zero effect on his individual stats."

They don't effect his stats; they put his stats in context. The first year, it was really easy to gain (for example) 1500 yards. In year two, for whatever reason, it got much harder, but Rice still pulled it off.

"The #10 receiver in 1974 was #10 of around 50 or 60 “meaningful” receivers, while the #10 receiver in 2006 is #10 of about 80. Am I wrong about this?"

You're not wrong, but I think you're looking at it backwards. Remember, in major league athletes, your sample size is the extreme high end of the bell curve. Thus, the lower the caliber of player, the more common the player.

There's a big difference between #1 and #20. There's a smaller difference between #20 and #40. There's a tiny difference between #40 and #60. There is virtually no difference between #60 and #80.

So being #10 out of 60 is equally impressive as being #10 out of 80, because 30 years ago, #61 through 80 wouldn't even be in the league.

Doug, your formula ONLY has division, and no subtraction, as it can be rewritten as:

1000 * (PY/BY - 1)

And, I don't see why you would subtract the "1". Otherwise, you are arguing that anyone under the #10 receiver is a negative.

The key might be to modify that "-1" to something else. If a league leader's PY/BY is typically 1.5 (and the #10 is 0 obviously), then how many #10 seasons equals a league leader who never reappears on any list? Sure 5 #10 seasons is better than 1 #1 season?

A PY/BY - .7 would make these two guys equivalent. You just have to play around with whatever equivalencies you want to make.

Vince, I totally agree with you about this:

if you're talking about true talent levels. But we're talking about stats, and a big part of that is opportunity. In other words, while there is very little difference talent-wise between #40 and #80, there will be a big difference in their stats because the #40 guy plays a lot, and the #80 guy doesn't.

In 1973, the 80th-most-talented receiver in the league probably wasn't on the field for more than 5 offensive plays a game. In 2006, he's probably in for half the game.

So the issue in my mind isn't that the "average" receiver is more or less talented in 1973 than in 2006, it's that the league's passing yardage total is being split among 50 guys in 1973 and 80 guys in 2006.

Maybe a way to check and see the opportunity would be to look at a receivers receptions compared to his teams receptions. This would be an indicator of his opportunity in the offense. I guess targets would be a better indicator, but I don't think stats are kept on that.