SITE NEWS: We are moving all of our site and company news into a single blog for We'll tag all PFR content, so you can quickly and easily find the content you want.

Also, our existing PFR blog rss feed will be redirected to the new site's feed. ยป Sports Reference

For more from Chase and Jason, check out their work at Football Perspective and The Big Lead.

Another ranking system

Posted by Doug on May 9, 2006

I'm essentially writing these down for my own benefit, so that if I forget how some of these things work I'll have a document to refer to. If you enjoy reading along, sit a spell. If not, I should be on to different topics tomorrow.

There is a particular style of argument, rarely used in NFL discussions but a staple for college football fans, that is tempting to use because it is based on a very reasonable premise but that is always doomed to lose. You might call it the argument by transitivity. Notre Dame is better than LSU because Notre Dame beat Tennessee and Tennessee beat LSU. Oregon is better than Notre Dame because Oregon killed Stanford and Notre Dame barely beat them. Arizona State is better than Auburn because they beat Northwestern who beat Wisconsin who beat Auburn.

As you know, this argument can't be taken seriously because it can be used to prove that just about any team is better than just about any other team. If you want to have a little fun with it, this page will let you do just that. Now indulge me briefly while I break down the mathematics of this argument.

The scoreboard says:

Tennessee beat LSU by 3

It's not much of a stretch from there to:

Tennessee is 3 points better than LSU

If you wanted to construct a mathematical model out of that bit of information, you might do this:

R_ten - R_lsu = 3

where R_ten is Tennessee's rating and R_lsu is LSU's. Put that with the rest of your data, though, and your mathematical model is shot. It looks like this:

R_ten - R_lsu = 3
R_lsu - R_vandy = 28
R_vandy - R_ten = 4
[. . . about 800 more equations . . ]

You've got about 800 equations and about 120 unknowns, but you can already tell that there will be no solution. Tennessee's rating has to be bigger than LSU's, LSU's has to be bigger than Vanderbilt's, and Vanderbilt's has to be bigger than Tennessee's. Impossible. Mathematically speaking, there is simply no way to assign a number to every team in such a way that all the results match up with the numbers exactly. That's why the argument by transitivity fails.

At this point, you probably think I'm insulting your intelligence. You understood all that without me having to get all mathy on you. But I needed to get all mathy to describe what happens next. We know the argument by transitivity doesn't work. But it's still popular, and the reason is that it's premise is reasonable. So let's add some extra stuff to give the argument a bit of wiggle room. When Tennessee beats LSU by 3, instead of saying:

R_ten - R_lsu = 3

I'll say

R_ten - R_lsu = 3 + e1

The extra e1 is a fudge factor. The above equation says, "The difference between Tennesee and LSU is 3 points plus or minus some other stuff that didn't show up on the scoreboard." So our collection of equations now looks like this:

R_ten - R_lsu = 3 + e1
R_lsu - R_vandy = 28 + e2
R_vandy - R_ten = 4 + e3
[. . . about 800 more equations . . ]

Remember that the es represent the stuff that didn't show up on the scoreboard. Since we want our ranking system to be objective, we take the viewpoint that the scoreboard is what matters and the es are there only because they have to be. So what we want to do is make the combined size of the es as small as possible. (For technical reasons that aren't important to the argument, we will want to minimize the sum of the squares of the es rather than the es themselves, but don't worry about that.)

Imagine that you have three dials --- one marked Tennessee, one marked LSU, and one marked Vanderbilt --- on a control panel. You can increase or decrease a team's rating by turning their dial. Now imagine that the total (squared) e is the volume. The object is the make the volume as low as possible. If you tune the Tennessee dial higher, then the volume from e1 goes down, but the volume from e3 goes up. As you tune LSU's dial, it affects the volume of e1 and e2, and Vandy's affects e2 and e3. The idea is to tune all three dials to a place that achieves the lowest possible volume. Now add 117 dials, each of which affects 11 or 12 es, tune to the lowest possible volume and you've got yourself a rating for all Division I college football teams.

The lower the volume, the lower the sum of the squared es and hence the better that set of ratings matches up with the actual game results. What we want to do is to find the lowest possible total, out of all possible sets of ratings. That would be set of ratings that is the best match for the actual data. A computer, properly programmed, can find this collection of ratings.

To summarize: if you want to play the transitivity game with any set of ratings, you're going to run into some contradictions. It's unavoidable. This system is designed to run into as few contradictions as possible. Or, more precisely, to minimize the total magnitude of all the contradictions.

OK, now here's the neat thing: the ranking system described above turns out to be the same as the one described yesterday. The descriptions are different and the mathematical tools used to get the answer are different, but you end up in the same place.

Have you ever, in your life, seen anything cooler than that?

This entry was posted on Tuesday, May 9th, 2006 at 4:15 am and is filed under BCS, Statgeekery. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.