SITE NEWS: We are moving all of our site and company news into a single blog for Sports-Reference.com. We'll tag all PFR content, so you can quickly and easily find the content you want.
Also, our existing PFR blog rss feed will be redirected to the new site's feed.
Pro-Football-Reference.com » Sports Reference
For more from Chase and Jason, check out their work at Football Perspective and The Big Lead.
Ten thousand seasons
You'd better read yesterday's post if you haven't yet.
So the plan is to simulate an NFL season a bazillion times and observe what kind of wacky stuff happens. Here are the particulars.
For each simulated season, I will assign each team a true strength which is a random number from a normal distribution with mean 0 and standard deviation 6. This means that the teams' true strengths are mostly somewhat close to zero. In particular, roughly two-thirds of all teams will have true strengths between -6 and +6, about 95% of all teams will have true strengths between -12 and +12. As you probably guessed, these numbers were rigged so that they generally agree with the values that the simple rating system produces for real NFL seasons in this decade.
You'll note that, even though it will be true for a real NFL season, I am not requiring that the teams' strengths in a given year average zero. Even though we can't observe it (at least not easily), there must surely be years when the league is stronger and years when it's weaker. And in any case, since we are primarily interested in questions like "how often does the best team in football (for that year) win the Super Bowl," it doesn't matter much.
Each simulated season had the same league structure and schedule as the 2005 NFL. That is, there were 32 teams divided into eight divisions of four teams each, and the schedule is just like that of the 2005 NFL.
There is one potential complication here, but I think it's minor. In the simulated world, each season is independent of the previous one, so the two intra-conference games in each team's schedule that are determined by last season's finish are instead essentially against random teams. In the real NFL, the seasons are not independent and good teams probably end up playing very slightly stronger schedules in general than bad teams do. Fortunately, this effect isn't nearly as dramatic now as it was in the 80s and 90s.
Also, I was too lazy to program the tiebreakers. All ties were broken by coin flip. I don't think this will affect anything, but let me know if you think I'm wrong about that.
Finally, the individual games are played by using the same formula we used in this post:
Home team prob. of winning =~ 1 / (1 + e^(-.438 - .0826*diff))
where diff is the home team's true strength minus the visiting team's true strength.
OK, that's that. Let's get to the question of the day, which is: how often does the best team in the NFL win the Super Bowl?
The answer is roughly 24% of the time.
I simulated 10,000 seasons. The table below shows that the best team won the Super Bowl 2,399 times, the second-best team won it 1,448 times, and so.
Tm# SBwins
==========
1 2399
2 1448
3 1060
4 846
5 670
6 584
7 464
8 388
9 327
10 285
11 231
12 189
13 188
14 151
15 141
16 122
17 113
18 72
19 70
20 55
21 42
22 35
23 36
24 22
25 22
26 15
27 12
28 4
29 4
30 3
31 1
32 1
[NOTE: if you thought this table looked slightly different earlier, you're not seeing things. I accidentally inlcuded the wrong table at first, so I updated it about an hour later.]
Very nearly 50% of the time, the Super Bowl champion was one of the best three teams in football. And let me reiterate that when I say "the best team," I am not necessarily talking about the team with the best record. I am talking about the best team. Remember, we're omniscient here. We know which team really was the best.
I'm sure what caught your eye was that the 32nd-best (i.e. the worst) team in the NFL won the title once. Let me tell you about that season.
It was simulated season #6605. The Seattle Seahawks were truly a great team (true strength +15.1) and they played up to their potential, posting a 15-1 regular season record. The Chicago Bears were the worst team in football, but with a true strength of -9.0, they really weren't that bad, at least by worst-team-in-football standards. The NFC North was relatively weak, and Chicago took the division with an 8-8 record.
The Bears' first round playoff opponent was the Carolina Panthers, who were not great (+2.8) but had posted a 10-6 record to finish second in the NFC South. The game was in Chicago, of course, and it was therefore only a mild upset when Chicago won it. Chicago then beat the Saints in New Orleans and the Seahawks in Seattle to reach the Super Bowl.
The AFC was weak in 6605. The best they had to offer was the Jets (+7.2) who had gone 12-4 in the regular season and had beaten the Colts on the road to reach the Super Bowl. The Bears beat the Jets to win the title.
As James points out in his article, there is no single event here that is too hard to believe. It's not unlikely that there wouldn't be any truly terrible teams in the NFL in a given year. It's not unlikely that an entire division would be weak, and it's not unlikely that the worst team in such a division could win the title with an 8-8 record. In their four playoff games, their probabilites of victory were 37%, 10%, 8%, and 21%. That they'd win those four games is certainly unlikely, but no more unlikely than, say, an NL team getting four straight hits at the bottom of their batting order, and I'll bet you've seen that.
No one of those things is terribly bizarre. Yet they all come together to create an almost-unbelievable occurrence. Almost unbelievable. Ten thousand years is a long time. Most of you have probably been watching NFL football for 20 or 30 years, and think of all the crazy stuff you've seen in that time. If you lived another 500 lifetimes, you'd see some even crazier stuff.
Do you think you'd ever see a team like the 2005 Jets win a Super Bowl? And I'm not talking about the Jets if Pennington and Curtis Martin had stayed healthy. I'm talking about the Brooks Bollinger Cedric Houston 2005 New York Jets. If you gave that team 10,000 tries, would they win a Super Bowl? Before you say no, think about all the times you've seen a really bad team rattle off three or four unexpected victories; think of the Craig Krenzel-led Bears during that stretch in 2004, for example. Such runs are unlikely, but you've seen lots of them. Don't you think that, in 10,000 years, some team could string a couple of those runs together, get some breaks from the schedule, and then fluke out in the playoffs?
It could happen.
This entry was posted on Thursday, June 1st, 2006 at 4:31 am and is filed under Statgeekery. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

Some questions:
1. Did you change the formula to simulate the Super Bowl, since neither team is a true home team?
2. How many 16-0 or 0-16 teams were there?
3. Were ties possible or was there a winner in every game?
4. What is the longest streak of the same team winning the Super Bowl in consecutive years?
5. How many Super Bowls can I expect the Bills to win in the next 10,000 seasons?
This is a great post, and it pretty well matches up with what I would have guessed: The best team wins the Super Bowl less than half of the time, and it is possible for some bizarre series of flukes to result in the worst team winning the whole thing. I gained a lot of respect for Bill Cowher a couple of weeks ago when I saw him on TV and he said, "We weren't the best team last year, we just played the best at the end." That's exactly right, and it's too bad most fans aren't as thoughtful as Cowher and would take it as an insult if you told them that their favorite team wasn't the best team the year it won the title.
Great, great stuff. Thanks Doug!
Bill M,
Yes, the formula did indeed take into account a neutral site Super Bowl each year.
No ties.
About the winless and undefeated teams, that will appear in a future post (most likely tomorrow or Monday). I plan to milk this idea for several posts
This is all very interesting, but I think not knowing who the "best" team is at the end of the regular season is the essence of sports. You have the NFL's seeding process which assigns a team in each conference the "1st seed" - home field advantage, first round bye, yada-yada-yada. I am sure that fans of many a 2 seed, 3 seed etc can argue if we "got that field goal, didn't have that game losing INT for TD, etc" - their team would be the 1 seed. I am less concerned with some team having some slightly less computer generated rating winning over some team that has a slightly higher computer generated rating. If the cowboys lose to the 1998 cards - TOO F'ING BAD - better luck next time. How about some analysis of how the NFL's seeding system (of byes and homefield advantage) affect "lessor rated teams" when they are seeded higher than "higher rated teams". Now THAT would be interesting. Like the NCAA Tourney - the #1 team in the country might not win every year but you gotta be DAMN good to crawl through all six straight games and even if you are cinderella, you deserved to win and the teams you beat deserved to lose and there is no sympathy for crying about it. Without huge upsets there would be no sour grapes and without sour grapes there would be no making fun of fans of the teams you hate. Cowboy fans are BITTER about 1994 loss to the 49ers and BITTER about the 1998 loss to the Cards - as well as a few others (when was the Jackie Smith drop - GOD I STILL HEAR ABOUT THAT TODAY) and trust me, the more bitter the better. Anyway, I can't remember what my original point was. Good Luck Sir.
You may already have this planned, but I'd be interested in results specific to the Suerbowl itself ie. once the two Superbowl participants are determined, how often does the better team in the Superbowl end up losing? And, how often does the #1 team at least make it to the Superbowl?
Champions get way too much credit. If the "best team" is the "champion" less than a quarter of the time, given the fact that everybody's enamored with "winners", it follows that winning championships is overrated!
The following conversation could legitimately happen 3 times out of every 4 years:
-------------------------------------
"Best" Team: "We're better than you!"
Super Bowl Winner: "Scoreboard!"
-------------------------------------
This being said, I am not too surprized the best team only should win 24% of the time. Assuming that the best team usually has 2 playoff games and a Super Bowl to win, that means they have to win 3 games against what are often other strong teams. This means that if they had a 62% chance of winning each such game, they would have just a 24% chance of winning all 3.
Further, I generally see football games (or any sports contest) as each team using their talent, skill, coaching, cleverness, etc. to stack the deck in their favor to give themselves the best chance to win a given game. The game basically comes down to one team using their stacked deck to defeat the other team and their stacked deck. Therefore, winning or losing is not a direct function of who's better. A 60% winner is, alas, a 40% loser.
Interesting, I would have thought one of the top 3 teams would win it more than 50% of the time.
Would it be possible for you to do true strength calculations for the past 40 seasons and see how the results match up against your simulation?
Obviously there would be differences because of league size and schedule format, but I think it would be interesting to know how often the top team won the superbowl, and how often one of the top 3 does.
You made the Jets finally make the Super Bowl and then lose to the worst team in the league???
I think this is an exercise in circular logic. OF COURSE you're going to get that win distribution if you use that math.
You need to add some rigor from game theory here. Remember, there is ZERO intrinsic value to the participants in any result other than win or loss. Thus the only stats (or numbers) that matter are those that predict wins (or losses) well.
In other words, the proper conclusion from your results is that having any concept of a ranking for "team strength" or "best" is meaningless since it is only a weak predictor of who wins.
This reminds me of how my daughter plays chess wrong because she "likes the horses." Once you assign INTRINSIC value to something beyond winning, then you miss the whole point of the game.
To make this clearer, what I'm suggesting you should do is reverse the equations and "solve for diff."
E.g. 50% of the time, the Best team (the winner) had a diff number of ___ or above.
What this shows is that if a team theoretically focused on the things that improved their diff number, how that would predict resulting increase in wins. There might be some interesting inflection or "supplemental value." For example, once you get your diff up to X don't bother trying to improve it because the return (in wins) won't be worth it.
Sorry not "diff" but "true strength" number in that last post.
There is absolutely nothing wrong with this viewpoint. It's simply not the viewpoint I'm taking in this post. I will cheefully admit that the exercise is meaningless if that's your defintion of "best."
I did know that I was going to get something different from your distribution:
But I didn't know what I was going to get. Instead of 24%, I might have gotten 11% or 53% or 29%. That was the point.
Twenty-four percent is shockingly high. I wonder if it'd be substantially lower if each team's true strength varied from week to week like it does in real life (where the "best team" would be the one with the highest average true strength over the course of the season [and postseason?]).
My point wasn't to dismiss the excercise. My point was, the exercise could be done to help determine what the "true strength" number really is (that is, corresponds to) in real life.
For example, James figured that the highest correlation to winning in MLB was run differential. If we can find some stat about a NFL team, where if they rank 1st in that stat they win the championship 24% of the time, then we have found what that "true strength" number is. Then we can ask, are there other numbers with a higher correlation? The number with the highest propensity to predict a winner is the best approximation at true strength.
Changing the definitions makes it a more interesting exercise.
Also, what I find really intriqing is that your exercise might suggest there is an upper limit of correlation between a team's inherent "strength" property and the outcome of the season. That is, can any inherent property of a team better the 24% correlation to them winning it all?
That in turn would suggest that teams are NOT better off maximizing any measurable property of their team for a few short years, but should instead attempt to be "just good enough" over many more years in order to maximize their chances of winning a championship.
Is 10 years of going 10-6 more likely to produce a championship than 2 years at 14-2 and the rest missing the playoffs?
I think what you have here is the inverse of half of your bell curve distribution. If you graph the numbers, you get a logarithmic curve. The thing to be gleaned from this is not how often the “best” team wins the superbowl, but how much more often than the next “best” team. In this case, the 1 team wins it 66% more than the 2 team (2399-1448=951, 951/1448=66%). This is extended to 125% over 3, 184% over 4 and 258% more than 5. The increase is about 100% per team over team over the first 10 teams, and it takes off from there (a curve once again). So, while the absolute number is only 24%, it is still 2/3 better to be the king than the next best thing.
[...] You’d better read Thursday’s post and Wednesday’s if you haven’t yet. Thanks to the many who posted thoughtful comments during the weekend, and apologies for not giving them the thought they deserve. I had a a busy weekend. But I will do my best to address some of them when and if I get a chance. [...]
[...] I had intended to continue the ten thousand seasons experiment for another day or so, by investigating different playoff formats. The programming for that has proved more challenging than I thought it would. It’s not so challenging that I can’t get it done at some point, just challenging enough that I can’t get it done right now. I’ll get back to it soon. [...]
[...] A few months ago, Doug Drinen wrote a series of fascinating articles on his blog based on 10,000 simulations of the NFL season. One of the surprising findings of his study was that the strongest team in the NFL is likely to win the Superbowl only about 24% of the time, and even the worst team in the NFL could win the Superbowl by random chance. [...]
I was actually thinking about this last night. The 24% figure sounds about right. If the top team is also a number 1 seed, they'd have a 25% chance of winning the SB if they have a 70% chance of winning their first game, a 65% chance of winning their second, and a 55% chance of winning the SB. That seems about right.
[...] team benefits more from the breaks. This random component is so pervasive that, according to a fascinating study published a few years ago by Doug Drinen, the strongest team in any particular NFL season is [...]
[...] structure more than doubles the chances of the best team winning compared to an NFL-like season. An earlier study by Pro Football Reference’s Doug Drinen suggested the best NFL team wins about 24% of the time. [...]
[...] over the entire season, etc.) team wins the NCAA Tournament. We know that the NFL’s best team wins the Super Bowl about 24% of the time, that the best team in baseball wins the World Series about 29% of the team (or at least, they did [...]
I'm sure what caught your eye was that the 32nd-best (i.e. the worst) team in the NFL won the title once. Let me tell you about that season.
It was simulated season #6605 .... The Chicago Bears were the worst team in football, but with a true strength of -9.0, they really weren't that bad, at least by worst-team-in-football standards. The NFC North was relatively weak, and Chicago took the division with an 8-8 record.
The Bears' first round playoff opponent was the Carolina Panthers, who were not great (+2.8) but had posted a 10-6 record to finish second in the NFC South. The game was in Chicago, of course, and it was therefore only a mild upset when Chicago won it. Chicago then beat the Saints in New Orleans and the Seahawks in Seattle to reach the Super Bowl.
The AFC was weak in 6605. The best they had to offer was the Jets (+7.2) who had gone 12-4 in the regular season and had beaten the Colts on the road to reach the Super Bowl.
The Bears beat the Jets to win the title...
~~~
Of course they beat the Jets. The Jets finally get to the Super Bowl again, and lose to the worst team to come out of the NFC in 10,000 years.
It just shows: The contract Weeb Ewbank signed to sell the team's soul to win SB III against his former employer was so thorough that even in the 1-in-10,000 years scenario where the worst team in football gets to the Super Bowl, the Jets can't beat 'em.
Not even in a computer.
My Jetsies! Let's Go Jets!