SITE NEWS: We are moving all of our site and company news into a single blog for Sports-Reference.com. We'll tag all PFR content, so you can quickly and easily find the content you want.

Also, our existing PFR blog rss feed will be redirected to the new site's feed.

Pro-Football-Reference.com » Sports Reference

For more from Chase and Jason, check out their work at Football Perspective and The Big Lead.

Data mining the ’07 divisional matchups

Posted by Doug on January 9, 2008

Last year I whipped up a fun method for predicting playoff games. The idea is this. Let's say you're trying to figure the Chargers/Colts game this weekend. One way to go would be to try to find previous playoff games where the two teams had characteristics that look similar to those of the Chargers and Colts, then see how those games turned out. I don't have much to add, so I'll just quote last year's post, but with the Colts' and Chargers' data filled in.

For every playoff (non Super Bowl) game since the beginning of the 12-team playoff format in 1990, I recorded the following bits of data for each team:

1. Their regular season record.

2. Their record in the last six games of the regular season.

3. Their regular season point differential.

4. Whether or not it was a home game.

5. Whether or not they had a bye the previous week.

Then I compared each team to its playoff opponent in each of those categories. So, for example, the Colts look like this for this weekend’s game against the Chargers:

Record: two games better than San Diego’s. This gets recorded as '+2'

Last six games record: Colts 5-1, Chargers 6-0, so the Colts are one game worse: -1

Point differential: Indy’s was +188 and San Diego's was +128. So this is a +60 for Indy.

Home field: yes.

Bye last week: yes.

Now I searched through all the playoff matchups from 1990--2006 to find the ones that look most like this one. Here are the top 15:

                   Rec L6  Marg  H Bye  SIM   Result
====================================================
ind                 2  -1    60  1  1
====================================================
dal 1992  d  phi    2   0    57  1  1   897  W 34-10
jax 1998  w  nwe    2  -1    46  1  0   886  W 25-10
pit 2001  c  nwe    2  -1    41  1  0   881  L 17-24
buf 1993  d  rai    2   0   107  1  1   853  W 29-23
nwe 2004  d  ind    2   0     6  1  1   846  W 20- 3
nwe 1996  c  jax    2  -1   115  1  0   845  W 20- 6
chi 2001  d  phi    2   0     0  1  1   840  L 19-33
ten 2000  d  bal    1  -1   -13  1  1   827  L 10-24
ind 2005  d  pit    3   0    61  1  1   799  L 18-21
nyg 1990  d  chi    2   1    56  1  1   796  W 31- 3
dal 1995  d  phi    2   0   164  1  1   796  W 30-11
dal 1995  c  gnb    1  -1    54  1  0   794  W 38-27
phi 2003  d  gnb    2   0   -48  1  1   792  W 20-17
buf 1991  c  den    1  -1    71  1  0   789  W 10- 7
buf 1990  d  mia    1   0    71  1  1   789  W 44-34

SIM is the similarity score; 1000 is the maximum. You can see that the teams at the top of the list do indeed have very similar profiles to the 2007 Colts in their matchup against San Diego.

When you get to the bottom of the list, claiming similarity is a dicey proposition. The 1991 Bills’ matchup certainly has some similarity to this Colt matchup, but there are also some significant differences. For one thing, the 91 Bills were playing a conference championship game while this Colts’ game is only in the divisional round. You might argue — and I won’t disagree too strongly — that wildcard games should only be compared to wildcard games, divisional games to divisional games, and so on. That’s fine, but it limits an already-small data set.

As usual, we have to make some choices about the tradeoff between sample size and sample relevance. We’ve got to draw the line somewhere, and I felt that including all playoff rounds and looking at the 15 most comparable matchups achieved about the right balance.

Eleven of these 15 teams won, so we might estimate the Colts chances at 11/15, which is about 73.3%. I think it makes sense to weight the more similar teams more heavily. In this case, it doesn't make much difference; a weighted average (weighted by similarity score) gives the Colts a 73.1% chance. A weighted average of the scores of those fifteen games is 24.3 - 16.8, roughly a 7.5-point spread.

The books have installed Indy as an 8.5-point favorite. The money lines imply roughly a 75--80% chance of a Colt victory, so these estimates are in the ballpark.

Here are the other three matchups. Unlike last season, the method has very strong feelings this year. It loves the favorites.

                   Rec L6  Marg  H Bye  SIM   Result
====================================================
dal                 3   1   108  1  1
====================================================
pit 2001  d  bal    3   1   102  1  1   994  W 27-10
den 2005  d  nwe    3   1    96  1  1   988  W 27-13
dal 1994  d  gnb    3   1    71  1  1   963  W 35- 9
buf 1991  d  kan    3   1    70  1  1   962  W 37-14
stl 2001  d  gnb    2   1   106  1  1   898  W 45-17
gnb 1997  d  tam    3   2   104  1  1   896  W 21- 7
sea 2005  d  was    3   0   115  1  1   893  W 20-10
stl 1999  d  min    3   1   220  1  1   888  W 49-37
stl 2001  c  phi    3   1    95  1  0   887  W 29-24
oak 2002  d  nyj    2   1   123  1  1   885  W 30-10
dal 1993  d  gnb    3   2    89  1  1   881  W 27-17
phi 2002  d  atl    3   2    86  1  1   878  W 20- 6
tam 2002  d  sfo    2   1   134  1  1   874  W 31- 6
den 1998  d  mia    4   1   136  1  1   872  W 38- 3
atl 2004  d  stl    3   0    76  1  1   868  W 47-17
====================================================
WEIGHTED AVERAGE:  100.0 pct chance of victory
PROJECTED SCORE:  32.1-13.3


                   Rec L6  Marg  H Bye  SIM   Result
====================================================
gnb                 3   0    42  1  1
====================================================
ind 2005  d  pit    3   0    61  1  1   981  L 18-21
atl 2004  d  stl    3   0    76  1  1   966  W 47-17
sea 2005  d  was    3   0   115  1  1   927  W 20-10
sfo 1990  d  was    4   0    34  1  1   892  W 28-10
ind 2006  w  kan    3   0    51  1  0   891  W 23- 8
dal 1992  d  phi    2   0    57  1  1   885  W 34-10
buf 1991  d  kan    3   1    70  1  1   872  W 37-14
dal 1994  d  gnb    3   1    71  1  1   871  W 35- 9
nwe 2004  d  ind    2   0     6  1  1   864  W 20- 3
chi 2006  c  nor    3   0    81  1  0   861  W 39-14
chi 2001  d  phi    2   0     0  1  1   858  L 19-33
den 2005  d  nwe    3   1    96  1  1   846  W 27-13
pit 2001  d  bal    3   1   102  1  1   840  W 27-10
kan 1995  d  ind    4   0   102  1  1   840  L  7-10
buf 1993  d  rai    2   0   107  1  1   835  W 29-23
====================================================
WEIGHTED AVERAGE:  79.7 pct chance of victory
PROJECTED SCORE:  27.4-13.7


                   Rec L6  Marg  H Bye  SIM   Result
====================================================
nwe                 5   2   208  1  1
====================================================
phi 2004  d  min    5   2   116  1  1   908  W 27-14
sfo 1994  d  chi    4   2   245  1  1   863  W 44-15
sfo 1992  d  was    5   3   150  1  1   842  W 20-13
min 1998  d  ari    6   2   313  1  1   795  W 41-21
jax 1999  d  mia    5   4   189  1  1   781  W 62- 7
chi 2006  d  sea    4   1   178  1  1   770  W 27-24
pit 2004  d  nyj    5   3    49  1  1   741  W 20-17
den 1998  d  mia    4   1   136  1  1   728  W 38- 3
sfo 1997  d  min    4   3   115  1  1   707  W 38-22
gnb 1997  d  tam    3   2   104  1  1   696  W 21- 7
stl 1999  d  min    3   1   220  1  1   688  W 49-37
dal 1993  d  gnb    3   2    89  1  1   681  W 27-17
phi 2002  d  atl    3   2    86  1  1   678  W 20- 6
rai 1990  d  cin    3   2    61  1  1   653  W 20-10
car 1996  d  dal    2   2   113  1  1   605  W 26-17
====================================================
WEIGHTED AVERAGE:  100.0 pct chance of victory
PROJECTED SCORE:  32.3-15.3

This entry was posted on Wednesday, January 9th, 2008 at 6:54 am and is filed under General. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.