SITE NEWS: We are moving all of our site and company news into a single blog for Sports-Reference.com. We'll tag all PFR content, so you can quickly and easily find the content you want.
Also, our existing PFR blog rss feed will be redirected to the new site's feed.
Pro-Football-Reference.com ยป Sports Reference
For more from Chase and Jason, check out their work at Football Perspective and The Big Lead.
What are the odds of that?
This post has nothing to do with football, unless you're the kind of person who sees football wherever you look. It was inspired by a question I saw on a fantasy football message board, and I might be able to make some loose football tie-ins at the end, but you may want to skip this one if you read this blog primarily for the football rather than the math.
Here is the inspiration: a post on the footballguys message board that said simply,
What are the odds of drawing the 14th pick out of 14 teams for three straight years?
The poster had apparently done just that. It was posted at 7:29 p.m. By 7:31, three people had posted the answer that most people would consider the right one.
(1/14)*(1/14)*(1/14) = 1/2744
Assuming everything is on the level, that guy's chances of drawing the fourteen slot in a given year would be 1 in 14. Since each year would be independent of the others, multiplying the yearly probabilities should give the overall probability of 1/2744. Fair enough. But the question "what are the odds of that?" can --- and often should --- be interpreted differently. [By the way, I am blurring, and will continue to blur, the distinction between odds and probability. Nothing bad will happen as a result.]
Suppose I flip a coin ten times and get: THHHTTHHTH. What are the odds of that?
Well, that depends on what you mean; it depends on what you think is the essence of a sequence of coin flips. The probability of that precise string is 1/2^10, or 1/1024. The probability of 6 heads and 4 tails is about 20.5% (that's 10-choose-6 divided by 2^10). The probability of a 6/4 split one way or the other is about 41%. So is THHHTTHHTH a rare event or not? It's a rare outcome, but all of the 1024 outcomes are rare. And we're often not interested in the outcome itself. Instead we group outcomes together into the events that we're interested in. We might be interested in the event "six of one, four of the other," which consists of a lot of different outcomes. That event is not particularly rare.
As a brief football-related aside, this is one reason --- not the only reason, but one reason --- to hate the expression
When you put the football in the air, three things can happen, and two of them are bad.
Actually, when you put the football in the air, there are about eleventy-bajillion different things that can happen. Grouping them into the three events: complete, incomplete, and interception, is arbitrary. If I was at my own 20-yard line, I could just as easily say, "when you put the football in the air, 75 different things can happen, and 72 of them are good." I could gain 80 yards, I could gain 79 yards, I could gain 78 yards, I could gain 77 yards, ..., I could gain 5 yards or less, I could throw an incompletion, or I could throw an interception.
Back to our fantasy football playing friend. If he had drawn the 6th pick for three straight years, or the 8th pick, or the 3rd pick, the same pick for three straight years, would we have heard from him? If so, then maybe the right question is "what are the odds of me getting the same draft slot for three straight years?" Answer: 1/14^2 = 1/196 (this was pointed out in the same thread, at 1:41 a.m.).
But as someone who wasn't involved, I might have a different perspective. If any single member of that league happened to draw the 14 slot all three years, you can bet I would have heard from him on the message board. Quite possibly if any person in that league had drawn the same slot for three years, we would have heard about it. But wait. This message board comprises people from many, many different leagues that probably have similar draws. From my perspective, maybe the question is "in a given year, what are the odds of me reading about some guy who got the same draft slot for each of the last three years?" And the probability of that is pretty high.
How high? Well, first, let's focus on a single 14-team league and examine the probability of somebody drawing the same slot (any slot) for three straight years. This turned out to be a hard question, at least for me. The answer is:
Sum_{k=1 to 14} P_k * Q_k
where P_k is the probability that there are exactly k guys with the same slot in the second year as the first, and Q_k is the probability that, given exactly k guys were in the same slot year one and year two, one or more of them would again land the same slot in year three. Even figuring P_k isn't an easy, but it turns out (google one of my favorite official mathematical words --- derangements --- for details) that
P_k = (1 / k!) * sum_{i=0 to (14-k)} (-1)^i / i!.
I think there must be an easier way to express Q_k, but what I came up with was:
Q_k = (1 / 14!) * sum_{j=1 to k} [(-1)^(j+1) * (k-choose-j) * (14-j)!].
Crunching the numbers, we find that the chance of somebody in a 14-team league getting the same slot three years in a row is about .06876. Roughly 7%. Now suppose that all the members of, say, 10 different leagues of the same kind frequent my message board. Then there is a 1 - (1-.06876)^10 probability --- about 51% --- that I, a random member of the message board, will be hearing about somebody who had a 1-in-2744 event happen to him.
This discussion is vaguely connected to my annual splits happen posts, where I point out highly non-interesting facts, like that Thomas Jones averaged nearly 10 fantasy points per game more against teams whose city name starts with A--M than against those that started with N--Z. A friend of mine read that post and remarked, "I don't even know how you thought to look for something like that." But the point is that I don't have to be clever. I don't have to know where to look. If you look at enough things, enough players' splits, enough fantasy draft drawings, you will see some things that seem absurdly unlikely.
This is one reason that it's not always appropriate to apply standard significance tests to facts you read about or see flashed on the screen during a game. You think Marty Schottenheimer is a crummy playoff coach. What are the chances that the collection of teams he's coached would, against the playoff opponent's they've faced, have a record of 5-13 or worse? You could make some assumptions and run the numbers; maybe you'd find that it is 4% or so. So you conclude: the probability of his record having happened due to chance is very, very low. Therefore, it probably isn't just chance; he must be a bad playoff coach.
That isn't necessarily appropriate. Why not? Because, even if all coaches were exactly average and chance was completely responsible for their records, there would undoubtedly still be coaches with records like Marty's (about 4% of all coaches, in fact!). You were (probably) checking the unlikelihood of Marty's record because you already knew it was bad. Do you see why that method is a self-fulfilling prophecy?
That's not to say that Marty isn't a bad playoff coach, or that statistical significance testing can never be used to examine such questions. It's just a reminder that "how unlikely is it that Marty's teams would go 5-13 in the postseason?" is a question that, just like "what are the chances of me getting the 14th slot in my draft three years in a row?" can be interpreted in different ways.
This entry was posted on Monday, August 27th, 2007 at 4:14 am and is filed under Statgeekery. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

Good article. I think you should categorize this in the "boo-hoo, cry me a river" section. Why? Well, I think every fantasy owner has something to cry about - Jamal Lewis hasn't been up to form, Kevin Jones got hurt after I passed over Stephen Jackson for him, Culpepper sucks now and Dominick Davis is in the witness protection program, Detroit can't throw a TD vs GB in week 15 to save their freaking life, LT traded for Charles Rogers in 2003, should I go on? When this "belly-aching" happens, the usual response from the other owners is "cry me a river, you little baby". My favorite (which I am usually a culprit of doing) is "I would have won the championship if I had just started Jason Campbell instead of Jon Kitna in week 15" or some such and "therefore you championship is very shallow especially because you haven't beaten me in 5 years". Of course, the appropriate response is "scoreboard". These kind of whinners are what make fantasy football great. They more they whine, the more you can rub it in their face. Now, 14th for three straight is simply bad luck and not his fault perhaps? Sort of, but this guy thinks he is being clever. he is setting up the dilema (I am also an expert in this): If he does poorly, his bad luck at the drawing is surely the excuse. If he does well, he overcame adversity and is skills are even better than you can ever imagine so when he finally wins it, it proves he is the greatest of all time. That being said, I don't think that playoffs wins-and-losses are random, like say a coin-flip or a drawing from a hat. However, if you make the assumption that all coaches that make the playoffs are decent teams, then yes, there will be a distribution of those with decent teams that play as expected, better than expected, and worse then expected. Unfortunately for Marty, he falls in the later category. However, I believe it is not a random statistic but rather that Marty, in fact, blows. Now, what are the chances that Green Bay shuts down Detriot's passing last year in Week 15? That, my friend, is something worth a 1300 word article.
I'd approach the math in a different vein.
You're counting a league with multiple people with the same pick the same as a league with only one person with the same pick. In actuality, I'd expect a league with multiple three in a rows to be more likely to cause someone to speak up than a league that only has one, and so I'd try to design any measurement of how common the odd case is to weight multiple three in a rows in a league more than a single three in a row.
Calculating the expected rate of three in a rows is much easier. It's a quick 1/14 chance with just basic expected value calculations. Some leagues might have multiple, but the expected rate of at least three in a row is 1 per 14.
Going back to the point about leagues with multiple special occurances being louder, I might model a league with n three in a rows as having a "speak up" factor of n^2 (n people each with n things to notice, rather than modeling it as n separate 1 person with 1 thing to notice (their own three in a row, expected value case)). Something like this would increase the overall "noise measure" past 1/14.
After all this math and theoretical chatter, I guess my point is that looking at just the interesting leagues (at least one three in a row) is less useful as a model of "how often" than a much simpler model as expected three in a rows measures how often a player will speak up about their own experiences and is trivial to calculate.
http://footballpredictionnetwork.blogspot.com/2007/08/probabilities-of-playoff-droughts-and.html
In a similar vein, I took a look at the probability of the current 5+-year playoff droughts. Even when you get to 7-8 years, the probability of not having made the playoffs in that span is high enough so that you'd expect 1 team to fit the profile. Detroit may have bad ownership and bad general management, but given scheduling, the weakness of their division, and hot streaks, they could have made the playoffs at least once with an 8-8 record. That is, in fact, what I predict them to do this year.
In a similar vein, maybe some of Schottenheimer's decisions in playoff games were suboptimal, but most coaches given the same regular season success and same decision making in the postseason would not have suffered as bad a record as Schottenheimer. That game against the Patriots was not his fault, for one thing. But it's just plain odds that someone has to have that poor a record.
What are the odds of a team being in the Super Bowl 4 years in a row?
Well, let's make the assumption that the making it one year is independent of making it the other year and that every team has an equal shot of making the Super Bowl just because there are so many uncontrollable factors that these things are dependent on.
Two out of 32 teams reach the Super Bowl each year for a P(making Super Bowl) = 0.0625. If you do it in a more complicated fashion, P(have bye week and win divisional round and win conf. champ.)+P(in playoffs but no bye week and win each round)=(4/32)*(1/2)*(1/2)+(8/32)*(1/2)*(1/2)=0.0625. This also assumes that each game is fair, with each team having an equal shot of winning.
The probability of making the Super Bowl 4 years in a row given these assumptions is (2/32)^4 = (1/16)^4 = 0.001525%. So 1.5 in 100,000 teams will make it to the Super Bowl 4 years in a row. Maybe I shouldn't say 100,000 teams, but 100,000 4-year windows, so 1996-2000 would have 2 such windows for each team (1996-9, 1997-2000).
Doug, excellent philosophical take on what a lot of people misunderstand about probabilities. Sh*t happens, even "unlikely" sh*t.
#5 - It's logical to simply jump to (1/16)^4 if you're assuming that each team in each conference has equal chance to make the SuperBowl each season. Although, you've set up a nice structure for altering those assumptions.
Given that there have been 41 SBs, that's 38 4-year windows. If we assume 32 teams for all those season (obviously incorrect), we'd expect to have witnessed 32*38*(1/16)^4 = .02 4-year streaks. We've seen it once (and I'm trying to forget about that.) Given that there hasn't always been 32 teams, we'd expect that 4 in a row is less probable. But we really need some conditional probability to model the likelihood that a SB team gets back the following season (greater than 1/16) and that a non-SB team gets there the next season (less than 1/16).
WHAT ARE THE ODDS OF 1 PERSON WINNING ON A 100 SQUARE SUPER BOWL BOARD 2 YEARS IN A ROW.I UNDERSTAND THAT IT IS 1-100 EACH YEAR BUT DOES IT GO UP ADDING THE 2 YEARS STRAIGHT. ALSO IF YOU ADD THAT THE SAME SQURE WON THE BOARD EACH YEAR.WHAT DOES THAT DO TO THE ODDS. THANK YOU Iappreciate anything you can tell me.i