**SITE NEWS:**
We are moving all of our site and company news into a single blog for Sports-Reference.com. We'll tag all PFR content, so you can quickly and easily find the content you want.

Also, our existing PFR blog rss feed will be redirected to the new site's feed.

Pro-Football-Reference.com ยป Sports Reference

For more from Chase and Jason, check out their work at Football Perspective and The Big Lead.

# Archive for the 'Non-football' Category

## 2011 NCAA Tournament Game Previews

To get you prepared for the matchups in this year's NCAA Tournament, we now have printable game previews at **SR/College Basketball**:

Game Previews | College Basketball at Sports-Reference.com

Each preview contains key information about both teams, including SRS ratings; offensive and defensive ratings; and player statistics from the 2010-11 season. Check them out, and increase your knowledge when watching the games this month!

Comments Off | Posted in Announcements, Checkdowns, Non-football

## Checkdowns: Former Track Stars Turned Pro in Other Sports

At the website **TrackAndFieldNews.com**, **Heimo Elonen** has compiled a neat list of pro football, basketball, and baseball players who were track and field stars before pursuing a career in a different sport. All in all, it's an interesting piece of research if you're a sports fan, and especially if you like football -- as you might expect, *lots* of NFL players turn up here, including a punter! (Brian Moorman ran the 400m hurdles at Pittsburg State and was actually a highly accomplished Division II track athlete.)

9 Comments | Posted in Checkdowns, Non-football, Trivia

## P-F-R March Madness pool 2010

First prize is we will do a podcast (yeah, that's right) on the player or team/season of your choice. Also honor and glory.

Rules: each team has a price, listed below. Pick as many teams as you want, as long as the total price stays at 100 or less. The winner will be the entry with the most total wins by all teams in the entry. First tiebreaker is greatest number of 16 seeds, second tiebreaker is greatest number of 15 seeds, etc. No point will be awarded for winning the play-in game.

Enter by putting a comma-delimited string of team numbers in the comments, like this:

3,8,12,14,...

which would correspond to Baylor, Duke, Georgetown, Gonzaga, etc. It doesn't matter what order you put the teams in. I will try to check all the entries to make sure they're legal, but I make no guarantees. It's your responsibility to make sure your entry is legal. Deadline is tipoff of Thursday's first game.

1 = (16) Ark.-Pine Bluff 1 2 = ( 7) BYU 7 3 = ( 3) Baylor 11 4 = ( 5) Butler 5 5 = ( 8) California 4 6 = ( 7) Clemson 4 7 = (12) Cornell 2

53 Comments | Posted in Non-football

## Introducing College Basketball at Sports-Reference.com

I am pleased to announce the launch of College Basketball at Sports-Reference.com, the latest addition to the Sports Reference family of web sites. We have had plans to launch a college basketball site for quite some time, but for one reason or another we always ran into roadblocks, most of them data-related. However, thanks to the efforts of researcher extraordinaire Kevin Johnson, we now have a college basketball database that we believe to be second-to-none. Let me tell you a little bit about what the site does (and doesn't) have:

3 Comments | Posted in Announcements, College, Non-football, Word from our Sponsors

## Bracket assistance

For those of you who don't get over to the basketball-reference.com blog regularly, you might want to head over there before filling out your brackets. Sports-reference.com's Sean (with help from Sagarin) and Neil (with help from Ken Pomeroy) both simulated the tournament a gazillion times and presented their results (Sean - Neil)

One caveat I'd add is that these simulations will tell you how to maximize the expected points you score in your pool. This does not always coincide, however, with maximizing your chances of winning. The bigger your pool is, and the more conservative/knowledgeable the people in it are, the more you need some longshots to maximize your chances of winning. I talked about this two Marches ago and was then convinced by a reader that #3 and #4 seeds are probably the key to maximizing your win chances in big pools. And it goes without saying, of course, that among the mind-bogglingly huge number of reasonable entries in any given pool, the differences we're talking about are very small. If you enter a pool with 50 or more participants every year, the rest of your life is not nearly enough time for you to perceive whether any particular strategy is working or not.

Good luck to all!

Comments Off | Posted in Non-football, Statgeekery

## Bill James supports BCS boycott

Thanks to Dr. Saturday for the pointer to this Slate article, in which Bill James articulates his reasons for not liking the BCS.

I don't have time to comment on all the items in the article that deserve comment, so I'll just say that, like everything Bill James has ever written, it's worth a read. I do have a question, though, for those out there who are a bit more in touch with what James has been doing for the past decade or so:

When did James start to refer to himself as a statistical analyst?

Twice in this article, he makes it clear that he does in fact consider himself to be one. My (possibly erroneous) recollection is that James has always specifically denied that, opting instead for something along the lines of, "I'm not a stat guy. I'm simply a guy who likes to ask questions, and then exhausts all possible avenues (some of which might happen to be statistical) of answering that question." Can any of you serious sabermetricians --- I know you're out there --- shed some light?

9 Comments | Posted in BCS, College, Non-football

## Usain Bolt and NFL combine 40 times

Let me preface this by saying that almost every single word of what I'm about to write could potentially be incorrect. I don't really know what I'm talking about. Possibly the *most* reliable source I've used here is Wikipedia, if that tells you anything.

But you guys will help correct me if I say something really stupid, right?

It all starts with a message board post from a guy I don't know that I saw linked from another message board.

**Usain Bolt's splits during the Olympic 100m race**

RT 0.165 10m 1.85 20m 2.87 (1.02) 30m 3.78 (0.91) 40m 4.65 (0.87) 50m 5.50 (0.85) 60m 6.32 (0.82) 70m 7.14 (0.82) 80m 7.96 (0.82) 90m 8.79 (0.83) 100m 9.69 (0.90)

30m is 32.8084 yards. So he needs to cover 7.1916 more yards from there.

He ran from 30m to 40m in .87 seconds, or .087 seconds per meter, or .0795528 seconds per yard. But he wasn't at top speed yet. So the first 7 yards of that would have been slightly slower than the average of the full ten meters, but faster than the .0832 seconds per yard at which he ran from meter 20 to meter 30. So let's say he averaged a nice round .08 seconds per yard. Multiply that by 7.1916 and you get .575. Add that to his 30m split and you're at 4.35 or 4.36.

So unless I've done something wrong, we have the following:

**At 40 yards of the actual Olympic 100m race, Bolt was at 4.35 or 4.36**

But wait...

His reaction time was .165. My understanding is that the combine 40 is timed from the runner's actual start rather than from a gun. So if this were in an NFL combine setting, that reaction time would be gone and he'd be at 4.19.

But wait...

There are no starting blocks at the NFL combine. And my understanding is that this particular Olympic track is the fastest around. Those two things would push his NFL combine time up over 4.2, maybe up to 4.25 or even 4.3.

But wait...

If he were training specifically for the 40, he might be able to do some things somewhat differently to shave a few hundredths off.

I hereby declare that Bolt would run a 4.22 at the combine.

Chris Johnson ran a 4.24 at this year's combine. Does that make my Bolt estimate seem too high? Or does it mean that the timing at the combine is inexact or inconsistent or just plain generous? Could be either one --- or both --- but I'm not totally sure the two figures are incompatible. It was around the halfway point that Bolt really blew everyone else away; I don't even think he was leading at 40 yards. So it's not clear to me that Chris Johnson couldn't hang close to him for 40 yards.

59 Comments | Posted in NFL Draft, Non-football

## March Madness: how important is a team’s recent play?

I would like to thank Doug for allowing me to infect this week's blog with some temporary March Madness. Long before I was doing research on the NFL, I was a NCAA tournament junkie.

I still have the NCAA program from the 1988 Final Four, featuring Arizona, Oklahoma, and Duke. (I'll be deep in the cold, cold ground before I recognize the fourth team). I memorized that program, and can still to this day tell you who won the NCAA tournament in any year. In college, I once banged my head on the floor during an intramural basketball game. My teammate's reportedly asked me who won the NCAA tournament in 1950, to see if I was okay. When I mumbled "CCNY, Irwin Dambrot, Nat Holman", they told the ref that I was okay. I don't really remember it.

My particular affliction, and the cause of an unhealthy love-hate relationship with March Madness, is that I am a Missouri Tigers fan. I cried when the 1987 team with Derrick Chievous lost in the first round to Xavier as a #4 seed. I cried the next year when they lost to Rhode Island in the first round. (Yes, I cried alot as a kid). By the time the 1990 NCAA tournament rolled around, I was a sophomore in high school. I was at school, but skipped class that afternoon, snuck into the A/V room in the library with a couple of other guys, and watched the second half of the game against Northern Iowa. I can still see Maurice Newby's prayer of a shot sailing through the net.

That particular Missouri Tigers team was led by Anthony Peeler and Doug Smith (who Doug had an opportunity to see up close at the first NFL game he attended). Less than a month earlier, they were the #1 ranked team in the country. They went into a bit of slump at the end of the season, culminating in a first round Big 8 tourney loss to the #8 seed Colorado Buffaloes, which dropped them all the way down to a #3 seed in the NCAA tournament. I'll use that Missouri Tigers team to transition to the point of this post. How important is a team's finish to the regular season, and does the tournament committee properly weight the end of season performance versus the whole body of a team's performance?

4 Comments | Posted in Non-football

## The Baseball Economist

My good friend John-Charles Bradbury has written a book called *The Baseball Economist*, which just hit the shelves a few weeks ago. There are a couple of reasons why I can't call this post a review of the book, or at least not an objective review: (1) as I've already admitted, the author is a friend, and (2) some of my own work is included.

Still, if you have any interest in baseball books or thoughtful baseball analysis, please read on. Hopefully I can pique your interest enough to convince you to go out and pick up a copy. If you need a less biased review or endorsement, you can find several on the web: [Wall Street Journal] [Division of Labour] [Was Watching] [Baseball Crank] [The Sports Economist] [Wages of Wins blog] [Marginal Revolution] [Braves Journal]

Most of you are probably not aware that I used to be as intensely into baseball analysis as I currently am into football. I wrote for a few sabermetric sites and publications and I like to think I came up with a few nifty little studies along the way and contributed a bit to the field. But it's now clear that my biggest contribution (by far) was that I introduced John-Charles Bradbury to the existence of sabermetrics.

I could drone on about how much of a hero I am for leading J.C. out of the wilderness of RBIs and pitchers' wins, but I'll spare you. It is worth mentioning, though, that the fact that J.C. didn't encounter sabermetrics until relatively late in life probably prevented him from becoming Just Another Sabermetrician. Rather than viewing things through a traditional sabermetric lens --- as those of us who grew up with Bill James tend to do --- he started to look at them in light of his training as a professional economist. When combined with the tools of sabermetrics, it leads to a fresh perspective.

If you're turned off by "the business of baseball," don't let the title of the book scare you. While it does contain a few chapters about issues traditionally associated with economics --- salaries, the reserve clause, the anti-trust exemption, and so forth --- it's not just about the economics *of* the game; it's also about the economics *in* the game.

What does that mean exactly? It means that economics isn't about money. The best-selling book *Freakonomics* has made the general public aware that there is virtually no topic into which economists won't stick their noses (including the fourth-down decisions of NFL coaches and draft day trades in the NFL). Economics isn't about GDPs and discount rates and money multipliers; it's a framework for studying human behavior. Since baseball players (and managers, and owners, and umpires, and fans) are human beings, economic theories often attempt to predict how they will behave in certain situations.

If you're like me, you like to read books that introduce you to the core concepts of different disciplines. But you're not willing to work too hard. I don't want a PhD in economics (or psychology, or linguistics, or whatever), I just want a brief introduction to the general mode of thought. J.C. provides this in a clear and easy-to-read way. But, while this book is easy to read, don't get the idea that it's trivial. Economists and hard core sabermetricians will undoubtedly read a few things that are not news to them, but everyone will find plenty of new ideas in this book. I know I did, despite the fact that I ate lunch with J.C. every Friday during the entire time he was writing it!

*The Baseball Economist* does a very nice job of clearly explaining various economic theories and concepts and examining how they play out on the baseball diamond and in the front office. If you're not an economist, you probably won't agree with all his conclusions, or even his methods, but if you have an analytical mind, his arguments will make you think.

2 Comments | Posted in Non-football

## Bracketology update

On Monday I offered a few thoughts about how to fill in your tournament brackets.

My interest in the issue is mainly theoretical (I am a mathematician, after all). *If* the entries in your pool satisfy a certain assumption, then it's reasonable to conclude something about your optimal strategy. It's an interesting mathematical problem with what I think is an elegant solution. But I have to admit that, other than just making sure it passed the sniff test, I didn't give a whole lot of thought to the reasonableness of the assumption.

Blog reader Patrick L. sent me some data from his office pool which indicates that people tend to overbet high seeds in these pools. This is the percentage of people whose entry includes a national champion of the given seed. OP is Patrick's office pool, TS is the implied probabilities from tradesports.com, and Hist is the historical frequencies:

OP TS Hist

#1 seeds 59% 53% 55%

#2 seeds 38% 23% 18%

everyone else 3% 24% 27%

Also, Yahoo's contest entry distributions are now posted, and they are even heavier on #1 seeds: 64/23/13.

In light of this, Patrick suggests that picking a #3 or #4 seed to win it all may be the optimal strategy. That seems believable to me.

Yahoo's distributions show further evidence of overbetting on favorites. Texas was picked in the first round by 97% of yahoo's entrants, whereas the Vegas odds imply that they have only about a 79% chance of beating New Mexico State. Virginia Tech (a #5 seed) was picked by 71% of the entrants over Illinois (#12) despite only having a 57% chance of winning according to the Vegas odds. The former is probably due to Kevin Durant's likeness being plastered all over everything for the last few weeks. The latter is probably due to an over-reliance on seedings instead of other objective measures like computer rankings or Vegas lines, as was mentioned in the comments to Monday's post.

2 Comments | Posted in Non-football

## How to fill out your brackets

So you're not much of a college basketball fan. You follow your alma mater, and possibly keep loose tabs on the rest of their conference, but that's about as far as it goes. But the tourney is good entertainment and, as is customary, you enter a bracket pool so you can have a rooting interest where none would otherwise exist. How do you maximize your chances winning the thing?

If you're like me, the first thing you do is you head someplace like this smorgasboard of computer ranking algorithms and check out a few of them to get a quick feel for which teams appear to be over- or under-seeded. Some of them even do the work for you by putting a specific probability estimate on each team's chances of advancing to each round.

Whatever the rules of your bracket pool, you probably get some sort of score associated with your entry. And the highest score wins. In most pools, you can use estimates like those above to compute (at least approximately) the expected score of each possible entry. Now simply find the entry with the highest expected score and turn it in.

That's what I used to do. Only recently did I realize that that's wrong. **Maximizing your expected score is not the same as maximizing your chance of having the highest score**. Your goal is the latter, not the former.

To see why they're not the same, imagine a simple pool where you are simply trying to pick the winner of the tournament. Let's say that in the very likely event of a tie, the winner will be selected randomly from among those who correctly picked the champion. You believe these are the probabilities of each team winning the tourney:

Ohio State: 25%

UCLA: 20%

Kansas: 15%

UNC: 15%

Florida: 10%

Texas A&M: 10%

Washington State: 5%

The "score" of your entry in this simple pool is either one or zero, depending on whether you pick the champ correctly or not. So the entry with the highest expected score is Ohio State. But Ohio State might or might not be the entry that maximizes your chance of winning the pool. It depends on who everyone else picked. If you were the only Buckeye-picker, then great. But if 90% of the other pool participants picked Ohio State, then you'd be better off picking Washington State.

So, while Ohio State is the "best" pick in some sense, it's also likely to be a "crowded" pick, and that's the problem. You may be better off going with a "worse" pick, if it's a pick that's less popular. That's a simple example, but the same issues are present in a real pool. Even if there aren't necessarily ties, the best picks are also going to be the most popular picks, and that's going to cause the same kind of crowding. If you pick the entry that you believe is most likely to occur, then there will be lots of other entries that look very similar to yours. This is problematic because you *know* you're going to miss on a lot of games. And if your entry is too centrist, it's likely that there will be an entry that looks just like it except that it got a few of the games you missed.

The other extreme is to pick an entry with Cinderellas and longshots aplenty. This avoids the crowding problem. With a wacky entry, even if you miss a lot of games, there are not likely to be many entries close to yours to capitalize on your mistakes. The problem here is that, if you turn in a wacky entry, you probably won't end up being even close. That's what makes it a wacky entry.

To make this a little more concrete, imagine two extreme strategies:

**Strategy #1:** pick a final four with two #1 seeds, a #2 seed and a #3 seed.

**Strategy #2:** pick a final four with two #9 seeds and two #7 seeds.

The upside of Strategy #1 you're very likely to hit at least a couple of the final four teams. The downside is that, if your final four hits, you're probably not the only one who has it.

The upside of Strategy #2 is that, even if you just get one or two of the final four teams correct, you're probably still doing better than everyone else. The downside is that you're not likely to hit even one.

And of course you don't have to be at one extreme or the other. There is a continuum of possibilities in between. So where do you want to position yourself? You can't answer that question unless you know what the other entries in your pool look like, and you're probably not going to know that. So you have to make some assumptions.

If it's a big contest with a mixture of hardcore and casual fans, I think it's reasonable to expect that the entries will generally cluster around the most likely outcomes, but that there will be some longshot entries mixed in. With that in mind, I'm going to make the following assumption:

**Assume the entries in your pool are distributed the same as the distribution of actual outcomes of the tournament.**

Roughly speaking, what this means is that, if you think Ohio State as a 25% chance of winning the tourney, then about 25% of the pool's participants will pick Ohio State to win it. If you think there is a 1% chance of a final four consisting of Florida, UCLA, Texas A&M, and Georgetown, then about 1% of the pool's entries will have that for a Final Four. If you think Virginia Tech has a 59% chance of beating Illinois in the first round, then around 59% of the entries will have Virginia Tech beating Illinois. And so on.

Is this a reasonable assumption? I think it's at least in the ballpark. Yahoo.com publishes the entries in its Tournament pick 'em contest and they match up reasonably well with objectively-generated probabilities (e.g. from Sagarin ratings and the like). Not perfectly, but reasonably. This shouldn't be too surprising. Sports gambling markets are often cited as an example of the wisdom of crowds and are generally believed to be pretty efficient.

So let's go back and apply this assumption to our drastically simplified pool, where we are only picking the champion. If these are the probabilities of each of these teams winning the title:

Ohio State: 25%

UCLA: 20%

Kansas: 15%

UNC: 15%

Florida: 10%

Texas A&M: 10%

Washington State: 5%

Then our assumption would imply that the above is also the distribution of entries. Twenty-five percent of the people would take Ohio State, 20% UCLA, and so on. If that's the case, then what is the best pick?

**There is no best pick! Your chances of winning are the same no matter who you pick.**

If there are 100 entries for example, then 25 of them took Ohio State. So if you are one of those 25 riding the Buckeyes, your chances of winning are 1%: a 25% chance they'll win, and then a 1-in-25 chance that you'll win the tiebreaker. If you take Washington State, you've also got a 1% chance of winning: 5% chance of the Cougars winning, then a 1-in-5 chance of winning the tiebreaker. Regardless of which team you look at, the analysis will turn out the same: you have a 1% chance of winning. One percent, of course, is one of a hundred, because you are one of a hundred people in the pool.

But that's an oversimplified situation. What happens in more complicated settings?

As many of you know, I teach math for a living. Last summer, I got a student and a colleague interested in investigating this question with me. Some very interesting (to us, anyway) mathematics arose from the investigation.

As an abstract model of the tournament prediction problem, we imagined the following game. Suppose that a random number, called the *target*, is to be chosen. Millions of participants will guess what the number will be, and whoever guesses closest is the winner. Let's say, just for example, that it is to come from the standard normal distribution. So there is about a 2/3 probability that the target will be between -1 and 1, a 95% chance that it will be between -2 and 2, a 99% chance that it will be between -3 and 3, and so on. Your job is to guess closer to the target than any other competitor, and let's assume that their guesses are distributed as independent standard normals as well. In other words, two-thirds of the guesses will be between -1 and 1, 95% between -2 and 2, and so on.

If you guess near zero, then you are likely to be close to the target. But you are also likely to be crowded out by the multitudes of other guesses that are in the same vicinity. If you make a guess far out in the tail, like say 3.4, then there aren't many guesses near yours, but the target isn't likely to be near your guess either. If you picture a standard bell curve, you can picture the choice as being between a tall skinny piece of the distribution (a guess near zero) or a short fat piece (a guess far from zero). Which gives you the better chance of winning?

As it turns out, it doesn't matter. Either is as good as the other. And anywhere in between is also just as good.

Even more interesting is that it does not matter that the distribution is standard normal. No matter what the distribution is (well, there are a few technical caveats, but I don't feel like I'm betraying the spirit of the results to say that it doesn't matter), as long as the distribution of entries is the same as the distribution of possible outcomes, and as long as the pool has a lot of entries, it doesn't matter what you guess.

So, at least to the extent that you believe our abstract game models your pool reasonably well, any guess is as good as any other. Fill out your bracket based on geography, uniform color, fierceness of mascot, or whatever other criteria you want. Your chances are as good as anyone else's.

If you're a casual follower of college hoops, you might find this liberating. While I haven't given you any actual advice on how to fill out your brackets, at least I've absolved you of any guilt you may have had about entering a contest where you have no idea what you're doing.

17 Comments | Posted in Non-football, Statgeekery