Malcolm Gladwell, google searches, and quarterback draft status versus performance as predictor of future playing time
Last week, a high level cat fight broke out over Malcolm Gladwell's book "What The Dog Saw" when Dr. Steven Pinker wrote a critique in the NY Times. That critique included a reference to one essay in the book, which originally ran last December in the New Yorker, entitled "Most Likely to Succeed: how do we hire when we don't know who's right for the job?" In that essay, Gladwell states, in reference to what he calls the quarterback problem, that "[t]here are certain jobs where almost nothing you can learn about candidates before they start predicts how they’ll do once they’re hired." Pinker responded that "[i]t is simply not true that a quarterback’s rank in the draft is uncorrelated with his success in the pros."
Gladwell fought back on his blog. His responses were primarily attacks upon the individuals later cited by Pinker to support the position that draft position does matter, contrary to what Gladwell claimed, with minor reference that the critiques failed to appreciate the difference between aggregate performance and per play performance. He closed with:
I have enormous respect for Professor Pinker, and his description of me as “minor genius” made even my mother blush. But maybe on the question of subjects like quarterbacks, we should agree that our differences owe less to what can be found in the scientific literature than they do to what can be found on Google.
This, of course, piqued my interest. I admit to having heard reference to Gladwell's essay that originally ran last December, but had not paid it much attention. When I see defenses that are primarily based on attacks of the person, and what I see as an initially questionable assumption (per play statistics are all important; aggregates do not matter), well, I feel compelled to dig further. I happen to believe that the merits of an argument rise and fall on the quality of the facts and analysis, and not on who made it. This is true whether the arguments are presented in a scientific journal or on a blog. Oh, and I wanted to add something that could be found with a Google search.
I have little interest in whatever motivations drive Pinker and Gladwell's exchange. I'm interested only in the football aspect. Does draft position matter in predicting success? To see where Gladwell is coming from, we need to turn to his source. That source was an article published in the Journal of Productivity Analysis, entitled "Catching a draft: on the process of selecting quarterbacks in the National Football League amateur draft" by David Berri and Rob Simmons. In that article, Berri/Simmons looked at quarterbacks drafted between the 1st and 250th pick of a draft since 1970, found all cases where a quarterback from that group started at least one game in an NFL season, and then calculated performance using Berri's QB Score measure, on both an aggregrate and per play basis. The authors concluded, when looking at nearly 40 years of data, that a relationship between a quarterback's draft position and his subsequent NFL performance on a per-play level could not be found. In other words, later selected quarterbacks played as well as quarterbacks selected at the very top of the draft, when the number of opportunities was ignored, and they were assessed on how they played per play.
When the spat broke out between Gladwell and Pinker, Berri also posted some thoughts on his Wages of Wins website.
My sense is that Pinker never read our article. What he did find on the Internet is evidence that a quarterback’s aggregate performance (i.e. passing yards, seasons played, Pro Bowl appearances) is indeed related to draft position. And as Rob and I detailed in our article, this is true. Aggregate performance and draft position are statistically related. But as Rob and I argue, this is because in the NFL (like we see in the NBA) draft position is linked to playing time. And this link is independent of performance. In fact, Rob and I find that draft position – again, independent of performance – impacts a quarterback’s pay many years into a quarterback’s career.
It is the bolded part of that quote that I want to focus on. This is a key underlying assumption for the position that Gladwell (through Berri/Simmons) ultimately takes (Although apparently it was a position that Gladwell had assumed long before that article was written). If quarterbacks get playing time because of their status, independent of how good they are, then Gladwell's position has some support.
The math in the Berri/Simmons article is fine. They use their own measure, QB Score, rather than others like Adjusted Net Yards per Attempt or QB Rating, but ultimately, all those per-play measures are including roughly the same things, and the differences are not going to change the conclusions. We can say with reasonable certainty that, as Berri/Simmons find, later round quarterbacks who actually play in the NFL perform about as well as early draft picks who actually play. This part is adequately portrayed in the article.
What I cannot find support for in the article, other than the reference to a quarterback's pay, is the underlying key assumption that playing time is linked to draft position, and this link is independent of performance. Clearly, earlier drafted quarterbacks get an opportunity at a younger age on average than later drafted quarterbacks who eventually become starters, so draft position (along with opportunity and team need) does dictate how soon a quarterback gets a chance. Berri, however, separates out per-play performance based on years of experience and reports that correlations are non-existent on a per-play basis even looking at quarterbacks with 5+ years of experience in the league. So, I don't think Berri is making the "playing time related to draft position independent of performance" based only on rookie starters at age 22 and 23, with a belief that everything being equal thereafter.
To test the interplay of performance versus draft status, I looked at all quarterbacks going back to 1960 who threw 200 or more passes in the NFL at age 24. From this list, I excluded all undrafted free agents and all supplementary draft picks (where I can't assign a specific draft number), so that we are only measuring players who were drafted in the regular draft. I also excluded players who had not turned 29 years of age as of the 2008 season. I chose age 24 for a couple of reasons. First, prior to age 23, there are very few seasons where quarterbacks threw 200 passes. Matt Stafford, for example, has joined Fran Tarkenton and Drew Bledsoe as the only three quarterbacks to throw that amount at age 21. The second reason I chose age 24 is because, for most quarterbacks, it is not a true rookie season, and we get a much larger variety for draft position, as a higher number of guys not selected in the first round start to get some playing time.
I ran a regression with the following input variables for that group of quarterbacks:
1) the quarterback's per play value above or below league average at age 24 ("PERFORM"), using the formula for value used by Chase in his most recent "Greatest Quarterback of All-Time" series; and
2) the quarterback's draft position ("DRAFT"), using a draft value number used by Chase originally here.
The three separate outputs used were games from ages 25-29, games started from ages 25-29, and performance above or below replacement value from ages 25-29. I used the age 25-29 period as the output so that a) we would have a larger data set and not have to exclude all currently active players, and b) so that injuries later in the career or decisions to retire at one age versus the other (which we would assume are not dependent on either draft position or performance at age 24) are not factored. Of course, injuries still can determine how many games a specific QB played, but this limits that somewhat. Oh, and for players who had a strike year or played before 1978, all the outputs were normalized to a 16-game schedule (so that 80 games is the max available).
Here are the results:
GAMES STARTED FROM AGES 25 TO 29= ~ 34.69 + 8.49*PERFORM + 0.39*DRAFT
GAMES PLAYED FROM AGES 25 TO 29= ~ 46.78 + 6.56*PERFORM + 0.23*DRAFT
VALUE OVER REPLACEMENT FROM AGES 25 TO 29= ~ 1930 + 866*PERFORM + 31*DRAFT
All the variables were strongly significant. As it turns out, how the player performs at age 24 and the draft status both matter. The fact that draft status matters, though, doesn't automatically mean that all of it is due to teams starting high draft picks for no reason other than to justify their selection (though there is some of that). Some of the "draft status matters" is that the scouting is proven correct, and the limitations of separating quarterback performance measures from the contributions of teammates is shown, because the higher drafted quarterbacks perform better than lower drafted quarterbacks who happened to play about as well at age 24. The Troy Aikmans of the world sometimes pan out, and do so more frequently, than the below average performers who were not highly drafted.
That said, the results also support that some highly drafted quarterbacks do get more opportunities than their performance dictates. This is something that I tried to look at last year in regard to Joey Harrington, and I do think that teams commit more false positive errors in regard to high picks (continuing to give them plays when the evidence suggests they are highly likely not to pan out) than false negatives (giving up on a high pick too soon, only to have him succeed elsewhere). If you compare the variables for Games Started versus Value over Replacement, you will see that the PERFORM variable is about 100 times larger for Replacement Value compared to Games Started. In contrast, the DRAFT variable is only about 80 times larger. Thus, performance at age 24 is relatively more important in determining value over replacement (for those that continue to play) than for games started. To put it into specific examples, occasionally, Joey Harrington and Rick Mirer get to bounce around and start far more games than they should, and this happens more frequently for previous high picks than low picks who played about like Harrington or Mirer at age 24.
What is clear to me, though, is that performance matters. A lot. I know this is a shocking finding in a performance driven business like the NFL. I'll also add that my choice of a 200 pass attempts cutoff (instead of say, a 100 pass attempts cutoff) probably increased the importance of the DRAFT variable by excluding guys who were already being weeded out, despite their draft status, by age 24. Poster boys for the early draft busts such as Akili Smith, Art Schlichter, Todd Blackledge, Heath Shuler and Andre Ware didn't even make the data set because they didn't throw enough passes at age 24. Cade McNown was out of the league. Ryan Leaf, who is THE #1 poster boy, is included, and he started 3 more games in his NFL career. Performance (or at least not acting like a jerk while not performing) matters.
Here is a look at the projected Games, Games Started, and Value over Replacement for ages 25-29, for the players who threw 200 passes at age 24 since 2004.
Ben Roethlisberger is notably underprojected here, but how many Quarterbacks have won 26 regular season games and a Super Bowl before age 24? 2006 was a down year for Big Ben, and our formula doesn't know about Ben's unique career before then. I included a projection for Russell based on his per play stats. The fact that a player like Tyler Thigpen (who was below average and a 7th round pick and hardly a star last year) has a similar projection in terms of games started to Jamarcus Russell (who has been horrible at age 24 as a the first overall pick) sums it up in a nutshell.
If you believe that the only reason Carson Palmer has played a lot more than Gibran Hamdan is because Palmer was drafted alot higher, then you can accept Gladwell's position. Otherwise, you probably cannot, at least to the extent Gladwell portrays, because we haven't accounted for the myriad of late round picks where the initial scouting met the performance teams were seeing in practice, and they never got any extended opportunity to play outside of the practice squad and pre-season contests. When we look at the top 20% of late round picks (those who are judged good enough to play or forced into action because of injuries) and they are roughly similar to the top 80% of high draft picks, that does not mean that late round picks are equal to early picks, and the NFL has a quarterback problem where nothing that happens before can predict what will happen in the future. Per play stats matter, and it's important to look at quarterbacks from that perspective, otherwise we reward compilers who get opportunities without merit. More opportunities matter too, though, and a quarterback who plays well over a larger sample size is likely better than a quarterback who plays well over a small one, particularly when qb stats are more volatile due to outside factors such as teammates.
This is a distinction that, as far as I can tell, Gladwell fails to grasp.
This entry was posted on Tuesday, November 24th, 2009 at 5:30 am and is filed under NFL Draft, Statgeekery. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.