In Dialogue: Christopher Phillips and Tim Chartier on Sports & Statistics

Question: How would you describe the intersection between statistics and sports? How does one inform the other?

Christopher Phillips, author of Scouting and Scoring: Sports have undoubtedly become one of the most visible and important sites for the rise of data analytics and statistics. In some respects, sports seem to be an easy, even inevitable place to apply new statistical tools: most sports produce a lot of data across teams and seasons; games have fixed rules and clear measures of success (e.g., wins or points); players and teams have incentives to adjust in order to gain a competitive edge.

But as I discuss in my new book Scouting and Scoring: How We Know What We Know About Baseball, it is also easy to fall prey to myths about the use of statistics in sports. Though these myths apply across many sports, it is easiest to hone in on baseball, as that has been one of the most consequential areas for statistics.

Perhaps the most persistent and pernicious myth is that data emerge naturally from sporting events. There is no doubt that new video-, Doppler-, and radar-based technologies, especially when combined with increasingly cheap computing power and storage capability, have dramatically expanded the amount of data that can be collected. But it takes a huge about of labor to create, collect, clean, and curate data, even before anyone tries to analyze them. Moreover, some data, like errors in baseball, are inescapably the product of individual judgment which has to be standardized and monitored.

The second myth is that sport statistics emerged only recently, particularly after the rise of the electronic computer. In fact, statistical analysis in sports goes back decades: in baseball, playing statistics were being used to evaluate players for year-end awards and negotiate contracts for as long as professional baseball has existed. (And statistics were collected and published for cricket decades before baseball’s rules were formalized.) As new methods of statistical analysis emerged in the early twentieth century in fields like psychology and physiology, some observers immediately tried to apply them to sports. In the 1910 book Touching Second, the authors promoted the use of data for shifting around fielders and for scouting prospects, two of the most important uses of statistical data in the modern era as well. There’s certainly been a flurry of new statistics over the last twenty years, but the general idea isn’t new—consider that Allen Guttmann’s half-century-old book From Ritual to Record, highlights the “numeration of achievement” and the “quantification of the aesthetic” as defining features of modern sport.

Finally, it’s a myth that there is a fundamental divide between those who look at performance statistics (i.e., scorers) and those who evaluate bodies (i.e., scouts). The usual gloss is that scouts are holistic, subjective judges of quality whereas scorers are precise, objective measurers. In reality, baseball scouts have long used methods of quantification, whether for the pricing of amateur prospects, or for the grading of skills, or the creation of single metrics like the Overall Future Potential that reduce a player to a single number. There’s a fairly good case to be made that scouts and other evaluators of talent are even more audacious quantifiers than scorers in that the latter mainly analyze things that can be easily counted.

Tim Chartier, author of Math Bytes: Data surrounds us. The rate at which data is produced can make us seem like specks in the cavernous expanse of digital information.  Each day 3 billion photos and videos are shared on Snapchat.  In the last minute, 300 hours of video were uploaded to YouTube.  Data is offering new possibilities for insight. Sports is an area where data has a traditional role and newfound possibilities, in part, due to the enlarging datasets. 

For years, there are a number of constants in baseball that include the ball, bat, bases, and statistics like balls, strikes, hits and outs.  Statistics are and have simply been a part of the game.  You can find from the 1920 box score that Babe Ruth got 2 hits in 4 at-bats in his first game as a Yankee. While new metrics have emerged with analytical advances, the game has been well studied for some time. As Ford C. Frick stated in Games, Asterisks and People,

“Baseball is probably the world’s best documented sport.”

While this is true, the prevalence of data does not necessarily result in trusting the recommendations of those who study it.  For example, Manager Bobby Bragen stated, “Say you were standing with one foot in the oven and one foot in an ice bucket. According to the percentage people, you should be perfectly comfortable.”  This underscores an important aspect of data and analytics.  Data, inherently, can lead to insight but it becomes actionable when one trusts in how accurately it reflects our world. 

Other sports, while not as statistically robust as baseball also have an influx of data.  In basketball, cameras positioned in the rafters report the (x,y) position of every player on the court and the (x,y,z) position of the ball throughout the entire game every fraction of a second.  As such, we can replay aspects of games via this data for years to come.  With such information comes new information.  For example, we know that Steph Curry, while averaging just over 34 minutes a game, runs, on average, just over 2.6 miles per game. He also runs almost a quarter of a mile more on offense than defense. 

While such data can be stunning with its size and detail, it also comes with challenges. How do you recognize a pick and roll versus an isolation play simply from essentially dots moving in a plane?  Further, basketball, like football but unlike baseball, generally involves multiple players at a time.  How much credit do players get for a basket on offense?  A player’s position may open up possibilities for scoring, even if that player didn’t touch the ball.  As such, metrics have been and continued to be created in order to better understand the game.

Sports are played with a combination of analytics, gut and experience.  What combination depends on the sport, player, coach and context.  Nonetheless, data is here and will continue to give insight on the game. 

Celebrate Major League Baseball’s Opening Day by Reading about Baseball in Blue and Gray

Today is THE day baseball fans. Major League Baseball is back in action. Over at the New York Times, they are celebrating by looking back at the early days of baseball. Specifically, they have posted an article from Princeton University Press author George B. Kirsch on baseball during the Civil War.

Compare Kirsch’s description of “spring training” and “opening day” in 1861 to the great hullabaloo today:

In late March and early April 1861, ballplayers in dozens of American towns looked forward to another season of play. But they were not highly paid professionals whose teams traveled to Florida or Arizona for spring training. Rather, they were amateur members of private organizations founded by men whose social standing ranged from the working class through the upper-middle ranks of society. There were no formal leagues or fixed schedules of games, although there were regional associations of clubs that drew up and enforced rules for each type of bat and ball game. Contests between the best teams attracted large crowds (including many gamblers), and reporters from daily newspapers and weekly sporting magazines wrote detailed accounts of the games.

While much has changed in American baseball since 1861, what hasn’t changed is the anticipation, excitement and pure sport of the game. Unfortunately, this spirit wasn’t enough to hold the reality of the Civil War at bay according to Kirsch. He writes:

As military action between the North and the South loomed, sportswriters highlighted the analogy between America’s first team sports and warfare. Yet they were also aware of the crucial differences between play and mortal combat. In March 1861, The New York Clipper anticipated the impending crisis:

God forbid that any balls but those of the Cricket and Baseball field may be caught either on the fly or first bound, and we trust that no arms but those of the flesh may be used to impel them, or stumps, but those of the wickets, injured by them.

But three months later sober realism replaced wishful thinking. A Clipper editor remarked:

Cricket and Baseball clubs … are now enlisted in a different sort of exercise, the rifle or gun taking the place of the bat, while the play ball gives place to the leaden messenger of death. Men who have heretofore made their mark in friendly strife for superiority in various games, are now beating off the rebels who would dismember this glorious “Union of States.”

Click over to read the complete article and peruse the Disunion feature at the New York Times. Disunion is tracking, day-by-day, the course of the Civil War in America through terrific articles from experts in a variety of fields. While there is certainly a lot of military history, the editors are also focusing on cultural and social issues (like baseball!) which make for truly compelling reading.