On March 24 we posted some bold predictions on who would win the NCAA basketball championship game. We now have a winner and send our congratulations to Kelly Davis.
Vickie Kearn: My bold pick was Duke based on a little math, past performance, and the luck factor. The one thing missing from my equation was the upset factor and I will be sure to add that next year.
The winner of the ESPN bracket challenge is Joe Pearlman who filled out his bracket in 10 minutes and based his picks on a hunch. Out of 5.9 million entries, he is only one of two people who picked the final four and he will be taking home the $10,000 prize. Does this mean we should throw out all of our math models and go solely on hunches or throw darts at a bracket next year? Absolutely not! As you will see, Tim Chartier and his students did very well with their brackets.
Tim Chartier: Any prediction method is, at some level, working on the odds of longterm success. This can be seen by our methods producing brackets that were in the 90th percentile 3 of the past 4 years. However, this year was, indeed, quite different. Still, Kelly Davis, a senior math major at Davidson College, produced a bracket that beat many celebrity sports analysts’ brackets. We were only using the results of games, the time it occurred in the season, and whether the game was home, away or on neutral ground.
Caption: Kelly Davis with her prizes of Ben and Jerry’s and Princeton University Press books.
Vickie Kearn: What method did you use when preparing your bracket?
Kelly Davis: Like most students in my Math Modeling class, I used a linear weighted Colley Ranking method that we learned about in class, which uses a system of linear equations. Different derivations of the Colley Ranking Method are often used in sports rankings, including for the Bowl Championship Series. Each student then modified this method, to emphasize or add in different factors that each student felt was important. Not knowing much about college basketball, I had to pull from my somewhat limited knowledge of sports to help me decide what factors from the regular season were important to help predict the tournament outcomes. The three major factors that I implemented into my coding were the point difference between the winner and loser, when in the season the game was played and whether the game was home, away or on neutral ground.
Let me give a few more details on this. Factoring in point difference helps to indicate the strength of the win. Winning by a lot is a stronger win than only winning by a little. It also helps to factor in games that are very close in point systems and ultimately come down to a bunch of fouls being called. Considering when in the season the game is played allowed me to give heavy emphasis on the end of the season. If a team is playing poorly at the end (such as due to the injury of a major player) then they will probably not do well. All teams are playing intensely at the end of the season in their conference tournaments, which I consider as a good predictor for tournament play. Finally, I fold in a weight for location. Teams that typically do poorly at away games, will have a hard time in March Madness where no one plays a home game.
Dr. Chartier included the brackets generated by the linear and the uniform Colley methods into our ESPN group so that we could see how our brackets compared to the simplistic/conservative, non-modified versions. Despite ranking lower than the majority of the class last year, the linear Colley method ironically ended up being the next highest bracket after mine, placing in the 64.4 percentile. Last year it placed in the 82 percentile. Perhaps sometimes the safest approach is the best approach!
Vickie: Did you submit more than one bracket? If so, which performed the best?
Kelly: Each student in our class was allowed to submit up to three brackets and I ended up submitting two. My bracket that ended up being the most successful was my initial one that I had to complete for a homework assignment. In this bracket, except for the very first portion of the season, I divided the season up into 10 segments and weighted each segment by an increasing 10% and then weighted the last two segments with a bit higher percentile because most teams are playing conference championship games during this time. In this bracket, I also subtracted a 3 point home court advantage from the score of the home team. For my second one, I placed in the 64.4 percentile, which placed it at the same percentile as the linear Colley method. For this one, I mostly shifted more weight to the end of the season in terms of how important it was to be winning at the end of the season instead of the beginning. I also penalized teams who tended to lose more at away games.
Vickie: Were there any surprises this year that you did not count on and that affected your bracket in a big way?
Kelly: With only 4.7% of over 5.9 million brackets submitted to the Tournament Challenge accurately predicting Connecticut to win, let alone only two people in the entire country correctly picking the final four, I think it is safe to say there were many surprises that most people did not count on! In terms of my bracket, early on in the tournament, my model actually did very well at predicting the outcomes of the first two rounds, with me finishing the second round in the 91.0 percentile, which placed me above many experts on this subject such as Mike Greenberg, Dick Vitale, and Matthew Berry, who ended up in the 21.3, 21.3, and 11.9 percentiles, respectively. Then again, Matt Hasselbeck’s (quarterback for the Seattle Seahawks) 5-year-old son finished in the 93.4 percentile.
As the tournament progressed and more of the upsets started occurring/becoming more apparent, my bracket, along with many others such as President Obama’s bracket, started to be less successful at predicting these surprises. Many of the unpredicted surprises in my bracket were pretty unexpected for most people, such as Kentucky’s win over Ohio, the team that over a quarter of the brackets, including mine, had predicted would win and Butler’s surprising series of wins, as evident by the fact that only 11,326 of the 5.2 million had accurately predicted Butler being in the finals.
Vickie: Each round of the competition provides a certain number of points for a correct pick. For example, you get 10 points for each winner you pick in the second round of play, 80 points for selecting the Elite 8 and 320 points for selecting the champion. The most points you can get is 1920. What was your ESPN score? You didn’t win the $10,000 but what was your prize?
Kelly: As Dr. Chartier mentioned earlier, for the first time in the past four years, my class’s mathematical models were not as successful at predicting all of this year’s surprises and my ESPN score ended up being 560 points, placing me in the 68.1 percentile, which was the same percentile as Colin Cowherd, an American sports radio personality.
Despite not doing as well as other students in previous years, I was a bit more successful within my modeling class and end up winning our inner-class pool. As part of winning our class pool, I received $100 worth of books from Princeton University Press, a t-shirt from the Davidson College Athletics Department and several free cones to Ben & Jerry’s. The picture above shows me sitting in our local Ben & Jerry’s with some books on ranking published by Princeton University Press while I enjoy one of my victory cones.
I think the largest prize of all, however, was the opportunity to show my friends and fellow college students an exciting and cool application of math to a topic most people would never associate with math. Some of my friends hated seeing their carefully thought out brackets lose to a bracket generated by a “math nerd” who knows very little about college basketball, which made my ice cream victory taste even sweeter!
Vickie: What would you do differently next year?
Kelly: After having had a lot of success with running my coding on some of the past few seasons in terms of fairly consistently predicting a large portion of the elite eight’s each year, in some ways I would be tempted to change very little. As with most mathematical models, my model has many limitations and flaws, and consequentially will have instances such as this year where it is less successful at accurately predicting real world outcomes, but then again so were many experts. I think one of the coolest things about using math modeling to predict tournament outcomes is that you can use the same coding to predict outcomes each year without having to spend the entire regular season keeping track of scores and top teams.
A couple of things I would be interested in exploring would be to look at a team’s patterns of wins and losses as an indicator of how to weight wins at different points in the season. After seeing how successful Butler was for the second year in a row, I also think it would be interesting to consider the success rates of teams in previous March Madness tournaments.
Vickie: In the earlier post, Lucy McMurry was doing well. How did her bracket do in the end?
Tim Chartier: Lucy was, indeed, doing very well. However, many of her picks did not lead to points as the tournament progressed and so she ended up in the 50.9 percentile. So, she was better than over half the brackets but it was indeed a difficult year! We look forward to next year and maybe this year will give us new ideas and even new statistics to fold into our methods. Nevertheless, there will also be upsets and a certain amount of madness in March as the tournament unfolds.