Just in time for the first Sweet Sixteen games, math editor Vickie Kearn speaks with Tim Chartier about how math and Google is used to predict the bracket winners. Enjoy this exclusive dialogue below.
Vickie Kearn: When I was at the University of Richmond, I only went to one football game in four years. That was not a program that got a lot of attention then. However, the Spiders had a great basketball team, as they do now. Because of their great success, I have been watching as many games as possible and following the brackets with a huge amount of energy. A few years ago, we published a book by Amy Langville and Carl Meyer, Google’s PageRank and Beyond: the Science of Search Engine Rankings, and part of the “Beyond” is how you use math to rank sports teams. It is really quite fascinating.
Amy and her students at the College of Charleston and Tim Chartier and his students at Davidson College use mathematical algorithms to rank teams and they are doing fantastic on their brackets this year. I asked Tim and his student, Lucy McMurry, a sophomore who plans to declare a major in math and a minor in Spanish, how they used their math background to make sense out of a bracket with 68 teams. How do you pick who will be #1?
Tim Chartier: There are a variety of techniques that can be used to rank items. Sports teams are often ranked by winning percentage. Elections are often won by the person who gains the highest number of votes. Search engines often use techniques from linear algebra. Amy and Carl’s book discusses important aspects of Google’s PageRank algorithm that makes it suitable for ranking webpages and how it is scalable to analyzing billions of items. For our brackets, we rank sports teams also with linear algebra. PageRank uses a stochastic matrix that is built from underlying probabilities based on a model of a surfer randomly traversing the web. Our method builds a linear system, Ax = b, which while still a linear system, has different properties than a stochastic system like PageRank’s.
Lucy McMurry: To begin, we acquire all of the data about the 68 participating teams including when each game was played, the score of each game, and whether or not it was home or away for each team. From this point, it is up to the student how he/she wants to implement the code. For example, an away win could be worth more than a home win and a game played later in the season could be worth more than one played at the beginning. Once the code is set, a ranking is produced using all of the data. From here, we assume that the higher ranked team will win each game. Thus, the top ranked team will win the tournament.
VK: When you found out on March 13 who would be playing in the tournament, how did you go about setting up your brackets and selecting who you predict will win on April 4?
LM: I personally don’t know very much about basketball and hadn’t followed the entire season. Therefore seeing who would be playing on March 13th didn’t really affect how I wanted to structure my code. I based my code on what seemed reasonable to me from an outsider’s perspective, which resulted in me creating a code based on when in the season a game was won.
TC: This is quite true. Fundamentally, the students think mathematically about their models, which allow all the students to participate. I’ve seen some students fiddle with their models when they have a particular team they want to see perform well or actually poorly. In many cases, this doesn’t help. While there are a number of different ways one could measure success, we submit our brackets to the ESPN Tournament Challenge, which allows us to compete against each other and millions of other brackets.
VK: Is there more than one way to rank the teams?
TC: Yes. Some people pick the winning team based on which team’s mascot they prefer! And yes, there are also multiple methods for doing this mathematically. In college football, the Bowl Championship Series (BCS) uses several methods to rank the teams. One such method is the Colley method is a linear system which is based on wins and losses and there is also the Massey method which can integrate the scores of games. One can also adapt Google’s PageRank algorithm.
LM: In fact, I’m using, as many of my fellow students are, the Colley method. However, we use recent research by Drs. Chartier and Langville that allows us to model momentum. By using these different models and ideas, we are all able to come up with original codes that can produce very different results.
VK: Suppose you love math but don’t know anything about basketball. Will your rankings still predict a reasonable winner?
LM: I love math, but as I mentioned earlier, I do not know very much about basketball. However, I am tied for first placed in our class pool along with three other students and am currently in the 91st percentile nationally! Therefore, I think I can safely say that my code predicted a very reasonable ranking of the teams.
TC: Lucy and three other students that are currently in the lead are performing better than over 5 million other brackets! Kelly Davis, who you will see in the video, is one of the students leading the class. Daniel Martin, who is also interviewed in the video, is in a pool outside the class and is currently in the 96th percentile. Interestingly, some students in the class pool tried very novel approaches to modeling momentum and a few such methods are performing quite poorly, with one such method ranking in the 1.5 percentile!
VK: Amy Langville’s students received quite a bit of publicity in the past because their predictions were so good. Have you experienced a bit of fame from your predictions?
TC: Yes. The media has covered our brackets for the past three years as we’ve done well each year.
LM: Just last week, Derek James of Fox News Charlotte came to our class to film Dr. Chartier talking to us about our brackets. Many of the students in the class were able to email their parents to watch Fox News and get a glimpse of us in class. You’ll see me in the news segment but I definitely have a good portion of any eventual 15 minutes of fame left! The interview concentrated on Dr. Chartier as well as a few students from my class discussing their brackets and the theories behind their code.
TC: In fact, I helped Derek create a bracket using our methods. I asked him to break the season into as many intervals as he wanted. For instance, suppose he chose 3. Then, he would weight the games during each interval. Suppose he chose weights of 1/2, 3/4 and 1. Then, all the games in the first, second and last third of the season would be worth 1/2, 3/4 and 1 game, respectively. He also gave weights to home and away games. In the end, he had a personalized bracket that is tied with Lucy’s! The winner of the class pool gets a prize from Ben and Jerry’s in Davidson. Derek isn’t eligible as coming for 15 minutes to class and talking during the lecture doesn’t qualify! We have great fun watching the brackets unfold and seeing how our modeling performs.
Vickie: To learn more about what Derek James learned in class, watch his interview below.
Derek James, Reporter FOX Charlotte-WCCB. Used with permission.
VK: In 2009 we published Mathletics: How Gamblers, Managers, and Sports Enthusiasts Use Mathematics in Baseball, Basketball, and Football by Wayne Winston and this is another great source for you veteran bracketologists. Also, if you go to waynewinston.com you can see all of Wayne’s calculations and odds of each team winning in a particular round. For example, for the Sweet Sixteen odds he gives the University of Richmond a .84% chance of winning the championship and Ohio State a 29.6% chance of taking home the trophy. Although my heart is still with Richmond, I am going to go with Duke for a repeat. All the rankings and calculations I have done give them less of a chance of winning than other teams, so they are a mathematical longshot. However, I added a little luck factor into my calculations. Wayne is going with Ohio State. Check back in a few weeks and we will let you know how we did.
Good luck to all!