# March Mathness — The Massey Method

In this post for March Mathness, Kenneth Massey whose popular ratings (http://masseyratings.com) help rank the BCS teams each year, offers an overview of what goes into filling out his brackets for March Madness.

I’m a college basketball fan, but to be honest, I don’t watch many games during the regular season. Since my personal expertise is a function of ESPN highlights and commentary, I’ve learned to trust the math more than my own feelings about a matchup.

All season I compile a monster list of the various computer rankings for college basketball: http://www.masseyratings.com/cb/compare.htm

By following the results, I have a pretty good idea about which teams are over/under rated by the media and which ones are coming on strong or fading into tournament time.

Once the pairings are announced, I usually fill out a bracket based on my limited first-hand knowledge of the teams, the impressions I have from following the rankings, and maybe some “gut” intuitions. For that particular bracket, I don’t do any additional analysis, or even look at the numbers–I just rely on what’s already accumulated in my brain.

Now let me describe how I fill out my more analytical brackets. I have two different strategies, but both of them start with estimating the probabilities that each particular team will advance past each round.
In this post, I will describe that process. In a later post, I’ll describe how I use those probabilities to actually fill out my bracket picks.

I’ve been doing computer ratings for years, and have experimented with many mathematical models, one of which is described in Who’s # 1?. The model I currently use, and post on masseyratings.com, is proprietary, but I will list some of the pertinent aspects of it.

1) Margin of victory matters, but in an intelligent way. There are diminishing returns for blowouts, and adjustments are made for the pace of the game. For example 60-45 may be more impressive than 100-80.

2) Winning is rewarded, especially on the road. Even if the margin is small, a team gets a bump by winning games against good competition.

3) Schedule strength is implicit in all the equations. Everything is measured relative to the opponent, so there is higher reward and less risk for playing tough opponents.

4) The model has a decaying memory of early season games. The team in March is different from the team in November.

5) Games between mis-matched opponents are not as important as games between well-matched opponents. There is a lot more information in a #18 vs #23 matchup than there is when #18 plays #230.

6) My model produces offensive and defensive ratings for each team, as well as homefield advantage estimates. From these, it is possible to predict the distribution of final scores for a hypothetical matchup between any two teams.

After the ratings are computed, I use conditional probability to effectively account for every possible scenario of how the bracket could “play out”. For example, if team X makes it to the Sweet 16, who are they likely to face? According to the seedings, some teams have easier paths of advancement. I can compute the probability that each team advances past a given round, the expected number of rounds a team will win, and ultimately each team’s probability of winning the championship.

The great thing about probabilities is that you are never “wrong”. For example, last year my calculations showed that UConn had an 86% chance of winning the first round, a 54% chance of advancing to Sweet 16, a 29% chance of advancing to Elite 8, 12% chance of advancing to Final 4, 5% of playing in the championship game, and a 2.3% chance of winning it all.

By the nature of randomness, it is not really surprising that underdogs occasionally win. Even a dominant #1 overall seed rarely has more than a 25% chance of winning the entire tournament. That’s what makes the event so exciting–nobody knows what will happen.

After all the probabilities are computed, I proceed to fill in my picks. Don’t I just pick the teams with the highest probabilities? Not exactly. I’ll address that in a subsequent post.