Tuesday, March 25, 2014

A stats-driven power ranking

Check out this post on the new version of Rink Stats. I used the win probability metric, which I described here, to calculate the average probability that each team wins each of their games. Then, for each team, I calculate a weighted-average (based on the logged temporal distance) of their win probabilities. In other words, the most recent game gets the highest weight, while game 1 is still in the calculation, but significantly down-weighted.

The result is a new power ranking metric, which is entirely based on in-game statistics. As far as I know (and please correct me if I'm wrong) but it's also the first power ranking (in any sport?) which doesn't just provide the ranks of the teams (1 through 30) but gives a sense of how big the gaps are between teams 1 and 2, 2 and 3, etc. Check out the results below, which are based on every game this season (excluding last night's).

As you can see, not only is Boston the top-ranked team, it's doing significantly better than the other teams in the top 5, mostly because of their dominance during their 12 game win streak. And you can see there's a huge dropoff in teams 25-30.


  1. Very interesting, but I'm a little confused by the write up-I'm sure I'm missing something simple. Which game probabilities do you average from each game (obviously they all start at close to 50% and end at 100%). Also, by my eye, it looks like about 17 teams have percentages >=50%. And how does it account for things like schedule strength?

  2. For each game I have the probability that either team wins the game at each second. I describe how I get these here: http://rinkstats.blogspot.com/2014/03/win-probabilities-metric-10.html. For the power rankings, I essentially take the average of the probability that team A wins the game for all of team A's games all season. Then the power ranking for team A is a weighted average of all of those averages, where the weight for each game is the log of how many games have been played by team A up to and including that game.

    So for after a team has played 10 games their power rank value is
    avg.win.prob.game10 / log(10) + avg.win.prob.game9 / log(9) + avg.win.prob.game8 / log(8) + avg.win.prob.game7 / log(7)...

    This means it's possible to have more than half of teams better than 50%, if there's some teams that are really really bad. The sum of the power ranks for all 30 teams will be roughly 50%. It won't necessarily be exactly 50% because of the weighting, but it will be incredibly close.

    And I haven't yet incorporated schedule strength, but that'll be a next step.