For those interested there is a Q/A at fivethirtyeight.com where Elo is discussed in regards to applying it to NFL football. Here I am a bit more descriptive than the Q/A which fails to elaborate in some areas. I also recommend the wikipedia article on the Elo rating system.
What is Elo?
Elo is a rating system designed for head-to-head matchups. It is named after its creator Arpad Elo, and it is not an acronym for anything in particular.
Elo is designed to take opinion and marketing out of the rating process. Only the actual result of a matchup is measured and credited or debited from a participants rating. It helps to form ranking systems less influenced by human biases, except, of course, what values are used to form the rating. That is not to say it is free from all bias. Mathematically, the past history of the participant is always going to, at least temporarily, bias their rating. It cannot account for a recent accident which has made the competitor incapable of performing at a previous rating. Elo was also very useful before the internet enabled matchups between opponents who are distant geographically.
Elo was popularized as a chess rating system to deal with the difficulty of rating and ranking players for competitions. In fact if you have ever heard of a chess master, much of what goes into determining their mastery is a high Elo based rating. It is desirable for competitive purposes that better chess players play similarly graded opponents. Additionally, for the purposes of ranking it is desirable that higher skilled players are not rewarded for beating up on lower ranked players in an attempt to pad their ranking. Elo is also designed to deal with the challenge that many players will never encounter each other. In other words, when the network of matchups is sparse. Increased sparsity does still bias the rating system. However, higher level competitions bring together top performers to level out this problem.
Elo and similarly derived ranking systems are used across many competition platforms. Video games, sports and other competitions have adapted the Elo rating system to their purposes. In fact the application of the Elo rating system to football is more expressive than its application to chess. In chess it is harder to quantify the strength of a win as piece counting or turns can indicate style vs. strength. While, in football, the differential in points is a relatively good indicator of the difference in team quality especially in offensive leagues like the CFL.
Why use it?
Elo is in a lot of ways a quantification of what human’s do all the time with qualitative opinions on teams. We give credit to teams who win, and reduce our opinion of those who lose. Elo is also a zero-sum game. A team who wins gains the same amount of credit as costs the team who loses. Elo also can be modified such that underdogs get more credit for a win over a favoured opponent and favourites gain less for beating up on uncompetitive opponents.
Elo is also in many ways more expressive than win-loss columns alone. Win-loss columns are a reduction of information into a singular bit of information. Did a team lose, zero points, or did they win, one point. In case of ties this has to be expanded to allow for half points. In comparison Elo starts with every team beginning with the same initial point total. Then for every matchup this point total is increased, on a win, or decreased with a loss. The amount of this change starts with a standard value which is then increased relative to how favoured the competitor was to win/lose and by how many points did the competitor win/lose by. Rather than a single value, the multi-point total earned/lost expresses more information about the result of the competition.
What are the basics?
Every team begins with an Elo rating of . (Mathematically, this actual specific value of does not matter in any way. However, it is nice visually to have one sufficiently positive such that low performing teams don’t all of a have negative values. You could start at zero, if you wanted, or even one million. However, in practice this is avoided.)
Additionally, we will give every game a value of . Therefore, before any other factors, a team who wins will gain points and the other team will lose points. For example, if we have two teams and competing with ratings of and , then if wins then and . If they tie, their ratings would remain unchanged.
If we want to clean up this formula, where if the team won or if the team lost.
Why choose ? In many ways has more influence than simply the value to adjust a teams rating by. It effects how responsive a teams rating is to an individual competition event. The larger the value the greater the fluctuation. In European football, different levels of events are given different ratings which attempt to express the rarity of the competition and hopefully how seriously the country participating takes the event. For example, higher rated events such as World Cup finals get a value , while friendlies are given .
Win expectancy is a measure of what are the odds that one team wins versus another. More particularly, the percentage chance that one teams wins. For example, if the game was flipping a coin, then each team would have odds. A favoured team will have a great number approaching and an underdog a value approaching .
We will use a win expectancy value to replace $WL$ from the existing formula. Instead of giving a team all the points indicated in we will adjust it by how the teams win expectancy compares to the actual win/loss result . A team that has no chance of losing, even if they didn’t show up would have a win expectancy of , and a team that has no chance of winning would have . A team who wins gets full value of a win , a team that loses no value , and a team that ties half value .
We now determine the relative amount of points given to each party in the competition based on how their result finished relative to how they were expected to finish. Two even teams would have a and therefore the winner would received of the points and the loser would get of . A favoured team with a who wins would get of , while a winning underdog with would get of . A underdog losing inversely only loses of the points and a similarly favourite loses of the points.
If we want to clean up this formula:
where the differential in Elo values is
Home field advantage?
So far we have adjusted for one team being considered a favourite. However, intuitively and statistically we know there is also an advantage for playing a game at home. Be it travel, sleep, timezones, locker-room, or other issues. The account for this we adjust the differential up by points for a team at home and down by the same amount for a road team. As a result
For some context, the rule of thumb is that Elo points is worth about points scored in an NFL game. From this you should be able to extrapolate that every Elo points is worth a single in game point. For example, a differential in points is a theoretical points spread of points.
This adjustment is pulled from the fivethirtyeight.com Q/A.
What is left? Margin of Victory
We still have a final value to account for. This is the amount of points a team wins/loses by, also known as margin of victory. Often when a favourite wins the point differential gets out of hand for more reasons than the competitive differential of the teams. Think top five college football power five conference team versus a barely mid-major conference team. What we want to do is use a multiplier to adjust for the result of the game.
This multiplier will have two parts. The first will apply decreasing returns on the point total as the differential for score gets larger. A well-suited math function is the natural logarithm . The second is a multiplier that decreases when Eloof winner is larger than that of loser and increases when the Eloof loser is larger than winner.
For the first part we have the formula
A tie game would then be which results in no multiplier. A single field goal difference is and a single touchdown difference is . Note, we can see the decreasing returns for win point differential by , , , and .
For the second, we start with a multiplier of and adjust it based on the team’s Elo differential before home and away adjustment. The results is . This multiplier starts at and decreases as the competitor’s Elo values get further apart.
The accumulated multiplier is
Sounds like a lot of math.
Here’s a neutral example of two average teams at a neutral site. With one team winning by a single touchdown.
We will have two teams, a winner and a loser .
Both team begin with an average ELO and .
As a result we have a differential of .
A neutral site game means .
The resulting win expectancy .
The winning team will then get of and the losing team will get of .
The value of itself is
while the loser will have
To account for the personnel turnover of the off-season a teams Elo is regressed towards the average value of by a third.
Significant CFL EloValues
|Top 0.1% All-Time||1750|
|Top 1% All-Time||1700|
|Top 5% All-Time||1650|
|Average Grey Cup Team||1600|
|Average Conference Finals Team||1575|
|Average Conference Semi-Finals Team||1525|