Analyzing Premier League Performance so far

After yet another round of matches this weekend, Liverpool were unable to beat a 10-man Newcastle, and have slipped to third spot, with Chelsea going ahead of them on goal difference. Arsenal thumped Norwich to go clear at the top of the table. Manchester United continued to flounder, drawing at home to Southampton, who are the most improved team this season, compared to the last.

Now, the problem is that each team has a different fixture list. Some teams (such as Manchester United) have had an insanely tough set of fixtures so far this season. Others such as Arsenal have had it quite easy (the eight games Arsenal have played this season have all been fixtures that they won last year!). How do we account for this relative ease in fixtures to see how well teams have been performing?

In chess, one of the popular tie breaker methods used for “Swiss League” tournaments is called the “Solkoff method”. According to this method, the tie breaker score for each player is the sum of points scored by all his opponents. In a swiss league, each player plays against a different set of players, so a higher Solkoff score means a player has played his games against tougher opponents, and has hence done better than someone else with the same points tally but who has played weaker opponents. The question is if we can use these principles to evaluate football teams at this point in the season.

I propose what I call the “Modified Solkoff” score. Here, we not only take into account the total points of each opponent of a team, but also the result of the game against the particular opponent. This is then normalized by the total points scored by all your opponents. Take Arsenal for example. Their opponents so far this season have a total of 69 points as of today. Of the eight games they’ve played, Arsenal have lost to Aston Villa and drawn at West Brom. So the numerator of Arsenal’s Modified Solkoff score becomes 0 * Aston Villa’s points (10) + 1 * West Brom’s points (10) + 3 * total points of all their other opponents, which  amounts to 157. This is then normalized by the total  points tally of their opponents so far (69) and we get Arsenal’s normalized Modified Solkoff score of 2.28. You can see that the maximum possible Solkoff score is 3 (if a team has won all its games) and the minimum is 0 (losing all games). The higher the Solkoff score the better (better performance against better opponents).

This is what the Modified Solkoff table looks like as of today (21st October 2013). Arsenal may not have played the toughest opponents but the fact that they have won so many of their games means that they are on top. They are interestingly followed by Manchester City and then Southampton. Manchester United is buried somewhere in the bottom half of the table:


It is also interesting to note that Sunderland is ahead of Crystal Palace at the bottom of their table. This is due to the fact that Palace’s only points so far have come against Sunderland, while Sunderland earned their point from a draw with high-flying Southampton.

This also shows that Liverpool’s early season highs have come on the back of wins against relatively weaker teams (it doesn’t help their cause that Manchester United is classified as a “weak team” thanks to their performance so far), and thus their early season table topping is unlikely to sustain.

Let me know in the comments what you think of this method of computing a normalized score based on a team’s opponents so far.

PS: This table will be regularly updated (after each “matchday”), so if you are reading this after October, some of the notes may not match what is there in the table.

2 thoughts on “Analyzing Premier League Performance so far”

  1. How sure are you about your prediction? It’s as good as someone making a random guess! In other words, how “reliable” is it? For a start, maybe you can check the performance of the method (whatever fancy name you want to call it) for at least the last 3 seasons and see whether your predictions are in tune with the eventual final rankings.

    My suggestions:
    1) Run it at different stages – maybe once a month – and see when your method begins to resemble the final table.
    2) Quantify the error at each stage and see how that varies.
    3) This is after all a statistical model. So consider developing a continuous feedback system where your model imbues the error obtained at each stage and uses it to recalibrate itself.
    4) Leave alone commercializing this model. If you even want anybody to take you seriously, you might want to calibrate your model with as much data as available – In this case, 21 years (since 1992-93). A better error estimate gives you a better reliability model.
    5) This might seem a little personal but it’s essential. I understand professionalism includes good presentation skills, but the “How?” is absolutely useless, if the “What?” is mediocre. In other words, nobody bothers about how well you presented your plots and in what form you presented it if what you are incompetent in the “data analysis” part of it. (You’re calling yourself a “Data Scientist” after all!)

Put Comment