Duckworth Lewis and Sprinting a Marathon

How would you like it if you were running a marathon and someone were to set you targets for every 100 meters? “Run the first 100m in 25 seconds. The second in 24 seconds” and so on? It is very likely that you would hate the idea. You would argue that the idea of the marathon would be to finish the 42-odd km within the target time you have set for yourself and you don’t care about any internal targets. You are also likely to argue that different runners have different running patterns and imposing targets for small distances is unfair to just about everyone.

Yet, this is exactly what cricketers are asked to do in games that likely to be affected by rain. The Duckworth Lewis method, which has been in use to adjust targets in rain affected matches since 1999 assumes an average “scoring curve”. The formula assumes a certain “curve” according to which a team scores runs during its innings. It’s basically an extension of the old thumb-rule that a team is likely to score as many runs in the last 20 overs as it does in the first 30 – but D/L also takes into accounts wickets lost (this is the major innovation of D/L. Earlier rain-rules such as run-rate or highest-scoring-overs didn’t take into consideration wickets lost).

The basic innovation of D/L is that it is based on “resources”. With 50 overs to go and 10 wickets in hand, a team has 100% of its resource. As a team utilizes overs and loses wickets, the resources are correspondingly depleted. D/L extrapolates based on the resources left at the end of the innings. Suppose, for example, that a team scores 100 in 20 overs for the loss of 1 wicket, and the match has to be curtailed right then. What would the team have scored at the end of 50 overs? According to the 2002 version of the D/L table (the first that came up when I googled), after 20 overs and the loss of 1 wicket, a team still has 71.8% of resources left. Essentially the team has scored 100 runs using 28.2% (100 – 71.8) % of its resources. So at the end of the innings the team would be expected to score 100 * 100 / 28.2 = 354.

How have D/L arrived at these values for resource depletion? By simple regression, based on historical games. To simplify, they look at all historical games where the team had lost 1 wicket at the end of 20 overs, and look at the ratio of the final score to the 20 over score in those games, and use that to arrive at the “resource score”.

To understand why this is inherently unfair, let us take into consideration the champions of the first two World Cups that I watched. In 1992, Pakistan followed the principle of laying a solid foundation and then exploding in the latter part of the innings. A score of 100 in 30 overs was considered acceptable, as long as the team hadn’t lost too many wickets. And with hard hitters such as Inzamam-ul-haq and Imran Khan in the lower order they would have more than doubled that score by the end of the innings. In fact, most teams followed a similar strategy in that World Cup (New Zealand was a notable exception, using Mark Greatbatch as a pinch-hitter. India also tried that approach in two games – sending Kapil Dev to open).

Four years later in the subcontinent the story was entirely different. Again, while there were teams that followed the approach of a slow build up and late acceleration, but the winners Sri Lanka turned around that formula on its head. Test opener Roshan Mahanama batted at seven, with the equally dour Hashan Tillekeratne preceding him. At the top were the explosive pair of Sanath Jayasuriya and Romesh Kaluwitharana. The idea was to exploit the field restrictions of the first 15 overs, and then bat on at a steady pace. It wasn’t unlikely in that setup that more runs would be scored in the first 25 overs than the last 25.

Duckworth-Lewis treats both strategies alike. The D/L regression contains matches from both the 1992 and 1996 world cups. They have matches where pinch hitters have dominated, and matches with a slow build up and a late slog. And the “average scoring curve” that they have arrived at probably doesn’t represent either – since it is an average based on all games played. 100/2 after 30 overs would have been an excellent score for Pakistan in 1992, but for Sri Lanka in 1996 the same score would have represented a spectacular failure. D/L, however, treats them equally.

So now you have the situation that if you know that a match is likely to be affected by rain, you (the team) have to abandon your natural game and instead play according to the curve. D/L expects you to score 5 runs in the first over? Okay, send in batsmen who are capable of doing that. You find it tough to score off Sunil Narine, and want to simply play him out? Can’t do, for you need to score at least 4 in each of his overs to keep up with the D/L target.

The much-touted strength of the D/L is that it allows you to account for multiple rain interruptions and mid-innings breaks. At a more philosophical level, though, this is also its downfall. Because now you have a formula that micromanages and tells you what you should be ideally doing on every ball (as Kieron Pollard and the West Indies found out recently, simply going by over-by-over targets will not do), you are now bound to play by the formula rather than how you want to play the game.

There are a few other shortcomings with D/L, which is a result of it being a product of regression. It doesn’t take into account who has bowled, or who has batted. Suppose you are the fielding captain and you know given the conditions and forecasts that there is likely to be a long rain delay after 25 overs of batting – after which the match is likely to be curtailed. You have three excellent seam bowlers who can take good advantage of the overcast conditions. Their backup is not so strong. So you now play for the rain break and choose to bowl out your best bowlers before that! Similarly, D/L doesn’t take into account the impact of power play overs. So if you are the batting captain, you want to take the batting powerplay ASAP, before the rain comes down!

The D/L is a good system no doubt, else it would have not survived for 14 years. However, it creates a game that is unfair to both teams, and forces them to play according to a formula. We can think of alternatives that overcome some of the shortcomings (for example, I’ve developed a Monte Carlo simulation based system which can take into account power plays and bowling out strongest bowlers). Nevertheless, as long as we have a system that can extrapolate after every ball, we will always have an unfair game, where teams have to play according to a curve. D/L encourages short-termism, at the cost of planning for the full quota of overs. This cannot be good for the game. It is like setting 100m targets for a marathon runner.

PS: The same arguments I’ve made here against the D/L apply to its competitor the VJD Method (pioneered by V Jayadevan of Thrissur) also.