monte carlo – Pertinent Observations

Simulating Covid-19 Scenarios

I must warn that this is a super long post. Also I wonder if I should put this on medium in order to get more footage.

Most models of disease spread use what is known as a “SIR” framework. This Numberphile video gives a good primer into this framework.

The problem with the framework is that it’s too simplistic. It depends primarily on one parameter “R0”, which is the average number of people that each infected patient infects. When R0 is high, each patient infects a number of other people, and the disease spreads fast. With a low R0, the disease spreads slow. It was the SIR model that was used to produce all those “flatten the curve” pictures that we were bombarded with a week or two back.

There is a second parameter as well – the recovery or removal rate. Some diseases are so lethal that they have a high removal rate (eg. Ebola), and this puts a natural limit on how much the disease can spread, since infected people die before they can infect too many people.

In any case, such modelling is great for academic studies, and post-facto analyses where R0 can be estimated. As we are currently in the middle of an epidemic, this kind of simplistic modelling can’t take us far. Nobody has a clue yet on what the R0 for covid-19 is. Nobody knows what proportion of total cases are asymptomatic. Nobody knows the mortality rate.

And things are changing well-at-a-faster-rate. Governments are imposing distancing of various forms. First offices were shut down. Then shops were shut down. Now everything is shut down, and many of us have been asked to step out “only to get necessities”. And in such dynamic and fast-changing environments, a simplistic model such as the SIR can only take us so far, and uncertainty in estimating R0 means it can be pretty much useless as well.

In this context, I thought I’ll simulate a few real-life situations, and try to model the spread of the disease in these situations. This can give us an insight into what kind of services are more dangerous than others, and how we could potentially “get back to life” after going through an initial period of lockdown.

The basic assumption I’ve made is that the longer you spend with an infected person, the greater the chance of getting infected yourself. This is not an unreasonable assumption because the spread happens through activities such as sneezing, touching, inadvertently dropping droplets of your saliva on to the other person, and so on, each of which is more likely the longer the time you spend with someone.

Some basic modelling revealed that this can be modelled as a sort of negative exponential curve that looks like this.

$p = 1 - e^{-\lambda T}$

T is the number of hours you spend with the other person. $\lambda$ is a parameter of transmission – the higher it is, the more likely the disease with transmit (holding the amount of time spent together constant).

The function looks like this:

We have no clue what $\lambda$ is, but I’ll make an educated guess based on some limited data I’ve seen. I’ll take a conservative estimate and say that if an uninfected person spends 24 hours with an infected person, the former has a 50% chance of getting the disease from the latter.

This gives the value of $\lambda$ to be 0.02888 per hour. We will now use this to model various scenarios.

Delivery

This is the simplest model I built. There is one shop, and N customers. Customers come one at a time and spend a fixed amount of time (1 or 2 or 5 minutes) at the shop, which has one shopkeeper. Initially, a proportion $p$ of the population is infected, and we assume that the shopkeeper is uninfected.

And then we model the transmission – based on our $\lambda = 0.02888$ , for a two minute interaction, the probability of transmission is $1 - e^{-\lambda T} = 1 - e^{-\frac{0.02888 * 2}{60}} ~= 0.1$ %.

In hindsight, I realised that this kind of a set up better describes “delivery” than a shop. With a 0.1% probability the delivery person gets infected from an infected customer during a delivery. With the same probability an infected delivery person infects a customer. The only way the disease can spread through this “shop” is for the shopkeeper / delivery person to be uninfected.

How does it play out? I simulated 10000 paths where one guy delivers to 1000 homes (maybe over the course of a week? that doesn’t matter as long as the overall infected rate in the population otherwise is constant), and spends exactly two minutes at each delivery, which is made to a single person. Let’s take a few cases, with different base cases of incidence of the disease – 0.1%, 0.2%, 0.5%, 1%, 2%, 5%, 10%, 20% and 50%.

The number of NEW people infected in each case is graphed here (we don’t care how many got the disease otherwise. We’re modelling how many got it from our “shop”). The right side graph excludes the case of zero new infections, just to show you the scale of the problem.

Notice this – even when 50% of the population is infected, as long as the shopkeeper or delivery person is not initially infected, the chances of additional infections through 2-minute delivery are MINUSCULE. A strong case for policy-makers to enable delivery of all kinds, essential or inessential.

2. SHOP

Now, let’s complicate matters a little bit. Instead of a delivery person going to each home, let’s assume a shop. Multiple people can be in the shop at the same time, and there can be more than one shopkeeper.

Let’s use the assumptions of standard queueing theory, and assume that the inter-arrival time for customers is guided by an Exponential distribution, and the time they spend in the shop is also guided by an Exponential distribution.

At the time when customers are in the shop, any infected customer (or shopkeeper) inside can infect any other customer or shopkeeper. So if you spend 2 minutes in a shop where there is 1 infected person, our calculation above tells us that you have a 0.1% chance of being infected yourself. If there are 10 infected people in the shop and you spend 2 minutes there, this is akin to spending 20 minutes with one infected person, and you have a 1% chance of getting infected.

Let’s consider two or three scenarios here. First is the “normal” case where one customer arrives every 5 minutes, and each customer spends 10 minutes in the shop (note that the shop can “serve” multiple customers simultaneously, so the queue doesn’t blow up here). Again let’s take a total of 1000 customers (assume a 24/7 open shop), and one shopkeeper.

Notice that there is significant transmission of infection here, even though we started with 5% of the population being infected. On average, another 3% of the population gets infected! Open supermarkets with usual crowd can result in significant transmission.

Does keeping the shop open with some sort of social distancing (let’s see only one-fourth as many people arrive) work? So people arrive with an average gap of 20 minutes, and still spend 10 minutes in the shop. There are still 10 shopkeepers. What does it look like when we start with 5% of the people being infected?

The graph is pretty much identical so I’m not bothering to put that here!

3. Office

This scenario simulates for N people who are working together for a certain number of hours. We assume that exactly one person is infected at the beginning of the meeting. We also assume that once a person is infected, she can start infecting others in the very next minute (with our transmission probability).

How does the infection grow in this case? This is an easier simulation than the earlier one so we can run 10000 Monte Carlo paths. Let’s say we have a “meeting” with 40 people (could just be 40 people working in a small room) which lasts 4 hours. If we start with one infected person, this is how the number of infected grows over the 4 hours.

The spread is massive! When you have a large bunch of people in a small closed space over a significant period of time, the infection spreads rapidly among them. Even if you take a 10 person meeting over an hour, one infected person at the start can result in an average of 0.3 other people being infected by the end of the meeting.

10 persons meeting over 8 hours (a small office) with one initially infected means 3.5 others (on average) being infected by the end of the day.

Offices are dangerous places for the infection to spread. Even after the lockdown is lifted, some sort of work from home regulations need to be in place until the infection has been fully brought under control.

4. Conferences

This is another form of “meeting”, except that at each point in time, people don’t engage with the whole room, but only a handful of others. These groups form at random, changing every minute, and infection can spread only within a particular group.

Let’s take a 100 person conference with 1 initially infected person. Let’s assume it lasts 8 hours. Depending upon how many people come together at a time, the spread of the infection rapidly changes, as can be seen in the graph below.

If people talk two at a time, there’s a 63% probability that the infection doesn’t spread at all. If they talk 5 at a time, this probability is cut by half. And if people congregate 10 at a time, there’s only a 11% chance that by the end of the day the infection HASN’T propagated!

One takeaway from this is that even once offices start functioning, they need to impose social distancing measures (until the virus has been completely wiped out). All large-ish meetings by video conference. A certain proportion of workers working from home by rotation.

And I wonder what will happen to the conferences.

I’ve put my (unedited) code here. Feel free to use and play around.

Finally, you might wonder why I’ve made so many Monte Carlo Simulations. Well, as the great Matt Levine had himself said, that’s my secret sauce!

The Old Shoe Theory of Relationships

When our daughter was young, some friends saw uncanny resemblances between her and me, and remarked that “Karthik could have married an old shoe and still produced a child that looks like this”, essentially remarking that at least as far as looks were concerned, the wife hadn’t contributed much (Bambi eyes apart).

Over time, the daughter has shown certain other traits that make her seem rather similar to me. For example, she has the practice of sticking her tongue out when performing tasks that require some degree of concentration. She laughs like me. Screeches like me. And makes a “burl-burl” noise with her fingers and lips like I do (admittedly the last one is taught). I’ve already written a fuller list of ways in which the daughter is similar to me.

If you are single and looking to get into a long-term gene propagating relationship, you inevitably ask yourself the question of whether someone is “the one” for you. We have discussed this topic multiple times on this blog.

For example, we have discussed that as far as men are concerned, one thing they look for in potential partners is “consistent fuckability“. We have also discussed that whether someone is “the one” is not a symmetric question, and when you ask yourself the question, you either get “no” or “maybe” as an answer, implying that you need to use Monte Carlo algorithms. Being married to the Marriage Broker Auntie, I’m pretty sure I’ve discussed this topic on this blog several other times.

This is a tubelight post – at least two years too late (the “old shoe” comment came that long ago), but this is yet another framework you can use to determine if you want someone as your long-term gene-propagating partner. Basically you replace yourself by an old shoe.

In other words, assume that the genes that you will propagate along with this person will result in kids who look like them, talk like them, act like them, and rather than a “next best thing”, might just be a superior version of them. Ask yourself if you are okay with having a child who is like this, and who you will be proud of.

This is another Monte Carlo type question, but if the answer in this case is no (you may not be particularly proud of a progeny who is exactly like the person under consideration – for whatever reason), you don’t want to risk propagating genes with this person. In case the answer is yes – that you are willing to parent a child who is exactly like this counterparty, then you can seriously consider this long term relationship.

Again, this applies if and only if you’re looking for a gene propagating relationship. If that isn’t an issue (no pun intended), then you don’t need to worry about old shoes of any kind.

Hypothesis Testing in Monte Carlo

I find it incredible, and not in a good way, that I took fourteen years to make the connection between two concepts I learnt barely a year apart.

In August-September 2003, I was auditing an advanced (graduate) course on Advanced Algorithms, where we learnt about randomised algorithms (I soon stopped auditing since the maths got heavy). And one important class of randomised algorithms is what is known as “Monte Carlo Algorithms”. Not to be confused with Monte Carlo Simulations, these are randomised algorithms that give a one way result. So, using the most prominent example of such an algorithm, you can ask “is this number prime?” and the answer to that can be either “maybe” or “no”.

The randomised algorithm can never conclusively answer “yes” to the primality question. If the algorithm can find a prime factor of the number, it answers “no” (this is conclusive). Otherwise it returns “maybe”. So the way you “conclude” that a number is prime is by running the test a large number of times. Each run reduces the probability that it is a “no” (since they’re all independent evaluations of “maybe”), and when the probability of “no” is low enough, you “think” it’s a “yes”. You might like this old post of mine regarding Monte Carlo algorithms in the context of romantic relationships.

Less than a year later, in July 2004, as part of a basic course in statistics, I learnt about hypothesis testing. Now (I’m kicking myself for failing to see the similarity then), the main principle of hypothesis testing is that you can never “accept a hypothesis”. You either reject a hypothesis or “fail to reject” it. And if you fail to reject a hypothesis with a certain high probability (basically with more data, which implies more independent evaluations that don’t say “reject”), you will start thinking about “accept”.

Basically hypothesis testing is a one-sided test, where you are trying to reject a hypothesis. And not being able to reject a hypothesis doesn’t mean we necessarily accept it – there is still the chance of going wrong if we were to accept it (this is where we get into messy territory such as p-values). And this is exactly like Monte Carlo algorithms – one-sided algorithms where we can only conclusively take a decision one way.

So I was thinking of these concepts when I came across this headline in ESPNCricinfo yesterday that said “Rahul Johri not found guilty” (not linking since Cricinfo has since changed the headline). The choice, or rather ordering, of words was interesting. “Not found guilty”, it said, rather than the usual “found not guilty”.

This is again a concept of one-sided testing. An investigation can either find someone guilty or it fails to do so, and the heading in this case suggested that the latter had happened. And as a deliberate choice, it became apparent why the headline was constructed this way – later it emerged that the decision to clear Rahul Johri of sexual harassment charges was a contentious one.

In most cases, when someone is “found not guilty” following an investigation, it usually suggests that the evidence on hand was enough to say that the chance of the person being guilty was rather low. The phrase “not found guilty”, on the other hand, says that one test failed to reject the hypothesis, but it didn’t have sufficient confidence to clear the person of guilt.

So due credit to the Cricinfo copywriters, and due debit to the product managers for later changing the headline rather than putting a fresh follow-up piece.

PS: The discussion following my tweet on the topic threw up one very interesting insight – such as Scotland having had a “not proven” verdict in the past for such cases (you can trust DD for coming up with such gems).

Bankers predicting football

So the Football World Cup season is upon us, and this means that investment banking analysts are again engaging in the pointless exercise of trying to predict who will win the World Cup. And the funny thing this time is that thanks to MiFiD 2 regulations, which prevent banking analysts from giving out reports for free, these reports aren’t in the public domain.

That means we’ve to rely on media reports of these reports, or on people tweeting insights from them. For example, the New York Times has summarised the banks’ predictions on the winner. And this scatter plot from Goldman Sachs will go straight into my next presentation on spurious correlations:

Different banks have taken different approaches to predict who will win the tournament. UBS has still gone for a classic Monte Carlo simulation approach, but Goldman Sachs has gone one ahead and used “four different methods in artificial intelligence” to predict (for the third consecutive time) that Brazil will win the tournament.

In fact, Goldman also uses a Monte Carlo simulation, as Business Insider reports.

The firm used machine learning to run 200,000 models, mining data on team and individual player attributes, to help forecast specific match scores. Goldman then simulated 1 million possible variations of the tournament in order to calculate the probability of advancement for each squad.

But an insider in Goldman with access to the report tells me that they don’t use the phrase itself in the report. Maybe it’s a suggestion that “data scientists” have taken over the investment research division at the expense of quants.

I’m also surprised with the reporting on Goldman’s predictions. Everyone simply reports that “Goldman predicts that Brazil will win”, but surely (based on the model they’ve used), that prediction has been made with a certain probability? A better way of reporting would’ve been to say “Goldman predicts Brazil most likely to win, with X% probability” (and the bank’s bets desk in the UK could have placed some money on it).

ING went rather simple with their forecasts – simply took players’ transfer values, and summed them up by teams, and concluded that Spain is most likely to win because their squad is the “most valued”. Now, I have two major questions about this approach – firstly, it ignores the “correlation term” (remember the famous England conundrum of the noughties of fitting Gerrard and Lampard into the same eleven?), and assumes a set of strong players is a strong team. Secondly, have they accounted for inflation? And if so, how have they accounted for inflation? Player valuation (about which I have a chapter in my book) has simply gone through the roof in the last year, with Mo Salah at £35 million being considered a “bargain buy”.

Nomura also seems to have taken a similar approach, though they have in some ways accounted for the correlation term by including “team momentum” as a factor!

Anyway, I look forward to the football! That it is live on BBC and ITV means I get to watch the tournament from the comfort of my home (a luxury in England!). Also being in England means all matches are at a sane time, so I can watch more of this World Cup than the last one.

Matt Levine describes my business idea

When I was leaving the big bank I was working for (I keep forgetting whether this blog is anonymous or not, but considering that I’ve now mentioned it on my LinkedIn profile (and had people congratulate me “on the new job”), I suppose it’s not anonymous any more) in 2011, I didn’t bother looking for a new job.

I was going into business, I declared. The philosophy (that’s a word I’ve learnt to use in this context by talking to Venture Capitalists) was that while Quant in investment banking was already fairly saturated, there was virgin territory in other industries, and I’d use my bank-honed quant skills to improve the level of reasoning in these other industries.

Since then things have more or less gone well. I’ve worked in several sectors, and done a lot of interesting work. While a lot of it has been fairly challenging, very little of it has technically been of a level that would be considered challenging by an investment banking quant. And all this is by design.

I’ve long admired Matt Levine for the way in which he clearly explains fairly complicated finance stuff in his daily newsletter (that you can get delivered to your inbox for free), and more or less talking about finance in an entertaining model. I’ve sometimes mentioned that I’ve wanted to grow up to be like him, to write like him, to analyse like him and all that.

And I find that in yesterday’s newsletter he clearly encapsulates the idea with which I started off when I quit banking in 2011. He writes:

A good trick is, find an industry where the words “Monte Carlo model” make you sound brilliant and mysterious, then go to town.

This is exactly what I set out to do in 2011, and have continued to do since then. And you’d be amazed to find the number of industries where “Monte Carlo model” makes you sound brilliant and mysterious.

Considering the difficulties I’ve occasionally had in communicating to people what exactly I do, I think I should adopt Levine’s line to describe my work. I clearly can’t go wrong that way.

Wasps have thin tails, or why cricket prediction algorithms fail..

A couple of months back, i was watching a One Day International match between New Zealand and India. it was during this series that the WASP algorithm for predicting score at the end of the innings (for the team batting first) and chances of victory (for the team batting second) first came in to public prominence. During the games, the scorecard would include the above data, which was derived from the WASP algorithm.

This one game that I was watching (I think it was the fourth match of the series), New Zealand was chasing. They were fifty odd without loss, and WASP showed their chances of winning as somewhere around 60%. Then Jesse Ryder got out. Suddenly, the chances of winning, as shown on screen dropped to the forties! How can the fall of one wicket, and at an early stage in the game, influence a game so much? The problem is with the algorithm.

Last year, during the IPL, i tried running this graphic that I called the “Tug-of-War”, that was to depict how the game swung between the two teams. Taking the analogy forward, if you were to imagine the game of cricket as a game of Tug-of-War, the graph plotted the position of the middle of the rope as a function of time. Here is a sample graphic from last year:

This shows how the game between the Pune Warriors and Sun Risers “moved”. The upper end of the graph represents the Sun Risers’ “line” and the lower end the line of the Pune Warriors. So we can see from this graph that for the first half of the Sun Risers innings (they batted first), the Sun Risers were marginally ahead. Then Pune pulled it back in the second half, and then somewhere midway through the Pune Innings, the SunRisers pulled it back again, and eventually won the game.

At least that’s the intention with which I started putting out this graphic. In practice, you can see that there is a problem. Check out the graph somewhere around the 8th over of the Pune innings. This was when Marlon Samuels got out. How can one event change the course of the game so dramatically? It was similar to the movement in the WASP when Ryder got out in the recent NZ-India match.

So what is the problem here? Based on the WASP algorithm that the designers have kindly published, and the algorithm I used for last year’s IPL (which was Monte Carlo-based), the one thing common is that both algorithms are Markovian (I know mine is, and from what WASP has put out, I’m guessing theirs is, too). To explain in English, what our algorithms assume is that what happens in the next ball doesn’t depend on what has happened so far. The odds of different events on the next ball (dot, six, single, out, etc.) are independent of how the previous balls have shaped up – this is the assumption that our algorithms use. And since that doesn’t accurately represent what happens in a cricket match, we end up with “thin tails”.

Recently, to evaluate IPL matches, with a view of evaluating players ahead of the auction, I reverse engineered the WASP algorithm, and decided to see what it says about the score at the end of an ODI innings. Note that my version is team agnostic, and assumes that every ball is bowled by “the average bowler” to “the average batsman”. The distribution of team score at the end of the first innings, as calculated by my algorithm, can be seen in the blue line in the graph below. The red line shows the actual distribution of score at the end of an ODI innings in the last 5 years (same data that’s been used to construct the model).

Note how the blue curve has a much higher peak, and tails off very quickly on either side. In other words, a lot of “mass” is situated within a small range of scores, and this leads to the bizarre situations as you can see in the first graph, and what I saw in the New Zealand India game.

The problem with a dynamic programming based approach, such as WASP, is that you need to make a Markovian assumption, and that assumption results in thin tails. And when you are trying to predict the probability of victory, and are using a curve such as the blue one above as your expected distribution of score at the end of the innings, events such as a six or a wicket can drastically alter your calculated odds.

To improve the cricket prediction system, what we need is an algorithm that can replicate the “fat tails” that the actual distribution of cricket scores shows. My current Monte Carlo based algorithm doesn’t cut it. Neither does the WASP.

Delivery

2. SHOP

3. Office

4. Conferences