Rare observations and observed distributions

Over the last four years, one of my most frequent commutes in Bangalore has been between Jayanagar and Rajajinagar – I travel between these two places once a week on an average. There are several routes one can take to get to Rajajinagar from Jayanagar, and one of them happens to be from the inside of Chamrajpet. However, I can count the number of times I’ve taken that route in the last four years on the fingers of one hand. This is because the first time I took that route I got stuck in a massive traffic jam.

Welcome to the world of real distributions and observed distributions. The basic concept is that if you observe a particular event rarely, the distribution you observe can be very different from the actual distribution. Take for example, the above example of driving through inner Chamrajpet. Let us say that the average time to drive through that particular road on a Saturday evening is 10 minutes. Let us say that 99% of the time on a Saturday evening, you take less than 15 minutes to drive through that road. In the remaining 1% of the time, you can take as much as an hour to drive through the road.

Now, if you are a regular commuter who drives through this road every Saturday evening, you will be aware of the distribution. You will be aware that 99% of the time you will take at most 15 minutes to get past, and base your routing decision based on that. When it takes an hour to drive past, you know that it is a rare event and discount it from your future calculations. If, however, you are an irregular commuter like me and happened to drive through that road on that one day when it took an hour you get past, you will assume that that is the average time it takes to get past! You are likely to mistake the rare event as the usual, and that can lead to suboptimal decisions in the future.

In his book The Black Swan, Nassim Nicholas Taleb talks about the inability of people to model for rare events. He says that the problem is that people underestimate the probability of rare events and fail to account for it in their models, leading to blow ups when they do occur. While I agree that is a problem, I contend that the opposite problem can also be not ignored. Sometimes you fail to recognize that what has happened to you is a rare event and thus end up with a wrong model.

Let me illustrate both problems with the same example. Think of a game where 99 times out of 100 you win a rupee. The rest of the time (i.e. 1%) you lose fifty rupees. Regular players of the game, who have “sampled” this enough will know the full distribution, and will take that into account when deciding on whether to play the game. Non-regular players, however, don’t have complete information.

Let us say there are a hundred cards. 99 of them have a +1 written on it, and the 100th has a -50. Let us suppose you pick ten cards. Ninety percent of the time, all ten cards you pick will be a “+1”, and you will conclude that all cards are “+1”. You will model for the game to give you a rupee each time you play. The other 10% of the time, however, you will draw nine +1s and one -50. You will then assume that the expected value of playing the game is Rs. -4 .1( (9 * 1  + 1 * (-50))/10 ). Notice that both times you are wrong in your inference!

So while it is important that you recognize black swans, it is also important that you don’t overestimate their probability. Always remember that if you are a rare observer, the distribution you observe may not reflect the real distribution.

Card Games

So the other day, while playing rummy with the members of the in-law family, I figured why I suck so much at some card games despite having played them quite regularly when I was a kid. Back then, in family gatherings, it was common for the host to come up with a couple of packs of cards, and we would play either rummy or this game called donkey (some kind of variation of hearts is how I’ll describe it for those that don’t know it). Given how regularly we played it, I should have become rather good at either of them, which unfortunately is not the case.

Bridge was the first card game that I learnt “formally”, in the hostel blocks of IIT Madras. Soon after being explained the rules of the game, I was taught conventions, both in bidding and play. I was taught the math, the probabilities of various distributions and to make intelligent guesses. While I quickly became decently good at bridge, it didn’t help my game in any of the other card games that I’d learnt.

So while playing recently, I realized that I know little about the science of rummy. And then I realized the reason for it – we used to play with incomplete decks. The problem with old family-held packs of cards with which no “formal” games are played is that cards tend to go missing over the course of time (especially if there are kids around), and no one really bothers to check. And when you play with incomplete packs of cards, all the beautiful math and rules of probability go out of the window. And if you have learnt playing with such a pack of cards, it is unlikely you’d have figured out much math also.

Last night, while playing rummy with the wife, I tried my best to use math, to keep a careful note of discarded cards, the joker (for example, if seven of hearts had turned up as the joker card, that meant a six of hearts in hand was of less use than otherwise (we were playing with only one pack) ), mathematical probabilities of which cards are still available based on discards and stuff. Then, it turned out that there was too much luck involved in the distribution of cards, and I started missing the duplicate bridge games that we used to play back in IIT.

The wife has shown an inclination to learn bridge, and I’m trying to teach her. We’re also trying to learn poker (we’d bought this nice poker set in Sri Lanka last year but it remains unused since neither of us can play the game). Yeah, becoming really good at these card games is one of the aims of my “project thirty”.

Chowka Baarah

Yesterday after a gap of about fifteen years, I played chowka-baarah. For starters, the name intrigues me. It translates into four-twelve (I suppose), but that doesn’t make sense. Essentially, there are two primary variations of this game depending upon the size of the grid used (5 by 5 or 7 by 7), and these two numbers are “big numbers” in different systems. In the 5×5 version, the “big scores” are 4 and 8, while in the “7×7” system, it’s 6 and 12.

A certain variety of seashells (called kavaDe in Kannada) are used as dice, four of them in the 5×5 version and 6 in the larger version. The “score” of the dice is determined by the number of kavaDes falling “face up”, and if all fall face down, the score is twice the number of dice. So if you have 4 shells and all fall face down, you get 8 points. I haven’t done much research on this but I do think the probability of a die falling “face up” is much more than the probability of it falling “face down”. I don’t know the exact probability.

The game itself is like Ludo; your pawns going round and round in circles and inward in order to reach the centre of the square when it “queens”. The first player to queen all their pawns wins. There are concepts such as doubling pawns (they act as a pair hten, move in pairs only on even throws of the die, etc.), cutting (if your pawn reaches a square where an opponent’s pawn is, the opponent’s pawn “goes home”, etc. Simple game, and widely played in a lot of “traditional households”.

Considering that I had stopped playing this game when I was still quite small, i had never realized the strategies involved in playing the game. Back then I’d just generally move whatever pawn i fancied nad somehow my grandparents would move in a way in order to simply enable me to win. It was only yesterday that I realized that the game is not as simple as I thought, and that strategy dominates luck when determining how you do.

It’s not like bridge, where card distributions are exchanged across pairs in order to take the luck out of the game. Nevertheless, I realize that the number of “turns” in the game is large enough for the probabilities in the seashells to balance out across players. Rather, the decision that you need to make at each turn regarding which pawn to move is so important that the importance of this drawfs the number you threw! Again you will need to keep into account stuff like the distribution of your next throw, your opponent’s next throw and so on.

I think I have a thing for games with randomness built into them rahter than those that are completely a function of the players’ moves (like chess). I think this is because even with the same set of players, games with randomness built in lead to a larger variety of positions which makes the game more exciting.

Coming back to Chowka Baarah, the other thing I was thinking of last night was if sunk cost fallacy applied in this, when I was trying to decide betwen a reasonably advanced pawn and a backward pawn to decide as to which one to save. Finally I decided that apart from the loss in terms of the pawn being sent home, other things that I had to take into consideration when I moved was about which pawn capture would be more valuable for the opponent, probabilities of differnet pawns getting captured, potential danger to other pawns, etc.

It’s a fun game, one of the most fun “traditional” games. Maybe one of the most “strategic” traditional games. Miss playing it for the last fifteen years or so.