Relationships and the Iterated Prisoner’s Dilemma

It was around this time last year that something snapped, and things have never been the same again. Until then, whenever she threw some tantrums, or we had some fight, I’d always give her the benefit of doubt, and unconditionally apologise, and make an effort to bring the relationship back on track. But since then, I don’t feel the same kind of sympathy for her. I don’t feel “paapa” for her like I used to , and have questioned myself several times as to why I even aoplogise, and not expect her to do that.

The optimal strategy for Iterated Prisoner’s Dilemma has been shown to be a strategy called “Tit for tat”. To explain the problem, you play a series of games against an “opponent”, and in each iteration, each of you choose to either “cooperate” or “defect”. For each combination of choices, there is a certain payoff. The payoff looks similar to this, though the exact numbers might be different. In this table, the first value refers to the first player’s payoff and the second represents the second player’s.

Player 1/ Player 2 Co-operate Defect
Co-operate 1 / 1 2 / 0
Defect 0 / 2 0.5/ 0.5

So you play this game several times, and your earnings are totalled. There was a tournament for computer programs playing this game sometime in the 1960s, where the winner was “tit for tat”. According to this strategy, you start by co-operating in the first iteration, and in every successive iteration you copy what your opponent did in the previous iteration. Notice that if both players choose this strategy, both will co-operate in perpetuity, and have identical payoffs.

Relationships can be modelled as an iterated prisoner’s dilemma. You can either choose to be nice to your partner (co-operate) for which you get a steady return, or you can choose to be nasty (defect), in which case you get a superior payoff if your partner continues to be nice. If both of you are nasty simultaneously both of you end up getting inferior payoffs (as shown by the Defect-Defect box in the above matrix).

Early on in the relationship, I was very keen to make things work and did my best to prevent it from falling into any abyss. I played the “Gandhi strategy”, where irrespective of her play, I simply co-operated. The idea there was that whenever she defected, she would feel sympathy for my co-operative position and switch back to co-operate.

So something snapped sometime around this time last year, which led me to change my strategy. I wasn’t going to be Gandhi anymore. I wasn’t going to unconditionally defect, either. I switched to playing tit-for-tat. You can see from the above table that when both players are playing tit-for-tat, you can get into a long (and extremely suboptimal) sequence of defect-defects. And that is what happened to us. We started getting into long sequences of suboptimality, when we would fight way more than what is required to sustain a relationship. Thankfully it never got so bad as to ruin the relationship.

Periodically, both of us would try to break the rut, and try to give the relationship a stimulus. We would play  the co-operate card, and given both of us were playing tit-for-tat we’d be back to normal (Co-operate – Co-operate). Soon we learnt that long defect-defect sequences are bad for both of us, so we would quickly break the strategy and co-operate and get things back on track. We weren’t playing pure tit-for-tat any more. There was a small randomness in our behaviour when we’d suddenly go crazy and defect. In the course of the year, we got formally engaged, and then we got married, and we’ve continued to play this randomized tit-for-tat strategy. And the payoffs have been a roller coaster.

Today I lost it. She randomly pulled out the defect card twice in the course of the day, and that made me go mad. While in earlier circumstances I’d wait a few iterations before I started to defect myself, something snapped today. I pulled out the defect card too. Maybe for the first time ever, I hung up on her. Do I regret it? Perhaps I do. I don’t want to get into a prolonged defect-defect sequence now.

And I hope one of us manages to give the relationship enough of a stimulus in the coming days to put us on a sustained co-operate co-operate path.

Wasting Youth

Nowadays everyone seems to be preparing for JEE. It is almost as if it is a logical progression to join some JEE coaching factory once you are done with 10th standard. Yeah, the numbers were quite large in my time (~10 yrs back) itself. But they are humongous now, and it is not funny.

Yeah, awareness about IIT and people feeling good about themselves and wanting to go study at India’s best undergraduate institutions is great. It is brilliant. Fantastic. What is not so great, brilliant and fantastic is that tens of thousands of youth are wasting two years of their prime youth trying to mug for an entrance exam in which they stand little chance of doing well.

I just hope I’m not sounding condescending here, but it intrigues me that so many people who have very little chances of making it through the JEE slog so much for it. I think it is due toe the unhealthy equilibrium that has been reached with respect to the exam, which makes everyone waste so much time. Let me explain.

So over the years the JEE has got the reputation of being a “tough” exam. And over the years, maybe due to the way papers are structured or the way factories train people, people have figured out that hard work and extra hours of preparation helps. I could get into studsandfighters mode here but in line with my promise let me try and explain without invoking the framework. And you need to remember that the JEE uses “relative grading” – how well you have done is dependent on how badly others have done.

So if everyone has put in that much extra hard work, you are likely to lose out by not putting in that extra work. And so you increase your effort. And so does everyone else. Yeah this is a single iteration game but still looking at the competition and peer pressure eveyone is forced to raise their effort. Everyone is forced to, to quote the Director of my JEE factory, “work up to human limit”.

Yeah, a few hundred people every year manage to “crack” the system and get through without putting in that much effort. But then their numbers are small compared to the number of people who get admitted, so people who get through based on sheer hard work do tend to get noticed more, and spur other aspirants to work even harder. And so forth.

Yes, there is a problem with a system. Something is not right when a large proportion of youth in the country is wasting away two years of prime youth in preparing for some entrance exam. It is easy to see the fundamental problem – shortage of “really good quality” engineering colleges (I argue that this mad fight for IIT seats shows the gap between IITs and the next level of engineering colleges – at least in terms of public perception). But considering that as given I wonder what we could change. I wonder what we could do in order to save our youth.

As an aside, one thing I’ve noticed about several JEE aspirants is that they don’t give up. I don’t know if this is necessarily a good thing – to carry on with the mad fight even if you know that your chances of making it are remote. Yeah I’m sure there is peer pressure and status issues with respect to giving up. But then I suppose I would have a lot more respect for someone who would give up and enjoy life rather than continue the mad fight knowing fully well that his chances are remote.

Looking back, I do regret wasting those two years in mad JEE mugging. Ok I must admit I did have my share of fun back then but still looking back I would have definitely preferred to have not worked so hard back then. And of course I count myself lucky that I got through the JEE and my hard work in those two years wasn’t in vain.

Arranged Scissors 13 – Pruning

Q: How do you carve an elephant?
A: Take a large stone and remove from it all that doesn’t look like an elephant

– Ancient Indian proverb, as told to us by Prof C Pandu Rangan during the Design of Algorithms course

As I had explained in a post a long time ago, this whole business of louvvu and marriage and all such follows a “Monte Carlo approach“. When you ask yourself the question “Do I want a long-term gene-propagating relationship with her?” , the answer is one of “No” or “Maybe”. Irrespective of how decisive you are, or how perceptive you are, it is impossible for you to answer that question with a “Yes” with 100% confidence.

Now, in Computer Science, the way this is tackled is by running the algorithm a large number of times. If you run the algo several times, and the answer is “Maybe” in each iteration, then you can put an upper bound on the probability that the answer is “No”. And with high confidence (though not 100%) you can say “Probably yes”. This is reflected in louvvu also – you meet several times, implicitly evaluate each other on several counts, and keep asking yourselves this question. And when both of you have asked yourselves this question enough times, and both have gotten consistent maybes, you go ahead and marry (of course, there is the measurement aspect also that is involved).

Now, the deal with the arranged marriage market is that you aren’t allowed to have too many meetings. In fact, in the traditional model, the “darshan” lasts only for some 10-15 mins. In extreme cases it’s just a photo but let’s leave that out of the analysis. In modern times, people have been pushing to get more time, and to get more opportunities to run iterations of the algo. Even then, the number of iterations you are allowed is bounded, which puts an upper bound on the confidence with which you can say yes, and also gives fewer opportunity for “noes”.

Management is about finding a creative solution to a system of contradictory constraints
– Prof Ramnath Narayanswamy, IIMB

So one way to deal with this situation I’ve described is by what can be approximately called “pruning”. In each meeting, you will need to maximize the opportunity of detecting a “no”. Suppose that in a normal “louvvu date”, the probability of a “no” is 50% (random number pulled out of thin air). What you will need to do in order to maximize information out of an “arranged date” (yes, that concept exists now) is to raise this probability of a “no” to a higher number, say 60% (again pulled out of thing air).

If you can design your interaction so as to increase the probability of detecting a no, then you will be able to extract more information out of a limited number of meetings. When the a priori rejection rate per date is 50%, you will need at least 5 meetings with consistent “maybes” in order to say “yes” with a confidence of over 50% (I’m too lazy to explain the math here), and this is assuming that the information you gather in one particular iteration is independent of all information gathered in previous iterations.

(In fact, considering that the amount of incremental information gathered in each subsequent iteration is a decreasing function, the actual number of meetings required is much more)

Now, if you raise the a priori probability of rejection in one particular iteration to 60%, then you will need only 4 independent iterations in order to say “yes” with a confidence of over 95% (and this again is by assuming independence).

Ignore all the numbers I’ve put, none of them make sense. I’ve only given them to illustrate my point. The basic idea is that in an “arranged date”, you will need to design the interaction in order to “prune” as much as possible in one particular iteration. Yes, this same thing can be argued for normal louvvu also, but there I suppose the pleasure in the process compensates for larger number of iterations, and there is no external party putting constraints.