Stereotypes and correlations

Earlier on this blog, I’ve argued in favour of stereotypes. “In the absence of further information, stereotypes give you a strong Bayesian prior”, I had written (I’m paraphrasing myself here). I had gone on to say (paraphrasing myself yet again), “however, it is important that you treat this as a weak prior and update them as and when you get new information. So in the presence of additional information, you need to let go of the stereotypes”.

A lot of stereotyping is due to spurious correlations, often formed due to small number of training samples. My mother, for example, strongly believed that if you drink alcohol, you must be a bad person. Sometime, she had explained to me why she thought so – there were a few of her friends whose fathers or husbands drank alcohol, and they had had to endure domestic abuse.

That is only one extreme correlation stereotype. We keep making these stereotypes based on correlation all the time. I’m not saying that the correlation is not positive – sometimes it can be extremely positive. Just that it may not have full explainability.

For example, certain ways on dressing have come to be associated with certain attitudes (black tshirts and heavy metal, for example). So when we see someone exhibiting one side of this correlation, our minds are naturally drawn to associating them with the other side of the correlation as well (so you see someone in a black heavy metal band t-shirt, and immediately assume that they must be interested in heavy metal – to take a trivial example).

And then when their further behaviour belies the correlation that you had instinctively made, your mind gets messed up.

There was this guy in my batch at IIT Madras, who used to wear a naama (vertical religious mark on forehead commonly worn by Iyengars) on his forehead a lot of the time. Unlike most other undergrads, he also preferred to wear dhotis. So you would see him in his dhoti and naama and assume he was a religious conservative. And then you would see his hand, which would usually be held up showing a prominent middle finger, and all your mental correlations would go for a toss.

Another such example that I’ve spoken about on this blog before is that of the “puritan topper” – having seen a few topper types who otherwise led austere lives, I had assumed that kind of behaviour was correlated with being a topper (in some ways I can now argue that this blog is getting a bit meta).

I find myself doing this all the time. I observe someone’s accent and make assumptions on their abilities or the lack of it. I see someone’s dressing sense and build a whole story in my head on that single data point. I see the way someone is walking, and that supposedly tells me about their state of mind that day.

The good thing I’ve done is to internalise my last year’s blogpost – while all these single data point correlations are fine as a prior (in the absence of other information), the moment I get more information I immediately update them, and the initial stereotypes go out of the window.

The other thing I’m thinking of is – sometimes some of these random spurious correlations are so ingrained in our heads that we let them influence us. We take a certain job and decide that it is associated with a certain way of dressing and also start dressing the same way (thus playing up the stereotypes). We know the sort of clothes most people wear to a certain kind of restaurant, and also dress that way – again playing up the stereotypes.

Without realising it, maybe because of mimetic desire or a desire to fit in, we end up furthering random correlations and stereotypes. So maybe it is time to make a conscious effort to start breaking these stereotypes? But no – you won’t see me wear a suit to work any time soon.

I’ll end with another school anecdote. For whatever reason, many of the topper types in my 11th standard class would wear the school uniform sweater to school every single day, irrespective of how hot or cold it was. And then one fine (and not cold) day, yet another guy showed up in the uniform sweater. “How come you’re wearing this sweater”, I asked. He replied, “Oh, I just wanted to look more intellectual!”

 

It’s not just about status

Rob Henderson writes that in general, relative to the value they add to their firms, senior employees are underpaid and junior employees are overpaid. This, he reasons, is because senior employees trade off money for status.

Quoting him in full:

Robert Frank suggests the reason for this is that workers would generally prefer to occupy higher-ranked positions in their work groups than lower-ranked ones. They’re forgoing more earnings to hold a higher-status position in their organization.

But this preference for a higher-status position can be satisfied within any given organization.

After all, 50 percent of the positions in any firm must always be in the bottom half.

So the only way some workers can enjoy the pleasure inherent in positions of high status is if others are willing to bear the dissatisfactions associated with low status.

The solution, then, is to pay the low-status workers a bit more than they are worth to get them to stay. The high-status workers, in contrast, accept lower pay for the benefit of their lofty positions.

I’m not sure I agree. Yes, I do agree that higher productivity employees are underpaid and lower productivity employees are overpaid. However, I don’t think status fully explains it. There are also issues of variance and correlation and liquidity (there – I’m talking like a real quant now).

One the variance front – the higher you are in the organisation and the higher your salary is, the more the variance of your contribution to the organisation. For example, if you are being paid $350,000 (the number Henderson hypothetically uses), the actual value you are bringing to your firm might have a mean of $500,000 and a standard deviation of $200,000 (pulling all these numbers out of thin air, while making some sense checks that broadly risk pricing holds).

On the other hand, if you are being paid $35,000, then it is far more likely that the average value you bring to the firm is $40,000 with a standard deviation of $5,000 (again numbers entirely pulled out of thin air). Notice the drastic difference in the coefficient of variation in the two cases.

Putting it another way, the more productive you are, the harder it is for any organisation to put a precise value on your contribution. Henderson might say “you are worth 500K while you earn 350K” but the former is an average number. It is because of the high variance in your “worth” that you are paid far lower than what you are worth on average.

And why does this variance exist? It’s due to correlation.

More so at higher ranked positions (as an aside – my weird career path means that I’ve NEVER been in middle management) the value you can add to a company is tightly coupled with your interactions with your colleagues and peers. As a junior employee your role can be defined well enough that your contributions are stable irrespective of how you work with the others. At senior levels though a very large part of the value you can add is tied to how you work with others and leverage their work in your contributions.

So one way a company can get you to contribute more is to have a good set of peers you like working with, which increases your average contribution to the firm. Rather paradoxically, because you like your peers (assuming peer liking in senior management is two way), the company can get away with paying you a little less than your average worth and you will continue to stick on. If you don’t like working with your colleagues, there is the double whammy that you will add less to the company and you need to be paid more to stick on. And so if you look at people who are actually successful in their jobs at a senior level, they will all appear to be underpaid relative to their peers.

And finally there is liquidity (can I ever theorise about something without bringing this up?). The more senior you go, the less liquid is the market for your job. The number of potential jobs that you want to do, and which might want you, is very very low. And as I’ve explained in the first chapter of my book, when a market is illiquid, the bid-ask spread can be rather high. This means that even holding the value of your contribution to a company constant, there can be a large variation in what you are actually paid. And that is a gain why, on average, senior employees are underpaid.

So yes, there is an element of status. But there are also considerations of variance, correlation and bid-ask. And selection bias (senior employees who are overpaid relative to the value they add don’t last very long in their jobs). And this is why, on average, you can afford to underpay senior employees.

Losing My Religion

In terms of religion, I had a bit of a strange upbringing. My father was a rationalist, bordering on atheist. My mother was insanely religious, even following a godman. And no – I never once saw them fight about this.

Both of them tried to impress me with their own religions. My mother tried to inculcate in me the habit of praying every morning, and looking for strange patterns (“if this flower on this photo falls, then it will be a good day” types). My father would refute most of these things saying “how can you be a student of science and still believe this stuff?”. I suppose I consumed a lot of coffy bite when I was a kid.

In any case, with a combination of influences, both internal and external, in my early youth I was this strange concoction of “not religious but superstitious”. I had both a “lucky shirt” and a “lucky pen”. Back in class 12, I had convinced myself that “Wednesdays are a particularly bad day for me”.

I really don’t know if this has anything to do with my upbringing, but I would see patterns everywhere. I would draw correlations between random unconnected things, and assume causality. I staunchly refused to admit that I was religious, but allowed for strange patterns and correlations nevertheless.

When I had five minor car accidents during the course of 2007 (it wasn’t a great year for me, and I was quite messed up), I believed (or maybe was made to believe) that it was “my car’s way of protecting me” (I wasn’t hurt in any of those, though the car took a lot of beatings and scratchings). I had come to believe that a particular job didn’t go well because on the first day of work, I had splashed water on a kid on my way back by driving fast through a puddle.

The general discourse nowadays is that religion improves people’s mental health. That it helps people see meaning and purpose in their lives, and live through tragedies and other kinds of unhappiness. A common discourse on the right, on social media, is that it is the lack of religion that has led to the mental health epidemic that we have been going through for a while.

The way I see it, based on my own experience, this is completely backward. The basic thing about religion, at least based on my mixed upbringing, is “random correlations”. A lot of religion can be explained as “you do this, God will be happy with you and give you that”. Or that something was just “meant to be”, maybe based on actions in one’s past lives.

Religion is about “being a good person” and “karma”, and that all your mistakes will necessarily get punished, if not in this life in the next. The long period over which karma operates significantly increases the scope of random correlations that you can draw from life.

First of all I’m good at pattern recognition (something that has immensely helped me in my academics and careers). The downside of being good at pattern recognition is that there can be LOTS of false positives in patterns that you recognise. And when you recognise patterns that don’t really exist, you learn the wrong things, and after that live life the wrong way. And I think that was happening to me for a very very long time.

And so came the lucky shirts, the lucky pens, the precise order in which I would check websites at work every morning and many other things that were actually damaging to life, especially mental health. The pattern recognition was making me miserable, and the religion and superstition that I had come to believe in gave credence to these patterns, and (with the benefit of hindsight) made me more miserable.

In 2012, after having burnt out for the third time in six years, I began to see a psychiatrist and take antidepressants. It was the same time when I had started my “portfolio life”, and one of the items in that portfolio was volunteering with the Takshashila Institution, where I was asked to teach a class on logical fallacies.

That’s possibly a funny trigger, but hours of lecturing about “correlation not implying causation” meant that I started finally seeing the random correlations that I had formed in my own head. And one by one, I started dismantling them. There were no lucky days any more. There wasn’t that much karma any more. I started feeling less worried about things I wanted to say. I started realising that being “good” is good for its own merits, and not because some karma recommends that you should be good.

And I started feeling happier. Over the course of time, it seemed like a big load had been taken off my head. And so, whenever I see discourse on social media (and in books) that religion makes people happier, I fail to understand it.

In January 2014, I met an old friend for dinner. While walking back to the parking lot, he casually asked me what my views on religion were. I thought for a minute and said, “well, I firmly believe that correlation does not imply causation. And this means I can’t be religious”. That’s when I became convinced that I had lost my religion, and had become happier for it. And I continue to be happy because I’m not religious.

Resorts

We spent the last three days at a resort, here in Karnataka. The first day went off very peacefully. On the second day, a rather loud group checked in. However, our meal times generally didn’t intersect with theirs and they weren’t too much of a bother.

Yesterday, a bigger and louder (and rather obnoxious – they were generally extremely rude to the resort staff) group checked in. Unfortunately their meal times overlapped with ours, and their unpleasantness had a bearing on us. Our holiday would have been far better had this group not checked in to our resort, but there was no way we could have anticipated, or controlled for that.

The moral of the story, basically, is that your experience at a resort is highly dependent on who else is checked in to the resort at the same time.

The thing with resorts is that unlike “regular hotels”, you end up spending all your time during your holiday in the resort itself, so the likelihood of bumping into or otherwise encountering others who are staying at the resort is far higher. And this means that if you don’t want to interact with some of the people there, you sometimes don’t really have a choice.

Of course, it helped that the resort we were in had private swimming pools attached to each room, and was rather large. So the only times we encountered the other groups at the resort was at meal times. However, as we found during our last day there, that itself was enough to make the experience somewhat unpleasant.

My wife and I had a long conversation last night on what we could do to mitigate this risk. We wondered if the resorts we have been going to are “not premium enough” (then again, a resort with private swimming pools in each room can be considered to be as premium as it gets). However, we quickly realised that ability to pay for a holiday is not at all correlated with pleasantness.

We wondered if resorts that are out of the way or in otherwise not so popular places are a better hedge against this. Now, with smaller or less popular resorts, the risk of having unpleasant co-guests is smaller (since the number of co-guests is lower). However, if one or more of the co-guests happens to be unpleasant, it will impact you a lot more. And that’s a bit of a risk.

Maybe the problem is with India, we thought, since one of the nice resort holidays we’ve had in the last couple of years was in Maldives. Then again, we quickly remembered the time at Taj Bentota (on our honeymoon) where the swimming pool had been taken over by a rather loud tour group, driving us nuts (and driving us away to the beach).

We thought of weekday vs weekend. Peak season vs off season. School holidays vs exam season. We were unable to draw any meaningful correlations.

There is no solution, it seemed. Then we spent time analysing why we didn’t get bugged by fellow-guests at Maldives (my wife helpfully remembered that the family at the table next to ours at one of the dinners was rather loud and obnoxious). It had to do with size. It was a massive resort. Because the resort was so massive, there would be other guests who were obnoxious. However, in the size of the resort, they would “become white noise”.

So, for now, we’ve taken a policy decision that for our further travel in India, we’ll either go to really large resorts, or we’ll do a “tourist tour” (seeing places, basically) while staying at “business hotels”. This also means that we’re unlikely to do another multi-day holiday until Covid-19 is well under control.

Postscript: Having spent a considerable amount of time in the swimming pool attached to our room, I now have a good idea on why public swimming pools haven’t yet been opened up post covid-19. Basically, I found myself blowing my nose and spitting into the pool a fair bit during the time when I was there. Since the only others using it at that time were my immediate family, it didn’t matter, but this tells you why public swimming pools may not be particularly safe.

Postscript 2: One other problem we have with Indian resorts is the late dinner. At home, we adults eat at 6pm (and our daughter before that). Pretty much every resort we’ve stayed in over the last year and half has started serving dinner only by 8, or sometimes at 9pm. And this has sort of messed with our “systems”.

Facial appendages

Designers and manufacturers of things we wear on our face don’t seem to have taken into account the fact that people can wear multiple facial appendages at a time.

One problem that has bothered me since I was eighteen, when I got my first motorcycle, has been the clash between my spectacles (something I’ve worn since I was eight) and the full-face helmet. Design of full-face helmets has always meant that I’ve had to take the spectacles off, wear the helmet and then wear the specs back on (and then put on the visor of the helmet).

With some helmets it’s worked beautifully. But occasionally I’ve bought helmets one size too small (or borrowed my wife’s helmet), and in those cases this correlation hasn’t worked out well. There are days when I wear contact lenses first thing in the morning just because I need to take the scooter out.

And now, there is a third appendage which doesn’t work well with either the spectacles or the helmet – the facial mask to keep covid-19 germs away.

So far I’ve been completely unable to wear a helmet while not making the mask move out of position (this is irrespective of which helmet and which mask I use).

And most of my masks have not worked well with my spectacles as well. They interfere with each other in several places – on the nose, on the ears, vapours from the mask fogging up my spectacles. I might start wearing my contact lenses first thing in the morning now as well, just so that I can wear a mask when I step out.

Now imagine what it would be like to wear spectacles, mask and helmet all at once.

I’m glad my hearing is good, for I’m sure you won’t be able to imagine what it’s like to wear spectacles, mask, helmet and hearing aids.

PS: I discovered this morning that I’m allergic to the N95 mask I have. It has an appendage to make it fit well on the nose, and my nose has developed rashes from it.

Correlation and causation

So I have this lecture on “smelling (statistical) bullshit” that I’ve delivered in several places, which I inevitably start with a lesson on how correlation doesn’t imply causation. I give a large number of examples of people mistaking correlation for causation, the class makes fun of everything that doesn’t apply to them, then everyone sees this wonderful XKCD cartoon and then we move on.

One of my favourite examples of correlation-causation (which I don’t normally include in my slides) has to do with religion. Praying before an exam in which one did well doesn’t necessarily imply that the prayer resulted in the good performance in the exam, I explain. So far, there has been no outward outrage at my lectures, but this does visibly make people uncomfortable.

Going off on a tangent, the time in life when I discovered to myself that I’m not religious was when I pondered over the correlation-causation issue some six or seven years back. Until then I’d had this irrational need to draw a relationship between seemingly unrelated things that had happened together once or twice, and that had given me a lot of mental stress. Looking at things from a correlation-causation perspective, however, helped clear up my mind on those things, and also made me believe that most religious activity is pointless. This was a time in life when I got immense mental peace.

Yet, for most of the world, it is not freedom from religion but religion itself that gives them mental peace. People do absurd activities only because they think these activities lead to other good things happening, thanks to a small number of occasions when these things have coincided, either in their own lives or in the lives of their ancestors or gurus.

In one of my lectures a few years back I had remarked that one reason why humans still mistake correlation for causation is religion – for if correlation did not imply causation then most of religious rituals would be rendered meaningless and that would render people’s lives meaningless. Based on what I observed today, however, I think I’ve got this causality wrong.

It’s not because of religion that people mistake correlation for causation. Instead, we’ve evolved to recognise patterns whenever we observe them, and a side effect of that is that we immediately assume causation whenever we see things happening together. Religion is just a special case of application of this correlation-causation second nature to things in real life.

So my daughter (who is two and a half) and I were standing in our balcony this evening, observing that it had rained heavily last night. Heavy rain reminded my daughter of this time when we had visited a particular aunt last week – she clearly remembered watching the heavy rain from this aunt’s window. Perhaps none of our other visits to this aunt’s house really registered in the daughter’s imagination (it’s barely two months since we returned to Bangalore, so admittedly there aren’t that many data points), so this aunt’s house is inextricably linked in her mind to rain.

And this evening because she wanted it to rain heavily again, the daughter suggested that we go visit this aunt once again. “We’ll go to Inna Ajji’s house and then it will start raining”, she kept saying. “Yes, it rained the last time it went there, but it was random. It wasn’t because we went there”, I kept saying. It wasn’t easy to explain it.

You know when you are about to have a kid you develop visions of how you’ll bring her up, and what you’ll teach her, and what she’ll say to “jack” the world. Back then I’d decided that I’d teach my yet-unborn daughter that “correlation does not imply causation” and she could use it use it against “elders” who were telling her absurd stuff.

I hadn’t imagined that mistaking correlation for causation is so fundamental to human nature that it would be a fairly difficult task to actually teach my daughter that correlation does not imply causation! Hopefully in the next one year I can convince her.

English Premier League: Goal Difference to points correlation

So I was just looking down the English Premier League Table for the season, and I found that as I went down the list, the goal difference went lower. There’s nothing counterintuitive in this, but the degree of correlation seemed eerie.

So I downloaded the data and plotted a scatter-plot. And what do you have? A near-perfect regression. I even ran the regression and found a 96% R Square.

In other words, this EPL season has simply been all about scoring lots of goals and not letting in too many goals. It’s almost like the distribution of the goals itself doesn’t matter – apart from the relegation battle, that is!

PS: Look at the extent of Manchester City’s lead at the top. And what a scrap the relegation is!

Biases, statistics and luck

Tomorrow Liverpool plays Manchester City in the Premier League. As things stand now I don’t plan to watch this game. This entire season so far, I’ve only watched two games. First, I’d gone to a local pub to watch Liverpool’s visit to Manchester City, back in September. Liverpool got thrashed 5-0.

Then in October, I went to Wembley to watch Tottenham Hotspur play Liverpool. The Spurs won 4-1. These two remain Liverpool’s only defeats of the season.

I might consider myself to be a mostly rational person but I sometimes do fall for the correlation-implies-causation bias, and think that my watching those games had something to do with Liverpool’s losses in them. Never mind that these were away games played against other top sides which attack aggressively. And so I have this irrational “fear” that if I watch tomorrow’s game (even if it’s from a pub), it might lead to a heavy Liverpool defeat.

And so I told Baada, a Manchester City fan, that I’m not planning to watch tomorrow’s game. And he got back to me with some statistics, which he’d heard from a podcast. Apparently it’s been 80 years since Manchester City did the league “double” (winning both home and away games) over Liverpool. And that it’s been 15 years since they’ve won at Anfield. So, he suggested, there’s a good chance that tomorrow’s game won’t result in a mauling for Liverpool, even if I were to watch it.

With the easy availability of statistics, it has become a thing among football commentators to supply them during the commentary. And from first hearing, things like “never done this in 80 years” or “never done that for last 15 years” sounds compelling, and you’re inclined to believe that there is something to these numbers.

I don’t remember if it was Navjot Sidhu who said that statistics are like a bikini (“what they reveal is significant but what they hide is crucial” or something). That Manchester City hasn’t done a double over Liverpool in 80 years doesn’t mean a thing, nor does it say anything that they haven’t won at Anfield in 15 years.

Basically, until the mid 2000s, City were a middling team. I remember telling Baada after the 2007 season (when Stuart Pearce got fired as City manager) that they’d be surely relegated next season. And then came the investment from Thaksin Shinawatra. And the appointment of Sven-Goran Eriksson as manager. And then the youtube signings. And later the investment from the Abu Dhabi investment group. And in 2016 the appointment of Pep Guardiola as manager. And the significant investment in players after that.

In other words, Manchester City of today is a completely different team from what they were even 2-3 years back. And they’re surely a vastly improved team compared to a decade ago. I know Baada has been following them for over 15 years now, but they’re unrecognisable from the time he started following them!

Yes, even with City being a much improved team, Liverpool have never lost to them at home in the last few years – but then Liverpool have generally been a strong team playing at home in these years! On the other hand, City’s 18-game winning streak (which included wins at Chelsea and Manchester United) only came to an end (with a draw against Crystal Palace) rather recently.

So anyways, here are the takeaways:

  1. Whether I watch the game or not has no bearing on how well Liverpool will play. The instances from this season so far are based on 1. small samples and 2. biased samples (since I’ve chosen to watch Liverpool’s two toughest games of the season)
  2. 80-year history of a fixture has no bearing since teams have evolved significantly in these 80 years. So saying a record stands so long has no meaning or predictive power for tomorrow’s game.
  3. City have been in tremendous form this season, and Liverpool have just lost their key player (by selling Philippe Coutinho to Barcelona), so City can fancy their chances. That said, Anfield has been a fortress this season, so Liverpool might just hold (or even win it).

All of this points to a good game tomorrow! Maybe I should just watch it!

 

 

Scott Adams, careers and correlation

I’ve written here earlier about how much I’ve been influenced by Scott Adams’s career advice about “being in top quartile of two or more things“.  To recap, this is what Adams wrote nearly ten years back:

If you want an average successful life, it doesn’t take much planning. Just stay out of trouble, go to school, and apply for jobs you might like. But if you want something extraordinary, you have two paths:

1. Become the best at one specific thing.
2. Become very good (top 25%) at two or more things.

The first strategy is difficult to the point of near impossibility. Few people will ever play in the NBA or make a platinum album. I don’t recommend anyone even try.

Having implemented this to various degrees of success over the last 5-6 years, I propose a small correction – basically to follow the second strategy that Adams has mentioned, you need to take correlation into account.

Basically there’s no joy in becoming very good (top 25%) at two or more correlated things. For example, if you think you’re in the top 25% in terms of “maths and physics” or “maths and computer science” there’s not so much joy because these are correlated skills. Lots of people who are very good at maths are also very good at physics or computer science. So there is nothing special in being very good at such a combination.

Why Adams succeeded was that he was very good at 2-3 things that are largely uncorrelated – drawing, telling jokes and understanding corporate politics are not very correlated to each other. So the combination of these three skills of his was rather unique to find, and their combination resulted in the wildly successful Dilbert.

So the key is this – in order to be wildly successful, you need to be very good (top 25%) at two or three things that are not positively correlated with each other (either orthogonal or negative correlation works). That ensures that if you can put them together, you can offer something that very few others can offer.

Then again, the problem there is that the market for this combination of skills will be highly illiquid – low supply means people who might demand such combinations would have adapted to make do with some easier to find substitute, so demand is lower, and so on. So in that sense, again, it’s a massive hit-or-miss!

Writing and depression

It is now a well-documented fact (that I’m too lazy to google and provide links) that there exists a relationship between mental illness and creative professions such as writing.

Most pieces that talk about this relationship draw the causality in one way – that the mental illness helped the writer (or painter or filmmaker or whoever) focus and channel emotions into the product.

Having taken treatment for depression in the past, and having just finished a manuscript of a book, I might tend to agree that there exists a relationship between creativity and depression. However, I wonder if the causality runs the other way.

I’ve mentioned here a couple of months back that writing a book is hard because you are working months together with little tangible feedback, and there’s a real possibility that it might flop miserably. Soncequently, you put fight to make the product as good as you can.

In the absence of feedback, you are your greatest critic, and you read, and re-read what you’ve written; you edit, and re-edit your passages until you’re convinced that they’re as good as they can be.

You get obsessed with your product. You start thinking that if it’s not perfect it is all doomed. You downplay the (rather large) random component that might affect the success of the product, and instead focus on making it as perfect as you can.

And this obsession can drive you mad. There are days when you sit with your manuscript and feel useless. There are times when you want to chuck months’ effort down the drain. And that depresses you. And affects other parts of your life, mostly negatively!

Again it’s rather early that I’m writing this blog post now – at a time when I’m yet to start marketing my book to publishers. However, it’s important that I document this relationship and causality now – before either spectacular success or massive failure take me over!