Stereotypes and correlations

Earlier on this blog, I’ve argued in favour of stereotypes. “In the absence of further information, stereotypes give you a strong Bayesian prior”, I had written (I’m paraphrasing myself here). I had gone on to say (paraphrasing myself yet again), “however, it is important that you treat this as a weak prior and update them as and when you get new information. So in the presence of additional information, you need to let go of the stereotypes”.

A lot of stereotyping is due to spurious correlations, often formed due to small number of training samples. My mother, for example, strongly believed that if you drink alcohol, you must be a bad person. Sometime, she had explained to me why she thought so – there were a few of her friends whose fathers or husbands drank alcohol, and they had had to endure domestic abuse.

That is only one extreme correlation stereotype. We keep making these stereotypes based on correlation all the time. I’m not saying that the correlation is not positive – sometimes it can be extremely positive. Just that it may not have full explainability.

For example, certain ways on dressing have come to be associated with certain attitudes (black tshirts and heavy metal, for example). So when we see someone exhibiting one side of this correlation, our minds are naturally drawn to associating them with the other side of the correlation as well (so you see someone in a black heavy metal band t-shirt, and immediately assume that they must be interested in heavy metal – to take a trivial example).

And then when their further behaviour belies the correlation that you had instinctively made, your mind gets messed up.

There was this guy in my batch at IIT Madras, who used to wear a naama (vertical religious mark on forehead commonly worn by Iyengars) on his forehead a lot of the time. Unlike most other undergrads, he also preferred to wear dhotis. So you would see him in his dhoti and naama and assume he was a religious conservative. And then you would see his hand, which would usually be held up showing a prominent middle finger, and all your mental correlations would go for a toss.

Another such example that I’ve spoken about on this blog before is that of the “puritan topper” – having seen a few topper types who otherwise led austere lives, I had assumed that kind of behaviour was correlated with being a topper (in some ways I can now argue that this blog is getting a bit meta).

I find myself doing this all the time. I observe someone’s accent and make assumptions on their abilities or the lack of it. I see someone’s dressing sense and build a whole story in my head on that single data point. I see the way someone is walking, and that supposedly tells me about their state of mind that day.

The good thing I’ve done is to internalise my last year’s blogpost – while all these single data point correlations are fine as a prior (in the absence of other information), the moment I get more information I immediately update them, and the initial stereotypes go out of the window.

The other thing I’m thinking of is – sometimes some of these random spurious correlations are so ingrained in our heads that we let them influence us. We take a certain job and decide that it is associated with a certain way of dressing and also start dressing the same way (thus playing up the stereotypes). We know the sort of clothes most people wear to a certain kind of restaurant, and also dress that way – again playing up the stereotypes.

Without realising it, maybe because of mimetic desire or a desire to fit in, we end up furthering random correlations and stereotypes. So maybe it is time to make a conscious effort to start breaking these stereotypes? But no – you won’t see me wear a suit to work any time soon.

I’ll end with another school anecdote. For whatever reason, many of the topper types in my 11th standard class would wear the school uniform sweater to school every single day, irrespective of how hot or cold it was. And then one fine (and not cold) day, yet another guy showed up in the uniform sweater. “How come you’re wearing this sweater”, I asked. He replied, “Oh, I just wanted to look more intellectual!”

 

Compression Stereotypes

One of the most mindblowing things I learnt while I was doing my undergrad in Computer Science and Engineering was Lempel-Ziv-Welch (LZW) compression. It’s one of the standard compression algorithms used everywhere nowadays.

The reason I remember this is twofold – firstly, I remember implementing this as part of an assignment (our CSE program at IITM was full of those), and feeling happy to be coding in C rather than in the dreaded Java (which we had to use for most other assignments).

The other is that this is one of those algorithms that I “internalised” while doing something totally different – in this case I was having coffee/ tea with a classmate in our hostel mess.

I won’t go into the algorithm here. However, the basic concept is that as and when we see a new pattern, we give it a code, and every subsequent occurrence of that pattern is replaced by its corresponding code. And the beauty of it is that you don’t need to ship a separate dictionary -the compressed code itself encapsulates it.

Anyway, in practical terms, the more the same kind of patterns are repeated in the original file, the more the file can be compressed. In some sense, the more the repetition of patterns, the less the overall “information” that the original file can carry – but that discussion is for another day.

I’ve been thinking of compression in general and LZW compression in particular when I think of stereotyping. The whole idea of stereotyping is that we are fundamentally lazy, and want to “classify” or categorise or pigeon-hole people using the fewest number of bits necessary.

And so, we use lazy heuristics – gender, caste, race, degrees, employers, height, even names, etc. to make our assumptions of what people are going to be like. This is fundamentally lazy, but also effective – in a sense, we have evolved to stereotype people (and objects and animals) because that allows our brain to be efficient; to internalise more data by using fewer bits. And for this precise reason, to some extent, stereotyping is rational.

However, the problem with stereotypes is that they can frequently be wrong. We might see a name and assume something about a person, and they might turn out to be completely different. The rational response to this is not to beat oneself for stereotyping in the first place – it is to¬†update one’s priors with the new information that one has learnt about this person.

So, you might have used a combination of pre-known features of a person to categorise him/her. The moment you realise that this categorisation is wrong, you ought to invest additional bits in your brain to classify this person so that the stereotype doesn’t remain any more.

The more idiosyncratic and interesting you are, the more the number of bits that will be required to describe you. You are very very different from any of the stereotypes that can possibly be used to describe you, and this means people will need to make that effort to try and understand you.

One of the downsides of being idiosyncratic, though, is that most people are lazy and won’t make the effort to use the additional bits required to know you, and so will grossly mischaracterise you using one of the standard stereotypes.

On yet another tangential note, getting to know someone is a Bayesian process. You make your first impressions of them based on whatever you find out about them, and go on building a picture of them incrementally based on the information you find out about them. It is like loading a picture on a website using a bad internet connection – first the picture appears grainy, and then the more idiosyncratic features can be seen.

The problem with refusing to use stereotypes, or demonising stereotypes, is that you fail to use the grainy pictures when that is the best available, and instead infinitely wait to get better pictures. On the other hand, failing to see beyond stereotypes means that you end up using grainy pictures when more clear ones are available.

And both of these approaches is suboptimal.

PS: I’ve sometimes wondered why I find it so hard to remember certain people’s faces. And I realise that it’s usually because they are highly idiosyncratic and not easy to stereotype / compress (both are the same thing). And so it takes more effort to remember them, and if I don’t really need to remember them so much, I just don’t bother.

The Base Rate in Hitting on People

Last week I met a single friend at a bar. He remarked that had I been late, or not turned up at all, he would have seriously considered chatting up a couple of women at the table next to ours.

This friend has spent considerable time in several cities. The conversation moved to how conducive these cities are for chatting up people, and what occasions are appropriate for chatting up. In Delhi, for example, he mentioned that you never try and chat up a strange woman – you are likely to be greeted with a swap.

In Bombay, he said, it depends on where you chat up. What caught my attention was when he mentioned that in hipster cafes, the ones that offer quinoa bowls and vegan smoothies, it is rather normal to chat up strangers, whether you are doing so with a romantic intent or not. One factor he mentioned was the price of real estate in Bombay which means most of these places have large “communal tables” that encourage conversation.

The other thing we spoke about how the sort of food and drink such places serve create a sort of “brotherhood” (ok not appropriate analogy when talking about chatting up women), and that automatically “qualifies” you as not being a creep, and your chatting up being taken up seriously.

This got me thinking about the concept of “base rates” or “priors”. I spent the prime years of my youth in IIT Madras, which is by most accounts a great college, but where for some inexplicable reason, not too many women apply to get in. That results in a rather lopsided ratio that you would more associate with a dating app in India rather than a co-educational college.

In marketing you have the concept of a “qualified lead”. When you randomly call a customer to pitch your product there is the high chance that she will hang up on you. So you need to “prime” the customer to expect your call and respond positively. Building your brand helps. Also, doing something that gauges the customer’s interest before the call, and calling only once the interest is established, helps.

What you are playing on here in marketing is is the “base rate” or the “prior” that the customer has in her head. By building your brand, you automatically place yourself in a better place in the customer’s mind, so she is more likely to respond positively. If, before the call, the customer expects to have a better experience with you, that increases the likelihood of a positive outcome from the call.

And this applies to chatting up women as well. The lopsided ratio at IIT Madras, where I spent the prime of my youth, meant that you started with a rather low base rate (the analogy with dating apps in India is appropriate). Consequently, chatting up women meant that you had to give an early signal that you were not a creep, or that you were a nice guy (the lopsided ratio also turns most guys there into misogynists, and not particularly nice. This is a rather vicious cycle). Of course, you could build your brand with grades or other things, but it wasn’t easy.

Coming from that prior, it took me a while to adjust to situations with better base rates, and made me hesitant for a long time, and for whatever reason I assumed I was a “low base rate” guy (I’m really glad, in hindsight, that my wife “approached” me (on Orkut) and said the first few words. Of course, once we’d chatted for a while, I moved swiftly to put her in my “basket”).

Essentially, when we lack information, we stereotype someone with the best information we have about them. When the best information we have about them is not much, we start with a rather low prior, and it is upon them to impress us soon enough to upgrade them. And upgrading yourself in someone’s eyes is not an easy business. And so you should rather start from a position where the base rate is high enough.

And this “upgrade” is not necessarily linear – you can also use this to brand yourself in the axes that you want to be upgraded. Hipster cafes provide a good base rate that you like the sort of food served there. Sitting in a hipster cafe with a MacBook might enhance your branding (increasingly, sitting in a cafe with a Windows laptop that is not a Surface might mark you out as an overly corporate type). Political events might help iff you are the overly political type (my wife has clients who specify the desired political leadings of potential spouses). Caste groups on Orkut or Facebook might help if that is the sort of thing you like. The axes are endless.

All that matters is that whatever improved base rate you seek to achieve by doing something, the signal you send out needs to be credible. Else you can get downgraded very quickly once you’ve got the target’s attention.