ethnicity – Pertinent Observations

Axes of diversity

Companies and educational institutions, especially those that have a global footprint and a reputation to protect, make a big deal about diversity policies. It is almost impossible to sit through a recruitment or admissions talk by one such entity without a mention to their diversity policies, which they are proud of.

And they have good reasons to have a diverse workforce. It has been shown, for example, that diversity leads to better decision-making and overall better performance. Having a diverse workforce brings together people with different backgrounds, and since backgrounds influence opinion, a more diverse team is more likely to have more diversity of opinion which results in better decision making. And so forth.

The problem, however, is that it is not easy to simultaneously achieve diversity on all possible axes. Let’s say that we have defined a number of axes, and are looking to recruit an incoming MBA class. If we want diversity on each of these axes, selection of each candidate is going to rule out a large number of other candidates and we will need a really large pool to choose from. In other words, it is akin to the eight queens problem (where you have to place eight queens on a chessboard such that no two of them are on the same row, column or diagonal). For those of you not familiar with chess, think of it like a Sudoku puzzle.

Since the pool of candidates large enough to achieve diversity on all axes is simply not feasible, firms and schools choose to prioritise certain axes over others, and seek to achieve diversity in these chosen axes. And since they can arbitrarily choose axes that they can prioritise, the incentive is to pick out those axes where diversity is most visible.

And so when you go to a global organisation or school that preaches diversity, you will notice that they indeed have a very diverse workforce/student body in terms of gender, race, and nationality, which are fairly visible dimensions. Beyond this, the choice of dimensions to impose diversity on is a matter of discretion. So you have organisations which seek diversity in sexual orientation. Others seek diversity in age profile. Yet others in educational backgrounds. And so forth.

The result of prioritising more “visible” dimensions to ensure diversity is that organisations end up becoming horribly similar in the “sacrificed dimensions”. Check out this excerpt from Peter Thiel’s Zero to One, for example, on the founding members of paypal:

The early PayPal team worked well together because we were all the same kind of nerd. We all loved science fiction: Cryptonomicon was required reading, and we preferred the capitalist Star Wars to the communist Star Trek

Now, remember that this was a fairly diverse team when it came to ethnicity, nationality and sexuality. But in a less visible dimension, the team was not diverse at all. And Thiel mentions it in his book as if it’s a good thing that they all thought so similarly.

On a similar note, I once worked for an organisation that made great shakes of its diversity policy, and the organisation was pretty diverse in terms pretty much every visible axis of diversity. And the seminars (some compulsory) they organised helped me significantly broaden my outlook on issues such as race or sexual orientation. But when it came to work, the (fairly large) team was horribly similar. Quoting from an earlier blogpost (a bit ranty, I admit):

First, a large number of guys building models come from similar backgrounds, so they think similarly. Because so many people think similarly, the rest train themselves to think similarly (or else get nudged out, by whatever means). So you have massive organizations full of massively talented brilliant minds which all think similarly! Who is to ask the uncomfortable questions?

So essentially because you had a large organisation of people from basically similar educational backgrounds (masters and PhDs in similar subjects), their way of thinking became dominant, and others were forced to conform, leading to groupthink, which might have potentially led to mishaps (but didn’t, at least not in my time).

And what of the Ivy League schools that again pride themselves on (visible forms of) diversity? Here is an excerpt from William Deresiewicz’s excellent 2008 essay:

Elite schools pride themselves on their diversity, but that diversity is almost entirely a matter of ethnicity and race. With respect to class, these schools are largely—indeed increasingly—homogeneous. Visit any elite campus in our great nation and you can thrill to the heartwarming spectacle of the children of white businesspeople and professionals studying and playing alongside the children of black, Asian, and Latino businesspeople and professionals. At the same time, because these schools tend to cultivate liberal attitudes, they leave their students in the paradoxical position of wanting to advocate on behalf of the working class while being unable to hold a simple conversation with anyone in it.

So the next time you want to make your organisation diverse, think of which axes you want diversity on. If you are public-minded and want to brag about your diversity, the obvious way to go would be to be diverse on visible axes, but that leaves other issues. On the other hand you could put together a team of people that look the same but think different!

It’s entirely up to you!

What is the feminine of Amit?

“Amit” is a word that is commonly used, often pejoratively, to refer to men from the North of India. The reason for the usage of “Amit” in this context is that while it is an extremely common name for men from North India, it is not as common in other parts of India, and thus it characterises men from North India.

A question that has been floating around in social media circles for a long time in this connection is what the feminine form of “Amit” is. If Amit characterises the median North Indian male, what name characterises the median North Indian female? Popular candidates for this are Neha, Isha and Pooja. Pooja suffers from the fact that is is also a fairly common name in other parts of India. Isha, while it might be strongly North Indian, is too obscure. And for some reason, people are loathe to accept Neha as the feminine Amit. So how do we resolve this?

I, being a stud, am a big follower of the Hanuman principle. If you have to solve a problem, and it takes no more effort to solve a generic problem, then solve the generic problem and apply it to this problem as a special instance rather than spending time to solve each instance. Hence, we will rephrase this problem as “What first name uniquely identifies a particular ethnicity?”. I, being a quant, am going to use the quantitative hammer to hammer down this nail. So we can rephrase as “how can we quantitatively characterise ethnicities by first names?”

The first thing to notice is that we need a frame of reference. Amit is a good name to characterise a North Indian man among the universe of Indian men. However, if we define the universe differently, as “Asian” for example, or “men living in Delhi”, Amit may not be as characteristic at all. Hence, any formula that we develop needs to take into account the frame of reference.

Secondly, what makes a name ethnically characteristic? I argue that there are two factors, and these two will be used in deriving the final formula. Firstly, the name should be common among the particular ethnicity – for example, Murugaselvan is extremely characteristic of Tamil men, but its occurrence is so low that using Murugaselvan as the median Tamil man among all Indian men is futile. Secondly, the name should be distinctive for that particular community. For example, a possible competitor to Amit is Rahul, a name that is possibly as common among North Indians as Amit is (I haven’t seen the statistics). The problem with Rahul, however, is that it is a fairly common name in South India also! So it does a bad job in terms of discrimination. So basically what we are looking for is a name that is both popular in the ethnicity we want to characterise, and also characteristic to that particular ethnicity in comparison to the universe.

These two requirements lead to the following rather simple formula (I’m not claiming that this is the best formula – if there is a way to objectively evaluate such formulas, that is – but it is sufficiently good and simple to understand and evaluate). Let our universe by U and the community we are trying to characterise by C. C’ is {U – C} (I’m assuming all of you know set theoretic notation). The first name N that characterises the community C is the one that maximises P(N|C) – P(N|C’). That’s it. Simple.

To explain in English, for each first name, we calculate the incidence of that particular name in the community C. That is, for example, what proportion of North Indian girls are named Neha, Pooja, Isha, Nidhi, etc. Next, we calculate the incidence of the name in the “complement of C”, that is how likely is it that someone in the rest of the “universe” we have defined has the same name. In our above example, we calculate what proportion of Indian but NOT North Indian girls (taking Indian women as the universe) are named Neha, Pooja, Isha, Nidhi, etc. Then, for each name, we subtract the latter quantity from the former quantity and then select the name for which this difference is maximum! Rather simple, I would think!

Now, we need data. Unfortunately I can’t seem to find any publicly available data sets that contain long lists of names along with markers of ethnicity (address or city or state or language preference or some such). If you can help me with some data sets, we can actually run the above formula for different ethnicities and characterise them. It is going to be a fun exercise, I promise! So pour in the data. And I request you to share publicly available data and not proprietary data.

And then we can for once and for all finish this debate of what the feminine form of Amit is, along with many other fun ethnic classifications.

Remembering Names and Pattern Recognition

I spent the first half of this week attending a Pan-Asia training program in Hong Kong. Most of the people attending this program were from the Tokyo and Hong Kong offices of our firm, and most of them happened to be natives of China, Japan and Korea. It was a wonderful training program and gave much scope for networking. The biggest surprise to me, however, was about how bad I was during the two days at remembering names – something I consider myself good at.

We Indians constantly crib that westerners are usually bad at catching our names while on the other hand that we don’t have much trouble remembering their names. Thinking about it, I think name recognition is basically an exercise in pattern recognition and the ease of rememberance of a certain class of names depends on how easily we can recognize those patterns.

If you are familiar with the broad class of names of a particular ethnicity (let’s say Indian Hindu for example), you don’t really need to remember the name as a collection of syllables. You only need to know say the first letter, or an abstract concept which is what the name means, or a combination of this and it is likely that you can remember the full name.

The thing wiht western names, however, is that due to Hollywood, or sport, or colonial rule, or the fact that Indian Christians have names similar to mostly christian Westerners, most Western names are familiar to us. And because of this familiarity, it is not hard at all for us to remember the name of the average Westerner. On the contrary, due to lack of exposure, Westerners can’t recognize patterns in Indian names because of which it is hard for them to remember our names.

It is due to lack of general familiarty with Chinese and Japanese names that I found it so hard to remember names during the recent trip. There was no way I could break down names into easy combination of syllables (yeah for example Hi-ro-hi-to consists of all easy syllables, but how many people called hirohito would you know for you to remember the whole name by just remembering part of the name) and so I had the additional responsibility of remembering all the syllables in the names and the combination in which they occurred.

On a related note, a disproportionate proportion of people of Chinese origin at the training had a christian (western) first name and a chinese last name (eg. Michael Chang). But then I suppose this is because a lot of Chinese people adopt a “Western name” to make matters simple when they migrate or something (so for example, someone called Chang Sun-Wang will convert his name to Stephen Chang).