It’s not just about status

Rob Henderson writes that in general, relative to the value they add to their firms, senior employees are underpaid and junior employees are overpaid. This, he reasons, is because senior employees trade off money for status.

Quoting him in full:

Robert Frank suggests the reason for this is that workers would generally prefer to occupy higher-ranked positions in their work groups than lower-ranked ones. They’re forgoing more earnings to hold a higher-status position in their organization.

But this preference for a higher-status position can be satisfied within any given organization.

After all, 50 percent of the positions in any firm must always be in the bottom half.

So the only way some workers can enjoy the pleasure inherent in positions of high status is if others are willing to bear the dissatisfactions associated with low status.

The solution, then, is to pay the low-status workers a bit more than they are worth to get them to stay. The high-status workers, in contrast, accept lower pay for the benefit of their lofty positions.

I’m not sure I agree. Yes, I do agree that higher productivity employees are underpaid and lower productivity employees are overpaid. However, I don’t think status fully explains it. There are also issues of variance and correlation and liquidity (there – I’m talking like a real quant now).

One the variance front – the higher you are in the organisation and the higher your salary is, the more the variance of your contribution to the organisation. For example, if you are being paid $350,000 (the number Henderson hypothetically uses), the actual value you are bringing to your firm might have a mean of $500,000 and a standard deviation of $200,000 (pulling all these numbers out of thin air, while making some sense checks that broadly risk pricing holds).

On the other hand, if you are being paid $35,000, then it is far more likely that the average value you bring to the firm is $40,000 with a standard deviation of $5,000 (again numbers entirely pulled out of thin air). Notice the drastic difference in the coefficient of variation in the two cases.

Putting it another way, the more productive you are, the harder it is for any organisation to put a precise value on your contribution. Henderson might say “you are worth 500K while you earn 350K” but the former is an average number. It is because of the high variance in your “worth” that you are paid far lower than what you are worth on average.

And why does this variance exist? It’s due to correlation.

More so at higher ranked positions (as an aside – my weird career path means that I’ve NEVER been in middle management) the value you can add to a company is tightly coupled with your interactions with your colleagues and peers. As a junior employee your role can be defined well enough that your contributions are stable irrespective of how you work with the others. At senior levels though a very large part of the value you can add is tied to how you work with others and leverage their work in your contributions.

So one way a company can get you to contribute more is to have a good set of peers you like working with, which increases your average contribution to the firm. Rather paradoxically, because you like your peers (assuming peer liking in senior management is two way), the company can get away with paying you a little less than your average worth and you will continue to stick on. If you don’t like working with your colleagues, there is the double whammy that you will add less to the company and you need to be paid more to stick on. And so if you look at people who are actually successful in their jobs at a senior level, they will all appear to be underpaid relative to their peers.

And finally there is liquidity (can I ever theorise about something without bringing this up?). The more senior you go, the less liquid is the market for your job. The number of potential jobs that you want to do, and which might want you, is very very low. And as I’ve explained in the first chapter of my book, when a market is illiquid, the bid-ask spread can be rather high. This means that even holding the value of your contribution to a company constant, there can be a large variation in what you are actually paid. And that is a gain why, on average, senior employees are underpaid.

So yes, there is an element of status. But there are also considerations of variance, correlation and bid-ask. And selection bias (senior employees who are overpaid relative to the value they add don’t last very long in their jobs). And this is why, on average, you can afford to underpay senior employees.

A day at an award function

So I got an award today. It is called “exemplary data scientist”, and was given out by the Analytics India Magazine as part of their MachineCon 2022. I didn’t really do anything to get the award, apart from existing in my current job.

I guess having been out of the corporate world for nearly a decade, I had so far completely missed out on the awards and conferences circuit. I would see old classmates and colleagues put pictures on LinkedIn collecting awards. I wouldn’t know what to make of it when my oldest friend would tell me that whenever he heard “eye of the tiger”, he would mentally prepare to get up and go receive an award (he got so many I think). It was a world alien to me.

Parallelly, I used to crib about how while I’m well networked in India, and especially in Bangalore, my networking within the analytics and data science community is shit. In a way, I was longing for physical events to remedy this, and would lament that the pandemic had killed those.

So I was positively surprised when about a month ago Analytics India Magazine wrote to me saying they wanted to give me this award, and it would be part of this in-person conference. I knew of the magazine, so after asking around a bit on legitimacy of such awards and looking at who had got it the last time round, I happily accepted.

Most of the awardees were people like me – heads of analytics or data science at some company in India. And my hypothesis that my networking in the industry was shit was confirmed when I looked at the list of attendees – of 100 odd people listed on the MachineCon website, I barely knew 5 (of which 2 didn’t turn up at the event today).

Again I might sound like a n00b, but conferences like today are classic two sided markets (read this eminently readable paper on two sided markets and pricing of the same by Jean Tirole of the University of Toulouse). On the one hand are awardees – people like me and 99 others, who are incentivised to attend the event with the carrot of the award. On the other hand are people who want to meet us, who will then pay to attend the event (or sponsor it; the entry fee for paid tickets to the event was a hefty $399).

It is like “ladies’ night” that pubs have, where on a particular days of the week, women who go to the pub get a free drink. This attracts women, which in turn attracts men who seek to court the women. And what the pub spends in subsidising the women it makes back in terms of greater revenue from the men on the night.

And so it was at today’s conference. I got courted by at least 10 people, trying to sell me cloud services, “AI services on the cloud”, business intelligence tools, “AI powered business intelligence tools”, recruitment services and the like. Before the conference, I had received LinkedIn requests from a few people seeking to sell me stuff at the conference. In the middle of the conference, I got a call from an organiser asking me to step out of the hall so that a sponsor could sell to me.

I held a poker face with stock replies like “I’m not the person who makes this purchasing decision” or “I prefer open source tools” or “we’re building this in house”.

With full benefit of hindsight, Radisson Blu in Marathahalli is a pretty good conference venue. An entire wing of the ground floor of the hotel is dedicated for events, and the AIM guys had taken over the place. While I had not attended any such event earlier, it had all the markings of a well-funded and well-organised event.

As I entered the conference hall, the first thing that struck me was the number of people in suits. Most people were in suits (though few wore ties; And as if the conference expected people to turn up in suits, the goodie bag included a tie, a pair of cufflinks and a pocket square). And I’m just not used to that. Half the days I go to office in shorts. When I feel like wearing something more formal, I wear polo T-shirts with chinos.

My colleagues who went to the NSE last month to ring the bell to take us public all turned up company T-shirts and jeans. And that’s precisely what I wore to the conference today, though I had recently procured a “formal uniform” (polo T-shirt with company logo, rather than my “usual uniform” which is a round neck T-shirt). I was pretty much the only person there in “uniform”. Towards the end of the day, I saw one other guy in his company shirt, but he was wearing a blazer over it!

Pretty soon I met an old acquaintance (who I hadn’t known would be at the conference). He introduced me to a friend, and we went for coffee. I was eating a cookie with the coffee, and had an insight – at conferences, you should eat with your left hand. That way, you don’t touch the food with the same hand you use to touch other people’s hands (surprisingly I couldn’t find sanitiser dispensers at the venue).

The talks, as expected, were nothing much to write about. Most were by sponsors selling their wares. The one talk that wasn’t by a sponsor was delivered by a guy who was introduced as “his greatgrandfather did this. His grandfather did that. And now this guy is here to talk about ethics of AI”. Full Challenge Gopalakrishna feels happened (though, unfortunately, the Kannada fellows I’d hung out with earlier that day hadn’t watched the movie).

I was telling some people over lunch (which was pretty good) that talking about ethics in AI at a conference has become like worshipping Ganesha as part of any elaborate pooja. It has become the de riguer thing to do. And so you pay obeisance to the concept and move on.

The awards function had three sections. The first section was for “users of AI” (from what I understood). The second (where I was included) was for “exemplary data scientists”. I don’t know what the third was for (my wife is ill today so I came home early as soon as I’d collected my award), except that it would be given by fast bowler and match referee Javagal Srinath. Most of the people I’d hung out with through the day were in the Srinath section of the awards.

Overall it felt good. The drive to Marathahalli took only 45 minutes each way (I drove). A lot of people had travelled from other cities in India to reach the venue. I met a few new people. My networking in data science and analytics is still not great, but far better than it used to be. I hope to go for more such events (though we need to figure out how to do these events without that talks).

PS: Everyone who got the award in my section was made to line up for a group photo. As we posed with our awards, an organiser said “make sure all of you hold the prizes in a way that the Intel (today’s chief sponsor) logo faces the camera”. “I guess they want Intel outside”, I joked. It seemed to be well received by the people standing around me. I didn’t talk to any of them after that, though.

The “intel outside” pic. Courtesy: https://www.linkedin.com/company/analytics-india-magazine/posts/?feedView=all

 

Proof of work

I like to say sometimes that one reason I never really get crypto is that it involves the concept of “proof of work”. That phrase sort of triggers me. It reminds me of all the times when I was in school when I wouldn’t get full marks in maths despite getting all the answers correct because I “didn’t show working”.

In any case, I spent about fifteen minutes early this morning drinking my aeropress and deleting LinkedIn connection requests. Yeah, you read that right. It took that long to refuse all the connection requests I had got since yesterday, when I put a fairly innocuous post saying I’m hiring.

I understand that the market is rather tough nowadays. Companies are laying employees off ($) left right and centre (in fact, this (paywalled) article prompted my post – I’m hoping to find good value in the layoff market). Interest rates are going up. Stock prices are going down. Startup funding has slowed. The job market is not easy. And so you see an innocuous post like this getting such a massive reaction.

In any case, the reason I was thinking about “proof of work” is that the responses to my post reminded me of my own (unsuccessful) job hunts from a few years back. I remember randomly applying through LinkedIn. I remember using easy apply. And I remember pretty much not hearing back from anyone.

Time for a bollywood break:

Yes, the choice of where I’ve started this video is deliberate. As i was spending time this morning refusing all the LinkedIn connection requests (some 500+ people I have no clue about had simply added me without any matter of introduction or purpose), I was thinking of this song.

I followed a simple strategy – I engaged with people who had cared to write a note (or InMail) to me along with the connection request, and I just ignored the rest. As I kept hitting “ignore ignore ignore … ” on my phone (while sipping coffee with the other hand), I realised that I almost hit “ignore” on one of my company HRs who had added me. A few minutes later, I actually hit ignore on a colleague who I’ve actually worked with (I made amends by sending him back a connection request that he accepted).

Given the flood of requests that I had got, I was forced to use a broad brush. I was forced to use simple heuristics rather than evaluating each application on its true merit. I’m pretty sure I’ve made plenty of errors of omission today (that said, my heuristic has thrown up a bunch of fairly promising candidates).

In any case, if you think about it, the heuristic I used can pretty well be described as “proof of work”. And what the proof of work achieved here was to help people stand out in a crowded market. That there was some work showed a certain minimum threshold of interest, and that was sufficient to get my attention, which is all that mattered here. And on a related note, during normal times (when I get a maximum of one or two LinkedIn requests each day), I do take the effort to evaluate each request on its own merit. No proof of work is necessary.

And if you think about it, “proof of work” is rather prevalent in the natural world. A peacock’s feathers are the most commonly quoted example of this one. The beautiful tail comes at a huge cost in terms of agility and ability to fly, and the tail is a way for the peacock to show off to potential mates that “I can carry this thing and yet stay alive so imagine how fit my genes are. Mate with me”.

Anyway, back to the hiring market, you need a way to stand out. Maybe a nicely written cover letter. Maybe a referral (or “influence” as we used to pejoratively call this back in the 90s). Maybe a strong github profile. (Ok the last one is literally a proof of work!)

Else you will just get swept away with the tide.

 

PS: In general, I was also thinking of the wisdom of writing to someone at a time when you know he/she will be flooded with other messages. The bar for you to stand out is much much higher. Being contrarian helps i guess.

So many numbers! Must be very complicated!

The story dates back to 2007. Fully retrofitting, I was in what can be described as my first ever “data science job”. After having struggled for several months to string together a forecasting model in Java (the bugs kept multiplying and cascading), I’d given up and gone back to the familiarity of MS Excel and VBA (remember that this was just about a year after I’d finished my MBA).

My seat in the office was near a door that led to the balcony, where smokers would gather. People walking to the balcony, with some effort, could see my screen. No doubt most of them would’ve seen my spending 90% (or more) of my time on Google Talk (it’s ironical that I now largely use Google Chat for work). If someone came at an auspicious time, though, they would see me really working, which was using MS Excel.

I distinctly remember this one time this guy who shared my office cab walked up behind me. I had a full sheet of Excel data and was trying to make sense of it. He took one look at my screen and exclaimed, “oh, so many numbers! Must be very complicated!” (FWIW, he was a software engineer). I gave him a fairly dirty look, wondering what was complicated about a fairly simple dataset on Excel. He moved on, to the balcony. I moved on, with my analysis.

It is funny that, fifteen years down the line, I have built my career in data science. Yet, I just can’t make sense of large sets of numbers. If someone sends me a sheet full of numbers I can’t make out the head or tail of it. Maybe I’m a victim of my own obsessions, where I spend hours visualising data so I can make some sense of it – I just can’t understand matrices of numbers thrown together.

At the very least, I need the numbers formatted well (in an Excel context, using either the “,” or “%” formats), with all numbers in a single column right aligned and rounded off to the exact same number of decimal places (it annoys me that by default, Excel autocorrects “84.0” (for example) to “84” – that disturbs this formatting. Applying “,” fixes it, though). Sometimes I demand that conditional formatting be applied on the numbers, so I know which numbers stand out (again I have a strong preference for red-white-green (or green-white-red, depending upon whether the quantity is “good” or “bad”) formatting). I might even demand sparklines.

But send me a sheet full of numbers and without any of the above mentioned decorations, and I’m completely unable to make any sense or draw any insight out of it. I fully empathise now, with the guy who said “oh, so many numbers! must be very complicated!”

And I’m supposed to be a data scientist. In any case, I’d written a long time back about why data scientists ought to be good at Excel.

Recruitment and diversity

This post has potential to become controversial and is related to my work, so I need to explicitly state upfront that all opinions here are absolutely my own and do not, in any way, reflect those of my employers or colleagues or anyone else I’m associated with.

I run a rather diverse team. Until my team grew inorganically two months back (I was given more responsibility), there were eight of us in the team. Each of us have masters degrees (ok we’re not diverse in that respect). Sixteen degrees / diplomas in total. And from sixteen different colleges / universities. The team’s masters degrees are in at least four disjoint disciplines.

I have built this part of my team ground up. And have made absolutely made no attempt to explicitly foster diversity in my team. Yet, I have a rather diverse team. You might think it is on accident. You might find weird axes on which the team is not diverse at all (masters degrees is one). I simply think it is because there was no other way.

I like to think that I have fairly high standards when it comes to hiring. Based on the post-interview conversations I have had with my team members, these standards have percolated to them as well. This means we have a rather tough task hiring. This means very few people even qualify to be hired by my team. Earlier this year I asked for a bigger hiring budget. “Let’s see if you can exhaust what you’ve been given, and then we can talk”, I was told. The person who told me this was not being sarcastic – he was simply aware of my demand-supply imbalance.

Essentially, in terms of hiring I face such a steep demand-supply imbalance that even if I wanted to, it would be absolutely impossible for me to discriminate while hiring, either positively or negatively.

If I want to hire less of a certain kind of profile (whatever that profile is), I would simply be letting go of qualified candidates. Given how long it takes to find each candidate in general, imagine how much longer it would take to find candidates if I were to only look at a subset of applicants (to prefer a category I want more of in my team). Any kind of discrimination (apart from things critical to the job such as knowledge of mathematics and logic and probability and statistics, and communication) would simply mean I’m shooting myself in the foot.

Not all jobs, however, are like this. In fact, a large majority of jobs in the world are of the type where you don’t need a particularly rare combination of skills. This means potential supply (assuming you are paying decently, treating employees decently, etc.) far exceeds demand.

When you’re operating in this kind of a market, cost of discrimination (either positive or negative) is rather low. If you were to rank all potential candidates, picking up number 25 instead of number 20 is not going to leave you all that worse off. And so you can start discriminating on axes that are orthogonal to what is required to do the job. And that way you can work towards a particular set of “diversity (or lack of it) targets”.

Given that a large number of jobs (not weighted by pay) belong to this category, the general discourse is that if you don’t have a diverse team it is because you are discriminating in a particular manner. What people don’t realise is that it is pretty impossible do discriminate in some cases.

All that said, I still stand by my 2015 post on “axes on diversity“. Any externally visible axis of diversity – race / colour / gender / sex / sexuality – is likely to diminish diversity in thought. And – again this is my personal opinion – I value diversity in thought and approach much more than the visible sources of diversity.

 

Compensation at the right tail

Yesterday I was reading this article ($) about how Liverpool FC is going about (not) retaining its star forwards Sadio Mane and Mo Salah, who have been key parts of the team that has (almost) “cracked it” in the last 5 seasons.

One of the key ideas in the (paywalled) piece is that Liverpool is more careful about spending on its players than other top contemporary clubs. As Oliver Kay writes:

[…] the Spanish club have the financial strength to operate differently — retaining their superstars well into their 30s and paying them accordingly until they are perceived to have served their purpose, at which point either another A-list star or one of the most coveted youngsters in world football (an Eder Militao, an Eduardo Camavinga, a Vinicius Junior, a Rodrygo and perhaps imminently, an Aurelien Tchouameni) will usually emerge to replace them.

In an ideal world, Liverpool would do something similar with Salah and Mane, just as Manchester City did with Vincent Kompany, Fernandinho, Yaya Toure, David Silva and Sergio Aguero — and as they will surely do with De Bruyne.

But the reality is that the Merseyside club are more restricted. Not dramatically so, but restricted enough for Salah, Mane and their agents to know there is more to be earned elsewhere, and that presents a problem not just when it comes to retaining talent but also when it comes to competing for the signings that might fill the footsteps of today’s heroes.

To go back to fundamentals, earnings in sport follow a power law distribution – a small number of elite players make a large portion of the money. And the deal with the power law is that it is self-similar – you can cut off the distribution at any arbitrary amount, and what remains to the right is still a power law.

So income in football follows a power law. Income in elite football also follows the same power law. The English Premier League is at the far right end of this, but wages there again follow a power law. If you look at really elite players in the league, again it is a (sort of – since number of data points would have become small by now) power law.

What this means is that if you can define “marginal returns to additional skill”, at this far right end of the distribution it can be massive. For example, the article talks about how Salah has been offered a 50% hike (to make him the best paid Liverpool player ever), but that is still short of what some other (perceptibly less skilled) footballers earn.

So how do you go about getting value while operating in this kind of a market? One approach, that Liverpool seems to be playing, is to go Moneyball. “The marginal cost of getting a slightly superior player is massive, so we will operate not so far out at the right tail”, seems to be their strategy.

This means not breaking the bank for any particular player. It means ruthlessly assessing each player’s costs and benefits and acting accordingly (though sometimes it comes across as acting without emotion). For example, James Milner has just got an extension in his contract, but at lower wages to reflect his marginally decreased efficiency as he gets older.

Some of the other elite clubs (Real Madrid, PSG, Manchester City, etc.), on the other hand, believe that the premium for marginal quality is worth it, and so splurge on the elite players (including keeping them till fairly late in their careers even if it costs a lot). The rationale here is that the differences (to the “next level”) might be small, but these differences are sufficient to outperform compared to their peers (for example, Manchester City has won the league by one point over Liverpool twice in the last four seasons).

(Liverpool’s moneyball approach, of course, means that they try to get these “marginal advantages” in other (cheaper) ways, like employing a throw in coach or neuroscience consultants).

This approach is not without risk, of course. At the far right end of the tail, the variance in output can be rather high. Because the marginal cost of small increases in competence is so high, even if a player slightly underperforms, the effective monetary value of this underperformance is massive – you have paid for insanely elite players to win you everything, but they win you nothing.

And the consequences can be disastrous, as FC Barcelona found out last year.

In any case, the story doing the rounds now is that Barcelona want to hire Salah, but given their financial situation, they can’t afford to buy out his contract at Liverpool. So, they are hoping that he will run down his contract and join them on a free transfer next year. Then again, that’s what they had hoped from Gini Wijnaldum two years ago as well. And he’s ended up at PSG, where (to the best of my knowledge) he doesn’t play much.

Alcohol and sleep

A few months back we’d seen this documentary on Netflix (I THINK) on the effects of alcohol on health. Like you would expect from a well-made documentary (rather than a polemic), the results were inconclusive. There were a few mildly positive effects, some negative effects, some indicators on how alcohol can harm your health, etc.

However, the one thing I remember from that documentary is about alcohol’s effect on sleep – that drinking makes you sleep worse (contrary to popular imagination where you can easily pass out if you drink a lot). And I have now managed to validate that for myself using data.

The more perceptive of you might know that I log my life. I have a spreadsheet where every day I record some vital statistics (sleep and meal times, anxiety, quality of work, etc. etc.). For the last three months I’ve also had an Apple Watch, which makes its own recordings of its vital statistics.

Until this morning these two data sets had been disjoint – until I noticed an interesting pattern in my average sleeping heart rate. And then I decided to join them and do some analysis. A time series to start:

Notice the three big spikes in recent times. And they only seem to be getting higher (I’ll come to that in a bit).

And then sometimes a time series doesn’t do justice to patterns – absent the three recent big spikes it’s hard to see from this graph if alcohol has an impact on sleep heart rate. This is where a boxplot can help.

The difference is evident here – when I have alcohol, my heart rate during sleep is much higher, which means I don’t rest as well.

That said, like everything else in the world, it is not binary. Go back to the time series and see – I’ve had alcohol fairly often in this time period but my heart rate hasn’t spiked as much on all days. This is where quantity of alcohol comes in.

Most days when I drink, it’s largely by myself at home. A glass or two of either single malt or wine. And the impact on sleep is only marginal. So far so good.

On 26th, a few colleagues had come home. We all drank Talisker. I had far more than I normally have. And so my heart rate spiked (79). And then on June 1st, I took my team out to Arbor. Pretty much for the first time in 2022 I was drinking beer. I drank a fair bit. 84.

And then on Saturday I went for a colleague’s birthday party. There were only cocktails. I drank lots of rum and coke (I almost never drink rum). 89.

My usual drinking, if you see, doesn’t impact my health that much. But big drinking is big problem, especially if it’s a kind of alcohol I don’t normally drink.

Now, in the interest of experimentation, one of these days I need to have lots of wine and see how I sleep!

PS: FWIW Sleeping heart rate is uncorrelated with how much coffee I have

PS2: Another time I wrote about alcohol

PS3: Maybe in my daily log I need to convert the alcohol column from binary to numeric (and record the number of units of alcohol I drink)

 

Structures of professions and returns to experience

I’ve written here a few times about the concept of “returns to experience“. Basically, in some fields such as finance, the “returns to experience” is rather high. Irrespective of what you have studied or where, how long you have continuously been in the industry and what you have been doing has a bigger impact on your performance than your way of thinking or education.

In other domains, returns to experience is far less. After a few years in the profession, you would have learnt all you had to, and working longer in the job will not necessarily make you better at it. And so you see that the average 15 years experience people are not that much better than the average 10 years experience people, and so you see salaries stagnating as careers progress.

While I have spoken about returns to experience, till date, I hadn’t bothered to figure out why returns to experience is a thing in some, and only some, professions. And then I came across this tweetstorm that seeks to explain it.

Now, normally I have a policy of not reading tweetstorms longer than six tweets, but here it was well worth it.

It draws upon a concept called “cognitive flexibility theory”.

Basically, there are two kinds of professions – well-structured and ill-structured. To quickly summarise the tweetstorm, well-structured professions have the same problems again and again, and there are clear patterns. And in these professions, first principles are good to reason out most things, and solve most problems. And so the way you learn it is by learning concepts and theories and solving a few problems.

In ill-structured domains (eg. business or medicine), the concepts are largely the same but the way the concepts manifest in different cases are vastly different. As a consequence, just knowing the theories or fundamentals is not sufficient in being able to understand most cases, each of which is idiosyncratic.

Instead, study in these professions comes from “studying cases”. Business and medicine schools are classic examples of this. The idea with solving lots of cases is NOT that you can see the same patterns in a new case that you see, but that having seen lots of cases, you might be able to reason HOW to approach a new case that comes your way (and the way you approach it is very likely novel).

Picking up from the tweetstorm once again:

 

It is not hard to see that when the problems are ill-structured or “wicked”, the more the cases you have seen in your life, the better placed you are to attack the problem. Naturally, assuming you continue to learn from each incremental case you see, the returns to experience in such professions is high.

In securities trading, for example, the market takes very many forms, and irrespective of what chartists will tell you, patterns seldom repeat. The concepts are the same, however. Hence, you treat each new trade as a “case” and try to learn from it. So returns to experience are high. And so when I tried to reenter the industry after 5 years away, I found it incredibly hard.

Chess, on the other hand, is well-structured. Yes, alpha zero might come and go, but a lot of the general principles simply remain.

Having read this tweetstorm, gobbled a large glass of wine and written this blogpost (so far), I’ve been thinking about my own profession – data science. My sense is that data science is an ill-structured profession where most practitioners pretend it is well-structured. And this is possibly because a significant proportion of practitioners come from academia.

I keep telling people about my first brush with what can now be called data science – I was asked to build a model to forecast demand for air cargo (2006-7). The said demand being both intermittent (one order every few days for a particular flight) and lumpy (a single order could fill up a flight, for example), it was an incredibly wicked problem.

Having had a rather unique career path in this “industry” I have, over the years, been exposed to a large number of unique “cases”. In 2012, I’d set about trying to identify patterns so that I could “productise” some of my work, but the ill-structured nature of problems I was taking up meant this simply wasn’t forthcoming. And I realise (after having read the above-linked tweetstorm) that I continue to learn from cases, and that I’m a much better data scientist than I was a year back, and much much better than I was two years back.

On the other hand, because data science attracts a lot of people from pure science and engineering (classically well-structured fields), you see a lot of people trying to apply overly academic or textbook approaches to problems that they see. As they try to divine problem patterns that don’t really exist, they fail to recognise novel “cases”. And so they don’t really learn from their experience.

Maybe this is why I keep saying that “in data science, years of experience and competence are not correlated”. However, fundamentally, that ought NOT to be the case.

This is also perhaps why a lot of data scientists, irrespective of their years of experience, continue to remain “junior” in their thinking.

PS: The last few paragraphs apply equally well to quantitative finance and economics as well. They are ill-structured professions that some practitioners (thanks to well-structured backgrounds) assume are well-structured.

Christian Rudder and Corporate Ratings

One of the studdest book chapters I’ve read is from Christian Rudder’s Dataclysm. Rudder is a cofounder of OkCupid, now part of the match.com portfolio of matchmakers. In this book, he has taken insights from OkCupid’s own data to draw insights about human life and behaviour.

It is a typical non-fiction book, with a studmax first chapter, and which gets progressively weaker. And it is the first chapter (which I’ve written about before) that I’m going to talk about here. There is a nice write-up and extract in Maria Popova’s website (which used to be called BrainPickings) here.

Quoting Maria Popova:

What Rudder and his team found was that not all averages are created equal in terms of actual romantic opportunities — greater variance means greater opportunity. Based on the data on heterosexual females, women who were rated average overall but arrived there via polarizing rankings — lots of 1’s, lots of 5’s — got exponentially more messages (“the precursor to outcomes like in-depth conversations, the exchange of contact information, and eventually in-person meetings”) than women whom most men rated a 3.

In one-hit markets like love (you only need to love and be loved by one person to be “successful” in this), high volatility is an asset. It is like option pricing if you think about it – higher volatility means greater chance of being in the money, and that is all you care about here. How deep out of the money you are just doesn’t matter.

I was thinking about this in some random context this morning when I was also thinking of the corporate appraisal process. Now, the difference between dating and appraisals is that on OKCupid you might get several ratings on a 5-point scale, but in your office you only get one rating each year on a 5-point scale. However, if you are a manager, and especially if you are managing a large team, you will GIVE out lots of ratings each year.

And so I was wondering – what does the variance of ratings you give out tell about you as a manager? Assume that HR doesn’t impose any “grading on curve” thing, what does it say if you are a manager who gave out an average rating of 3, with standard deviation 0.5, versus a manager who gave an average of 3, with all employees receiving 1s and 5s.

From a corporate perspective, would you rather want a team full of 3s, or a team with a few 5s and a few 1s (who, it is likely, will leave)? Once again, if you think about it, it depends on your Vega (returns to volatility). In some sense, it depends on whether you are running a stud or a fighter team.

If you are running a fighter team, where there is no real “spectacular performance” but you need your people to grind it out, not make mistakes, pay attention to detail and do their jobs, you want a team full of3s. The 5s in this team don’t contribute that much more than a 3. And 1s can seriously hurt your performance.

On the other hand, if you’re running a stud team, you will want high variance. Because by the sheer nature of work, in a stud team, the 5s will add significantly more value than the 1s might cause damage. When you are running a stud team, a team full of 3s doesn’t work – you are running far below potential in that case.

Assuming that your team has delivered, then maybe the distribution of ratings across the team is a function of whether it does more stud or fighter work? Or am I force fitting my pet theory a bit too much here?

Conductors and CAPM

For a long time I used to wonder why orchestras have conductors. I possibly first noticed the presence of the conductor sometime in the 1990s when Zubin Mehta was in the news. And then I always wondered why this person, who didn’t play anything but stood there waving a stick, needed to exist. Couldn’t the orchestra coordinate itself like rockstars or practitioners of Indian music forms do?

And then i came across this video a year or two back.

And then the computer science training I’d gone through two decades back kicked in – the job of an orchestra conductor is to reduce an O(n^2) problem to an O(n) problem.

For a  group of musicians to make music, they need to coordinate with each other. Yes, they have the staff notation and all that, but still they need to know when to speed up or slow down, when to make what transitions, etc. They may have practiced together but the professional performance needs to be flawless. And so they need to constantly take cues from each other.

When you have n musicians who need to coordinate, you have \frac{n.(n-1)}{2} pairs of people who need to coordinate. When n is small, this is trivial, and so you see that small ensembles or rock bands can easily coordinate. However, as n gets large, n^2 grows well-at-a-faster-rate. And that is a problem, and a risk.

Enter the conductor. Rather than taking cues from one another, the musicians now simply need to take cues from this one person. And so there are now only n pairs that need to coordinate – each musician in the band with the conductor. Or an O(n^2) problem has become an O(n) problem!

For whatever reason, while I was thinking about this yesterday, I got reminded of legendary finance professor R Vaidya‘s class on capital asset pricing model (CAPM), or as he put it “Sharpe single index model” (surprisingly all the links I find for this are from Indian test prep sites, so not linking).

We had just learnt portfolio theory, and how using the expected returns, variances and correlations between a set of securities we could construct an “efficient frontier” of securities that could give us the best risk-adjusted return. Seemed very mathematically elegant, except that in case you needed to construct a portfolio of n stocks, you needed n^2 correlations. In other word, an O(n^2) problem.

And then Vaidya introduced CAPM, which magically reduced the problem to an O(n) problem. By suddenly introducing the concept of an index, all that mattered for each stock now was its beta – the coefficient of its returns proportional to the index returns. You didn’t need to care about how stocks reacted with each other any more – all you needed was the relationship with the index.

In a sense, if you think about it, the index in CAPM is like the conductor of an orchestra. If only all O(n^2) problems could be reduced to O(n) problems this elegantly!