Speed, Accuracy and Shannon’s Channel Coding Theorem

I was probably the CAT topper in my year (2004) (they don’t give out ranks, only percentiles (to two digits of precision), so this is a stochastic measure). I was also perhaps the only (or one of the very few) person to get into IIMs that year despite getting 20 questions wrong.

It had just happened that I had attempted far more questions than most other people. And so even though my accuracy was rather poor, my speed more than made up for it, and I ended up doing rather well.

I remember this time during my CAT prep, where the guy who was leading my CAT factory once suggested that I was making too many errors so I should possibly slow down and make fewer mistakes. I did that in a few mock exams. I ended up attempting far fewer questions. My accuracy (measured as % of answers I got wrong) didn’t change by much. So it was an easy decision to forget above accuracy and focus on speed and that served me well.

However, what serves you well in an entrance exam need not necessarily serve you well in life. An exam is, by definition, an artificial space. It is usually bounded by certain norms (of the format). And so, you can make blanket decisions such as “let me just go for speed”, and you can get away with it. In a way, an exam is a predictable space. It is a caricature of the world. So your learnings from there don’t extend to life.

In real life, you can’t “get away with 20 wrong answers”. If you have done something wrong, you are (most likely) expected to correct it. Which means, in real life, if you are inaccurate in your work, you will end up making further iterations.

Observing myself, and people around me (literally and figuratively at work), I sometimes wonder if there is a sort of efficient frontier in terms of speed and accuracy. For a given level of speed and accuracy, can we determine an “ideal gradient” – on which way a person needs to move in order to make the maximum impact?

Once in a while, I take book recommendations from academics, and end up reading (rather, trying to read) academic books. Recently, someone had recommended a book that combined information theory and machine learning, and I started reading it. Needless to say, within half a chapter, I was lost, and I had abandoned the book. Yet, the little I read performed the useful purpose of reminding me of Shannon’s channel coding theorem.

Paraphrasing, what it states is that irrespective of how noisy a channel is, using the right kind of encoding and redundancy, we will be able to predictably send across information at a certain maximum speed. The noisier the channel, the more the redundancy we will need, and the lower the speed of transmission.

In my opinion (and in the opinions of several others, I’m sure), this is a rather profound observation, and has significant impact on various aspects of life. In fact, I’m prone to abusing it in inexact manners (no wonder I never tried to become an academic).

So while thinking of the tradeoff between speed and accuracy, I started thinking of the channel coding theorem. You can think of a person’s work (or “working mind”) as a communication channel. The speed is the raw speed of transmission. The accuracy (rather, the lack of it) is a measure of noise in the channel.

So the less accurate someone is, the more the redundancy they require in communication (or in work). For example, if you are especially prone to mistakes (like I am sometimes), you might need to redo your work (or at least a part of it) several times. If you are the more accurate types, you need to redo less often.

And different people have different speed-accuracy trade-offs.

I don’t have a perfect way to quantify this, but maybe we can think of “true speed of work” by dividing the actual speed in which someone does a piece of work by the number of iterations they need to get it right.  OK it is not so straightforward (there might be other ways to build redundancy – like getting two independent people to do the same thing and then tally the numbers), but I suppose you get the drift.

The interesting thing here is that the speed and accuracy is not only depend on the person but the nature of work itself. For me, a piece of work that on average takes 1 hour has a different speed-accuracy tradeoff compared to a piece of work that on average takes a day (usually, the more complicated and involved a piece of analysis, the more the error rate for me).

In any case, the point to be noted is that the speed-accuracy tradeoff is different for different people, and in different contexts. For some people, in some contexts, there is no point at all in expecting highly accurate work – you know they will make mistakes anyways, so you might as well get the work done quickly (to allow for more time to iterate).

And in a way, figuring out speed-accuracy tradeoffs of the people who work for you is an important step in getting the best out of them.

 

Christian Rudder and Corporate Ratings

One of the studdest book chapters I’ve read is from Christian Rudder’s Dataclysm. Rudder is a cofounder of OkCupid, now part of the match.com portfolio of matchmakers. In this book, he has taken insights from OkCupid’s own data to draw insights about human life and behaviour.

It is a typical non-fiction book, with a studmax first chapter, and which gets progressively weaker. And it is the first chapter (which I’ve written about before) that I’m going to talk about here. There is a nice write-up and extract in Maria Popova’s website (which used to be called BrainPickings) here.

Quoting Maria Popova:

What Rudder and his team found was that not all averages are created equal in terms of actual romantic opportunities — greater variance means greater opportunity. Based on the data on heterosexual females, women who were rated average overall but arrived there via polarizing rankings — lots of 1’s, lots of 5’s — got exponentially more messages (“the precursor to outcomes like in-depth conversations, the exchange of contact information, and eventually in-person meetings”) than women whom most men rated a 3.

In one-hit markets like love (you only need to love and be loved by one person to be “successful” in this), high volatility is an asset. It is like option pricing if you think about it – higher volatility means greater chance of being in the money, and that is all you care about here. How deep out of the money you are just doesn’t matter.

I was thinking about this in some random context this morning when I was also thinking of the corporate appraisal process. Now, the difference between dating and appraisals is that on OKCupid you might get several ratings on a 5-point scale, but in your office you only get one rating each year on a 5-point scale. However, if you are a manager, and especially if you are managing a large team, you will GIVE out lots of ratings each year.

And so I was wondering – what does the variance of ratings you give out tell about you as a manager? Assume that HR doesn’t impose any “grading on curve” thing, what does it say if you are a manager who gave out an average rating of 3, with standard deviation 0.5, versus a manager who gave an average of 3, with all employees receiving 1s and 5s.

From a corporate perspective, would you rather want a team full of 3s, or a team with a few 5s and a few 1s (who, it is likely, will leave)? Once again, if you think about it, it depends on your Vega (returns to volatility). In some sense, it depends on whether you are running a stud or a fighter team.

If you are running a fighter team, where there is no real “spectacular performance” but you need your people to grind it out, not make mistakes, pay attention to detail and do their jobs, you want a team full of3s. The 5s in this team don’t contribute that much more than a 3. And 1s can seriously hurt your performance.

On the other hand, if you’re running a stud team, you will want high variance. Because by the sheer nature of work, in a stud team, the 5s will add significantly more value than the 1s might cause damage. When you are running a stud team, a team full of 3s doesn’t work – you are running far below potential in that case.

Assuming that your team has delivered, then maybe the distribution of ratings across the team is a function of whether it does more stud or fighter work? Or am I force fitting my pet theory a bit too much here?

Studs and Fighters and Attack and Defence

The general impression in sport is that attack is “stud” and defence is “Fighter“. This is mainly because defence (in any game, pretty much) is primarily about not making errors, and being disciplined. Flamboyance can pay off in attack, when you only need to strike occasionally, but not in defence, where the real payoff comes from being consistent and excellent.

However, attack need not always be stud, and defence need not always be fighter. This is especially true in team sports such as football, where there can be a fair degree of organisation and coaching to get players to coordinate.

This piece in The Athletic (paywalled) gives an interesting instance of how attacking can be fighter, and how modern football is all about fighter attacking. It takes the instance of this weekend’s game between Tottenham Hotspur and Liverpool F.C., which the latter won.

Jack Pitt-Brooke, the author, talks about how Liverpool is fighter in attack because the players are well-drilled in attacking, and practice combination play, or what are known in football as “automisations”.

But in modern football, the opposite is true. The best football, the type played by Pep Guardiola’s Manchester City or Jurgen Klopp’s Liverpool, is the most rigorously planned, drilled and co-ordinated. Those two managers have spent years teaching their players the complex attacking patterns and synchronised movements that allow them to cut through every team in the country. That is why they can never be frustrated by opponents who just sit in and defend, why they are racking up points totals beyond the reach of anyone else.

Jose Mourinho, on the other hand, might be fighter in the way he sets up his defence, but not so when it comes to attacking. He steadfastly refuses to have his teams train attacking automisations. While defences are extremely well drilled, and know exactly how to coordinate, attackers are left to their own forces and creativity. What Mourinho does is to identify a handful of attackers (usually the centre forward and the guy just behind him) who are given “free roles” and are expected to use their own creativity in leading their team’s attacks.

As Pitt-Brooke went on to write in his article,

That, more than anything else, explains the difference between Klopp and Mourinho. Klopp wants to plan his way out of the randomness of football. Mourinho is more willing to accept it as a fact and work around it. So while the modern manager — Klopp, Guardiola, Antonio Conte — coaches players in ‘automisations’, pre-planned moves and patterns, Mourinho does not.

Jurgen Klopp the fighter, and Jose Mourinho the stud. That actually makes sense when you think of how their teams attack. It may not be intuitive, but upon some thought it makes sense.

Yes, attack is also being fighterised in modern sport.

Studs and fighters: Origin

As far as this blog is concerned, the concept of studs and fighters began sometime in 2007, when I wrote the canonical blog post on the topic. Since then the topic has been much used and abused.

Recently, though, I remembered when I had first come across the concept of studs and fighters. This goes way back to 1999, and has its origins in a conversation with two people who I consider as among the studdest people I’ve ever met (they’re both now professors at highly reputed universities).

We were on a day-long train journey, and were discussing people we had spent a considerable amount of time with over the previous one month. It was a general gossip session, the sort that was common to train journeys in the days before smartphones made people insular.

While discussing about one guy we had met, one of us (it wasn’t me for sure. It was one of the other two but I now can’t recall which of them it was) said “well, he isn’t particularly clever, but he is a very hard worker for sure”.

And so over time this distinction got institutionalised, first in my head and then in the heads of all my readers. There were two ways to be good at something – by either being clever or by being a very hard worker.

Thinking about it now, it seems rather inevitable that the concept that would become studs and fighters came about in the middle of a conversation among studs.

10X Studs and Fighters

Tech twitter, for the last week, has been inundated with unending debate on this tweetstorm by a VC about “10X engineers”. The tweetstorm was engineered by Shekhar Kirani, a Partner at Accel Partners.

I have friends and twitter-followees on both sides of the debate. There isn’t much to describe more about the “paksh” side of the debate. Read Shekhar’s tweetstorm I’ve put above, and you’ll know all there is to this side.

The vipaksh side argues that this normalises “toxicity” and “bad behaviour” among engineers (about “10X engineers”‘s hatred for meetings, and their not adhering to processes etc.). Someone I follow went to the extent to say that this kind of behaviour among engineers is a sign of privilege and lack of empathy.

This is just the gist of the argument. You can just do a search of “10X engineer”, ignore the jokes (most of them are pretty bad) and read people’s actual arguments for and against “10X engineers”.

Regular readers of this blog might be familiar with the “studs and fighters” framework, which I used so often in the 2007-9 period that several people threatened to stop reading me unless I stopped using the framework. I put it on a temporary hiatus and then revived it a couple of years back because I decided it’s too useful a framework to ignore.

One of the fundamental features of the studs and fighters framework is that studs and fighters respectively think that everyone else is like themselves. And this can create problems at the organisational level. I’d spoken about this in the introductory post on the framework.

To me this debate about 10X engineers and whether they are good or bad reminds me of the conflict between studs and fighters. Studs want to work their way. They are really good at what they’re competent at, and absolutely suck at pretty much everything else. So they try to avoid things they’re bad at, can sometimes be individualistic and prefer to work alone, and hope that how good they are at the things they’re good at will compensate for all that they suck elsewhere.

Fighters, on the other hand, are process driven, methodical, patient and sticklers for rules. They believe that output is proportional to input, and that it is impossible for anyone to have a 10X impact, even 1/10th of the time (:P). They believe that everyone needs to “come together as a group and go through a process”.

I can go on but won’t.

So should your organisation employ 10X engineers or not? Do you tolerate the odd “10X engineer” who may not follow company policy and all that in return for their superior contributions? There is no easy answer to this but overall I think companies together will follow a “mixed strategy”.

Some companies will be encouraging of 10X behaviour, and you will see 10X people gravitating towards such companies. Others will dissuade such behaviour and the 10X people there, not seeing any upside, will leave to join the 10X companies (again I’ve written about how you can have “stud organisations” and “fighter organisations”.

Note that it’s difficult to run an organisation with solely 10X people (they’re bad at managing stuff), so organisations that engage 10X people will also employ “fighters” who are cognisant that 10X people exist and know how they should be managed. In fact, being a fighter while recognising and being able to manage 10X behaviour is, I think, an important skill.

As for myself, I don’t like one part of Shekhar Kirani’s definition – that he restricts it to “engineers”. I think the sort of behaviour he describes is present in other fields and skills as well. Some people see the point in that. Others don’t.

Life is a mixed strategy.

AlphaZero Revisited

It’s been over a year since Google’s DeepMind first made its splash with the reinforcement-learning based chess playing engine AlphaZero. The first anniversary of the story of AlphaZero being released also coincided with the publication of the peer-reviewed paper.

To go with the peer-reviewed paper, DeepMind has released a further 200 games played between AlphaZero and the conventional chess engine StockFish, which is again heavily loaded in favour of wins for AlphaZero, but also contains 6 game where AlphaZero lost. I’ve been following these games on GM Daniel King’s excellent Powerplaychess channel, and want to revise my opinion on AlphaZero.

Back then, I had looked at AlphaZero’s play from my favourite studs and fighter framework, which in hindsight doesn’t do full justice to AlphaZero. From the games that I’ve seen from the set released this season, AlphaZero’s play hasn’t exactly been “stud”. It’s just that it’s much more “human”. And the reason why AlphaZero’s play possibly seems more human is because of the way it “learns”.

Conventional chess engines evaluate a position by considering all possible paths (ok not really, they use an intelligent method called Alpha-Beta Pruning to limit their search size), and then play the move that leads to the best position at the end of the search. These engines use “pre-learnt human concepts” such as point count for different pieces, which are used to evaluate positions. And this leads to a certain kind of play.

AlphaZero’s learning, process, however, involves playing zillions of games against itself (since I wrote that previous post, I’ve come back up to speed with reinforcement learning). And then based on the results of these games, it evaluates positions it reached in the course of play (in hindsight). On top of this, it builds a deep learning model to identify the goodness of positions.

Given my limited knowledge of how deep learning works, this process involves AlphaZero learning about “features” of games that have more often than not enabled it to win. So somewhere in the network there will be a node that represents “control of centre”. Another node deep in the network might represent “safety of king”. Yet another might perhaps involve “open A file”.

Of course, none of these features have been pre-specified to AlphaZero. It has simply learnt it by training its neural network on zillions of games it has played against itself. And while deep learning is hard to “explain”, it is likely to have so happened that the features of the game that AlphaZero has learnt are remarkably similar to the “features” of the game that human players have learnt over the centuries. And it is because of the commonality in these features that we find AlphaZero’s play so “human”.

Another way to look at is from the concept of “10000 hours” that Malcolm Gladwell spoke about in his book Outliers. As I had written in my review of the book, the concept of 10000 hours can be thought of as “putting fight until you get enough intuition to become stud”. AlphaZero, thanks to its large number of processors, has effectively spent much more than “10000 hours” playing against itself, with its neural network constantly “learning” from the positions faced and the outcomes of the game reached. And this way, it has “gained intuition” over features of the game that lead to wins, giving it an air of “studness”.

The interesting thing to me about AlphaZero’s play is that thanks to its “independent development” (in a way like the Finches of Galapagos), it has not been burdened by human intuition on what is good or bad, and learnt its own heuristics. And along the way, it has come up with a bunch of heuristics that have not commonly be used by human players.

Keeping bishops on the back rank (once the rooks have been connected), for example. A stronger preference for bishops to knights than humans. Suddenly simplifying from a terrifying-looking attack into a winning endgame (machines are generally good at endgames, so this is not that surprising). Temporary pawn and piece sacrifices. And all that.

Thanks to engines such as LeelaZero, we can soon see the results of these learnings being applied to human chess as well. And human chess can only become better!

AlphaZero defeats Stockfish: Quick thoughts

The big news of the day, as far as I’m concerned, is the victory of Google Deepmind’s AlphaZero over Stockfish, currently the highest rated chess engine. This comes barely months after Deepmind’s AlphaGo Zero had bested the earlier avatar of AlphaGo in the game of Go.

Like its Go version, the AlphaZero chess playing machine learnt using reinforcement learning (I remember doing a term paper on the concept back in 2003 but have mostly forgotten). Basically it wasn’t given any “training data”, but the machine trained itself on continuously playing with itself, with feedback given in each stage of learning helping it learn better.

After only about four hours of “training” (basically playing against itself and discovering moves), AlphaZero managed to record this victory in a 100-game match, winning 28 and losing none (the rest of the games were drawn).

There’s a sample game here on the Chess.com website and while this might be a biased sample (it’s likely that the AlphaZero engineers included the most spectacular games in their paper, from which this is taken), the way AlphaZero plays is vastly different from the way engines such as Stockfish have been playing.

I’m not that much of a chess expert (I “retired” from my playing career back in 1994), but the striking things for me from this game were

  • the move 7. d5 against the Queen’s Indian
  • The piece sacrifice a few moves later that was hard to see
  • AlphaZero’s consistent attempts until late in the game to avoid trading queens
  • The move Qh1 somewhere in the middle of the game

In a way (and being consistent with some of the themes of this blog), AlphaZero can be described as a “stud” chess machine, having taught itself to play based on feedback from games it’s already played (the way reinforcement learning broadly works is that actions that led to “good rewards” are incentivised in the next iteration, while those that led to “poor rewards” are penalised. The challenge in this case is to set up chess in a way that is conducive for a reinforcement learning system).

Engines such as StockFish, on the other hand, are absolute “fighters”. They get their “power” by brute force, by going down nearly all possible paths in the game several moves down. This is supplemented by analysis of millions of existing games of various levels which the engine “learns” from – among other things, it learns how to prune and prioritise the paths it searches on. StockFish is also fed a database of chess openings which it remembers and tries to play.

What is interesting is that AlphaZero has “discovered” some popular chess openings through the course of is self-learning. It is interesting to note that some popular openings such as the King’s Indian or French find little favour with this engine, while others such as the Queen’s Gambit or the Queen’s Indian find favour. This is a very interesting development in terms of opening theory itself.

Frequency of openings over time employed by AlphaZero in its “learning” phase. Image sourced from AlphaZero research paper.

In any case, my immediate concern from this development is how it will affect human chess. Over the last decade or two, engines such as stockfish have played a profound role in the development of chess, with current top players such as Magnus Carlsen or Sergey Karjakin having trained extensively with these engines.

The way top grandmasters play has seen a steady change in these years as they have ingested the ideas from engines such as StockFish. The game has become far more quiet and positional, as players seek to gain small advantages which steadily improves over the course of (long) games. This is consistent with the way the engines that players learn from play.

Based on the evidence of the one game I’ve seen of AlphaZero, it plays very differently from the existing engines. Based on this, it will be interesting to see how human players who train with AlphaZero based engines (or their clones) will change their game.

Maybe chess will turn back to being a bit more tactical than it’s been in the last decade? It’s hard to say right now!

The skill in making coffee

Perhaps for the first time ever in life, I’m working in an office without a coffee machine. I don’t mind that so much for two reasons – firstly, having to go down 27 floors and then pay explicitly for a coffee means that my coffee consumption has come down drastically. Secondly, there is a rather liquid market of coffee shops around my office.

As you might expected, there is one particular coffee shop close to my office that has become my favourite. And while walking back with my flat white on Wednesday afternoon, I noticed that the coffee tasted different to the flat white I’d had at the same place the same morning.

Assuming that even artisanal coffee shops like that one are unlikely to change beans midway through the day, I’m guessing that the difference in taste came down to the way the coffee was prepared. Flat white involves some effort on behalf of the barista – milk needs to be steamed and frothed and poured in a particular manner. And this can vary by barista.

So this got me thinking about whether making coffee is a skilled task. And this might explain the quality of coffee at various establishments in Bangalore.

When the coffee bar is equipped with an espresso machine, the job of making an espresso involves less of a skill since all that the barista needs to do is to weigh out the appropriate quantity of beans, press it down to the right extent and then pop it into the espresso maker (I know these tasks themselves involve some skill, but it’s less compared to using a South Indian style filter, for example).

When you want milk coffee, though, there is a dramatic increase in skill requirement. Even in South Indian coffee, the way you boil and froth the milk makes a huge difference in the taste of the coffee. In Brahmin’s Coffee Bar in Shankarpuram, Bangalore, for example, the barista explicitly adds a measure of milk foam to the top of the coffee lending it a special taste.

And when it comes to “European” coffee, with its multiple variants involving milk, the skill required to make good milk coffee is massive. How much milk do you add.. How hot do you steam it.. Whether you add foam or not.. These are all important decisions that the barista needs to make, and there is a lot of value a good barista can add to a cup of coffee.

One of my biggest cribs about chain coffee shops in India is that the taste of the coffee isn’t particularly good, with hot milk coffees being especially bad. Based on my analysis so far, I think this could be largely a result of unskilled (or semi-skilled) and inexperienced baristas – something these chains have had to employ in order to scale rapidly.

The cold coffees in these places are relatively much better since the process of making them can be “fighterised” – for each unit, add X shots of espresso to Y ml of milk, Z ice cubes and W spoons of sugar and blend. The only skill involved there is in getting the proportions right, and that can be easily taught, or looked up from a table.

The problem with hot coffees is that this process cannot be fighterised – the precise way in which you pour the milk so that there is a heart shape on top of the cappuccino foam, for example, is a skill that comes only with significant practice. Even the way in which the milk is to be foamed is not an easily teachable task.

And that is the problem with chain coffee shops in India – lack of skilled labour combined with the need to scale rapidly has meant that people have tried to use processes to compensate for skills, and in most parts of coffee making, that’s not necessarily a good way to go.

How power(law)ful is your job?

A long time back I’d written about how different jobs are sigmoidal to different extents – the most fighter jobs, I’d argued, have linear curves – the amount you achieve is proportional to the amount of effort you put in. 

And similarly I’d argued that the studdest jobs have a near vertical line in the middle of the sigmoid – indicating the point when insight happens. 

However what I’d ignored while building that model was that different people can have different working styles – some work like Sri Lanka in 1996 – get off to a blazing start and finish most of the work in the first few days. 

Others work like Pakistan in 1992 – put ned for most of the time and then suddenly finish the job at the last minute. Assuming a sigmoid does injustice to both these strategies since both these curves cannot easily be described using a sigmoidal function. 

So I revise my definition, and in order to do so, I use a concept from the 1992 World Cup – highest scoring overs. Basically take the amount of work you’ve done in each period of time (period can be an hour or day or week or whatever) and sort it in descending order. Take the cumulative sum. 

Now make a plot with an index on the X axis and the cumulative sum on the Y axis. The curve will look like that if a Pareto (80-20) distribution. Now you can estimate the power law exponent, and curves that are steeper in the beginning (greater amount of work done in fewer days) will have a lower power law exponent. 

And this power law exponent can tell you how stud or fighter the job is – the lower the exponent the more stud the job!! 

Studs, fighters and spikes

In a blog post yesterday I talked about the marriage and dating markets and how people with spikes which can be evaluated either highly positively or highly negatively were more likely to get dates, while in the arranged marriage market, you were better off being a solid CMP (common minimum program).

The question is how this applies for jobs. Are you better off being a solid performer or if you are someone who has a quirky CV, with some features that can either be heavily positively or heavily negatively by some people. How will the market evaluate you, and which of them is more likely for finding you a job?

The answer lies in whether the job that you are applying for is predominantly stud or fighter (apologies to those to whom I mentioned I was retiring this framework – I find it way too useful to ditch). If it is a predominantly fighter job – one that requires a steady output and little creativity or volatility, you are better off having a solid CV – being a consistent 3 rather than having lots of 5s and 1s in your rating chart. When the job is inherently fighter, what they are looking for is consistent output, and what they don’t look for is the occasional 1 – a situation where you are likely to underperform for whatever reason. Fighter jobs don’t necessarily care for the occasional spike in the CV – for there is no use of being extraordinary for such jobs. Thus, you are better off being a consistent 3.

If it is a stud job, though, one where you are likely to show some occasional creativity, you are more likely to get hired if you have a few 5s and a few 1s rather than if you have all 3s. If the job requires creativity and volatility, what the employer wants to know is that you are occasionally capable of delivering a 5 – which is what they are essentially hiring you for. Knowing that people who are good at stud jobs have the occasional off day, employers of stud jobs are okay with someone with a few 1s, as long as they have 5s.

So whether you should be looking for a stud or a fighter job depends on what kind of a professional career that you’ve had so far – if you’ve had a volatile career with a few spikes and a few troughs, you are much better off applying for stud jobs. If you’ve been a steady consistent performer you are better suited for a fighter job!

Of course you need to remember that this ranking as a function of your volatility is valid only if you were to hold your “average rating” constant!