Big and fast

In football, normally we see two kinds of strikers – small and quick or big and slow. About twenty years ago, when 4-4-2 was the dominant formation, it was common for teams to deploy a strike partnership with one of each. Liverpool, for example, played with Michael Owen (small and quick) and Emile Heskey (bit and slow).

While strike partnerships have gone out of fashion, you still see these two kinds of strikers in modern football. The small and quick striker usually “plays on the shoulder of the last defender”, looking to beat the offside trap and score. The big and slow striker holds up the ball in an advanced position, waiting for teammates to go past, so that the team can then attack in numbers. The big and slow striker is also usually good in the air and can convert crosses.

For a long time, I was wondering why there were no “big and fast” strikers in football. It isn’t as if bulk / size is negatively correlated with speed – there surely must exist big guys who are also quick, and I was wondering why there weren’t so many strikers like this.

That, of course changed last year, with the arrival of Erling Haaland, a striker who is both incredibly quick and incredibly big, and who has dominated the Premier League like nobody’s business. Similarly, there is also Darwin Nuñez, who can both play off the last defender, and head crosses towards goal, and hold up the ball. Then again, I can’t think of too many others in contemporary football.

This morning, I got a hypothesis on why this is so – the big and fast guys are all in rugby! I was watching highlights of the quarter finals (England beating Fiji and South Africa beating France), and what I noticed was that the rugby guys are all both big and fast.

You need to be fast (and agile) to skip past the opponents to do a touchdown. And then you need tremendous upper body strength to be able to take down an opponent, or resist when an opponent tries to take you down. From that perspective, being big and being fast are both non negotiable for you to be a top rugby player.

I know there is a class difference in places like England between those who take up football and those who take up rugby (football is working class, rugby is upper class), but could it be that most people who are big and fast, and want to take up professional sport, choose rugby rather than football? And is this why you find few big and fast players from countries traditionally good at both games – such as England and France (and maybe Argentina)?

Haaland is from Norway, which doesn’t really play rugby (again, his father was a footballer). Nuñez is from Uruguay, which is a massive football nation, but not much in rugby (they made their rugby debut at this world cup, i think). And so despite their physique and speed, they chose football.

Had they been from England or France, it’s likely they would’ve played rugby instead!

Channel Coding Theorem in Real Life

One of my favourite concepts in Computer Science is Shannon’s Channel Coding Theorem. This theorem is basically about the efficiency of communication over a noisy channel. And as I was thinking a few minutes back, this has interesting implications in real life as well, well away from the theory of communication.

I don’t have that much understanding of the rigorous explanation of the theorem. However, I absolutely love the central idea of it – that the noisier a channel is, the more the redundancy you need in your communication, and thus the slower is your communication. A corollary of this is that every channel has a “natural maximum speed”, and as long as you try to communicate within that speed, you can communicate reliably.

I won’t go into the technical details here – that involves assuming that the channel loses (or garbles) X% of bits, and then constructing a redundant code that shows that even with this loss, you can communicate effectively.

Anyway, let’s leave behind the theory communication and go on to real life.

I’ve found that I communicate badly when I’m not sure what language to talk in. If I’m talking in English with someone who I know knows good English, I communicate rather well (like my writing 😛 ) . However, if I’m not sure about the quality of language of the other person, I hesitate. I try to force myself to find simpler / more obvious words, and that disturbs my flow of thought, and I stammer.

Similarly, when I’m not sure whether to talk in Kannada or English (the two languages I’m very comfortable in), I stammer heavily. Again, because I’m not sure if the words I would naturally use will be understood by the other person (the counterparty’s comprehension being the “noise in the channel” here), I slow down, get jittery, and speak badly.

Then of course, there is the very literal interpretation of the channel coding theorem – when your internet connection (or call quality in general) is bad, you end up having to speak slower. When I was hunting for a job in 2020, I remember doing badly in a few interviews because of the quality (or lack thereof) of the internet connection (this was before I had discovered that Google Meet performs badly on Safari).

Similarly, sometime last month, I had thought I had prepared well for what I thought was going to be a key conversation at work. The internet was bad, we couldn’t hear each other and  kept repeating (redundancy is how you overcome the noise in the channel), and that diminished throughput massively. Given the added difficulty in communication, I didn’t bring up the key points I had prepared for. It was a damp squib.

Related to this is when you aren’t sure if the person you are speaking to can hear clearly. This disability again clouds the communication channel, meaning you need to build in redundancy, and thus a reduction in throughput.

When you are uncertain of yourself, or underconfident, you end up tending to do badly. That is because when you are uncertain, you aren’t sure if the other person will fully understand what you are going to say. Consequently, you end up talking slower, building redundancy in your speech, etc. You are more doubtful of what you are going to say, and don’t take risks, since your lack of confidence has clouded the “communication channel”, thus depressing your throughput.

Again a lot of this might apply to me alone – I function best when I’m talking / writing at a certain minimum throughput, and operating at anywhere below that makes me jittery and underconfident and a bad communicator. It is no surprise that my writing really took off once I got a computer of my own.

That was in the beginning of July 2004, and within a month, I had started (the predecessor of) this blog. I’ve been blogging for 19 years now.

That aside aside, the channel coding  theorem works in non-verbal contexts as well. Back in 2016, before my daughter was born, I remember reading somewhere that tentative mothers lead to cranky babies. The theory was that if the mum was anxious or afraid while handling her baby, the baby wouldn’t perceive the signals of touch sufficiently, and being devoid of communication, become cranky.

We had seen a few examples of this among relatives and friends (and this possibly applies to me as well – my mother had told me that I was the first newborn she ever handled, and so she was a bit tentative in handling me). This again can be explained using the Channel Coding Theorem.

When the mother’s touch is tentative, it is as if the touchy channel between mother and child has some “noise”. The tentativeness of the touch means the baby is not really sure of what the mother is “saying”. With touch, unlike language or bits, redundancy is harder. And so the child goes up insufficiently connected to its mother.

Conversely, later on in life, these tentative mothers tend to bring in redundancy in their communications with their (now jittery) children, and end up holding them too hard, and not letting them go (and some of these children go to therapists, who inevitably blame it on the mothers 😛 ). Ultimately, all of this stems from the noise in the initial communication channel (thanks to the tentativeness of the source).

Ok I’ve rambled on here, so will stop now. However, now that I’ve seeded this thought in you, you too will start seeing the channel coding theorem everywhere (oh – if you think this post is badly written, then that is again like reading this over a noisy channel. And you will get irritated with the lack of throughput and pack).

Algo trading and ice cream

I refuse to share ice cream with my daughter, just like I used to refuse to share peanuts with my father. This refusal to share in both cases primarily has to do with the differential speed of consumption.

With my father and peanuts, it was a matter of ability – as someone who had grown up on a peanut farm (and thus he was a fan of Jimmy Carter), he was an expert at shelling peanuts. The Bangalore-born me was much less expert, and so before I knew it he would have finished the lot of it.

With my daughter and ice cream, it is a matter of willingness – she likes to finish it quickly, in big spoons. I like to savour it over a long time – at home,  I use a rather small spoon and eat it slowly. Nowadays I’ve been trying to cut down sugars and so when I eat them I try to get the maximum benefit out of them and thus eat slowly. However, even as a child I would eat my desserts slowly, trying to “extract maximum benefits”.

So last night we were having ice cream (individual small tubs of course). Daughter finished hers quickly and came to me, to see that my tub was still half full (and I was blogging as I was eating it).

“Appa, why do you like to turn your ice cream into milkshake?”, she asked.

“I don’t”, I said, “I just try to get the maximum value out of it, and thus I eat it slowly”.

“But then if you take too long to eat, then it turns into milkshake which is much less enjoyable than ice cream”, she countered. She had a valid point.

And then I realised this is exactly the problem I worked on during my stint as an investment banking quant in 2009-11. I was working on algo trading, specifically execution of large block deals.

The tradeoff there was that if you traded too quickly, you would end up moving the market and thus trading at an unfavourable price. On the other hand, if you traded too slowly, the natural volatility of the stock would mean that the market might move against you. And so you had to balance the two and trade.

I won’t go into the details on how we solved it (my erstwhile bank might not like it), but it suffices to say here that it is similar to eating ice cream.

If you eat too quickly, you run the risk of not getting sufficient “benefit” out of the ice cream at hand. If you eat too slowly, then there is the risk that the ice cream itself will melt and thus be less enjoyable for you.

I tried explaining this analogy to my daughter last night, but she didn’t get it. I guess she is too young to understand risk, volatility, market impact and the like.

And so I’m inflicting this on you!

Speed, Accuracy and Shannon’s Channel Coding Theorem

I was probably the CAT topper in my year (2004) (they don’t give out ranks, only percentiles (to two digits of precision), so this is a stochastic measure). I was also perhaps the only (or one of the very few) person to get into IIMs that year despite getting 20 questions wrong.

It had just happened that I had attempted far more questions than most other people. And so even though my accuracy was rather poor, my speed more than made up for it, and I ended up doing rather well.

I remember this time during my CAT prep, where the guy who was leading my CAT factory once suggested that I was making too many errors so I should possibly slow down and make fewer mistakes. I did that in a few mock exams. I ended up attempting far fewer questions. My accuracy (measured as % of answers I got wrong) didn’t change by much. So it was an easy decision to forget above accuracy and focus on speed and that served me well.

However, what serves you well in an entrance exam need not necessarily serve you well in life. An exam is, by definition, an artificial space. It is usually bounded by certain norms (of the format). And so, you can make blanket decisions such as “let me just go for speed”, and you can get away with it. In a way, an exam is a predictable space. It is a caricature of the world. So your learnings from there don’t extend to life.

In real life, you can’t “get away with 20 wrong answers”. If you have done something wrong, you are (most likely) expected to correct it. Which means, in real life, if you are inaccurate in your work, you will end up making further iterations.

Observing myself, and people around me (literally and figuratively at work), I sometimes wonder if there is a sort of efficient frontier in terms of speed and accuracy. For a given level of speed and accuracy, can we determine an “ideal gradient” – on which way a person needs to move in order to make the maximum impact?

Once in a while, I take book recommendations from academics, and end up reading (rather, trying to read) academic books. Recently, someone had recommended a book that combined information theory and machine learning, and I started reading it. Needless to say, within half a chapter, I was lost, and I had abandoned the book. Yet, the little I read performed the useful purpose of reminding me of Shannon’s channel coding theorem.

Paraphrasing, what it states is that irrespective of how noisy a channel is, using the right kind of encoding and redundancy, we will be able to predictably send across information at a certain maximum speed. The noisier the channel, the more the redundancy we will need, and the lower the speed of transmission.

In my opinion (and in the opinions of several others, I’m sure), this is a rather profound observation, and has significant impact on various aspects of life. In fact, I’m prone to abusing it in inexact manners (no wonder I never tried to become an academic).

So while thinking of the tradeoff between speed and accuracy, I started thinking of the channel coding theorem. You can think of a person’s work (or “working mind”) as a communication channel. The speed is the raw speed of transmission. The accuracy (rather, the lack of it) is a measure of noise in the channel.

So the less accurate someone is, the more the redundancy they require in communication (or in work). For example, if you are especially prone to mistakes (like I am sometimes), you might need to redo your work (or at least a part of it) several times. If you are the more accurate types, you need to redo less often.

And different people have different speed-accuracy trade-offs.

I don’t have a perfect way to quantify this, but maybe we can think of “true speed of work” by dividing the actual speed in which someone does a piece of work by the number of iterations they need to get it right.  OK it is not so straightforward (there might be other ways to build redundancy – like getting two independent people to do the same thing and then tally the numbers), but I suppose you get the drift.

The interesting thing here is that the speed and accuracy is not only depend on the person but the nature of work itself. For me, a piece of work that on average takes 1 hour has a different speed-accuracy tradeoff compared to a piece of work that on average takes a day (usually, the more complicated and involved a piece of analysis, the more the error rate for me).

In any case, the point to be noted is that the speed-accuracy tradeoff is different for different people, and in different contexts. For some people, in some contexts, there is no point at all in expecting highly accurate work – you know they will make mistakes anyways, so you might as well get the work done quickly (to allow for more time to iterate).

And in a way, figuring out speed-accuracy tradeoffs of the people who work for you is an important step in getting the best out of them.