May 2023 – Pertinent Observations

New blood joins this team

I intended to write this a year ago, when Sadio Mane left Liverpool after six brilliant years at the club. There was much heartbreak among the club fan base about Mane leaving, and a lot of people saw it as a failure on the part of the management and ownership in terms of not being able to keep him.

Now, a year on, I admit that Darwin Nunez hasn’t quite set the club on fire (though I personally quite like him), but as a general principle, this kind of “freshening up” is a highly necessary process in a team, if you need to avoid stagnation.

A month or two back, I was watching some YouTube video on “Liverpool’s greatest Premier League goals against Manchester City” (this was just before the 4-1 hammering at the Etihad). As the goals were shown one by one, I kept trying to guess which season and game it was in.

There were important clues – whether Firmino wore 9 or 11, whether Mane wore 19 or 10, the identity of some players, the length of Trent Alexander Arnold’s hair, my memory of the scoreline from that game, etc. (Liverpool always wear the home Red at the Etihad, so the colour of the away kit wasn’t a clue).

However, for one goal I simply wasn’t able to figure out which season it was. There was TAA wearing 66, Fabinho, Henderson, the fab front three (Firmino-Mane-Salah, wearing 9-10-11 respectively) and Robertson. That’s when it hit me that for a fairly long time, a large part of Liverpool’s team had stayed constant! There was very little change at the club.

Now, there are benefits to having a consistently settled team (as the fabulous 2021-22 season showed), but there is also the danger of stasis. In something like football where careers are short, you don’t want the whole team “getting old together”. In the corporate world, people can get into too much of a comfort zone. And cynicism can set in.

Good new employees are always buzzing with ideas, fearless about what has been rejected before and who thinks how. As people spend longer in the organisation, though, colleagues become predictable and certain ways of doing things become institutionalised. Sooner than you know it, you would have become a “company man”, (figuratively) wearing the same white shirt and blue suits as your fellow company men, and socialising with your colleagues at the (figurative) company club.

There can be different kinds of companies here – some companies allow people to retain a lot of their individuality; and there the “decay” into company-manhood is slower. In this kind of a place, the same set of people can stay together for longer and still continue to innovate and add significant value to one another.

Other companies are less forgiving, and you very quickly assimilate, and lose part of your idiosyncrasy. Insofar as innovation comes out of fresh ideas and thinking and unusual connections, these companies are not very good at it. And in such companies, pretty much the only way to keep the innovative wheel going and continue to add value is by bringing in fresh blood well-at-a-faster-rate.

Putting it another way, if you are a cohesive kind of company, some attrition may not actually be a bad thing (unless you are growing rapidly enough to expand your team rapidly). To grow and innovate, you need people to think different.

And you get there either by having the sort of superior culture where existing employees continue to think different long after they’ve been exposed to one another’s thoughts; or by continuing to bring in fresh employees.

There is no other way.

Jordan “visa interview on arrival”

The peak-end hypothesis means that we’ve come back from our trip to Jordan really happy. It was a brilliant and diverse experience, involving Roman History (Jerash, Amman Citadel), Christian Theology (Mount Nebo, Madaba), hill climbing (at Petra – more on that later), wilderness (Wadi Rum) and a resort and floating on water (Dead Sea).

However, preceding all this was an absolutely atrocious “process” that we had to go through at the Amman airport. I waited to return to India to write this.

Nominally Jordan has “visa on arrival” for Indians. This means you don’t need to get a visa before you travel. However, what they don’t really tell you is that it doesn’t work the same way as visas on arrival in other countries – such as Hong Kong or Thailand or Maldives (based on my limited experience), where you enter the passport control, get your passport stamped, maybe pay a fee and move on.

In Jordan that’s not the way it works. We had pre-bought a “Jordan Pass” that includes fees for the visa and to some of the historic attractions in the country. Upon landing at Amman Airport, we encountered a line saying “for jordan pass / visa on arrival”. And that’s where the arbitrariness started.

Firstly, it is the “border police” who man this, unlike India where it’s bureaucrats from the external affairs ministry. More importantly, there is no “process”. You go to the window where the person there leafs through the passport looking for active visas – if you have a valid US or UK or Schengen or even Saudi visa, your visa gets printed on a paper and you get waved on to passport control. In the absence of all this, you are asked to “wait there”, without any further direction.

Then we were asked to go to “police in room 1”, which was some 200m away. This is where we had our first cultural shock of the trip – there was a heavy smell of cigarettes there, and we entered to see cops smoking there as they were talking to us.

The same process repeated – the cops leafing through the passport to see if there are any other valid visas, and then when not finding anything, asking us to “wait”. Again there was no definite timeline or process. We waited for a bit (during which the cops did namaz, and presumably stopped smoking while doing so), and then went in again and asked. Again we were asked to “wait”.

The cops all had identical uniforms so it was impossible for us to know who was “superior” or to escalate. After a few rounds of such waiting, my wife finally put senti saying we have a small child who is hungry (thankfully our daughter managed to produce a reasonably sad face at that time, though she was unable to cry), and finally they started considering our application.

We had printed out all our hotel reservations (I’d read on some forum that it might be required at “immigration” – though those fora didn’t mention how arbitrary the process is) and handed them over to the police, who went through them. One cop got convinced (I don’t know if it helped that we had booked in a few expensive hotels; he even asked us for our salaries and what work we do, etc.) and we got sent to another one. Yet again, and this was not the first time we were encountering him, he started the process all from the beginning, looking for valid visa stamps in our passports!

And then he started filling out some application. It was the first time I had seen someone actually write right to left, so it was mildly amusing (and it’s interesting that finally he stapled all our documents at the top RIGHT corner). He asked for our return tickets, which we hadn’t printed out, so I showed him on the phone. He took the phone and put it on the xerox machine and took a “copy” of the tickets! And then he stapled everything together and asked us to “wait”. Apparently his “boss” was supposed to call him (this guy took a picture of the application he had written and sent it to someone).

Then five minutes later, he gave us a small chit of paper and asked us to go back to the Visa On Arrival counter. I assumed we were almost through and messaged our driver that “we should be out soon”.

I don’t know if the guy at the visa on arrival counter was incompetent, but it’s not funny how many times he entered details of the same passports. In the middle of this, one lady walked near his counter, and he got busy talking to her while “processing” our stuff. And entered details many more times.

He got thoroughly confused because we had two Jordan Passes, and had to pay for our daughter’s visa (since she didn’t need a ticket to see the monuments this made more sense). In the middle he suddenly picked up all our passports and walked over to the police room. By now I was thoroughly psyched and had already swallowed my panic attack pill.

After yet another inordinate delay, he printed out our visas and sent us to passport control (a few metres away). Again we thought we were done, only to be told he had printed out my visa wrong (remember I said he entered details multiple times). Since the distance there was short, the passport control officer called the visa on arrival guy over and he took my passport YET AGAIN, and started entering details on his computer.

Another ten minutes later, he brought over my passport and visa to the passport control, where my passport was duly stamped and we were sent on our way.

Our bag was there in one corner, and we picked it up and walked out, feeling glad that we had booked a driver for the length of the trip who would be available for any further interfacing with Jordanian cops.

Overall, the whole process was rather bizarre. I’ve waited hours in line at Heathrow to be let in. I’ve visited the US, again waiting for a long time at JFK and even being pulled over for a customs check. None of that was even remotely comparable to our experience at Queen Alia International Airport last Tuesday.

If Jordan wants to outsource its visa process to more developed countries, that is fine, but they need to make it explicit. Turkey, for example, offers visa on arrival to Indians with a valid US or Schengen visa, but everyone else is expected to apply for a visa before travel.

Jordan says no such thing, and instead subjects people to arbitrary waits without any process in a smoky police station in the airport. Which is really really bizarre.

Round Tables

One of the “features” of being in a job is that you get invited to conferences and “industry events”. I’ve written extensively about one of them in the past – the primary purpose of these events is for people to be able to sell their companies’ products, their services and even themselves (job-hunting) to other attendees.

Now, everyone knows that this is the purpose of these events, but it is one of those things that is hard to admit. “I’m going to this hotel to get pitched to by 20 vendors” is not usually a good enough reason to bunk work. So there is always a “front” – an agenda that makes it seemingly worthy for people to attend these events.

The most common one is to have talks. This can help attract people at two levels. There are some people who won’t attend talks unless they have also been asked to talk, and so they get invited to talk. And then there are others who are happy to just attend and try to get “gyaan”, and they get invited as the audience. The other side of the market soon appears, paying generous dollars to hold the event at a nice venue, and to be able to sell to all the speakers and the audience.

Similarly, you have panel discussions. Organisers in general think this is one level better than talks – instead of the audience being bored by ONE person for half an hour, they are bored by about 4-5 people (and one moderator) for an hour. Again there is the hierarchy here – some people won’t want to attend unless they have been put on the panel. And who gets to be on the panel is a function of how desperate one or more sponsors is to sell to the potential panelists.

The one thing most of these events get right is to have sufficient lunch and tea breaks for people to talk to each other. Then again, these are brilliant times for sponsors to be able to sell their wares to the attendees. And it has the positive externality that people can meet and “network” and talk among themselves – which is the best value you can get out of an event like this one.

However, there is one kind of event that I’ve attended a few times, but I can’t understand how they work. This is the “round table”. It is basically a closed room discussion with a large number of invited “panellists”, where everyone just talks past each other.

Now, at one level I understand this – this is a good way to get a large number of people to sell to without necessarily putting a hierarchy in terms of “speakers” / “panellists” and “audience”. The problem is that what they do with these people is beyond my imagination.

I’ve attended two of these events – one online and one offline. The format is the same. There is a moderator who goes around the table (not necessarily in any particular order), with one question to each participant (the better moderators would have prepared well for this). And then the participant gives a long-winded answer to that question, and the answer is not necessarily addressed at any of the other participants.

The average length of each answer and the number of participants means that each participant gets to speak exactly once. And then it is over.

The online version of this was the most underwhelming event I ever attended – I didn’t remember anything from what anyone spoke, and assumed that the feeling was mutual. I didn’t even bother checking out these people on LinkedIn after the event was over.

The offline version I attended was better in the way that at least we could get to talk to each other after the event. But the event itself was rather boring – I’m pretty sure I bored everyone with my monologue when it was my turn, and I don’t remember anything that anyone else said in this event. The funny thing was – the event wasn’t recorded, and there was hardly anyone from the organising team at the discussion. There existed just no point of all of us talking for so long. It was like people who organise Satyanarayana Poojes to get an excuse to have a party at home.

I’m wondering how this kind of event can be structured better. I fully appreciate the sponsors and their need to sell to the lot of us. And I fully appreciate that it gives them more bang for the buck to have 20 people of roughly equal standing to sell to – with talks or panels, the “potential high value customers” can be fewer.

However – wouldn’t it be far more profitable to them to be able to spend more time actually talking to the lot of us and selling, rather than getting all of us to waste time talking nonsense to each other? Like – maybe just a party or a “lunch” would be better?

Then again – if you want people to travel inter-city to attend this, a party is not a good enough excuse for people to get their employers to sponsor their time and travel. And so something inane like the “round table” has to be invented.

PS: There is this school of thought that temperatures in offices and events are set at a level that is comfortable for men but not for women. After one recent conference I attended I have a theory on why this is the case. It is because of what is “acceptable formal wear” for men and women.

Western formal wear for men is mostly the suit, which means dressing up in lots of layers, and maybe even constraining your neck with a tie. And when you are wearing so many clothes, the environment better be cool else you’ll be sweating.

For women, however, formal wear need not be so constraining – it is perfectly acceptable to wear sleeveless tops, or dresses, for formal events. And the temperatures required to “air” the suit-wearers can be too cold for women.

At a recent conference I was wearing a thin cotton shirt and could thus empathise with the women.

Shrinking deadlines

I’m reminded of this old joke/riddle, which also happened to feature in Gowri Ganesha. “If a 1 metre long sari takes 1 hour to dry in the sun, how long will and 8 metre long sari take to dry?”.

The instinctive answer, of course, is 8 hours, while if you think about it (and assume that you have enough clothesline space to not need to fold), the correct answer is likely to be 1 hour.

Now this riddle is completely unconnected to do with the point of the post, except that both have to do with time.

And then one day you find, ten years have got behind you.
No one told you when to run. You missed the starting gun.

Ok enough distractions. I’m now home, home again.

Modern workspaces are synonymous with tight deadlines. Even when you give a conservative estimate on how long something will take, you get asked to compress the timelines further. If you protest too much and say that there is a lot to be done, sometimes you might get asked to “put one more person on the job and get it done quickly”.

This might work for routine, or “fighter” jobs – for example, if your job is to enter and copy data for (let’s say) 1000 records, you can easily put another person on the job, and the entire job will be done in about half the time (allowing for a little time for the new person to learn the job and for coordination).

As the job gets more complex, the harder it gets. At one level, there is more time to be spent by the new person coming into the job. Then, as the job gets more complex, it gets harder to divide and conquer, or to “specialise”. This means there is lesser impact to the new person coming in.

And then when you get closer and closer to the stud end of the spectrum, the advantage of putting more people to get the work done faster get lesser and lesser. There comes a point when the extra person actively becomes a liability. Again – I’m reminded of my childhood when occasionally I would ask my mother if she needed help in cooking. “Yes, the best way for you to help is for you to stay out of the kitchen”, she would say.

And then when the job gets really creative, there is a further limit on compression – a lot of the work is done “offline”. I keep telling people about how I finally discovered the proof of Ramsey’s numbers (3,3) while playing table tennis in my hostel, or how I had solved a tough assignment problem while taking a friend’s new motorcycle for a ride.

When you want to solve problems “offline” (to let the insight come to you rather than going hunting for it – I had once written about this) – there is no way to shorten the process. You need to let the problem stew in your head, and hope that some time it will get solved.

There is nothing that can be done here. The more you hurry up, the less the chances you give yourself of solving the problem. Everything needs to take its natural course.

I got reminded of it when we missed a deadline last Friday, and I decided to not think about it through the weekend. And then, an hour before I got to work on Monday, an idea occurred in the shower which fixed the problem. Even if I’d stressed myself (and my team) out on Friday, or done somersaults, the problem would not have been solved.

As I’d said in 2004, quality takes time.

Pre-trained models

On Sunday evening, we were driving to a relative’s place in Mahalakshmi Layout when I almost missed a turn. And then I was about to miss another turn and my wife said “how bad are you with directions? You don’t even know where to turn!”.

“Well, this is your area”, I told her (she grew up in Rajajinagar). “I had very little clue of this part of town till I married you, so it’s no surprise I don’t know how to go to your cousin’s place”.

“But they moved into this house like six months ago, and every time we’ve gone there together. So if I know the route, why can’t you”, she retorted.

This gave me a trigger to go off on a rant on pre-trained models, and I’m going to inflict that on you now.

For a long time, I didn’t understand what the big deal was on pre-trained machine learning models. “If it’s trained on some other data, how will it even work with my data”, I wondered. And then recently I started using GPT4 and other similar large language models. And I started reading blogposts on how with very little finetuning these models can do “gymnastics”.

Having grown up in North Bangalore, my wife has a “pretrained model” of that part of town in her head. This means she has sufficient domain knowledge, even if she doesn’t have any specific knowledge. Now, with a small amount of new specific information (the way to her cousins’s new house, for example), it is easy for her to fit in the specific information to her generic knowledge and get a clear idea on how to get there.

(PS: I’m not at all suggesting that my wife’s intelligence is artificial here)

On the other hand, my domain knowledge of North Bangalore is rather weak, despite having lived there for two years. For the longest time, Mallewaram was a Chakravyuha – I would know how to go there, but not how to get back. Given this lack of domain knowledge, the little information on the way to my wife’s cousin’s new house is not sufficient for me to find my way there.

It is similar with machines. LLMs and other pre-trained models have sufficient “generic domain knowledge” in lots of things, thanks to the large amounts of data they’ve been trained on. As a consequence, if you can train them on fairly small samples of specific data, they are able to generalise around this specific data and learn around them.

More pertinently, in real life, depending upon our “generic domain knowledge” of different domains, the amount of information that you and I will need to learn a certain amount about a certain domain can be very very different.

Everything is context-sensitive!

Channelling

I’m writing this five minutes after making my wife’s “coffee decoction” using the Bialetti Moka pot. I don’t like chicory coffee early in the morning, and I’m trying to not have coffee soon after I wake up, so I haven’t made mine yet.

While I was filling the coffee into the Moka Pot, I was thinking of the concept of channelling. Basically, if you try to pack the moka pot too tight with coffee powder, then the steam (that goes through the beans, thus extracting the caffeine) takes the easy way out – it tries to create a coffee-less channel to pass through, rather than do the hard work of extracting coffee as it passes through the layer of coffee.

I’m talking about steam here – water vapour, to be precise. It is as lifeless as it could get. It is the gaseous form of a colourless odourless shapeless liquid. Yet, it shows the seeming “intelligence” of taking the easy way out. Fundamentally this is just physics.

This is not an isolated case. Last week, at work, I was wondering why some algorithm was returning a “negative cost” (I’m using local search for that, and after a few iterations, I found that the algorithm is rapidly taking the cost – which is supposed to be strictly positive – into deep negative territory). Upon careful investigation (thankfully it didn’t take too long), it transpired that there was a penalty cost which increased non-linearly with some parameter. And the algo had “figured” that if this parameter went really high, the penalty cost would go negative (basically I hadn’t done a good job of defining the penalty well). And so would take this channel.

Again, this algorithm has none of the supposedly scary “AI” or “ML” in it. It is a good old rule-based system, where I’ve defined all the parameters and only the hard work of finding the optimal solution is left to the algo. And yet, it “channelled”.

Basically, you don’t need to have got a good reason for taking the easy way out now. It is not even human, or “animal” to do that – it is simply a physical fact. When there exists an easier path, you simply take that – whether you are an “AI” or an algorithm or just steam!

I’ll leave you with this algo that decided to recognise sheep by looking for meadows (this is rather old stuff).

Order of guests’ arrival

When I’m visiting someone’s house and they have an accessible bookshelf, one of the things I do is to go check out the books they have. There is no particular motivation, but it’s just become a habit. Sometimes it serves as conversation starters (or digressors). Sometimes it helps me understand them better. Most of the time it’s just entertaining.

So at a friend’s party last night, I found this book on Graph Theory. I just asked my hosts whose book it was, got the answer and put it back.

As many of you know, whenever we host a party, we use graph theory to prepare the guest list. My learning from last night’s party, though, is that you should not only use graph theory to decide WHO to invite, but also to adjust the times you tell people so that the party has the best outcome possible for most people.

With the full benefit of hindsight, the social network at last night’s party looked approximately like this. Rather, this is my interpretation of the social network based on my knowledge of people’s affiliation networks.

This is approximate, and I’ve collapsed each family to one dot. Basically it was one very large clique, and two or three other families (I told you this was approximate) who were largely only known to the hosts. We were one of the families that were not part of the large clique.

This was not the first such party I was attending, btw. I remember this other party from 2018 or so which was almost identical in terms of the social network – one very large clique, and then a handful of families only known to the hosts. In fact, as it happens, the large clique from the 2018 party and from yesterday’s party were from the same affiliation network, but that is only a coincidence.

Thinking about it, we ended up rather enjoying ourselves at last night’s party. I remember getting comfortable fairly quickly, and that mood carrying on through the evening. Conversations were mostly fun, and I found myself connecting adequately with most other guests. There was no need to get drunk. As we drove back peacefully in the night, my wife and daughter echoed my sentiments about the party – they had enjoyed themselves as well.

This was in marked contrast with the 2018 party with the largely similar social network structure (and dominant affiliation network). There we had found ourselves rather disconnected, unable to make conversation with anyone. Again, all three of us had felt similarly. So what was different yesterday compared to the 2018 party?

I think it had to do with the order of arrival. Yesterday, we were the second family to arrive at the party, and from a strict affiliation group perspective, the family that had preceded us at the party wasn’t part of the large clique affiliation network (though they knew most of the clique from beforehand). In that sense, we started the party on an equal footing – us, the hosts and this other family, with no subgroup dominating.

The conversation had already started flowing among the adults (the kids were in a separate room) when the next set of guests (some of them from the large clique arrived), and the assimilation was seamless. Soon everyone else arrived as well.

The point I’m trying to make here is that because the non-large-clique guests had arrived first, they had had a chance to settle into the party before the clique came in. This meant that they (non-clique) had had a chance to settle down without letting the party get too cliquey. That worked out brilliantly.

In contrast, in the 2018 party, we had ended up going rather late which meant that the clique was already in action, and a lot of the conversation had been clique-specific. This meant that we had struggled to fit in and never really settled, and just went through the motions and returned.

I’m reminded of another party WE had hosted back in 2012, where there was a large clique and a small clique. The small clique had arrived first, and by the theory in this post, should have assimilated well into the party. However, as the large clique came in, the small clique had sort of ended up withdrawing into itself, and I remember having had to make an effort to balance the conversation between all guests, and it not being particularly stress-free for me.

The difference there was that there were TWO cliques with me as cut-vertex. Yesterday, if you took out the hosts (cut-vertex), you would largely have one large clique and a few isolated nodes. And the isolated nodes coming in first meant they assimilated both with one another and with the party overall, and the party went well!

And now that I’ve figured out this principle, I might break my head further at the next party I host – in terms of what time I tell to different guests!

Sierpinski Triangles

On Saturday morning, my daughter had made some nice art with sketch pen on an A4 paper. It was rather “geometric” consisting of repeating patterns across the page. My wife took one look at it and said, “do you know that you can make such art with computers also? Your father has made some”.

Some drawings I had made using code, back in 2016

“Reallly?”, piped the daughter. I had been intending for a while to start teaching her to code (she is six), and figured this was the perfect trigger, and said I will teach her.

A quick search revealed that there is an “ACS Logo” for Mac (Logo was the first “programming language” I had learnt, when I was nine). I quickly downloaded it on her computer (my wife’s old Macbook Air) and figured I remembered most of the commands.

And then I started typing, and showed her what they had showed me back in a “computer class” behind my house in 1992 – FD for “forward”. RT for right turn. HT for hide turtle. Etc. Etc.

Soon she was engrossed in it. Thankfully she has learnt angles in her school, though it took her some trial and error to figure out how much to turn by for different shapes (later I was thinking this can also serve as a good “angles revision” for her during her ongoing summer holidays).

With my wife having reminded me that I could produce images through code, I realised that as my daughter was engrossed in her “coding”, I should do some “coding art” on my own. All she needed was some occasional input, and for me to sit right next to her.

Last Monday I had got a bit of a scare – at work, I needed to generate randomly distributed points in a regular hexagon. A lookup online told me that I could just get a larger number of randomly distributed points in a bounding rectangle, and then only pick points within the hexagon. And then take a random sample of those.

This had meant that I needed to write equations for whether a point lay inside a hexagon. And I realised I’d forgotten ALL my coordinate geometry. It took me over half an hour to get the equation for the sides of the hexagon right – I’m clearly rusty.

And on Saturday, as I sat down to make some “computer art”, I decided I’ll make some fractals. Why don’t I make some Sierpinski Triangles, I thought. I started breaking down what code I needed to write.

First, given an equilateral triangle, I had to return three similar equilateral triangles, each of half the side length of the original triangles.

Then, given the centroid of an equilateral triangle and the length of each side, I had to return the vertices.

Once these two functions had been written, I could just chain them (after running the first one recursively) and then had to just plot to get the Sierpinski triangle.

And then I had my second scare of the week – not only had I forgotten my coordinate geometry – I had forgotten my trigonometry as well. Again I messed up a few times, but the good thing about programming with a computer is that i could do trial and error. Soon I had it right, and started producing Sierpinski triangles.

Then, there was another problem – my code was really inefficient. If I went beyond depth 4 or 5, the figures would take inordinately long to render. Since I was coding in R, I set about vectorising all my code. In R you don’t write loops if you can help it – instead, you apply functions on entire vectors. This again took some time, and then I had the triangles ready. I proudly showed them off to my daughter.

“Appa, why is it that as you increase the number it becomes greyer”, she asked . I explained how with each step, you were taking away more of the filled areas from the triangles. Then I figured this wasn’t that good-looking – maybe I should colour it.

And so I wrote code to colour the triangles. Basically, I started recursively colouring them – the top third green, left third red and right third blue (starting with a red base). This is what I ended up producing:

And this is what my daughter produced at the same time, using Logo:

I forgot to “HT” before taking the screenshot. This is a “lollipop”