New blood joins this team

I intended to write this a year ago, when Sadio Mane left Liverpool after six brilliant years at the club. There was much heartbreak among the club fan base about Mane leaving, and a lot of people saw it as a failure on the part of the management and ownership in terms of not being able to keep him.

Now, a year on, I admit that Darwin Nunez hasn’t quite set the club on fire (though I personally quite like him), but as a general principle, this kind of “freshening up” is a highly necessary process in a team, if you need to avoid stagnation.

A month or two back, I was watching some YouTube video on “Liverpool’s greatest Premier League goals against Manchester City” (this was just before the 4-1 hammering at the Etihad). As the goals were shown one by one, I kept trying to guess which season and game it was in.

There were important clues – whether Firmino wore 9 or 11, whether Mane wore 19 or 10, the identity of some players, the length of Trent Alexander Arnold’s hair, my memory of the scoreline from that game, etc. (Liverpool always wear the home Red at the Etihad, so the colour of the away kit wasn’t a clue).

However, for one goal I simply wasn’t able to figure out which season it was. There was TAA wearing 66, Fabinho, Henderson, the fab front three (Firmino-Mane-Salah, wearing 9-10-11 respectively) and Robertson. That’s when it hit me that for a fairly long time, a large part of Liverpool’s team had stayed constant! There was very little change at the club.

Now, there are benefits to having a consistently settled team (as the fabulous 2021-22 season showed), but there is also the danger of stasis. In something like football where careers are short, you don’t want the whole team “getting old together”. In the corporate world, people can get into too much of a comfort zone. And cynicism can set in.

Good new employees are always buzzing with ideas, fearless about what has been rejected before and who thinks how. As people spend longer in the organisation, though, colleagues become predictable and certain ways of doing things become institutionalised. Sooner than you know it, you would have become a “company man”, (figuratively) wearing the same white shirt and blue suits as your fellow company men, and socialising with your colleagues at the (figurative) company club.

There can be different kinds of companies here – some companies allow people to retain a lot of their individuality; and there the “decay” into company-manhood is slower. In this kind of a place, the same set of people can stay together for longer and still continue to innovate and add significant value to one another.

Other companies are less forgiving, and you very quickly assimilate, and lose part of your idiosyncrasy. Insofar as innovation comes out of fresh ideas and thinking and unusual connections, these companies are not very good at it. And in such companies, pretty much the only way to keep the innovative wheel going and continue to add value is by bringing in fresh blood well-at-a-faster-rate.

Putting it another way, if you are a cohesive kind of company, some attrition may not actually be a bad thing (unless you are growing rapidly enough to expand your team rapidly). To grow and innovate, you need people to think different.

And you get there either by having the sort of superior culture where existing employees continue to think different long after they’ve been exposed to one another’s thoughts; or by continuing to bring in fresh employees.

There is no other way.

Round Tables

One of the “features” of being in a job is that you get invited to conferences and “industry events”. I’ve written extensively about one of them in the past – the primary purpose of these events is for people to be able to sell their companies’ products, their services and even themselves (job-hunting) to other attendees.

Now, everyone knows that this is the purpose of these events, but it is one of those things that is hard to admit. “I’m going to this hotel to get pitched to by 20 vendors” is not usually a good enough reason to bunk work. So there is always a “front” – an agenda that makes it seemingly worthy for people to attend these events.

The most common one is to have talks. This can help attract people at two levels. There are some people who won’t attend talks unless they have also been asked to talk, and so they get invited to talk. And then there are others who are happy to just attend and try to get “gyaan”, and they get invited as the audience. The other side of the market soon appears, paying generous dollars to hold the event at a nice venue, and to be able to sell to all the speakers and the audience.

Similarly, you have panel discussions. Organisers in general think this is one level better than talks – instead of the audience being bored by ONE person for half an hour, they are bored by about 4-5 people (and one moderator) for an hour. Again there is the hierarchy here – some people won’t want to attend unless they have been put on the panel. And who gets to be on the panel is a function of how desperate one or more sponsors is to sell to the potential panelists.

The one thing most of these events get right is to have sufficient lunch and tea breaks for people to talk to each other. Then again, these are brilliant times for sponsors to be able to sell their wares to the attendees. And it has the positive externality that people can meet and “network” and talk among themselves – which is the best value you can get out of an event like this one.

However, there is one kind of event that I’ve attended a few times, but I can’t understand how they work. This is the “round table”. It is basically a closed room discussion with a large number of invited “panellists”, where everyone just talks past each other.

Now, at one level I understand this – this is a good way to get a large number of people to sell to without necessarily putting a hierarchy in terms of “speakers” / “panellists” and “audience”. The problem is that what they do with these people is beyond my imagination.

I’ve attended two of these events – one online and one offline. The format is the same. There is a moderator who goes around the table (not necessarily in any particular order), with one question to each participant (the better moderators would have prepared well for this). And then the participant gives a long-winded answer to that question, and the answer is not necessarily addressed at any of the other participants.

The average length of each answer and the number of participants means that each participant gets to speak exactly once. And then it is over.

The online version of this was the most underwhelming event I ever attended – I didn’t remember anything from what anyone spoke, and assumed that the feeling was mutual. I didn’t even bother checking out these people on LinkedIn after the event was over.

The offline version I attended was better in the way that at least we could get to talk to each other after the event. But the event itself was rather boring – I’m pretty sure I bored everyone with my monologue when it was my turn, and I don’t remember anything that anyone else said in this event. The funny thing was – the event wasn’t recorded, and there was hardly anyone from the organising team at the discussion. There existed just no point of all of us talking for so long. It was like people who organise Satyanarayana Poojes to get an excuse to have a party at home.

I’m wondering how this kind of event can be structured better. I fully appreciate the sponsors and their need to sell to the lot of us. And I fully appreciate that it gives  them more bang for the buck to have 20 people of roughly equal standing to sell to – with talks or panels, the “potential high value customers” can be fewer.

However – wouldn’t it be far more profitable to them to be able to spend more time actually talking to the lot of us and selling, rather than getting all of us to waste time talking nonsense to each other? Like – maybe just a party or a “lunch” would be better?

Then again – if you want people to travel inter-city to attend this, a party is not a good enough excuse for people to get their employers to sponsor their time and travel. And so something inane like the “round table” has to be invented.

PS: There is this school of thought that temperatures in offices and events are set at a level that is comfortable for men but not for women. After one recent conference I attended I have a theory on why this is the case. It is because of what is “acceptable formal wear” for men and women.

Western formal wear for men is mostly the suit, which means dressing up in lots of layers, and maybe even constraining your neck with a tie. And when you are wearing so many clothes, the environment better be cool else you’ll be sweating.

For women, however, formal wear need not be so constraining – it is perfectly acceptable to wear sleeveless tops, or dresses, for formal events. And the temperatures required to “air” the suit-wearers can be too cold for women.

At a recent conference I was wearing a thin cotton shirt and could thus empathise with the women.

 

Shrinking deadlines

I’m reminded of this old joke/riddle, which also happened to feature in Gowri Ganesha. “If a 1 metre long sari takes 1 hour to dry in the sun, how long will and 8 metre long sari take to dry?”.

The instinctive answer, of course, is 8 hours, while if you think about it (and assume that you have enough clothesline space to not need to fold), the correct answer is likely to be 1 hour.

Now this riddle is completely unconnected to do with the point of the post, except that both have to do with time.

And then one day you find, ten years have got behind you.
No one told you when to run. You missed the starting gun. 

Ok enough distractions. I’m now home, home again.

Modern workspaces are synonymous with tight deadlines. Even when you give a conservative estimate on how long something will take, you get asked to compress the timelines further. If you protest too much and say that there is a lot to be done, sometimes you might get asked to “put one more person on the job and get it done quickly”.

This might work for routine, or “fighter” jobs – for example, if your job is to enter and copy data for (let’s say) 1000 records, you can easily put another person on the job, and the entire job will be done in about half the time (allowing for a little time for the new person to learn the job and for coordination).

As the job gets more complex, the harder it gets. At one level, there is more time to be spent by the new person coming into the job. Then, as the job gets more complex, it gets harder to divide and conquer, or to “specialise”. This means there is lesser impact to the new person coming in.

And then when you get closer and closer to the stud end of the spectrum, the advantage of putting more people to get the work done faster get lesser and lesser. There comes a point when the extra person actively becomes a liability. Again – I’m reminded of my childhood when occasionally I would ask my mother if she needed help in cooking. “Yes, the best way for you to help is for you to stay out of the kitchen”, she would say.

And then when the job gets really creative, there is a further limit on compression – a lot of the work is done “offline”. I keep telling people about how I finally discovered the proof of Ramsey’s numbers (3,3) while playing table tennis in my hostel, or how I had solved a tough assignment problem while taking a friend’s new motorcycle for a ride.

When you want to solve problems “offline” (to let the insight come to you rather than going hunting for it – I had once written about this) – there is no way to shorten the process. You need to let the problem stew in your head, and hope that some time it will get solved.

There is nothing that can be done here. The more you hurry up, the less the chances you give yourself of solving the problem. Everything needs to take its natural course.

I got reminded of it when we missed a deadline last Friday, and I decided to not think about it through the weekend. And then, an hour before I got to work on Monday, an idea occurred in the shower which fixed the problem. Even if I’d stressed myself (and my team) out on Friday, or done somersaults, the problem would not have been solved.

As I’d said in 2004, quality takes time.

Pre-trained models

On Sunday evening, we were driving to a relative’s place in Mahalakshmi Layout when I almost missed a turn. And then I was about to miss another turn and my wife said “how bad are you with directions? You don’t even know where to turn!”.

“Well, this is your area”, I told her (she grew up in Rajajinagar). “I had very little clue of this part of town till I married you, so it’s no surprise I don’t know how to go to your cousin’s place”.

“But they moved into this house like six months ago, and every time we’ve gone there together. So if I know the route, why can’t you”, she retorted.

This gave me a trigger to go off on a rant on pre-trained models, and I’m going to inflict that on you now.

For a long time, I didn’t understand what the big deal was on pre-trained machine learning models. “If it’s trained on some other data, how will it even work with my data”, I wondered. And then recently I started using GPT4 and other similar large language models. And I started reading blogposts on how with very little finetuning these models can do “gymnastics”.

Having grown up in North Bangalore, my wife has a “pretrained model” of that part of town in her head. This means she has sufficient domain knowledge, even if she doesn’t have any specific knowledge. Now, with a small amount of new specific information (the way to her cousins’s new house, for example), it is easy for her to fit in the specific information to her generic knowledge and get a clear idea on how to get there.

(PS: I’m not at all suggesting that my wife’s intelligence is artificial here)

On the other hand, my domain knowledge of North Bangalore is rather weak, despite having lived there for two years. For the longest time, Mallewaram was a Chakravyuha – I would know how to go there, but not how to get back. Given this lack of domain knowledge, the little information on the way to my wife’s cousin’s new house is not sufficient for me to find my way there.

It is similar with machines. LLMs and other pre-trained models have sufficient “generic domain knowledge” in lots of things, thanks to the large amounts of data they’ve been trained on. As a consequence, if you can train them on fairly small samples of specific data, they are able to generalise around this specific data and learn around them.

More pertinently, in real life, depending upon our “generic domain knowledge” of different domains, the amount of information that you and I will need to learn a certain amount about a certain domain can be very very different.

Everything is context-sensitive!

Channelling

I’m writing this five minutes after making my wife’s “coffee decoction” using the Bialetti Moka pot. I don’t like chicory coffee early in the morning, and I’m trying to not have coffee soon after I wake up, so I haven’t made mine yet.

While I was filling the coffee into the Moka Pot, I was thinking of the concept of channelling. Basically, if you try to pack the moka pot too tight with coffee powder, then the steam (that goes through the beans, thus extracting the caffeine) takes the easy way out – it tries to create a coffee-less channel to pass through, rather than do the hard work of extracting coffee as it passes through the layer of coffee.

I’m talking about steam here – water vapour, to be precise. It is as lifeless as it could get. It is the gaseous form of a colourless odourless shapeless liquid. Yet, it shows the seeming “intelligence” of taking the easy way out. Fundamentally this is just physics.

This is not an isolated case. Last week, at work, I was wondering why some algorithm was returning a “negative cost” (I’m using local search for that, and after a few iterations, I found that the algorithm is rapidly taking the cost – which is supposed to be strictly positive – into deep negative territory). Upon careful investigation (thankfully it didn’t take too long), it transpired that there was a penalty cost which increased non-linearly with some parameter. And the algo had “figured” that if this parameter went really high, the penalty cost would go negative (basically I hadn’t done a good job of defining the penalty well). And so would take this channel.

Again, this algorithm has none of the supposedly scary “AI” or “ML” in it. It is a good old rule-based system, where I’ve defined all the parameters and only the hard work of finding the optimal solution is left to the algo. And yet, it “channelled”.

Basically, you don’t need to have got a good reason for taking the easy way out now. It is not even human, or “animal” to do that – it is simply a physical fact. When there exists an easier path, you simply take that – whether you are an “AI” or an algorithm or just steam!

I’ll leave you with this algo that decided to recognise sheep by looking for meadows (this is rather old stuff).

Order of guests’ arrival

When I’m visiting someone’s house and they have an accessible bookshelf, one of the things I do is to go check out the books they have. There is no particular motivation, but it’s just become a habit. Sometimes it serves as conversation starters (or digressors). Sometimes it helps me understand them better. Most of the time it’s just entertaining.

So at a friend’s party last night, I found this book on Graph Theory. I just asked my hosts whose book it was, got the answer and put it back.

As many of you know, whenever we host a party, we use graph theory to prepare the guest list. My learning from last night’s party, though, is that you should not only use graph theory to decide WHO to invite, but also to adjust the times you tell people so that the party has the best outcome possible for most people.

With the full benefit of hindsight, the social network at last night’s party looked approximately like this. Rather, this is my interpretation of the social network based on my knowledge of people’s affiliation networks.

This is approximate, and I’ve collapsed each family to one dot. Basically it was one very large clique, and two or three other families (I told you this was approximate) who were largely only known to the hosts. We were one of the families that were not part of the large clique.

This was not the first such party I was attending, btw. I remember this other party from 2018 or so which was almost identical in terms of the social network – one very large clique, and then a handful of families only known to the hosts. In fact, as it happens, the large clique from the 2018 party and from yesterday’s party were from the same affiliation network, but that is only a coincidence.

Thinking about it, we ended up rather enjoying ourselves at last night’s party. I remember getting comfortable fairly quickly, and that mood carrying on through the evening. Conversations were mostly fun, and I found myself connecting adequately with most other guests. There was no need to get drunk. As we drove back peacefully in the night, my wife and daughter echoed my sentiments about the party – they had enjoyed themselves as well.

This was in marked contrast with the 2018 party with the largely similar social network structure (and dominant affiliation network). There we had found ourselves rather disconnected, unable to make conversation with anyone. Again, all three of us had felt similarly. So what was different yesterday compared to the 2018 party?

I think it had to do with the order of arrival. Yesterday, we were the second family to arrive at the party, and from a strict affiliation group perspective, the family that had preceded us at the party wasn’t part of the large clique affiliation network (though they knew most of the clique from beforehand). In that sense, we started the party on an equal footing – us, the hosts and this other family, with no subgroup dominating.

The conversation had already started flowing among the adults (the kids were in a separate room) when the next set of guests (some of them from the large clique arrived), and the assimilation was seamless. Soon everyone else arrived as well.

The point I’m trying to make here is that because the non-large-clique guests had arrived first, they had had a chance to settle into the party before the clique came in. This meant that they (non-clique) had had a chance to settle down without letting the party get too cliquey. That worked out brilliantly.

In contrast, in the 2018 party, we had ended up going rather late which meant that the clique was already in action, and a lot of the conversation had been clique-specific. This meant that we had struggled to fit in and never really settled, and just went through the motions and returned.

I’m reminded of another party WE had hosted back in 2012, where there was a large clique and a small clique. The small clique had arrived first, and by the theory in this post, should have assimilated well into the party. However, as the large clique came in, the small clique had sort of ended up withdrawing into itself, and I remember having had to make an effort to balance the conversation between all guests, and it not being particularly stress-free for me.

The difference there was that there were TWO cliques with me as cut-vertex.  Yesterday, if you took out the hosts (cut-vertex), you would largely have one large clique and a few isolated nodes. And the isolated nodes coming in first meant they assimilated both with one another and with the party overall, and the party went well!

And now that I’ve figured out this principle, I might break my head further at the next party I host – in terms of what time I tell to different guests!

Optimal quality of beer

Last evening I went for drinks with a few colleagues. We didn’t think or do much in terms of where to go – we just minimised transaction costs by going to the microbrewery on the top floor of our office building. This meant that after the session those of us who were able (and willing) to drive back could just go down to the basement and drive back. No “intermediate driving”.

Of course, if you want to drive back after you’ve gone for drinks, it means that you need to keep your alcohol consumption in check. And when you know you are going for a longish session, that is tricky. And that’s where the quality of beer maters.

In a place like Arbor, which makes absolutely excellent beer, “one beer” is a hard thing to pull off (though I exercised great willpower in doing just that the last time I’d gone for drinks with colleagues – back in feb). And after a few recent experiences, I’ve concluded that beer is the best “networking drink” – it offers the optimal amount of “alcohol per unit time” (wine and whisky I tend to consume well-at-a-faster-rate, and end up getting too drunk too quickly). So if you go to a place that serves bad beer, that isn’t great either.

This is where the quality of beer at a middling (for a Bangalore microbrewery) place like Bangalore Brewworks works perfectly – it’s decent enough that you are able to drink it (and not something that delivers more ethanol per unit time), but also not so good that you gulp it down (like I do with the Beach Shack at Arbor).

And this means that you can get through a large part of the session (where the counterparties down several drinks) on your one beer – you stay within reasonable alcohol limits and are not buzzed at all and easily able to drive. Then you down a few glasses of iced water and you’re good to go!

Then again, when I think about it, nowadays I go out for drinks so seldom that maybe this strategy is not so optimal at all – next time I might as well go to Arbor and take a taxi home.

The Law Of Comparative Advantage and Priorities

Over a decade ago I had written about two kinds of employees – those who offer “competitive advantage” and those who offer “comparative advantage”.

Quoting myself:

So in a “comparative advantage” job, you keep the job only because you make it easier for one or more colleagues to do more. You are clearly inferior to these colleagues in all the “components” of your job, but you don’t get fired only because you increase their productivity. You become the Friday to their Crusoe.

On the other hand, you can keep a job for “competitive advantage“. You are paid because there are one or more skills that the job demands in which you are better than your colleagues

Now, one issue with “comparative advantage” jobs is that sometimes it can lead to people being played out of position. And that can reduce the overall productivity of the team, especially when priorities change.

Let’s say you have 2 employees A and B, and 2 high-priority tasks X and Y. A dominates B – she is better and faster than B in both X and Y. In fact, B cannot do X at all, and is inferior to A when it comes to Y. Given these tasks and employees, the theory of comparative advantage says that A should do X and B should do Y. And that’s how you split it.

In this real world problem though, there can be a few issues – A might be better at X than B, but she just doesn’t want to do X. Secondly, by putting the slower B on Y, there is a floor on how soon Y can be delivered.

And if for some reason Y becomes high priority for the team, with the current work allocation there is no option than to just wait for B to finish Y, or get A to work on Y as well (thus leaving X in the lurch, and the otherwise good A unhappy). A sort of no win situation.

The whole team ends up depending on the otherwise weak B, a sort of version of this:

A corollary is that if you have been given what seems like a major responsibility it need not be because you are good at the task you’ve been given responsibility for. It could also be because you are “less worse” than your colleagues at this particular thing than you are at other things.

 

 

Lifting and arithmetic

At a party we hosted recently, we ended up talking a lot about lifting heavy weights in the gym. In the middle of the conversation, my wife wondered loudly as to why “so many intelligent people are into weightlifting nowadays”. A few theories got postulated in the following few minutes but I’m not going to talk about that here.

Anecdotally, this is true. The two people I hold responsible for getting me lift heavy weights are both people I consider rather intelligent. I discuss weights and lifting with quite a few other friends as well. Nassim Taleb, for a long time, kept tweeting about deadlifts, though now he has dialled back on strength training.

In 2012 or 2013 I had written about how hard it was to maintain a good diet and exercise regime. While I had stopped being really fat in 2009, my weight had started creeping up again and my triglyceride numbers hadn’t been good. I had found it hard to stick to a diet, and found the gym rather boring.

In response, one old friend (one of the intelligent people I mentioned above) sent me Mark Rippetoe’s Starting Strength (and a few other articles on cutting carbs, and high-fat diets). Starting Strength, in a way, brought back geekery into the gym, which had until then been taken over by “gym bros” doing bicep curls and staring into mirrors.

It’s been a long time since I read it, but it’s fascinating – I remember reading it and thinking it reminded me of IIT-JEE physics. He draws free body diagrams to explain why you should maintain a straight bar path. He talks about “moment arms” to explain why the bar should be over your mid-foot while deadlifting (ok this book we did discuss at the party in response to my wife’s question).

However, two incidents that happened last week gave me an idea on why “intelligent people” are drawn to lifting heavy barbells. It’s about challenging yourself to the right extent.

The gym that I go to (a rather kickass gym) has regular classes that most members attend. These classes focus on functional fitness (among other things, everyone is made to squat and press and deadlift), but I’ve for long found that these classes bore me so I just do my own thing (squats, press / bench and deadlift, on most days). Occasionally, though, like last Friday, I decide to “do the class”. And on these occasions, I remember why I don’t like the class.

The problem with the gym class is that I get bored. Most of the time, the exercises you are doing are of the sort where you lift well below capacity on each lift, but you do a lot of lifts. They train you not just for strength but also for endurance and metabolic conditioning. The problem with that for me is that because every single repetition is not challenging, I get bored. “Why do i need to do so much”, I think. Last Friday I exited the class midway, bored.

My daughter is having school holidays, and one of the things we have figured is that while she has grasped all her maths concepts rather soundly (the montessori system does a good job of that), she has completely failed to mug her tables. If I ask her what is “7 times 4” (for example), she takes half a minute, adds  7 four times and tells me.

Last Monday, I printed out (using Excel) all combinations of single digit multiplications and told her she “better mug it by Friday”. She completely refused to do it. There was no headway in her “learning”. I resorted to occasionally asking her simple arithmetic questions and making her answer immediately. While waiting to cross the road while on a walk, “what is six times eight?”. While waiting for the baker to give us bread “you gave him ?100 and the bread costs ?40. How much change should he give you?”. And so on.

She would occasionally answer but again her boredom was inherent. The concept learning had been challenging for her and she had learnt it. But this “repetitive practice” was boring and she would refuse to do it.

Then, last Friday, I decided to take it up a notch. I suddenly asked “what is four and a half times eight?” (she’s done fractions in school). This was a gamechanger.

Suddenly, by dialling up the challenge, she got interested, and with some prodding gave me the correct answer. An hour earlier, she had struggled for a minute to tell me what 8 times 7 is. However, when I asked her “what is eight times seven and a half?” she responded in a few seconds, “eight times seven is fifty six..” (and then proceeded to complete the solution).

Having exited my gym class midway just that morning, I was now able to make sense of everything. Practicing simple arithmetic for her is like light weight lifting for me. “Each rep” is not challenging in either case, and so we get bored and don’t want to do it. Dial up the challenge a little bit, such as bringing in fractions or making the weights very heavy, and now every rep is a challenge. The whole thing becomes more fun.

And if you are of the type that easily gets bored and wants to do things where each unit is challenging, barbell training is an obvious way to exercise. and “intelligent” people are more likely to get bored of routine stuff. And so they are taking to lifting heavy weights.

I guess my wife has her answer now.

 

The Second Great Wall (of programming)

Back in 2000, I entered the Computer Science undergrad program at IIT Madras thinking I was a fairly competent coder. In my high school, I had a pretty good reputation in terms of my programming skills and had built a whole bunch of games.

By the time half the course was done I had completely fallen out of love with programming, deciding a career in Computer Science was not for me. I even ignored Kama (current diro)’s advice and went on to do an MBA.

What had happened? Basically it was a sudden increase in the steepness of the learning curve. Or that I’m a massive sucker for user experience, which the Computer Science program didn’t care for.

Back in school, my IDE of choice (rather the only one available) was TurboC, a DOS-based program. You would write your code, and then hit Ctrl+F9 to run the program. And it would just run. I didn’t have to deal with any technical issues. Looking back, we had built some fairly complex programs just using TurboC.

And then I went to IIT and found that there was no TurboC, no DOS. Most computers there had an ancient version of Unix (or worse, Solaris). These didn’t come with nice IDEs such as TurboC. Instead, you had to use vi (some of the computers were so old they didn’t even have vim) to write the code, and then compile it from outside.

Difficulties in coming to terms with vi meant that my speed of typing dropped. I couldn’t “code at the speed of thought” any more. This was the first give up moment.

Then, I discovered that C++ had now got this new set of “standard template libraries” (STL) with vectors and stuff. This was very alien to the way I had learnt C++ in school. Also I found that some of my classmates were very proficient with this, and I just couldn’t keep up with this. The effort seemed too much (and the general workload of the program was so high that I couldn’t get much time for “learning by myself”), so I gave up once  again.

Next, I figured that a lot of my professors were suckers for graphic UIs (though they denied us good UX by denying us good computers). This, circa 2001-2, meant programming in Java and writing applets. It was a massive degree of complexity (and “boringness”) compared to the crisp C/C++ code I was used to writing. I gave up yet again.

I wasn’t done with giving up yet. Beyond all of this, there was “systems programming”. You had to write some network layers and stuff. You had to go deep into the workings of the computer system to get your code to run. This came rather intuitively to most of my engineering-minded classmates. It didn’t to me (programming in C was the “deepest” I could grok). And I gave up even more.

A decade after I graduated from IIT Madras, I visited IIM Calcutta to deliver a lecture. And got educated.

I did my B.Tech. project in “theoretical computer science”, managed to graduate and went on to do an MBA. Just before my MBA, I was helping my father with some work, and he figured I sucked at Excel. “What is the use of completing a B.Tech. in computer science if you can’t even do simple Excel stuff?”, he thundered.

In IIMB, all of us bought computers with pirated Windows and Office. I started using Excel. It was an absolute joy. It was a decade before I started using Apple products, but the UX of Windows was such a massive upgrade compared to what I’d seen in my more technical life.

In my first job (where I didn’t last long) I learnt the absolute joy of Visual Basic macros for Excel. This was another level of unlock. I did some insane gymnastics in that. I pissed off a lot of people in my second job by replicating what they thought was a complex model on an Excel sheet. In my third job, I replaced a guy on my team with an Excel macro. My programming mojo was back.

Goldman Sachs’s SLANG was even better. By the time I left from there, I had learnt R as well. And then I became a “data scientist”. People asked me to use Python. I struggled with it. After the user experience of R, this was too complex. This brought back bad memories of all the systems programming and dealing with bad UX I had encountered in my undergrad. This time I was in control (I was a freelancer) so I didn’t need to give up – I was able to get all my work done in R.

The second giving up

I’ve happily used R for most of my data work in the last decade. Recently at work I started using Databricks (still write my code in R there, using sparklyr), and I’m quite liking that as well. However, in the last 3-4 months there has been a lot of developments in “AI”, which I’ve wanted to explore.

The unfortunate thing is that most of this is available only in Python. And the bad UX problem is back again.

Initially I got excited, and managed to install Stable Diffusion on my personal Mac. I started writing some OpenAI code as well (largely using R). I started tracking developments in artificial intelligence, and trying some of them out.

And now, in the last 2-3 weeks, I’ve been struggling with “virtual environments”. Each newfangled open-source AI that is released comes with its own codebase and package requirements. They are all mutually incompatible. You install one package, and you break another package.

The “solution” to this, from what I could gather, is to use virtual environments – basically a sandbox for each of these things that I’ve downloaded. That, I find, is grossly inadequate. One of the points of using open source software is to experiment with connecting up two or more of them. And if each needs to be in its own sandbox, how is one supposed to do this? And how are all other data scientists and software engineers okay with this?

This whole virtual environment mess means that I’m giving up on programming once again. I won’t fully give up – I’ll continue to use R for most of my data work (including my job), but I’m on the verge of giving up in terms of these “complex AI”.

It’s the UX thing all over again. I simply can’t handle bad UX. Maybe it’s my ADHD. But when something is hard to use, I simply don’t want to use it.

And so I’m giving up once again. Tagore was right.