Chat GPT and Information Content

For the last week or so I’ve been fooling around with Chat GPT. It’s been a lot of fun, if you can endure the verbosity of its answers. Yesterday, at work, I used it for tech support when some people I was talking to were using too much jargon.

Before I came home from work yesterday, my last conversation was with a couple of recruiters, about a role that I’m hiring for. So when I got home, on a whim, I wondered if I could check out of Chat GPT could write job descriptions for me.

I asked:

write a job description for a senior data scientist at India’s largest logistics company. 4-7 years exp. Needs to have experience in ML, OR and statistics, and be able to work close to business and on ambiguous problems

It answered (pasting in full since the screenshot above is limited):

Job Title: Senior Data Scientist

Company: India’s Largest Logistics Company

Location: India

Experience: 4-7 years

We are seeking an experienced and highly skilled Senior Data Scientist to join our team at India’s largest logistics company. In this role, you will be responsible for using machine learning, operations research, and statistics to drive business insights and decision making.

As a Senior Data Scientist, you will have a deep understanding of data science techniques and technologies, and will be able to apply them to complex and ambiguous problems. You will be able to work closely with business teams to understand their needs and to develop solutions that drive value for the company.

To be successful in this role, you should have at least 4-7 years of experience in data science, and should have a strong background in machine learning, operations research, and statistics. You should be able to work independently and be comfortable working on complex and ambiguous problems.

Responsibilities:

  • Use machine learning, operations research, and statistics to drive business insights and decision making
  • Work closely with business teams to understand their needs and develop solutions
  • Develop and maintain data pipelines and systems
  • Collaborate with other data scientists and cross-functional teams to develop and implement solutions

Qualifications:

  • Bachelor’s or Master’s degree in a quantitative field, such as computer science, statistics, or engineering
  • At least 4-7 years of experience in data science
  • Strong background in machine learning, operations research, and statistics
  • Ability to work independently and on complex and ambiguous problems
  • Excellent communication and collaboration skills
  • Experience with Python, R, or other programming languages is a plus

We offer a competitive salary and benefits package, as well as opportunities for growth and advancement. If you are an experienced data scientist with a passion for driving business insights and decision making, we encourage you to apply.

In the past when I’ve been hiring, I’ve been a bit annoyed by people asking for job descriptions. I’ve largely found them to be fluff and boilerplate, and I’ve tried to get away with a sentence or two. However, due to the insistence of recruiters and candidates alike, I’ve ended up writing elaborate job descriptions. I don’t need to any more.

This post, however, is not about hiring (but yeah – I am hiring, and the above job description is reasonably accurate). It’s about information content in general.

Traditionally, information content has been measured in bits. A bit is the amount of information contained in an equi-probable binary event.

Sometimes when we find that someone is too verbose and using too many words when fewer would suffice, we say that their bit rate is low. We also use “low bit rate” to describe people such as former Prime Minister Atal Behari Vajpayee, who would speak incredibly slowly.

However, beyond the bit, which is a fairly technical concept, it has been difficult to quantify information content. Sometimes you read an article or a story and find that there is nothing much to it. But given the natural language, and the context of various words, it is impossible to quantify the information content.

Now, with Chat GPT, maybe it becomes a bit easier (though one would need a “reverse chat GPT algo”, to find the set of prompts required for Chat GPT to churn out a particular essay). Above, for example, I’ve shown how much fluff there generally is to the average job description – a fairly short prompt generated this longish description that is fairly accurate.

So you can define the information content of a piece or essay in terms of the number of words in the minimum set of prompts required for Chat GPT (or something like it) to come up with it. If you are a boring stereotypical writer, the set of prompts required will be lower. If you are highly idiosyncratic, then you will need to give a larger number of prompts for Chat GPT to write like you. You know where I’m going.

This evening, in office, a colleague commented that now it will be rather easy to generate marketing material. “Even blogs might become dead, since with a few prompts you can get that content”, he said (it can be a legit service to build off the Chat GPT API to take a tweet and convert it into an essay).

I didn’t tell him then but I have decided to take it up as a challenge. I consider myself to be a fairly idiosyncratic writer, which means I THINK there is a fair bit of information content in what I write, and so this blog will stay relevant. Let’s see how it goes.

PS: I still want to train a GAN on my blog (well over a million words, at last count) and see how it goes. If you know of any tools I can use for this, let me know!

 

Open and closed platforms

This is a blogpost that I had planned a very long time (4-5 weeks) ago, and I’m only getting down to write it now. So my apologies if the quality is not as good as my blogposts usually are. 

Many of you would have looked at the title of this blogpost and assumed that the trigger for this was the “acquisition” of Joe Rogan’s podcast by Spotify. For a large sum of money, Spotify is “taking his podcast private”, making it exclusive to Spotify subscribers.

However, this is only an “immediate trigger” for writing this post. I’d planned this post way back in April when I’d written one of my Covid-19 related blogposts – maybe it was this one.

I had joked the post needed to be on Medium for it to be taken seriously (a lot of covid related analysis was appearing on Medium around that time). Someone suggested I actually put it on Medium. I copied and pasted it there. Medium promptly took down my post.

I got pissed off and swore to never post on Medium again. I got reminded of the time last year when Youtube randomly pulled down one of my cricket videos when someone (an IP troll, I later learnt) wrongly claimed that I’d used copyrighted sounds in my video (the only sound in that video was my own voice).  I had lodged a complaint with Youtube, and my video was resurrected, but it was off air for a month (I think).

Medium and Youtube are both examples of closed platforms. All content posted on these platforms are “native to the platform”. These platforms provide a means of distributing (and sometimes even marketing) the content, and all content posted there essentially belongs to the platform. Yes, you get paid a cut of the ad fee (in case your Youtube channel becomes super powerful, for example), but Youtube decides whether your video deserves to be there at all, and whose homepages to put it on.

The main feature of a closed platform is that any content created on the platform needs to be consumed on the same platform. A video I’ve uploaded on Youtube is only accessible on Youtube. A medium post can only be read on medium. A tweet can only be read on twitter. A Facebook post only on Facebook.

The advantage with closed platforms is that by submitting your content to the platform, you are hoping to leverage some benefits the platform might offer, like additional marketing and distribution, and discovery.

This blog doesn’t work that way. Blogposts work through this technology called “RSS”, and to read what I’m writing here you don’t need to necessarily visit noenthuda.com. You can read it on the feed reader of your choice (Feedly is what I use). Of course there is the danger that one feed reader can have overwhelming marketshare, and the destruction of that feed reader can kill the ecosystem itself (like it happened with Google Reader in 2013). Yet, RSS being an open platform means that this blog still exists, and you can continue to receive it on the RSS reader of your choice. If Medium were to shut down tomorrow, all Medium posts might be lost.

Another example of an open platform is email – it doesn’t matter what email service or app you use, my email and yours is interoperable. India’s Universal Payment Interface (UPI) is another open platform – the sender and receiver can use apps of their choice and still transact.

And yet another open platform (which a lot of people didn’t really realise is an open platform) is podcasting. Podcasts run on the RSS protocol. So when you subscribe to a podcast using Apple Podcasts, it is similar to adding a blog to your Feedly. This thread by Ben Thompson of Stratechery (that I just stumbled upon when I started writing this post) sums it up well:

What Spotify is trying to do (with the Joe Rogan and Ringer deals) is to take these contents off open platforms and put it on its own closed platform. Some people (like Rogan) will take the bait since they’re getting paid for it. However, this comes at the cost of control – like I’m not sure if we’ll have another episode of Rogan’s podcast where host and guest light up a joint.

Following my experiences with Medium and Youtube, when my content was yanked off for no reason (or for flimsy reasons), I’m not sure I like closed platforms any more. Rather, someone needs to pay me a lot of money to take my content to a closed platform (speaking of which, do you know that all my writing for Mint (written in 2013-18) is behind their newly erected paywall now?).

In closing I must mention that platforms being “open” and platforms being “free” are orthogonal. A paid podcast or newsletter is still on an open platform (see Ben Thompson tweetstorm above), since it can be consumed on a medium independent of the one where it was produced – essentially a different feed is generated depending on what the customer has paid for.

Now that I’ve written this post, I don’t know what the point of this is. Maybe it’s just for collecting and crystallising my own thoughts, which is the point behind most of my blogposts anyway.

PS: We have RSS feeds for text and podcasts for audio. I wonder why we don’t have a popular and open protocol for video.

A day at the museum

I still haven’t learnt on the food front – in my effort to optimise for both the daughter and myself this morning, I got her excellent breakfast and myself a terrible one. Actually I blame decision fatigue – there were so many stalls at the Munich Hauptbahnhof (central railway station, which is across the road from our hotel) selling what we wanted that I got confused on where to buy.

I wanted to buy croissant for her, and pretzel with Bavarian cheese for myself. After going round a zillion stalls, I bought them from the same stall I had bought croissant at last night (which the wife had for breakfast today and said was good). The croissant turned out to be excellent and was duly polished off by the daughter. I threw 3/4th of my pretzel in a dustbin on our way to the museum.

So our agenda for today was to visit the Deutsches Museum, reputed to be the largest science museums in the world. Now, science museums are the best museums in my opinion, since you generally have “something to do”.

The first museum I ever went to was the Visvesvaraya Industrial and Technological Museum in Bangalore, where there are lots of fun activities, such as the chair on which you can rotate (and change speed by pulling in some discs). So the second museum I went to (the adjacent Government Museum in Bangalore) was a massive disappointment, as I tried pressing on the labels on the displays, imagining something might happen.

And despite not being the best maintained museum in Europe (it seemed rather “sarkari” to me), the Deutsches Museum didn’t disappoint. There were plenty of buttons to be pressed and pulleys to be pulled, especially in the physics section (I wished then that I had taken my daughter there when she was older, when I could have actually explained some of the science to her).

There were massive rooms full of boats and aeroplanes (the latter being Berry’s favourite room at the museum. She kept screaming “airplane” “airplane” there several times, and had great fun “navigating” a toy plane (see picture above). I tried hard to explain to her that some of the early aeroplanes (one of the Wright Brothers’s planes is on display at the museum, along with a few World War I planes) were actually aeroplanes. She recognised the Zeppelins as “airplane”, though!

We  saw stars and planets, and telescopes and yachts of different kinds. In the middle, we went to the museum cafe (which looked and felt like a sarkari canteen) and had excellent cheesecake. And I took her to the kinderreich (kids’ kingdom), a play area for kids.

As we were going through the last rooms of the museums, she started getting cranky. I took her once again to the aeroplane room, and she said goodbye to her airplanes. By the time we had walked to the metro station she had fallen asleep.

So there wasn’t so much of flaneuring on this second day, but I managed to see everything I wanted to see. For the most part, I had put her on her “leash” (to make sure she doesn’t run away too far), but then in the last part when she started tiring I put her in the baby carrier.

The first part of the “training” in travelling with me ends today. And I’m hopeful that I’ll have a proper flaneuring partner soon!

More on interactive graphics

So for a while now I’ve been building this cricket visualisation thingy. Basically it’s what I think is a pseudo-innovative way of describing a cricket match, by showing how the game ebbs and flows, and marking off the key events.

Here’s a sample, from the ongoing game between Chennai Super Kings and Kolkata Knight Riders.

As you might appreciate, this is a bit cluttered. One “brilliant” idea I had to declutter this was to create an interactive version, using Plotly and D3.js. It’s the same graphic, but instead of all those annotations appearing, they’ll appear when you hover on those boxes (the boxes are still there). Also, when you hover over the line you can see the score and what happened on that ball.

When I came up with this version two weeks back, I sent it to a few friends. Nobody responded. I checked back with them a few days later. Nobody had seen it. They’d all opened it on their mobile devices, and interactive graphics are ill-defined for mobile!

Because on mobile there’s no concept of “hover”. Even “click” is badly defined because fingers are much fatter than mouse pointers.

And nowadays everyone uses mobile – even in corporate settings. People who spend most time in meetings only have access to their phones while in there, and consume all their information through that.

Yet, you have visualisation “experts” who insist on the joys of tools such as Tableau, or other things that produce nice-looking interactive graphics. People go ga-ga over motion charts (they’re slightly better in that they can communicate more without input from the user).

In my opinion, the lack of use on mobile is the last nail in the coffin of interactive graphics. It is not like they didn’t have their problems already – the biggest problem for me is that it takes too much effort on the part of the user to understand the message that is being sent out. Interactive graphics are also harder to do well, since the users might use them in ways not intended – hovering and clicking on the “wrong” places, making it harder to communicate the message you want to communicate.

As a visualiser, one thing I’m particular about is being in control of the message. As a rule, a good visualisation contains one overarching message, and a good visualisation is one in which the user gets the message as soon as she sees the chart. And in an interactive chart which the user has to control, there is no way for the designer to control the message!

Hopefully this difficulty with seeing interactive charts on mobile will mean that my clients will start demanding them less (at least that’s the direction in which I’ve been educating them all along!). “Controlling the narrative” and “too much work for consumer” might seem like esoteric problems with something, but “can’t be consumed on mobile” is surely a winning argument!

 

 

FaceTime Baby

My nephew Samvit, born in 2011, doesn’t talk much on the phone. It’s possibly because he didn’t talk much on the phone as a baby, but I’ve never been able to have a decent phone conversation with him (we get along really well when we meet, though). He talks a couple of lines and hands over the phone to his mother and runs off. If it’s a video call, he appears, says hi and disappears.

Berry (born in 2016), on the other hand, seems to have in a way “leapfrogged” the phone. We moved to London when she was five and a half months old, and since then we’ve kept in touch with my in-laws and other relatives primarily through video chat (FaceTime etc.). And so Berry has gotten used to seeing all these people on video, and has become extremely comfortable with the medium.

For example, when we were returning from our last Bangalore trip in December, we were worried that Berry would miss her grandparents tremendously. As it turned out, we landed in London and video called my in-laws, and Berry was babbling away as if there was no change in scene!

Berry has gotten so used to video calling that she doesn’t seem to get the “normal” voice call. Sure enough, she loves picking up the phone and holding it against her ear and saying “hello” and making pretend conversations (apparently she learnt this at her day care). But give her a phone and ask her to talk, and she goes quiet unless there’s another person appearing on screen.

Like there’s this one aunt of mine who is so tech-phobic that she doesn’t use video calls. And every time I call her she wants to hear Berry speak, except that Berry won’t speak because there is nobody on the screen! I’m now trying to figure out how to get this aunt to get comfortable with video calling just so that Berry can talk to her!

 

In that sense, Berry is a “video call” native. And I wouldn’t be surprised if it turns out that she’ll find it really hard to get comfortable with audio calls later on in life.

I’ll turn into one uncle now and say “kids nowadays… “

Is TripAdvisor killing Expedia?

The coming of the internet has led to one round of disintermediation in the travel market, and I hypothesize that review websites such as TripAdvisor are going to lead to another. Let me explain.

In the “good old days” if you wanted to travel there was no option but to reach out to the neighbourhood travel agent who would give you options of a few airlines and hotels. The best you could do to figure out if you were being taken for a ride was to check across multiple agents, but even then the only thing you could compare was price. It was impossible to compare hotels in terms of quality and you would take the word of the travel agent.

And then the internet happened.

Now, with sites such as Expedia or Travelocity, you got more transparency in pricing – especially when it came to airline ticketing. The travel agent could no longer take you for a ride when it came to the air fares – you could cross check online and bypass the agent if he wasn’t offering you a good deal (of course some things such as flexible schedules were best booked via agents, and they continue to hold sway in the corporate segment for that reason). Simultaneously airlines started selling tickets direct, via their own websites (this was led in part by “low cost carriers” who saw this as a good way of saving cost by cutting out agent fees).

This was the  first round of disintermediation in the travel industry. Airlines selling tickets direct and customers being able to book directly online meant the overall business of travel agents reduced. Some of them were cut out completely while others were replaced by large-scale technology enabled agents such as Expedia or Travelocity. Those that survived either have corporate clients (who need flexible schedules and have little time to book online) or have resorted to packages – where they arrange for flights, accommodation and cars, and quote you a consolidated fee – in which there are margins to be made.

The move to large-scale technology-enabled agents meant that some of these agents were now large-scale aggregators. This gave them significant bargaining power vis-a-vis hotels and this allowed them to bargain for deep discounts. While earlier conventional wisdom was that “travel agents” could get you “good deals”, now these large online aggregators were the ones providing the “best deals”. Thus it made eminent sense to book via these aggregators.

Simultaneously most hotels also started direct booking on their own websites. However, the problem was that the hotels themselves did not have the technological capability to implement good revenue management practices on their own websites. They also did not have the technological capability to offer a seamless and smooth booking experience. Thus, large online agents such as booking.com and Agoda prospered.

There are two functions that a travel agent performs – helping customers discover hotels and then actually executing the booking. In the traditional model, agents don’t charge for the discovery process. That service is instead cross-subsidized by the fees they make on the actual booking process. The first level of disintermediation in the travel agency (which we’ve seen above) has chipped away at this model, however. What do I, a travel agent, have to gain if I put in painstaking research and find you a hotel, only for you to find that you can book it for a lower price online? Agents, however, have not figured out a way to charge for the discovery process.

However, it is unlikely that they need to. For you now have websites such as TripAdvisor which have user-generated reviews and ratings for a large number of hotels, and which rank hotels in each city by type and user ratings. TripAdvisor has become so ubiquitous for user-generated ratings for hotels that nowadays travel agents add links to TripAdvisor profiles of hotels that they are recommending. Thus, we can see that the hotel discovery process can exist independently of travel agents.

What of the bookings itself? Don’t we need travel agents for that? Note that irrespective of whether a travel agent is online or offline, the hotel has to pay them a commission for selling their inventory. In the past given their size, hotels (unlike airlines) were unable to effectively sell rooms on their own websites and thus resorted to paying travel agents. However, advances in technology now mean that it is easy for a hotel to adopt a third-party software to effectively manage their inventory and sell tickets on their own website, and at a fraction of the cost they need to pay travel agents.

So, if TripAdvisor helps you discover hotels and then you can book hotels directly through their own websites, who needs travel agents? For now, most large online aggregators have a price matching policy and thus match the prices that hotels quote on their own websites. However, in order to save booking fees (rumoured to be of the order of 17% of the total booking value) hotels are trying to innovate and add freebies to their offering.

For example, a hotel in Cambodia I stayed in last week offered a free massage to guests who had booked through their own website (unfortunately I booked via Agoda and couldn’t avail of this offer). The Bangkok hotel I stayed in last week offered a 10% discount on payments made via American Express on their own website (again we discovered this after we had booked on Agoda, using an AmEx. To their credit, Agoda gave us a refund to the extent of the discount we would have got on the hotel website).

Essentially hotels have figured that with the growing popularity of platforms such as TripAdvisor, they don’t really need travel agents, small or large. As TripAdvisor gets more popular and third party hotel booking softwares gain traction, we are likely to see the decline of large travel aggregators such as Expedia, Travelocity and Agoda.

In essence, the growth of TripAdvisor is going to lead to the partial downfall of its erstwhile parent Expedia.

Missed opportunities in cross-selling

Talk to any analytics or “business intelligence” provider – be it a large commoditized outsourcing firm or a rather niche consultant – and one thing they all claim to advise their clients on is strategies for “cross sell”. However, my personal experience suggests that implementation of cross-sell strategies among retailers I encounter is extremely poor. I will illustrate two such examples in this post here.

Jet Airways and American Express together have come up with this “Jet Airways American Express Platinum Credit Card”. Like any other co-branded credit card, it offers you additional benefits on Jet Airways flights booked with this card (in terms of higher points) as well as some other benefits such as lounge access for economy travel. Given that I’m a consultant and travel frequently, this is something I think is good to have, and have attempted to purchase it a few times. And got discouraged by the purchase process each time and backed out.

Now, I’m a customer of both Jet Airways and American Express. I hold an American Express Gold Card (perhaps one of the few people to have an individual AmEx card), and have a Jet Privilege account. Yet, neither Jet or Amex seems remotely interested in selling to me. I once remember applying for this card through the Amex call centre. The person at the other end of the line wanted me to fill up the entire form once again – despite me being already a cardholder. This I would ascribe to messed up incentive structures where the salesperson at the other end gets higher benefits for acquiring a new customer rather than upgrading an existing one. I’ve mentioned I want this card to the Amex call centre several times, yet no one has called me back.

However, these are not the missed cross-sell opportunities I’m talking about in this post. Three times in the last three months (maybe more, but I cannot recollect) I’ve booked an air ticket to fly on Jet airways from the Jet Airways website having logged into my Jet Privilege account and paying with my American Express card. Each time I’ve waited hopefully that some system at either the Jet or the Amex end will make the connection and offer me this Platinum card, but so far there has been response. It is perhaps the case that for some reason they do not want to upgrade existing customers to this card (in which case the entire discussion is moot) but not offering me a card here is simply a case of a blatant missed opportunity – in cricketing terms you can think of this as an easy dropped catch.

The other case has to do with banking. I’m in the process of purchasing a house, and over the last few months have been transferring large amounts of money to the seller in order to make my down payments (which I’m meeting through my savings). Now, I’ve had my account with Citibank for over seven years and have never withdrew such large amounts – except maybe to make some fixed deposits. One time, I got a call from the bank’s call centre, confirming if it was indeed I who had made the transfer. Why did the bank not think of finding out (in a discreet manner) why all of a sudden so much money had moved out of my account, and if I was up to purchasing something and if the bank could help? Of course, later, during a visit to the Citibank local branch recently I found I wouldn’t have got a loan from them anyway since they don’t finance apartments built by no-name builders that are still under construction (which fits the bill of the property I’m purchasing). Nevertheless – the large money transferred out of my account could have been for buying a property that the bank could have financed. Missed opportunity there?

My understanding of the situation is that in several “analytics” offerings there is a disconnect between the tech and the business sides. Somewhere along the chain of implementation there is one hand-off where one party knows only the business aspects and the other knows only technology, and thus the two are unable to converse, leading to suboptimal decisions. One kind of value I offer (hint! hint!!) is that I understand both tech and business, and I can ensure a much smoother hand-off between the technical and business aspects, thus leading to superior solution design.