compression – Pertinent Observations

One of the most mindblowing things I learnt while I was doing my undergrad in Computer Science and Engineering was Lempel-Ziv-Welch (LZW) compression. It’s one of the standard compression algorithms used everywhere nowadays.

The reason I remember this is twofold – firstly, I remember implementing this as part of an assignment (our CSE program at IITM was full of those), and feeling happy to be coding in C rather than in the dreaded Java (which we had to use for most other assignments).

The other is that this is one of those algorithms that I “internalised” while doing something totally different – in this case I was having coffee/ tea with a classmate in our hostel mess.

I won’t go into the algorithm here. However, the basic concept is that as and when we see a new pattern, we give it a code, and every subsequent occurrence of that pattern is replaced by its corresponding code. And the beauty of it is that you don’t need to ship a separate dictionary -the compressed code itself encapsulates it.

Anyway, in practical terms, the more the same kind of patterns are repeated in the original file, the more the file can be compressed. In some sense, the more the repetition of patterns, the less the overall “information” that the original file can carry – but that discussion is for another day.

I’ve been thinking of compression in general and LZW compression in particular when I think of stereotyping. The whole idea of stereotyping is that we are fundamentally lazy, and want to “classify” or categorise or pigeon-hole people using the fewest number of bits necessary.

And so, we use lazy heuristics – gender, caste, race, degrees, employers, height, even names, etc. to make our assumptions of what people are going to be like. This is fundamentally lazy, but also effective – in a sense, we have evolved to stereotype people (and objects and animals) because that allows our brain to be efficient; to internalise more data by using fewer bits. And for this precise reason, to some extent, stereotyping is rational.

However, the problem with stereotypes is that they can frequently be wrong. We might see a name and assume something about a person, and they might turn out to be completely different. The rational response to this is not to beat oneself for stereotyping in the first place – it is to update one’s priors with the new information that one has learnt about this person.

So, you might have used a combination of pre-known features of a person to categorise him/her. The moment you realise that this categorisation is wrong, you ought to invest additional bits in your brain to classify this person so that the stereotype doesn’t remain any more.

The more idiosyncratic and interesting you are, the more the number of bits that will be required to describe you. You are very very different from any of the stereotypes that can possibly be used to describe you, and this means people will need to make that effort to try and understand you.

One of the downsides of being idiosyncratic, though, is that most people are lazy and won’t make the effort to use the additional bits required to know you, and so will grossly mischaracterise you using one of the standard stereotypes.

On yet another tangential note, getting to know someone is a Bayesian process. You make your first impressions of them based on whatever you find out about them, and go on building a picture of them incrementally based on the information you find out about them. It is like loading a picture on a website using a bad internet connection – first the picture appears grainy, and then the more idiosyncratic features can be seen.

The problem with refusing to use stereotypes, or demonising stereotypes, is that you fail to use the grainy pictures when that is the best available, and instead infinitely wait to get better pictures. On the other hand, failing to see beyond stereotypes means that you end up using grainy pictures when more clear ones are available.

And both of these approaches is suboptimal.

PS: I’ve sometimes wondered why I find it so hard to remember certain people’s faces. And I realise that it’s usually because they are highly idiosyncratic and not easy to stereotype / compress (both are the same thing). And so it takes more effort to remember them, and if I don’t really need to remember them so much, I just don’t bother.

A long long time ago I had installed the Jio Cinema app on my Fire TV Stick. I had perhaps watched two movies on that, and then completely forgotten about it. This evening, I had to look for a movie to watch my the wife, and having exhausted most of the “compatible content” (stuff we can watch together on Netflix) and been exhausted by the user experience on Prime Video, I decided to give this app a try.

I ended up selecting a movie, which I later found out has a 4.5 IMDB rating and doesn’t even have a Wikepedia page. Needless to say, we abandoned the movie midway. That’s when the wife went in to put the daughter to bed and my fun began.

So Jio Cinema follows what I call the “Amazon paradigm for product management”. Since Amazon tries to sell every product (or service) as if it is a physical book, it has one single mantra for product management. “Improve selection and they will come”.

The user experience doesn’t matter. How easy the product is to use, and how pleasing it looks on the eye, and whether it has occasional bugs, is all secondary. All that matters is selection. Given that the company built its business on the back of selling “long tail” books, this is not so surprising, except that it doesn’t necessarily work in other categories.

I’ve written about Amazon’s ineptitude in product management before, in the context of that atrocity of an app called Sony Liv. The funny thing is that the Jio Cinema app (on Fire TV Stick) looks and feels pretty much exactly like Sony Liv. Maybe there is an open source shitty fire TV app that these guys have based themselves on?

In any case, I started browsing the Jio Cinema app, and I found something called “movies in 15 minutes“. Initially I thought it was a parody. The first few movies I noticed there were things I had never heard of. “This is perhaps for bad movies”, I reasoned. I kept scrolling, and more recognisable names popped up.

I decided to watch Deewana, which was released just before the start of my optimal age of movie appreciation, and which, for some reason, we didn’t get home a video cassette of.

It’s basically a collage of scenes from the movie. It’s like someone has put together a “highlights package”, taking all the important scenes and then putting them together.

And for a movie like Deewana it works. The 15 minute version had all the necessary plot elements to fully follow the movie. It is a great movie, for 15 minutes. Maybe at 30 minutes as well it might be a great movie. However, I can’t imagine having watched it in the full version.

That was two hours back. I’ve since gone crazy watching 15 minute versions of many other movies (mostly from the 70s and 80s, though they have movies as recent as Jab We Met). It’s been fantastic.

However, I have one crib. This has to do with information content. Essentially, the premise behind “movies in 15 minutes” is that the information content in these movies is so little that the whole thing can be compressed into 15 minutes. The problem is that not every movie has the same amount of information.

15 minutes was perfect for Deewana. It was also appropriate for Kasam Paida Karne Waali Ki, which I watched only because it gets referenced in Gangs of Wasseypur. Between these two, I “watched” Namak Halaal, and I didn’t understand the head or tail of it. I had to go to Wikepedia to understand the plot.

Essentially the plot of Namak Halaal is complex enough, I imagine, that compressing it into 15 minutes is impossible without significant information loss. And the loss of information was so much that I couldn’t understand the summary at all. Maybe I’ll watch the movie in full some day.

I’m writing this blogpost after watching the 15 minute version of Don. I guess whoever made the summary realised that the movie is so complex that it can’t really be compressed into 15 minutes – and so they have added a voiceover to narrate the key elements.

In any case, I’m feeling super thrilled. I normally don’t watch movies because the bit rate in most movies is too low. Compression means that I can happily watch the movies without ever getting bored.

I wish they made these 15 minute versions of all movies. Jio, all (your Amazon-style product maangement) is forgiven.

Now on to Amar Akbar Anthony.

Tag: compression

Compression Stereotypes

Jio, Amazon and Information Content