coding – Pertinent Observations

Sierpinski Triangles

On Saturday morning, my daughter had made some nice art with sketch pen on an A4 paper. It was rather “geometric” consisting of repeating patterns across the page. My wife took one look at it and said, “do you know that you can make such art with computers also? Your father has made some”.

Some drawings I had made using code, back in 2016

“Reallly?”, piped the daughter. I had been intending for a while to start teaching her to code (she is six), and figured this was the perfect trigger, and said I will teach her.

A quick search revealed that there is an “ACS Logo” for Mac (Logo was the first “programming language” I had learnt, when I was nine). I quickly downloaded it on her computer (my wife’s old Macbook Air) and figured I remembered most of the commands.

And then I started typing, and showed her what they had showed me back in a “computer class” behind my house in 1992 – FD for “forward”. RT for right turn. HT for hide turtle. Etc. Etc.

Soon she was engrossed in it. Thankfully she has learnt angles in her school, though it took her some trial and error to figure out how much to turn by for different shapes (later I was thinking this can also serve as a good “angles revision” for her during her ongoing summer holidays).

With my wife having reminded me that I could produce images through code, I realised that as my daughter was engrossed in her “coding”, I should do some “coding art” on my own. All she needed was some occasional input, and for me to sit right next to her.

Last Monday I had got a bit of a scare – at work, I needed to generate randomly distributed points in a regular hexagon. A lookup online told me that I could just get a larger number of randomly distributed points in a bounding rectangle, and then only pick points within the hexagon. And then take a random sample of those.

This had meant that I needed to write equations for whether a point lay inside a hexagon. And I realised I’d forgotten ALL my coordinate geometry. It took me over half an hour to get the equation for the sides of the hexagon right – I’m clearly rusty.

And on Saturday, as I sat down to make some “computer art”, I decided I’ll make some fractals. Why don’t I make some Sierpinski Triangles, I thought. I started breaking down what code I needed to write.

First, given an equilateral triangle, I had to return three similar equilateral triangles, each of half the side length of the original triangles.

Then, given the centroid of an equilateral triangle and the length of each side, I had to return the vertices.

Once these two functions had been written, I could just chain them (after running the first one recursively) and then had to just plot to get the Sierpinski triangle.

And then I had my second scare of the week – not only had I forgotten my coordinate geometry – I had forgotten my trigonometry as well. Again I messed up a few times, but the good thing about programming with a computer is that i could do trial and error. Soon I had it right, and started producing Sierpinski triangles.

Then, there was another problem – my code was really inefficient. If I went beyond depth 4 or 5, the figures would take inordinately long to render. Since I was coding in R, I set about vectorising all my code. In R you don’t write loops if you can help it – instead, you apply functions on entire vectors. This again took some time, and then I had the triangles ready. I proudly showed them off to my daughter.

“Appa, why is it that as you increase the number it becomes greyer”, she asked . I explained how with each step, you were taking away more of the filled areas from the triangles. Then I figured this wasn’t that good-looking – maybe I should colour it.

And so I wrote code to colour the triangles. Basically, I started recursively colouring them – the top third green, left third red and right third blue (starting with a red base). This is what I ended up producing:

And this is what my daughter produced at the same time, using Logo:

I forgot to “HT” before taking the screenshot. This is a “lollipop”

Code Density

As many of the regular readers of this blog know, I largely use R for most of my data work. There have been a few occasions when I’ve tried to use Python, but have found that I’m far less efficient in that than I am with R, and so abandoned it, despite the relative ease of putting things into production.

Now in my company, like in most companies, people use both Python and R (the team that reports to me largely uses R, everyone else largely uses Python). And while till recently I used to claim that I’m multilingual in the sense that I can read Python code fairly competently, of late I’m not sure I am. I find it increasingly difficult to parse and read production grade python code.

And now, after some experiments with ChatGPT, and exploring other people’s codes, I have an idea on why I’m finding it hard to read production-grade Python code. It has to do with “code density”.

Of late I’ve been experimenting with Spark (finally, in this job I do a lot of “big data” work – something I never had to in my consulting career prior to this). Related to this, I was reading someone’s PySpark code.

And then it hit me – the problem (rather, my problem) with Python is that it is far more verbose than R. The number of characters or lines of code required to do something in Python is far more than what you need in R (especially if you are using the tidyverse family of packages, which I do, including sparklyr for spark).

Why does the density of code matter? It is to do with aesthetics and modularity and ease of understanding.

Yesterday I was writing some code that I plan to put into production. After a few hours of coding, I looked at the code and felt disgusted with myself – it was a way too long monolithic block of code. It might have been good when I was writing it, but I knew that if I were to revisit it in a week or two, I wouldn’t be able to understand what the hell was happening there.

I’ve never worked as a professional software engineer, but with the amount of coding I’ve done, I’ve worked out what is a “reasonable length for a code block”. It’s like that apocryphal story of Indian public examiners for high school exams who evaluate history answers based on how long they are – “if they were to place an ordinary Reynolds 045 pen vertically on the sheet, the answer should be longer than that for the student to get five marks”.

An answer in a high school history exam needs to be longer than this. A code block or function should be shorter than this

It’s the reverse here. Approximately speaking, if you were to place a Reynolds pen vertically on screen (at your normal font size), no function (or code block) can be longer than the pen.

This easily approximates how much the eye can see on one normal Macbook screen (I use a massive external monitor at work, and a less massive, but equally wide, one at home). If you have to keep scrolling up and down to understand the full logic, there is a higher chance you might make mistakes, and higher difficulty for someone to understand the code.

Till recently (as in earlier this week) I would crib like crazy that people coding in Python would make their code “too modular”. That I would have to keep switching back and forth between different functions (and classes!!) to make sense of some logic, and about how that would make codes hard to debug (I still think there is a limit to how modular you can make your ETL code).

Now, however (I’m writing this on a Saturday – I’m not working today), from the code density perspective, it all makes sense to me.

The advantage of R is that because the code is far denser, you can pack in a far greater amount of logic in a Reynolds pen length of code. So over the years I’ve gotten used to having this much logic being presented to me in one chunk (without having to scroll or switch functions).

The relatively lower density of Python, however, means that the same amount of logic that would be one function in R is now split across several different functions. It is not that the writer of the code is “making this too modular” or “writing functions just for the heck of it”. It is just that their “mental Reynolds pens” doesn’t allow them to pack in more lines in a chunk or function, and Python’s density means there is only so much logic that can go in there.

As part of my undergrad, I did a course on Software Engineering (and the one thing the professor told us then was that we should NOT take up software engineering as a career – “it’s a boring job”, he had said). In that, one of the things we learnt was that in conventional software services contexts, billing would happen as a (nonlinear) function of “kilo lines of code” (this was in early 2003).

Now, looking back, one thing I can say is that the rate per kilo line of R code ought to be much higher than the rate per kilo line of Python code.

Cross posted on my now-largely-dormant Art of Data Science newsletter

Cookbooks, Code and College

Why Is This Interesting, a fascinating daily newsletter I subscribe to, has this edition on code and cookbooks. The basic crib here is that most coding books teach you to code as if you were trying to become a professional coder, rather than trying to teach you to code as an additional life skill.

This, the author Noah Brier remarks, is quite unlike how most cookbooks teach you to cook, where there is absolutely no pretence of trying to turn you into a professional cook. Cookbooks know that most people who want to learn to cook simply want to cook for themselves or their families, so professional level learning is not required. This, however, is not the case with books on coding.

In fact, this pretty much explains why I completely fell out of love with coding during my undergrad in Computer Science. I remember being rather excited in 2000, when I got an entrance exam score good enough to get admitted to the Computer Science program at IIT Madras. I had learnt to code only two years before, but I’d taken on to it rather well, and had quickly built a reputation of being one of the better coders in school.

And then the four year program in Computer Science sucked out all the love I had for coding. This cooking-code post reminded me why – basically most professors in my department assumed that all of us wanted to be academics and taught us that way. This wasn’t an unfair assumption, since 17 of the 22 of us who graduated in 2004 either immediately or in a couple of years went to grad school.

However, the approach of teaching that assumed that you would be an expert or an academic meant a paradigm that made it incredibly hard to learn unless you were insanely motivated.

For example, the fourth year B.Tech. project was almost always supposed to be a “work of research” that would turn into a paper (or dozen). There was a lot of theory all round (I didn’t mind parts of it, like some bits of algorithm analysis, but most of it was boring). The course was heavy in terms of assignments, which you can argue was a practical concept, but the way the assignments were done by most people meant that the bar was rather academic.

And that meant that someone like me, who didn’t want to be “an engineer” to begin with, but had entered with a love for coding, quickly fell out of love with the field itself. In hindsight, given the way I was taught, I’m not surprised that my first option upon exit was to go to business school, and it would be at least five years later that I would begin to appreciate that I had an aptitude for code.

(Interestingly, business school was different. Nobody assumed anybody would become an academic, so the teaching was far more palatable.)

Programming back to the 1970s

I learnt to write computer code circa 1998, at a time when resources were plenty. I had a computer of my own – an assembled desktop with a 386 processor and RAM that was measured in MBs. It wasn’t particularly powerful, but it was more than adequate to handle the programs I was trying to write.

I wasn’t trying to process large amounts of data. Even when the algorithms were complex, they weren’t that complex. Most code ran in a matter of minutes, which meant that I didn’t need to bother about getting the code right the first time round – apart from for examination purposes. I could iterate and slowly get things right.

This was markedly different from how people programmed back in the 1970s, when computing resource was scarce and people had to mostly write code on paper. Time had to be booked at computer terminals, when the code would be copied onto the computers, and then run. The amount of time it took for the code to run meant that you had to get it right the first time round. Any mistake meant standing in line at the terminal again, and further time to run the code.

The problem was particularly dire in the USSR, where the planned economy meant that the shortages of computer resources were shorter. This has been cited as a reason as to why Russian programmers who migrated to the US were prized – they had practice in writing code that worked for the first time.

Anyway, the point of this post is that coding became progressively easier through the second half of the 20th century, when Moore’s Law was in operation, and computers became faster, smaller and significantly more abundant.

This process continues – computers continue to become better and more abundant – smartphones are nothing but computers. On the other side, however, as storage has gotten cheap and data capture has gotten easier, data sources are significantly larger now than they were a decade or two back.

So if you are trying to write code that uses a large amount of data, it means that each run can take a significant amount of time. When the data size reaches big data proportions (when it all can’t be processed on a single computer), the problem is more complex.

And in that sense, every time you want to run a piece of code, however simple it is, execution takes a long time. This has made bugs much more expensive again – the amount of time programs take to run means that you lose a lot of time in debugging and rewriting your code.

It’s like being in the 1970s all over again!

Simulating segregation

Back in the 1970s, economist Thomas Schelling proposed a model to explain why cities are segregated. Individual people choosing to live with others like themselves would have the macroscopic impact of segregating the city, he had explained.

Think of the city as being organised in terms of a grid. Each person has 8 neighbours (including the diagonals as well). If a person has fewer than 3 people who are like himself (whether that is race, religion, caste or football fandom doesn’t matter), he decides to relocate, and moves to an arbitrary empty spot where at least 3 new neighbours are like himself. Repeat this a sufficient number of times and the city will be segregated, he said.

Rediscovering this concept while reading this wonderful book on Networks, Crowds and Markets yesterday, I decided to code it up on a whim. It’s nothing that’s not been done before – all you need to do is to search around and you’ll find plenty of code with the simulations. I just decided to code it myself from first principles as a challenge.

You can find the (rather badly written) code here. Here is some sample output:

As you can see, people belong to two types – red and blue. Initially they start out randomly distributed (white spaces show empty areas). Then people start moving based on Schelling’s rule – if there are less than 3 neighbours of the same kind, you move to a new empty place (if one is available) which is more friendly to you. Over time, you see that you get a segregated city, with large-ish patterns of reds and blues.

The interesting thing to note is that there is no “complete segregation” – there is no one large red patch and one large blue patch. Secondly, segregation seems rather slow at first, but soon picks up pace. You might also notice that the white spaces expand over time.

This is for one specific input, where there are 2500 cells (50 by 50 grid), and we start off with 900 red and 900 blue people (meaning 700 cells are empty). If you change these numbers, the pattern of segregation changes. When there are too few empty cells, for example, the city remains mixed – people unhappy with their neighbourhood have no where to go. When there are too many empty cells, you’ll see that the city contracts. And so forth.

Play around with the code (I admit I haven’t written sufficient documentation), and you can figure out some more interesting patterns by yourself!

Making coding cool again

I learnt to code back in 1998. My aunt taught me the basics of C++, and I was fascinated by all that I could make my bad old x386 computer to do. Soon enough I was solving complex math problems, and using special ASCII characters to create interesting pattens on screen. It wasn’t long before I wrote the code for two players sitting on the same machine to play Pong. And that made me a star.

I was in a rather stud class back then (the school I went to in class XI had a reputation for attracting toppers), and after a while I think I had used my coding skills to build a reasonable reputation. In other words, coding was cool. And all the other kids also looked up to coding as a special skill.

Somewhere down the line, though I don’t remember when it was, coding became uncool. Despite graduating with a degree in Computer Science from IIT Madras, I didn’t want a “coding job”. I ended up with one, but didn’t want to take it, and so I wrote some MBA entrance exams, and made my escape that way.

By the time I graduated from my MBA, coding had become even more uncool. If you were in a job that required you to code, it was an indication that you were in the lowest rung, and thus not in a “management job”. Perhaps even worse, if your job required you to code, you were probably in an “IT job”, something that was back then considered as being a “dead end” and thus not a very preferred job. Thus, even if you coded in your job, you tended to downplay it. You didn’t want your peers to think you were either in a “bottom rung” job or in an “IT job”. So I wrote fairly studmax code (mostly using VB on Excel) but didn’t particularly talk about it when I met my MBA friends. As I moved jobs (they became progressively studder) my coding actually increased, but I continued to downplay the coding bit.

And I don’t think it’s just me. Thanks to the reasons explained above, coding is considered uncool among most MBA graduates. Even most engineering graduates from good colleges don’t find coding cool, for that is the job that their peers in big-name big-size dead-end-job software services companies do. And if people consider coding uncool, it has a dampening impact on the quality of talent that goes into jobs that involves coding. And that means code becomes less smart. And so forth.

So the question is how we can make coding cool again. I’m not saying it’s totally uncool. There are plenty of talented people who want to code, and who think it’s cool. The problem though is that the marginal potential coder is not taking to coding because he thinks that coding is not cool enough. And making coding cool will make more people want to take it up, which will lead to greater number of people take up this vocation!

Any ideas?