Communicating Numbers

Earlier this week I read this masterful blogpost on Andrew Gelman’s blog (though the post itself is not written by Andrew Gelman – it’s written by Phil Price) about communicating numbers.

Basically the way you communicate a number can give a lot more information “between the lines”. Take the example at the top of the article:

“At the New York Marathon, three of the five fastest runners were wearing our shoes.” I’m sure I’m not the first or last person to have realized that there’s more information there than it seems at first. For one thing, you can be sure that one of those three runners finished fifth: otherwise the ad would have said “three of the four fastest.” Also, it seems almost certain that the two fastest runners were not wearing the shoes, and indeed it probably wasn’t 1-3 or 2-3 either: “The two fastest” and “two of the three fastest” both seem better than “three of the top five.” The principle here is that if you’re trying to make the result sound as impressive as possible, an unintended consequence is that you’re revealing the upper limit.

Incredible. So 3 in 5 means one of them is likely to be 5th. And likely one is fourth as well. Similarly, if you see a company that calls itself a “Fortune 500 company”, it is likely closer to 500 than to 100.

The other, slightly unrelated, example quoted in the article is about Covid-19 spread in outdoor conditions. There is another article that says that “less than 10% of covid-19 transmission that happens indoors”. This is misleading because if you say “less than 10%”, people will assume it’s 9%! The number, apparently, is closer to 0.1%.

There are many more such examples that we encounter in real life. If you write on LinkedIn that you went to a “top 10 ranked B-school”, it means you DID NOT go to a “top 5 ranked B-school”.

Loosely related to this, I’ve got a bit irritated over the last year and a bit in terms of imprecise numerical reporting by the media (related to covid-19). I won’t provide links or quotes here, since what I can remember are mostly by one person and I don’t want to implicate her here (and it’s a systemic problem, not unique to her).

You see reports saying “20000 new cases in Karnataka. A majority of them are from Bangalore”. I’ve seen this kind of a report even when 90% of the cases have been from Bangalore, and that is misleading – when you say “majority”, you instinctively think of “50% + 1”. Another report said “as many as 10000 cases”. Now, the “as many as” phrasing makes it sound like a very large number, but put in context, this 10000 wasn’t really very high.

Communication of numbers is an art that is not very well spread. Nowadays we see lots of courses on “telling stories with data”, “data visualisation”, graphics, etc. but none in terms of communication of sheer numbers itself.

Maybe I should record an episode about this in my forthcoming podcast. If you know who might be a good guest for it, AND can make an introduction, let me know.

More issues with Slack

A long time back I’d written about how Slack in some ways was like the old DBabble messaging and discussion group platform, except for one small difference – Slack didn’t have threaded conversations which meant that it was only possible to hold one thread of thought in a channel, significantly limiting discussion.

Since then, Slack has introduced threaded conversations, but done it in an atrocious manner. The same linear feed in each channel remains, but there’s now a way to reply to specific messages. However, even in this little implementation Slack has done worse than even WhatsApp – by default, unless you check one little checkbox, your reply will only be sent to the person who originally posted the message, and doesn’t really post the message on the group.

And if you click the checkbox, the message is displayed in the feed, but in a rather ungainly manner. And threads are only one level deep (this was one reason I used to prefer LiveJournal over blogspot back in the day – comments could be nested in the former, allowing for significantly superior discussions).

Anyway, the point of this post is not about threads. It’s about another bug/feature of Slack which makes it an extremely difficult tool to use, especially for people like me.

The problem is slack is that it nudges you towards sending shorter messages rather than longer messages. In fact, there’s no facility at all to send a long well-constructed argument unless you keep holding on to Shift+Enter everytime you need a new line. There is a “insert text snippet” feature, but that lacks richness of any kind – like bullet points, for example.

What this does is to force you to use Slack for quick messages only, or only share summaries. It’s possible that this is a design feature, intended to capture the lack of attention span of the “twitter generation”, but it makes it an incredibly hard platform to use to have real discussions.

And when Slack is the primary mode of communication in your company (some organisations have effectively done away with email for internal communications, preferring to put everything on Slack), there is no way at all to communicate nuance.

PS: It’s possible that the metric for someone at Slack is “number of messages sent”. And nudging users towards writing shorter messages can mean more messages are sent!

PS2: DBabble allowed for plenty of nuance, with plenty of space to write your messages and arguments.

 

Teaching and research

My mind goes back to a debate organised by the Civil Engineering department at IIT Madras back in the early 2000s. A bunch of students argued that IIT Madras was “not a world class institution”. A bunch of professors argued otherwise.

I don’t remember too much of the debate but I remember one line that one of the students said. “How does one become a professor at IIT Madras? By writing a hundred papers. Whether one can teach is immaterial”.

The issue of an academic’s responsibilities has been a long-standing one. One accusation against the IITs (ironical in the context of the bit of debate I’ve quoted above) is that they’re too focussed on undergraduate teaching and not enough on research – despite only hiring PhDs as faculty. From time to time the Indian government issues diktats on minimum hours that a professor must teach, and each time it is met with disapproval from the professors.

The reason this debate on an academic’s ability to teach came to my mind is because I’ve been trying to read some books and papers recently (such as this one), and they’re mostly unreadable.

They start with some basic introductory statements and before you know it you are caught up in a slew of jargon and symbols and greeks. Basically for anyone who’s not an insider in the field, this represents a near-insurmountable barrier to learning.

And this is where undergraduate teaching comes in. By definition, undergrads are non-specialists and not insiders in any particular specialisation. Even if they were to partly specialise (such as in a branch of engineering), the degree of specialisation is far less than that of a professor.

Thus, in order to communicate effectively with the undergrad, the professor needs to change the way he communicates. Get rid of the jargon and the sudden introduction of greeks and introduce subjects in a more gentle manner. Of course plenty of professors simply fail to do that, but if the university has a good feedback mechanism in place this won’t last.

And once the professor is used to communicating to undergrads, communicating with the wider world becomes a breeze, since the same formula works. And that vastly improves the impact of their work, since so many more people can now follow it.

Portfolio communication

I just got a promotional message from my broker (ICICI Direct). The intention of the email is possibly to get me to log back on to the website and do some transactions – remember that the broker makes money when I transact, and buy-and-hold investors don’t make much money for them.

So the mail, which I’m sure has been crafted after getting some “data insight”, goes like this:

Here is a quick update on what is happening in the world of investments since you last visited your ICICIdirect.com investment account.
1. Your total portfolio size is INR [xxxxxx]*
2. Sensex moved up by 8.36% during this period#
3. To know more about the top performing stocks and mutual funds, click here.

While this information might be considered to be useful, it simply isn’t enough information to make me learn sufficiently about my portfolio to take any action.

It’s great to know what my portfolio value is, and what the Sensex moved by in this period (“since my last logon”). A simple additional piece of information would be how much my portfolio has gone up by in this period – to know how I’m performing relative to the market.

And right in my email, they could’ve suggested some mutual funds and stock portfolios that I should move my money to – and given me an easy way to click through to the website/app and trade into these new portfolios using a couple of clicks.

There’s so much that can be done in the field of personal finance, in terms of how brokers and advisors can help clients invest better. And a lot of it is simple formula-based, which means it can be automated and hence done at a fairly low cost.

But then as long as the amount of money brokers make is proportional to the amount the client trades, there will always be conflicts of interest.

Black Box Models

A few years ago, Felix Salmon wrote this article in Wired called “The Formula That Killed Wall Street“. It was about a formula called “Gaussian Copula”, which was a formula for estimating the joint probability of a set of events happening, if you knew the individual probabilities. It was a mathematical breakthrough.

Unfortunately, it fell into the hands of quants and traders who didn’t fully understand it, and they used it to derive joint probabilities of a large number of instruments put together. What they did not realize was that there was an error in the model (as there is in all models), and when they used the formula to tie up a large number of instruments, this error cascaded, resulting in an extremely inaccurate model, and subsequent massive losses (the last paragraph is based on my reading of the situation. Your mileage might vary).

In a blog post earlier this week at Reuters, Salmon returned to this article. He said:

 And you can’t take technology further than its natural limits, either. It wasn’t really the Gaussian copula function which killed Wall Street, nor was it the quants who wielded it. Rather, it was the quants’ managers — the people whose limited understanding of copula functions and value-at-risk calculations allowed far too much risk to be pushed out into the tails. On Wall Street, just as in the rest of industry, a little bit of common sense can go a very long way.

I’m completely with him on this one. This blog post was in reference to Salmon’s latest article in Wired, which is about the four stages in which quants disrupt industries. You are encouraged to read both the Wired article and the blog post about it.

The essence is that it is easy to over-do analytics. Once you have a model that works in a few cases, you will end up putting too much faith into the model, and soon the model will become gospel, and you will build the rest of the organization around the model (this is Stage Three that Salmon talks about). For example, a friend who is a management consultant once mentioned about how bank lending practices are now increasingly formula driven. He mentioned reading a manager’s report that said “I know the applicant well, and am confident that he will repay the loan. However, our scoring system ranks him too low, hence I’m unable to offer the loan“.

The key issue, as Salmon mentions in his blog post, is that managers need to have at least a basic understanding of analytics (I had touched upon this issue in an earlier blog post). As I had written in that blog post, there can be two ways in which the analytics team can end up not contributing to the firm – firstly, people think they are geeks who nobody understands, and ignores them. Secondly, and perhaps more dangerously, people think of the analytics guys as gods, and fail to challenge them sufficiently, thus putting too much faith in models.

From this perspective, it is important for the analytics team to communicate well with the other managers – to explain the basic logic behind the models, so that the managers can understand the assumptions and limitations, and can use the models in the intended manner. What usually happens, though, is that after a few attempts when management doesn’t “get” the models, the analytics people resign themselves to using technical jargon and three letter acronyms to bulldoze their models past the managers.

The point of this post, however, is about black box models. Sometimes, you can have people (either analytics professionals or managers) using models without fully understanding them, and their assumptions. This inevitably leads to disaster. A good example of this are the traders and quants who used David Li’s Gaussian Copula, and ended up with horribly wrong models.

In order to prevent this, a good practice would be for the analytics people to be able to explain the model in an intuitive fashion (without using jargon) to the managers, so that they all understand the essence and nuances of the model in question. This, of course, means that you need to employ analytics people who are capable of effectively communicating their ideas, and employ managers who are able to at least understand some basic quant.

Long mails

As you might have noticed from my blog posts over the years, I like writing long essays. By long, I mean blog post long. Somewhere of the length of 800-1000 words. I can’t write longer than that, because of which my attempts to write a book have come to nought.

Now, thanks to regular blogging for over nine years, I think I’ve become better at writing rather than speaking when I have to explain a complicated concept. Writing allows me to structure my thoughts better, whereas while speaking I sometimes tend to think ahead of what I’m talking, and end up making a mess of it (I had a major stammer when I was in school, by the way).

Given that I like explaining concepts in writing rather than in speech, I write long mails even when it comes to work. Writing long emails is like writing blog posts – you have the time and space to structure your thought well and present it to your readers. This especially helps if the thoughts you are to communicate are complex.

The problem, however, is that most people are not used to reading long emails in a work contexts. People prefer to do meetings instead. Or they just call you up. For whatever reason, the art of long emails has never really taken off in the corporate sphere, Maybe people just want to talk too much.

This, of course, has never deterred me from using my favourite means of communication. It didn’t stop me when I was an employee and the people I wrote to were colleagues. It still doesn’t stop me now, when I’m a consultant, writing to people who are paying me for a piece of work. If they are paying me, I should communicate things to them in a form they are most comfortable with, you might argue. If they are paying me, I should communicate things as well as I can, I argue back, and my best means of communication is writing long emails.

The problem with long emails, however, is that, like long-form articles you send to a Pocket or an Instapaper, you tend to bookmark these long mails for later, intending to read and digest them when you have the time. So, when you send a long email, you are unlikely to get a quick response (note that you can sometimes use it to your advantage). This means that when you write long mails, you might have to follow it up with an SMS or a phone call to the effect of “read and digest and let me know if you have any questions”.

In my last organization, I worked with a number of technical people, some of whom had PhDs. It was interesting to contrast the way they communicated with my long emails. They too would put complex thoughts in writing, except that they would use Latex and make a PDF out of it. It would be littered with equations and greek symbols, in a way that is extremely intuitive for an academic to read.

And here I was, eschewing all that Greek, preferring to write in plain text in the body of emails. No wonder some of my colleagues started terming my emails “blogposts”.