chatgpt – Pertinent Observations

I can retire

I was playing with the “custom GPT” feature of ChatGPT. I asked it to build a bot to answer questions based on content in this blog. And then I started trying it out. Here is a sample:

You

explain the studs and fighters theory

NED Bot

The “Studs and Fighters” theory is a framework used to describe two distinct types of competitive individuals or entities, often in the context of sports, business, or any field involving competition. Here’s a casual explanation:

Studs: These are the naturally talented. They’ve got the skills, often inherent, that make them stand out with seemingly less effort. In sports, think of the athlete who executes plays with grace and makes it all look easy. In business, it could be that entrepreneur whose ventures just always seem to bloom.

Fighters: These folks might not have the natural flair of the studs, but boy do they have grit. They’re the ones who compensate for any lack of natural talent with sheer hard work and perseverance. They fight for every point, work relentlessly, and often overcome obstacles through tenacity. Think of the athlete who may not be the most skilled on the field but is always the last to leave practice.

The theory goes that in the long run, the Fighters tend to outperform the Studs because their success is not reliant on inherent talent, which can plateau, but on continuous improvement and adaptation. However, when a Stud also has the mindset of a Fighter, that’s when you get those once-in-a-generation superstars. It’s not just about what you’re born with or what you’re willing to do; it’s about combining the best of both worlds if you can.

You

who came up with this thoery?

NED Bot

The “Studs and Fighters” theory was popularized by the blog “noenthuda.com,” which is run by Karthik Shashidhar. He formulated this concept based on observations in various competitive fields, and it has since been referenced in discussions about performance, talent, and success. The theory encapsulates the dichotomy between natural talent and hard-earned skill, offering a lens through which to view competition and development in individuals.

Not bad, I must say! This took about five minutes to program.

I’ve published the bot. You can find it here. According to ChatGPT itself, you don’t need a premium ChatGPT account to USE this. Have fun, and stress test!

A few months back, I came across this article that talked about margins in the software industry. Long ago, computer software was well known to be an insanely high gross margin industry. However, it is not the case any more.

If you look at SaaS (software as a service) companies, a lot of them barely make much profits any more. So what changed?

The answer is infrastructure. In the olden days, when all hardware was “on premise”, software would be a bunch of lines of code that would get sold, and then run on the client’s on-premise hardware. Thus, once the code had been written and tested and perfected, the only cost that the vendor faced was to install the code on the client’s hardware (including the cost of engineers involved in the installation). And the margins soared.

Then (I’m still paraphrasing the article that I had read, and now can’t find), the cloud happened. Hardware wasn’t all on-premise any more. People figured out that software could be sold “as a service” (hence SaaS). Which means, instead of charging for installing some code on a computer, you could charge for API hits, or function calls. Everything became smooth.

The catch, though, was that the software would now have to be hosted on hardware maintained (in the cloud) by the vendor. Which meant now the marginal cost of delivery suddenly became non-zero. Rather, it went from $O(1)$ (one time installation) to $O(n)$ (costing each time it gets hit, or the time for which it is maintained). And this had a material impact on software margins.

I’m thinking of this now in the wake of new-fangled open source LLMs that keep getting announced every day. Every new LLM that comes out gets compared with ChatGPT, and people tell you that this new LLM is “open source”. And you get excited that you can get for free what you would have to pay for with ChatGPT.

Of course, the catch here is that ChatGPT is like SaaS – not only does it provide you the “LLM service” it also hosts the service for you and answers your questions, for a fee.

These open source models are like the traditional “on-premise” computer software industry – they have good code but the issue of course is that you need to supply your own hardware. Add in the cost of maintaining the said hardware, and you see where you might spend with the open source LLMs.

That said, Free != Open Source. The Open Source LLMs are not only free, but also open source – and so, the real value in them is that you can actually build on the existing algorithms and not have to pay a fee (except for your own infrastructure).

And from that perspective, it’s exciting that so many new tools are coming along.

Stable Diffusion and Chat GPT and Logistic Regression

For a long time I have had this shibboleth on whether someone is a “statistics person or a machine learning person”. It is based on what they call regressions where the dependent variable is binary. Statisticians simply call it “logit” (there is also a “probit“).

Now, in terms of implementation as well, there is one big difference between the way “logit” is modelled versus “logistic regression”. For a logit model (if you are using python, you need to use the “statsmodels” package for this, not scikit learn), the number of observations needs to far exceed the number of independent variables.

Else, a matrix that needs to be inverted as part of the solution will turn out to be singular, and there will be no solution. I guess I betrayed my greater background in statistics than in Machine Learning when, in 2018, I wrote this blogpost on machine learning being a “process to tie down coefficients in maths models“.

For “logistic regression” (as opposed to “logit”) puts no such constraint – on the regression matrix being invertible. Instead of actually inverting the matrix, machine learning approaches simply focus on learning the terms of the inverted matrix using gradient descent (basically the opposite of hill climbing), so mathematical inconveniences such as matrices that cannot be inverted are moot there.

And so you have logistic regression models with thousands of variables, often calibrated with a fewer number of data points. To be honest, I can’t understand this fully – without sufficient information (data points) to calibrate the coefficients, there will always be a sense of randomness in the output. The model has too many degrees of freedom, and so there is additional information the model is supplying (apart from what was supplied in the training data!).

Of late I have been playing a fair bit with generative AI (primarily ChatGPT and Stable Diffusion). The other day, my daughter and I were alone in my in-laws’ house, and I told her “look I’ve brought my personal laptop along, if you want we can play with it”. And she had demanded that she “play with stable diffusion”. This is the image she got for “tiger chasing deer”.

I have written earlier here about how the likes of ChatGPT and Stable Diffusion in a way redefine “information content“.

What the likes of Dall-E and Stable Diffusion have clearly shown is that a picture is worth far less than a thousand words

— Karthik S (@karthiks) December 16, 2022

And if you think about it, almost by definition, “generative AI” creates information (and hallucinates, like in the above pic). Traditionally speaking, a “picture is worth a thousand words”, but if you can generate a picture with just a few words of prompt, the information content in it is far less than a thousand words.

In some sense, this reminds me of “logistic regression” once again. By definition (because it is generative), there is insufficient “tying down of coefficients”, because of which the AI inevitably ends up “adding value of its own”, which by definition is random.

So, you will end up getting arbitrary results. ChatGPT often gives you wrong answers to questions. Dall-E and Midjourney and Stable Diffusion will return nonsense images such as the above. Because a “generative AI” needs to create information, by definition, all the coefficients of the model cannot be well calibrated.

And the consequence of this is that however good these AIs get, however much data is used to train them, there will always be an element of randomness to them. There will always be test cases where they give funny results.

No, AGI is not here yet.

Tag: chatgpt

I can retire

LLMs and Software Margins

Stable Diffusion and Chat GPT and Logistic Regression