What's happening at the intersection of ML and Engineering.
Software Engineering, AI, ML, Data science, DevOps, MLOps - What's happening
A few days ago, I shared a LinkedIn post reflecting on the overwhelming hype surrounding AI that has captivated the tech world—and beyond—over the past two years. Since then, I’ve had the pleasure of engaging with a diverse group of tech founders, tech executives, VC’s, and employee’s across various tech companies. This gave me the topic for this post. It's important to note that, I have mostly worked in Applied ML and Platform Engineering in the industry, and my posts are largely opinion-based, and while I strive for insight, there's always the chance I could be off the mark. Let’s dive in.
Noise to Signal Ratio - is extremely high. This is, however, true for all transformative technology. While, I wasn’t old enough to witness and understand the dotcom boom, I assume it would have been similar. But, the thing to remember about the dot com boom is that a lot of winners were created and the world changed for good. Filtering out the noise will be difficult but then that is true for everything in life. Generative AI will also produce winners but it will not be as easy as it was initially made out to be.
A lot of money is being wasted on consultants and strategy - When consultants come out as making the most money from the AI boom, you know something is going wrong. The sort of people who will tell you the difference between Data Science, Machine Learning, and Artificial Intelligence, get nothing done, and charge you $10000 while telling you something that doesn’t matter - I mean - This is exactly where value is lost.
Winners - I am assuming 3 to 4 kinds of winners in the gen ai space
LLM providers - This seems obvious at this point in time. OpenAI, Google, Anthropic, SSI, X and many others are competing for this space. At this point, Open AI with their Apple partnership and Google with their Android partnership seeem to have the distribution sorted but this space continues to evolve. Distribution and feedback loop will be the key to success in the space and hence Google and Open AI (via Apple partnership) have an inherent advantage. Meta’s play seems very similar to their cloud play. Meta bet on OpenCompute when every one else was getting into Cloud services. I don’t know if their Llama initiative will face a similar not-so-great situation. While, the traction for Llama is a lot more and Meta too has immense distribution via Facebook, Instagram, and Whatsapp - I don’t know what their final bet is. A mentor and manager of mine once shared that one of the key reasons Google dominated the search landscape and maintained its leadership was their unparalleled expertise in managing large-scale search infrastructure. This will be true for the LLM providers.
ML(calling it LLM at this point seems to do the magic) infra providers - ML infra space is broken. Azure, GCP, AWS have their offerrings on training, inference, monitoring and everything else but none of them provide what is needed at a cost that is affordable. While, there has been some headway in pipelining and training - inferencing and monitoring remain broken. All the cloud providers seem to have everything built but every small to medium tech company seems to either build something on top of Kubernetes or use multiple vendors to do the job for them. This space will definitely have a winner and it is wide open.
Enterprise applications - Couple of years back, when LLM race started, a lot of folks jumped into the bandwagon of building RAG based applications for the enterprise. Some of them were already building something in the space and got a leg up with the LLM advances while there were others who thought building an enterprise application on top of LLM’s was going to be a walk in the park. While, most of the applications have realised that it is not a walk in the park, there will be some winners in the space. Developer Productivity, PM productivity and what not. There will be winners in the space but I am still to know any one which is winning.
B2C applications - While, most folks have thought of LLM providers, ML Infra, Enterprise tech as the big winners, I would bet on some B2C application somewhere which can take attention away from Tiktok, Instagram and the likes. It is not going to be easy. The kind of work that has been on recommendation systems by these companies is hard to replicate. While, character.ai has had some initial success with time spent on the platform, I do await the biggest winner in this space. I have a feeling that amongst all others, this one will be the biggest winner.
Minimum iterable product - While, most of us are aware of the term Minimum viable product or Minimum lovable product, I propose a term called Minimum iterable product. Folks, who have worked in the AI space for a while would know what I mean. Google developers portal has great set of rules for Machine learning which talks about related ideas. A minimum iterable product would be a product which has some intelligence and the ability to take feedback and learn quickly. Also, the underlying infrastructure boxes have the ability to change the intelligence algorithm without much effort. Cost of experimentation has to go down for any company which has some user base. Lot of companies which jumped into the AI space in the last 1-2 years don’t understand this point enough. You have to build a ML product. Build something via rules, ship it, and then iterate on the components via different algorithms. Even mature companies dont seem to understand that in order to create a AI/ML driven product you need to collect good quality diverse data and that can come by design or by collecting a lot of it. Collecting a lot of it has its own distribution constraints but quality and design of what you collect and how you collect is in your hands.
Gen AI applications have very long tail problem - While, building applications on the top of LLM’s looks quick and viable at first, such applications have a never ending long tail. Hallucination is just one part of the problem where 1 in 100 times, your LLM hallucinates and leads to a weird application flow. There is a bigger hidden human need for determinism which is a problem for LLM apps. Grounding your LLM app in your own data is being solved by many as we speak but there is a lot to be done there.
Customer care seems to be one of the most obvious use cases where LLM’s are being tried out. I have a feeling we are far away from solving this use case completely. We call customer care when all obvious ways of solving the problem have been exhausted or I am in a hurry to solve my problem. If I get stuck in some kind of a hallucinatory loop, I am done. Even if it happens once in a 100 times with a valuable consumer, I would not bet on doing it.
With LLM based apps, you fix one thing and the other one breaks. While continuous evaluation (like continuos unit testing) is one way to deal with the problem, more and more developers are realising the truth about the stochastic world of ML. More and more developers feel that there is a never ending long tail problem with developing LLM apps.
Fear about Software Engineering roles - Somehow, I do feel that Karpathy added fuel to the fire. There is this bunch of companies trying to build a AI software engineers, and then you have the likes of Copilot and what not. So, is the fear real? It is very similar to the saying AutoML will take over all ML roles. What needs to be understood is that writing code is a very small part of the SDE job description . With all the code generation what might happen is that the entry level software jobs will be impacted but not the senior levels. So, if you are already a SDE and write neat code, and build neat systems, you are probably fine for now. However, the near elimination of entry level roles does mean that the pipeline for senior level roles becomes thin and it is a problem. The demand for senior level roles keeps growing but there is literally no pipeline. It is a weird conundrum but for now, but if you are good with your software development skills, you are gold.
Fear about ML roles - I have also met a bunch of software engineers and product managers who think Gen AI will lead the elimination of ML roles, because Gen AI does everything magically. The argument goes like - Gen AI roles are all engineering or prompt engineering and don’t require you to know any ML. This is flawed at the same level as the SDE argument. If you are eliminating entry level folks, I can understand that you have point. But calling a model to infer something is a very small part of the ML(Scientist/Engineer) job description. ML folks are used to the idea of stochasticity and minimum iterable product. This is somewhat foreign to the deterministic world. A lot of magic happens in pre-processing, pipelines, metrics and post-processing which no one talks about. So, again if you are an entry level ML person, you might have a hard time but not if you have worked with ML for a bit. In fact, the AI engineer salaries are going through the roof as we speak.
On the flip side, if as an ML engineer, you are asking people to not do a POC using Gen AI, you are stupid. Gen AI based POC should better be seen as the logistic regression baseline which you wish to beat via your work using Gen AI or otherwise.
Advise for folks who want to get into ML or are into ML already but struggling with the recent set of changes -
Get breadth first before getting into depth. If you don’t understand the basics of SWE or the basics of ML, you are going to struggle in building anything of substance in the ML space, be in GenAI or otherwise in the AI space
Entry level jobs are hard to get currently and I dont exactly know the solve to this
Be ready for the iterable paradigm of development as far as any intelligent software product goes
Collecting good quality data (explore-exploit trade off) is more important than just getting data
Getting a Masters or a Phd is great for learning and that should be the motive to do a Phd in ML or a related field. If the motive is an entry into AI or career advancement, it isn’t a great motive. ML is getting democratised and execs are looking at who delivers what not who comes with what degree. Best way to learn ML is to build something and ship it. No degree can compare to the experience of shipping it.
At the cost of moving out of topic - ex Physicists are doing very well in Machine Learning. Just wanted to point out this correlation.
Bye for Now.
Nice writeup of the scene. The physicists part reminds me of
“Physicists, it turns out, are almost perfectly suited to invading other people’s disciplines, being not only extremely clever but also generally much less fussy than most about the problems they choose to study. Physicists tend to see themselves as the lords of the academic jungle, loftily regarding their own methods as above the ken of anybody else and jealously guarding their own terrain. But their alter egos are closer to scavengers, happy to borrow ideas and technologies from anywhere if they seem like they might be useful, and delighted to stomp all over someone else’s problem. As irritating as this attitude can be to everybody else, the arrival of the physicists into a previously non-physics area of research often presages a period of great discovery and excitement. Mathematicians do the same thing occasionally, but no one descends with such fury and in so great a number as a pack of hungry physicists, adrenalized by the scent of a new problem.”
Six Degrees: The Science Of A Connected Age, 2003, Duncan J. Watts; pp. 61-62