Generative AI is already making real impacts in the enterprise, but bad processes may create a false ceiling to hold back progress. That’s how Exponential View founder and esteemed independent AI researcher Azeem Azhar reads recently emerging research, where processes and data structures designed for GenAI can reap exponential rewards.
In a fireside conversation with HFS Research CEO and Chief Analyst Phil Fersht, Azeem offered evidence that LLMs were already helping people do their work quicker, at higher performance levels – and with greater employee satisfaction. So without further ado, here are Azeem’s keen insights….
LLMs boost performance among the majority of skilled workers
Azeem Azhar: “Phil, thank you so much for the opportunity to talk to you and the audience here. There was a Brynjolfsson study that looked at call center workers using LLMs before GPT-3.5 or 4, and they identified that these call center workers were 14% more productive, and after two months of using an LLM, a new worker was as well skilled as long-term employees who had not used an LLM.”
He explained that even uncodified knowledge and know-how not in the document was transmitted to these new workers over six months. In another survey by Noy and Zhang from MIT, higher-paid white-collar workers were provided with ChatGPT.
Azeem Azhar: “These were grant writers, or in HR, or marcomms, with an average salary over $100,000 yearly. They had a 40% improvement in the time taken and – I think – 15 to 20% improvement in the quality of the work.”
Poor processes constrain top performers and will limit the benefits GenAI can bring
Recently, Azeem’s colleague Karim Lakhani at HBS and friend François Candelon at BCG looked at 800 Boston Consulting Group strategy consultants supported in their work using a GPT-4 application. Tasks got completed to a higher standard faster, and below-median employees improved the most. Azeem says these three studies show LLMs can be productivity enhancers across the board. But improvement at the top level is constrained.
Azeem Azhar: “I think, Phil, this falls right in the realm of the kind of strategic transformation work you have done with clients for years and years. The fact that the bottom three quartiles of the employees improve the most speaks to the limitations of the existing process flow.”
He says LLMs reveal the limitations that poor processes place on top performers.
Azeem Azhar: “It’s as if we have a high jump, and the bar never goes above 1.7m, and for me, that’s a stretch; for you, that’s easy, you could jump 1.9m, but we never raised it to 1.9m. And that’s the kind of thing that HFS helps firms think about. LLMs have shown that you must rethink your internal dynamics to get more performance.”
We must get to grips with a new ‘jagged frontier’ where performance can go either way
Azeem Azhar: “Where you (Phil) identify we may hit limits, that is what the AI researchers call the jagged frontier. On one side of the frontier, the LLM does better; on the other, it worsens things. The problem is we don’t know what that frontier looks like. It’s also a shifting frontier. It varies from task to task.”
Azeem thinks the arrival of LLMs triggers a moment to rethink how work is down.
Azeem Azhar: “You must be alert to where your existing systems or processes are so constrained they don’t allow you to perform at a GPA of 4.0 because you’ve never thought it was possible, and also how you manage for those tasks where working with an LLM might give you a worse result.”
Don’t blame the LLMs – it’s the shareholders and finance directors who are likely to be swinging the job cuts axe
Phil Fersht: So, do you feel white-collar jobs are under threat, Azeem? Or do you think this is another evolution, a new technology, and new jobs will be created…
Azeem Azhar: I think we can be reasonably expectant that new jobs will get created. I don’t believe the threat necessarily comes from LLMs. It probably comes from shareholders and finance directors, more than anything else, because there will be a lot of pressure for cost savings and, you know, “Can we deliver the same experience to our customers at a lower cost? And if we can, let’s do that.”
“There will undoubtedly be many processes that will be as efficient with fewer people. What a firm chooses to do at that point will depend a lot on its relationship with its workers, territory, and employee rights, the kind of direction, mission, and depth of capacity of the firm. Some big IT consulting firms managed to upskill and retrain hundreds of thousands of people in the face of automation. But they’ve done that against the backdrop of growing businesses.
Jobs created out of technical debt will be among the first to go
Phil Fersht: Yeah. Yes, very well put. And, you know, it’s interesting, the conversation I had yesterday with the academic was very much, “We need to keep reminding ourselves that AI is about ultimately improving human intelligence.” So…
Azeem Azhar: Yeah. But I think we have to bear in mind that there was an assumption, when tasks were designed and processes were created, about who would do that job. Many jobs were framed so that they didn’t need to be done by humans; humans did them because it was a bit too complicated to get a computer to do them. Data entry is one. The whole of RPA exists because of poorly architected, monolithic computing frameworks, which meant we couldn’t move the ledger entry from the mainframe system into the minicomputer system, the client service system, or into the web–based system, so we had to do screen scraping and things like that. Now, that person has a job, but that job exists only because of technology debt.
They can’t reason, they can’t plan – but LLMs overcome limitations
Phil Fersht: Yeah. So, final question. You said it’s not all going to end here with LLMs. If you could look back in three years’ time, what do you think the world of enterprise tech will look like then, based on how fast things are moving now?
Azeem Azhar: LLMs are good at a bunch of things. But they can’t reason, reliably plan complex actions, and are not great at learning representations of the world. And this is really about how they’re designed, so it does appear like new science may be needed. However, these limitations that we see – for example, hallucinations– may get tackled through continual improvement in the LLMs themselves; GPT-4 is much less hallucinatory than GPT-3.5, but also by how they get productized by other tools, like vector databases, or RAG, retrieval-augmented generation, which is meant to anchor an LLM’s output to verifiable certified facts that it might find elsewhere. Because you’re starting to see technologies like that, and techniques like that, wrapped around the science, I think you’ll see many companies building SaaS and enterprise software with such solutions as an underlying model.
The Bottom-Line: Prepare to be surprised – just as you were surprised by ChatGPT
Azeem Azhar: I would expect the use of more and more open source, more sparse, more efficient models that are tuned to specific sub-verticals within industries. But at the same time, there will still be a constraint because if you are a customer of Salesforce, and you have the Salesforce GenAI chatbot helping you with this or that, there will still be things that it can’t see in your worldview, and you will then start to think about, “How do I bring that in, with my internal system?”
It’s an exciting time. We should be prepared to be surprised in the same way that we were surprised by ChatGPT. But I think there’s quite a lot of momentum in building these systems based around LLMs as a core orchestrator and reasoning engine, even though it doesn’t do any of that stuff particularly well. Still, it does it well enough, which looks like a framing for the next few years.
Phil Fersht: Very well put, Azeem, and thank you very much for your time today – always good to hear from you.