Table of Contents
Unify API
Engineering

Agents: A New Paradigm or a Passing Phase?

Daniel Lenton
April 30, 2024
5 mins
thumbnail

Agents are all the rage right now, but are they actually legit, or just hype? There are arguments on both sides, and in this post I'll do my best to unpack them and share some perspectives.

Emergent Intelligence

If the last century has taught us anything about intelligence, it's that general intelligence is always an emergent property of an optimization algorithm. It is not hand crafted or hand engineered, it just pops out from a simple set of rules mixed with a lot of data and compute. A good early example is Charles Darwin's "On the Origin of Species" in 1859, advocating that biological complexity emerges from random mutations and simple competition. Over a century later, the same concept was playing out in a different domain, this time the capabilities of neural networks. It's argued that Minksy's advocacy for symbolism and criticisms of neural networks in his 1969 book "Perceptons" could have set progress back by many decades. The same debate still continues to this day. I remember in 2011 when CVPR were rejecting papers using CNNs because the convolution filters were "learned" and therefore could not be explained, and were not "true computer vision". This prompted an open letter from Yann LeCun. Imagine what CVPR would look like in 2024 if neural networks were still not allowed, and the papers were still about hand designing vision filters rather than allowing them to be learned. The same holds true for robotics conferences in the early 2010s, where rigid hand-engineered step-by-step pipelines of vision-model-plan-execute were favoured over the end-to-end systems pioneered by Sergey Levine, Pieter Abbeel, Chelsea Finn etc.

"Agents", as they are today, feel much the same as symbolism did in the 1960s, and vision filters and robotics did in the early 2010s. The data buses are very sparse, with a very restrictive data format between nodes of the system (natural language), with a lot of human engineering involved. It's no wonder that these systems are often very brittle. If we've learned anything from the past few decades, it's that we should "let the data speak for itself" where possible, and avoid distorting systems with our own biases for how the reasoning "should" be performed. It's also taught us that we should take as much inspiration from mother nature as possible, and the different regions of our brain do not talk to one another via a singular neurons transmitting english language back and forth.

However, unfortunately the last few years has seen a dramatic shift away form open source models to black box ones. From this perspective, agents are simply "the best we can do" to push the performance of these black box systems, given that their only input-output representation is natural language. To achieve truly capable systems, what we really need are systems which are end-to-end trained on the tasks at hand, with multi-step reasoning, planning and reflection also being emergent properties. We need much higher bandwidth between different regions involved in the reasoning steps (neural network), with representations much more expressive than english language (latent vector representations). Manually constructed multi-step reasoning is destined to fail in exactly the same manner that hand engineered symbolic reasoning, vision filters, and robotics pipelines also failed.

Society

A slight counter-argument to the above is that human civilization is kind of an example of collective emergent intelligence, where relatively sparse data buses (our five senses) and rigid hierarchies are sufficient. Humanity is able to achieve amazing feats via specialization and collaboration, with very sparsely connected nodes (maybe a few hundred direct acquaintances per person). Therefore, arguably agentic systems could have a role to play at the very high levels of collective intelligence, in the same way that individual people have a role to play in the collective intelligence of society.

However, this is generally not how agents are being deployed today. When working in the office, we do not need a whole team of 20 people to search a file directory, retrieve a file, and summarize the key points of the file, with each person doing a hyper-specific and simple task. One person is able to handle this, while planning and performing actions based on internal hierarchies and representations.

Further, with regards to society, who is to say that humanity wouldn't function much better if we were all more deeply connected. The limitations of the physical world makes evolution a very restrictive algorithm, and the humanity is full of conflict and disarray. Thinking of humanity as a functioning system, you could argue that it is incredibly brittle and chaotic. Perhaps the roles, hierarchies, and communication channels that humanity has organized itself into would function much more smoothly if this was also more densely connected, and could also be end-to-end learned directly from data. Maybe Unity from Rick and Morty knows what she's talking about.

A New Programming Language

Taking a more favourable view of agentic systems, one clear benefit is control. This is actually the same argument made for symbolic reasoning, hand engineered vision filters and multi-step robotics pipelines. In general, when it comes to intelligence, there is always a trade off between capability and explainability. It's for this reason that symbolic reasoning is still used in calculators, hand-engineered edge detectors and morphological operators are still used in medical imaging, and observe-plan-execute systems are still deployed in factory robotics. Similarly, while hand engineered agent flows will never reach the same levels as implicit and emergent planning and reasoning skills, it will always give us more explainability. As in the prior examples, there will be some applications where the explainability is more important than capability, perhaps mandating the use of "agents" as they are today.

Thoughts on the Future

Firstly, I think it's only a matter of time until we get much more capable and emergent multi-step reasoning, all learned from data. Presumably GPT5 will have sophisticated multi-step reflection, reasoning and planning all behind the API, but I'm more excited by what the open community can contribute here, which is increasingly possible with the rise of powerful open source LLMs such as Llama3.

With regards to agents as they are today, I think of them in the same way I think of the vision filters of 2010. They certainly do not lie on the path to AGI, but they might have an ongoing role to play in future applications where explainability is more important than system capability.

Do you agree? Let me know what you think!

About the Author
Daniel Lenton
Unify | Founder and CEO

Prior to founding Unify, Dan was a PhD student in the Dyson Robotics Lab, and also worked at Amazon. He completed his Masters in Mechanical Engineering at Imperial College, with Dean's list recognition.

More Reads

left button chevronright button chevron

Wish Your LLM Deployment Was
Faster, Cheaper and Simpler?

Use the Unify API to send your prompts to the best LLM endpoints and get your LLM applications flying