Output / Research Journal

Agents, worlds and hands

by Nicklas Berild Lundblad
22. Sep 2025

Note 5 from the research journal by Nicklas Lundblad: Through the hand, the brain learns the affordances of the world—how objects resist, yield, or break—and refines its models accordingly.

Note 5.

In a recent paper from Google DeepMind, the authors note that any sufficiently advanced agent (or delegate, as we noted before) needs to have some kind of world model. They write: 

“Are world models a necessary ingredient for flexible, goal-directed behaviour, or is model-free learning sufficient? We provide a formal answer to this question, showing that any agent capable of generalizing to multi-step goal-directed tasks must have learned a predictive model of its environment. We show that this model can be extracted from the agent’s policy, and that increasing the agent’s performance or the complexity of the goals it can achieve requires learning increasingly accurate world models.”

There is no such thing as agency outside of a world. This sounds like a trivial observation, but it might actually be much more than that: if this is true the agency we have is also shaped by the world in ways that subtly limit what we can and cannot do, and what we can want and not want. 

This in turn is interesting, because it suggests a definition of intelligence that is operationalizable: agency/world-fit

Something that has a great fit between what it can want and the world that it is in, and knows the full range of agency options afforded by a world, would be very intelligent relative to that world. And this is an axis of intelligence that is quite interesting to think through. Often when we think about intelligence we think about the ability to solve problems, and often those problems are mathematical-logical problems. But what we are looking for here is different - and it is not a new distinction. 

We find this distinction in Aristotle - who speaks of theoria and phronesis. Aristotle’s distinction between theoria and phronesis captures two complementary dimensions of intelligence that map directly onto the idea of agency/world-fit. Theoria refers to the contemplative or theoretical intelligence that seeks truth through understanding the stable structures of the world—what an agent might call its world model. It is the capacity to grasp the underlying patterns that make prediction and explanation possible. Phronesis, by contrast, is practical intelligence: the situated ability to act well within the contingencies of a specific world. It governs choice, timing, and adaptation, ensuring that theoretical insight translates into effective and appropriate action. In the context of artificial apprentices, theoria corresponds to an agent’s learned model of its environment, while phronesis expresses how well that model is put to use—how skillfully the agent adjusts its behavior to maintain a high degree of agency/world-fit.

So, then, it is not enough to understand the world - you also have to be able to act in it.

Now, there may be an interesting trade-off here: the complexity of your world model may actually limit the agency / world-fit, in that you have to take action on a specific timescale, in order to be able to respond to the world. Beyond a certain threshold, epistemic richness eats into pragmatic capabilities. Evolution has tuned that trade-off over time, and our fitness could be thought of as the ability we have to combine world model and agency / world-fit in ways that allow us to act effectively. If there is a pareto frontier here, that would be interesting - since we then might have to give up some world complexity for better agency / world-fit. 

In an image:

What would allow us to get to a place where the computational pareto frontier moves ahead of the biological one, then? That is a really interesting research question! Evolution achieves the trade off over time, selecting for a world understanding that optimizes the use of energy and the agency/world fit in particular environments. The technology we currently use is heavy on theoria but thin on phronesis

The ultimate example of this may be the hand. As often noted by Elon Musk and others - hands are hard to build! But the hand is what combines a world model with agency/ world fit. The hand embodies a synthesis of theoria and phronesis that machines still struggle to achieve: it is both an instrument of understanding and an instrument of action. Through the hand, the brain learns the affordances of the world—how objects resist, yield, or break—and refines its models accordingly. Every movement is a small hypothesis tested against reality. This continuous calibration between perception and manipulation means that the hand is not merely an actuator but a feedback system that fuses knowing and doing. The philosopher Maurice Merleau-Ponty might say that the hand is where consciousness meets the world, where the abstract model of space becomes tangible through grasp and touch.

For artificial agents, developing an equivalent capacity would mean achieving not only fine motor control but also the cognitive plasticity that arises from embodied interaction. Artificial phronesis would require that the system’s world model be dynamically reshaped by its encounters, not simply referenced as a static database of predictions. The goal is not perfect accuracy but adaptive coherence—knowing enough about the world to act meaningfully within it and to revise one’s own model as the world pushes back. As we move toward artificial delegates, the challenge is to bridge this gap: to design systems that do not only map the world but live in it, navigating the same tension between complexity and responsiveness that evolution solved through billions of small, embodied experiments.

And maybe this gives us a new Turing test of sorts: when an artificial intelligence can take us by the hand and lead us into a larger, more complex and deeper world. 

 

Read all editions here

Author

Nicklas Berild Lundblad

fellow of practice