Legg, Chapter 4

Universal Intelligence Measure

Turning the informal definition of intelligence into a precise mathematical equation.

Prerequisites: Chapters 1-3.
10
Chapters
2
Simulations
10
Quizzes

Chapter 0: The Goal

We have an informal definition of intelligence: an agent's ability to achieve goals in a wide range of environments. We have AIXI, a theoretical model of optimal intelligence. Now we flip the idea on its head: instead of using universal AI theory to build intelligent agents, we use it to measure intelligence.

The vision: A single equation that takes any agent π as input and outputs a real number Υ(π) representing its intelligence. An equation so general it applies to humans, animals, algorithms, robots — anything that interacts with environments. The holy grail of intelligence measurement.
Check: What is the goal of this chapter?

Chapter 1: Formalising the Definition

Our informal definition has three pieces: agents, environments, and goals. Let's formalise each one.

Agents and environments communicate through the agent-environment model from Chapter 2. The agent sends actions, the environment returns observations and rewards. The reward signal implicitly defines the goal — the agent tries to maximise cumulative reward.

"Wide range of environments" means we consider the space of all computable environments with bounded reward sum (the set E). We require environments to be computable because incomputable environments cannot be simulated or tested. The bounded reward sum condition ensures that every possible temporal preference is represented.

"Ability to achieve" means expected performance: the value function Vμπ.

The remaining question: how do we combine performance across infinitely many environments into a single number? We cannot use a uniform distribution (it doesn't exist over infinite sets). Instead, we weight each environment by 2-K(μ) — its Kolmogorov complexity. Simple environments count more.

Why weight by complexity? Occam's razor. A simple environment that always gives maximal reward is more likely than a complex one that does the same. An agent that solves simple problems but fails at complex ones should get more credit than one that solves only complex problems — because the simple problems are more probable.
Check: Why do we weight environments by 2-K(μ) instead of uniformly?

Chapter 2: The Equation

Bringing everything together:

Universal Intelligence: The universal intelligence of an agent π is its expected performance with respect to the universal distribution 2-K(μ) over the space of all computable reward-summable environments E:
Υ(π) := ∑μ ∈ E 2-K(μ) Vμπ = Vξπ

The final equality is remarkable: universal intelligence equals the agent's expected performance under the universal mixture ξ. It is literally AIXI's value function. This means universal intelligence of an agent is simply its expected performance with respect to the universal distribution.

Let's unpack what each part captures from our informal definition:

InformalFormal
"Agent"π — any function from histories to actions
"Environments"μ ∈ E — all computable reward-summable environments
"Goals"Implicit in the reward structure of each μ
"Ability to achieve"Vμπ — expected total reward
"Wide range"∑ 2-K(μ) — weighted sum over all environments
Check: What does the equation Υ(π) = Vξπ tell us?

Chapter 3: The Random Agent

A random agent πrand chooses uniformly random actions. In most environments, it will fail to exploit any regularities, so Vμπrand will be low compared to other agents. Therefore Υ(πrand) is low.

But wait — some environments give high reward no matter what the agent does (imagine an environment that always gives reward 1 regardless of actions). For these, even the random agent scores well. However, such trivial environments are simple (short programs), so while 2-K(μ) is relatively large, the random agent's performance is no better than any other agent's. It gets no advantage.

Check: Why does a random agent have low universal intelligence?

Chapter 4: Specialist Agents

IBM's Deep Blue plays chess at superhuman level. Its value function Vμchessπdblue is extremely high. But 2-K(μchess) is small (chess is complex), and for all other environments V is very low. So Υ(πdblue) is very low.

The counter-intuitive result: A very simple agent πsimple that can only predict trivial sequences (0000... and 1111...) has higher universal intelligence than Deep Blue. Why? Because the environments where 0000... appears have very short programs, so their weight 2-K(μ) is very high. Deep Blue fails at these trivial tasks because it only plays chess. Universal intelligence strongly emphasises the ability to solve simple problems.

This tells us something profound about current AI: by focusing on increasingly specialised systems, we have in some sense been going backwards in terms of universal intelligence. A system that handles basic pattern recognition across many domains is more intelligent than one that dominates a single complex domain.

Check: Why does universal intelligence rank Deep Blue lower than a simple general-purpose learner?

Chapter 5: Simple Agents

A general but simple agent πbasic builds a table of observation-action pairs and keeps statistics. It takes the best known action 90% of the time, explores 10%. For most environments it will find some structure to exploit, so Vμπbasic > Vμπrand almost everywhere. Thus Υ(πbasic) > Υ(πrand).

Extending πbasic to use more history improves it further. An agent π2back that conditions on the last two observations finds patterns that πbasic misses, like the alternating-action environment.

An agent π2forward that looks one step into the future (not just maximising immediate reward but also next-step reward) is even more powerful. It can see that climbing a hill (zero immediate reward) leads to sliding down (high reward next step), a pattern that greedy agents miss.

The playground slide: An agent at the bottom of a slide can rest (reward 2-k-4) or climb (reward 0). At the top, it slides down (reward 2-k). A greedy agent always rests. A forward-looking agent climbs, getting higher total reward. The more history and lookahead an agent uses, the more environments it can master, and the higher its universal intelligence.
Check: What is the key difference between a greedy agent and a forward-looking agent?

Chapter 6: The AIXI Upper Bound

By construction, AIXI maximises Υ. No agent can have higher universal intelligence. This gives us the upper bound on intelligence:

Υ̂ := maxπ Υ(π) = Υ(πξ)

This upper bounds the intelligence of all future machines, no matter how powerful their hardware and algorithms. Of course, AIXI is not computable, so no real machine can achieve this bound. But it tells us the theoretical ceiling.

Where would a human fall? For simple environments, a human should identify structure and exploit it. For complex environments (say, one that involves processing sensory data in formats the brain was not designed for), a human might perform poorly compared to a specialised algorithm. Perhaps the universal intelligence of a human is not that high compared to some machine learning algorithms? We genuinely don't know.

Check: What does the upper bound Υ̂ = Υ(πξ) represent?

Chapter 7: Properties

How does universal intelligence compare to the desirable properties of an intelligence measure?

PropertyStatus
ValidYes — derived step by step from mainstream definitions of intelligence
InformativeYes — assigns a real number, enabling comparison of any two agents
Wide rangeYes — spans from πrand to AIXI
GeneralYes — hard to imagine a broader metric without contradicting Church-Turing
DynamicYes — measures learning and adaptation over time, not one-shot problems
UnbiasedYes — grounded in universal Turing computation, not any particular culture
FundamentalYes — based on computation and complexity, unlikely to change with technology
FormalYes — a mathematical equation
PracticalNo — Kolmogorov complexity is not computable

The one weakness: impracticality. But this mirrors the definition of randomness — incomputable to verify, yet theoretically fundamental. Future work aims to approximate Υ using computable complexity measures like Levin's Kt complexity.

Check: What is the main practical limitation of universal intelligence?

Chapter 8: Criticisms

Legg addresses common criticisms head-on:

"It's just a few equations." Yes, but so is E=mc2. The work is in showing that the equation correctly captures the concept. That required surveying 70+ definitions, building the agent-environment framework, and connecting it to universal AI theory.

"It's just reinforcement learning." The equation goes far beyond RL. It uses universal Occam-weighted priors, considers all computable environments, and produces an absolute measure. Simply writing down the RL framework does not give you universal intelligence.

"The universe might not be computable." There is no evidence of incomputable physical processes. Even if some exist, computable approximations would still work extremely well given that all known physics is computable.

"What about consciousness/creativity/soul?" These matter only insofar as they measurably affect performance. If understanding has a measurable impact on an agent's performance, then Υ is partly a measure of understanding. If not, it is irrelevant to intelligence in any practical sense.

"No Free Lunch theorem makes this impossible." NFL applies to uniform distributions over problems. Universal intelligence uses a highly non-uniform distribution (Occam's razor). The NFL theorem does not apply.

Check: Why doesn't the No Free Lunch theorem undermine universal intelligence?

Chapter 9: Summary

Informal Definition
An agent's ability to achieve goals in a wide range of environments
↓ formalise each piece
Υ(π) = ∑ 2-K(μ) Vμπ
Weighted sum of performance across all computable environments
↓ equals
Vξπ
Performance under the universal prior = AIXI's value function

This equation is the central contribution of the thesis. It turns the age-old question "what is intelligence?" into a mathematical statement. It correctly ranks agents from random to optimal, emphasises generality over specialisation, and connects to the deepest ideas in theoretical computer science.

The next challenge: can we approximate this measure? Chapter 5 will show that fundamental limits on computation constrain how closely any real algorithm can approach AIXI.

Check: What is the central equation of the thesis?