Legg, Chapter 5

Limits of Computational Agents

Fundamental barriers to building powerful AI: prediction complexity, hard sequences, and Godel's shadow.

Prerequisites: Chapter 2 (Kolmogorov complexity, Solomonoff induction).
9
Chapters
1
Simulations
9
Quizzes

Chapter 0: The Question

AIXI is the theoretical ideal, but it requires infinite computation. Can we build practical agents that approximate its power? Or are there fundamental barriers?

This chapter delivers sobering news. We study computable sequence predictors — the simplest possible version of the problem (no actions, just prediction). Even here, fundamental limitations emerge. Since sequence prediction lies at the heart of reinforcement learning, these limits apply to all computable AI agents.

The three results: (1) Very powerful predictors must themselves be very complex. No simple algorithm can be truly general. (2) Some computable sequences are extremely hard to predict, even with unlimited computation time and data. (3) Beyond a certain complexity, mathematical analysis itself breaks down — Godel incompleteness prevents us from even proving that powerful algorithms work.
Check: Why do limits on sequence prediction matter for all of AI?

Chapter 1: Setup

We consider computable binary sequences: infinite sequences that can be generated by some program on a universal Turing machine. The Kolmogorov complexity K(ω) is the length of the shortest such program.

A computable predictor p takes the sequence so far and outputs its prediction for the next bit. We say p learns to predict ω if there exists a point after which p always predicts correctly: ∀n ≥ m : p(x1:n) = xn+1.

An important distinction: we give the predictor unlimited computation time and memory. The only constraint is that it must be a computable function. Even with these generous assumptions, fundamental limitations arise.

Check: What does it mean for a predictor to "learn to predict" a sequence?

Chapter 2: Predicting Any Sequence

Good news first: every computable sequence can be predicted by at least one predictor, and that predictor need not be significantly more complex than the sequence.

Lemma: ∀ω ∈ C, ∃p ∈ P(ω) : K(p) ≤ K(ω) + c. Every computable sequence has a predictor no more complex than the sequence itself (up to a constant).

The proof is constructive: take the program q that generates ω, build a predictor that runs q to generate x1:n+1 and returns xn+1. Simple.

Remarkably, even very simple predictors can learn arbitrarily complex sequences. A predictor that searches for repeating patterns (an "eventually periodic" predictor) can learn sequences of any Kolmogorov complexity, as long as the sequence eventually becomes periodic.

But the reverse is also true: for every predictor, there exists a computable sequence it cannot predict.

The adversary lemma: For any computable predictor p, there constructively exists a computable sequence ω such that p never learns to predict ω, and K(ω) ≤ K(p) + c. The adversary simply does the opposite of whatever p predicts.

This means no single predictor can handle all computable sequences. Universal prediction is impossible for computable predictors — unlike for the incomputable Solomonoff predictor, which handles all of them.

Check: Can a single computable predictor learn to predict all computable sequences?

Chapter 3: Predicting Simple Sequences

Since predicting all computable sequences is impossible, what about predicting all simple ones — those with Kolmogorov complexity at most n?

Define Cn = {ω : K(ω) ≤ n} and Pn = the set of predictors that can learn to predict every sequence in Cn.

The complexity barrier (Theorem 5.3.3): ∀n : p ∈ Pn ⇒ K(p) ≥ n. To predict all sequences of complexity at most n, the predictor must itself have complexity at least n. There are no simple but powerful predictors.

This is a deeply negative result. It rules out the upper-left corner of the "complexity vs. power" diagram: you cannot have a simple algorithm that is also powerful. The most general prediction algorithms must themselves be very complex.

Check: What does the complexity barrier tell us about the design of general AI?

Chapter 4: Prediction Complexity

Legg introduces a new complexity measure: the prediction complexity K̇(ω) of a sequence ω is the length of the shortest predictor that can learn to predict it:

K̇(ω) := min { ℓ(p) : p ∈ P(ω) }

This is different from Kolmogorov complexity K(ω), which measures the length of the shortest generator. A sequence might be hard to generate (high K) but easy to predict (low K̇), or vice versa.

From the results above: 0 ≤ K̇(ω) ≤ K(ω) + c. The prediction complexity never exceeds the Kolmogorov complexity by more than a constant. But it can be much lower: a repeating string has bounded K̇ but potentially unbounded K.

Check: How does prediction complexity K̇ relate to Kolmogorov complexity K?

Chapter 5: Hard Sequences

Do sequences with high prediction complexity actually exist? Yes. For every n, there exist computable sequences where K̇(ω) is close to K(ω), and K(ω) > n.

Furthermore, sequences with high K̇ are extremely expensive to compute. If a sequence can be generated quickly (in time polynomial in n), then K̇ is essentially 0 — a simple predictor that simulates all fast programs will learn it.

Only sequences that require exponential (or worse) computation time to generate can have high prediction complexity. These sequences exist, but they push the limits of what can be practically computed, even with unlimited time.

Check: What kind of sequences have high prediction complexity?

Chapter 6: Godel's Limit

The most devastating result: even if powerful prediction algorithms exist, we may not be able to find them or prove they work.

Theorem 5.6.1: In any consistent formal axiomatic system F, there exists a threshold m such that for all n > m and all predictors p ∈ Pn, the statement "p ∈ Pn" cannot be proven within F.

In other words: beyond a certain complexity, you cannot mathematically prove that a predictor is powerful. This is a direct analogue of Godel's incompleteness theorem, applied to prediction algorithms.

The practical implications are profound. Even if we write a very powerful AI algorithm, we may not be able to prove it works. Mathematical analysis, our most powerful tool for understanding algorithms, has a hard ceiling. Beyond that ceiling, algorithm development must become mainly experimental.

The triple bind: (1) Powerful predictors must be complex (Theorem 5.3.3). (2) Some sequences can only be predicted by complex predictors (hard sequences). (3) Beyond moderate complexity, we cannot prove that algorithms work (Godel). The region of "simple, powerful, provable" algorithms is fundamentally empty.
Check: What does Godel incompleteness imply for powerful AI algorithms?

Chapter 7: Implications

These results carry over directly to reinforcement learning. At the heart of RL lies prediction: an agent must predict future rewards to choose good actions. The limits on predictors are therefore limits on RL agents.

Specifically: very powerful RL algorithms must be complex (no simple general learner), some environments require complex agents to learn (no universally easy problems), and beyond moderate power, we cannot prove convergence (the theory breaks down).

What can we do? Several paths forward:

Check: Given the fundamental limits, what is the most promising path forward for practical AI?

Chapter 8: Summary

No Universal Predictor
For every predictor, there is a sequence that defeats it
Complexity Barrier
Powerful predictors must be complex — no simple shortcuts
Hard Sequences
Some sequences require complex predictors — no universal easy problems
Godel's Ceiling
Beyond moderate complexity, proofs of correctness are impossible

These results constrain our ambitions but do not destroy them. Practical AI can and does make progress — just not through simple, provably optimal algorithms. The path forward is through clever, statistically principled heuristics. Chapter 6 provides one example.

Check: What is the fundamental tension revealed by this chapter?