The formal definition, the counting argument, incompressible strings, and the algorithmic properties that make C(x) a powerful proof tool.
In the Introduction, we established three foundational ideas:
Chapter 1 of the book formalizes this further, establishing the precise definition (the "plain" version, to distinguish it from prefix and monotone versions that come later) and deriving its key properties.
Let Σ = {0, 1} be the binary alphabet and Σ* the set of all finite binary strings, including the empty string λ. A description mode (decompressor) is a partial computable function D: Σ* → Σ*.
"Partial" means D may not halt on some inputs — and that's fine. A description mode doesn't need to be defined everywhere. "Computable" means there's an algorithm that, given y in the domain of D, eventually halts and returns D(y).
Here l(y) denotes the length of y. If no y maps to x under D, we set CD(x) = +∞.
D1 is not worse than D2 if CD1(x) ≤ CD2(x) + O(1). D is optimal if it is not worse than every other description mode.
From now on, we fix an optimal D and write C(x) instead of CD(x). Everything is "up to O(1)."
This is the most frequently used tool in Kolmogorov complexity. It's beautifully simple.
Proof: A description of length less than n is a binary string of length 0, 1, ..., or n−1. The total number of such strings is 1 + 2 + 4 + ... + 2n−1 = 2n − 1 < 2n. Each description maps to at most one string (D is a function), so at most 2n − 1 strings can have complexity less than n.
Note that this bound has no hidden constant — it holds for any optimal description mode.
Among all 2n strings of length n:
The counting argument also gives us C(x) ≤ l(x) + O(1), since the identity is a valid description mode. Combining: for most n-bit strings, C(x) is close to n.
A string x is called incompressible if C(x) ≥ l(x). These are the strings that contain "maximal information" — no program can describe them more concisely than the string itself.
From the counting argument, we know incompressible strings of every length exist. In fact, most strings are incompressible (or nearly so). But which specific strings are incompressible?
Here's the catch: you can never prove a specific string is incompressible (in general). Remember, C(x) is not computable. Even worse, no consistent formal system F can prove "C(x) ≥ n" for n larger than C(F) + O(1) — the complexity of the formal system itself. This is essentially Gödel's incompleteness theorem, rephrased in terms of Kolmogorov complexity.
The average complexity of strings of length n is n + O(1). Here's why: let ak be the fraction of n-bit strings with complexity n − k. The average deficit from n is ∑ k · ak. Since ak ≤ 2−k (counting argument), this sum is bounded by ∑ k/2k, which converges. So the average complexity is n minus a bounded constant.
Kolmogorov complexity is defined for binary strings, but we often want to talk about the complexity of other objects: natural numbers, graphs, tuples, sets. The non-growth theorem makes this easy.
A natural number n can be encoded as a binary string in several ways:
| Encoding | Example (n=5) | Length |
|---|---|---|
| Binary representation | 101 | log n + O(1) |
| Lexicographic enumeration | 110 (6th string) | log n + O(1) |
| Unary (n ones) | 11111 | n |
Since any two of these encodings are related by computable functions, the complexities they induce differ by at most O(1). So we can speak of C(n) without specifying the encoding.
The key bound: C(n) ≤ log n + O(1). The binary representation of n has length log n + O(1), and C(x) ≤ l(x) + O(1).
But deleting an arbitrary bit (not the last one) can change complexity dramatically. Consider x = 02n (2n zeros). Its complexity is at most C(n) + O(1) ≈ log n. Now flip one bit at some position. There are 2n possible results, and at least one must have complexity ≥ n (by counting). So flipping one bit can increase complexity from log n to n.
Incrementing a number by 1 changes C(n) by at most O(1). This gives a Lipschitz property: |C(m) − C(n)| ≤ c|m − n| for some constant c. In fact, the bound is tighter: |C(m) − C(n)| ≤ 2 log|m − n| + O(1) when m ≠ n.
While C(x) is not computable, it is upper semicomputable (also called enumerable from above). This means there exists a computable function F(x, k) such that:
The sequence F(x, k) decreases monotonically and converges to C(x). You get progressively better upper bounds, but you never know when you've reached the true value.
The algorithm is simple: run all possible programs in parallel. Whenever program y halts and outputs x, you have a new candidate description. Update your best bound: C(x) ≤ l(y). Over time, you discover shorter and shorter programs for x. In the limit, you find the shortest — but the algorithm never announces "I'm done."
Formally, a function f: Σ* → N is upper semicomputable if the set {(x, n) : f(x) < n} is computably enumerable. Equivalently, if there is a decreasing computable sequence converging to f(x) for each x.
The set Sn = {x : C(x) < n} is also computably enumerable (you enumerate it by running all programs and outputting x whenever you find a description of length less than n). But the set {x : C(x) ≥ n} is not computably enumerable in general — this is another face of non-computability.
Theorem 7 is one of the most useful structural results in Kolmogorov complexity theory. To state it, we need the notion of an enumerable family of sets.
A set V of strings is computably enumerable (c.e.) if there is an algorithm that lists all elements of V (possibly with repetitions, possibly never halting if V is infinite). Think of a printer that spits out strings one by one — eventually every element of V appears.
A family of sets Vn (indexed by n) is enumerable if the set {(n, x) : x ∈ Vn} is c.e. This is stronger than each Vn being c.e. individually — we need a uniform enumeration algorithm that works for all n simultaneously.
The key subtlety: knowing that each Vn is c.e. does not make the family enumerable. Consider Vn = {0} if n is in some non-c.e. set S, and Vn = ∅ otherwise. Each Vn is finite (hence c.e.), but the family is not enumerable because enumerating it would solve the membership problem for S.
This theorem says that "having small cardinality" and "having low complexity" are the same thing, for enumerable families.
Part (a) we already know. Part (b) is the powerful direction: any enumerable family of "small" sets is automatically a family of "simple" strings.
Proof of (b): Define a new description mode D'. For strings of length n, use them as descriptions of elements of Vn. Specifically, let xk be the kth element of Vn as enumerated. Let yk be the kth binary string of length n. Set D'(yk) = xk. Since |Vn| < 2n, every element of Vn gets a description of length n. Since D' is computable, there exists c with C(x) ≤ n + c for all x ∈ Vn.
The theorem generalizes: a function f(x) is upper semicomputable iff f(x) ≥ C(x) − O(1) whenever the level sets {x : f(x) < n} form an enumerable family with |{x : f(x) < n}| < 2n. In other words, C(x) is the smallest upper semicomputable function (up to O(1)) with the counting property.
Let's build intuition about how Kolmogorov complexity behaves. The simulation below lets you experiment with strings and see estimates of their complexity.
Of course, we can't compute C(x) exactly. But we can compute rough upper bounds by trying simple compression strategies: run-length encoding, repetition detection, and pattern matching. The simulation uses these heuristics to give a (loose) upper bound.
Enter a binary string or use the generators. The bar shows the ratio of estimated complexity to string length. Truly random strings hover near 100%; patterned strings compress significantly.
| Result | Statement |
|---|---|
| Upper bound | C(x) ≤ l(x) + O(1) |
| Non-growth | C(A(x)) ≤ C(x) + O(1) for computable A |
| Counting | |{x : C(x) < n}| < 2n |
| Existence of incompressibles | For every n, some x of length n has C(x) ≥ n |
| Average complexity | Average C(x) over n-bit strings is n + O(1) |
| Non-computability | C(x) is not computable |
| Upper semicomputability | C(x) is enumerable from above |
| Complexity of numbers | C(n) ≤ log n + O(1) |
| Theorem 7 | Small enumerable families = low complexity |