Kochenderfer et al., Chapter 10

Discrete Reachability

When states are countable, reachability becomes graph search — powerful, elegant, and combinatorially explosive.

Prerequisites: Chapter 8 (Linear Reachability). That's it.
10
Chapters
4+
Simulations
10
Quizzes

Chapter 0: Why Discrete Reachability?

Imagine a self-driving car navigating an intersection. The car has a finite set of decisions: go straight, turn left, turn right, stop. The traffic light has three states: green, yellow, red. Other vehicles occupy discrete lanes. At this level of abstraction, the world is not a continuous differential equation — it is a graph.

Chapters 8 and 9 handled continuous dynamics (differential equations, neural networks). But many systems are naturally discrete, or are best analyzed at a discrete level. Digital controllers execute finite state machines. Protocol specifications describe message sequences. Even continuous systems can be abstracted into discrete graphs by partitioning the state space into cells.

The core shift: In continuous reachability, the tools are linear algebra and calculus (zonotopes, Taylor models). In discrete reachability, the tools are graph algorithms (BFS, DFS, shortest paths) and logic (SAT, model checking). The question is the same — "where can the system go?" — but the mathematical universe changes completely.

Discrete reachability has a major advantage: it is exact. There is no overapproximation. If a state is reachable in the graph, it is truly reachable. If it is not reachable, it is truly unreachable. The challenge is not approximation quality but scalability: a system with n binary state variables has 2n possible states. Exploring them all is infeasible for large n.

Continuous reachability
Tools: calculus, linear algebra. Challenge: overapproximation.
↓ discretize
Discrete reachability
Tools: graph search, logic. Challenge: state space explosion.
ContinuousDiscrete
States: RnStates: finite set S
Dynamics: x' = f(x)Transitions: edges in a graph
Reachable set: curved blobReachable set: subset of nodes
OverapproximateExact (but may be huge)
What is the main advantage of discrete reachability over continuous reachability?

Chapter 1: Graph Formulation

A discrete system is a directed graph G = (V, E). The vertices V are the system states. An edge (u, v) ∈ E means the system can transition from state u to state v in one step. Some edges may carry weights (probabilities, costs).

Notation: We write outneighbors(u) for the set of states reachable from u in one step: {v : (u,v) ∈ E}. We write inneighbors(v) for the set of states that can reach v in one step: {u : (u,v) ∈ E}. These two concepts are the workhorses of forward and backward reachability.

Adjacency Representations

How you store the graph determines the cost of reachability queries:

RepresentationSpaceOutneighbor queryInneighbor query
Adjacency matrixO(|V|2)O(|V|)O(|V|)
Adjacency listO(|V| + |E|)O(degree)O(|V| + |E|) unless stored separately
Both directionsO(|V| + |E|)O(degree)O(degree)

For reachability analysis, we almost always want both forward and backward edges, so we store two adjacency lists: one for outneighbors, one for inneighbors. This doubles the storage but makes both directions equally fast.

Transition Systems vs. Graphs

In verification, the graph often represents a transition system: a tuple (S, S0, T) where S is the state set, S0 ⊆ S is the set of initial states, and T ⊆ S × S is the transition relation. States may have labels (e.g., "safe", "failure"). The reachability question becomes: starting from S0, can the system reach any state labeled "failure"?

Key insight: Reachability on a graph is equivalent to connectivity. The forward reachable set from S0 is exactly the set of vertices reachable from any vertex in S0 via directed paths. This is a foundational graph theory problem, solvable by BFS or DFS in O(|V| + |E|) time.
python
# Simple directed graph using adjacency lists
class DiGraph:
    def __init__(self, n):
        self.n = n
        self.out_adj = [[] for _ in range(n)]
        self.in_adj  = [[] for _ in range(n)]

    def add_edge(self, u, v):
        self.out_adj[u].append(v)
        self.in_adj[v].append(u)
What does outneighbors(u) represent in a directed graph?

Chapter 2: Forward Reachability

The most natural reachability question: starting from a set of initial states S0, which states can the system eventually reach? This is forward reachability, and it is solved by breadth-first search (BFS).

BFS explores states layer by layer. Layer 0 is S0 itself. Layer 1 is everything reachable in one step from S0. Layer 2 is everything reachable in two steps. And so on, until no new states are discovered.

R0 = S0
Rk+1 = Rk ∪ { v : (u, v) ∈ E, u ∈ Rk }

When Rk+1 = Rk, we have found the full forward reachable set. The algorithm terminates in at most |V| steps (since each step adds at least one new state, or we stop).

Key insight: Forward reachability is just BFS with a set of starting nodes instead of one. The total cost is O(|V| + |E|) — linear in the graph size. Each vertex and each edge is examined at most once. This is as efficient as possible for an algorithm that must read the entire reachable portion of the graph.
Grid World Reachability: BFS Explorer

A 7×7 grid with walls (dark), a start cell (teal), and a goal cell (orange). Click Step to expand one BFS layer. Toggle Forward/Backward to switch direction. In Probabilistic mode, cells are colored by occupancy probability.

Layer 0
Speed200ms
What to notice: In forward mode, BFS expands outward from the start cell like a wavefront. Walls block propagation. The BFS layer number equals the shortest path distance from the start. States behind walls may be unreachable entirely. Switch to backward mode (Chapter 3) to see propagation from the goal.
python
from collections import deque

def forward_reachable(graph, initial_states):
    """BFS from a set of initial states. Returns reachable set."""
    visited = set(initial_states)
    queue = deque(initial_states)
    while queue:
        u = queue.popleft()
        for v in graph.out_adj[u]:
            if v not in visited:
                visited.add(v)
                queue.append(v)
    return visited
What determines the BFS layer number of a state in forward reachability?

Chapter 3: Backward Reachability

Forward reachability answers: "what can the system reach?" Backward reachability answers the dual question: "what can reach the target?" Given a set of target states T (perhaps failure states), backward reachability computes the set of all states from which T is eventually reachable.

B0 = T
Bk+1 = Bk ∪ { u : (u, v) ∈ E, v ∈ Bk }

This is BFS on the reverse graph — following edges backward (using inneighbors instead of outneighbors). The set Bk contains all states that can reach T in at most k steps.

Why backward? Often we care about a specific dangerous state ("the car hits a pedestrian") and want to know: from which initial conditions is this danger reachable? Forward search from every possible initial state is wasteful. Backward search from the danger set directly identifies the backward reachable set — the set of states we must avoid or handle with care.

Combining Forward and Backward

The most powerful analysis combines both directions:

Forward from S0
Compute R = states reachable from initial conditions
Backward from T
Compute B = states that can reach the target/failure
Intersection R ∩ B
States that are both reachable AND can lead to failure

If R ∩ B is empty, the system is safe: no initial condition can lead to the failure state. If R ∩ B is non-empty, every state in the intersection lies on a path from S0 to T — a potential failure trajectory.

Key insight: Forward reachability tells you where you can go. Backward reachability tells you where danger comes from. Their intersection tells you what you need to worry about. This bidirectional analysis is more informative than either direction alone and costs only twice the BFS time: O(2(|V| + |E|)).

Try the simulation in Chapter 2: switch to "Backward" mode and watch BFS expand from the goal. The backward reachable set at layer k contains all cells that can reach the goal in exactly k steps. Compare the forward and backward wavefronts — they meet where paths connect start to goal.

python
def backward_reachable(graph, target_states):
    """BFS on reverse graph from target states."""
    visited = set(target_states)
    queue = deque(target_states)
    while queue:
        v = queue.popleft()
        for u in graph.in_adj[v]:  # inneighbors!
            if u not in visited:
                visited.add(u)
                queue.append(u)
    return visited

# Safety check
R = forward_reachable(G, S0)
B = backward_reachable(G, T_fail)
if R & B:
    print("UNSAFE: failure is reachable")
else:
    print("SAFE: no path from S0 to failure")
If the forward reachable set R from initial states and the backward reachable set B from failure states have an empty intersection, what can we conclude?

Chapter 4: Counterexample Search

Knowing that a failure state is reachable is only half the story. We also want a counterexample: a concrete path from an initial state to the failure state. This path explains how the system can fail and is essential for debugging.

A counterexample is a sequence of states s0, s1, ..., sk where s0 ∈ S0, sk ∈ T, and each (si, si+1) ∈ E. This is simply a path in the graph from the initial set to the target set.

Search Strategies

AlgorithmFindsTimeBest for
BFSShortest counterexampleO(|V| + |E|)Minimum-step failures
DFSAny counterexampleO(|V| + |E|)Quick detection (any path)
A*Lowest-cost counterexampleO(|E| + |V| log |V|)Weighted graphs (most likely failure)

BFS finds the shortest counterexample (fewest transitions). This is valuable because shorter failure paths are often more plausible and easier to understand.

DFS uses less memory (O(depth) vs O(branching factordepth)) and may find a counterexample faster if one exists deep in the graph. But the path it finds may not be the shortest.

A* is optimal for weighted graphs. If edges have costs (e.g., negative log-probabilities), A* with an admissible heuristic finds the minimum-cost path — the most likely failure trajectory.

Counterexamples in model checking: When a model checker like SPIN or NuSMV declares a property violated, it produces a counterexample — a trace from the initial state to the violating state. This trace is the most valuable output of the tool: it tells the engineer exactly what sequence of events leads to failure, enabling targeted fixes.

Extracting the Path

During BFS or DFS, maintain a parent pointer for each discovered state: parent[v] = u means "v was first reached from u." Once the search reaches a target state t, trace back from t through the parent pointers to reconstruct the full path.

python
def find_counterexample(graph, s0, target):
    """BFS shortest path from s0 to any state in target."""
    parent = {s0: None}
    queue = deque([s0])
    while queue:
        u = queue.popleft()
        if u in target:
            # Trace back
            path = []
            while u is not None:
                path.append(u)
                u = parent[u]
            return path[::-1]
        for v in graph.out_adj[u]:
            if v not in parent:
                parent[v] = u
                queue.append(v)
    return None  # no path exists
Why is BFS preferred over DFS when we need the counterexample to be as short as possible?

Chapter 5: Boolean Satisfiability (SAT)

For small graphs, BFS and DFS work perfectly. But what about systems with 2100 states? Explicit enumeration is hopeless. SAT-based methods exploit the fact that many large systems have compact symbolic descriptions — the state can be encoded as a vector of Boolean variables, and the transition relation as a logical formula.

The Encoding

Suppose the system has n Boolean state variables x1, ..., xn. A state is a truth assignment to these variables (2n possible states). The transition relation T(x, x') is a Boolean formula relating the current state x to the next state x'.

Example: A 2-bit counter has state variables x1, x2. The transition adds 1 modulo 4. The formula is: x'1 = x1 ⊕ x2, x'2 = ¬x2. The entire 4-state system is encoded in two simple equations.

The k-step reachability question becomes: is there a sequence of states x(0), x(1), ..., x(k) such that x(0) ∈ S0, x(k) ∈ T, and T(x(i), x(i+1)) holds for all i? This is a satisfiability problem:

S0(x(0)) ∧ T(x(0), x(1)) ∧ T(x(1), x(2)) ∧ … ∧ T(x(k−1), x(k)) ∧ Tfail(x(k))

This formula, called the bounded model checking (BMC) formula, is satisfiable if and only if there exists a failure path of length k. A SAT solver determines satisfiability and, if satisfiable, returns a satisfying assignment — which is exactly the counterexample.

Why SAT beats explicit search: The BMC formula has O(n · k) variables. Modern SAT solvers (MiniSat, Glucose, CaDiCaL) can handle millions of variables. This means we can analyze systems with millions of state bits — far beyond the reach of explicit BFS. The solver's internal search (DPLL with clause learning) prunes the 2nk search space by exploiting the structure of the formula.

Incremental Deepening

We don't know the minimum failure path length in advance. So we try k = 0, 1, 2, ... increasing the bound until either the formula is satisfiable (failure found) or we reach a completeness threshold. The completeness threshold is the maximum k such that if no failure exists at depth k, none exists at any depth. For simple reachability, this is the diameter of the graph (longest shortest path).

kFormula sizeSAT resultMeaning
0n variablesUNSATNo initial state is a failure
12n variablesUNSATNo 1-step path to failure
............
k*(k*+1)n variablesSATFound shortest failure path!
What does it mean when the bounded model checking formula is satisfiable for bound k?

Chapter 6: Probabilistic Occupancy

So far we have asked binary questions: is a state reachable or not? But if transitions have probabilities, we can ask a richer question: what is the probability of being in each state at time k?

A Markov chain over n states is described by a transition matrix P, where Pij = Pr(next state = j | current state = i). The occupancy distribution μk is a row vector giving the probability of being in each state at time k:

μk+1 = μk P

Starting from μ0 (which puts all probability on the initial states), each multiplication by P propagates the distribution one time step. After k steps, μk = μ0 Pk.

Key insight: Probabilistic occupancy is the "soft" version of forward reachability. BFS says "this state is reachable (yes/no)." Occupancy says "this state has probability 0.003 of being occupied at time k." The math is identical in structure — matrix-vector multiplication instead of set union — but the information is richer.

Worked Example

Consider a 3-state system with transition matrix:

P = [0.5, 0.5, 0.0]
    [0.0, 0.3, 0.7]
    [0.0, 0.0, 1.0]

Starting from state 0: μ0 = [1, 0, 0].

μ1 = [1, 0, 0] · P = [0.5, 0.5, 0.0]

μ2 = [0.5, 0.5, 0.0] · P = [0.25, 0.40, 0.35]

State 2 is an absorbing state (P22 = 1). Its probability grows monotonically. Eventually, all probability mass concentrates there: μ = [0, 0, 1].

Connection to reachability probability: The probability that the system ever reaches state j, starting from state i, is related to the absorbing Markov chain theory. If j is absorbing, it equals 1 minus the probability of being absorbed elsewhere. The matrix (I − Q)−1, where Q is the transient part of P, gives the expected number of visits to each transient state.
Probabilistic Occupancy Propagation

A 5-state Markov chain. Click Step to propagate the distribution one time step (μk+1 = μkP). Watch probability flow from the initial state (leftmost) toward the absorbing state (rightmost).

k = 0
How is the occupancy distribution at time k+1 computed from the distribution at time k?

Chapter 7: Finite & Infinite Horizon

The reachability probability question comes in two flavors, depending on whether we impose a time limit.

Finite Horizon

Finite-horizon reachability: what is the probability of reaching a target set T within k time steps? Define:

pk(s) = Pr(reach T within k steps | start in s)

This satisfies a recursive formula. If s ∈ T, then pk(s) = 1 for all k ≥ 0. Otherwise:

pk+1(s) = ∑s' P(s, s') · pk(s')     for s ∉ T

Starting from p0(s) = 1 if s ∈ T, 0 otherwise, we compute p1, p2, ... by propagating backward through the transition matrix. After k iterations, pk(s) gives the exact k-step reachability probability for each state.

Note the direction: This looks like backward reachability, and it is. We start from the target set and propagate backward through the transition probabilities. This is the probabilistic analogue of the backward BFS in Chapter 3.

Infinite Horizon

Infinite-horizon reachability: what is the probability of ever reaching T? Define:

p(s) = Pr(reach T eventually | start in s)

This is the limit of pk(s) as k → ∞. If T is reachable with positive probability at each step, the sequence pk(s) is non-decreasing and bounded by 1, so it converges. In practice, we iterate the recursive formula until convergence (|pk+1 − pk| < ε).

Matrix Formulation

Partition the states into target states T and non-target states S \ T. The sub-matrix Q = P restricted to transitions among non-target states governs the dynamics before reaching T. The infinite-horizon reachability probability for non-target states is:

p = (I − Q)−1 · r

where rs = ∑t ∈ T P(s, t) is the one-step probability of transitioning into T from state s. The matrix (I − Q)−1 is the fundamental matrix of the absorbing Markov chain — its (i,j) entry counts the expected number of times the process visits state j starting from state i before being absorbed.

Solving linear systems vs. iterating: The matrix formula gives the exact answer in one shot (O(n3) for matrix inversion). The iterative approach converges geometrically and may be faster for sparse matrices. Both give the same answer. For large systems, iterative methods with sparse matrix-vector products (O(|E|) per step) are preferred.
HorizonQuestionMethod
k stepsPr(reach T in ≤ k steps)k matrix-vector products
InfinitePr(ever reach T)Solve (I − Q)p = r
What does the fundamental matrix (I − Q)−1 of an absorbing Markov chain represent?

Chapter 8: Discrete State Abstractions

What if the system is continuous but we want to use discrete reachability tools? The answer: discretize. Partition the continuous state space into finitely many cells. Build a graph where each cell is a node, and there is an edge from cell A to cell B if the continuous dynamics can take any state in A to some state in B.

The abstraction idea: A discrete abstraction overapproximates the continuous system. If the abstract graph says cell B is not reachable from cell A, then no continuous state in A can reach any continuous state in B. The guarantee is one-directional: the abstraction may say "reachable" when the true system cannot reach (spurious paths), but it will never miss a real path.

Constructing the Abstract Graph

1. Partition
Divide state space into cells C1, ..., CN
2. Reachability per cell
For each cell Ci, compute overapprox of f(Ci) using Ch 9 methods
3. Build edges
Edge Ci → Cj if f(Ci) ∩ Cj ≠ ∅
4. Graph analysis
BFS/DFS/SAT on the abstract graph

Step 2 is where the nonlinear reachability methods from Chapter 9 enter. For each cell, we use interval arithmetic (or Taylor models) to overapproximate the image under the dynamics. The edges are conservative: we add an edge whenever the overapproximated image might overlap a neighboring cell.

Refinement

If the abstract graph says a failure state is reachable but we suspect this is spurious, we can refine: split cells in the abstract path into smaller cells and re-check. If the path survives refinement, it is likely genuine. If it disappears, it was spurious. This counterexample-guided abstraction refinement (CEGAR) loop is the backbone of many verification tools.

Partition typeCellsBest for
Uniform gridHyper-rectangles of equal sizeSimple, low dimension
Octree/quadtreeAdaptive rectangles (finer near boundaries)Non-uniform importance
SimplexTriangulation of state spacePolyhedral dynamics
Key insight: Discretization converts a continuous reachability problem into a graph problem, letting us use the full toolkit of Chapters 1-7. The cost is overapproximation (spurious edges) and the curse of dimensionality (the number of cells is exponential in the state dimension). This is why discrete abstractions are practical mainly for low-dimensional systems (n ≤ 4 or 5).
Why does a discrete abstraction of a continuous system always overapproximate reachability?

Chapter 9: Resolution Tradeoffs

Every method in this chapter lives on a spectrum: finer resolution = tighter results but more computation. This final chapter maps out the tradeoffs and points forward.

The Resolution Ladder

ApproachResolutionCostTightness
Explicit BFSExact (each state)O(|V| + |E|)Exact
SAT/BMCSymbolic (bit-level)NP (but often fast)Exact
Coarse abstractionFew large cellsLowVery loose
Fine abstractionMany small cellsHighTight
Adaptive (CEGAR)Fine where neededVariableGood

When to Use What

Small state space (|V| < 106): Use explicit BFS. It is simple, exact, and fast. Store the graph in memory and run forward/backward reachability directly.

Large structured state space (|V| > 106, compact description): Use SAT/BMC. Encode the transitions as Boolean formulas and let the SAT solver handle the search. This is the sweet spot for hardware and protocol verification.

Continuous system, low dimension (n ≤ 5): Use discrete abstraction with adaptive refinement. The cell count is manageable, and the graph tools from this chapter apply directly.

Continuous system, high dimension (n > 10): Return to Chapter 9's nonlinear methods (interval arithmetic, Taylor models) or use sampling-based falsification (Chapters 4-5). Discrete abstraction is impractical due to the curse of dimensionality.

The verification engineer's dilemma: Exact methods are complete but expensive. Approximate methods are cheap but may give false alarms (or miss real bugs). The art of verification is choosing the right level of abstraction for the problem at hand — coarse enough to be tractable, fine enough to be useful.

Connections

This chapter closes the reachability trilogy (Chapters 8-10). Each chapter handles a different flavor of dynamics:

What we covered:

Ch 8: Linear systems (zonotopes, polytopes)
Ch 9: Nonlinear systems (intervals, Taylor models)
Ch 10: Discrete systems (graphs, SAT, Markov chains)

What comes next:

Ch 11: Explainability (why did the system fail?)
Ch 12: Runtime monitoring (detecting failures in real time)

The reachability methods provide pre-deployment guarantees: before the system runs, we prove it cannot reach unsafe states. Chapters 11 and 12 address what happens during deployment: monitoring the system in real time and explaining its decisions when things go wrong.

"Testing shows the presence, not the absence of bugs."
— Edsger Dijkstra
For a continuous system with 20 state dimensions, why is discrete abstraction impractical?