Systems Programming

C/C++ Programming II

Dynamic memory, bitwise ops, recursion, data structures, state machines — the toolkit that separates embedded engineers from script writers.

Prerequisites: Basic C syntax + Arrays & functions. That's it.
12
Chapters
9+
Simulations
0
Assumed Knowledge

Chapter 0: Why Low-Level C?

You write x = [1, 2, 3] in Python and never think about where those bytes live. Python's runtime allocates memory for you, resizes it when you append, and garbage-collects it when you're done. It's wonderful — until you're programming a microcontroller with 8 KB of RAM.

In embedded systems — pacemakers, antilock brakes, drone flight controllers — there is no operating system. No garbage collector. No virtual memory. You have a fixed chunk of Flash (for code) and a fixed chunk of SRAM (for data). Every byte is accounted for. One dangling pointer (pointing at freed memory) and the device crashes. One stack overflow (stack grows into the heap) and the motor controller sends full throttle.

This is why C exists. It gives you total control over memory layout, allocation timing, and data representation. The tradeoff: you're responsible for everything Python hides from you.

The contract: C trusts you completely. It will let you write past array bounds, dereference null, and corrupt your own stack. In return, you get predictable performance, zero overhead, and code that runs on a $0.30 chip with 2KB of RAM.

Let's look at a real embedded memory map. A typical STM32F0 microcontroller has:

RegionSizeContents
Flash32 KBYour compiled code + constants
SRAM8 KBStack + heap + static variables

Inside that 8 KB of SRAM, multiple regions compete for space. The stack grows downward from the top. The heap grows upward from the bottom. If they collide, your program crashes with no warning.

Embedded Memory Layout

Drag the slider to increase call depth. Watch the stack grow down toward the heap. When they collide — crash.

Call Depth 3
Key insight: In embedded C, memory is a zero-sum game. Every byte the stack uses is a byte the heap can't have. Every malloc without a free is a permanent loss. This chapter's lesson teaches the tools to manage this scarce resource.
c
// Typical embedded memory layout (linker script)
// Address 0x20000000: start of SRAM (8192 bytes)
//   [static data] [heap -->    <-- stack] [top]
//   0x20000000                            0x20002000

#define SRAM_START  0x20000000
#define SRAM_SIZE   8192
#define STACK_TOP   (SRAM_START + SRAM_SIZE)  // 0x20002000

static int sensor_buffer[64];  // 256 bytes of static
static int motor_state;          // 4 bytes of static
// Remaining ~7932 bytes split between stack and heap
In an embedded system with 8KB SRAM, what happens if your stack grows too large?

Chapter 1: Bitwise Operations

In embedded C, you rarely work with whole integers. You work with individual bits. A hardware register is a 32-bit value where each bit controls something different: bit 5 turns on an LED, bit 3 enables a timer, bit 12 selects the clock source. You need surgical tools to flip one bit without disturbing the others.

C gives you six bitwise operators that work on each bit independently:

OperatorNameWhat it doesExample (8-bit)
&AND1 only if BOTH bits are 10b11001010 & 0b11110000 = 0b11000000
|OR1 if EITHER bit is 10b11001010 | 0b00000101 = 0b11001111
^XOR1 if bits DIFFER0b11001010 ^ 0b11110000 = 0b00111010
~NOTFlip all bits~0b11001010 = 0b00110101
<<Left shiftShift bits left, fill with 00b00000001 << 5 = 0b00100000
>>Right shiftShift bits right0b00100000 >> 5 = 0b00000001
The three sacred patterns:
Set bit N: reg |= (1 << N) — OR with a mask that has only bit N set.
Clear bit N: reg &= ~(1 << N) — AND with a mask that has every bit set EXCEPT N.
Toggle bit N: reg ^= (1 << N) — XOR flips just that one bit.

Here's a real embedded example. On an STM32, GPIO Port A's Output Data Register (GPIOA->ODR) is at address 0x40020014. Bit 5 controls the onboard LED. To turn it on:

c
// Turn on LED (bit 5 of GPIOA->ODR)
GPIOA->ODR |= (1 << 5);    // Set bit 5: xxxx_xxxx |= 0010_0000

// Turn off LED
GPIOA->ODR &= ~(1 << 5);   // Clear bit 5: xxxx_xxxx &= 1101_1111

// Toggle LED
GPIOA->ODR ^= (1 << 5);    // Flip bit 5 only

// Check if bit 5 is set
if (GPIOA->IDR & (1 << 5)) {
    // Button is pressed
}

Bit fields let you pack multiple flags into a single byte. Instead of wasting 4 bytes (an int) for a boolean flag, you use 1 bit:

c
struct SensorFlags {
    uint8_t ready    : 1;  // 1 bit: sensor data ready
    uint8_t overflow : 1;  // 1 bit: buffer overflow
    uint8_t channel  : 3;  // 3 bits: channel 0-7
    uint8_t gain     : 2;  // 2 bits: gain 0-3
    uint8_t reserved : 1;  // 1 bit: padding to 8 bits total
};  // sizeof = 1 byte! Not 4 ints = 16 bytes.
Interactive 8-Bit Register

Click any bit to toggle it. Choose an operation and a mask below to see the result.

Mask bit 5
To clear bit 3 of a register without affecting other bits, which expression do you use?

Chapter 2: Pointers Deep Dive

You already know a pointer holds a memory address. But pointer arithmetic is where C reveals its true power — and its danger. When you write p + 1, C doesn't add 1 byte. It adds sizeof(*p) bytes. This is how arrays work under the hood: arr[i] is exactly *(arr + i).

The fundamental rule: Pointer arithmetic scales by the size of the pointed-to type. If int *p = 0x1000 and sizeof(int) = 4, then p + 1 = 0x1004, p + 2 = 0x1008. The compiler does the multiplication for you.

Let's trace through memory. Suppose we have an array of 4-byte integers starting at address 0x2000:

c
int arr[4] = {10, 20, 30, 40};
int *p = arr;           // p = 0x2000

// Pointer arithmetic:
*p        // value at 0x2000 = 10
*(p + 1)  // value at 0x2004 = 20  (NOT 0x2001!)
*(p + 2)  // value at 0x2008 = 30
*(p + 3)  // value at 0x200C = 40

// These are identical:
arr[2]       // = 30
*(arr + 2)  // = 30
*(p + 2)    // = 30

Double pointers (char **) hold the address of a pointer. Why? Two common reasons: (1) you need a function to modify a pointer (pass pointer-to-pointer), and (2) arrays of strings (argv in main).

c
// Function that allocates memory and returns it via double pointer
void create_buffer(int **out, int size) {
    *out = (int *)malloc(size * sizeof(int));
}

int *buf = NULL;
create_buffer(&buf, 100);  // buf now points to allocated memory
// Without **, the function would modify a LOCAL copy of buf

Function pointers store the address of a function. This enables callbacks — you pass behavior as an argument:

c
// Declare a function pointer type
typedef void (*callback_t)(int event_code);

// A function matching that signature
void on_button_press(int code) {
    printf("Button %d pressed\n", code);
}

// Register the callback
callback_t handler = on_button_press;
handler(5);  // calls on_button_press(5)
Pointer Arithmetic Visualizer

An array of structs in memory. Each struct is 12 bytes. Watch how p+1 jumps 12 bytes, not 1.

Pointer offset (p + N) 0
Why this matters: In embedded, you often have a buffer of sensor readings (structs). You traverse it with a pointer that advances by sizeof(SensorReading) each step — no array indexing, no bounds checking, just raw pointer arithmetic. Fast. Dangerous.
If int *p = 0x3000 and sizeof(int) = 4, what is the value of p + 3?

Chapter 3: Dynamic Memory

Static arrays have a problem: you must decide their size at compile time. But what if you don't know how many sensor readings you'll collect? You need memory allocated at runtime. In C, that means asking the heap for space using malloc.

The heap is a region of memory managed by a runtime allocator. Unlike the stack (which is automatic), heap memory persists until you explicitly free it. Forget to free? That's a memory leak. Free it twice? That's undefined behavior (usually a crash).

The four functions:
malloc(size) — allocate size bytes, uninitialized.
calloc(n, size) — allocate n * size bytes, zeroed.
realloc(ptr, new_size) — resize existing allocation (may move it!).
free(ptr) — return memory to the heap.
c
// Allocate array of 100 ints (400 bytes on 32-bit system)
int *data = (int *)malloc(100 * sizeof(int));
if (data == NULL) {
    // malloc failed! Out of memory. Handle gracefully.
    return -1;
}

// Use it
data[0] = 42;
data[99] = 7;

// Need more space? Resize to 200 ints
int *bigger = (int *)realloc(data, 200 * sizeof(int));
if (bigger == NULL) {
    // realloc failed, but data is still valid!
    free(data);
    return -1;
}
data = bigger;  // ptr may have moved

// Done. Free it.
free(data);
data = NULL;  // ALWAYS null after free to prevent dangling pointer

Fragmentation is the silent killer of embedded systems. After many malloc/free cycles, the heap becomes a patchwork of used and unused blocks. You might have 1 KB free total, but split into 10 non-contiguous 100-byte fragments. A request for 200 bytes fails even though "enough" memory exists.

malloc(100)
Allocator finds 100 free bytes, marks them used, returns pointer
Use memory
Write sensor data, process it
free(ptr)
Marks those 100 bytes as available again
Repeat
Over time, free blocks become scattered = fragmentation
Heap Visualizer

Click malloc to allocate blocks. Click free to release a random block. Watch fragmentation grow.

Heap: 0 / 1024 bytes used. 0 fragments.

Rule of thumb: In embedded systems, many teams ban malloc entirely. They pre-allocate all memory as static arrays and use pool allocators (fixed-size blocks). No fragmentation. Deterministic timing. More code, but zero surprises at runtime.
After free(ptr), what should you immediately do with ptr?

Chapter 4: Recursion

A recursive function calls itself. That sounds circular, but it works because each call handles a smaller version of the problem, and there's a base case that stops the chain. Without a base case, you recurse forever and overflow the stack.

Every recursive call creates a new stack frame — a block of memory on the stack holding that call's local variables, parameters, and return address. When the function returns, its frame is popped. This is why deep recursion is dangerous in embedded: each frame costs memory.

The recipe: Every recursive function needs: (1) a base case (when to stop), and (2) a recursive case that makes the problem smaller and calls itself. If the problem isn't getting smaller, you have infinite recursion.

Let's trace factorial(4) step by step:

c
int factorial(int n) {
    if (n <= 1) return 1;       // Base case
    return n * factorial(n - 1);  // Recursive case
}

// Trace of factorial(4):
// factorial(4): return 4 * factorial(3)  [frame 4 pushed]
//   factorial(3): return 3 * factorial(2)  [frame 3 pushed]
//     factorial(2): return 2 * factorial(1)  [frame 2 pushed]
//       factorial(1): return 1  (BASE CASE)   [frame 1 pushed]
//     factorial(2): return 2 * 1 = 2          [frame 1 popped]
//   factorial(3): return 3 * 2 = 6            [frame 2 popped]
// factorial(4): return 4 * 6 = 24             [frame 3 popped]

Binary search is another natural fit for recursion. Each call halves the search space:

c
int binary_search(int *arr, int lo, int hi, int target) {
    if (lo > hi) return -1;          // Base: not found
    int mid = lo + (hi - lo) / 2;    // Avoid overflow vs (lo+hi)/2
    if (arr[mid] == target) return mid;
    if (arr[mid] < target)
        return binary_search(arr, mid + 1, hi, target);
    return binary_search(arr, lo, mid - 1, target);
}
Tail recursion: If the recursive call is the LAST thing the function does (no multiplication after), the compiler can optimize it into a loop — no extra stack frames. factorial above is NOT tail recursive (it does n * result after). A tail-recursive version passes an accumulator: fact_tail(n, acc) = fact_tail(n-1, n*acc).
Stack Frame Visualizer

Step through factorial(N). Watch stack frames push and pop. Each frame holds its own n value.

N 4
What happens if a recursive function has no base case?

Chapter 5: Data Structures

Arrays are fast for random access (arr[i] is O(1)) but terrible for insertion in the middle (you must shift everything). Different problems demand different structures. The big three in C: linked lists, binary search trees, and hash tables.

Linked Lists

A linked list is a chain of nodes, each containing data and a pointer to the next. Insertion anywhere is O(1) (just rewire pointers), but finding the i-th element is O(n) (walk the chain).

c
typedef struct Node {
    int data;
    struct Node *next;
} Node;

// Insert at head: O(1)
void push(Node **head, int val) {
    Node *n = (Node *)malloc(sizeof(Node));
    n->data = val;
    n->next = *head;   // new node points to old head
    *head = n;          // head now points to new node
}

// Free entire list
void free_list(Node *head) {
    while (head) {
        Node *tmp = head;
        head = head->next;
        free(tmp);
    }
}

Binary Search Trees

A BST stores values such that for every node: all left descendants are smaller, all right descendants are larger. Search, insert, and delete are all O(log n) on average — but O(n) worst case if the tree becomes a straight line (unbalanced).

c
typedef struct TreeNode {
    int key;
    struct TreeNode *left, *right;
} TreeNode;

TreeNode* insert(TreeNode *root, int key) {
    if (!root) {
        TreeNode *n = (TreeNode*)malloc(sizeof(TreeNode));
        n->key = key; n->left = n->right = NULL;
        return n;
    }
    if (key < root->key) root->left = insert(root->left, key);
    else root->right = insert(root->right, key);
    return root;
}

Hash Tables

A hash table maps keys to values in O(1) average time. You hash the key to get an array index. Collisions (two keys mapping to the same index) are handled by chaining (linked list at each slot) or open addressing (probe for next empty slot).

StructureSearchInsertDeleteBest for
ArrayO(n) / O(1) indexO(n) middleO(n)Random access by index
Linked listO(n)O(1) at headO(1) if have ptrFrequent insert/delete
BSTO(log n) avgO(log n) avgO(log n) avgSorted data, range queries
Hash tableO(1) avgO(1) avgO(1) avgKey-value lookup
Interactive BST

Type a number and insert it. Watch the tree grow. Notice how sorted input creates a degenerate (linear) tree.

What is the average-case search time for a hash table with a good hash function?

Chapter 6: Sorting & Searching

Sorting is the most studied problem in computer science. Why? Because searching a sorted array is O(log n) instead of O(n). That's the difference between finding a record in 20 comparisons vs 1,000,000. The key divide: O(n²) algorithms (bubble, insertion) vs O(n log n) algorithms (quicksort, mergesort).

Quicksort (The King)

Quicksort picks a pivot, partitions the array so everything smaller is left and everything larger is right, then recurses on each half. Average case: O(n log n). Worst case (already sorted): O(n²). In practice, fastest for random data.

c
void swap(int *a, int *b) { int t = *a; *a = *b; *b = t; }

int partition(int *arr, int lo, int hi) {
    int pivot = arr[hi];  // last element as pivot
    int i = lo - 1;
    for (int j = lo; j < hi; j++) {
        if (arr[j] <= pivot) {
            i++;
            swap(&arr[i], &arr[j]);
        }
    }
    swap(&arr[i + 1], &arr[hi]);
    return i + 1;
}

void quicksort(int *arr, int lo, int hi) {
    if (lo < hi) {
        int p = partition(arr, lo, hi);
        quicksort(arr, lo, p - 1);
        quicksort(arr, p + 1, hi);
    }
}

Let's trace quicksort on [38, 27, 43, 3, 9, 82]:

Pivot = 82
[38, 27, 43, 3, 9 | 82] — everything is ≤ 82, pivot goes to end
↓ recurse left [38, 27, 43, 3, 9]
Pivot = 9
[3 | 9 | 38, 27, 43] — partition around 9
↓ recurse right [38, 27, 43]
Pivot = 43
[38, 27 | 43] — both ≤ 43, pivot in place
↓ recurse [38, 27]
Pivot = 27
[27 | 38] — done!

Final sorted: [3, 9, 27, 38, 43, 82].

AlgorithmBestAverageWorstStable?Space
Bubble sortO(n)O(n²)O(n²)YesO(1)
Insertion sortO(n)O(n²)O(n²)YesO(1)
QuicksortO(n log n)O(n log n)O(n²)NoO(log n)
MergesortO(n log n)O(n log n)O(n log n)YesO(n)
Sorting Visualizer

Watch bars get sorted in real time. Compare algorithm speeds. Red = comparing, green = sorted.

Speed 20
Quicksort's worst case O(n²) occurs when:

Chapter 7: State Machines

A Finite State Machine (FSM) is the most important design pattern in embedded C. Your microcontroller is always in exactly one state. When an event occurs, it transitions to a new state and may produce an action. That's it. No complex logic trees. No tangled if-else chains. Just: current state + event → action + next state.

Why FSMs dominate embedded? Because they're provably correct. You can draw every possible state and transition on paper. If you can't reach a dangerous state from the diagram, it can't happen in code. Compare that to 500 lines of nested if-statements where one forgotten condition means the motor never stops.

FSM implementation in C:
1. Define states as an enum.
2. Define events as an enum.
3. Write a switch(state) with nested switch(event) inside each case.
4. Each transition: perform action, set new state.
c
typedef enum { IDLE, COIN_IN, SELECTING, DISPENSING, CHANGE } State;
typedef enum { EVT_COIN, EVT_SELECT, EVT_DONE, EVT_TIMEOUT } Event;

State current = IDLE;

void fsm_step(Event evt) {
    switch (current) {
    case IDLE:
        if (evt == EVT_COIN) {
            display("Coin accepted");
            current = COIN_IN;
        }
        break;
    case COIN_IN:
        if (evt == EVT_SELECT) {
            start_motor();
            current = DISPENSING;
        } else if (evt == EVT_TIMEOUT) {
            return_coin();
            current = IDLE;
        }
        break;
    case DISPENSING:
        if (evt == EVT_DONE) {
            give_change();
            current = CHANGE;
        }
        break;
    case CHANGE:
        current = IDLE;  // auto-transition back
        break;
    }
}

The transition table is the formal specification:

Current StateEventActionNext State
IDLECOINDisplay "accepted"COIN_IN
COIN_INSELECTStart motorDISPENSING
COIN_INTIMEOUTReturn coinIDLE
DISPENSINGDONEGive changeCHANGE
CHANGE(auto)IDLE
Vending Machine FSM — Interactive

Click events to drive the state machine. Watch the current state highlight, transitions animate, and actions fire.

State: IDLE. Waiting for coin...

Why not just if-else? With 5 states and 4 events, you have 20 possible combinations. An FSM handles each explicitly. Nested if-else invariably misses edge cases. Plus, an FSM is trivially testable: feed every event in every state, verify the output. Try that with spaghetti logic.
In the vending machine FSM, what happens if a TIMEOUT event fires while in the DISPENSING state?

Chapter 8: Storage Maps & Data Representation

When you define a struct in C, the compiler doesn't always pack the fields tightly. It inserts padding bytes to ensure each field starts at an aligned address. Why? Because most CPUs can only read a 4-byte int from an address that's a multiple of 4. Unaligned access is either slow (ARM) or illegal (older ARM, some MIPS).

Storage map equation:
address_of(field) = base_address + offset

The offset depends on alignment rules. For a field of size S, its offset must be a multiple of S (or the platform's maximum alignment, whichever is smaller).

Let's trace a real struct:

c
struct Example {
    char  a;    // 1 byte, offset 0
                 // 3 bytes PADDING (next field needs 4-byte alignment)
    int   b;    // 4 bytes, offset 4
    short c;    // 2 bytes, offset 8
                 // 2 bytes PADDING (struct size must be multiple of largest align)
};
// sizeof(struct Example) = 12, NOT 7!

// Memory layout (byte by byte):
// [a][P][P][P][b][b][b][b][c][c][P][P]
//  0  1  2  3  4  5  6  7  8  9  10 11

You can reduce wasted space by reordering fields from largest to smallest:

c
struct Packed {
    int   b;    // 4 bytes, offset 0
    short c;    // 2 bytes, offset 4
    char  a;    // 1 byte,  offset 6
                 // 1 byte PADDING (struct align to 4)
};
// sizeof(struct Packed) = 8 — saved 4 bytes!

Endianness matters for data sent between machines. Little-endian (x86, ARM default): least significant byte at lowest address. Big-endian (network byte order): most significant byte first. The value 0x12345678:

AddressLittle-endianBig-endian
0x000x78 (LSB)0x12 (MSB)
0x010x560x34
0x020x340x56
0x030x12 (MSB)0x78 (LSB)
Struct Layout Visualizer

See how char, int, and short lay out in memory with alignment padding. Toggle packed mode to remove padding.

sizeof = 12 bytes (5 wasted on padding)

Pragmatic rule: In cross-platform communication (networking, file formats), NEVER send raw structs. The padding, alignment, and endianness may differ between sender and receiver. Instead, serialize each field explicitly using fixed-width types (uint32_t) in a known byte order.
A struct has fields: char a; int b; char c;. With 4-byte alignment, sizeof is:

Chapter 9: File I/O & Portability

In C, files are accessed through file pointers (FILE *). You open a file, read or write, then close it. The key insight: C treats files as a stream of bytes. You maintain a file position indicator that advances as you read/write. You can jump to any position with fseek.

c
// Write 100 records to a binary file
typedef struct { uint32_t id; float value; } Record;

FILE *f = fopen("data.bin", "wb");  // "wb" = write binary
if (!f) { perror("fopen"); return 1; }

Record rec = { .id = 42, .value = 3.14f };
fwrite(&rec, sizeof(Record), 1, f);  // write 1 record
fclose(f);

// Read record N directly (random access)
f = fopen("data.bin", "rb");
int record_num = 5;
fseek(f, record_num * sizeof(Record), SEEK_SET);
fread(&rec, sizeof(Record), 1, f);
// rec now contains record #5
fclose(f);
Random access formula: To read the N-th record in a file of fixed-size records:
offset = N × sizeof(Record)
fseek(file, offset, SEEK_SET)

Portability Landmines

C's primitive types have implementation-defined sizes. int could be 2 bytes on a 16-bit micro or 4 bytes on a PC. If you write raw int values to a file on one machine and read them on another, you'll get garbage. The solution: fixed-width types from <stdint.h>:

TypeGuaranteed sizeUse when
uint8_t1 byte, unsignedRaw bytes, characters
int16_t2 bytes, signedAudio samples, sensors
uint32_t4 bytes, unsignedTimestamps, IDs, registers
int64_t8 bytes, signedLarge counters, file sizes
c
// Portable serialization: write uint32 in little-endian
void write_le32(FILE *f, uint32_t val) {
    uint8_t buf[4];
    buf[0] = val & 0xFF;
    buf[1] = (val >> 8) & 0xFF;
    buf[2] = (val >> 16) & 0xFF;
    buf[3] = (val >> 24) & 0xFF;
    fwrite(buf, 4, 1, f);
}

// Portable deserialization: read uint32 from little-endian
uint32_t read_le32(FILE *f) {
    uint8_t buf[4];
    fread(buf, 4, 1, f);
    return buf[0] | (buf[1] << 8) | (buf[2] << 16) | (buf[3] << 24);
}
Rule: If your data ever crosses a machine boundary (network, file, different compiler), serialize manually. Never fwrite(&my_struct, sizeof(my_struct), 1, f) for portable formats. The padding, endianness, and sizes may all differ on the reader's side.
Why should you use uint32_t instead of unsigned int in portable binary file formats?

Chapter 10: Variable Arguments & Non-Local Gotos

How does printf accept any number of arguments? Through C's variable argument (varargs) mechanism. The function declares a minimum set of fixed parameters, then uses va_list to iterate over the rest. This is how you write your own printf-like logging function in embedded code.

c
#include <stdarg.h>

// Custom debug logger: logs to UART in embedded, stdout in testing
void debug_log(const char *fmt, ...) {
    va_list args;
    va_start(args, fmt);    // Initialize: 'fmt' is the last fixed param

    // Extract arguments one by one based on format string
    // (simplified: real impl would parse fmt)
    vprintf(fmt, args);     // printf that takes a va_list

    va_end(args);           // MUST call when done. Cleans up.
}

// Usage:
debug_log("Sensor %d: value = %f\n", 3, 27.5);
debug_log("Error at addr 0x%08X\n", 0xDEADBEEF);
The varargs contract:
va_start(args, last_fixed) — initialize, naming the last fixed parameter.
va_arg(args, type) — extract next argument as type. YOU must know the type.
va_end(args) — cleanup. Always call this. Always.

The compiler cannot check types for you. Pass a float where int is expected? Undefined behavior. This is why printf format strings exist — they encode the expected types.

setjmp / longjmp: C's Exception Handling

C has no try/catch. But embedded code often has deeply nested function calls that can fail. Unwinding the stack manually (checking return codes at every level) is tedious. setjmp/longjmp provides non-local jumps — you mark a return point, and can jump back to it from any depth.

c
#include <setjmp.h>

jmp_buf error_recovery;

void deep_function() {
    // ... 5 levels deep in call stack ...
    if (hardware_fault()) {
        longjmp(error_recovery, 1);  // Jump back to setjmp site!
        // Code below here NEVER executes
    }
}

void main_loop() {
    if (setjmp(error_recovery) != 0) {
        // We got here via longjmp. Handle the error.
        reset_hardware();
        printf("Recovered from fault\n");
        return;
    }
    // Normal execution path:
    init_sensors();
    calibrate();       // calls deep_function somewhere inside
    run_control_loop();
}
When to use longjmp: Only for fatal error recovery in embedded systems without an OS exception mechanism. It's C's equivalent of "throw" — but more dangerous because it doesn't call destructors, doesn't free memory, and can leave resources in undefined states. Use sparingly. Always pair with cleanup code at the setjmp site.

Warning: longjmp unwinds the stack without calling any cleanup code in intermediate functions. Any malloc'd memory in those functions is leaked. Any open file handles are leaked. Any hardware left in a half-configured state stays that way. This is why embedded teams use it ONLY at the top-level error boundary.

In a varargs function, what does va_arg(args, int) do?

Chapter 11: Mastery & Connections

You now have the complete low-level C toolkit: bit manipulation for hardware registers, pointer arithmetic for efficient traversal, dynamic memory (and why embedded avoids it), recursion, data structures, sorting, state machines for control flow, and storage maps for understanding what the compiler actually does with your structs.

Memory management rules:
1. Every malloc must have a matching free. No exceptions.
2. Set pointers to NULL after freeing. Always.
3. Check malloc's return value. It CAN return NULL.
4. Never free the same pointer twice.
5. Never use a pointer after freeing it.
6. In embedded: prefer static allocation. If you must use heap, use pool allocators.

Complexity Cheat Sheet

OperationArrayLinked ListBST (balanced)Hash Table
Access by indexO(1)O(n)O(log n)N/A
Search by valueO(n)O(n)O(log n)O(1)
Insert at endO(1) amortizedO(1) w/ tail ptrO(log n)O(1)
Insert at middleO(n)O(1) if have ptrO(log n)N/A
DeleteO(n)O(1) if have ptrO(log n)O(1)

State Machine Design Template

c
// 1. Define states and events as enums
typedef enum { ST_INIT, ST_RUNNING, ST_ERROR, ST_COUNT } State;
typedef enum { EV_START, EV_FAULT, EV_RESET, EV_COUNT } Event;

// 2. Define action type
typedef void (*Action)(void);

// 3. Transition table (table-driven FSM)
typedef struct { State next; Action action; } Transition;

Transition table[ST_COUNT][EV_COUNT] = {
    /*           EV_START          EV_FAULT          EV_RESET */
    /* INIT  */ {{ST_RUNNING, go}, {ST_ERROR, log},  {ST_INIT, nop}},
    /* RUN   */ {{ST_RUNNING,nop}, {ST_ERROR, stop}, {ST_INIT, reset}},
    /* ERROR */ {{ST_ERROR, nop},  {ST_ERROR, nop},  {ST_INIT, reset}},
};

// 4. Step function: O(1), no switch-case needed
void fsm_dispatch(Event e) {
    Transition t = table[state][e];
    if (t.action) t.action();
    state = t.next;
}

Bitwise Operations Quick Reference

TaskExpressionExample
Set bit Nreg |= (1 << N)Turn on LED
Clear bit Nreg &= ~(1 << N)Turn off LED
Toggle bit Nreg ^= (1 << N)Blink LED
Check bit Nif (reg & (1 << N))Read button
Set bits 3-5reg |= (0x7 << 3)Set a 3-bit field
Clear bits 3-5reg &= ~(0x7 << 3)Clear a 3-bit field
Extract bits 3-5(reg >> 3) & 0x7Read a 3-bit field

Design Challenge: Pool Allocator

Implement a memory pool allocator for fixed-size blocks. This eliminates fragmentation entirely: all blocks are the same size, so any free block fits any allocation request.

c
// Pool allocator: zero fragmentation, O(1) alloc and free
#define POOL_SIZE    32
#define BLOCK_SIZE   64   // bytes per block

static uint8_t pool_memory[POOL_SIZE * BLOCK_SIZE];
static uint8_t pool_used[POOL_SIZE];  // 0 = free, 1 = used

void* pool_alloc(void) {
    for (int i = 0; i < POOL_SIZE; i++) {
        if (!pool_used[i]) {
            pool_used[i] = 1;
            return &pool_memory[i * BLOCK_SIZE];
        }
    }
    return NULL;  // pool exhausted
}

void pool_free(void *ptr) {
    int idx = ((uint8_t*)ptr - pool_memory) / BLOCK_SIZE;
    pool_used[idx] = 0;
}
What's next: This lesson gave you the C programming toolkit. To go deeper: study RTOS concepts (tasks, semaphores, mutexes), interrupt-driven I/O (ISRs, volatile, atomic operations), and DMA (direct memory access for high-throughput data transfer without CPU intervention).

"C is quirky, flawed, and an enormous success." — Dennis Ritchie

A pool allocator eliminates fragmentation because: