Embedded Linux — From Boot to Driver

Chapter 0: Why Embedded Linux?

Your embedded system needs: TCP/IP networking, USB host support, a filesystem, multitasking, and driver support for hundreds of peripherals. Writing all of that from scratch would take decades. Embedded Linux gives you all of it — for free.

But Linux isn't always the right answer. A blinking LED on a $0.50 microcontroller doesn't need a kernel. A hard real-time motor controller running at 100kHz can't tolerate Linux's scheduling jitter. The question is: when do you reach for Linux?

The decision rule: If your system needs a filesystem, a networking stack, USB host, or must run multiple concurrent processes — and has at least 32MB of RAM — use Linux. If it needs hard real-time guarantees at microsecond resolution on a tiny MCU, use bare metal or an RTOS.

The boards that run embedded Linux are everywhere: Raspberry Pi (Broadcom BCM2711, quad-core ARM Cortex-A72), BeagleBone Black (TI AM335x, Cortex-A8), NXP i.MX8 (used in automotive), NVIDIA Jetson (GPU-accelerated AI at the edge). All of these boot a Linux kernel, mount a root filesystem, and run userspace applications — just like your laptop, but on a board the size of a credit card.

Why "embedded" Linux is still Linux: The kernel is the same source tree (kernel.org). The difference is configuration: you enable only the drivers and subsystems your board needs, cross-compile for ARM/RISC-V instead of x86, and strip userspace down to the minimum. Same code, different .config file.

Here's a decision tree. Answer the questions to see which approach fits your project:

Decision Tree: Linux vs RTOS vs Bare Metal

Click your answers. The tree will guide you to the right platform.

Real-world examples of each path:

Platform	RAM	Example	Use Case
Bare Metal	<64KB	STM32F103	Motor PWM, sensor sampling
RTOS	64KB–4MB	ESP32 + FreeRTOS	IoT device, WiFi sensor node
Embedded Linux	>32MB	Raspberry Pi, Jetson	Camera processing, web server, AI

You're building a device that streams video over WiFi, runs a web interface, and logs data to an SD card. Which platform?

Bare metal — just bit-bang the protocols RTOS — FreeRTOS can handle it Embedded Linux — needs filesystem, networking, and multitasking

Chapter 1: Host & Target

Here's the fundamental reality of embedded development: you write and compile code on one machine (the host) but run it on a completely different machine (the target). Your laptop is x86_64. Your embedded board is ARM. The binary formats are incompatible. You can't just copy your laptop's compiled programs to the board.

This is why we need cross-compilation: compiling on the host architecture (x86) to produce binaries for the target architecture (ARM). The tool that does this is the cross-compiler toolchain.

The toolchain naming convention: arm-linux-gnueabihf-gcc breaks down as: arm (target CPU), linux (target OS), gnueabi (ABI — calling conventions), hf (hardware floating point), gcc (the compiler). Every tool in the chain follows this prefix: arm-linux-gnueabihf-ld, arm-linux-gnueabihf-objdump, etc.

But the compiler alone isn't enough. When you #include <stdio.h>, that header comes from the target's C library, not your host's. When you link against -lpthread, that's the target's libpthread. The sysroot is a directory on your host that mirrors the target's filesystem — it contains the target's headers and libraries so the cross-compiler can find them.

bash
# Typical sysroot directory structure
/opt/sysroot/
  usr/
    include/      # Target headers (stdio.h, linux/gpio.h, ...)
    lib/          # Target shared libraries (libc.so, libpthread.so)
  lib/            # More target libraries

# Tell the compiler where to find target headers/libs
arm-linux-gnueabihf-gcc --sysroot=/opt/sysroot -o hello hello.c

Once you've compiled the binary, you need to get it onto the target. Common transfer methods:

SCP

scp hello pi@192.168.1.50:/home/pi/ — simple, needs SSH on target

↓

NFS Mount

Target mounts host directory over network — instant updates, no copy needed

↓

SD Card

Remove SD, mount on host, copy binary, reinsert — slow but always works

↓

TFTP + NFS Boot

Bootloader loads kernel via TFTP, mounts rootfs via NFS — no SD card at all

The professional workflow: NFS root is king for active development. Your target boots directly from a directory on your host. Edit, recompile, and it's instantly available — no copy step, no SD card swapping. Every time you restart a program on the target, it picks up the latest binary.

The full development cycle looks like this:

1. Edit

Write C code on host in your favorite editor/IDE

↓

2. Cross-compile

arm-linux-gnueabihf-gcc produces ARM binary on host

↓

3. Deploy

SCP/NFS/SD card → binary arrives on target

↓

4. Test

SSH to target, run binary, check output

↓

5. Debug

gdbserver on target, arm-linux-gnueabihf-gdb on host, remote debug

↻ repeat

Host → Target Data Flow

Watch the development workflow: source on host, binary transferred to target, debug connection back.

bash
# Install the cross-compiler (Ubuntu/Debian host)
sudo apt install gcc-arm-linux-gnueabihf

# Verify it works
arm-linux-gnueabihf-gcc --version

# Cross-compile a simple program
arm-linux-gnueabihf-gcc -o hello hello.c

# Check the binary type — should say ARM, not x86!
file hello
# hello: ELF 32-bit LSB executable, ARM, EABI5, ...

# Deploy via SCP
scp hello pi@192.168.1.50:/home/pi/

# SSH and run
ssh pi@192.168.1.50 ./hello

Why can't you just compile with regular gcc on your x86 laptop and copy the binary to an ARM board?

Because gcc is too old Because gcc produces x86 machine code, but ARM CPUs execute ARM instructions Because Linux doesn't allow file transfers between architectures

Chapter 2: Building Programs

You know you need a cross-compiler. But real programs aren't single files — they have dependencies, multiple source files, and build configurations. Let's trace the complete journey from source code to running binary on your target.

First, the compilation pipeline. When you run arm-linux-gnueabihf-gcc -o hello hello.c, four separate stages happen:

The Build Pipeline

Source code passes through 4 stages to become an executable ELF binary. Click stages to see intermediate output.

bash
# See each stage individually:

# 1. Preprocess only (-E): expand #includes, #defines
arm-linux-gnueabihf-gcc -E hello.c -o hello.i

# 2. Compile to assembly (-S): C → ARM assembly
arm-linux-gnueabihf-gcc -S hello.c -o hello.s

# 3. Assemble (-c): assembly → object file (machine code, not linked)
arm-linux-gnueabihf-gcc -c hello.c -o hello.o

# 4. Link: combine object files + libraries → final ELF executable
arm-linux-gnueabihf-gcc hello.o -o hello

Static vs dynamic linking: When your program calls printf(), that code lives in libc. You have two choices:

Linking	Flag	Binary Size	Dependencies	Use When
Dynamic	(default)	Small (~10KB)	Needs libc.so on target	Normal development
Static	`-static`	Large (~800KB+)	Self-contained	Minimal rootfs, single binary deploy

Rule of thumb: Use dynamic linking during development (faster compile, smaller binaries). Use static linking for production deployment on minimal systems where you control exactly what's on the target.

For real projects, you need a build system. Here's a Makefile for cross-compilation:

makefile
# Cross-compilation Makefile
CROSS_COMPILE := arm-linux-gnueabihf-
CC            := $(CROSS_COMPILE)gcc
CFLAGS        := -Wall -O2 --sysroot=/opt/sysroot
LDFLAGS       := --sysroot=/opt/sysroot
TARGET        := gpio_reader
SRCS          := main.c gpio.c
OBJS          := $(SRCS:.c=.o)

all: $(TARGET)

$(TARGET): $(OBJS)
	$(CC) $(LDFLAGS) -o $@ $^

%.o: %.c
	$(CC) $(CFLAGS) -c -o $@ $<

clean:
	rm -f $(OBJS) $(TARGET)

deploy: $(TARGET)
	scp $(TARGET) pi@192.168.1.50:/home/pi/

For larger projects, CMake with a toolchain file is standard:

cmake
# toolchain-arm.cmake
set(CMAKE_SYSTEM_NAME Linux)
set(CMAKE_SYSTEM_PROCESSOR arm)

set(CMAKE_C_COMPILER   arm-linux-gnueabihf-gcc)
set(CMAKE_CXX_COMPILER arm-linux-gnueabihf-g++)

set(CMAKE_SYSROOT /opt/sysroot)
set(CMAKE_FIND_ROOT_PATH /opt/sysroot)

# Search headers/libs only in sysroot, programs only on host
set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)

bash
# Build with the toolchain file
mkdir build && cd build
cmake -DCMAKE_TOOLCHAIN_FILE=../toolchain-arm.cmake ..
make -j$(nproc)

Worked example: A program that reads a GPIO pin via the sysfs interface:

c
/* gpio_read.c — Read GPIO pin state via sysfs */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>

#define GPIO_PIN  "17"
#define GPIO_PATH "/sys/class/gpio/"

int main(void) {
    int fd;
    char buf[64];
    char val;

    /* Export the GPIO pin */
    fd = open(GPIO_PATH "export", O_WRONLY);
    if (fd < 0) { perror("export"); return 1; }
    write(fd, GPIO_PIN, strlen(GPIO_PIN));
    close(fd);

    usleep(100000); /* Wait for sysfs to create entries */

    /* Set direction to input */
    snprintf(buf, sizeof(buf), GPIO_PATH "gpio%s/direction", GPIO_PIN);
    fd = open(buf, O_WRONLY);
    write(fd, "in", 2);
    close(fd);

    /* Read value */
    snprintf(buf, sizeof(buf), GPIO_PATH "gpio%s/value", GPIO_PIN);
    fd = open(buf, O_RDONLY);
    read(fd, &val, 1);
    close(fd);

    printf("GPIO %s = %c\n", GPIO_PIN, val);

    /* Unexport */
    fd = open(GPIO_PATH "unexport", O_WRONLY);
    write(fd, GPIO_PIN, strlen(GPIO_PIN));
    close(fd);

    return 0;
}

bash
# Cross-compile and deploy
arm-linux-gnueabihf-gcc -Wall -O2 -o gpio_read gpio_read.c
scp gpio_read pi@192.168.1.50:/home/pi/
ssh pi@192.168.1.50 sudo ./gpio_read
# GPIO 17 = 1

What does the --sysroot flag tell the cross-compiler?

Where to install the compiled binary Where to find the target's headers and libraries for compiling and linking The IP address of the target board

Chapter 3: Services & Daemons

Your embedded device boots up. No one is logged in. No terminal is open. Yet your temperature monitoring program needs to be running, logging data, and raising alarms. That's what a daemon is: a background process with no controlling terminal that starts at boot and runs forever.

The word "daemon" comes from Greek mythology — a background spirit doing useful work. In Unix, a daemon is a process that has detached from any terminal and runs independently in the background. Think of it as a service: it starts, it does its job, it doesn't need human interaction.

The classic daemon recipe (pre-systemd): fork() to create a child, parent exits (so the shell thinks the command finished), child calls setsid() to become session leader (detach from terminal), close stdin/stdout/stderr (no terminal I/O), enter main loop. This five-step ritual has been the Unix way since the 1970s.

c
/* Classic daemon creation (the traditional Unix way) */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/stat.h>
#include <syslog.h>

void daemonize(void) {
    pid_t pid;

    /* Step 1: Fork — parent exits, child continues */
    pid = fork();
    if (pid < 0) exit(EXIT_FAILURE);
    if (pid > 0) exit(EXIT_SUCCESS);  /* Parent exits */

    /* Step 2: Create new session — detach from terminal */
    if (setsid() < 0) exit(EXIT_FAILURE);

    /* Step 3: Fork again — ensure we can't re-acquire terminal */
    pid = fork();
    if (pid < 0) exit(EXIT_FAILURE);
    if (pid > 0) exit(EXIT_SUCCESS);

    /* Step 4: Set file permissions, change working dir */
    umask(0);
    chdir("/");

    /* Step 5: Close standard file descriptors */
    close(STDIN_FILENO);
    close(STDOUT_FILENO);
    close(STDERR_FILENO);
}

int main(void) {
    daemonize();
    openlog("tempmon", LOG_PID, LOG_DAEMON);
    syslog(LOG_INFO, "Temperature monitor started");

    while (1) {
        /* Read temperature, log it, check threshold */
        syslog(LOG_INFO, "Temperature: 42.3C");
        sleep(10);
    }

    closelog();
    return 0;
}

The modern way: systemd. Instead of hand-coding the fork/setsid ritual, you write your program as a simple foreground process and let systemd handle daemonization, logging, restart-on-crash, and dependency ordering. A unit file tells systemd how to manage your service:

ini
# /etc/systemd/system/tempmon.service
[Unit]
Description=Temperature Monitor Daemon
After=network.target

[Service]
Type=simple
ExecStart=/usr/bin/tempmon
Restart=always
RestartSec=5
WatchdogSec=30
StandardOutput=journal
StandardError=journal

[Install]
WantedBy=multi-user.target

Why systemd wins for embedded: Restart=always means if your daemon crashes, systemd restarts it in 5 seconds. WatchdogSec=30 means if your daemon doesn't ping systemd every 30 seconds, systemd assumes it's hung and kills/restarts it. This is critical for unattended devices in the field.

Watchdog integration in your code:

c
#include <systemd/sd-daemon.h>

int main(void) {
    sd_notify(0, "READY=1");  /* Tell systemd we're up */

    while (1) {
        /* Do work... */
        read_temperature();
        log_data();

        /* Kick the watchdog — "I'm still alive" */
        sd_notify(0, "WATCHDOG=1");
        sleep(10);
    }
}

bash
# Service management commands
sudo systemctl start tempmon      # Start the service
sudo systemctl stop tempmon       # Stop it
sudo systemctl enable tempmon     # Start at boot
sudo systemctl status tempmon     # Check status
journalctl -u tempmon -f          # Follow logs in real-time

Worked example: A complete temperature monitoring daemon:

c
/* tempmon.c — Systemd-friendly temperature monitor */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <syslog.h>
#include <systemd/sd-daemon.h>

#define TEMP_PATH "/sys/class/thermal/thermal_zone0/temp"
#define THRESHOLD 70000  /* 70°C in millidegrees */
#define INTERVAL  10     /* seconds */

int read_temp(void) {
    int fd, temp;
    char buf[16];

    fd = open(TEMP_PATH, O_RDONLY);
    if (fd < 0) return -1;
    read(fd, buf, sizeof(buf));
    close(fd);
    temp = atoi(buf);
    return temp;
}

int main(void) {
    int temp;

    openlog("tempmon", LOG_PID, LOG_DAEMON);
    sd_notify(0, "READY=1");
    syslog(LOG_INFO, "Temperature monitor started");

    while (1) {
        temp = read_temp();
        if (temp < 0) {
            syslog(LOG_ERR, "Failed to read temperature");
        } else {
            syslog(LOG_INFO, "Temp: %d.%dC", temp/1000, (temp%1000)/100);
            if (temp > THRESHOLD)
                syslog(LOG_WARNING, "ALERT: Over %dC!", THRESHOLD/1000);
        }

        sd_notify(0, "WATCHDOG=1");
        sleep(INTERVAL);
    }

    closelog();
    return 0;
}

Service Lifecycle

Interact with the daemon: start, stop, restart. Watch PID changes and log messages appear.

In a systemd service file, what does Restart=always do?

Restarts the service automatically if it exits for any reason Restarts the entire system on crash Prevents the service from ever stopping

Chapter 4: Kernel Modules

The Linux kernel is monolithic — all kernel code runs in the same address space, sharing the same memory. But you don't want to rebuild the entire kernel every time you add a new driver. Kernel modules solve this: they're chunks of code you can load into the running kernel and unload without rebooting.

Think of modules as plugins for the kernel. Your WiFi driver? A module. Your filesystem driver? A module. Your custom hardware driver? You guessed it — a module. The command lsmod on any Linux system shows dozens of loaded modules.

The fundamental difference: Userspace code runs in a sandbox — if it crashes, only that process dies. Kernel code (including modules) runs with full hardware access. If your module has a bug, it can crash the entire system. There is no protection. You are operating without a safety net.

The minimal kernel module has exactly two functions: one called when the module loads (init), one called when it unloads (exit):

c
/* hello_module.c — The minimal kernel module */
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Your Name");
MODULE_DESCRIPTION("A minimal hello world kernel module");

static int __init hello_init(void) {
    printk(KERN_INFO "hello: Module loaded!\n");
    return 0;  /* 0 = success, negative = error */
}

static void __exit hello_exit(void) {
    printk(KERN_INFO "hello: Module unloaded.\n");
}

module_init(hello_init);
module_exit(hello_exit);

printk, not printf: You're in kernel space now. There's no libc, no printf, no malloc. Instead: printk() for logging (output goes to the kernel ring buffer, viewable with dmesg), kmalloc() for memory allocation, and kernel-specific APIs for everything else. You're writing code for a different world.

The Makefile for kernel modules is unique — it delegates to the kernel's own build system (kbuild):

makefile
# Makefile for out-of-tree kernel module
obj-m += hello_module.o

# For cross-compilation:
KDIR  := /path/to/target/kernel/source
ARCH  := arm
CROSS := arm-linux-gnueabihf-

all:
	make -C $(KDIR) M=$(PWD) ARCH=$(ARCH) CROSS_COMPILE=$(CROSS) modules

clean:
	make -C $(KDIR) M=$(PWD) clean

bash
# Build the module
make

# Copy to target
scp hello_module.ko pi@192.168.1.50:/home/pi/

# On the target:
sudo insmod hello_module.ko    # Load
dmesg | tail -1                # See: "hello: Module loaded!"
lsmod | grep hello             # Verify it's loaded
sudo rmmod hello_module        # Unload
dmesg | tail -1                # See: "hello: Module unloaded."

Module parameters let users configure your module at load time without recompiling:

c
static int interval = 5;
static char *name = "world";

module_param(interval, int, 0644);
MODULE_PARM_DESC(interval, "Polling interval in seconds");

module_param(name, charp, 0644);
MODULE_PARM_DESC(name, "Name to greet");

/* Usage: sudo insmod hello.ko interval=10 name="Linux" */

Worked example: A module that creates a /proc entry for reading:

c
/* proc_module.c — Creates /proc/myinfo */
#include <linux/module.h>
#include <linux/proc_fs.h>
#include <linux/seq_file.h>

MODULE_LICENSE("GPL");

static int myinfo_show(struct seq_file *m, void *v) {
    seq_printf(m, "Uptime ticks: %llu\n", jiffies_64);
    seq_printf(m, "HZ: %d\n", HZ);
    seq_printf(m, "Seconds: %llu\n", jiffies_64 / HZ);
    return 0;
}

static int myinfo_open(struct inode *inode, struct file *file) {
    return single_open(file, myinfo_show, NULL);
}

static const struct proc_ops myinfo_ops = {
    .proc_open    = myinfo_open,
    .proc_read    = seq_read,
    .proc_lseek   = seq_lseek,
    .proc_release = single_release,
};

static int __init myinfo_init(void) {
    proc_create("myinfo", 0444, NULL, &myinfo_ops);
    printk(KERN_INFO "myinfo: /proc/myinfo created\n");
    return 0;
}

static void __exit myinfo_exit(void) {
    remove_proc_entry("myinfo", NULL);
    printk(KERN_INFO "myinfo: /proc/myinfo removed\n");
}

module_init(myinfo_init);
module_exit(myinfo_exit);

bash
# After loading the module:
cat /proc/myinfo
# Uptime ticks: 4523890
# HZ: 100
# Seconds: 45238

Kernel/Userspace Boundary

Watch a module being loaded into kernel space. It gains access to kernel APIs that userspace can never touch.

What happens if a kernel module dereferences a NULL pointer?

The module process receives SIGSEGV Nothing — the kernel handles it gracefully Kernel panic or oops — the whole system may crash

Chapter 5: Character Device Drivers

In Unix, everything is a file. Your serial port is /dev/ttyS0. Your webcam is /dev/video0. Your random number generator is /dev/urandom. When a userspace program calls open("/dev/mydevice"), that call travels through the kernel's Virtual Filesystem (VFS) layer and arrives at YOUR driver's open() function. You control what happens next.

A character device driver presents hardware as a stream of bytes. Userspace reads and writes bytes; the driver translates those bytes into hardware operations. The interface is defined by the struct file_operations — a table of function pointers that the kernel calls when userspace does I/O:

The mental model: A character driver is a translator. Userspace speaks "read/write bytes." Hardware speaks "toggle GPIO pins" or "read ADC registers." Your driver translates between these two languages. The file_operations struct is the dictionary.

c
/* The file_operations struct — your driver's API */
static struct file_operations mydev_fops = {
    .owner   = THIS_MODULE,
    .open    = mydev_open,      /* Called on open("/dev/mydev") */
    .release = mydev_release,   /* Called on close() */
    .read    = mydev_read,      /* Called on read() */
    .write   = mydev_write,     /* Called on write() */
    .unlocked_ioctl = mydev_ioctl, /* Called on ioctl() */
};

Device files are identified by major and minor numbers. The major number identifies the driver. The minor number identifies which device instance (if the driver handles multiple devices). When you mknod /dev/myled c 240 0, you're creating a character device (c) with major 240, minor 0.

Worked example: A complete LED driver. Writing "1" turns the LED on, writing "0" turns it off, reading returns the current state:

c
/* led_driver.c — Character device driver for an LED */
#include <linux/module.h>
#include <linux/fs.h>
#include <linux/cdev.h>
#include <linux/device.h>
#include <linux/uaccess.h>
#include <linux/gpio.h>

#define DEVICE_NAME "myled"
#define LED_GPIO    17

MODULE_LICENSE("GPL");

static dev_t dev_num;
static struct cdev myled_cdev;
static struct class *myled_class;
static int led_state = 0;

static int myled_open(struct inode *i, struct file *f) {
    printk(KERN_INFO "myled: opened\n");
    return 0;
}

static int myled_release(struct inode *i, struct file *f) {
    printk(KERN_INFO "myled: closed\n");
    return 0;
}

static ssize_t myled_read(struct file *f, char __user *buf,
                          size_t len, loff_t *off) {
    char val = led_state ? '1' : '0';

    if (*off > 0) return 0;  /* EOF on second read */
    if (copy_to_user(buf, &val, 1)) return -EFAULT;
    *off += 1;
    return 1;
}

static ssize_t myled_write(struct file *f, const char __user *buf,
                           size_t len, loff_t *off) {
    char val;

    if (len < 1) return -EINVAL;
    if (copy_from_user(&val, buf, 1)) return -EFAULT;

    if (val == '1') {
        gpio_set_value(LED_GPIO, 1);
        led_state = 1;
    } else if (val == '0') {
        gpio_set_value(LED_GPIO, 0);
        led_state = 0;
    }

    return len;
}

static struct file_operations myled_fops = {
    .owner   = THIS_MODULE,
    .open    = myled_open,
    .release = myled_release,
    .read    = myled_read,
    .write   = myled_write,
};

static int __init myled_init(void) {
    /* 1. Allocate device number */
    alloc_chrdev_region(&dev_num, 0, 1, DEVICE_NAME);

    /* 2. Initialize cdev and add to kernel */
    cdev_init(&myled_cdev, &myled_fops);
    cdev_add(&myled_cdev, dev_num, 1);

    /* 3. Create device class and device file automatically */
    myled_class = class_create(DEVICE_NAME);
    device_create(myled_class, NULL, dev_num, NULL, DEVICE_NAME);

    /* 4. Request and configure GPIO */
    gpio_request(LED_GPIO, "led");
    gpio_direction_output(LED_GPIO, 0);

    printk(KERN_INFO "myled: registered at major %d\n", MAJOR(dev_num));
    return 0;
}

static void __exit myled_exit(void) {
    gpio_set_value(LED_GPIO, 0);
    gpio_free(LED_GPIO);
    device_destroy(myled_class, dev_num);
    class_destroy(myled_class);
    cdev_del(&myled_cdev);
    unregister_chrdev_region(dev_num, 1);
    printk(KERN_INFO "myled: unregistered\n");
}

module_init(myled_init);
module_exit(myled_exit);

bash
# Usage from userspace:
sudo insmod led_driver.ko
ls -la /dev/myled                  # Device file auto-created
echo "1" | sudo tee /dev/myled    # LED ON
cat /dev/myled                     # Read state: "1"
echo "0" | sudo tee /dev/myled    # LED OFF

copy_to_user / copy_from_user: You can NEVER directly dereference a userspace pointer from kernel space. The user's memory might be paged out, or the pointer might be invalid. These functions safely copy data across the kernel/user boundary, handling page faults and returning -EFAULT on bad pointers.

Driver Call Path: User → VFS → Driver → Hardware

See how a userspace write() call travels through the kernel to your driver function and ultimately to hardware.

Why must you use copy_from_user() instead of directly reading a userspace pointer in kernel code?

Because the compiler won't allow it Because userspace memory may be paged out or the pointer may be invalid — direct access could crash the kernel Because userspace uses different byte ordering

Chapter 6: Platform Drivers & Device Tree

The LED driver in Chapter 5 has a problem: the GPIO number (17) is hardcoded. If you move to a different board where the LED is on GPIO 22, you'd have to modify the source and recompile. That's terrible engineering. Modern Linux solves this with the Device Tree: a data structure that describes hardware separately from the driver code.

The Device Tree (DT) is a hierarchical description of the hardware on your board. It tells the kernel: "at address 0x3F200000 there's a GPIO controller" and "connected to GPIO 17 there's an LED." The driver code says: "I know how to drive LEDs" but never mentions specific addresses or pin numbers. The kernel matches them together.

The separation principle: Driver code describes HOW to talk to hardware (the algorithm). Device Tree describes WHICH hardware exists on THIS specific board (the wiring). Same driver binary works on any board — just change the .dtb file. This is why one kernel image can boot on hundreds of different ARM boards.

A Device Tree snippet (a .dts file) for our LED:

dts
/* Device Tree Source for our LED */
/ {
    my_led: led@0 {
        compatible = "mycompany,my-led";   /* Matching string */
        gpios = <&gpio 17 GPIO_ACTIVE_HIGH>; /* Which GPIO */
        label = "status-led";
        default-state = "off";
        status = "okay";
    };
};

The platform driver registers itself with a compatible string. When the kernel boots and parses the Device Tree, it finds a node with compatible = "mycompany,my-led" and calls your driver's probe() function, passing all the information from the DT node:

c
/* led_platform.c — Platform driver matching Device Tree */
#include <linux/module.h>
#include <linux/platform_device.h>
#include <linux/of.h>
#include <linux/gpio/consumer.h>

MODULE_LICENSE("GPL");

struct my_led_data {
    struct gpio_desc *gpio;
    int state;
};

static int my_led_probe(struct platform_device *pdev) {
    struct my_led_data *led;
    struct device *dev = &pdev->dev;

    /* Allocate driver-private data */
    led = devm_kzalloc(dev, sizeof(*led), GFP_KERNEL);
    if (!led) return -ENOMEM;

    /* Get GPIO from Device Tree — no hardcoded numbers! */
    led->gpio = devm_gpiod_get(dev, NULL, GPIOD_OUT_LOW);
    if (IS_ERR(led->gpio)) {
        dev_err(dev, "Failed to get GPIO\n");
        return PTR_ERR(led->gpio);
    }

    platform_set_drvdata(pdev, led);
    dev_info(dev, "LED driver probed successfully\n");
    return 0;
}

static int my_led_remove(struct platform_device *pdev) {
    struct my_led_data *led = platform_get_drvdata(pdev);
    gpiod_set_value(led->gpio, 0);  /* Turn off on remove */
    dev_info(&pdev->dev, "LED driver removed\n");
    return 0;
}

/* Match table: links this driver to Device Tree nodes */
static const struct of_device_id my_led_of_match[] = {
    { .compatible = "mycompany,my-led" },
    { }  /* Sentinel — marks end of table */
};
MODULE_DEVICE_TABLE(of, my_led_of_match);

static struct platform_driver my_led_driver = {
    .probe  = my_led_probe,
    .remove = my_led_remove,
    .driver = {
        .name = "my-led",
        .of_match_table = my_led_of_match,
    },
};

module_platform_driver(my_led_driver);

devm_ functions (managed resources): Any function starting with devm_ automatically frees its resource when the device is removed. devm_kzalloc frees memory, devm_gpiod_get releases the GPIO. No manual cleanup needed — the kernel tracks it for you. This eliminates an entire class of resource leak bugs.

The matching process works like this:

1. Kernel boots

Parses the Device Tree Blob (.dtb) loaded by bootloader

↓

2. Creates platform devices

Each DT node with a "compatible" string becomes a platform_device

↓

3. Driver loads

Module registers its of_match_table with the platform bus

↓

4. Bus matches

Kernel compares driver's compatible strings vs device's compatible strings

↓

5. probe() called

Driver's probe function runs — hardware is now initialized

Device Tree Matching

See how the kernel matches DT nodes to drivers based on compatible strings. Click to load a driver.

Key Device Tree syntax patterns:

dts
/* Node naming: name@address */
i2c_sensor: sensor@48 {
    compatible = "ti,tmp102";    /* vendor,device */
    reg = <0x48>;                  /* I2C address */
    interrupt-parent = <&gpio1>;
    interrupts = <7 IRQ_TYPE_EDGE_FALLING>;
};

/* Referencing other nodes (phandles) */
spi_display: display@0 {
    compatible = "sitronix,st7789";
    reg = <0>;                     /* SPI chip select */
    spi-max-frequency = <40000000>;
    dc-gpios = <&gpio 25 GPIO_ACTIVE_HIGH>;
    reset-gpios = <&gpio 24 GPIO_ACTIVE_LOW>;
};

What is the purpose of the "compatible" string in a Device Tree node?

It specifies the device's memory address It sets the device's operating voltage It's the key the kernel uses to match a device to its driver

Chapter 7: Building the Kernel (SHOWCASE)

The Linux kernel is the largest collaborative software project in human history: 30+ million lines of code, 15,000+ configuration options, support for dozens of architectures. Building it from source gives you complete control over what runs on your embedded device — every driver included, every feature enabled or disabled, every byte of the final image justified.

The kernel build process has four stages:

1. Get Source

git clone or download tarball from kernel.org

↓

2. Configure

make menuconfig → .config file with 15,000+ options

↓

3. Compile

make -j$(nproc) → zImage/Image + .ko modules + .dtb files

↓

4. Install

Copy kernel + dtb to boot partition, modules to rootfs

bash
# Full kernel build for ARM (e.g., Raspberry Pi)

# 1. Get source
git clone --depth=1 https://github.com/raspberrypi/linux.git
cd linux

# 2. Configure (start from default config for this board)
make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- bcm2711_defconfig

# 2b. Customize (interactive menu)
make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- menuconfig

# 3. Compile (all cores)
make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- -j$(nproc) zImage modules dtbs

# 4. Install modules to staging directory
make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- \
     INSTALL_MOD_PATH=/tmp/rootfs modules_install

# 4b. Copy kernel and device tree to SD card boot partition
cp arch/arm/boot/zImage /mnt/boot/kernel7l.img
cp arch/arm/boot/dts/bcm2711-rpi-4-b.dtb /mnt/boot/

The .config file is where all 15,000+ decisions are recorded. Each option can be:

Value	Meaning	Effect on Size
`y`	Built into kernel image	Kernel gets bigger, always available
`m`	Built as loadable module (.ko)	Kernel stays small, load on demand
`n` (or absent)	Not built at all	Code excluded entirely

The key tradeoff: Built-in (y) means it's always available instantly at boot — essential for root filesystem drivers. Module (m) means it loads on demand — saves RAM for rarely-used drivers. Disabled (n) means the code doesn't exist in your build — smaller image, less attack surface. For embedded: disable everything you don't need.

Critical kernel configuration categories:

kconfig
# Preemption Model (General Setup → Preemption Model)
CONFIG_PREEMPT_NONE=y      # Server: max throughput, poor latency
CONFIG_PREEMPT_VOLUNTARY=y # Desktop: good balance
CONFIG_PREEMPT=y           # Embedded/RT: low latency, some overhead

# Filesystem Support
CONFIG_EXT4_FS=y           # Built-in (need this to mount root!)
CONFIG_VFAT_FS=m           # Module (for SD card FAT partitions)
CONFIG_BTRFS_FS=n          # Disabled (we don't use Btrfs)

# Networking
CONFIG_NET=y               # Built-in (needed for basically everything)
CONFIG_INET=y              # TCP/IP
CONFIG_WIRELESS=m          # WiFi stack as module
CONFIG_BT=m                # Bluetooth as module

# Device Drivers
CONFIG_I2C=y               # I2C bus support
CONFIG_SPI=y               # SPI bus support
CONFIG_GPIO_SYSFS=y        # GPIO access from userspace
CONFIG_USB_SUPPORT=y       # USB host support

Dependencies: Options depend on each other. Enabling CONFIG_WIRELESS automatically pulls in CONFIG_NET and CONFIG_CFG80211. Disabling CONFIG_NET forces all networking options off. The config system enforces these constraints automatically.

Interactive Kernel Config Explorer

Toggle kernel features on/off. Watch the kernel size change, see dependencies auto-enable, and observe build time estimates.

Build output: After make completes, you get:

File	Location	Purpose
`zImage`	arch/arm/boot/	Compressed kernel (for ARM32)
`Image`	arch/arm64/boot/	Uncompressed kernel (ARM64)
`*.dtb`	arch/arm/boot/dts/	Compiled Device Tree Blob
`*.ko`	throughout tree	Loadable kernel modules
`vmlinux`	root of tree	Uncompressed ELF (for debugging)

bash
# Check the size of your kernel
ls -lh arch/arm/boot/zImage
# -rw-r--r-- 1 user user 6.2M  zImage

# Count modules built
find . -name "*.ko" | wc -l
# 247

# See what config produced this
grep "=y" .config | wc -l   # Built-in options: ~2100
grep "=m" .config | wc -l   # Module options: ~250

Chapter 8: Root Filesystem & Boot

You've built the kernel. But the kernel alone is useless — it needs a root filesystem (rootfs) to mount. The rootfs contains everything userspace needs: the init system, shell, libraries, utilities, and your application. Without it, the kernel panics: Kernel panic - not syncing: VFS: Unable to mount root fs.

The complete boot sequence on an embedded ARM system:

1. Power On

CPU begins executing from a fixed ROM address (SoC-specific)

↓

2. Boot ROM

SoC's built-in code loads first-stage bootloader from storage

↓

3. U-Boot (SPL)

Secondary Program Loader: initializes DRAM, loads full U-Boot

↓

4. U-Boot

Full bootloader: loads kernel (zImage) + DTB from storage/network

↓

5. Kernel

Decompresses, initializes hardware, mounts rootfs

↓

6. Init (PID 1)

First userspace process: systemd, BusyBox init, or custom

↓

7. Services

Init starts your daemons, network, login shells

U-Boot is the embedded bootloader: Like GRUB for desktops but far more capable for embedded. It has a shell, can boot over TFTP/NFS, flash firmware, and is configured via environment variables. The kernel command line (e.g., root=/dev/mmcblk0p2 rootfstype=ext4) tells the kernel which partition to mount as root.

Building a root filesystem from scratch is complex — you need cross-compiled busybox/coreutils, glibc, device nodes, init scripts. Two tools automate this entirely:

Tool	Philosophy	Config Time	Use When
Buildroot	Simple, Makefile-based	Minutes	Simple systems, learning, fast iteration
Yocto/OpenEmbedded	Layer-based, industrial	Hours	Production products, long-term maintenance, BSP vendors

bash
# Buildroot: build a complete rootfs in one command
git clone https://github.com/buildroot/buildroot.git
cd buildroot

# Start with a board-specific config
make raspberrypi4_defconfig

# Customize (add packages, configure init system)
make menuconfig

# Build everything: toolchain + kernel + rootfs + bootloader
make -j$(nproc)

# Output: a complete SD card image
ls output/images/
# sdcard.img  rootfs.ext4  zImage  bcm2711-rpi-4-b.dtb

# Flash to SD card
sudo dd if=output/images/sdcard.img of=/dev/sdX bs=4M status=progress

initramfs (initial RAM filesystem): A small filesystem loaded into RAM by the bootloader alongside the kernel. The kernel mounts it as a temporary root to run early-boot tasks (load storage drivers, decrypt disks, find the real rootfs) before switching to the final root filesystem.

bash
# U-Boot environment for network boot (TFTP + NFS)
setenv serverip 192.168.1.100
setenv ipaddr 192.168.1.50
setenv bootcmd 'tftp 0x80000000 zImage; tftp 0x82000000 board.dtb; bootz 0x80000000 - 0x82000000'
setenv bootargs 'console=ttyS0,115200 root=/dev/nfs nfsroot=192.168.1.100:/srv/nfs/rootfs ip=dhcp'
saveenv
boot

Development vs Production boot: During development, boot via TFTP+NFS — instant updates, no SD card swapping. For production, everything goes on eMMC/NAND: bootloader in a dedicated partition, kernel+dtb in a boot partition, rootfs in the main partition. Add an A/B partition scheme for safe OTA updates.

Boot Sequence Timeline

Watch the complete boot process from power-on to userspace. Click each stage to see timing and details.

A minimal rootfs directory structure:

text
/
├── bin/          busybox, sh, ls, cat, ...
├── sbin/         init, mount, ifconfig, ...
├── lib/          libc.so, ld-linux.so (target libraries)
├── etc/
│   ├── init.d/   Boot scripts (if using BusyBox init)
│   ├── fstab     Filesystem mount table
│   └── inittab   BusyBox init configuration
├── dev/          Device nodes (/dev/console, /dev/null)
├── proc/         Mount point for procfs
├── sys/          Mount point for sysfs
├── tmp/          Temporary files (tmpfs)
└── usr/
    ├── bin/      Additional utilities
    └── lib/      Additional libraries

What happens if the kernel can't mount a root filesystem?

Kernel panic — it prints "Unable to mount root fs" and halts It creates a blank filesystem automatically It drops to a U-Boot shell

Chapter 9: Mastery & Connections

You now understand the complete embedded Linux stack: from deciding whether to use Linux at all, through cross-compilation, daemon creation, kernel module development, character device drivers, platform drivers with Device Tree, kernel building, and the boot process. Let's consolidate with templates and cheat sheets you can use in real projects.

The full picture: Embedded Linux is a layered system. Hardware ↔ Device Tree ↔ Platform Driver ↔ Character Device ↔ Userspace Application. Each layer communicates through well-defined interfaces. Mastering these interfaces is mastering embedded Linux.

Kernel Module Template — starting point for any new module:

c
// SPDX-License-Identifier: GPL-2.0
/* my_module.c — Template */
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Your Name");
MODULE_DESCRIPTION("Description");
MODULE_VERSION("1.0");

static int __init my_init(void) {
    pr_info("Module loaded\n");
    return 0;
}

static void __exit my_exit(void) {
    pr_info("Module unloaded\n");
}

module_init(my_init);
module_exit(my_exit);

Character Device Driver Skeleton with automatic /dev creation:

c
// SPDX-License-Identifier: GPL-2.0
#include <linux/module.h>
#include <linux/fs.h>
#include <linux/cdev.h>
#include <linux/device.h>
#include <linux/uaccess.h>

#define DEV_NAME "mychardev"
MODULE_LICENSE("GPL");

static dev_t devno;
static struct cdev cdev;
static struct class *cls;

static int     dev_open(struct inode *i, struct file *f) { return 0; }
static int     dev_release(struct inode *i, struct file *f) { return 0; }
static ssize_t dev_read(struct file *f, char __user *b, size_t n, loff_t *o) { return 0; }
static ssize_t dev_write(struct file *f, const char __user *b, size_t n, loff_t *o) { return n; }

static struct file_operations fops = {
    .owner = THIS_MODULE, .open = dev_open, .release = dev_release,
    .read = dev_read, .write = dev_write,
};

static int __init chrdev_init(void) {
    alloc_chrdev_region(&devno, 0, 1, DEV_NAME);
    cdev_init(&cdev, &fops);
    cdev_add(&cdev, devno, 1);
    cls = class_create(DEV_NAME);
    device_create(cls, NULL, devno, NULL, DEV_NAME);
    return 0;
}

static void __exit chrdev_exit(void) {
    device_destroy(cls, devno);
    class_destroy(cls);
    cdev_del(&cdev);
    unregister_chrdev_region(devno, 1);
}

module_init(chrdev_init);
module_exit(chrdev_exit);

Device Tree syntax cheat sheet:

dts
/* Basic node */
node_label: node-name@address {
    compatible = "vendor,device";
    reg = ;
    status = "okay";
};

/* GPIO reference */
gpios = <&gpio_controller pin_number flags>;

/* Interrupt */
interrupt-parent = <&intc>;
interrupts = ;

/* Clock */
clocks = <&clock_controller clock_id>;
clock-names = "main";

/* DMA */
dmas = <&dma_controller channel_num>;
dma-names = "tx", "rx";

Cross-compilation cheat sheet:

bash
# ARM 32-bit (Cortex-A7/A8/A9/A53 in 32-bit mode)
export CROSS_COMPILE=arm-linux-gnueabihf-
export ARCH=arm

# ARM 64-bit (Cortex-A53/A72 in 64-bit mode)
export CROSS_COMPILE=aarch64-linux-gnu-
export ARCH=arm64

# RISC-V
export CROSS_COMPILE=riscv64-linux-gnu-
export ARCH=riscv

# Common commands pattern
${CROSS_COMPILE}gcc -o app app.c       # Compile userspace
${CROSS_COMPILE}objdump -d app          # Disassemble
${CROSS_COMPILE}readelf -h app          # Check ELF header
${CROSS_COMPILE}strip app               # Remove debug symbols (smaller binary)

Build system comparison:

System	Produces	Learning Curve	Rebuild Time	Best For
Manual	Individual programs	High	Fast	Understanding, tiny systems
Buildroot	Complete SD image	Low	~30min full	Prototypes, simple products
Yocto	Complete SD image	Very high	~2hr full	Production, long-term support
Debian/Ubuntu	Package-based rootfs	Low	Minutes (apt)	Development boards, quick hacks

Design challenge: Write a complete I2C temperature sensor driver. The device is at I2C address 0x48. Reading register 0x00 gives a 16-bit temperature value (MSB first, 12-bit resolution, 0.0625°C per LSB). Expose temperature via a sysfs attribute. Device Tree binding: compatible = "ti,tmp102". You'll need: platform driver with of_match, I2C client in probe, sysfs attribute group, and proper cleanup in remove.

c
// SPDX-License-Identifier: GPL-2.0
/* tmp102_driver.c — I2C temperature sensor driver skeleton */
#include <linux/module.h>
#include <linux/i2c.h>
#include <linux/hwmon.h>
#include <linux/hwmon-sysfs.h>

MODULE_LICENSE("GPL");

struct tmp102_data {
    struct i2c_client *client;
    struct mutex lock;
};

static int tmp102_read_temp(struct tmp102_data *data) {
    int ret;
    s16 raw;

    mutex_lock(&data->lock);
    ret = i2c_smbus_read_word_swapped(data->client, 0x00);
    mutex_unlock(&data->lock);

    if (ret < 0) return ret;

    raw = ret >> 4;  /* 12-bit resolution */
    return raw * 625 / 10;  /* Convert to millidegrees C */
}

static ssize_t temp_show(struct device *dev,
                         struct device_attribute *attr, char *buf) {
    struct tmp102_data *data = dev_get_drvdata(dev);
    int temp = tmp102_read_temp(data);
    if (temp < 0) return temp;
    return sprintf(buf, "%d\n", temp);
}

static DEVICE_ATTR_RO(temp);

static struct attribute *tmp102_attrs[] = {
    &dev_attr_temp.attr,
    NULL,
};
ATTRIBUTE_GROUPS(tmp102);

static int tmp102_probe(struct i2c_client *client) {
    struct tmp102_data *data;
    struct device *hwmon_dev;

    data = devm_kzalloc(&client->dev, sizeof(*data), GFP_KERNEL);
    if (!data) return -ENOMEM;

    data->client = client;
    mutex_init(&data->lock);

    hwmon_dev = devm_hwmon_device_register_with_groups(
        &client->dev, "tmp102", data, tmp102_groups);

    return PTR_ERR_OR_ZERO(hwmon_dev);
}

static const struct of_device_id tmp102_of_match[] = {
    { .compatible = "ti,tmp102" },
    { },
};
MODULE_DEVICE_TABLE(of, tmp102_of_match);

static const struct i2c_device_id tmp102_id[] = {
    { "tmp102", 0 },
    { },
};
MODULE_DEVICE_TABLE(i2c, tmp102_id);

static struct i2c_driver tmp102_driver = {
    .driver = {
        .name = "tmp102",
        .of_match_table = tmp102_of_match,
    },
    .probe = tmp102_probe,
    .id_table = tmp102_id,
};

dts
/* Device Tree binding for the TMP102 */
&i2c1 {
    tmp102@48 {
        compatible = "ti,tmp102";
        reg = <0x48>;
    };
};

Connections to other lessons:

If you want to learn...	See
Real-time guarantees on embedded	RTOS lesson
How processors execute instructions	Computer Architecture
Networking protocols on embedded	Networking lesson
State machines for firmware logic	MDPs & Decision Making

You've learned: The decision of when to use Linux vs RTOS/bare-metal. Cross-compilation and the host/target workflow. Building programs with Makefiles and CMake. Writing systemd daemons with watchdog support. Kernel modules and the kernel/userspace boundary. Character device drivers with file_operations. Platform drivers with Device Tree matching. Building and configuring the Linux kernel. The complete boot sequence from power-on to userspace. You can now build a complete embedded Linux system from scratch.

"The art of programming is the art of organizing complexity." — Edsger Dijkstra

A platform driver's probe() function is called when:

The user runs insmod The kernel finds a Device Tree node whose compatible string matches the driver's of_match_table The device file is opened by userspace

Embedded LinuxFrom Boot to Driver