Build kernel modules, write device drivers, cross-compile for ARM, and understand every stage from power-on to userspace.
Your embedded system needs: TCP/IP networking, USB host support, a filesystem, multitasking, and driver support for hundreds of peripherals. Writing all of that from scratch would take decades. Embedded Linux gives you all of it — for free.
But Linux isn't always the right answer. A blinking LED on a $0.50 microcontroller doesn't need a kernel. A hard real-time motor controller running at 100kHz can't tolerate Linux's scheduling jitter. The question is: when do you reach for Linux?
The boards that run embedded Linux are everywhere: Raspberry Pi (Broadcom BCM2711, quad-core ARM Cortex-A72), BeagleBone Black (TI AM335x, Cortex-A8), NXP i.MX8 (used in automotive), NVIDIA Jetson (GPU-accelerated AI at the edge). All of these boot a Linux kernel, mount a root filesystem, and run userspace applications — just like your laptop, but on a board the size of a credit card.
Here's a decision tree. Answer the questions to see which approach fits your project:
Click your answers. The tree will guide you to the right platform.
Real-world examples of each path:
| Platform | RAM | Example | Use Case |
|---|---|---|---|
| Bare Metal | <64KB | STM32F103 | Motor PWM, sensor sampling |
| RTOS | 64KB–4MB | ESP32 + FreeRTOS | IoT device, WiFi sensor node |
| Embedded Linux | >32MB | Raspberry Pi, Jetson | Camera processing, web server, AI |
Here's the fundamental reality of embedded development: you write and compile code on one machine (the host) but run it on a completely different machine (the target). Your laptop is x86_64. Your embedded board is ARM. The binary formats are incompatible. You can't just copy your laptop's compiled programs to the board.
This is why we need cross-compilation: compiling on the host architecture (x86) to produce binaries for the target architecture (ARM). The tool that does this is the cross-compiler toolchain.
arm-linux-gnueabihf-gcc breaks down as: arm (target CPU), linux (target OS), gnueabi (ABI — calling conventions), hf (hardware floating point), gcc (the compiler). Every tool in the chain follows this prefix: arm-linux-gnueabihf-ld, arm-linux-gnueabihf-objdump, etc.But the compiler alone isn't enough. When you #include <stdio.h>, that header comes from the target's C library, not your host's. When you link against -lpthread, that's the target's libpthread. The sysroot is a directory on your host that mirrors the target's filesystem — it contains the target's headers and libraries so the cross-compiler can find them.
bash # Typical sysroot directory structure /opt/sysroot/ usr/ include/ # Target headers (stdio.h, linux/gpio.h, ...) lib/ # Target shared libraries (libc.so, libpthread.so) lib/ # More target libraries # Tell the compiler where to find target headers/libs arm-linux-gnueabihf-gcc --sysroot=/opt/sysroot -o hello hello.c
Once you've compiled the binary, you need to get it onto the target. Common transfer methods:
The full development cycle looks like this:
Watch the development workflow: source on host, binary transferred to target, debug connection back.
bash # Install the cross-compiler (Ubuntu/Debian host) sudo apt install gcc-arm-linux-gnueabihf # Verify it works arm-linux-gnueabihf-gcc --version # Cross-compile a simple program arm-linux-gnueabihf-gcc -o hello hello.c # Check the binary type — should say ARM, not x86! file hello # hello: ELF 32-bit LSB executable, ARM, EABI5, ... # Deploy via SCP scp hello pi@192.168.1.50:/home/pi/ # SSH and run ssh pi@192.168.1.50 ./hello
gcc on your x86 laptop and copy the binary to an ARM board?You know you need a cross-compiler. But real programs aren't single files — they have dependencies, multiple source files, and build configurations. Let's trace the complete journey from source code to running binary on your target.
First, the compilation pipeline. When you run arm-linux-gnueabihf-gcc -o hello hello.c, four separate stages happen:
Source code passes through 4 stages to become an executable ELF binary. Click stages to see intermediate output.
bash # See each stage individually: # 1. Preprocess only (-E): expand #includes, #defines arm-linux-gnueabihf-gcc -E hello.c -o hello.i # 2. Compile to assembly (-S): C → ARM assembly arm-linux-gnueabihf-gcc -S hello.c -o hello.s # 3. Assemble (-c): assembly → object file (machine code, not linked) arm-linux-gnueabihf-gcc -c hello.c -o hello.o # 4. Link: combine object files + libraries → final ELF executable arm-linux-gnueabihf-gcc hello.o -o hello
Static vs dynamic linking: When your program calls printf(), that code lives in libc. You have two choices:
| Linking | Flag | Binary Size | Dependencies | Use When |
|---|---|---|---|---|
| Dynamic | (default) | Small (~10KB) | Needs libc.so on target | Normal development |
| Static | -static | Large (~800KB+) | Self-contained | Minimal rootfs, single binary deploy |
For real projects, you need a build system. Here's a Makefile for cross-compilation:
makefile # Cross-compilation Makefile CROSS_COMPILE := arm-linux-gnueabihf- CC := $(CROSS_COMPILE)gcc CFLAGS := -Wall -O2 --sysroot=/opt/sysroot LDFLAGS := --sysroot=/opt/sysroot TARGET := gpio_reader SRCS := main.c gpio.c OBJS := $(SRCS:.c=.o) all: $(TARGET) $(TARGET): $(OBJS) $(CC) $(LDFLAGS) -o $@ $^ %.o: %.c $(CC) $(CFLAGS) -c -o $@ $< clean: rm -f $(OBJS) $(TARGET) deploy: $(TARGET) scp $(TARGET) pi@192.168.1.50:/home/pi/
For larger projects, CMake with a toolchain file is standard:
cmake # toolchain-arm.cmake set(CMAKE_SYSTEM_NAME Linux) set(CMAKE_SYSTEM_PROCESSOR arm) set(CMAKE_C_COMPILER arm-linux-gnueabihf-gcc) set(CMAKE_CXX_COMPILER arm-linux-gnueabihf-g++) set(CMAKE_SYSROOT /opt/sysroot) set(CMAKE_FIND_ROOT_PATH /opt/sysroot) # Search headers/libs only in sysroot, programs only on host set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER) set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY) set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)
bash # Build with the toolchain file mkdir build && cd build cmake -DCMAKE_TOOLCHAIN_FILE=../toolchain-arm.cmake .. make -j$(nproc)
Worked example: A program that reads a GPIO pin via the sysfs interface:
c /* gpio_read.c — Read GPIO pin state via sysfs */ #include <stdio.h> #include <stdlib.h> #include <fcntl.h> #include <unistd.h> #include <string.h> #define GPIO_PIN "17" #define GPIO_PATH "/sys/class/gpio/" int main(void) { int fd; char buf[64]; char val; /* Export the GPIO pin */ fd = open(GPIO_PATH "export", O_WRONLY); if (fd < 0) { perror("export"); return 1; } write(fd, GPIO_PIN, strlen(GPIO_PIN)); close(fd); usleep(100000); /* Wait for sysfs to create entries */ /* Set direction to input */ snprintf(buf, sizeof(buf), GPIO_PATH "gpio%s/direction", GPIO_PIN); fd = open(buf, O_WRONLY); write(fd, "in", 2); close(fd); /* Read value */ snprintf(buf, sizeof(buf), GPIO_PATH "gpio%s/value", GPIO_PIN); fd = open(buf, O_RDONLY); read(fd, &val, 1); close(fd); printf("GPIO %s = %c\n", GPIO_PIN, val); /* Unexport */ fd = open(GPIO_PATH "unexport", O_WRONLY); write(fd, GPIO_PIN, strlen(GPIO_PIN)); close(fd); return 0; }
bash # Cross-compile and deploy arm-linux-gnueabihf-gcc -Wall -O2 -o gpio_read gpio_read.c scp gpio_read pi@192.168.1.50:/home/pi/ ssh pi@192.168.1.50 sudo ./gpio_read # GPIO 17 = 1
--sysroot flag tell the cross-compiler?Your embedded device boots up. No one is logged in. No terminal is open. Yet your temperature monitoring program needs to be running, logging data, and raising alarms. That's what a daemon is: a background process with no controlling terminal that starts at boot and runs forever.
The word "daemon" comes from Greek mythology — a background spirit doing useful work. In Unix, a daemon is a process that has detached from any terminal and runs independently in the background. Think of it as a service: it starts, it does its job, it doesn't need human interaction.
c /* Classic daemon creation (the traditional Unix way) */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/stat.h> #include <syslog.h> void daemonize(void) { pid_t pid; /* Step 1: Fork — parent exits, child continues */ pid = fork(); if (pid < 0) exit(EXIT_FAILURE); if (pid > 0) exit(EXIT_SUCCESS); /* Parent exits */ /* Step 2: Create new session — detach from terminal */ if (setsid() < 0) exit(EXIT_FAILURE); /* Step 3: Fork again — ensure we can't re-acquire terminal */ pid = fork(); if (pid < 0) exit(EXIT_FAILURE); if (pid > 0) exit(EXIT_SUCCESS); /* Step 4: Set file permissions, change working dir */ umask(0); chdir("/"); /* Step 5: Close standard file descriptors */ close(STDIN_FILENO); close(STDOUT_FILENO); close(STDERR_FILENO); } int main(void) { daemonize(); openlog("tempmon", LOG_PID, LOG_DAEMON); syslog(LOG_INFO, "Temperature monitor started"); while (1) { /* Read temperature, log it, check threshold */ syslog(LOG_INFO, "Temperature: 42.3C"); sleep(10); } closelog(); return 0; }
The modern way: systemd. Instead of hand-coding the fork/setsid ritual, you write your program as a simple foreground process and let systemd handle daemonization, logging, restart-on-crash, and dependency ordering. A unit file tells systemd how to manage your service:
ini # /etc/systemd/system/tempmon.service [Unit] Description=Temperature Monitor Daemon After=network.target [Service] Type=simple ExecStart=/usr/bin/tempmon Restart=always RestartSec=5 WatchdogSec=30 StandardOutput=journal StandardError=journal [Install] WantedBy=multi-user.target
Restart=always means if your daemon crashes, systemd restarts it in 5 seconds. WatchdogSec=30 means if your daemon doesn't ping systemd every 30 seconds, systemd assumes it's hung and kills/restarts it. This is critical for unattended devices in the field.Watchdog integration in your code:
c #include <systemd/sd-daemon.h> int main(void) { sd_notify(0, "READY=1"); /* Tell systemd we're up */ while (1) { /* Do work... */ read_temperature(); log_data(); /* Kick the watchdog — "I'm still alive" */ sd_notify(0, "WATCHDOG=1"); sleep(10); } }
bash # Service management commands sudo systemctl start tempmon # Start the service sudo systemctl stop tempmon # Stop it sudo systemctl enable tempmon # Start at boot sudo systemctl status tempmon # Check status journalctl -u tempmon -f # Follow logs in real-time
Worked example: A complete temperature monitoring daemon:
c /* tempmon.c — Systemd-friendly temperature monitor */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <fcntl.h> #include <syslog.h> #include <systemd/sd-daemon.h> #define TEMP_PATH "/sys/class/thermal/thermal_zone0/temp" #define THRESHOLD 70000 /* 70°C in millidegrees */ #define INTERVAL 10 /* seconds */ int read_temp(void) { int fd, temp; char buf[16]; fd = open(TEMP_PATH, O_RDONLY); if (fd < 0) return -1; read(fd, buf, sizeof(buf)); close(fd); temp = atoi(buf); return temp; } int main(void) { int temp; openlog("tempmon", LOG_PID, LOG_DAEMON); sd_notify(0, "READY=1"); syslog(LOG_INFO, "Temperature monitor started"); while (1) { temp = read_temp(); if (temp < 0) { syslog(LOG_ERR, "Failed to read temperature"); } else { syslog(LOG_INFO, "Temp: %d.%dC", temp/1000, (temp%1000)/100); if (temp > THRESHOLD) syslog(LOG_WARNING, "ALERT: Over %dC!", THRESHOLD/1000); } sd_notify(0, "WATCHDOG=1"); sleep(INTERVAL); } closelog(); return 0; }
Interact with the daemon: start, stop, restart. Watch PID changes and log messages appear.
Restart=always do?The Linux kernel is monolithic — all kernel code runs in the same address space, sharing the same memory. But you don't want to rebuild the entire kernel every time you add a new driver. Kernel modules solve this: they're chunks of code you can load into the running kernel and unload without rebooting.
Think of modules as plugins for the kernel. Your WiFi driver? A module. Your filesystem driver? A module. Your custom hardware driver? You guessed it — a module. The command lsmod on any Linux system shows dozens of loaded modules.
The minimal kernel module has exactly two functions: one called when the module loads (init), one called when it unloads (exit):
c /* hello_module.c — The minimal kernel module */ #include <linux/module.h> #include <linux/kernel.h> #include <linux/init.h> MODULE_LICENSE("GPL"); MODULE_AUTHOR("Your Name"); MODULE_DESCRIPTION("A minimal hello world kernel module"); static int __init hello_init(void) { printk(KERN_INFO "hello: Module loaded!\n"); return 0; /* 0 = success, negative = error */ } static void __exit hello_exit(void) { printk(KERN_INFO "hello: Module unloaded.\n"); } module_init(hello_init); module_exit(hello_exit);
printk() for logging (output goes to the kernel ring buffer, viewable with dmesg), kmalloc() for memory allocation, and kernel-specific APIs for everything else. You're writing code for a different world.The Makefile for kernel modules is unique — it delegates to the kernel's own build system (kbuild):
makefile # Makefile for out-of-tree kernel module obj-m += hello_module.o # For cross-compilation: KDIR := /path/to/target/kernel/source ARCH := arm CROSS := arm-linux-gnueabihf- all: make -C $(KDIR) M=$(PWD) ARCH=$(ARCH) CROSS_COMPILE=$(CROSS) modules clean: make -C $(KDIR) M=$(PWD) clean
bash # Build the module make # Copy to target scp hello_module.ko pi@192.168.1.50:/home/pi/ # On the target: sudo insmod hello_module.ko # Load dmesg | tail -1 # See: "hello: Module loaded!" lsmod | grep hello # Verify it's loaded sudo rmmod hello_module # Unload dmesg | tail -1 # See: "hello: Module unloaded."
Module parameters let users configure your module at load time without recompiling:
c static int interval = 5; static char *name = "world"; module_param(interval, int, 0644); MODULE_PARM_DESC(interval, "Polling interval in seconds"); module_param(name, charp, 0644); MODULE_PARM_DESC(name, "Name to greet"); /* Usage: sudo insmod hello.ko interval=10 name="Linux" */
Worked example: A module that creates a /proc entry for reading:
c /* proc_module.c — Creates /proc/myinfo */ #include <linux/module.h> #include <linux/proc_fs.h> #include <linux/seq_file.h> MODULE_LICENSE("GPL"); static int myinfo_show(struct seq_file *m, void *v) { seq_printf(m, "Uptime ticks: %llu\n", jiffies_64); seq_printf(m, "HZ: %d\n", HZ); seq_printf(m, "Seconds: %llu\n", jiffies_64 / HZ); return 0; } static int myinfo_open(struct inode *inode, struct file *file) { return single_open(file, myinfo_show, NULL); } static const struct proc_ops myinfo_ops = { .proc_open = myinfo_open, .proc_read = seq_read, .proc_lseek = seq_lseek, .proc_release = single_release, }; static int __init myinfo_init(void) { proc_create("myinfo", 0444, NULL, &myinfo_ops); printk(KERN_INFO "myinfo: /proc/myinfo created\n"); return 0; } static void __exit myinfo_exit(void) { remove_proc_entry("myinfo", NULL); printk(KERN_INFO "myinfo: /proc/myinfo removed\n"); } module_init(myinfo_init); module_exit(myinfo_exit);
bash # After loading the module: cat /proc/myinfo # Uptime ticks: 4523890 # HZ: 100 # Seconds: 45238
Watch a module being loaded into kernel space. It gains access to kernel APIs that userspace can never touch.
In Unix, everything is a file. Your serial port is /dev/ttyS0. Your webcam is /dev/video0. Your random number generator is /dev/urandom. When a userspace program calls open("/dev/mydevice"), that call travels through the kernel's Virtual Filesystem (VFS) layer and arrives at YOUR driver's open() function. You control what happens next.
A character device driver presents hardware as a stream of bytes. Userspace reads and writes bytes; the driver translates those bytes into hardware operations. The interface is defined by the struct file_operations — a table of function pointers that the kernel calls when userspace does I/O:
file_operations struct is the dictionary.c /* The file_operations struct — your driver's API */ static struct file_operations mydev_fops = { .owner = THIS_MODULE, .open = mydev_open, /* Called on open("/dev/mydev") */ .release = mydev_release, /* Called on close() */ .read = mydev_read, /* Called on read() */ .write = mydev_write, /* Called on write() */ .unlocked_ioctl = mydev_ioctl, /* Called on ioctl() */ };
Device files are identified by major and minor numbers. The major number identifies the driver. The minor number identifies which device instance (if the driver handles multiple devices). When you mknod /dev/myled c 240 0, you're creating a character device (c) with major 240, minor 0.
Worked example: A complete LED driver. Writing "1" turns the LED on, writing "0" turns it off, reading returns the current state:
c /* led_driver.c — Character device driver for an LED */ #include <linux/module.h> #include <linux/fs.h> #include <linux/cdev.h> #include <linux/device.h> #include <linux/uaccess.h> #include <linux/gpio.h> #define DEVICE_NAME "myled" #define LED_GPIO 17 MODULE_LICENSE("GPL"); static dev_t dev_num; static struct cdev myled_cdev; static struct class *myled_class; static int led_state = 0; static int myled_open(struct inode *i, struct file *f) { printk(KERN_INFO "myled: opened\n"); return 0; } static int myled_release(struct inode *i, struct file *f) { printk(KERN_INFO "myled: closed\n"); return 0; } static ssize_t myled_read(struct file *f, char __user *buf, size_t len, loff_t *off) { char val = led_state ? '1' : '0'; if (*off > 0) return 0; /* EOF on second read */ if (copy_to_user(buf, &val, 1)) return -EFAULT; *off += 1; return 1; } static ssize_t myled_write(struct file *f, const char __user *buf, size_t len, loff_t *off) { char val; if (len < 1) return -EINVAL; if (copy_from_user(&val, buf, 1)) return -EFAULT; if (val == '1') { gpio_set_value(LED_GPIO, 1); led_state = 1; } else if (val == '0') { gpio_set_value(LED_GPIO, 0); led_state = 0; } return len; } static struct file_operations myled_fops = { .owner = THIS_MODULE, .open = myled_open, .release = myled_release, .read = myled_read, .write = myled_write, }; static int __init myled_init(void) { /* 1. Allocate device number */ alloc_chrdev_region(&dev_num, 0, 1, DEVICE_NAME); /* 2. Initialize cdev and add to kernel */ cdev_init(&myled_cdev, &myled_fops); cdev_add(&myled_cdev, dev_num, 1); /* 3. Create device class and device file automatically */ myled_class = class_create(DEVICE_NAME); device_create(myled_class, NULL, dev_num, NULL, DEVICE_NAME); /* 4. Request and configure GPIO */ gpio_request(LED_GPIO, "led"); gpio_direction_output(LED_GPIO, 0); printk(KERN_INFO "myled: registered at major %d\n", MAJOR(dev_num)); return 0; } static void __exit myled_exit(void) { gpio_set_value(LED_GPIO, 0); gpio_free(LED_GPIO); device_destroy(myled_class, dev_num); class_destroy(myled_class); cdev_del(&myled_cdev); unregister_chrdev_region(dev_num, 1); printk(KERN_INFO "myled: unregistered\n"); } module_init(myled_init); module_exit(myled_exit);
bash # Usage from userspace: sudo insmod led_driver.ko ls -la /dev/myled # Device file auto-created echo "1" | sudo tee /dev/myled # LED ON cat /dev/myled # Read state: "1" echo "0" | sudo tee /dev/myled # LED OFF
See how a userspace write() call travels through the kernel to your driver function and ultimately to hardware.
copy_from_user() instead of directly reading a userspace pointer in kernel code?The LED driver in Chapter 5 has a problem: the GPIO number (17) is hardcoded. If you move to a different board where the LED is on GPIO 22, you'd have to modify the source and recompile. That's terrible engineering. Modern Linux solves this with the Device Tree: a data structure that describes hardware separately from the driver code.
The Device Tree (DT) is a hierarchical description of the hardware on your board. It tells the kernel: "at address 0x3F200000 there's a GPIO controller" and "connected to GPIO 17 there's an LED." The driver code says: "I know how to drive LEDs" but never mentions specific addresses or pin numbers. The kernel matches them together.
A Device Tree snippet (a .dts file) for our LED:
dts /* Device Tree Source for our LED */ / { my_led: led@0 { compatible = "mycompany,my-led"; /* Matching string */ gpios = <&gpio 17 GPIO_ACTIVE_HIGH>; /* Which GPIO */ label = "status-led"; default-state = "off"; status = "okay"; }; };
The platform driver registers itself with a compatible string. When the kernel boots and parses the Device Tree, it finds a node with compatible = "mycompany,my-led" and calls your driver's probe() function, passing all the information from the DT node:
c /* led_platform.c — Platform driver matching Device Tree */ #include <linux/module.h> #include <linux/platform_device.h> #include <linux/of.h> #include <linux/gpio/consumer.h> MODULE_LICENSE("GPL"); struct my_led_data { struct gpio_desc *gpio; int state; }; static int my_led_probe(struct platform_device *pdev) { struct my_led_data *led; struct device *dev = &pdev->dev; /* Allocate driver-private data */ led = devm_kzalloc(dev, sizeof(*led), GFP_KERNEL); if (!led) return -ENOMEM; /* Get GPIO from Device Tree — no hardcoded numbers! */ led->gpio = devm_gpiod_get(dev, NULL, GPIOD_OUT_LOW); if (IS_ERR(led->gpio)) { dev_err(dev, "Failed to get GPIO\n"); return PTR_ERR(led->gpio); } platform_set_drvdata(pdev, led); dev_info(dev, "LED driver probed successfully\n"); return 0; } static int my_led_remove(struct platform_device *pdev) { struct my_led_data *led = platform_get_drvdata(pdev); gpiod_set_value(led->gpio, 0); /* Turn off on remove */ dev_info(&pdev->dev, "LED driver removed\n"); return 0; } /* Match table: links this driver to Device Tree nodes */ static const struct of_device_id my_led_of_match[] = { { .compatible = "mycompany,my-led" }, { } /* Sentinel — marks end of table */ }; MODULE_DEVICE_TABLE(of, my_led_of_match); static struct platform_driver my_led_driver = { .probe = my_led_probe, .remove = my_led_remove, .driver = { .name = "my-led", .of_match_table = my_led_of_match, }, }; module_platform_driver(my_led_driver);
devm_ automatically frees its resource when the device is removed. devm_kzalloc frees memory, devm_gpiod_get releases the GPIO. No manual cleanup needed — the kernel tracks it for you. This eliminates an entire class of resource leak bugs.The matching process works like this:
See how the kernel matches DT nodes to drivers based on compatible strings. Click to load a driver.
Key Device Tree syntax patterns:
dts /* Node naming: name@address */ i2c_sensor: sensor@48 { compatible = "ti,tmp102"; /* vendor,device */ reg = <0x48>; /* I2C address */ interrupt-parent = <&gpio1>; interrupts = <7 IRQ_TYPE_EDGE_FALLING>; }; /* Referencing other nodes (phandles) */ spi_display: display@0 { compatible = "sitronix,st7789"; reg = <0>; /* SPI chip select */ spi-max-frequency = <40000000>; dc-gpios = <&gpio 25 GPIO_ACTIVE_HIGH>; reset-gpios = <&gpio 24 GPIO_ACTIVE_LOW>; };
The Linux kernel is the largest collaborative software project in human history: 30+ million lines of code, 15,000+ configuration options, support for dozens of architectures. Building it from source gives you complete control over what runs on your embedded device — every driver included, every feature enabled or disabled, every byte of the final image justified.
The kernel build process has four stages:
bash # Full kernel build for ARM (e.g., Raspberry Pi) # 1. Get source git clone --depth=1 https://github.com/raspberrypi/linux.git cd linux # 2. Configure (start from default config for this board) make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- bcm2711_defconfig # 2b. Customize (interactive menu) make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- menuconfig # 3. Compile (all cores) make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- -j$(nproc) zImage modules dtbs # 4. Install modules to staging directory make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- \ INSTALL_MOD_PATH=/tmp/rootfs modules_install # 4b. Copy kernel and device tree to SD card boot partition cp arch/arm/boot/zImage /mnt/boot/kernel7l.img cp arch/arm/boot/dts/bcm2711-rpi-4-b.dtb /mnt/boot/
The .config file is where all 15,000+ decisions are recorded. Each option can be:
| Value | Meaning | Effect on Size |
|---|---|---|
y | Built into kernel image | Kernel gets bigger, always available |
m | Built as loadable module (.ko) | Kernel stays small, load on demand |
n (or absent) | Not built at all | Code excluded entirely |
Critical kernel configuration categories:
kconfig # Preemption Model (General Setup → Preemption Model) CONFIG_PREEMPT_NONE=y # Server: max throughput, poor latency CONFIG_PREEMPT_VOLUNTARY=y # Desktop: good balance CONFIG_PREEMPT=y # Embedded/RT: low latency, some overhead # Filesystem Support CONFIG_EXT4_FS=y # Built-in (need this to mount root!) CONFIG_VFAT_FS=m # Module (for SD card FAT partitions) CONFIG_BTRFS_FS=n # Disabled (we don't use Btrfs) # Networking CONFIG_NET=y # Built-in (needed for basically everything) CONFIG_INET=y # TCP/IP CONFIG_WIRELESS=m # WiFi stack as module CONFIG_BT=m # Bluetooth as module # Device Drivers CONFIG_I2C=y # I2C bus support CONFIG_SPI=y # SPI bus support CONFIG_GPIO_SYSFS=y # GPIO access from userspace CONFIG_USB_SUPPORT=y # USB host support
CONFIG_WIRELESS automatically pulls in CONFIG_NET and CONFIG_CFG80211. Disabling CONFIG_NET forces all networking options off. The config system enforces these constraints automatically.Toggle kernel features on/off. Watch the kernel size change, see dependencies auto-enable, and observe build time estimates.
Build output: After make completes, you get:
| File | Location | Purpose |
|---|---|---|
zImage | arch/arm/boot/ | Compressed kernel (for ARM32) |
Image | arch/arm64/boot/ | Uncompressed kernel (ARM64) |
*.dtb | arch/arm/boot/dts/ | Compiled Device Tree Blob |
*.ko | throughout tree | Loadable kernel modules |
vmlinux | root of tree | Uncompressed ELF (for debugging) |
bash # Check the size of your kernel ls -lh arch/arm/boot/zImage # -rw-r--r-- 1 user user 6.2M zImage # Count modules built find . -name "*.ko" | wc -l # 247 # See what config produced this grep "=y" .config | wc -l # Built-in options: ~2100 grep "=m" .config | wc -l # Module options: ~250
You've built the kernel. But the kernel alone is useless — it needs a root filesystem (rootfs) to mount. The rootfs contains everything userspace needs: the init system, shell, libraries, utilities, and your application. Without it, the kernel panics: Kernel panic - not syncing: VFS: Unable to mount root fs.
The complete boot sequence on an embedded ARM system:
root=/dev/mmcblk0p2 rootfstype=ext4) tells the kernel which partition to mount as root.Building a root filesystem from scratch is complex — you need cross-compiled busybox/coreutils, glibc, device nodes, init scripts. Two tools automate this entirely:
| Tool | Philosophy | Config Time | Use When |
|---|---|---|---|
| Buildroot | Simple, Makefile-based | Minutes | Simple systems, learning, fast iteration |
| Yocto/OpenEmbedded | Layer-based, industrial | Hours | Production products, long-term maintenance, BSP vendors |
bash # Buildroot: build a complete rootfs in one command git clone https://github.com/buildroot/buildroot.git cd buildroot # Start with a board-specific config make raspberrypi4_defconfig # Customize (add packages, configure init system) make menuconfig # Build everything: toolchain + kernel + rootfs + bootloader make -j$(nproc) # Output: a complete SD card image ls output/images/ # sdcard.img rootfs.ext4 zImage bcm2711-rpi-4-b.dtb # Flash to SD card sudo dd if=output/images/sdcard.img of=/dev/sdX bs=4M status=progress
initramfs (initial RAM filesystem): A small filesystem loaded into RAM by the bootloader alongside the kernel. The kernel mounts it as a temporary root to run early-boot tasks (load storage drivers, decrypt disks, find the real rootfs) before switching to the final root filesystem.
bash # U-Boot environment for network boot (TFTP + NFS) setenv serverip 192.168.1.100 setenv ipaddr 192.168.1.50 setenv bootcmd 'tftp 0x80000000 zImage; tftp 0x82000000 board.dtb; bootz 0x80000000 - 0x82000000' setenv bootargs 'console=ttyS0,115200 root=/dev/nfs nfsroot=192.168.1.100:/srv/nfs/rootfs ip=dhcp' saveenv boot
Watch the complete boot process from power-on to userspace. Click each stage to see timing and details.
A minimal rootfs directory structure:
text / ├── bin/ busybox, sh, ls, cat, ... ├── sbin/ init, mount, ifconfig, ... ├── lib/ libc.so, ld-linux.so (target libraries) ├── etc/ │ ├── init.d/ Boot scripts (if using BusyBox init) │ ├── fstab Filesystem mount table │ └── inittab BusyBox init configuration ├── dev/ Device nodes (/dev/console, /dev/null) ├── proc/ Mount point for procfs ├── sys/ Mount point for sysfs ├── tmp/ Temporary files (tmpfs) └── usr/ ├── bin/ Additional utilities └── lib/ Additional libraries
You now understand the complete embedded Linux stack: from deciding whether to use Linux at all, through cross-compilation, daemon creation, kernel module development, character device drivers, platform drivers with Device Tree, kernel building, and the boot process. Let's consolidate with templates and cheat sheets you can use in real projects.
Kernel Module Template — starting point for any new module:
c // SPDX-License-Identifier: GPL-2.0 /* my_module.c — Template */ #include <linux/module.h> #include <linux/kernel.h> #include <linux/init.h> MODULE_LICENSE("GPL"); MODULE_AUTHOR("Your Name"); MODULE_DESCRIPTION("Description"); MODULE_VERSION("1.0"); static int __init my_init(void) { pr_info("Module loaded\n"); return 0; } static void __exit my_exit(void) { pr_info("Module unloaded\n"); } module_init(my_init); module_exit(my_exit);
Character Device Driver Skeleton with automatic /dev creation:
c // SPDX-License-Identifier: GPL-2.0 #include <linux/module.h> #include <linux/fs.h> #include <linux/cdev.h> #include <linux/device.h> #include <linux/uaccess.h> #define DEV_NAME "mychardev" MODULE_LICENSE("GPL"); static dev_t devno; static struct cdev cdev; static struct class *cls; static int dev_open(struct inode *i, struct file *f) { return 0; } static int dev_release(struct inode *i, struct file *f) { return 0; } static ssize_t dev_read(struct file *f, char __user *b, size_t n, loff_t *o) { return 0; } static ssize_t dev_write(struct file *f, const char __user *b, size_t n, loff_t *o) { return n; } static struct file_operations fops = { .owner = THIS_MODULE, .open = dev_open, .release = dev_release, .read = dev_read, .write = dev_write, }; static int __init chrdev_init(void) { alloc_chrdev_region(&devno, 0, 1, DEV_NAME); cdev_init(&cdev, &fops); cdev_add(&cdev, devno, 1); cls = class_create(DEV_NAME); device_create(cls, NULL, devno, NULL, DEV_NAME); return 0; } static void __exit chrdev_exit(void) { device_destroy(cls, devno); class_destroy(cls); cdev_del(&cdev); unregister_chrdev_region(devno, 1); } module_init(chrdev_init); module_exit(chrdev_exit);
Device Tree syntax cheat sheet:
dts /* Basic node */ node_label: node-name@address { compatible = "vendor,device"; reg = ; status = "okay"; }; /* GPIO reference */ gpios = <&gpio_controller pin_number flags>; /* Interrupt */ interrupt-parent = <&intc>; interrupts =; /* Clock */ clocks = <&clock_controller clock_id>; clock-names = "main"; /* DMA */ dmas = <&dma_controller channel_num>; dma-names = "tx", "rx";
Cross-compilation cheat sheet:
bash # ARM 32-bit (Cortex-A7/A8/A9/A53 in 32-bit mode) export CROSS_COMPILE=arm-linux-gnueabihf- export ARCH=arm # ARM 64-bit (Cortex-A53/A72 in 64-bit mode) export CROSS_COMPILE=aarch64-linux-gnu- export ARCH=arm64 # RISC-V export CROSS_COMPILE=riscv64-linux-gnu- export ARCH=riscv # Common commands pattern ${CROSS_COMPILE}gcc -o app app.c # Compile userspace ${CROSS_COMPILE}objdump -d app # Disassemble ${CROSS_COMPILE}readelf -h app # Check ELF header ${CROSS_COMPILE}strip app # Remove debug symbols (smaller binary)
Build system comparison:
| System | Produces | Learning Curve | Rebuild Time | Best For |
|---|---|---|---|---|
| Manual | Individual programs | High | Fast | Understanding, tiny systems |
| Buildroot | Complete SD image | Low | ~30min full | Prototypes, simple products |
| Yocto | Complete SD image | Very high | ~2hr full | Production, long-term support |
| Debian/Ubuntu | Package-based rootfs | Low | Minutes (apt) | Development boards, quick hacks |
compatible = "ti,tmp102". You'll need: platform driver with of_match, I2C client in probe, sysfs attribute group, and proper cleanup in remove.c // SPDX-License-Identifier: GPL-2.0 /* tmp102_driver.c — I2C temperature sensor driver skeleton */ #include <linux/module.h> #include <linux/i2c.h> #include <linux/hwmon.h> #include <linux/hwmon-sysfs.h> MODULE_LICENSE("GPL"); struct tmp102_data { struct i2c_client *client; struct mutex lock; }; static int tmp102_read_temp(struct tmp102_data *data) { int ret; s16 raw; mutex_lock(&data->lock); ret = i2c_smbus_read_word_swapped(data->client, 0x00); mutex_unlock(&data->lock); if (ret < 0) return ret; raw = ret >> 4; /* 12-bit resolution */ return raw * 625 / 10; /* Convert to millidegrees C */ } static ssize_t temp_show(struct device *dev, struct device_attribute *attr, char *buf) { struct tmp102_data *data = dev_get_drvdata(dev); int temp = tmp102_read_temp(data); if (temp < 0) return temp; return sprintf(buf, "%d\n", temp); } static DEVICE_ATTR_RO(temp); static struct attribute *tmp102_attrs[] = { &dev_attr_temp.attr, NULL, }; ATTRIBUTE_GROUPS(tmp102); static int tmp102_probe(struct i2c_client *client) { struct tmp102_data *data; struct device *hwmon_dev; data = devm_kzalloc(&client->dev, sizeof(*data), GFP_KERNEL); if (!data) return -ENOMEM; data->client = client; mutex_init(&data->lock); hwmon_dev = devm_hwmon_device_register_with_groups( &client->dev, "tmp102", data, tmp102_groups); return PTR_ERR_OR_ZERO(hwmon_dev); } static const struct of_device_id tmp102_of_match[] = { { .compatible = "ti,tmp102" }, { }, }; MODULE_DEVICE_TABLE(of, tmp102_of_match); static const struct i2c_device_id tmp102_id[] = { { "tmp102", 0 }, { }, }; MODULE_DEVICE_TABLE(i2c, tmp102_id); static struct i2c_driver tmp102_driver = { .driver = { .name = "tmp102", .of_match_table = tmp102_of_match, }, .probe = tmp102_probe, .id_table = tmp102_id, };
dts /* Device Tree binding for the TMP102 */ &i2c1 { tmp102@48 { compatible = "ti,tmp102"; reg = <0x48>; }; };
Connections to other lessons:
| If you want to learn... | See |
|---|---|
| Real-time guarantees on embedded | RTOS lesson |
| How processors execute instructions | Computer Architecture |
| Networking protocols on embedded | Networking lesson |
| State machines for firmware logic | MDPs & Decision Making |
"The art of programming is the art of organizing complexity." — Edsger Dijkstra