Kernel Fuzzing in Userspace with LKL

In this article we'll explore the Linux Kernel Library (LKL), diving into its functionality, how it operates, and its use in fuzzing kernel components from userspace with tools like libFuzzer. We'll examine how LKL emulates a kernel environment and enables access to kernel functions from a userspace application.

Jul 10

LKL

The Linux Kernel Library (LKL) transforms the Linux kernel into a userspace library, enabling kernel code such as drivers, filesystems, or networking stacks to run as part of a standard process without requiring a full operating system or hardware. By compiling the kernel into a shared object (e.g., liblkl.so), LKL allows developers to call kernel functions directly from userspace, bypassing the need for root access, virtual machines, or complex kernel module loading. This makes it an ideal tool for testing, debugging, and fuzzing kernel components in a controlled, lightweight environment.

LKL emulates a virtual kernel environment by providing a minimal, self-contained runtime that mimics core kernel subsystems. Here's how it operates at a high level:

Initialization: The lkl_init function loads the LKL library, and lkl_start_kernel sets up a virtual kernel with a specified memory limit (e.g., 50MB). This initializes essential components like memory management, interrupt handling, and a simplified scheduler.

Syscall Mapping: LKL provides wrappers (e.g., lkl_sys_open, lkl_sys_read) that map POSIX syscalls to their kernel equivalents (e.g., sys_open, sys_read) via the lkl_host_ops structure. These wrappers translate userspace calls into kernel operations, allowing seamless interaction with kernel code.

Virtual Subsystems: LKL supports virtual filesystems (e.g., sysfs, proc, dev) and basic kernel facilities like virtual IRQs and timers. This creates a sandboxed environment where kernel objects, such as device nodes or files, can be accessed.

Component Integration: Kernel components (e.g., drivers or filesystems) are compiled into the LKL image and initialized during boot, behaving as they would in a real kernel but within the userspace process.

Single-Threaded Execution: Unlike a full kernel, LKL typically runs in a single-threaded context, relying on a custom scheduler to manage tasks, which simplifies execution but limits concurrency compared to a real kernel.

LKL's versatility supports a wide range of applications:

Driver Testing
Test kernel drivers (e.g., char devices, block devices) in isolation, debugging without risking a live kernel crash.

Fuzzing
Feed inputs to kernel components to uncover vulnerabilities like buffer overflows or memory errors, leveraging sanitizers like ASan or UBSan.

Filesystem Development
Test new filesystems or modifications in userspace before deploying to a real kernel.

Cross-Platform Kernel Code: Run Linux kernel code on non-Linux systems (e.g., Windows, with tweaks).

LKL bridges userspace and kernel space by providing a syscall interface that mimics standard POSIX calls but executes within the virtual kernel. Here's how it works:

Syscall Wrappers
Userspace programs call functions like lkl_sys_open or lkl_sys_write, defined in lkl.h. These wrappers, implemented via lkl_host_ops, invoke the corresponding kernel syscall handlers (e.g., sys_open, sys_write).

Kernel Space Access
By initializing LKL, the program creates a virtual kernel environment where kernel objects (e.g., device nodes like /dev/mydevice or files in /proc) are accessible. For example, opening /dev/mydevice with lkl_sys_open triggers the driver's open function in kernel space.

Virtual Filesystems
LKL's lkl_mount_fs function sets up sysfs, proc, or dev, allowing access to kernel data structures (e.g., /proc/devices lists registered drivers major numbers).

Direct Interaction: Userspace programs can manipulate kernel objects by passing data through syscalls, which LKL routes to the appropriate kernel functions, such as a driver's read or write handlers.

For example, to interact with a char device driver:

Initialize LKL with lkl_init and lkl_start_kernel.
Mount proc and read /proc/devices to find the driver's major number.
Create a device node (e.g., /dev/mydevice) with lkl_sys_mknodat.
Use lkl_sys_open, lkl_sys_read, or lkl_sys_write to call driver functions.

This approach allows precise, controlled access to kernel space, making LKL ideal for targeted testing and fuzzing.

let's apply LKL to a specific case: a custom char device driver (driver.c) named mobilehackinglab with intentional vulnerabilities to demonstrate fuzzing potential.
The driver has:

Write Vulnerability: Copies user data to a 512-byte stack buffer via copy_from_user without size checks inputs larger than 512 bytes cause stack overflows.

Read Vulnerability: Copies from a 16-byte tmp buffer to a 500-byte buffer via memcpy, then to user lengths over 16 bytes overflow the source, and over 500 bytes risk destination issues.

Here's the driver code, built into an LKL-enabled Android kernel (add to drivers/char/Kconfig and Makefile as built-in):

To interact with our mobilehackinglab driver, we use a userspace program (mhl_driver.c) that initializes LKL, starts the virtual kernel, mounts sysfs, proc, and dev, finds the driver's major number (e.g., 254) in /proc/devices, creates /dev/mobilehackinglab, and performs open, write, read, and close. We use small lengths (35 bytes for write, 10 bytes for read) to avoid crashes, but you can increase them to demonstrate the vulnerabilities (e.g., write_len > 512 for stack overflow, read_len > 16 for memcpy overflow).

The initialize_lkl function sets up the LKL environment to enable interaction with the mobilehackinglab driver. It starts by calling lkl_init to load the LKL library, followed by lkl_start_kernel with parameters like mem=50M and kasan.fault=panic to create a virtual kernel with 50MB of memory and strict error handling. It then mounts virtual filesystems (sysfs, proc, dev) to provide access to kernel structures. The function uses LKL syscalls (LKL_CALL(open), LKL_CALL(read), LKL_CALL(close)) to read /proc/devices, parsing it to find the major number (e.g., 254) for mobilehackinglab. With this number, it creates the device node /dev/mobilehackinglab using LKL_CALL(mknodat), enabling subsequent syscalls to access the driver.

This setup is crucial, as it establishes the Virtual Kernel Environment where the driver's functions (open, write, read, close) operate, allowing our program to interact with the driver's vulnerabilities (e.g., stack overflow in write for large inputs) in a controlled userspace context.

The main function orchestrates the interaction with the mobilehackinglab driver, serving as the entry point for our userspace program. It begins by calling initialize_lkl to set up the LKL environment and create the /dev/mobilehackinglab node.

It then opens the device using LKL_CALL(open) with O_RDWR | O_CLOEXEC flags, triggering the driver\u2019s driver_open function in the virtual kernel.

For the write operation, it prepares a 35-byte test string ("This is a test write to the driver."), safely under the 512-byte limit to avoid the stack overflow vulnerability, and sends it via LKL_CALL(write), which invokes the driver's driver_write function.

The dump_bytes function displays the sent data, and the program checks the write's success, printing the result. For the read operation, it requests 10 bytes (safe, under the 16-byte tmp buffer limit) using LKL_CALL(read), triggering driver_read, and displays the returned data (10 'A's) via dump_bytes.

Finally, it closes the device with LKL_CALL(close) and cleans up with lkl_cleanup. This function demonstrates how LKL syscalls access the driver's kernel-space functions, setting the stage for fuzzing by replacing hardcoded inputs with fuzzer-generated ones to exploit the vulnerabilities.

Fuzz Logic: In LLVMFuzzerTestOneInput (for libFuzzer) or a persistent loop (for AFL++), parse inputs, open the target device or subsystem, perform operations (e.g., read/write), and close.

Sanitizers: Use AddressSanitizer (ASan) to catch memory errors (e.g., buffer overflows) and UndefinedBehaviorSanitizer (UBSan) for issues like invalid casts or out-of-bounds accesses.

Coverage Feedback: libFuzzer or AFL++ tracks coverage, mutating inputs to explore new paths.

Alternative Fuzzers: AFL++ persistent mode can wrap the same logic for faster iteration, ideal for high-volume testing.

For our mobilehackinglab driver, fuzzing targets the stack overflow in write (len > 512) and memcpy overflow in read (len > 16), using ASan to catch crashes.

LKL Fuzzing

Mechanism: Runs kernel as a userspace library, targeting specific components via syscalls.
Use Case: Ideal for isolated drivers or subsystems (e.g., char devices, filesystems).
Integration: Works with libFuzzer or AFL++ for coverage-guided fuzzing.

kAFL

Mechanism: Uses hardware-assisted virtualization (Intel PT or KVM) to fuzz full kernels in a VM, capturing coverage at the hardware level.Use Case: Broad kernel fuzzing, including syscalls, drivers, and core logic.
Integration: Pairs with AFL-style fuzzers, using VM snapshots.

Syzkaller

Mechanism: Generates syscalls on a live kernel (VM or hardware), using syscall descriptions to explore paths.
Use Case: Comprehensive fuzzing of syscalls, drivers, and interactions.
Integration: Custom fuzzer with KCOV coverage.

LKL with libFuzzer/AFL++

Pros:

Speed: Userspace execution enables thousands of iterations per second, with no VM overhead.
Simplicity: Compile kernel as a library, link, and run no VMs or root required.
Safety: Crashes are contained in the process, protecting the host.
Sanitizer Support: Integrates with ASan/UBSan to catch memory errors and undefined behavior.
Targeted Testing: Ideal for specific drivers or subsystems, like our mobilehackinglab.
Flexible Fuzzers: Supports libFuzzer for tight integration or AFL++ for high-throughput persistent mode.

Cons:

Limited Scope: Best for isolated components; less suited for complex kernel interactions (e.g., scheduling, interrupts).
LKL Quirks: Requires config tweaks and static driver integration.
No Dynamic Modules: Components must be built into the LKL image (no insmod).
Single-Threaded: Limited concurrency compared to full kernel environments.

kAFL

Pros:

Precise Coverage: Hardware-based coverage (Intel PT) finds deep kernel paths.
Full Kernel Context: Tests interrupts, scheduling, and drivers in a VM.
Snapshots: VM resets speed up fuzzing cycles.

Cons:

Complex Setup: Requires KVM, Intel PT, and VM configuration.
Hardware Dependency: Limited to Intel CPUs with PT support.
Overhead: Slower than LKL due to VM execution.

Syzkaller
Pros:

Comprehensive: Tests syscalls, drivers, and races with detailed syscall specs.
Community: Large corpus and active maintenance.
Coverage: KCOV provides robust kernel coverage.

Cons:

Complexity: Needs syscall descriptions, VM setup, and root access.
Speed: Slower due to full kernel runs and reboots on crashes.
Resource Intensive: Requires significant CPU/memory for VMs.

To fuzz our mobilehackinglab driver, extend the access program into a libFuzzer or AFL++ harness:

Input Parsing: Use fuzz_data_t to specify mode (read/write), length, and data.

Fuzz Loop: In LLVMFuzzerTestOneInput (libFuzzer) or a persistent loop (AFL++), open /dev/mobilehackinglab, perform read/write, and close.

Sanitizers: ASan catches stack overflows (write len > 512); UBSan catches undefined memcpy behavior (read len > 16).

Coverage: libFuzzer or AFL++ mutates inputs based on coverage to hit new paths

Kernel Fuzzing in Userspace with LKL

LKL

How LKL Works

Accessing Syscalls and Kernel Space from Userspace with LKL

Install and Build Kernel with LKL

Accessing the custom kernel wth lkl

Full Code

Compile the program

Run the Program

How LKL Can Be Used for Fuzzing

Fuzzing Approach

LKL Fuzzing vs. Other Kernel Fuzzers

Pros and Cons of LKL-Based Fuzzing vs. kAFL and Syzkaller

Strategy for Fuzzing the Custom Driver

Fuzzing harness pseducode:

Socials

Resources

Legal

Company