From Bytes to Bugs: Fuzzing Android Libraries

In this post, we'll walk through fuzzing from the ground up: what it is, the main styles (black, grey, and white-box), and how to pick the right fuzzer for your target. Then we'll build practical fuzzing harnesses, point them at real CVE-affected libraries, and reproduce the exact crashes behind those bugs. Along the way, you'll use ASan/UBSan.

Oct 20

Fuzzing is automated test generation that feeds lots of weird inputs into your code to shake out bugs. Modern fuzzers are coverage-guided: they learn which inputs open new code paths and then mutate those winners to go deeper. Pair that with runtime "oracles" like ASan (AddressSanitizer) and UBSan (UndefinedBehaviorSanitizer), and you catch memory corruption, overflows, UAFs, and UB quickly.

In practice, many fuzzing workflows combine elements of black-box, grey-box, and white-box fuzzing to maximize effectiveness. For example:

Grey-box + White-box: Tools like Dr. Fuzz or QSYM integrate coverage-guided fuzzing with selective symbolic execution to balance speed and depth.
Black-box + Grey-box: Start with black-box fuzzing to quickly identify low-hanging bugs, then switch to grey-box for deeper exploration.
Custom workflows: Security researchers often chain multiple fuzzers, using black-box to generate initial inputs, grey-box to refine them, and white-box to target specific vulnerabilities.
Input strategies:
Mutation-based (common): start from valid seeds, mutate them. Great for binary formats, images, parsers.
Generation/grammar-based: synthesize inputs from a spec/grammar (e.g., libprotobuf-mutator). Higher validity, excellent for deeply structured formats.

Fuzzer	Style	Coverage Source	How You Drive It	Strengths	Gotchas
libFuzzer	In-process, mutation-based (grey-box)	LLVM SanitizerCoverage (edge counters)	Implement `int LLVMFuzzerTestOneInput(const uint8_t*, size_t)`	Very fast; unit-test style harness, great with ASan/UBSan	Single target per binary, in-process requires fuzzer-safe code (no global leaks/stateful crashes)
AFL++	Fork-server (grey-box), plus QEMU/Frida modes	Compiler RT or binary instr., QEMU DBT, Frida hooks	Feed files/stdin to `main()`, use persistent_mode for speed	Handles big apps/whole programs, rich corpus tools, many modes	Slower per-exec than in-process, best perf needs harnessing/persistent loop
LibAFL	Modular framework: in-process or fork-server; QEMU/Frida	Compiler instr., QEMU DBT, Frida hooks, custom observers	Write a harness (Rust or C/C++ via bindings); plug-and-play mutators/schedulers, supports persistent loops	Highly customizable, advanced schedulers, scale-out/distributed friendly, mobile (Frida) support	Steeper learning curve, more engineering to assemble a pipeline, fewer drop-in harnesses
Honggfuzz	Fork-server (grey-box)	Binary instrumentation	Similar to AFL++ (files/stdin), persistent mode available	Simple setup; solid performance, handy features (perf counters, sanitizers)	Smaller ecosystem vs. AFL++, fewer modes/integration options

Start with your target & constraints.

Have source and a callable parser/decoder? Use libFuzzer.

In-process, very fast, great with ASan/UBSan, ideal for small, deterministic harnesses.

Fuzzing a whole program or binary-only target? Use AFL++.

Prefer compiler instrumentation; switch to QEMU mode for closed binaries, or Frida mode to hook functions in mobile apps/.so files.

Need a customizable, scalable pipeline? Use LibAFL.

Mix-and-match schedulers/mutators, supports distributed runs and deep mobile hooking (QEMU/Frida); more engineering effort up front.

Want a lean fork-server with solid performance? Use Honggfuzz.

Simpler setup, fewer moving parts.

Rule of thumb:

libFuzzer for fast, source-instrumented libraries, AFL++ for programs & binaries (QEMU/Frida as needed), LibAFL for advanced/bespoke pipelines, Honggfuzz for a straightforward fork-server.

The bug lives in the lossless WebP (VP8L) decoder, specifically in how it builds Huffman coding tables. A maliciously crafted WebP can cause BuildHuffmanTable to construct tables larger than the allocated buffer, triggering a heap buffer overflow. (See the WebP API docs: http://developers.google.com/speed/webp/docs/api)

To build the fuzzing harness, we consult the libwebp documentation to choose the right decode APIs and call sequence what a valid invocation looks like, the required initialization (WebPDecoderConfig, colorspace), incremental vs. one-shot decode, and proper cleanup (WebPFreeDecBuffer, WebPIDelete). This ensures the harness drives the vulnerable code paths deterministically while staying stable and leak-free.

While fuzzing, we hit frequent OOMs and slow-path timeouts. Our initial genericWebP harness did reach the Huffman code path, but crashes were slow to surface. So we need to refine it into a VP8L-focused libFuzzer harness that adds strict checks on mutated inputs and tight resource caps. This version is designed for stability while reliably driving the lossless WebP (VP8L) Huffman path the area implicated in CVE-2023-4863.

You can run the fuzzer with flags like -ignore_ooms=1 (and even -ignore_timeouts=1) to keep the campaign moving past crashes and hangs, but it's not a silver bullet. Some mutated images can drive resident memory (RSS) so high that the process still runs out of memory despite the ignore setting. In practice, you'll want to combine those flags with tighter resource limits\u2014e.g., -rss_limit_mb, -max_len, and dimension/pixel caps in the harness to prevent pathological inputs from blowing up memory in the first place.

Pre-parses RIFF/WEBP: verifies RIFF/WEBP magic, iterates chunks, requires a VP8L chunk.
Caps chunk sizes: rejects chunks larger than a small threshold (e.g 32 KiB) to avoid OOM/slow paths.
Uses WebPGetFeatures: quickly probes width/height/format, then clamps dimensions/pixel count.
Decodes into a caller buffer with WebPDecodeRGBAInto (small, fixed size) to prevent huge internal allocations.
Keeps fuzzing deterministic and fast (size limits, no threads), making it ideal when you're seeing OOMs/timeouts but still want to reach the Huffman table build code.

In this Blogpost, we started with a simple goal,explain what fuzzing is, the main types, and then apply white-box fuzzing to Android-relevant libraries and turned it into a repeatable workflow that surfaces real bugs. Using libFuzzer with ASan/UBSan, we built two harnesses that balance depth and stability: a straightforward decoder harness for libpng (reproducing the CVE-2019-7317 use-after-free on 1.6.36) and a refined, VP8L-focused harness for libwebp (triggering the CVE-2023-4863 Huffman overflow on 1.3.1). Along the way, we handled the practical realities of image-codec fuzzing OOMs, slow units, and timeouts by adding size/dimension caps, caller-allocated buffers, and input pre-parsing (RIFF/VP8L checks).

The result is more than two PoCs: it's a portable methodology you can apply to other native libraries used in Android apps. Pick a vulnerable or high-risk parser, study its public API, design a deterministic, leak-free harness that exercises deep code paths, and add sensible validations so the fuzzer explores productively instead of drowning in pathological inputs. We also showed how to reproduce deterministically and minimize crashing inputs. With these patterns in hand, you can iterate across additional file formats and media stacks common on Android confident your fuzzing will be fast, stable, and capable of rediscovering (and preventing) real-world vulnerabilities.

Now that we've covered fuzzing with source available, do you want to learn advanced binary-only fuzzing where you don't have the source, you pull a library from a mobile device, reverse it to find target functions, and fuzz it by emulating with QEMU or fuzzing on-device with Frida mode and then move on to exploitation?

Check out our courses that cover full chain exploitation,

Android Userland Fuzzing and Exploitation
https://www.mobilehackinglab.com/afe-promo

Android Kernel Fuzzing and Exploitation
https://www.mobilehackinglab.com/course/android-kernel-fuzzing-and-exploitation

Get both together with a special bundle sale with 60% discount!
https://www.mobilehackinglab.com/bundles?bundle_id=android-kernel-userland

Want to learn to Chained AppSec bugs to get remote code execution?

Advanced Android Hacking - Road to Pwn2Own
https://www.mobilehackinglab.com/course/advanced-android-hacking

From Bytes to Bugs: Fuzzing Android Libraries

Basic Fuzzing of Native Programs

What is Fuzzing?

Types of Fuzzing

Black-box Fuzzing

Key Characteristics:

Advantages:

Limitations:

Use Case:

Grey-box Fuzzing

Key Characteristics:

Advantages:

Limitations:

Use Case:

White-box Fuzzing

Key Characteristics:

Advantages:

Limitations:

Use Case:

Hybrid Approaches

Different kind of fuzzers

Choosing the right Fuzzer

What is a Fuzzing Harness?

Good harnesses:

Build with sanitizers + coverage:

libpng CVE-2019-7317 (Use-After-Free):

Get a vulnerable libpng

Format gate (PNG magic)

Set up png_image

decode from memory

Dimension caps

Choose output format + compute buffer size

Allocate output:

Decode function:

Cleanup every path

Harness

compile the harness

Running the fuzzer

libwebp CVE-2023-4863

Get a vulnerable libwebp

Includes: types + libwebp API

libFuzzer entrypoint

Decoder configuration: deterministic, concrete output

Split input: set up incremental decode

Feed chunk #1 and handle early aborts

Feed chunk #2 to drive deeper states

Dispose the incremental decoder

Also try the one-shot decode path:

Harness

Compile the harness

resource limits & smarter guidance

Running the Harness

Refined harness

Compile the harness

Running the harness

Conclusion

Socials

Resources

Legal

Company