Fuzzing is automated test generation that feeds lots of weird inputs into your code to shake out bugs. Modern fuzzers are coverage-guided: they learn which inputs open new code paths and then mutate those winners to go deeper. Pair that with runtime "oracles" like ASan (AddressSanitizer) and UBSan (UndefinedBehaviorSanitizer), and you catch memory corruption, overflows, UAFs, and UB quickly.
Black-box fuzzing involves testing a program without any knowledge of its internal structure, code, or logic. The fuzzer generates inputs (e.g., random strings, malformed files, or network packets) and feeds them to the target, observing only external behavior such as crashes, errors, or unexpected outputs.
Black-box fuzzing is ideal for quick, initial testing of closed-source applications, legacy systems, or scenarios where instrumentation is impractical. For example, fuzzing a proprietary PDF reader by feeding it thousands of malformed PDF files to detect crashes.
Grey-box fuzzing strikes a balance between black-box and white-box approaches by incorporating lightweight instrumentation to gain partial visibility into the program's execution. It uses feedback, such as code coverage (e.g., which code branches or edges are executed), to guide input generation and prioritize inputs that explore new paths.
White-box fuzzing leverages detailed knowledge of the program\u2019s internal structure, often using advanced techniques like symbolic execution or concolic execution to systematically explore code paths. It analyzes the program's control flow and constraints to generate inputs that target specific execution paths.
White-box fuzzing is best suited for critical, well-defined components where deep analysis is justified, such as cryptographic libraries or protocol parsers. For example, using KLEE to test a compression library for edge-case bugs by solving constraints to reach rare code paths.
In practice, many fuzzing workflows combine elements of black-box, grey-box, and white-box fuzzing to maximize effectiveness. For example:
| Fuzzer | Style | Coverage Source | How You Drive It | Strengths | Gotchas |
| libFuzzer | In-process, mutation-based (grey-box) | LLVM SanitizerCoverage (edge counters) | Implement `int LLVMFuzzerTestOneInput(const uint8_t*, size_t)` | Very fast; unit-test style harness, great with ASan/UBSan | Single target per binary, in-process requires fuzzer-safe code (no global leaks/stateful crashes) |
| AFL++ | Fork-server (grey-box), plus QEMU/Frida modes | Compiler RT or binary instr., QEMU DBT, Frida hooks | Feed files/stdin to `main()`, use persistent_mode for speed | Handles big apps/whole programs, rich corpus tools, many modes | Slower per-exec than in-process, best perf needs harnessing/persistent loop |
| LibAFL | Modular framework: in-process or fork-server; QEMU/Frida | Compiler instr., QEMU DBT, Frida hooks, custom observers | Write a harness (Rust or C/C++ via bindings); plug-and-play mutators/schedulers, supports persistent loops | Highly customizable, advanced schedulers, scale-out/distributed friendly, mobile (Frida) support | Steeper learning curve, more engineering to assemble a pipeline, fewer drop-in harnesses |
| Honggfuzz | Fork-server (grey-box) | Binary instrumentation | Similar to AFL++ (files/stdin), persistent mode available | Simple setup; solid performance, handy features (perf counters, sanitizers) | Smaller ecosystem vs. AFL++, fewer modes/integration options |
Start with your target & constraints.
Have source and a callable parser/decoder? Use libFuzzer.
The harness is the adapter between the fuzzer and your code under test. For libFuzzer, you implement:
Let's put this into practice by building fuzzing harnesses for Android-used libraries. We'll focus on two realvulnerabilities: libpng (CVE-2019-7317, use-after-free) and libwebp (CVE-2023-4863, heap overflow).
The use-after-free occurs because the `png_image_free_function` is incorrectly called under `png_safe_execute`. This can cause the program to reuse memory that has already been freed, leading to a crash.
We're using libFuzzer as our fuzzer of choice because it runs in-process (in-memory) and is fast. To build a good harness, we first need to understand how to call the target APIs correctly, what a valid call looks like, required initialization, and cleanup. So we start by reading the libpng documentation and selecting the right functions to drive from our harness (see: libpng manual: https://www.libpng.org/pub/png/libpng-manual.txt)
We include png headers file to access the png required functions to setup the harness
To use libFuzzer, our harness must implement its entry point.
Standard libFuzzer hook. In-process fuzzer calls this function many times with mutated inputs
To guard against oversized inputs by checking the mutated buffer's length
Zero-init is required; version must be set per the simplified API contract.
Caller-allocated buffer keeps control over memory use (vs. letting the library allocate huge blocks).
To prevent per-iteration leaks, ensure cleanup runs on every path.
You can copy the PNG corpus files from the libpng directory and run the fuzzer.
Within a minute, the fuzzer should crash and ASan will produce a report.
You'll find the crash file in the same directory as your harness. Use it for deeper analysis e.g., building an exploit or PoC and you can reproduce the crash by running the fuzzer directly on that file.
The bug lives in the lossless WebP (VP8L) decoder, specifically in how it builds Huffman coding tables. A maliciously crafted WebP can cause BuildHuffmanTable to construct tables larger than the allocated buffer, triggering a heap buffer overflow. (See the WebP API docs: http://developers.google.com/speed/webp/docs/api)
To build the fuzzing harness, we consult the libwebp documentation to choose the right decode APIs and call sequence what a valid invocation looks like, the required initialization (WebPDecoderConfig, colorspace), incremental vs. one-shot decode, and proper cleanup (WebPFreeDecBuffer, WebPIDelete). This ensures the harness drives the vulnerable code paths deterministically while staying stable and leak-free.
webp/decode.h exposes the decoder (incremental + one-shot) and config structs.
The function libFuzzer repeatedly calls with mutated inputs (in-process).
It initializes the decoder configuration with safe defaults, forces a specific colorspace so the decoder exercises real pixel pipelines, and disables threading to avoid nondeterministic races or masking crashes.
The code first computes a halfway point in the input (mid = size / 2, clamped to at least 4 bytes) so the initial chunk isn't small. It then creates an incremental decoder with WebPIDecode(NULL, 0, &config) to consume the data in chunks. If the decoder allocation fails, it frees the output buffer stored in config and returns immediately.
The harness feeds the first half of the input to the incremental decoder with WebPIUpdate(idec, data, mid), which typically parses the RIFF/WEBP header and discovers basic features. If this call returns VP8_STATUS_OUT_OF_MEMORY or VP8_STATUS_USER_ABORT, it cleans up by deleting the decoder and freeing the output buffer in config, then returns immediately to abort the current iteration.
It then feeds the second half of the input WebPIUpdate(idec, data + mid, size - mid) to continue the bitstream, which drives the decoder into deeper stages such as the lossless VP8L or lossy VP8 paths, including entropy/Huffman table construction, alpha processing, and other decode logic.
Prevents leaks between the iterations (critical in-processing fuzzing).
The harness also exercises the one-shot decode path by calling WebPDecode(data, size, &config), which traverses slightly different glue and validation code than the incremental flow. After this call, it explicitly frees the output buffer stored in config\u2014which libwebp allocated via WebPFreeDecBuffer(&config.output), and then returns.
After 8 - 10 hours hours running fuzzer we go the crash.
While fuzzing, we hit frequent OOMs and slow-path timeouts. Our initial genericWebP harness did reach the Huffman code path, but crashes were slow to surface. So we need to refine it into a VP8L-focused libFuzzer harness that adds strict checks on mutated inputs and tight resource caps. This version is designed for stability while reliably driving the lossless WebP (VP8L) Huffman path the area implicated in CVE-2023-4863.
You can run the fuzzer with flags like -ignore_ooms=1 (and even -ignore_timeouts=1) to keep the campaign moving past crashes and hangs, but it's not a silver bullet. Some mutated images can drive resident memory (RSS) so high that the process still runs out of memory despite the ignore setting. In practice, you'll want to combine those flags with tighter resource limits\u2014e.g., -rss_limit_mb, -max_len, and dimension/pixel caps in the harness to prevent pathological inputs from blowing up memory in the first place.

With the refined harness, execution was smoother and consistently reached the lossless (VP8L) code path without OOMs or timeouts. We were also able to reproduce the crash within 2-3 hours.

In this Blogpost, we started with a simple goal,explain what fuzzing is, the main types, and then apply white-box fuzzing to Android-relevant libraries and turned it into a repeatable workflow that surfaces real bugs. Using libFuzzer with ASan/UBSan, we built two harnesses that balance depth and stability: a straightforward decoder harness for libpng (reproducing the CVE-2019-7317 use-after-free on 1.6.36) and a refined, VP8L-focused harness for libwebp (triggering the CVE-2023-4863 Huffman overflow on 1.3.1). Along the way, we handled the practical realities of image-codec fuzzing OOMs, slow units, and timeouts by adding size/dimension caps, caller-allocated buffers, and input pre-parsing (RIFF/VP8L checks).
The result is more than two PoCs: it's a portable methodology you can apply to other native libraries used in Android apps. Pick a vulnerable or high-risk parser, study its public API, design a deterministic, leak-free harness that exercises deep code paths, and add sensible validations so the fuzzer explores productively instead of drowning in pathological inputs. We also showed how to reproduce deterministically and minimize crashing inputs. With these patterns in hand, you can iterate across additional file formats and media stacks common on Android confident your fuzzing will be fast, stable, and capable of rediscovering (and preventing) real-world vulnerabilities.
Now that we've covered fuzzing with source available, do you want to learn advanced binary-only fuzzing where you don't have the source, you pull a library from a mobile device, reverse it to find target functions, and fuzz it by emulating with QEMU or fuzzing on-device with Frida mode and then move on to exploitation?
Check out our courses that cover full chain exploitation,
Android Userland Fuzzing and Exploitation
https://www.mobilehackinglab.com/afe-promo
Android Kernel Fuzzing and Exploitation
https://www.mobilehackinglab.com/courses/kernel-fuzzing.html
Get both together with a special bundle sale with 60% discount!
https://www.mobilehackinglab.com/bundles?bundle_id=android-kernel-userland
Want to learn to Chained AppSec bugs to get remote code execution?
Advanced Android Hacking - Road to Pwn2Own
https://www.mobilehackinglab.com/course/advanced-android-hacking
Copyright © 2024 (function() { if (!window.opener) return; if (window.location.search.indexOf('msg=not-logged-in') !== -1) return; if (window.location.pathname.indexOf('signin') !== -1) return; function send(email, lwToken) { if (!email || email.indexOf('@') === -1) return; window.opener.postMessage({ type: 'mhl-lw-login', email: email.toLowerCase().trim(), lwToken: lwToken || '' }, '*'); setTimeout(function() { try { window.close(); } catch(e) {} }, 500); } function findToken() { // Scan localStorage for any JWT that contains an email — that's the LW user token try { var keys = Object.keys(localStorage); for (var i = 0; i < keys.length; i++) { var v = localStorage.getItem(keys[i]); if (!v || v.split('.').length !== 3) continue; try { var p = JSON.parse(atob(v.split('.')[1].replace(/-/g,'+').replace(/_/g,'/'))); if (p && p.email) return { email: p.email, token: v }; } catch(e) {} } } catch(e) {} return null; } function tryGlobals() { var checks = [window.LW && window.LW.user, window.learner, window.__learner, window.lwUser, window.currentUser]; for (var i = 0; i < checks.length; i++) { if (checks[i] && checks[i].email) return checks[i].email; } return null; } function tryApis(done) { var endpoints = ['/api/v2/me', '/api/v2/learner', '/api/v2/learner/profile']; var i = 0; function next() { if (i >= endpoints.length) return done(null); fetch(endpoints[i++], { credentials: 'include' }) .then(function(r) { return r.ok ? r.json() : null; }) .then(function(d) { if (!d) return next(); var e = d.email || (d.data && d.data.email) || (d.user && d.user.email); e ? done(e) : next(); }).catch(next); } next(); } window.addEventListener('load', function() { // Best case: find token in localStorage (has email + is the LW JWT) var found = findToken(); if (found) return send(found.email, found.token); // Try JS globals var email = tryGlobals(); if (email) return send(email, ''); // Try LW's own same-origin API tryApis(function(email) { if (email) send(email, ''); }); }); })();
Socials