LibFuzzer Integration¶
The C++ backend of Grammarinator provides seamless integration with
libFuzzer via a custom mutator interface. This allows Grammarinator
to be used not only as a blackbox test case generator, but also as an
in-process input synthesizer, where its internal derivation trees are
evolved and mutated during fuzzing runs. The mutator operates on serialized
.grt* trees and performs grammar-aware transformations based on the compiled
ANTLR grammar.
Overview¶
The integration supports the LLVMFuzzerCustomMutator and
LLVMFuzzerCustomCrossOver libFuzzer hooks. These override the default
libFuzzer mutation logic with tree-based logic derived from the grammar.
This enables grammar-aware, structure-preserving mutation and recombination of test cases at runtime – improving coverage and syntactic correctness compared to purely byte-level fuzzing.
Building the libFuzzer-Compatible Mutator¶
To enable this integration in a real libFuzzer-based fuzzing binary, a
specialized static library must be generated. This is done using the --grlf
flag (short for grammarinator-libfuzzer) in the build.py utility script.
Example using the HTML grammar:
python3 grammarinator-cxx/dev/build.py --clean \
--generator HTMLGenerator \
--includedir examples/fuzzer/ \
--grlf
This command produces a static library:
grammarinator-cxx/build/Release/lib/libgrlf-html.a
This .a file can be linked into a standard libFuzzer fuzz target:
clang++ <fuzz_target.cpp> -fsanitize=fuzzer grammarinator-cxx/build/Release/lib/libgrlf-html.a
After linking, Grammarinator will handle all mutation logic via its custom
mutator. Test inputs are expected to be serialized .grt* trees
(e.g., FlatBuffer-encoded). During fuzzing, mutations will occur in a
grammar-aware manner, resulting in:
higher syntactic validity of inputs,
better exploration of the structured input space,
and potentially deeper semantic bugs found in the target.
Note that only .grt*-style inputs (e.g., .grtf for FlatBuffer-encoded
trees) are supported by the libFuzzer integration.
Fuzzing Configuration¶
The libFuzzer mutator integration can be configured through command-line
options, similarly to grammarinator-generate.
These arguments must be passed after the -ignore_remaining_args=1 flag,
so that libFuzzer forwards them to Grammarinator.
The following options are supported:
-max_depth: Equivalent to
--max-depth(integer)-max_tokens: Equivalent to
--max-tokens(integer)-memo_size: Equivalent to
--memo-size(integer)-random_mutators: Enable random mutators; equivalent to the inverse of
--disable-random-mutators(0 or 1)-weights: Equivalent to
--weights(path to a JSON file)-allowlist: Equivalent to
--allowlist(comma-separated list of enabled creators)-blocklist: Equivalent to
--blocklist(comma-separated list of disabled creators)
Verifying the Setup¶
If you want to test the integration without a real target, you can build a
dummy binary by adding --fuzznull:
CXX=clang++ python3 grammarinator-cxx/dev/build.py --clean \
--generator HTMLGenerator \
--includedir examples/fuzzer/ \
--fuzznull
This will create a fuzznull-html binary under
grammarinator-cxx/build/Release/bin/, which can be invoked directly to
verify the setup and test input processing.
Note 1: clang++ must be used in this case, since other compilers don’t support libFuzzer.
Note 2: When using LibFuzzer with Grammarinator integration, both the input and output corpora must be in tree format. Therefore, any existing input corpus must first be converted into trees using the grammarinator-parse utility. After the fuzzing session, the resulting tree corpus can be converted back into source-level test cases using the grammarinator-decode utility.