AFL++ Integration¶
The C++ backend of Grammarinator provides seamless integration with
AFL++ via its custom mutator interface. This allows Grammarinator
to be used not only as a blackbox test case generator, but also as an
in-process input synthesizer, where its internal derivation trees are
evolved and mutated during fuzzing runs. The mutator operates on serialized
.grt* trees and performs grammar-aware transformations based on the compiled
ANTLR grammar.
Overview¶
The integration uses the AFL++ custom mutator API custom mutator hooks. AFL++ loads a shared library implementing these hooks and delegates mutation and related workflow operations to Grammarinator.
This enables grammar-aware, structure-preserving mutation and recombination of test cases at runtime – improving coverage and syntactic correctness compared to purely byte-level fuzzing.
Building the AFL++-Compatible Mutator¶
To enable this integration in a real AFL++ fuzzing setup, a specialized
shared library must be generated from the C++ generator class produced by
grammarinator-process. This can be compiled
by the the build script using the --grafl flag
(short for grammarinator-afl).
Example using the HTML grammar:
python3 grammarinator-cxx/dev/build.py --clean \
--generator HTMLGenerator \
--includedir <dir-to-HTMLGenerator> \
--afl-includedir <AFLplusplus-root>/include \
--serializer SimpleSpaceSerializer \
--grafl
This command produces a shared library:
grammarinator-cxx/build/lib/libgrafl-html.so
AFL++ will load this .so as the custom mutator library through the
AFL_CUSTOM_MUTATOR_LIBRARY environment variable.
Test inputs are expected to be encoded as .grt* trees
(e.g., FlatBuffer-encoded). During fuzzing, mutations will occur in a
grammar-aware manner, resulting in:
higher syntactic validity of inputs,
better exploration of the structured input space,
and potentially deeper semantic bugs found in the target.
Note that only .grt*-style inputs (e.g., .grtf for FlatBuffer-encoded
trees) are supported by the AFL++ integration.
Fuzzing Configuration¶
Unlike the grammarinator-generate utility, the
AFL++ custom mutator integration cannot be configured through command-line
arguments. Instead, the behavior of the mutator can be controlled via
environment variables prefixed with GRAFL_.
The following options are currently supported:
GRAFL_MAX_DEPTH: Equivalent to
--max-depth(integer)GRAFL_MAX_TOKENS: Equivalent to
--max-tokens(integer)GRAFL_MEMO_SIZE: Equivalent to
--memo-size(integer)GRAFL_RANDOM_MUTATORS: Enables random mutators; inverse of
--disable-random-mutators(boolean; accepts1,true, oryescase-insensitively)GRAFL_WEIGHTS: Equivalent to
--weights(path to a JSON file)GRAFL_MAX_TRIM_STEPS: Maximum number of mutation steps performed during trimming of a single test input (integer)
Verifying the Setup¶
To run a fuzzing session with AFL++ equipped with Grammarinator, a compiler
wrapper (e.g., afl-clang-fast) and the afl-fuzz utility must first be
obtained. Both can be installed or built with following the instruction in the
official AFL++ documentation.
Once the target application is compiled with the AFL++ compiler wrapper, the
required instrumentation is automatically injected into the binary. This
instrumentation is later used by afl-fuzz to guide the fuzzing process.
Next, select or create a grammar that describes the expected input format (e.g.,
HTML grammar), then build the required binaries with
--grafl, and optionally also with --generate and --decode flags.
The next step is to prepare an initial tree corpus that serves as the starting point for the fuzzing session. One option is to generate this corpus from scratch using the grammarinator-generate utility. For example:
grammarinator-generate-html \
-n 100 \
-o html-src/%d.html \
--population html-trees/ \
--keep-trees
Alternatively, an initial tree corpus can be created by converting existing source files (e.g., HTML documents) into tree format using the grammarinator-parse utility. For example:
grammarinator-parse html-src \
-o html-trees \
-g HTMLLexer.g4 HTMLParser.g4 \
--tree-format flatbuffers
To test the integration, run AFL++ in custom-mutator-only mode and point it to the generated shared library:
AFL_CUSTOM_MUTATOR_ONLY=1 \
AFL_CUSTOM_MUTATOR_LIBRARY=grammarinator-cxx/build/lib/libgrafl-html.so \
afl-fuzz -i html-trees -o outdir -- ./target_app @@
Setting AFL_CUSTOM_MUTATOR_ONLY=1 is mandatory. Without this flag,
AFL++ would apply its built-in byte-level mutators to the test cases, which
would corrupt the encoded tree representation used by Grammarinator.
Note 1: When using AFL++ with Grammarinator integration, both the input and output corpora must be in tree format. Therefore, any existing input corpus must first be converted into trees using the grammarinator-parse utility. After the fuzzing session, the resulting tree corpus can be converted back into source-level test cases using the grammarinator-decode utility.
Note 2: The items of a tree corpus can be minimized using the afl-tmin
tool in a grammar-aware manner by providing the appropriate custom
mutator-related environment variables. For example:
AFL_CUSTOM_MUTATOR_ONLY=1 \
AFL_CUSTOM_MUTATOR_LIBRARY=grammarinator-cxx/build/lib/libgrafl-html.so \
afl-tmin -i html-trees -o html-trimmed -e -- ./target_app @@