Expand description
Corpus management for parallel fuzzing with coverage-guided mutation.
This module implements a corpus-based fuzzing system that stores, mutates, and shares transaction sequences across multiple fuzzing workers. Each corpus entry represents a sequence of transactions that has produced interesting coverage, and can be mutated to discover new execution paths.
§File System Structure
The corpus is organized on disk as follows:
<corpus_dir>/
├── worker0/ # Master (worker 0) directory
│ ├── corpus/ # Master's corpus entries
│ │ ├── <uuid>-<timestamp>.json # Corpus entry (if small)
│ │ ├── <uuid>-<timestamp>.json.gz # Corpus entry (if large, compressed)
│ └── sync/ # Directory where other workers export new findings
│ └── <uuid>-<timestamp>.json # New entries from other workers
└── workerN/ # Worker N's directory
├── corpus/ # Worker N's local corpus
│ └── ...
└── sync/ # Worker 2's sync directory
└── ...§Workflow
- Each worker maintains its own local corpus with entries stored as JSON files
- Workers export new interesting entries to the master’s sync directory via hard links
- The master (worker0) imports new entries from its sync directory and exports them to all the other workers
- Workers sync with the master to receive new corpus entries from other workers
- This all happens periodically, there is no clear order in which workers export or import entries since it doesn’t matter as long as the corpus eventually syncs across all workers
Structs§
- Corpus
DirEntry 🔒 - Corpus
Entry 🔒 - Holds Corpus information.
- Corpus
Metrics 🔒 - Global
Corpus 🔒Metrics - Worker
Corpus - Per-worker corpus manager.
Enums§
- Mutation
Type 🔒 - Possible mutation strategies to apply on a call sequence.
Constants§
- CORPUS_
DIR 🔒 - COVERAGE_
MAP_ 🔒SIZE - FAVORABILITY_
THRESHOLD 🔒 - GZIP_
THRESHOLD 🔒 - Threshold for compressing corpus entries. 4KiB is usually the minimum file size on popular file systems.
- SYNC_
DIR 🔒 - WORKER 🔒
Functions§
- parse_
corpus_ 🔒filename - Parses the corpus filename and returns the uuid and timestamp associated with it.
- read_
corpus_ 🔒dir