Expand description
Corpus management for parallel fuzzing with coverage-guided mutation.
This module implements a corpus-based fuzzing system that stores, mutates, and shares transaction sequences across multiple fuzzing workers. Each corpus entry represents a sequence of transactions that has produced interesting coverage, and can be mutated to discover new execution paths.
§File System Structure
The corpus is organized on disk as follows:
<corpus_dir>/
├── worker0/ # Master (worker 0) directory
│ ├── corpus/ # Master's corpus entries
│ │ ├── <uuid>-<timestamp>.json # Corpus entry (if small)
│ │ ├── <uuid>-<timestamp>.json.gz # Corpus entry (if large, compressed)
│ └── sync/ # Directory where other workers export new findings
│ └── <uuid>-<timestamp>.json # New entries from other workers
└── workerN/ # Worker N's directory
├── corpus/ # Worker N's local corpus
│ └── ...
└── sync/ # Worker 2's sync directory
└── ...§Workflow
- Each worker maintains its own local corpus with entries stored as JSON files
- Workers export new interesting entries to the master’s sync directory via hard links
- The master (worker0) imports new entries from its sync directory and exports them to all the other workers
- Workers sync with the master to receive new corpus entries from other workers
- This all happens periodically, there is no clear order in which workers export or import entries since it doesn’t matter as long as the corpus eventually syncs across all workers
Structs§
- Campaign
Corpus 🔒Entry - Corpus entry selected by a worker and returned for logical-campaign persistence.
- Corpus
Entry 🔒 - Holds Corpus information.
- Corpus
Metrics 🔒 - Dynamic
Target Ctx - Refs used during corpus replay to register contracts deployed mid-sequence as fuzz targets,
mirroring the campaign loop so follow-up calls into them aren’t dropped by
can_replay_tx. - Global
Corpus 🔒Metrics - Optimization
State 🔒 - Persisted optimization state: the best value found and the sequence that produced it.
- Replay
Coverage 🔒 - Replay
Outcome 🔒 - Replay
Target 🔒 - Worker
Corpus - Per-worker corpus manager.
- Worker
Corpus 🔒Seed - Campaign-level corpus state produced by replaying persisted corpus entries once.
Enums§
- Mutation
Type 🔒 - Possible mutation strategies to apply on a call sequence.
Constants§
- CORPUS_
DIR 🔒 - FAVORABILITY_
THRESHOLD 🔒 - GZIP_
THRESHOLD 🔒 - Threshold for compressing corpus entries. 4KiB is usually the minimum file size on popular file systems.
- OPTIMIZATION_
BEST_ 🔒FILE - SYNC_
DIR 🔒 - WORKER 🔒
Functions§
- has_
legacy_ 🔒invariant_ corpus_ dirs - load_
optimization_ 🔒state - persist_
campaign_ 🔒entry - persist_
optimization_ 🔒output - prepare_
campaign_ 🔒output_ dir - register_
replay_ 🔒created - Registers contracts created by the last tx so subsequent txs in the same replayed sequence can target them.
- replay_
corpus_ 🔒sequence - replay_
corpus_ 🔒sequence_ with_ executor - rollback_
replay_ 🔒created - Clears dynamic targets added during a replayed entry so they don’t leak into the next one.
- unique_
corpus_ 🔒entries