Skip to main content

Module corpus

Module corpus 

Source
Expand description

Corpus management for parallel fuzzing with coverage-guided mutation.

This module implements a corpus-based fuzzing system that stores, mutates, and shares transaction sequences across multiple fuzzing workers. Each corpus entry represents a sequence of transactions that has produced interesting coverage, and can be mutated to discover new execution paths.

§File System Structure

The corpus is organized on disk as follows:

<corpus_dir>/
├── worker0/                  # Master (worker 0) directory
│   ├── corpus/               # Master's corpus entries
│   │   ├── <uuid>-<timestamp>.json          # Corpus entry (if small)
│   │   ├── <uuid>-<timestamp>.json.gz       # Corpus entry (if large, compressed)
│   └── sync/                 # Directory where other workers export new findings
│       └── <uuid>-<timestamp>.json          # New entries from other workers
└── workerN/                  # Worker N's directory
    ├── corpus/               # Worker N's local corpus
    │   └── ...
    └── sync/                 # Worker 2's sync directory
        └── ...

§Workflow

  • Each worker maintains its own local corpus with entries stored as JSON files
  • Workers export new interesting entries to the master’s sync directory via hard links
  • The master (worker0) imports new entries from its sync directory and exports them to all the other workers
  • Workers sync with the master to receive new corpus entries from other workers
  • This all happens periodically, there is no clear order in which workers export or import entries since it doesn’t matter as long as the corpus eventually syncs across all workers

Structs§

CampaignCorpusEntry 🔒
Corpus entry selected by a worker and returned for logical-campaign persistence.
CorpusEntry 🔒
Holds Corpus information.
CorpusMetrics 🔒
DynamicTargetCtx
Refs used during corpus replay to register contracts deployed mid-sequence as fuzz targets, mirroring the campaign loop so follow-up calls into them aren’t dropped by can_replay_tx.
GlobalCorpusMetrics 🔒
OptimizationState 🔒
Persisted optimization state: the best value found and the sequence that produced it.
ReplayCoverage 🔒
ReplayOutcome 🔒
ReplayTarget 🔒
WorkerCorpus
Per-worker corpus manager.
WorkerCorpusSeed 🔒
Campaign-level corpus state produced by replaying persisted corpus entries once.

Enums§

MutationType 🔒
Possible mutation strategies to apply on a call sequence.

Constants§

CORPUS_DIR 🔒
FAVORABILITY_THRESHOLD 🔒
GZIP_THRESHOLD 🔒
Threshold for compressing corpus entries. 4KiB is usually the minimum file size on popular file systems.
OPTIMIZATION_BEST_FILE 🔒
SYNC_DIR 🔒
WORKER 🔒

Functions§

has_legacy_invariant_corpus_dirs 🔒
load_optimization_state 🔒
persist_campaign_entry 🔒
persist_optimization_output 🔒
prepare_campaign_output_dir 🔒
register_replay_created 🔒
Registers contracts created by the last tx so subsequent txs in the same replayed sequence can target them.
replay_corpus_sequence 🔒
replay_corpus_sequence_with_executor 🔒
rollback_replay_created 🔒
Clears dynamic targets added during a replayed entry so they don’t leak into the next one.
unique_corpus_entries 🔒