Module corpus

Module corpus 

Source
Expand description

Corpus management for parallel fuzzing with coverage-guided mutation.

This module implements a corpus-based fuzzing system that stores, mutates, and shares transaction sequences across multiple fuzzing workers. Each corpus entry represents a sequence of transactions that has produced interesting coverage, and can be mutated to discover new execution paths.

§File System Structure

The corpus is organized on disk as follows:

<corpus_dir>/
├── worker0/                  # Master (worker 0) directory
│   ├── corpus/               # Master's corpus entries
│   │   ├── <uuid>-<timestamp>.json          # Corpus entry (if small)
│   │   ├── <uuid>-<timestamp>.json.gz       # Corpus entry (if large, compressed)
│   └── sync/                 # Directory where other workers export new findings
│       └── <uuid>-<timestamp>.json          # New entries from other workers
└── workerN/                  # Worker N's directory
    ├── corpus/               # Worker N's local corpus
    │   └── ...
    └── sync/                 # Worker 2's sync directory
        └── ...

§Workflow

  • Each worker maintains its own local corpus with entries stored as JSON files
  • Workers export new interesting entries to the master’s sync directory via hard links
  • The master (worker0) imports new entries from its sync directory and exports them to all the other workers
  • Workers sync with the master to receive new corpus entries from other workers
  • This all happens periodically, there is no clear order in which workers export or import entries since it doesn’t matter as long as the corpus eventually syncs across all workers

Structs§

CorpusDirEntry 🔒
CorpusEntry 🔒
Holds Corpus information.
CorpusMetrics 🔒
GlobalCorpusMetrics 🔒
WorkerCorpus
Per-worker corpus manager.

Enums§

MutationType 🔒
Possible mutation strategies to apply on a call sequence.

Constants§

CORPUS_DIR 🔒
COVERAGE_MAP_SIZE 🔒
FAVORABILITY_THRESHOLD 🔒
GZIP_THRESHOLD 🔒
Threshold for compressing corpus entries. 4KiB is usually the minimum file size on popular file systems.
SYNC_DIR 🔒
WORKER 🔒

Functions§

parse_corpus_filename 🔒
Parses the corpus filename and returns the uuid and timestamp associated with it.
read_corpus_dir 🔒