Project Context Primer

This book focuses on the Nomos Testing Framework. It assumes familiarity with the Nomos architecture, but for completeness, here is a short primer.

Nomos is a modular blockchain protocol composed of validators, executors, and a data-availability (DA) subsystem.
Validators participate in consensus and produce blocks.
Executors are validators with the DA dispersal service enabled. They perform all validator functions plus submit blob data to the DA network.
Data Availability (DA) ensures that blob data submitted via channel operations in transactions is published and retrievable by the network.

These roles interact tightly, which is why meaningful testing must be performed in multi-node environments that include real networking, timing, and DA interaction.

What You Will Learn

This book gives you a clear mental model for Nomos multi-node testing, shows how to author scenarios that pair realistic workloads with explicit expectations, and guides you to run them across local, containerized, and cluster environments without changing the plan.

Part I — Foundations

Conceptual chapters that establish the mental model for the framework and how it approaches multi-node testing.

Introduction

The Nomos Testing Framework is a purpose-built toolkit for exercising Nomos in realistic, multi-node environments. It solves the gap between small, isolated tests and full-system validation by letting teams describe a cluster layout, drive meaningful traffic, and assert the outcomes in one coherent plan.

It is for protocol engineers, infrastructure operators, and QA teams who need repeatable confidence that validators, executors, and data-availability components work together under network and timing constraints.

Multi-node integration testing is required because many Nomos behaviors—block progress, data availability, liveness under churn—only emerge when several roles interact over real networking and time. This framework makes those checks declarative, observable, and portable across environments.

Architecture Overview

The framework follows a clear flow: Topology → Scenario → Deployer → Runner → Workloads → Expectations.

Core Flow

flowchart LR
    A(Topology<br/>shape cluster) --> B(Scenario<br/>plan)
    B --> C(Deployer<br/>provision & readiness)
    C --> D(Runner<br/>orchestrate execution)
    D --> E(Workloads<br/>drive traffic)
    E --> F(Expectations<br/>verify outcomes)

Components

Topology describes the cluster: how many nodes, their roles, and the high-level network and data-availability parameters they should follow.
Scenario combines that topology with the activities to run and the checks to perform, forming a single plan.
Deployer provisions infrastructure on the chosen backend (local processes, Docker Compose, or Kubernetes), waits for readiness, and returns a Runner.
Runner orchestrates scenario execution: starts workloads, observes signals, evaluates expectations, and triggers cleanup.
Workloads generate traffic and conditions that exercise the system.
Expectations observe the run and judge success or failure once activity completes.

Each layer has a narrow responsibility so that cluster shape, deployment choice, traffic generation, and health checks can evolve independently while fitting together predictably.

Entry Points

The framework is consumed via runnable example binaries in examples/src/bin/:

local_runner.rs — Spawns nodes as local processes
compose_runner.rs — Deploys via Docker Compose (requires NOMOS_TESTNET_IMAGE built)
k8s_runner.rs — Deploys via Kubernetes Helm (requires cluster + image)

Run with: POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin <name>

Important: All runners require POL_PROOF_DEV_MODE=true to avoid expensive Groth16 proof generation that causes timeouts.

These binaries use the framework API (ScenarioBuilder) to construct and execute scenarios.

Builder API

Scenarios are defined using a fluent builder pattern:

#![allow(unused)]
fn main() {
let mut plan = ScenarioBuilder::topology_with(|t| {
        t.network_star()      // Topology configuration
            .validators(3)
            .executors(2)
    })
    .wallets(50)             // Wallet seeding
    .transactions_with(|txs| {
        txs.rate(5)
            .users(20)
    })
    .da_with(|da| {
        da.channel_rate(1)
            .blob_rate(2)
    })
    .expect_consensus_liveness()  // Expectations
    .with_run_duration(Duration::from_secs(90))
    .build();
}

Key API Points:

Topology uses .topology_with(|t| { t.validators(N).executors(M) }) closure pattern
Workloads are configured via _with closures (transactions_with, da_with, chaos_with)
Chaos workloads require .enable_node_control() and a compatible runner

Deployers

Three deployer implementations:

Deployer	Backend	Prerequisites	Node Control
`LocalDeployer`	Local processes	Binaries in sibling checkout	No
`ComposeDeployer`	Docker Compose	`NOMOS_TESTNET_IMAGE` built	Yes
`K8sDeployer`	Kubernetes Helm	Cluster + image loaded	Not yet

Compose-specific features:

Includes Prometheus at http://localhost:9090 (override via TEST_FRAMEWORK_PROMETHEUS_PORT)
Optional OTLP trace/metrics endpoints (NOMOS_OTLP_ENDPOINT, NOMOS_OTLP_METRICS_ENDPOINT)
Node control for chaos testing (restart validators/executors)

Assets and Images

Docker Image

Built via testing-framework/assets/stack/scripts/build_test_image.sh:

Embeds KZG circuit parameters from testing-framework/assets/stack/kzgrs_test_params/
Includes runner scripts: run_nomos_node.sh, run_nomos_executor.sh
Tagged as NOMOS_TESTNET_IMAGE (default: nomos-testnet:local)

Circuit Assets

KZG parameters required for DA workloads:

Default path: testing-framework/assets/stack/kzgrs_test_params/
Override: NOMOS_KZGRS_PARAMS_PATH=/custom/path
Fetch via: scripts/setup-nomos-circuits.sh v0.3.1 /tmp/circuits

Compose Stack

Templates and configs in testing-framework/runners/compose/assets/:

docker-compose.yml.tera — Stack template (validators, executors, Prometheus)
Cfgsync config: testing-framework/assets/stack/cfgsync.yaml
Monitoring: testing-framework/assets/stack/monitoring/prometheus.yml

Logging Architecture

Two separate logging pipelines:

Component	Configuration	Output
Runner binaries	`RUST_LOG`	Framework orchestration logs
Node processes	`NOMOS_LOG_LEVEL`, `NOMOS_LOG_FILTER`, `NOMOS_LOG_DIR`	Consensus, DA, mempool logs

Node logging:

Local runner: Writes to temporary directories by default (cleaned up). Set NOMOS_TESTS_TRACING=true + NOMOS_LOG_DIR for persistent files.
Compose runner: Default logs to container stdout/stderr (docker logs). Optional per-node files if NOMOS_LOG_DIR is set and mounted.
K8s runner: Logs to pod stdout/stderr (kubectl logs). Optional per-node files if NOMOS_LOG_DIR is set and mounted.

File naming: Per-node files use prefix nomos-node-{index} or nomos-executor-{index} (may include timestamps).

Observability

Prometheus (Compose only):

Exposed at http://localhost:9090 (configurable)
Scrapes all validator and executor metrics
Accessible in expectations: ctx.telemetry().prometheus_endpoint()

Node APIs:

HTTP endpoints per node for consensus info, network status, DA membership
Accessible in expectations: ctx.node_clients().validators().get(0)

OTLP (optional):

Trace endpoint: NOMOS_OTLP_ENDPOINT=http://localhost:4317
Metrics endpoint: NOMOS_OTLP_METRICS_ENDPOINT=http://localhost:4318
Disabled by default (no noise if unset)

For detailed logging configuration, see Logging and Observability.

Testing Philosophy

This framework embodies specific principles that shape how you author and run scenarios. Understanding these principles helps you write effective tests and interpret results correctly.

Declarative over Imperative

Describe what you want to test, not how to orchestrate it:

#![allow(unused)]
fn main() {
// Good: declarative
ScenarioBuilder::topology_with(|t| {
        t.network_star()
            .validators(2)
            .executors(1)
    })
    .transactions_with(|txs| {
        txs.rate(5)             // 5 transactions per block
    })
    .expect_consensus_liveness()
    .build();

// Bad: imperative (framework doesn't work this way)
// spawn_validator(); spawn_executor(); 
// loop { submit_tx(); check_block(); }
}

Why it matters: The framework handles deployment, readiness, and cleanup. You focus on test intent, not infrastructure orchestration.

Protocol Time, Not Wall Time

Reason in blocks and consensus intervals, not wall-clock seconds.

Consensus defaults:

Slot duration: 2 seconds (NTP-synchronized, configurable via CONSENSUS_SLOT_TIME)
Active slot coefficient: 0.9 (90% block probability per slot)
Expected rate: ~27 blocks per minute

#![allow(unused)]
fn main() {
// Good: protocol-oriented thinking
let plan = ScenarioBuilder::topology_with(|t| {
        t.network_star()
            .validators(2)
            .executors(1)
    })
    .transactions_with(|txs| {
        txs.rate(5)             // 5 transactions per block
    })
    .with_run_duration(Duration::from_secs(60))  // Let framework calculate expected blocks
    .expect_consensus_liveness()  // "Did we produce the expected blocks?"
    .build();

// Bad: wall-clock assumptions
// "I expect exactly 30 blocks in 60 seconds"
// This breaks on slow CI where slot timing might drift
}

Why it matters: Slot timing is fixed (2s by default, NTP-synchronized), so the expected number of blocks is predictable: ~27 blocks in 60s with the default 0.9 active slot coefficient. The framework calculates expected blocks from slot duration and run window, making assertions protocol-based rather than tied to specific wall-clock expectations. Assert on "blocks produced relative to slots" not "blocks produced in exact wall-clock seconds".

Determinism First, Chaos When Needed

Default scenarios are repeatable:

Fixed topology
Predictable traffic rates
Deterministic checks

Chaos is opt-in:

#![allow(unused)]
fn main() {
// Separate: functional test (deterministic)
let plan = ScenarioBuilder::topology_with(|t| {
        t.network_star()
            .validators(2)
            .executors(1)
    })
    .transactions_with(|txs| {
        txs.rate(5)             // 5 transactions per block
    })
    .expect_consensus_liveness()
    .build();

// Separate: chaos test (introduces randomness)
let chaos_plan = ScenarioBuilder::topology_with(|t| {
        t.network_star()
            .validators(3)
            .executors(2)
    })
    .enable_node_control()
    .chaos_with(|c| {
        c.restart()
            .min_delay(Duration::from_secs(30))
            .max_delay(Duration::from_secs(60))
            .target_cooldown(Duration::from_secs(45))
            .apply()
    })
    .transactions_with(|txs| {
        txs.rate(5)             // 5 transactions per block
    })
    .expect_consensus_liveness()
    .build();
}

Why it matters: Mixing determinism with chaos creates noisy, hard-to-debug failures. Separate concerns make failures actionable.

Observable Health Signals

Prefer user-facing signals over internal state:

Good checks:

Blocks progressing at expected rate (liveness)
Transactions included within N blocks (inclusion)
DA blobs retrievable (availability)

Avoid internal checks:

Memory pool size
Internal service state
Cache hit rates

Why it matters: User-facing signals reflect actual system health. Internal state can be "healthy" while the system is broken from a user perspective.

Minimum Run Windows

Always run long enough for meaningful block production:

#![allow(unused)]
fn main() {
// Bad: too short
.with_run_duration(Duration::from_secs(5))  // ~2 blocks (with default 2s slots, 0.9 coeff)

// Good: enough blocks for assertions
.with_run_duration(Duration::from_secs(60))  // ~27 blocks (with default 2s slots, 0.9 coeff)
}

Note: Block counts assume default consensus parameters:

Slot duration: 2 seconds (configurable via CONSENSUS_SLOT_TIME)
Active slot coefficient: 0.9 (90% block probability per slot)
Formula: blocks ≈ (duration / slot_duration) × active_slot_coeff

If upstream changes these parameters, adjust your duration expectations accordingly.

The framework enforces minimum durations (at least 2× slot duration), but be explicit. Very short runs risk false confidence—one lucky block doesn't prove liveness.

Summary

These principles keep scenarios:

Portable across environments (protocol time, declarative)
Debuggable (determinism, separation of concerns)
Meaningful (observable signals, sufficient duration)

When authoring scenarios, ask: "Does this test the protocol behavior or my local environment quirks?"

Scenario Lifecycle

Build the plan: Declare a topology, attach workloads and expectations, and set the run window. The plan is the single source of truth for what will happen.
Deploy: Hand the plan to a deployer. It provisions the environment on the chosen backend, waits for nodes to signal readiness, and returns a runner.
Drive workloads: The runner starts traffic and behaviors (transactions, data-availability activity, restarts) for the planned duration.
Observe blocks and signals: Track block progression and other high-level metrics during or after the run window to ground assertions in protocol time.
Evaluate expectations: Once activity stops (and optional cooldown completes), the runner checks liveness and workload-specific outcomes to decide pass or fail.
Cleanup: Tear down resources so successive runs start fresh and do not inherit leaked state.

flowchart LR
    P[Plan<br/>topology + workloads + expectations] --> D[Deploy<br/>deployer provisions]
    D --> R[Runner<br/>orchestrates execution]
    R --> W[Drive Workloads]
    W --> O[Observe<br/>blocks/metrics]
    O --> E[Evaluate Expectations]
    E --> C[Cleanup]

Design Rationale

Modular crates keep configuration, orchestration, workloads, and runners decoupled so each can evolve without breaking the others.
Pluggable runners let the same scenario run on a laptop, a Docker host, or a Kubernetes cluster, making validation portable across environments.
Separated workloads and expectations clarify intent: what traffic to generate versus how to judge success. This simplifies review and reuse.
Declarative topology makes cluster shape explicit and repeatable, reducing surprise when moving between CI and developer machines.
Maintainability through predictability: a clear flow from plan to deployment to verification lowers the cost of extending the framework and interpreting failures.

Part II — User Guide

Practical guidance for shaping scenarios, combining workloads and expectations, and running them across different environments.

Workspace Layout

The workspace focuses on multi-node integration testing and sits alongside a nomos-node checkout. Its crates separate concerns to keep scenarios repeatable and portable:

Configs: prepares high-level node, network, tracing, and wallet settings used across test environments.
Core scenario orchestration: the engine that holds topology descriptions, scenario plans, runtimes, workloads, and expectations.
Workflows: ready-made workloads (transactions, data-availability, chaos) and reusable expectations assembled into a user-facing DSL.
Runners: deployment backends for local processes, Docker Compose, and Kubernetes, all consuming the same scenario plan.
Runner Examples (examples/runner-examples): runnable binaries (local_runner.rs, compose_runner.rs, k8s_runner.rs) that demonstrate complete scenario execution with each deployer.

This split keeps configuration, orchestration, reusable traffic patterns, and deployment adapters loosely coupled while sharing one mental model for tests.

Annotated Tree

Directory structure with key paths annotated:

nomos-testing/
├─ testing-framework/           # Core library crates
│  ├─ configs/                  # Node config builders, topology generation, tracing/logging config
│  ├─ core/                     # Scenario model (ScenarioBuilder), runtime (Runner, Deployer), topology, node spawning
│  ├─ workflows/                # Workloads (transactions, DA, chaos), expectations (liveness), builder DSL extensions
│  ├─ runners/                  # Deployment backends
│  │  ├─ local/                 # LocalDeployer (spawns local processes)
│  │  ├─ compose/               # ComposeDeployer (Docker Compose + Prometheus)
│  │  └─ k8s/                   # K8sDeployer (Kubernetes Helm)
│  └─ assets/                   # Docker/K8s stack assets
│     └─ stack/
│        ├─ kzgrs_test_params/  # KZG circuit parameters (fetch via setup-nomos-circuits.sh)
│        ├─ monitoring/         # Prometheus config
│        ├─ scripts/            # Container entrypoints, image builder
│        └─ cfgsync.yaml        # Config sync server template
│
├─ examples/                    # PRIMARY ENTRY POINT: runnable binaries
│  └─ src/bin/
│     ├─ local_runner.rs        # Local processes demo (POL_PROOF_DEV_MODE=true)
│     ├─ compose_runner.rs      # Docker Compose demo (requires image)
│     └─ k8s_runner.rs          # Kubernetes demo (requires cluster + image)
│
├─ scripts/                     # Helper utilities
│  └─ setup-nomos-circuits.sh   # Fetch KZG circuit parameters
│
└─ book/                        # This documentation (mdBook)

Key Directories Explained

`testing-framework/`

Core library crates providing the testing API.

Crate	Purpose	Key Exports
`configs`	Node configuration builders	Topology generation, tracing config
`core`	Scenario model & runtime	`ScenarioBuilder`, `Deployer`, `Runner`
`workflows`	Workloads & expectations	`ScenarioBuilderExt`, `ChaosBuilderExt`
`runners/local`	Local process deployer	`LocalDeployer`
`runners/compose`	Docker Compose deployer	`ComposeDeployer`
`runners/k8s`	Kubernetes deployer	`K8sDeployer`

`testing-framework/assets/stack/`

Docker/K8s deployment assets:

kzgrs_test_params/: Circuit parameters (override via NOMOS_KZGRS_PARAMS_PATH)
monitoring/: Prometheus config
scripts/: Container entrypoints and image builder
cfgsync.yaml: Configuration sync server template

`examples/` (Start Here!)

Runnable binaries demonstrating framework usage:

local_runner.rs — Local processes
compose_runner.rs — Docker Compose (requires NOMOS_TESTNET_IMAGE built)
k8s_runner.rs — Kubernetes (requires cluster + image)

Run with: POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin <name>

All runners require POL_PROOF_DEV_MODE=true to avoid expensive proof generation.

`scripts/`

Helper utilities:

setup-nomos-circuits.sh: Fetch KZG parameters from releases

Observability

Compose runner includes:

Prometheus at http://localhost:9090 (metrics scraping)
Node metrics exposed per validator/executor
Access in expectations: ctx.telemetry().prometheus_endpoint()

Logging controlled by:

NOMOS_LOG_DIR — Write per-node log files
NOMOS_LOG_LEVEL — Global log level (error/warn/info/debug/trace)
NOMOS_LOG_FILTER — Target-specific filtering (e.g., consensus=trace,da=debug)
NOMOS_TESTS_TRACING — Enable file logging for local runner

See Logging and Observability for details.

To Do This	Go Here
Run an example	`examples/src/bin/` → `cargo run -p runner-examples --bin <name>`
Write a custom scenario	`testing-framework/core/` → Implement using `ScenarioBuilder`
Add a new workload	`testing-framework/workflows/src/workloads/` → Implement `Workload` trait
Add a new expectation	`testing-framework/workflows/src/expectations/` → Implement `Expectation` trait
Modify node configs	`testing-framework/configs/src/topology/configs/`
Extend builder DSL	`testing-framework/workflows/src/builder/` → Add trait methods
Add a new deployer	`testing-framework/runners/` → Implement `Deployer` trait

For detailed guidance, see Internal Crate Reference.

Authoring Scenarios

Creating a scenario is a declarative exercise:

Shape the topology: decide how many validators and executors to run, and what high-level network and data-availability characteristics matter for the test.
Attach workloads: pick traffic generators that align with your goals (transactions, data-availability blobs, or chaos for resilience probes).
Define expectations: specify the health signals that must hold when the run finishes (e.g., consensus liveness, inclusion of submitted activity; see Core Content: Workloads & Expectations).
Set duration: choose a run window long enough to observe meaningful block progression and the effects of your workloads.
Choose a runner: target local processes for fast iteration, Docker Compose for reproducible multi-node stacks, or Kubernetes for cluster-grade validation. For environment considerations, see Operations.

Keep scenarios small and explicit: make the intended behavior and the success criteria clear so failures are easy to interpret and act upon.

Core Content: Workloads & Expectations

Workloads describe the activity a scenario generates; expectations describe the signals that must hold when that activity completes. Both are pluggable so scenarios stay readable and purpose-driven.

Workloads

Transaction workload: submits user-level transactions at a configurable rate and can limit how many distinct actors participate.
Data-availability workload: drives blob and channel activity to exercise data-availability paths.
Chaos workload: triggers controlled node restarts to test resilience and recovery behaviors (requires a runner that can control nodes).

Expectations

Consensus liveness: verifies the system continues to produce blocks in line with the planned workload and timing window.
Workload-specific checks: each workload can attach its own success criteria (e.g., inclusion of submitted activity) so scenarios remain concise.

Together, workloads and expectations let you express both the pressure applied to the system and the definition of “healthy” for that run.

flowchart TD
    I[Inputs<br/>topology + wallets + rates] --> Init[Workload init]
    Init --> Drive[Drive traffic]
    Drive --> Collect[Collect signals]
    Collect --> Eval[Expectations evaluate]

Core Content: ScenarioBuilderExt Patterns

Patterns that keep scenarios readable and reusable:

Topology-first: start by shaping the cluster (counts, layout) so later steps inherit a clear foundation.
Bundle defaults: use the DSL helpers to attach common expectations (like liveness) whenever you add a matching workload, reducing forgotten checks.
Intentional rates: express traffic in per-block terms to align with protocol timing rather than wall-clock assumptions.
Opt-in chaos: enable restart patterns only in scenarios meant to probe resilience; keep functional smoke tests deterministic.
Wallet clarity: seed only the number of actors you need; it keeps transaction scenarios deterministic and interpretable.

These patterns make scenario definitions self-explanatory while staying aligned with the framework’s block-oriented timing model.

Best Practices

State your intent: document the goal of each scenario (throughput, DA validation, resilience) so expectation choices are obvious.
Keep runs meaningful: choose durations that allow multiple blocks and make timing-based assertions trustworthy.
Separate concerns: start with deterministic workloads for functional checks; add chaos in dedicated resilience scenarios to avoid noisy failures.
Reuse patterns: standardize on shared topology and workload presets so results are comparable across environments and teams.
Observe first, tune second: rely on liveness and inclusion signals to interpret outcomes before tweaking rates or topology.
Environment fit: pick runners that match the feedback loop you need—local for speed (including fast CI smoke tests), compose for reproducible stacks (recommended for CI), k8s for cluster-grade fidelity.
Minimal surprises: seed only necessary wallets and keep configuration deltas explicit when moving between CI and developer machines.

Examples

Concrete scenario shapes that illustrate how to combine topologies, workloads, and expectations.

Runnable examples: The repo includes complete binaries in examples/src/bin/:

local_runner.rs — Local processes
compose_runner.rs — Docker Compose (requires NOMOS_TESTNET_IMAGE built)
k8s_runner.rs — Kubernetes (requires cluster access and image loaded)

Run with: POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin <name>

All runners require POL_PROOF_DEV_MODE=true to avoid expensive proof generation.

Code patterns below show how to build scenarios. Wrap these in #[tokio::test] functions for integration tests, or #[tokio::main] for binaries.

Simple consensus liveness

Minimal test that validates basic block production:

#![allow(unused)]
fn main() {
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_local::LocalDeployer;
use testing_framework_workflows::ScenarioBuilderExt;
use std::time::Duration;

async fn simple_consensus() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    let mut plan = ScenarioBuilder::topology_with(|t| {
            t.network_star()
                .validators(3)
                .executors(0)
        })
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(30))
        .build();

    let deployer = LocalDeployer::default();
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;
    
    Ok(())
}
}

When to use: smoke tests for consensus on minimal hardware.

Transaction workload

Test consensus under transaction load:

#![allow(unused)]
fn main() {
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_local::LocalDeployer;
use testing_framework_workflows::ScenarioBuilderExt;
use std::time::Duration;

async fn transaction_workload() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    let mut plan = ScenarioBuilder::topology_with(|t| {
            t.network_star()
                .validators(2)
                .executors(0)
        })
        .wallets(20)
        .transactions_with(|txs| {
            txs.rate(5)
                .users(10)
        })
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(60))
        .build();

    let deployer = LocalDeployer::default();
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;
    
    Ok(())
}
}

When to use: validate transaction submission and inclusion.

DA + transaction workload

Combined test stressing both transaction and DA layers:

#![allow(unused)]
fn main() {
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_local::LocalDeployer;
use testing_framework_workflows::ScenarioBuilderExt;
use std::time::Duration;

async fn da_and_transactions() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    let mut plan = ScenarioBuilder::topology_with(|t| {
            t.network_star()
                .validators(3)
                .executors(2)
        })
        .wallets(30)
        .transactions_with(|txs| {
            txs.rate(5)
                .users(15)
        })
        .da_with(|da| {
            da.channel_rate(1)
                .blob_rate(2)
        })
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(90))
        .build();

    let deployer = LocalDeployer::default();
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;
    
    Ok(())
}
}

When to use: end-to-end coverage of transaction and DA layers.

Chaos resilience

Test system resilience under node restarts:

#![allow(unused)]
fn main() {
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_compose::ComposeDeployer;
use testing_framework_workflows::{ScenarioBuilderExt, ChaosBuilderExt};
use std::time::Duration;

async fn chaos_resilience() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    let mut plan = ScenarioBuilder::topology_with(|t| {
            t.network_star()
                .validators(4)
                .executors(2)
        })
        .enable_node_control()
        .wallets(20)
        .transactions_with(|txs| {
            txs.rate(3)
                .users(10)
        })
        .chaos_with(|c| {
            c.restart()
                .min_delay(Duration::from_secs(20))
                .max_delay(Duration::from_secs(40))
                .target_cooldown(Duration::from_secs(30))
                .apply()
        })
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(120))
        .build();

    let deployer = ComposeDeployer::default();
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;
    
    Ok(())
}
}

When to use: resilience validation and operational readiness drills.

Note: Chaos tests require ComposeDeployer or another runner with node control support.

Advanced Examples

Realistic advanced scenarios demonstrating framework capabilities for production testing.

Summary

Example	Topology	Workloads	Deployer	Key Feature
Load Progression	3 validators + 2 executors	Increasing tx rate	Compose	Dynamic load testing
Sustained Load	4 validators + 2 executors	High tx + DA rate	Compose	Stress testing
Aggressive Chaos	4 validators + 2 executors	Frequent restarts + traffic	Compose	Resilience validation

Load Progression Test

Test consensus under progressively increasing transaction load:

#![allow(unused)]
fn main() {
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_compose::ComposeDeployer;
use testing_framework_workflows::ScenarioBuilderExt;
use std::time::Duration;

async fn load_progression_test() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    for rate in [5, 10, 20, 30] {
        println!("Testing with rate: {}", rate);
        
        let mut plan = ScenarioBuilder::topology_with(|t| {
                t.network_star()
                    .validators(3)
                    .executors(2)
            })
            .wallets(50)
            .transactions_with(|txs| {
                txs.rate(rate)
                    .users(20)
            })
            .expect_consensus_liveness()
            .with_run_duration(Duration::from_secs(60))
            .build();

        let deployer = ComposeDeployer::default();
        let runner = deployer.deploy(&plan).await?;
        let _handle = runner.run(&mut plan).await?;
    }
    
    Ok(())
}
}

When to use: Finding the maximum sustainable transaction rate for a given topology.

Sustained Load Test

Run high transaction and DA load for extended duration:

#![allow(unused)]
fn main() {
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_compose::ComposeDeployer;
use testing_framework_workflows::ScenarioBuilderExt;
use std::time::Duration;

async fn sustained_load_test() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    let mut plan = ScenarioBuilder::topology_with(|t| {
            t.network_star()
                .validators(4)
                .executors(2)
        })
        .wallets(100)
        .transactions_with(|txs| {
            txs.rate(15)
                .users(50)
        })
        .da_with(|da| {
            da.channel_rate(2)
                .blob_rate(3)
        })
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(300))
        .build();

    let deployer = ComposeDeployer::default();
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;
    
    Ok(())
}
}

When to use: Validating stability under continuous high load over extended periods.

Aggressive Chaos Test

Frequent node restarts with active traffic:

#![allow(unused)]
fn main() {
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_compose::ComposeDeployer;
use testing_framework_workflows::{ScenarioBuilderExt, ChaosBuilderExt};
use std::time::Duration;

async fn aggressive_chaos_test() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    let mut plan = ScenarioBuilder::topology_with(|t| {
            t.network_star()
                .validators(4)
                .executors(2)
        })
        .enable_node_control()
        .wallets(50)
        .transactions_with(|txs| {
            txs.rate(10)
                .users(20)
        })
        .chaos_with(|c| {
            c.restart()
                .min_delay(Duration::from_secs(10))
                .max_delay(Duration::from_secs(20))
                .target_cooldown(Duration::from_secs(15))
                .apply()
        })
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(180))
        .build();

    let deployer = ComposeDeployer::default();
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;
    
    Ok(())
}
}

When to use: Validating recovery and liveness under aggressive failure conditions.

Note: Requires ComposeDeployer for node control support.

Extension Ideas

These scenarios require custom implementations but demonstrate framework extensibility:

Mempool & Transaction Handling

Transaction Propagation & Inclusion Test

Concept: Submit the same batch of independent transactions to different nodes in randomized order/offsets, then verify all transactions are included and final state matches across nodes.

Requirements:

Custom workload: Generates a fixed batch of transactions and submits the same set to different nodes via ctx.node_clients(), with randomized submission order and timing offsets per node
Custom expectation: Verifies all transactions appear in blocks (order may vary), final state matches across all nodes (compare balances or state roots), and no transactions are dropped

Why useful: Exercises mempool propagation, proposer fairness, and transaction inclusion guarantees under realistic race conditions. Tests that the protocol maintains consistency regardless of which node receives transactions first.

Implementation notes: Requires both a custom Workload implementation (to submit same transactions to multiple nodes with jitter) and a custom Expectation implementation (to verify inclusion and state consistency).

Cross-Validator Mempool Divergence & Convergence

Concept: Drive different transaction subsets into different validators (or differing arrival orders) to create temporary mempool divergence, then verify mempools/blocks converge to contain the union (no permanent divergence).

Requirements:

Custom workload: Targets specific nodes via ctx.node_clients() with disjoint or jittered transaction batches
Custom expectation: After a convergence window, verifies that all transactions appear in blocks (order may vary) or that mempool contents converge across nodes
Run normal workloads during convergence period

Expectations:

Temporary mempool divergence is acceptable (different nodes see different transactions initially)
After convergence window, all transactions appear in blocks or mempools converge
No transactions are permanently dropped despite initial divergence
Mempool gossip/reconciliation mechanisms work correctly

Why useful: Exercises mempool gossip and reconciliation under uneven input or latency. Ensures no node "drops" transactions seen elsewhere, validating that mempool synchronization mechanisms correctly propagate transactions across the network even when they arrive at different nodes in different orders.

Implementation notes: Requires both a custom Workload implementation (to inject disjoint/jittered batches per node) and a custom Expectation implementation (to verify mempool convergence or block inclusion). Uses existing ctx.node_clients() capability—no new infrastructure needed.

Adaptive Mempool Pressure Test

Concept: Ramp transaction load over time to observe mempool growth, fee prioritization/eviction, and block saturation behavior, detecting performance regressions and ensuring backpressure/eviction work under increasing load.

Requirements:

Custom workload: Steadily increases transaction rate over time (optional: use fee tiers)
Custom expectation: Monitors mempool size, evictions, and throughput (blocks/txs per slot), flagging runaway growth or stalls
Run for extended duration to observe pressure buildup

Expectations:

Mempool size grows predictably with load (not runaway growth)
Fee prioritization/eviction mechanisms activate under pressure
Block saturation behavior is acceptable (blocks fill appropriately)
Throughput (blocks/txs per slot) remains stable or degrades gracefully
No stalls or unbounded mempool growth

Why useful: Detects performance regressions in mempool management. Ensures backpressure and eviction mechanisms work correctly under increasing load, preventing memory exhaustion or unbounded growth. Validates that fee prioritization correctly selects high-value transactions when mempool is full.

Implementation notes: Can be built with current workload model (ramping rate). Requires custom Expectation implementation that reads mempool metrics (via node HTTP APIs or Prometheus) and monitors throughput to judge behavior. No new infrastructure needed—uses existing observability capabilities.

Invalid Transaction Fuzzing

Concept: Submit malformed transactions and verify they're rejected properly.

Implementation approach:

Custom workload that generates invalid transactions (bad signatures, insufficient funds, malformed structure)
Expectation verifies mempool rejects them and they never appear in blocks
Test mempool resilience and filtering

Why useful: Ensures mempool doesn't crash or include invalid transactions under fuzzing.

Network & Gossip

Gossip Latency Gradient Scenario

Concept: Test consensus robustness under skewed gossip delays by partitioning nodes into latency tiers (tier A ≈10ms, tier B ≈100ms, tier C ≈300ms) and observing propagation lag, fork rate, and eventual convergence.

Requirements:

Partition nodes into three groups (tiers)
Apply per-group network delay via chaos: netem/iptables in compose; NetworkPolicy + netem sidecar in k8s
Run standard workload (transactions/block production)
Optional: Remove delays at end to check recovery

Expectations:

Propagation: Messages reach all tiers within acceptable bounds
Safety: No divergent finalized heads; fork rate stays within tolerance
Liveness: Chain keeps advancing; convergence after delays relaxed (if healed)

Why useful: Real networks have heterogeneous latency. This stress-tests proposer selection and fork resolution when some peers are "far" (high latency), validating that consensus remains safe and live under realistic network conditions.

Current blocker: Runner support for per-group delay injection (network delay via netem/iptables) is not present today. Would require new chaos plumbing in compose/k8s deployers to inject network delays per node group.

Byzantine Gossip Flooding (libp2p Peer)

Concept: Spin up a custom workload/sidecar that runs a libp2p host, joins the cluster's gossip mesh, and publishes a high rate of syntactically valid but useless/stale messages to selected topics, testing gossip backpressure, scoring, and queue handling under a "malicious" peer.

Requirements:

Custom workload/sidecar that implements a libp2p host
Join the cluster's gossip mesh as a peer
Publish high-rate syntactically valid but useless/stale messages to selected gossip topics
Run alongside normal workloads (transactions/block production)

Expectations:

Gossip backpressure mechanisms prevent message flooding from overwhelming nodes
Peer scoring correctly identifies and penalizes the malicious peer
Queue handling remains stable under flood conditions
Normal consensus operation continues despite malicious peer

Why useful: Tests Byzantine behavior (malicious peer) which is critical for consensus protocol robustness. More realistic than RPC spam since it uses the actual gossip protocol. Validates that gossip backpressure, peer scoring, and queue management correctly handle adversarial peers without disrupting consensus.

Current blocker: Requires adding gossip-capable helper (libp2p integration) to the framework. Would need a custom workload/sidecar implementation that can join the gossip mesh and inject messages. The rest of the scenario can use existing runners/workloads.

Network Partition Recovery

Concept: Test consensus recovery after network partitions.

Requirements:

Needs block_peer() / unblock_peer() methods in NodeControlHandle
Partition subsets of validators, wait, then restore connectivity
Verify chain convergence after partition heals

Why useful: Tests the most realistic failure mode in distributed systems.

Current blocker: Node control doesn't yet support network-level actions (only process restarts).

Time & Timing

Time-Shifted Blocks (Clock Skew Test)

Concept: Test consensus and timestamp handling when nodes run with skewed clocks (e.g., +1s, −1s, +200ms jitter) to surface timestamp validation issues, reorg sensitivity, and clock drift handling.

Requirements:

Assign per-node time offsets (e.g., +1s, −1s, +200ms jitter)
Run normal workload (transactions/block production)
Observe whether blocks are accepted/propagated and the chain stays consistent

Expectations:

Blocks with skewed timestamps are handled correctly (accepted or rejected per protocol rules)
Chain remains consistent across nodes despite clock differences
No unexpected reorgs or chain splits due to timestamp validation issues

Why useful: Clock skew is a common real-world issue in distributed systems. This validates that consensus correctly handles timestamp validation and maintains safety/liveness when nodes have different clock offsets, preventing timestamp-based attacks or failures.

Current blocker: Runner ability to skew per-node clocks (e.g., privileged containers with libfaketime/chrony or time-offset netns) is not available today. Would require a new chaos/time-skew hook in deployers to inject clock offsets per node.

Block Timing Consistency

Concept: Verify block production intervals stay within expected bounds.

Implementation approach:

Custom expectation that consumes BlockFeed
Collect block timestamps during run
Assert intervals are within (slot_duration * active_slot_coeff) ± tolerance

Why useful: Validates consensus timing under various loads.

Topology & Membership

Dynamic Topology (Churn) Scenario

Concept: Nodes join and leave mid-run (new identities/addresses added; some nodes permanently removed) to exercise peer discovery, bootstrapping, reputation, and load balancing under churn.

Requirements:

Runner must be able to spin up new nodes with fresh keys/addresses at runtime
Update peer lists and bootstraps dynamically as nodes join/leave
Optionally tear down nodes permanently (not just restart)
Run normal workloads (transactions/block production) during churn

Expectations:

New nodes successfully discover and join the network
Peer discovery mechanisms correctly handle dynamic topology changes
Reputation systems adapt to new/removed peers
Load balancing adjusts to changing node set
Consensus remains safe and live despite topology churn

Why useful: Real networks experience churn (nodes joining/leaving). Unlike restarts (which preserve topology), churn changes the actual topology size and peer set, testing how the protocol handles dynamic membership. This exercises peer discovery, bootstrapping, reputation systems, and load balancing under realistic conditions.

Current blocker: Runner support for dynamic node addition/removal at runtime is not available today. Chaos today only restarts existing nodes; churn would require the ability to spin up new nodes with fresh identities/addresses, update peer lists/bootstraps dynamically, and permanently remove nodes. Would need new topology management capabilities in deployers.

API & External Interfaces

API DoS/Stress Test

Concept: Adversarial workload floods node HTTP/WS APIs with high QPS and malformed/bursty requests; expectation checks nodes remain responsive or rate-limit without harming consensus.

Requirements:

Custom workload: Targets node HTTP/WS API endpoints with mixed valid/invalid requests at high rate
Custom expectation: Monitors error rates, latency, and confirms block production/liveness unaffected
Run alongside normal workloads (transactions/block production)

Expectations:

Nodes remain responsive or correctly rate-limit under API flood
Error rates/latency are acceptable (rate limiting works)
Block production/liveness unaffected by API abuse
Consensus continues normally despite API stress

Why useful: Validates API hardening under abuse and ensures control/telemetry endpoints don't destabilize the node. Tests that API abuse is properly isolated from consensus operations, preventing DoS attacks on API endpoints from affecting blockchain functionality.

Implementation notes: Requires custom Workload implementation that directs high-QPS traffic to node APIs (via ctx.node_clients() or direct HTTP clients) and custom Expectation implementation that monitors API responsiveness metrics and consensus liveness. Uses existing node API access—no new infrastructure needed.

State & Correctness

Wallet Balance Verification

Concept: Track wallet balances and verify state consistency.

Description: After transaction workload completes, query all wallet balances via node API and verify total supply is conserved. Requires tracking initial state, submitted transactions, and final balances. Validates that the ledger maintains correctness under load (no funds lost or created). This is a state assertion expectation that checks correctness, not just liveness.

Running Scenarios

Running a scenario follows the same conceptual flow regardless of environment:

Select or author a scenario plan that pairs a topology with workloads, expectations, and a suitable run window.
Choose a deployer aligned with your environment (local, compose, or k8s) and ensure its prerequisites are available.
Deploy the plan through the deployer, which provisions infrastructure and returns a runner.
The runner orchestrates workload execution for the planned duration; keep observability signals visible so you can correlate outcomes.
The runner evaluates expectations and captures results as the primary pass/fail signal.

Use the same plan across different deployers to compare behavior between local development and CI or cluster settings. For environment prerequisites and flags, see Operations.

Runners

Runners turn a scenario plan into a live environment while keeping the plan unchanged. Choose based on feedback speed, reproducibility, and fidelity. For environment and operational considerations, see Operations.

Important: All runners require POL_PROOF_DEV_MODE=true to avoid expensive Groth16 proof generation that causes timeouts.

Local runner

Launches node processes directly on the host.
Fastest feedback loop and minimal orchestration overhead.
Best for development-time iteration and debugging.
Can run in CI for fast smoke tests.
Node control: Not supported (chaos workloads not available)

Docker Compose runner

Starts nodes in containers to provide a reproducible multi-node stack on a single machine.
Discovers service ports and wires observability for convenient inspection.
Good balance between fidelity and ease of setup.
Recommended for CI pipelines (isolated environment, reproducible).
Node control: Supported (can restart nodes for chaos testing)

Kubernetes runner

Deploys nodes onto a cluster for higher-fidelity, longer-running scenarios.
Suits CI with cluster access or shared test environments where cluster behavior and scheduling matter.
Node control: Not supported yet (chaos workloads not available)

Common expectations

All runners require at least one validator and, for transaction scenarios, access to seeded wallets.
Readiness probes gate workload start so traffic begins only after nodes are reachable.
Environment flags can relax timeouts or increase tracing when diagnostics are needed.

flowchart TD
    Plan[Scenario Plan] --> RunSel{Runner<br/>(local | compose | k8s)}
    RunSel --> Provision[Provision & readiness]
    Provision --> Runtime[Runtime + observability]
    Runtime --> Exec[Workloads & Expectations execute]

Operations

Operational readiness focuses on prerequisites, environment fit, and clear signals:

Prerequisites: keep a sibling nomos-node checkout available; ensure the chosen runner’s platform needs are met (local binaries for host runs, Docker for compose, cluster access for k8s).
Artifacts: DA scenarios require KZG parameters (circuit assets) located at testing-framework/assets/stack/kzgrs_test_params. Fetch them via scripts/setup-nomos-circuits.sh or override the path with NOMOS_KZGRS_PARAMS_PATH.
Environment flags: POL_PROOF_DEV_MODE=true is required for all runners (local, compose, k8s) unless you want expensive Groth16 proof generation that will cause tests to timeout. Configure logging via NOMOS_LOG_DIR, NOMOS_LOG_LEVEL, and NOMOS_LOG_FILTER (see Logging and Observability for details). Note that nodes ignore RUST_LOG and only respond to NOMOS_* variables.
Readiness checks: verify runners report node readiness before starting workloads; this avoids false negatives from starting too early.
Failure triage: map failures to missing prerequisites (wallet seeding, node control availability), runner platform issues, or unmet expectations. Start with liveness signals, then dive into workload-specific assertions.

Treat operational hygiene—assets present, prerequisites satisfied, observability reachable—as the first step to reliable scenario outcomes.

CI Usage

Both LocalDeployer and ComposeDeployer work in CI environments:

LocalDeployer in CI:

Faster (no Docker overhead)
Good for quick smoke tests
Trade-off: Less isolation (processes share host)

ComposeDeployer in CI (recommended):

Better isolation (containerized)
Reproducible environment
Includes Prometheus/observability
Trade-off: Slower startup (Docker image build)
Trade-off: Requires Docker daemon

See .github/workflows/compose-mixed.yml for a complete CI example using ComposeDeployer.

Running Examples

Local Runner

POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner

Optional environment variables:

LOCAL_DEMO_VALIDATORS=3 — Number of validators (default: 1)
LOCAL_DEMO_EXECUTORS=2 — Number of executors (default: 1)
LOCAL_DEMO_RUN_SECS=120 — Run duration in seconds (default: 60)
NOMOS_TESTS_TRACING=true — Enable persistent file logging (required with NOMOS_LOG_DIR)
NOMOS_LOG_DIR=/tmp/logs — Directory for per-node log files (only with NOMOS_TESTS_TRACING=true)
NOMOS_LOG_LEVEL=debug — Set log level (default: info)
NOMOS_LOG_FILTER=consensus=trace,da=debug — Fine-grained module filtering (rate is per-block, not per-second)

Note: The default local_runner example includes DA workload, so circuit assets in testing-framework/assets/stack/kzgrs_test_params/ are required (fetch via scripts/setup-nomos-circuits.sh).

Compose Runner

Prerequisites:

Docker daemon running
Circuit assets in testing-framework/assets/stack/kzgrs_test_params (fetched via scripts/setup-nomos-circuits.sh)
Test image built (see below)

Build the test image:

# Fetch circuit assets first
chmod +x scripts/setup-nomos-circuits.sh
scripts/setup-nomos-circuits.sh v0.3.1 /tmp/nomos-circuits
cp -r /tmp/nomos-circuits/* testing-framework/assets/stack/kzgrs_test_params/

# Build image (embeds assets)
chmod +x testing-framework/assets/stack/scripts/build_test_image.sh
testing-framework/assets/stack/scripts/build_test_image.sh

Run the example:

NOMOS_TESTNET_IMAGE=nomos-testnet:local \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin compose_runner

Required environment variables:

NOMOS_TESTNET_IMAGE=nomos-testnet:local — Image tag (must match built image)
POL_PROOF_DEV_MODE=true — Critical: Without this, proof generation is CPU-intensive and tests will timeout

Optional environment variables:

COMPOSE_NODE_PAIRS=1x1 — Topology: "validators×executors" (default varies by example)
TEST_FRAMEWORK_PROMETHEUS_PORT=9091 — Override Prometheus port (default: 9090)
COMPOSE_RUNNER_HOST=127.0.0.1 — Host address for port mappings (default: 127.0.0.1)
COMPOSE_RUNNER_PRESERVE=1 — Keep containers running after test (for debugging)
NOMOS_LOG_DIR=/tmp/compose-logs — Write logs to files inside containers (requires copy-out or volume mount)
NOMOS_LOG_LEVEL=debug — Set log level

Compose-specific features:

Node control support: Only runner that supports chaos testing (.enable_node_control() + .chaos() workloads)
Prometheus observability: Metrics at http://localhost:9090

Important: Chaos workloads (random restarts) only work with ComposeDeployer. LocalDeployer and K8sDeployer do not support node control.

K8s Runner

Prerequisites:

Kubernetes cluster with kubectl configured and working
Circuit assets in testing-framework/assets/stack/kzgrs_test_params
Test image built (same as Compose: testing-framework/assets/stack/scripts/build_test_image.sh)
Image available in cluster (loaded via kind, minikube, or pushed to registry)
POL_PROOF_DEV_MODE=true environment variable set

Load image into cluster:

# For kind clusters
export NOMOS_TESTNET_IMAGE=nomos-testnet:local
kind load docker-image nomos-testnet:local

# For minikube
minikube image load nomos-testnet:local

# For remote clusters (push to registry)
docker tag nomos-testnet:local your-registry/nomos-testnet:local
docker push your-registry/nomos-testnet:local
export NOMOS_TESTNET_IMAGE=your-registry/nomos-testnet:local

Run the example:

export NOMOS_TESTNET_IMAGE=nomos-testnet:local
export POL_PROOF_DEV_MODE=true
cargo run -p runner-examples --bin k8s_runner

Important:

K8s runner mounts testing-framework/assets/stack/kzgrs_test_params as a hostPath volume. Ensure this directory exists and contains circuit assets on the node where pods will be scheduled.
No node control support yet: Chaos workloads (.enable_node_control()) will fail. Use ComposeDeployer for chaos testing.

Circuit Assets (KZG Parameters)

DA workloads require KZG cryptographic parameters for polynomial commitment schemes.

Asset Location

Default path: testing-framework/assets/stack/kzgrs_test_params

Override: Set NOMOS_KZGRS_PARAMS_PATH to use a custom location:

NOMOS_KZGRS_PARAMS_PATH=/path/to/custom/params cargo run -p runner-examples --bin local_runner

Getting Circuit Assets

Option 1: Use helper script (recommended):

# From the repository root
chmod +x scripts/setup-nomos-circuits.sh
scripts/setup-nomos-circuits.sh v0.3.1 /tmp/nomos-circuits

# Copy to default location
cp -r /tmp/nomos-circuits/* testing-framework/assets/stack/kzgrs_test_params/

Option 2: Build locally (advanced):

# Requires Go, Rust, and circuit build tools
make kzgrs_test_params

CI Workflow

The CI automatically fetches and places assets:

- name: Install circuits for host build
  run: |
    scripts/setup-nomos-circuits.sh v0.3.1 "$TMPDIR/nomos-circuits"
    cp -a "$TMPDIR/nomos-circuits"/. testing-framework/assets/stack/kzgrs_test_params/

When Are Assets Needed?

Runner	When Required
Local	Always (for DA workloads)
Compose	During image build (baked into `NOMOS_TESTNET_IMAGE`)
K8s	During image build + deployed to cluster via hostPath volume

Error without assets:

Error: missing KZG parameters at testing-framework/assets/stack/kzgrs_test_params

Logging and Observability

Node Logging vs Framework Logging

Critical distinction: Node logs and framework logs use different configuration mechanisms.

Component	Controlled By	Purpose
Framework binaries (`cargo run -p runner-examples --bin local_runner`)	`RUST_LOG`	Runner orchestration, deployment logs
Node processes (validators, executors spawned by runner)	`NOMOS_LOG_LEVEL`, `NOMOS_LOG_FILTER`, `NOMOS_LOG_DIR`	Consensus, DA, mempool, network logs

Common mistake: Setting RUST_LOG=debug only increases verbosity of the runner binary itself. Node logs remain at their default level unless you also set NOMOS_LOG_LEVEL=debug.

Example:

# This only makes the RUNNER verbose, not the nodes:
RUST_LOG=debug cargo run -p runner-examples --bin local_runner

# This makes the NODES verbose:
NOMOS_LOG_LEVEL=debug cargo run -p runner-examples --bin local_runner

# Both verbose (typically not needed):
RUST_LOG=debug NOMOS_LOG_LEVEL=debug cargo run -p runner-examples --bin local_runner

Logging Environment Variables

Variable	Default	Effect
`NOMOS_LOG_DIR`	None (console only)	Directory for per-node log files. If unset, logs go to stdout/stderr.
`NOMOS_LOG_LEVEL`	`info`	Global log level: `error`, `warn`, `info`, `debug`, `trace`
`NOMOS_LOG_FILTER`	None	Fine-grained target filtering (e.g., `consensus=trace,da=debug`)
`NOMOS_TESTS_TRACING`	`false`	Enable tracing subscriber for local runner file logging
`NOMOS_OTLP_ENDPOINT`	None	OTLP trace endpoint (optional, disables OTLP noise if unset)
`NOMOS_OTLP_METRICS_ENDPOINT`	None	OTLP metrics endpoint (optional)

Example: Full debug logging to files:

NOMOS_TESTS_TRACING=true \
NOMOS_LOG_DIR=/tmp/test-logs \
NOMOS_LOG_LEVEL=debug \
NOMOS_LOG_FILTER="nomos_consensus=trace,nomos_da_sampling=debug" \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin local_runner

Per-Node Log Files

When NOMOS_LOG_DIR is set, each node writes logs to separate files:

File naming pattern:

Validators: Prefix nomos-node-0, nomos-node-1, etc. (may include timestamp suffix)
Executors: Prefix nomos-executor-0, nomos-executor-1, etc. (may include timestamp suffix)

Local runner caveat: By default, the local runner writes logs to temporary directories in the working directory. These are automatically cleaned up after tests complete. To preserve logs, you MUST set both NOMOS_TESTS_TRACING=true AND NOMOS_LOG_DIR=/path/to/logs.

Filter Target Names

Common target prefixes for NOMOS_LOG_FILTER:

Target Prefix	Subsystem
`nomos_consensus`	Consensus (Cryptarchia)
`nomos_da_sampling`	DA sampling service
`nomos_da_dispersal`	DA dispersal service
`nomos_da_verifier`	DA verification
`nomos_mempool`	Transaction mempool
`nomos_blend`	Mix network/privacy layer
`chain_network`	P2P networking
`chain_leader`	Leader election

Example filter:

NOMOS_LOG_FILTER="nomos_consensus=trace,nomos_da_sampling=debug,chain_network=info"

Accessing Logs Per Runner

Local Runner

Default (temporary directories, auto-cleanup):

POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner
# Logs written to temporary directories in working directory
# Automatically cleaned up after test completes

Persistent file output:

NOMOS_TESTS_TRACING=true \
NOMOS_LOG_DIR=/tmp/local-logs \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin local_runner

# After test completes:
ls /tmp/local-logs/
# Files with prefix: nomos-node-0*, nomos-node-1*, nomos-executor-0*
# May include timestamps in filename

Both flags required: You MUST set both NOMOS_TESTS_TRACING=true (enables tracing file sink) AND NOMOS_LOG_DIR (specifies directory) to get persistent logs.

Compose Runner

Via Docker logs (default, recommended):

# List containers (note the UUID prefix in names)
docker ps --filter "name=nomos-compose-"

# Stream logs from specific container
docker logs -f <container-id-or-name>

# Or use name pattern matching:
docker logs -f $(docker ps --filter "name=nomos-compose-.*-validator-0" -q | head -1)

Via file collection (advanced):

Setting NOMOS_LOG_DIR writes files inside the container. To access them, you must either:

Copy files out after the run:

NOMOS_LOG_DIR=/logs \
NOMOS_TESTNET_IMAGE=nomos-testnet:local \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin compose_runner

# After test, copy files from containers:
docker ps --filter "name=nomos-compose-"
docker cp <container-id>:/logs/nomos-node-0* /tmp/

Mount a host volume (requires modifying compose template):

volumes:
  - /tmp/host-logs:/logs  # Add to docker-compose.yml.tera

Recommendation: Use docker logs by default. File collection inside containers is complex and rarely needed.

Keep containers for debugging:

COMPOSE_RUNNER_PRESERVE=1 \
NOMOS_TESTNET_IMAGE=nomos-testnet:local \
cargo run -p runner-examples --bin compose_runner
# Containers remain running after test—inspect with docker logs or docker exec

Note: Container names follow pattern nomos-compose-{uuid}-validator-{index}-1 where {uuid} changes per run.

K8s Runner

Via kubectl logs (use label selectors):

# List pods
kubectl get pods

# Stream logs using label selectors (recommended)
kubectl logs -l app=nomos-validator -f
kubectl logs -l app=nomos-executor -f

# Stream logs from specific pod
kubectl logs -f nomos-validator-0

# Previous logs from crashed pods
kubectl logs --previous -l app=nomos-validator

Download logs for offline analysis:

# Using label selectors
kubectl logs -l app=nomos-validator --tail=1000 > all-validators.log
kubectl logs -l app=nomos-executor --tail=1000 > all-executors.log

# Specific pods
kubectl logs nomos-validator-0 > validator-0.log
kubectl logs nomos-executor-1 > executor-1.log

Specify namespace (if not using default):

kubectl logs -n my-namespace -l app=nomos-validator -f

OTLP and Telemetry

OTLP exporters are optional. If you see errors about unreachable OTLP endpoints, it's safe to ignore them unless you're actively collecting traces/metrics.

To enable OTLP:

NOMOS_OTLP_ENDPOINT=http://localhost:4317 \
NOMOS_OTLP_METRICS_ENDPOINT=http://localhost:4318 \
cargo run -p runner-examples --bin local_runner

To silence OTLP errors: Simply leave these variables unset (the default).

Observability: Prometheus and Node APIs

Runners expose metrics and node HTTP endpoints for expectation code and debugging:

Prometheus (Compose only):

Default: http://localhost:9090
Override: TEST_FRAMEWORK_PROMETHEUS_PORT=9091
Access from expectations: ctx.telemetry().prometheus_endpoint()

Node APIs:

Access from expectations: ctx.node_clients().validators().get(0)
Endpoints: consensus info, network info, DA membership, etc.
See testing-framework/core/src/nodes/api_client.rs for available methods

flowchart TD
    Expose[Runner exposes endpoints/ports] --> Collect[Runtime collects block/health signals]
    Collect --> Consume[Expectations consume signals<br/>decide pass/fail]
    Consume --> Inspect[Operators inspect logs/metrics<br/>when failures arise]

Part III — Developer Reference

Deep dives for contributors who extend the framework, evolve its abstractions, or maintain the crate set.

Scenario Model (Developer Level)

The scenario model defines clear, composable responsibilities:

Topology: a declarative description of the cluster—how many nodes, their roles, and the broad network and data-availability characteristics. It represents the intended shape of the system under test.
Scenario: a plan combining topology, workloads, expectations, and a run window. Building a scenario validates prerequisites (like seeded wallets) and ensures the run lasts long enough to observe meaningful block progression.
Workloads: asynchronous tasks that generate traffic or conditions. They use shared context to interact with the deployed cluster and may bundle default expectations.
Expectations: post-run assertions. They can capture baselines before workloads start and evaluate success once activity stops.
Runtime: coordinates workloads and expectations for the configured duration, enforces cooldowns when control actions occur, and ensures cleanup so runs do not leak resources.

Developers extending the model should keep these boundaries strict: topology describes, scenarios assemble, deployers provision, runners orchestrate, workloads drive, and expectations judge outcomes. For guidance on adding new capabilities, see Extending the Framework.

Extending the Framework

Adding a workload

Implement testing_framework_core::scenario::Workload:
- Provide a name and any bundled expectations.
- In init, derive inputs from GeneratedTopology and RunMetrics; fail fast if prerequisites are missing (e.g., wallet data, node addresses).
- In start, drive async traffic using the RunContext clients.
Expose the workload from a module under testing-framework/workflows and consider adding a DSL helper for ergonomic wiring.

Adding an expectation

Implement testing_framework_core::scenario::Expectation:
- Use start_capture to snapshot baseline metrics.
- Use evaluate to assert outcomes after workloads finish; return all errors so the runner can aggregate them.
Export it from testing-framework/workflows if it is reusable.

Adding a runner

Implement testing_framework_core::scenario::Deployer for your backend.
- Produce a RunContext with NodeClients, metrics endpoints, and optional NodeControlHandle.
- Guard cleanup with CleanupGuard to reclaim resources even on failures.
Mirror the readiness and block-feed probes used by the existing runners so workloads can rely on consistent signals.

Adding topology helpers

Extend testing_framework_core::topology::TopologyBuilder with new layouts or configuration presets (e.g., specialized DA parameters). Keep defaults safe: ensure at least one participant and clamp dispersal factors as the current helpers do.

Example: New Workload & Expectation (Rust)

A minimal, end-to-end illustration of adding a custom workload and matching expectation. This shows the shape of the traits and where to plug into the framework; expand the logic to fit your real test.

Workload: simple reachability probe

Key ideas:

name: identifies the workload in logs.
expectations: workloads can bundle defaults so callers don’t forget checks.
init: derive inputs from the generated topology (e.g., pick a target node).
start: drive async activity using the shared RunContext.

#![allow(unused)]
fn main() {
use std::sync::Arc;
use async_trait::async_trait;
use testing_framework_core::scenario::{
    DynError, Expectation, RunContext, RunMetrics, Workload,
};
use testing_framework_core::topology::GeneratedTopology;

pub struct ReachabilityWorkload {
    target_idx: usize,
    bundled: Vec<Box<dyn Expectation>>,
}

impl ReachabilityWorkload {
    pub fn new(target_idx: usize) -> Self {
        Self {
            target_idx,
            bundled: vec![Box::new(ReachabilityExpectation::new(target_idx))],
        }
    }
}

#[async_trait]
impl Workload for ReachabilityWorkload {
    fn name(&self) -> &'static str {
        "reachability_workload"
    }

    fn expectations(&self) -> Vec<Box<dyn Expectation>> {
        self.bundled.clone()
    }

    fn init(
        &mut self,
        topology: &GeneratedTopology,
        _metrics: &RunMetrics,
    ) -> Result<(), DynError> {
        if topology.validators().get(self.target_idx).is_none() {
            return Err("no validator at requested index".into());
        }
        Ok(())
    }

    async fn start(&self, ctx: &RunContext) -> Result<(), DynError> {
        let client = ctx
            .clients()
            .validators()
            .get(self.target_idx)
            .ok_or("missing target client")?;

        // Pseudo-action: issue a lightweight RPC to prove reachability.
        client.health_check().await.map_err(|e| e.into())
    }
}
}

Expectation: confirm the target stayed reachable

Key ideas:

start_capture: snapshot baseline if needed (not used here).
evaluate: assert the condition after workloads finish.

#![allow(unused)]
fn main() {
use async_trait::async_trait;
use testing_framework_core::scenario::{DynError, Expectation, RunContext};

pub struct ReachabilityExpectation {
    target_idx: usize,
}

impl ReachabilityExpectation {
    pub fn new(target_idx: usize) -> Self {
        Self { target_idx }
    }
}

#[async_trait]
impl Expectation for ReachabilityExpectation {
    fn name(&self) -> &str {
        "target_reachable"
    }

    async fn evaluate(&mut self, ctx: &RunContext) -> Result<(), DynError> {
        let client = ctx
            .clients()
            .validators()
            .get(self.target_idx)
            .ok_or("missing target client")?;

        client.health_check().await.map_err(|e| {
            format!("target became unreachable during run: {e}").into()
        })
    }
}
}

How to wire it

Build your scenario as usual and call .with_workload(ReachabilityWorkload::new(0)).
The bundled expectation is attached automatically; you can add more with .with_expectation(...) if needed.
Keep the logic minimal and fast for smoke tests; grow it into richer probes for deeper scenarios.

Internal Crate Reference

High-level roles of the crates that make up the framework:

Configs (testing-framework/configs/): Prepares reusable configuration primitives for nodes, networking, tracing, data availability, and wallets, shared by all scenarios and runners. Includes topology generation and circuit asset resolution.
Core scenario orchestration (testing-framework/core/): Houses the topology and scenario model, runtime coordination, node clients, and readiness/health probes. Defines Deployer and Runner traits, ScenarioBuilder, and RunContext.
Workflows (testing-framework/workflows/): Packages workloads (transaction, DA, chaos) and expectations (consensus liveness) into reusable building blocks. Offers fluent DSL extensions (ScenarioBuilderExt, ChaosBuilderExt).
Runners (testing-framework/runners/{local,compose,k8s}/): Implements deployment backends (local host, Docker Compose, Kubernetes) that all consume the same scenario plan. Each provides a Deployer implementation (LocalDeployer, ComposeDeployer, K8sDeployer).
Runner Examples (examples/runner-examples): Runnable binaries demonstrating framework usage and serving as living documentation. These are the primary entry point for running scenarios (local_runner.rs, compose_runner.rs, k8s_runner.rs).

Where to Add New Capabilities

What You're Adding	Where It Goes	Examples
Node config parameter	`testing-framework/configs/src/topology/configs/`	Slot duration, log levels, DA params
Topology feature	`testing-framework/core/src/topology/`	New network layouts, node roles
Scenario capability	`testing-framework/core/src/scenario/`	New capabilities, context methods
Workload	`testing-framework/workflows/src/workloads/`	New traffic generators
Expectation	`testing-framework/workflows/src/expectations/`	New success criteria
Builder API	`testing-framework/workflows/src/builder/`	DSL extensions, fluent methods
Deployer	`testing-framework/runners/`	New deployment backends
Example scenario	`examples/src/bin/`	Demonstration binaries

Extension Workflow

Adding a New Workload

Define the workload in testing-framework/workflows/src/workloads/your_workload.rs:

#![allow(unused)]
fn main() {
use async_trait::async_trait;
use testing_framework_core::scenario::{Workload, RunContext, DynError};

pub struct YourWorkload {
    // config fields
}

#[async_trait]
impl Workload for YourWorkload {
    fn name(&self) -> &'static str { "your_workload" }
    async fn start(&self, ctx: &RunContext) -> Result<(), DynError> {
        // implementation
        Ok(())
    }
}
}

Add builder extension in testing-framework/workflows/src/builder/mod.rs:

#![allow(unused)]
fn main() {
pub trait ScenarioBuilderExt {
    fn your_workload(self) -> YourWorkloadBuilder;
}
}

Use in examples in examples/src/bin/your_scenario.rs:

#![allow(unused)]
fn main() {
let mut plan = ScenarioBuilder::topology_with(|t| {
        t.network_star()
            .validators(3)
            .executors(0)
    })
    .your_workload_with(|w| {  // Your new DSL method with closure
        w.some_config()
    })
    .build();
}

Adding a New Expectation

Define the expectation in testing-framework/workflows/src/expectations/your_expectation.rs:

#![allow(unused)]
fn main() {
use async_trait::async_trait;
use testing_framework_core::scenario::{Expectation, RunContext, DynError};

pub struct YourExpectation {
    // config fields
}

#[async_trait]
impl Expectation for YourExpectation {
    fn name(&self) -> &str { "your_expectation" }
    async fn evaluate(&mut self, ctx: &RunContext) -> Result<(), DynError> {
        // implementation
        Ok(())
    }
}
}

Add builder extension in testing-framework/workflows/src/builder/mod.rs:

#![allow(unused)]
fn main() {
pub trait ScenarioBuilderExt {
    fn expect_your_condition(self) -> Self;
}
}

Adding a New Deployer

Implement Deployer trait in testing-framework/runners/your_runner/src/deployer.rs:

#![allow(unused)]
fn main() {
use async_trait::async_trait;
use testing_framework_core::scenario::{Deployer, Runner, Scenario};

pub struct YourDeployer;

#[async_trait]
impl Deployer for YourDeployer {
    type Error = YourError;
    
    async fn deploy(&self, scenario: &Scenario) -> Result<Runner, Self::Error> {
        // Provision infrastructure
        // Wait for readiness
        // Return Runner
    }
}
}

Provide cleanup and handle node control if supported.
Add example in examples/src/bin/your_runner.rs.

For detailed examples, see Extending the Framework and Custom Workload Example.

Part IV — Appendix

Quick-reference material and supporting guidance to keep scenarios discoverable, debuggable, and consistent.

Builder API Quick Reference

Quick reference for the scenario builder DSL. All methods are chainable.

Imports

#![allow(unused)]
fn main() {
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_local::LocalDeployer;
use testing_framework_runner_compose::ComposeDeployer;
use testing_framework_runner_k8s::K8sDeployer;
use testing_framework_workflows::{ScenarioBuilderExt, ChaosBuilderExt};
use std::time::Duration;
}

Topology

#![allow(unused)]
fn main() {
ScenarioBuilder::topology_with(|t| {
        t.network_star()      // Star topology (all connect to seed node)
            .validators(3)    // Number of validator nodes
            .executors(2)     // Number of executor nodes
    })                        // Finish topology configuration
}

Wallets

#![allow(unused)]
fn main() {
.wallets(50)                 // Seed 50 funded wallet accounts
}

Transaction Workload

#![allow(unused)]
fn main() {
.transactions_with(|txs| {
    txs.rate(5)              // 5 transactions per block
        .users(20)           // Use 20 of the seeded wallets
})                           // Finish transaction workload config
}

DA Workload

#![allow(unused)]
fn main() {
.da_with(|da| {
    da.channel_rate(1)       // 1 channel operation per block
        .blob_rate(2)        // 2 blob dispersals per block
})                           // Finish DA workload config
}

Chaos Workload (Requires `enable_node_control()`)

#![allow(unused)]
fn main() {
.enable_node_control()       // Enable node control capability
.chaos_with(|c| {
    c.restart()              // Random restart chaos
        .min_delay(Duration::from_secs(30))     // Min time between restarts
        .max_delay(Duration::from_secs(60))     // Max time between restarts
        .target_cooldown(Duration::from_secs(45))  // Cooldown after restart
        .apply()             // Required for chaos configuration
})
}

Expectations

#![allow(unused)]
fn main() {
.expect_consensus_liveness() // Assert blocks are produced continuously
}

Run Duration

#![allow(unused)]
fn main() {
.with_run_duration(Duration::from_secs(120))  // Run for 120 seconds
}

Build

#![allow(unused)]
fn main() {
.build()                     // Construct the final Scenario
}

Deployers

#![allow(unused)]
fn main() {
// Local processes
let deployer = LocalDeployer::default();

// Docker Compose
let deployer = ComposeDeployer::default();

// Kubernetes
let deployer = K8sDeployer::default();
}

Execution

#![allow(unused)]
fn main() {
let runner = deployer.deploy(&plan).await?;
let _handle = runner.run(&mut plan).await?;
}

Complete Example

#![allow(unused)]
fn main() {
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_local::LocalDeployer;
use testing_framework_workflows::ScenarioBuilderExt;
use std::time::Duration;

async fn run_test() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    let mut plan = ScenarioBuilder::topology_with(|t| {
            t.network_star()
                .validators(3)
                .executors(2)
        })
        .wallets(50)
        .transactions_with(|txs| {
            txs.rate(5)                     // 5 transactions per block
                .users(20)
        })
        .da_with(|da| {
            da.channel_rate(1)             // 1 channel operation per block
                .blob_rate(2)              // 2 blob dispersals per block
        })
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(90))
        .build();

    let deployer = LocalDeployer::default();
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;
    
    Ok(())
}
}

Troubleshooting Scenarios

Prerequisites for All Runners:

POL_PROOF_DEV_MODE=true MUST be set for all runners (local, compose, k8s) to avoid expensive Groth16 proof generation that causes timeouts
KZG circuit assets must be present at testing-framework/assets/stack/kzgrs_test_params/ for DA workloads (fetch via scripts/setup-nomos-circuits.sh)

Quick Symptom Guide

Common symptoms and likely causes:

No or slow block progression: missing POL_PROOF_DEV_MODE=true, missing KZG circuit assets for DA workloads, too-short run window, port conflicts, or resource exhaustion—set required env vars, verify assets, extend duration, check node logs for startup errors.
Transactions not included: unfunded or misconfigured wallets (check .wallets(N) vs .users(M)), transaction rate exceeding block capacity, or rates exceeding block production speed—reduce rate, increase wallet count, verify wallet setup in logs.
Chaos stalls the run: chaos (node control) only works with ComposeDeployer; LocalDeployer and K8sDeployer don't support it (won't "stall", just can't execute chaos workloads). With compose, aggressive restart cadence can prevent consensus recovery—widen restart intervals.
Observability gaps: metrics or logs unreachable because ports clash or services are not exposed—adjust observability ports and confirm runner wiring.
Flaky behavior across runs: mixing chaos with functional smoke tests or inconsistent topology between environments—separate deterministic and chaos scenarios and standardize topology presets.

Where to Find Logs

Log Location Quick Reference

Runner	Default Output	With `NOMOS_LOG_DIR` + Flags	Access Command
Local	Temporary directories (cleaned up)	Per-node files with prefix `nomos-node-{index}` (requires `NOMOS_TESTS_TRACING=true`)	`cat $NOMOS_LOG_DIR/nomos-node-0*`
Compose	Docker container stdout/stderr	Per-node files inside containers (if path is mounted)	`docker ps` then `docker logs <container-id>`
K8s	Pod stdout/stderr	Per-node files inside pods (if path is mounted)	`kubectl logs -l app=nomos-validator`

Important Notes:

Local runner: Logs go to system temporary directories (NOT in working directory) by default and are automatically cleaned up after tests. To persist logs, you MUST set both NOMOS_TESTS_TRACING=true AND NOMOS_LOG_DIR=/path/to/logs.
Compose/K8s: Per-node log files only exist inside containers/pods if NOMOS_LOG_DIR is set AND the path is writable inside the container/pod. By default, rely on docker logs or kubectl logs.
File naming: Log files use prefix nomos-node-{index}* or nomos-executor-{index}* with timestamps, e.g., nomos-node-0.2024-12-01T10-30-45.log (NOT just .log suffix).
Container names: Compose containers include project UUID, e.g., nomos-compose-<uuid>-validator-0-1 where <uuid> is randomly generated per run

Accessing Node Logs by Runner

Local Runner

Console output (default):

POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner 2>&1 | tee test.log

Persistent file output:

NOMOS_TESTS_TRACING=true \
NOMOS_LOG_DIR=/tmp/debug-logs \
NOMOS_LOG_LEVEL=debug \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin local_runner

# Inspect logs (note: filenames include timestamps):
ls /tmp/debug-logs/
# Example: nomos-node-0.2024-12-01T10-30-45.log
tail -f /tmp/debug-logs/nomos-node-0*  # Use wildcard to match timestamp

Compose Runner

Stream live logs:

# List running containers (note the UUID prefix in names)
docker ps --filter "name=nomos-compose-"

# Find your container ID or name from the list, then:
docker logs -f <container-id>

# Or filter by name pattern:
docker logs -f $(docker ps --filter "name=nomos-compose-.*-validator-0" -q | head -1)

# Show last 100 lines
docker logs --tail 100 <container-id>

Keep containers for post-mortem debugging:

COMPOSE_RUNNER_PRESERVE=1 \
NOMOS_TESTNET_IMAGE=nomos-testnet:local \
cargo run -p runner-examples --bin compose_runner

# After test failure, containers remain running:
docker ps --filter "name=nomos-compose-"
docker exec -it <container-id> /bin/sh
docker logs <container-id> > debug.log

Note: Container names follow the pattern nomos-compose-{uuid}-validator-{index}-1 or nomos-compose-{uuid}-executor-{index}-1, where {uuid} is randomly generated per run.

K8s Runner

Important: Always verify your namespace and use label selectors instead of assuming pod names.

Stream pod logs (use label selectors):

# Check your namespace first
kubectl config view --minify | grep namespace

# All validator pods (add -n <namespace> if not using default)
kubectl logs -l app=nomos-validator -f

# All executor pods
kubectl logs -l app=nomos-executor -f

# Specific pod by name (find exact name first)
kubectl get pods -l app=nomos-validator  # Find the exact pod name
kubectl logs -f <actual-pod-name>        # Then use it

# With explicit namespace
kubectl logs -n my-namespace -l app=nomos-validator -f

Download logs from crashed pods:

# Previous logs from crashed pod
kubectl get pods -l app=nomos-validator  # Find crashed pod name first
kubectl logs --previous <actual-pod-name> > crashed-validator.log

# Or use label selector for all crashed validators
for pod in $(kubectl get pods -l app=nomos-validator -o name); do
  kubectl logs --previous $pod > $(basename $pod)-previous.log 2>&1
done

Access logs from all pods:

# All pods in current namespace
for pod in $(kubectl get pods -o name); do
  echo "=== $pod ==="
  kubectl logs $pod
done > all-logs.txt

# Or use label selectors (recommended)
kubectl logs -l app=nomos-validator --tail=500 > validators.log
kubectl logs -l app=nomos-executor --tail=500 > executors.log

# With explicit namespace
kubectl logs -n my-namespace -l app=nomos-validator --tail=500 > validators.log

Debugging Workflow

When a test fails, follow this sequence:

1. Check Framework Output

Start with the test harness output—did expectations fail? Was there a deployment error?

Look for:

Expectation failure messages
Timeout errors
Deployment/readiness failures

2. Verify Node Readiness

Ensure all nodes started successfully and became ready before workloads began.

Commands:

# Local: check process list
ps aux | grep nomos

# Compose: check container status (note UUID in names)
docker ps -a --filter "name=nomos-compose-"

# K8s: check pod status (use label selectors, add -n <namespace> if needed)
kubectl get pods -l app=nomos-validator
kubectl get pods -l app=nomos-executor
kubectl describe pod <actual-pod-name>  # Get name from above first

3. Inspect Node Logs

Focus on the first node that exhibited problems or the node with the highest index (often the last to start).

Common error patterns:

"Failed to bind address" → port conflict
"Connection refused" → peer not ready or network issue
"Proof verification failed" or "Proof generation timeout" → missing POL_PROOF_DEV_MODE=true (REQUIRED for all runners)
"Failed to load KZG parameters" or "Circuit file not found" → missing KZG circuit assets at testing-framework/assets/stack/kzgrs_test_params/
"Insufficient funds" → wallet seeding issue (increase .wallets(N) or reduce .users(M))

4. Check Log Levels

If logs are too sparse, increase verbosity:

NOMOS_LOG_LEVEL=debug \
NOMOS_LOG_FILTER="nomos_consensus=trace,nomos_da_sampling=debug" \
cargo run -p runner-examples --bin local_runner

5. Verify Observability Endpoints

If expectations report observability issues:

Prometheus (Compose):

curl http://localhost:9090/-/healthy

Node HTTP APIs:

curl http://localhost:18080/consensus/info  # Adjust port per node

6. Compare with Known-Good Scenario

Run a minimal baseline test (e.g., 2 validators, consensus liveness only). If it passes, the issue is in your workload or topology configuration.

Common Error Messages

"Consensus liveness expectation failed"

Cause: Not enough blocks produced during run window, missing POL_PROOF_DEV_MODE=true (causes slow proof generation), or missing KZG assets for DA workloads
Fix:
1. Verify POL_PROOF_DEV_MODE=true is set (REQUIRED for all runners)
2. Verify KZG assets exist at testing-framework/assets/stack/kzgrs_test_params/ (for DA workloads)
3. Extend with_run_duration() to allow more blocks
4. Check node logs for proof generation or DA errors
5. Reduce transaction/DA rate if nodes are overwhelmed

"Wallet seeding failed"

Cause: Topology doesn't have enough funded wallets for the workload
Fix: Increase .wallets(N) count or reduce .users(M) in transaction workload (ensure N ≥ M)

"Node control not available"

Cause: Runner doesn't support node control (only ComposeDeployer does), or enable_node_control() wasn't called
Fix:
1. Use ComposeDeployer for chaos tests (LocalDeployer and K8sDeployer don't support node control)
2. Ensure .enable_node_control() is called in scenario before .chaos()

"Readiness timeout"

Cause: Nodes didn't become responsive within expected time (often due to missing prerequisites)
Fix:
1. Verify POL_PROOF_DEV_MODE=true is set (REQUIRED for all runners—without it, proof generation is too slow)
2. Check node logs for startup errors (port conflicts, missing assets)
3. Verify network connectivity between nodes
4. For DA workloads, ensure KZG circuit assets are present

"Port already in use"

Cause: Previous test didn't clean up, or another process holds the port
Fix: Kill orphaned processes (pkill nomos-node), wait for Docker cleanup (docker compose down), or restart Docker

"Image not found: nomos-testnet:local"

Cause: Docker image not built for Compose/K8s runners, or KZG assets not baked into image
Fix:
1. Fetch KZG assets: scripts/setup-nomos-circuits.sh v0.3.1 /tmp/nomos-circuits
2. Copy to assets: cp -r /tmp/nomos-circuits/* testing-framework/assets/stack/kzgrs_test_params/
3. Build image: testing-framework/assets/stack/scripts/build_test_image.sh

"Failed to load KZG parameters" or "Circuit file not found"

Cause: DA workload requires KZG circuit assets that aren't present
Fix:
1. Fetch assets: scripts/setup-nomos-circuits.sh v0.3.1 /tmp/nomos-circuits
2. Copy to expected path: cp -r /tmp/nomos-circuits/* testing-framework/assets/stack/kzgrs_test_params/
3. For Compose/K8s: rebuild image with assets baked in

For detailed logging configuration and observability setup, see Operations.

FAQ

Why block-oriented timing?
Slots advance at a fixed rate (NTP-synchronized, 2s by default), so reasoning about blocks and consensus intervals keeps assertions aligned with protocol behavior rather than arbitrary wall-clock durations.

Can I reuse the same scenario across runners?
Yes. The plan stays the same; swap runners (local, compose, k8s) to target different environments.

When should I enable chaos workloads?
Only when testing resilience or operational recovery; keep functional smoke tests deterministic.

How long should runs be?
The framework enforces a minimum of 2× slot duration (4 seconds with default 2s slots), but practical recommendations:

Smoke tests: 30s minimum (~14 blocks with default 2s slots, 0.9 coefficient)
Transaction workloads: 60s+ (~27 blocks) to observe inclusion patterns
DA workloads: 90s+ (~40 blocks) to account for dispersal and sampling
Chaos tests: 120s+ (~54 blocks) to allow recovery after restarts

Very short runs (< 30s) risk false confidence—one or two lucky blocks don't prove liveness.

Do I always need seeded wallets?
Only for transaction scenarios. Data-availability or pure chaos scenarios may not require them, but liveness checks still need validators producing blocks.

What if expectations fail but workloads “look fine”?
Trust expectations first—they capture the intended success criteria. Use the observability signals and runner logs to pinpoint why the system missed the target.

Glossary

Validator: node role responsible for participating in consensus and block production.
Executor: a validator node with the DA dispersal service enabled. Executors can submit transactions and disperse blob data to the DA network, in addition to performing all validator functions.
DA (Data Availability): subsystem ensuring blobs or channel data are published and retrievable for validation.
Deployer: component that provisions infrastructure (spawns processes, creates containers, or launches pods), waits for readiness, and returns a Runner. Examples: LocalDeployer, ComposeDeployer, K8sDeployer.
Runner: component returned by deployers that orchestrates scenario execution—starts workloads, observes signals, evaluates expectations, and triggers cleanup.
Workload: traffic or behavior generator that exercises the system during a scenario run.
Expectation: post-run assertion that judges whether the system met the intended success criteria.
Topology: declarative description of the cluster shape, roles, and high-level parameters for a scenario.
Scenario: immutable plan combining topology, workloads, expectations, and run duration.
Blockfeed: stream of block observations used for liveness or inclusion signals during a run.
Control capability: the ability for a runner to start, stop, or restart nodes, used by chaos workloads.
Slot duration: time interval between consensus rounds in Cryptarchia. Blocks are produced at multiples of the slot duration based on lottery outcomes.
Block cadence: observed rate of block production in a live network, measured in blocks per second or seconds per block.
Cooldown: waiting period after a chaos action (e.g., node restart) before triggering the next action, allowing the system to stabilize.
Run window: total duration a scenario executes, specified via with_run_duration(). Framework auto-extends to at least 2× slot duration.
Readiness probe: health check performed by runners to ensure nodes are reachable and responsive before starting workloads. Prevents false negatives from premature traffic.
Liveness: property that the system continues making progress (producing blocks) under specified conditions. Contrasts with safety/correctness which verifies that state transitions are accurate.
State assertion: expectation that verifies specific values in the system state (e.g., wallet balances, UTXO sets) rather than just progress signals. Also called "correctness expectations."
Mantle transaction: transaction type in Nomos that can contain UTXO transfers (LedgerTx) and operations (Op), including channel data (ChannelBlob).
Channel: logical grouping for DA blobs; each blob belongs to a channel and references a parent blob in the same channel, creating a chain of related data.
POL_PROOF_DEV_MODE: environment variable that disables expensive Groth16 zero-knowledge proof generation for leader election. Required for all runners (local, compose, k8s) for practical testing—without it, proof generation causes timeouts. Should never be used in production environments.

Nomos Testing Book