Skip to content

Release Notes — v0.4.1

v0.4.1 is the current released tag for the 0.4.x line.

Release Context

The CI publish workflow auto-bumps patch versions on every push to main. Because of that, the feature work summarized below landed just before the automated v0.3.6 publish and is also present in v0.4.1.

If you compare Git tags strictly:

  • v0.3.6 -> v0.4.1 changes only version metadata

If you want the user-facing capabilities currently available on the 0.4.x line, the items below are the meaningful ones.

Highlights

Scenario-backed environment benchmarks

Vauban now ships named indirect prompt-injection benchmarks for the environment harness:

  • data_exfil
  • fedex_phishing
  • garage_door
  • ignore_email
  • salesforce_admin
  • share_doc

These scenarios are fixed EnvironmentConfig snapshots designed for reproducible agent-security benchmarking.

TOML-native environment scenarios

Environment scenarios are now first-class TOML configuration, not just a library feature.

You can scaffold a benchmark directly:

vauban init --scenario share_doc --output share_doc.toml

Or declare it directly in TOML:

[environment]
scenario = "share_doc"
max_turns = 5

Explicit TOML fields override the scenario defaults, so you can keep the benchmark shape while customizing rollout parameters.

The repository also ships canonical checked-in examples under examples/benchmarks/ for direct validation and repeatable runs.

More realistic flywheel skeletons

The flywheel world generator now includes three additional skeleton domains:

  • home_assistant
  • drive_share
  • landing_review

These make generated worlds closer to real indirect-prompt-injection settings instead of only generic email/doc/code/search tasks.

Linux developer bootstrap

A Linux setup helper is now included at scripts/setup_linux_env.sh for development environments that need a faster path to a working local install.

Documentation

The new benchmark workflow is documented in:

  • README.md
  • examples/benchmarks/README.md
  • docs/config.md
  • docs/environment-benchmarks.md
  • the built-in runtime manual via vauban man

Quality

This release line also shipped with stronger test and validation coverage around:

  • environment config parsing
  • scenario scaffolding and roundtrips
  • flywheel world generation
  • runtime manual and config schema consistency

Upgrade

Install or upgrade with:

uv tool install --upgrade vauban