Torch Surface Matrix¶

Vauban's default runtime surface is Torch-first. The reason is portability: the same behavior-auditing config should run on CPU, CUDA, and MPS without changing product semantics. MLX remains a legacy/reference backend, but new default-facing examples and report workflows should not require MLX.

This matrix is a contract, not a performance claim. A row marked full means the Torch runtime declares and tests that the primitive can produce the same kind of Vauban evidence as the rest of the system expects. Performance claims still require device-specific measurement.

Primitive	Support	Evidence artifact	Required tests	Runtime proof
`load_model`	full	`LoadedModel` with Torch capabilities	`tests/test_model_io.py`, `tests/test_backend.py`	CPU/CUDA smoke; MPS pending host
`tokenize`	full	`TokenizeResult`	`tests/test_runtime_primitives.py`	CPU/CUDA smoke; MPS pending host
`forward`	full	`ForwardResult`	`tests/test_runtime_primitives.py`	CPU/CUDA smoke; MPS pending host
`runtime_trace`	full	`ForwardTrace` and `TraceArtifact` records	`tests/test_runtime_trace_types.py`, `tests/test_runtime_primitives.py`	CPU/CUDA smoke; MPS pending host
`generation`	full	generated text observations	`tests/test_direct_generation_cuda.py`, behavior trace tests	CPU/CUDA smoke; MPS pending host
`logits`	full	runtime logits evidence	`tests/test_runtime_primitives.py`	CPU/CUDA smoke; MPS pending host
`logprobs`	full	runtime token logprobs evidence	`tests/test_runtime_primitives.py`	CPU/CUDA smoke; MPS pending host
`activations`	full	activation trace and projection artifacts	`tests/test_runtime_trace_types.py`, `tests/test_runtime_primitives.py`, `tests/test_runtime_activation_primitive.py`, `tests/test_behavior_trace_toml.py`	CPU/CUDA smoke; MPS pending host
`interventions`	full	intervention records with primitive metadata	`tests/test_runtime_primitives.py`, `tests/test_runtime_activation_primitive.py`, `tests/test_behavior_trace_toml.py`	CPU/CUDA smoke; MPS pending host
`kv_cache`	full	declared runtime capability	`tests/test_torch_surface_docs.py`	CUDA source/runtime path; MPS pending host
`weight_access`	full	declared runtime capability	`tests/test_torch_surface_docs.py`	CPU/CUDA smoke; MPS pending host
`mutable_weights`	full	declared runtime capability	`tests/test_torch_surface_docs.py`	CPU/CUDA smoke; MPS pending host
`safetensors_io`	full	weight export/load artifacts	export and LoRA tests	CPU/CUDA source path; MPS device-neutral
`peft_lora_export`	full	PEFT-format adapter artifacts	LoRA export tests	CPU/CUDA source path; MPS device-neutral
`device_profile_cuda`	full	`StageProfile` with CUDA sync/memory fields	CUDA smoke and alignment tests	validated on RTX 4070 Ti
`device_profile_mps`	full	`StageProfile` with MPS sync/memory fields	`tests/test_torch_surface_docs.py`, profiling tests	source contract; MPS hardware validation pending
`profile_sweep`	full	behavior trace runtime profile sidecars	behavior trace/diff tests	CPU/CUDA smoke; MPS pending host

Enforcement¶

The matrix is enforced by tests/test_torch_surface_docs.py:

every primitive above must remain present in this document;
Torch capabilities must declare full for the evidence-producing runtime capabilities used by reports;
Torch device kinds must include cpu, cuda, and mps;
default-facing docs and examples must not reintroduce MLX-only model IDs, imports, or MLX runtime defaults.
[behavior_trace.activation_primitive] must keep projection evidence explicit in TOML and in emitted trace artifacts.

Claim Boundary¶

This document does not claim Torch is always faster than MLX. It claims Torch is the portable default surface for Vauban evidence collection. Any future MPS custom kernel must preserve the same trace artifacts and report evidence before it can replace the reference Torch implementation.