ovm.sh / atlas cli research

Research-grade version management for agent CLIs

OVM tracks Codex, Claude Code, and Pi releases, installs exact versions, watches upstream updates, and turns benchmark runs into publishable Atlas research evidence.

Latest research snapshot

Codex pays a one-time cold reindex on large stores, then launches faster warm than Claude in this run. Claude currently wins the trivial one-token response prompt, with sample size caveats.

research.json
install
curl -fsSL https://ovm.sh/install | sh
homebrew
brew tap ovm-sh/ovm && brew install ovm
npm
npm install -g @ovm-sh/ovm

research ledger

OVM is part product, part instrument. Release Radar, benchmark runs, and migration audits should produce publishable evidence, not disappear into terminal scrollback.

ovm, and the bug it taught us to catch

Running old and new Codex builds side by side exposed a silent shared state migration failure. OVM turns that kind of boundary into a visible warning and a research artifact.

read ↗
01

release radar → benchmark queue

verified Claude/Codex releases now queue benchmark work only after install evidence exists.

2026 · tool evidence
01

watch upstream

Release Radar detects package updates, verifies OVM can install them, and keeps command evidence.

02

measure locally

Benchmarks run where real auth and session state live, then append durable history for reports.

03

publish deliberately

Clean public commits carry curated code, release notes, benchmark data, and research write-ups.