Cosmos Reason 2 (CR2) is the system: one model, multiple safety reasoning tasks on video, delivered as structured JSON + <think> reasoning traces.
This repo is driven by:
warehouse-ops-center-spec.md(project spec)lessons_learned.md(Nebius/vLLM operational runbook and gotchas)
- Python 3.11+
- Nebius-managed vLLM endpoint serving
nvidia/Cosmos-Reason2-8B
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
copy .env.example .envEdit .env with your Nebius endpoint IP and API key.
Analyze a single video:
python -m src.cli analyze --mode load --video .\data\videos\clip.mp4Modes: load, safety, security, timeline, full
Run a batch manifest:
python -m src.cli batch --manifest .\batch\batch_manifest_example.yamlRender the exact request payload (offline; no Nebius calls):
python -m src.cli render --mode load --video .\data\videos\clip.mp4Parse a saved raw output into JSON + <think> (offline):
python -m src.cli parse --raw .\outputs\clip.mp4__load.raw.txtRun evaluation (offline, against hand-labeled ground truth):
python -m src.cli eval --results .\outputs --ground-truth .\data\ground_truth --out .\outputs\eval_report.jsonRun tests:
python -m unittest discover -s tests- Assume
reasoning_contentis not available. We parse<think>...</think>from the raw assistant message. - We do not assume we can change vLLM startup flags on Nebius-managed deployments.
We follow the NVIDIA Cosmos reason prompt guide (see warehouse-ops-center-spec.md Section 3B): media-first ordering + the standard reasoning suffix appended to the user prompt.