From 58465bf8c0c034d079a903ab497320c6d0625289 Mon Sep 17 00:00:00 2001 From: Yvette Carlisle Date: Sat, 11 Apr 2026 12:45:30 +0800 Subject: [PATCH 1/3] {"schema":"delivery/1","type":"docs","scope":"voxit","summary":"normalize repository docs taxonomy","intent":"Replace the legacy governance guide and plans docs structure with the current taxonomy and tighter topic names","impact":"README and docs now route through spec runbook reference and decisions lanes while removing deprecated docs/plans artifacts","breaking":false,"risk":"low","authority":"repo","delivery_mode":"status-only","refs":[]} --- README.md | 16 +- docs/decisions/index.md | 28 +++ docs/governance.md | 94 ------- docs/guide/index.md | 55 ----- docs/index.md | 35 +-- docs/plans/.gitkeep | 0 ...-01-audio-input-device-selection-design.md | 51 ---- ...2026-03-01-audio-input-device-selection.md | 56 ----- docs/policy.md | 96 ++++++++ docs/reference/index.md | 23 ++ docs/reference/repository-layout.md | 39 +++ docs/runbook/first-run.md | 76 ++++++ docs/runbook/index.md | 30 +++ docs/spec/index.md | 16 +- docs/spec/runtime.md | 232 ++++++++++++++++++ docs/spec/system_voxit_v1.md | 211 ---------------- 16 files changed, 570 insertions(+), 488 deletions(-) create mode 100644 docs/decisions/index.md delete mode 100644 docs/governance.md delete mode 100644 docs/guide/index.md delete mode 100644 docs/plans/.gitkeep delete mode 100644 docs/plans/2026-03-01-audio-input-device-selection-design.md delete mode 100644 docs/plans/2026-03-01-audio-input-device-selection.md create mode 100644 docs/policy.md create mode 100644 docs/reference/index.md create mode 100644 docs/reference/repository-layout.md create mode 100644 docs/runbook/first-run.md create mode 100644 docs/runbook/index.md create mode 100644 docs/spec/runtime.md delete mode 100644 docs/spec/system_voxit_v1.md diff --git a/README.md b/README.md index c181c46..4bee064 100644 --- a/README.md +++ b/README.md @@ -26,7 +26,8 @@ AI dictation App for macOS (MVP scaffold). - Auto-paste into the app that was frontmost when recording began. - Configurable behavior and models via `config.toml`. -For the normative product contract, constraints, and gaps, see [System Spec v1](docs/spec/system_voxit_v1.md). +For the normative product contract, constraints, and gaps, see the +[Runtime Spec](docs/spec/runtime.md). ## Status @@ -35,7 +36,9 @@ V1 target is **macOS-first** and aligned to the English-only voice input design. - Status: ✅ Core MVP loop is implemented (record → stream preview → finalize → optional rewrite → paste). - Scope: ✅ Native macOS mic capture + OpenAI model pipeline only. - Limitation: ✅ Linux/Windows build is intentionally disabled. -- Limitation: ⚠️ Known gaps vs full spec are documented in [System Spec v1](docs/spec/system_voxit_v1.md) (hotkey configurability, tray menu behavior, CPAL fallback robustness, and rollout cleanup items). +- Limitation: ⚠️ Known gaps are documented in the + [Runtime Spec](docs/spec/runtime.md) (hotkey configurability, tray menu behavior, + CPAL fallback robustness, and rollout cleanup items). ## Usage @@ -133,6 +136,8 @@ First-run onboarding checklist: - Voxit uses request buttons to guide you through the permission prompts in sequence (Microphone → Accessibility → Input Monitoring); grant each permission and re-check when prompted. - Verify paste flow after permission grant and restart the app if needed. +For the full guided sequence, see [First Run](docs/runbook/first-run.md). + The app saves updates to the same `config.toml` path when settings are changed. ### Interaction @@ -165,6 +170,13 @@ The app saves updates to the same `config.toml` path when settings are changed. - Dedicated auth/session/config/rewrite/paste pipeline and typed application state. - macOS frontmost-app capture + clipboard/command-paste integration. +### Docs + +- [Documentation Index](docs/index.md) routes to spec, runbook, reference, and decision docs. +- [Runtime Spec](docs/spec/runtime.md) is the normative runtime contract. +- [First Run](docs/runbook/first-run.md) covers sign-in, permission grants, and paste validation. +- [Repository Layout](docs/reference/repository-layout.md) maps the current repo surfaces. + ## Support Me If you find this project helpful and would like to support its development, you can buy me a coffee! diff --git a/docs/decisions/index.md b/docs/decisions/index.md new file mode 100644 index 0000000..3bd3dcc --- /dev/null +++ b/docs/decisions/index.md @@ -0,0 +1,28 @@ +# Decision Index + +Purpose: Route agents to durable rationale documents that explain why the repository is +shaped the way it is. + +Question this index answers: "why is it shaped this way?" + +## Use this index when + +- You need the accepted tradeoff behind a current behavior or UX choice. +- You need consequences and rationale that should survive implementation churn. +- You need to understand why a contract or surface exists before changing it. + +## Do not use this index when + +- You need the final runtime contract or schema. +- You need an operator procedure or validation sequence. +- You need a current-state layout map rather than design rationale. + +## What belongs in `docs/decisions/` + +- Durable rationale and tradeoffs. +- Consequences that affect future changes. +- Accepted choices that shape the governing spec or current repository structure. + +## Current decisions + +- No accepted decision records are currently checked in under this lane. diff --git a/docs/governance.md b/docs/governance.md deleted file mode 100644 index ced9007..0000000 --- a/docs/governance.md +++ /dev/null @@ -1,94 +0,0 @@ -# Documentation Governance - -Purpose: Define how agent-facing documentation is organized, updated, and kept consistent -across this repository. - -Audience: All documentation under `docs/` is written for AI agents and LLM workflows. -The split between `spec` and `guide` is by task shape, not by reader type. - -## Principles - -- Optimize for retrieval, routing, and execution. -- Keep one authoritative document per topic. -- Separate normative truth from procedural steps. -- Prefer explicit section labels and stable links over prose-heavy narrative. -- Let structure emerge from real topics. Avoid premature folder taxonomies. - -## Document classes - -| Class | Location | Answers | Source of truth for | Update trigger | -| --- | --- | --- | --- | --- | -| Spec | `docs/spec/` | What must be true? | Contracts, schemas, invariants, required behavior | Any behavior or schema change | -| Guide | `docs/guide/` | What should I do? | Runbooks, migrations, validation, troubleshooting | Any procedure or operational change | -| Plan artifacts | `docs/plans/` | Which saved plan artifact should a planning tool or execution workflow use? | Tool-managed planning outputs | As emitted or updated by the relevant tool | - -## Placement rules - -- If a document defines correctness, it belongs in `docs/spec/`. -- If a document defines actions, it belongs in `docs/guide/`. -- Do not treat `docs/plans/` as a general-purpose docs bucket. -- Use `docs/plans/` only for artifacts produced or consumed by planning tools or - workflows that explicitly depend on saved plan files. -- Do not duplicate the same authoritative content across documents. Link to the source - of truth instead. -- A guide may summarize why a step exists, but normative statements still live in the - governing spec. - -## Document contracts - -Every document should start with a short routing header. - -Spec header: - -- `Purpose` -- `Status: normative` -- `Read this when` -- `Not this document` -- `Defines` - -Guide header: - -- `Goal` -- `Read this when` -- `Inputs` or `Preconditions` -- `Depends on` -- `Outputs` or `Verification` - -## Structure rules - -- Prefer shallow paths by default. -- Add subfolders only when they mirror stable system boundaries or improve retrieval. -- Use descriptive `snake_case` file names. -- Do not require fixed filename prefixes unless a real ambiguity appears. -- Do not create empty folders, empty indexes, or placeholder documents to satisfy a - taxonomy. - -## Canonical entry points - -- Unified documentation router: `docs/index.md` -- Normative router: `docs/spec/index.md` -- Procedural router: `docs/guide/index.md` -- Repo task and automation entrypoints: `Makefile.toml` - -## LLM reading guidance - -When answering a repository question: - -1. Read `docs/index.md` for routing. -2. Route by question type: - - "What must be true?" -> `docs/spec/index.md` - - "What should I do?" -> `docs/guide/index.md` -3. Read `Makefile.toml` when the task depends on repository automation or named tasks. -4. Use `docs/plans/` only when the task explicitly concerns a saved plan artifact used by - a planning tool or execution workflow. - -## Update workflow - -- Behavior or schema change: update the relevant spec. -- Procedure change: update the relevant guide. -- If a change touches both truth and procedure, update both documents and keep their - boundary explicit. -- When a guide starts carrying normative content, move that content into spec and link - to it. -- Do not impose local document-header requirements on files under `docs/plans/`; those - files are owned by the planning tool or workflow that created them. diff --git a/docs/guide/index.md b/docs/guide/index.md deleted file mode 100644 index 9c1e247..0000000 --- a/docs/guide/index.md +++ /dev/null @@ -1,55 +0,0 @@ -# Guide Index - -Purpose: Route agents to procedural documents that tell them how to execute work safely -and repeatably. - -Question this index answers: "what should I do?" - -## Use this index when - -- You need a runbook, how-to, migration sequence, validation flow, troubleshooting - path, or maintenance procedure. -- You already know the relevant spec and need the operational steps. -- You need a bounded sequence with prerequisites and verification. - -## Do not use this index when - -- You need the authoritative contract, schema, or invariant. -- You need a planning-tool artifact or a saved execution plan under `docs/plans/`. -- You need broad documentation policy or repo task-entrypoint rules; read - `docs/governance.md` or `Makefile.toml` instead. - -## What belongs in `docs/guide/` - -- Task-oriented runbooks. -- Validation and test procedures. -- Migration, rollout, rollback, and recovery sequences. -- Troubleshooting flows and operator checklists. -- Short implementation recipes that depend on a governing spec. - -## Guide document contract - -Start each guide with a compact routing header: - -- `Goal` -- `Read this when` -- `Inputs` or `Preconditions` -- `Depends on` -- `Outputs` or `Verification` - -Then structure the body for execution: - -- Write steps in the order an agent should perform them. -- Keep commands, checks, and rollback points explicit. -- Link to specs for normative truth instead of restating contracts. -- Include failure branches only when they change the next action. -- End with verification so an agent can tell whether the guide succeeded. - -## Structure policy - -- Group guides by workflow or subsystem only when multiple guides exist and the grouping - improves retrieval. -- Do not create empty category folders or placeholder section headings. -- Prefer titles that encode the task or outcome, such as `validate_release.md` or - `rerun_ingest_job.md`. -- Keep the guide index as a router, not a dumping ground for long explanations. diff --git a/docs/index.md b/docs/index.md index 57b1214..a1b4096 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,36 +1,43 @@ # Documentation Index -Purpose: Route agents to the smallest correct document set for the current task. +Purpose: Route agents to the smallest correct repository surface for the current task. -Audience: All documentation in this repository is written for AI agents and LLM workflows. -The split below is by question type, not by human-versus-agent audience. +Audience: All documentation in this repository is written for AI agents and LLM +workflows. The split below is by question type, not by human-versus-agent audience. ## Read order -- Read `docs/governance.md` for document contracts and placement rules. -- Read `Makefile.toml` when the task depends on repo task names or execution entrypoints. +- Read `README.md` first when you need the repository scope, platform target, or + top-level runtime summary. +- Read `docs/policy.md` for document contracts, placement rules, and naming rules. +- Use `cargo make` whenever an equivalent repo task exists. Inspect `Makefile.toml` + directly when task names or execution entrypoints matter. - Then choose one primary lane: - `docs/spec/index.md` when the question is "what must be true?" - - `docs/guide/index.md` when the question is "what should I do?" -- Use `docs/plans/` only when a planning tool or execution workflow explicitly points to - a saved plan artifact there. + - `docs/runbook/index.md` when the question is "which sequence should I execute?" + - `docs/reference/index.md` when the question is "how is it currently organized or + implemented?" + - `docs/decisions/index.md` when the question is "why is it shaped this way?" ## Routing matrix - Need contracts, invariants, schemas, enums, state machines, or required behavior -> `docs/spec/` -- Need runbooks, migrations, validation steps, troubleshooting, or operational sequences -> - `docs/guide/` +- Need runbooks, onboarding, validation steps, troubleshooting, or operational sequences + -> `docs/runbook/` +- Need current repository layout, ownership boundaries, or implementation surface maps -> + `docs/reference/` +- Need durable rationale, tradeoffs, or historical consequences -> `docs/decisions/` - Need repo task names or automation entrypoints -> `Makefile.toml` -- Need documentation placement or authoring rules -> `docs/governance.md` -- Need a planning-tool artifact or saved execution plan -> `docs/plans/` +- Need documentation placement or authoring rules -> `docs/policy.md` ## Retrieval rules - Optimize for agent routing and execution, not narrative flow. - Keep one authoritative document per topic. Link instead of copying. +- Runtime and behavior authority lives in code plus `docs/spec/`. Runbook, reference, + and decision docs explain usage, current state, and rationale, but do not override the + governing spec. - Start each document with a short routing header that says what the document is for, when to read it, and what it does not cover. - Keep links explicit and stable. -- Let structure emerge from real topics. Do not create empty folders, empty indexes, or - naming schemes that are stricter than the current corpus needs. diff --git a/docs/plans/.gitkeep b/docs/plans/.gitkeep deleted file mode 100644 index e69de29..0000000 diff --git a/docs/plans/2026-03-01-audio-input-device-selection-design.md b/docs/plans/2026-03-01-audio-input-device-selection-design.md deleted file mode 100644 index ed15d0a..0000000 --- a/docs/plans/2026-03-01-audio-input-device-selection-design.md +++ /dev/null @@ -1,51 +0,0 @@ -# Audio Input Device Selection Design - -## Scope - -Document the implemented behavior for choosing and persisting the microphone used for recording, including fallback and UX constraints. - -## UX - -- The runtime control panel shows a microphone section with: - - A **Refresh microphones** button to re-enumerate available input devices. - - A **Input device** combo box rendered from discovered devices and a **System default** option. -- Combo text conventions: - - `System default` corresponds to no explicit device override. - - Discovered item labels follow `name (id)`. - - If the selected ID is no longer in the list, the fallback label uses `Device #` or the persisted `audio.input_device_name`. -- Changing selection updates config immediately and persists it. -- Recording status should expose fallback when it happens: - - e.g., `Selected microphone unavailable. Falling back to default: .` - -## Config contract - -- Keys under `[audio]`: - - `audio.input_device_id` (number, `0` => use system default). - - `audio.input_device_name` (string, best-effort human-readable label). -- Default state: - - `audio.input_device_id = 0`. - - `audio.input_device_name = ""`. -- Persistence: - - Both keys are serialized in config writes. - - On load, missing/invalid keys fall back to defaults. -- Resolution rules: - - If `audio.input_device_id == 0`, recording uses the platform default microphone. - - If non-zero, app attempts that ID. - -## Fallback and constraints - -- If configured `audio.input_device_id` is invalid, disconnected, or lacks input scope at session start: - - selection falls back to default input device. - - recording proceeds with `fallback_to_default = true`. - - status/logging reports the fallback. -- If the device enumeration call fails or returns empty: - - combo still supports **System default** path. - - no devices can be shown/selected from the list. -- Non-macOS paths currently do not support mic capture and are not in-scope for picker functionality. - -## Acceptance criteria - -- Picker always presents **System default** and any available input-capable device list. -- Selection persists and survives restart via `audio.input_device_name` + `audio.input_device_id`. -- Session start is deterministic when configured devices are unavailable. -- Fallback behavior is transparent in status/log output. diff --git a/docs/plans/2026-03-01-audio-input-device-selection.md b/docs/plans/2026-03-01-audio-input-device-selection.md deleted file mode 100644 index 12ddd59..0000000 --- a/docs/plans/2026-03-01-audio-input-device-selection.md +++ /dev/null @@ -1,56 +0,0 @@ -# Audio Input Device Selection Implementation Plan - -## Goal - -Deliver and document the implemented microphone picker behavior, aligned to current code paths and config contract. - -## High-level execution steps - -1. Confirm configuration parsing and serialization for audio keys - - Ensure `audio.input_device_id` and `audio.input_device_name` are preserved in `AudioConfig`. - - Keep defaults as: - - `input_device_id = 0` - - `input_device_name = ""` - - Keep parse/serialize behavior unchanged except for explicit persistence of these keys. - -2. Confirm audio module selection path - - Keep `list_input_devices()` returning all input-capable devices (sorted for deterministic order). - - Keep `resolve_input_device()` behavior: - - `None` => default input. - - explicit ID => selected if still input-capable. - - explicit invalid/missing ID => system default with `fallback_to_default = true`. - - Keep `start_recording_with_stream()` returning `InputDeviceSelection`. - -3. Wire app startup and picker state sync - - Refresh microphone list on startup and before user interaction via `refresh_input_devices()`. - - Keep `sync_input_device_name()` behavior so persisted names are repaired to current label when possible. - - Keep `selected_input_device_label()` as canonical display formatting. - -4. Implement picker UI behavior - - Keep `Refresh microphones` action tied to `refresh_input_devices()`. - - Keep combo options: - - `System default` mapped to `0`. - - discovered device ids mapped to list entries. - - On selection change: - - write `config.audio.input_device_id`. - - write `config.audio.input_device_name` for UI readability. - - call `persist_config()`. - -5. Apply startup-time fallback into recording flow - - In `start_recording()`, pass optional configured id using `configured_input_device_id()`. - - When `InputDeviceSelection` reports fallback: - - prepend fallback notice in status text. - - keep recorder + realtime session on fallback path. - - Continue to proceed with Pass2/Pass3 flow once recorder starts. - -6. Preserve diagnostics and constraints - - Keep user-facing status updates for: - - empty/failed refreshes, - - fallback-to-default behavior, - - non-macOS unsupported recording path. - - Keep behavior aligned with non-breaking UI contract and existing restart/reload behavior. - -## Validation scope (manual, no test rewrite) - -- Update docs/plans only; no code edits in this slice. -- Verify the two key names are referenced as `audio.input_device_name` and `audio.input_device_id`. diff --git a/docs/policy.md b/docs/policy.md new file mode 100644 index 0000000..b7c66d6 --- /dev/null +++ b/docs/policy.md @@ -0,0 +1,96 @@ +# Documentation Policy + +Purpose: Define the repository-wide documentation taxonomy, naming rules, and placement +rules for durable agent-facing content. + +Audience: All documentation under `docs/` is written for AI agents and LLM workflows. +The split below is by question type, not by reader type. + +## Primary taxonomy + +| Lane | Location | Answers | Holds | +| --- | --- | --- | --- | +| Spec | `docs/spec/` | What must be true? | Contracts, schemas, invariants, required behavior | +| Runbook | `docs/runbook/` | Which sequence should I execute? | Operational procedures, onboarding steps, validation flows, recovery steps | +| Reference | `docs/reference/` | How is it currently organized or implemented? | Repository layout, surface maps, current implementation boundaries | +| Decisions | `docs/decisions/` | Why is it shaped this way? | Durable rationale, tradeoffs, and consequences | + +## Placement rules + +- If a document defines correctness, it belongs in `docs/spec/`. +- If a document defines operator actions, it belongs in `docs/runbook/`. +- If a document describes current structure, ownership, or implementation boundaries, it + belongs in `docs/reference/`. +- If a document records durable rationale or tradeoffs, it belongs in + `docs/decisions/`. +- If a document drifts across lanes, split it instead of stretching one file to answer + several question types. +- Do not duplicate authoritative content across lanes. Link to the source of truth. +- Do not add `docs/plans/` back. Transient planning artifacts are not part of the + durable docs tree in this repository. + +## Naming rules + +- Directory names express document lane. +- File names express stable topic. +- Use lowercase kebab-case for document file names. +- Keep primary-lane file names short and topic-first. +- Do not encode temporary versions such as `v1`, `draft2`, or dates into primary-lane + file names. +- Do not repeat the directory class in the file name when the topic is already clear. + Prefer `runtime.md` under `docs/spec/` over `runtime-spec.md`. +- Prefer names like `runtime.md`, `first-run.md`, and `repository-layout.md`. +- Keep `index.md` reserved for lane routers. + +## Document headers + +Every primary-lane document should start with a short routing header. + +Spec header: + +- `Purpose` +- `Status: normative` +- `Read this when` +- `Not this document` +- `Defines` + +Runbook header: + +- `Goal` +- `Read this when` +- `Inputs` or `Preconditions` +- `Depends on` +- `Outputs` or `Verification` + +Reference header: + +- `Purpose` +- `Read this when` +- `Not this document` +- `Covers` + +Decision header: + +- `Status` +- `Date` +- `Question` +- `Decision` +- `Consequences` + +## Canonical entry points + +- Unified router: `docs/index.md` +- Normative router: `docs/spec/index.md` +- Procedural router: `docs/runbook/index.md` +- Current-state router: `docs/reference/index.md` +- Rationale router: `docs/decisions/index.md` +- Repo task and automation entrypoints: `Makefile.toml` + +## Update workflow + +- Behavior or schema change: update the relevant spec. +- Procedure change: update the relevant runbook. +- Structural or ownership change: update the relevant reference doc. +- Tradeoff or rationale change: update the relevant decision doc. +- If a document starts carrying normative content from another lane, move that content + into the authoritative lane and link to it. diff --git a/docs/reference/index.md b/docs/reference/index.md new file mode 100644 index 0000000..81cddb3 --- /dev/null +++ b/docs/reference/index.md @@ -0,0 +1,23 @@ +# Reference Index + +Purpose: Route agents to descriptive documents that explain the repository's current +structure and implementation surfaces. + +Question this index answers: "how is it currently organized or implemented?" + +## Use this index when + +- You need the current repository layout, ownership boundaries, or where a topic lives. +- You need to know which directory or file surface is authoritative for a class of work. +- You need a map of the app, package, script, and documentation surfaces before editing. + +## Do not use this index when + +- You need a normative contract. +- You need an execution sequence or operator runbook. +- You need durable design rationale rather than current-state description. + +## Current reference docs + +- [`repository-layout.md`](./repository-layout.md) for the repository surface map and + directory ownership boundaries. diff --git a/docs/reference/repository-layout.md b/docs/reference/repository-layout.md new file mode 100644 index 0000000..cd517ad --- /dev/null +++ b/docs/reference/repository-layout.md @@ -0,0 +1,39 @@ +# Repository Layout + +Purpose: Describe the current top-level repository surfaces and which concerns each one +owns. + +Read this when: You need to know where the app entrypoint, shared packages, repo task +definitions, or documentation topics currently live. + +Not this document: The normative runtime contract, the first-run operator sequence, or +the design rationale behind specific product choices. + +Covers: The repository surface map, ownership boundaries, and the role of `apps/`, +`packages/`, `docs/`, `scripts/`, and repository root policy files. + +## Top-level surfaces + +- `apps/voxit/` holds the application crate and packaging-facing entrypoint for the + macOS app. +- `packages/voxit-core/` holds the shared runtime logic, auth, OpenAI integration, and + dictation pipeline code. +- `packages/voxit-audio/` holds audio-capture specific functionality. +- `packages/voxit-macos/` holds macOS-specific integration surfaces. +- `docs/spec/` holds normative runtime and behavior contracts. +- `docs/runbook/` holds operator procedures such as onboarding and validation flows. +- `docs/reference/` holds current repository and implementation surface maps. +- `docs/decisions/` holds durable rationale and tradeoffs behind current design choices. +- `Makefile.toml` holds repo-native task names for lint, test, format, and checks. +- `scripts/` holds repository helper scripts such as local macOS packaging helpers. +- `.github/workflows/` holds CI and release automation. + +## Boundary notes + +- Runtime authority stays in the application and package crates plus the governing specs + under `docs/spec/`. +- `docs/runbook/`, `docs/reference/`, and `docs/decisions/` must not override runtime + or configuration authority. +- `Makefile.toml` is the source of truth for named repository tasks. +- Decision docs explain why the system is shaped a certain way; the spec still defines + what must be true at runtime. diff --git a/docs/runbook/first-run.md b/docs/runbook/first-run.md new file mode 100644 index 0000000..4382e42 --- /dev/null +++ b/docs/runbook/first-run.md @@ -0,0 +1,76 @@ +# First Run + +Goal: Bring a fresh macOS Voxit install to the point where sign-in, permissions, and +paste work end to end. + +Read this when: You are launching Voxit for the first time, validating a fresh install, +or re-checking onboarding after macOS permission resets. + +Preconditions: + +- A macOS machine with Voxit built or installed. +- Network access for ChatGPT sign-in. +- Access to macOS System Settings. + +Depends on: + +- [`../spec/runtime.md`](../spec/runtime.md) for the normative auth, permissions, and + paste contract. +- `Makefile.toml` when you need the repository task entrypoints for formatting, linting, + or tests before packaging. + +Verification: + +- Voxit shows signed-in status. +- Microphone, Accessibility, and Input Monitoring permissions are granted. +- A short dictation run pastes text back into the app that was frontmost at start. + +## 1. Launch Voxit + +- Start the app from `Voxit.app` or a local debug build. +- If you are building from source, use the workspace manifests under `apps/` and + `packages/` rather than an old single-crate entrypoint. + +## 2. Sign in + +- Open the auth controls in the panel. +- Use the default browser OAuth flow first. +- Complete the callback flow and return to the app. +- If browser OAuth is unavailable, use the device-code fallback path. + +## 3. Grant required macOS permissions + +- Open the onboarding or preferences surface in Voxit. +- Grant permissions in this order: + 1. Microphone + 2. Accessibility + 3. Input Monitoring +- After each grant, re-check the status in Voxit before moving to the next permission. + +## 4. Confirm runtime configuration + +- Check the config file at: + +```text +$HOME/Library/Application Support/voxit/config.toml +``` + +- Confirm the default hotkey and audio device settings look reasonable for the machine. +- If you need an explicit microphone, refresh the device list and select it before the + first real dictation run. + +## 5. Verify paste flow + +- Put focus on a target app that accepts text input. +- Start a short dictation run. +- Stop recording and wait for finalize and optional rewrite to finish. +- Confirm the result pastes back into the same app that was frontmost when recording + started. + +## 6. Failure handling + +- If sign-in stalls, reopen the auth surface and retry with the visible-window path. +- If a permission does not update, grant it in macOS System Settings and then re-check + from Voxit. +- If paste fails, verify Accessibility and Input Monitoring first before debugging the + clipboard or target-app path. diff --git a/docs/runbook/index.md b/docs/runbook/index.md new file mode 100644 index 0000000..b6412ef --- /dev/null +++ b/docs/runbook/index.md @@ -0,0 +1,30 @@ +# Runbook Index + +Purpose: Route agents to procedural documents that tell them which sequence to execute. + +Question this index answers: "which sequence should I execute?" + +## Use this index when + +- You need a runbook, how-to, validation flow, troubleshooting path, or maintenance + procedure. +- You already know the relevant spec and need the operational steps. +- You need explicit prerequisites, commands, checkpoints, or verification. + +## Do not use this index when + +- You need the authoritative contract, schema, or invariant. +- You need the current repository layout or implementation boundaries. +- You need durable rationale rather than operator steps. + +## What belongs in `docs/runbook/` + +- Task-oriented operator procedures. +- Validation and inspection sequences. +- Rollout, rollback, and recovery flows. +- Bounded recipes that depend on a governing spec. + +## Current runbooks + +- [`first-run.md`](./first-run.md) for first sign-in, permission grants, and paste-path + verification on macOS. diff --git a/docs/spec/index.md b/docs/spec/index.md index 701e2c0..60156d0 100644 --- a/docs/spec/index.md +++ b/docs/spec/index.md @@ -9,14 +9,16 @@ Question this index answers: "what must remain true?" - You need an invariant, contract, schema, enum, state model, interface, or required behavior. - You are deciding whether code or data is correct. -- A guide says "see the governing spec" and you need the authoritative source. +- A runbook says "see the governing spec" and you need the authoritative source. ## Do not use this index when - You need step-by-step instructions, maintenance actions, migrations, or incident response. -- You need a planning-tool artifact or a saved execution plan under `docs/plans/`. +- You need durable rationale rather than the final contract; read + `docs/decisions/index.md`. - You want rationale only, without an authoritative contract. +- You need current layout or implementation boundaries; read `docs/reference/index.md`. ## What belongs in `docs/spec/` @@ -41,13 +43,17 @@ Then keep the body explicit: - Separate facts from rationale. - Include canonical names exactly as code or data uses them. - Include a small example when it removes ambiguity. -- Link to related guides instead of embedding procedures. +- Link to related runbooks instead of embedding procedures. ## Structure policy - Prefer shallow paths while the spec set is small. - Add subfolders only when they mirror stable system boundaries or materially reduce ambiguity. -- Do not require fixed filename prefixes up front. - Choose names for topic clarity and retrieval quality, not visual uniformity. -- If a guide depends on a spec, the guide links back to the governing spec. +- If a runbook depends on a spec, the runbook links back to the governing spec. + +## Current governing specs + +- [`runtime.md`](./runtime.md) defines the runtime scope, auth flow, audio capture, + transcript pipeline, paste behavior, configuration keys, and known gaps. diff --git a/docs/spec/runtime.md b/docs/spec/runtime.md new file mode 100644 index 0000000..368b6ef --- /dev/null +++ b/docs/spec/runtime.md @@ -0,0 +1,232 @@ +# Voxit Runtime Specification (macOS, English) + +Purpose: Define the normative runtime, auth, capture, paste, configuration, and release +contract for Voxit in this repository. + +Status: normative + +Read this when: You need the authoritative contract for Voxit runtime behavior, state +transitions, authentication, audio capture, paste flow, configuration keys, or release +scope. + +Not this document: Step-by-step operational guidance, design rationale, or workflow +instructions. + +Defines: + +- macOS-first runtime scope and platform boundaries +- user-visible state machine and transcript lifecycle +- authentication, storage, audio capture, finalize, rewrite, and paste contracts +- onboarding, configuration, CI, release, observability, and known-gap expectations + +## 1) Runtime Scope and Boundaries + +- Build entrypoint uses `eframe::run_native` and renders an always-on-top panel + controlled by system tray visibility and a global hotkey. +- The app supports English-first behavior and configuration defaults (`language = "en"`). +- No speech is injected into target apps while Pass1 is running; text is only pasted + after Pass2 or Pass3 completion. + +## 2) State Machine + +The runtime state is user-visible in `self.state` and UI status labels: + +- `Ready to listen.` +- `Listening` +- `Stopped` +- `FinalizingPass2` +- `RewritingPass3` +- `Done` + +State transitions: + +- `Start Dictation` or hotkey start in **toggle** mode -> `Listening`. +- `Stop Dictation` or hotkey release in **hold** mode -> stop capture, encode WAV, then + `FinalizingPass2`. +- Pass2 completion: + - if auto rewrite is enabled -> `RewritingPass3` + - else -> paste raw final transcript and `Done` +- Pass3 completion: + - if guard passes -> paste rewritten result and `Done` + - if skipped or rejected -> paste raw result and `Done` +- `Paste raw now (skip rewrite)` during Pass2 or Pass3 forces raw paste and sets `Done`. + +## 3) Authentication Contract + +- Default login is browser OAuth flow to ChatGPT and must open a browser callback page + first. +- Device-code path is available and should be used as fallback or manual fallback. +- Token acquisition flow: + - open browser login via OAuth + - exchange authorization token and persist auth locally + - fallback path uses `OPENAI_API_KEY` only when no OAuth token exists +- Storage: + - preferred: keyring + - fallback: local `auth.json` +- On startup: + - read status as "signed in" when unexpired token or session metadata exists + - otherwise show "Not signed in." + +## 4) Audio Capture and Streaming Contract + +### 4.1 Capture + +- Default recorder is macOS CoreAudio VoiceProcessingIO. +- The active recorder input is resolved at session start from `audio.input_device_id`. + - `0` means system default. + - non-zero uses the requested CoreAudio input device id from config. + - if the requested device is missing or unusable, Voxit falls back to system default + before capture starts. +- Capture should be continuous while in `Listening`, producing in-memory PCM sample + buffers and metadata (`sample_rate`, `channels`, `frames`). +- Raw audio must not be persisted by default. + +### 4.2 Device picker lifecycle + +- On startup, the app refreshes available input-capable devices and caches the result. +- A manual **Refresh microphones** action is available in the UI to repopulate the + picker. +- Picker values map to: + - **System default** (`audio.input_device_id = 0`) + - an explicit input device id and name pair from a discovered device list +- Selection changes persist `audio.input_device_name` and `audio.input_device_id` to + config. +- If a configured device id is invalid or stale when starting recording, the runtime + falls back to system default and reports fallback in status or logs. + +### 4.3 Pass1 transport + +- For each chunk, send `input_audio_buffer.append` payload frames to OpenAI Realtime. +- Realtime session must be configured with: + - `audio.input.format`: `audio/pcm` with sample rate from config (default `24000`) + - `audio.input.noise_reduction`: configured profile (default `near_field`) + - `audio.input.transcription.model`: Pass1 model + - `audio.input.turn_detection.type`: `server_vad` +- Realtime events consumed by the UI: + - `conversation.item.input_audio_transcription.delta` (draft) + - `conversation.item.input_audio_transcription.completed` (committed) + +### 4.4 Transcript composition + +- Draft and committed must be separated in UI: + - committed = finalized turns from completed events + - draft = latest in-flight text fragment +- Ordering for committed text is deterministic by `item_id` and `previous_item_id` + chain; out-of-order completed events must still render in chain order. + +## 5) Pass2 Finalization Contract + +- On stop, stop capture and upload full WAV to `/v1/audio/transcriptions`. +- Use the configured finalize model. +- Final transcript (`Pass2`) becomes baseline output for: + - paste when rewrite is disabled or skipped + - rewrite input when enabled + - final output display + +## 6) Pass3 Rewrite Contract + +- Auto-run rewrite only when: + - raw Pass2 transcript exists + - rewrite is enabled in runtime preference + - rewrite auto flag is enabled +- If disabled for this run, skip and paste raw final transcript. +- Rewriter output contract: + - keep meaning + - preserve numeric, date, and currency tokens + - reject rewrite when the protected token multiset changes +- Guarded outcomes: + - `Applied`: paste rewritten text + - `Rejected`: fallback to raw Pass2 and paste raw + - `Skipped`: fallback to raw Pass2 and paste raw + +## 7) Target App Capture and Paste + +- Before starting recording, capture frontmost app metadata (pid, bundle id, name) if + `lock_frontmost_app = true`. +- On paste: + - attempt to reactivate captured target app with retries + - copy to clipboard + - dispatch `Cmd+V` (`Meta+V`) to simulate paste +- A dedicated test-paste action should validate the clipboard and paste injection path. + +## 8) Hotkey and Tray Behavior + +- Hotkey chord handling: + - supported mode switch: toggle or hold + - currently recognized physical combo: `Ctrl+Shift+Space` + - configuration exposes `hotkey.chord` for future use +- Tray behavior: + - left click toggles panel visibility + - no menu-driven start, stop, rewrite, or quit actions are implemented in this + version + +## 9) UI and Onboarding Contract + +- Panel contains: + - auth status and sign-in actions + - runtime controls (start/stop, rewrite toggle, hotkey mode) + - live stream sections (committed plus draft) + - final transcript sections + - onboarding checklist statuses for microphone, accessibility, and input monitoring +- Onboarding checklist provides request actions for required macOS permissions. The UI + prompts permission requests in order: + - Microphone: probe-based request and retry loop when denied + - Accessibility: system prompt request plus re-check + - Input Monitoring: system prompt request plus re-check +- Grant each permission in macOS Privacy & Security settings when prompted, then + re-check in Voxit before continuing. +- "Paste raw now" is always available when finalization or rewrite is active and should + bypass Pass3. + +## 10) Configuration Contract + +Config file location: + +- `${Application Support}/voxit/config.toml` via `ProjectDirs` + +Supported sections and keys: + +- `ui.start_hidden`, `ui.panel_width_px`, `ui.panel_height_px` +- `hotkey.chord`, `hotkey.mode` (`toggle` or `hold`) +- `audio.backend`, `audio.input_sample_rate_hz`, `audio.input_device_name`, + `audio.input_device_id`, `audio.realtime_target_rate_hz` +- `openai.api_base_url`, `openai.realtime_model`, `openai.finalize_model`, + `openai.rewrite_model`, `openai.language` +- `openai.realtime.noise_reduction` +- `rewrite.enabled`, `rewrite.auto`, `rewrite.guard_numbers`, + `rewrite.max_output_chars`, `rewrite.style` +- `paste.lock_frontmost_app`, `paste.method` + +On load: + +- parse file when present +- defaults are used when missing or invalid entries are encountered +- `audio.input_device_id = 0` is treated as system default +- non-zero `audio.input_device_id` requests that device; if unavailable at startup, + Voxit falls back to default input + +## 11) CI and Release + +- `language.yml` is macOS-only for lint, format, and test checks. +- Release packaging matrix is restricted to `aarch64-apple-darwin` and comments out + Linux and Windows jobs. +- Packaging uses `cargo bundle --manifest-path apps/voxit/Cargo.toml` (or `cargo bundle + -p voxit`) and zips `target//bundle/osx/Voxit.app` as `voxit-.zip`. + +## 12) Observability and Logs + +- Runtime logs are written via rotating file appender under the data directory. +- User-facing state is mirrored by status strings for troubleshooting. +- Error states must avoid hard-crash behavior and should return to a user-actionable + status. + +## 13) Known Gaps + +- Tray menu controls are not implemented; only click-to-toggle panel exists. +- Configured hotkey chord string is not yet mapped; current hardcoded gesture is + `Ctrl+Shift+Space` only. +- CPAL fallback capture is not implemented despite a configuration option; only the + VoiceProcessingIO path is active. +- `rewrite.max_output_chars` and `rewrite.style` are persisted but not strictly + enforced in the rewrite prompt yet. +- No explicit audio resampling step to 24 kHz is implemented in the current path. diff --git a/docs/spec/system_voxit_v1.md b/docs/spec/system_voxit_v1.md deleted file mode 100644 index 9af9a68..0000000 --- a/docs/spec/system_voxit_v1.md +++ /dev/null @@ -1,211 +0,0 @@ -# Voxit v1 System Specification (macOS, English) - -Purpose: Define the normative runtime, auth, capture, paste, configuration, and release -contract for Voxit v1 in this repository. - -Status: normative - -Read this when: You need the authoritative contract for Voxit v1 behavior, state -transitions, authentication, audio capture, paste flow, configuration keys, or release -scope. - -Not this document: Step-by-step operational guidance, planning artifacts, exploratory -design notes, or workflow instructions. - -Defines: - -- macOS-first runtime scope and platform boundaries -- user-visible state machine and transcript lifecycle -- authentication, storage, audio capture, finalize, rewrite, and paste contracts -- onboarding, configuration, CI, release, observability, and known-gap expectations - -## 1) Runtime Scope and Boundaries - -- Build entrypoint uses `eframe::run_native` and renders an always-on-top panel controlled by system tray visibility and a global hotkey. -- The app supports English-first behavior and configuration defaults (`language = "en"`). -- No speech is injected into target apps while Pass1 is running; text is only pasted after Pass2/Pass3 completion. - -## 2) State Machine - -The runtime state is user-visible in `self.state` and UI status labels: - -- `Ready to listen.` -- `Listening` -- `Stopped` -- `FinalizingPass2` -- `RewritingPass3` -- `Done` - -State transitions: - -- `Start Dictation` / hotkey start in **toggle** mode -> `Listening`. -- `Stop Dictation` / hotkey release in **hold** mode -> stop capture, encode WAV, then `FinalizingPass2`. -- Pass2 completion: - - if auto rewrite is enabled -> `RewritingPass3`; - - else -> paste raw final transcript and `Done`. -- Pass3 completion: - - if guard passes -> paste rewritten result and `Done`; - - if skipped/rejected -> paste raw result and `Done`. -- `Paste raw now (skip rewrite)` during Pass2 or Pass3 forces raw paste and sets `Done`. - -## 3) Authentication Contract - -- Default login is browser OAuth flow to ChatGPT and must open a browser callback page first. -- Device-code path is available and should be used as fallback/manual fallback. -- Token acquisition flow: - - Open browser login via OAuth. - - Exchange authorization token and persist auth locally. - - Fallback path uses `OPENAI_API_KEY` only when no OAuth token exists. -- Storage: - - Preferred: keyring. - - Fallback: local `auth.json`. -- On startup: - - read status as “signed in” when unexpired token/session metadata exists; - - otherwise show “Not signed in.” - -## 4) Audio Capture and Streaming Contract - -### 4.1 Capture - -- Default recorder is macOS CoreAudio VoiceProcessingIO. -- The active recorder input is resolved at session start from `audio.input_device_id`. - - `0` means "system default". - - non-zero uses the requested CoreAudio input device id from config. - - if the requested device is missing or unusable, Voxit falls back to system default before capture starts. -- Capture should be continuous while in `Listening`, producing in-memory PCM sample buffers and metadata (`sample_rate`, `channels`, `frames`). -- Raw audio must not be persisted by default. - -### 4.2 Device picker lifecycle - -- On startup, the app refreshes available input-capable devices and caches the result. -- A manual **Refresh microphones** action is available in the UI to repopulate the picker. -- Picker values map to: - - **System default** (`audio.input_device_id = 0`), - - an explicit input device id and name pair from a discovered device list. -- Selection changes persist `audio.input_device_name` and `audio.input_device_id` to config. -- If a configured device id is invalid or stale when starting recording, the runtime falls back to system default and reports fallback in status/log. - -### 4.3 Pass1 transport - -- For each chunk, send `input_audio_buffer.append` payload frames to OpenAI Realtime. -- Realtime session must be configured with: - - `audio.input.format`: `audio/pcm` with sample rate from config (default `24000`) - - `audio.input.noise_reduction`: configured profile (default `near_field`) - - `audio.input.transcription.model`: Pass1 model - - `audio.input.turn_detection.type`: `server_vad` -- Realtime events consumed by the UI: - - `conversation.item.input_audio_transcription.delta` (draft) - - `conversation.item.input_audio_transcription.completed` (committed) - -### 4.4 Transcript composition - -- Draft and committed must be separated in UI: - - committed = finalized turns from completed events - - draft = latest in-flight text fragment -- Ordering for committed text is deterministic by `item_id` and `previous_item_id` chain; out-of-order completed events must still render in chain order. - -## 5) Pass2 Finalization Contract - -- On stop, stop capture and upload full WAV to `/v1/audio/transcriptions`. -- Use configured finalize model. -- Final transcript (`Pass2`) becomes baseline output for: - - paste (when rewrite disabled/skipped), - - rewrite input (if enabled), - - final output display. - -## 6) Pass3 Rewrite Contract - -- Auto-run rewrite only when: - - raw Pass2 transcript exists, - - rewrite is enabled in runtime preference, - - rewrite auto flag is enabled. -- If disabled for this run, skip and paste raw final transcript. -- Rewriter output contract (current implementation): - - keep meaning, - - preserve numeric/date/currency tokens, - - reject rewrite when protected token multiset changes. -- Guarded outcomes: - - `Applied`: paste rewritten text, - - `Rejected`: fallback to raw Pass2 and paste raw, - - `Skipped`: fallback to raw Pass2 and paste raw. - -## 7) Target App Capture and Paste - -- Before starting recording, capture frontmost app metadata (pid, bundle id, name) if `lock_frontmost_app = true`. -- On paste: - - attempt to reactivate captured target app (retry with exponential delay), - - copy to clipboard, - - dispatch `Cmd+V` (macOS `Meta+V`) to simulate paste. -- A dedicated “Test Paste” action should validate the clipboard + paste injection path. - -## 8) Hotkey and Tray Behavior - -- Hotkey chord handling: - - supported mode switch: toggle or hold. - - currently recognized physical combo: `Ctrl + Shift + Space`. - - configuration exposes `hotkey.chord` for future use. -- Tray behavior: - - left click toggles panel visibility. - - no menu-driven start/stop/rewrite/quit actions are implemented in this version. - -## 9) UI and Onboarding Contract - -- Panel contains: - - Auth status and sign-in actions, - - Runtime controls (start/stop, rewrite toggle, hotkey mode), - - Live stream sections (committed + draft), - - Final transcript sections, - - onboarding checklist statuses: - - microphone, - - accessibility, - - input monitoring. -- Onboarding checklist provides request actions for required macOS permissions. The UI prompts permission requests in order: - - Microphone: probe-based request and retry loop when denied. - - Accessibility: system prompt request + re-check. - - Input Monitoring: system prompt request + re-check. -- Grant each permission in macOS Privacy & Security settings when prompted, then re-check in Voxit before continuing. -- “Paste raw now” is always available when finalization/rewrite is active and should bypass Pass3. - -## 10) Configuration Contract - -Config file location: - -- `${Application Support}/voxit/config.toml` via `ProjectDirs`. - -Supported sections and keys: - -- `ui.start_hidden`, `ui.panel_width_px`, `ui.panel_height_px` -- `hotkey.chord`, `hotkey.mode` (`toggle` / `hold`) - - `audio.backend`, `audio.input_sample_rate_hz`, `audio.input_device_name`, `audio.input_device_id`, - `audio.realtime_target_rate_hz` -- `openai.api_base_url`, `openai.realtime_model`, `openai.finalize_model`, - `openai.rewrite_model`, `openai.language` -- `openai.realtime.noise_reduction` -- `rewrite.enabled`, `rewrite.auto`, `rewrite.guard_numbers`, `rewrite.max_output_chars`, `rewrite.style` -- `paste.lock_frontmost_app`, `paste.method` - -On load: -- parse file when present; -- defaults are used when missing/invalid entries are encountered; -- `audio.input_device_id = 0` is treated as system default; -- non-zero `audio.input_device_id` requests that device; if unavailable at startup, Voxit falls back to default input. - -## 11) CI and Release - -- `language.yml` is macOS-only for lint/format/test checks. -- Release packaging matrix is restricted to `aarch64-apple-darwin` and comments out Linux/Windows jobs. -- Packaging uses `cargo bundle --manifest-path apps/voxit/Cargo.toml` (or `cargo bundle -p voxit`) and zips `target//bundle/osx/Voxit.app` as `voxit-.zip`. - -## 12) Observability and Logs - -- Runtime logs are written via rotating file appender under data directory. -- User-facing state is mirrored by `status` strings for troubleshooting. -- Error states must avoid hard-crash behavior and should return to a user-actionable status. - -## 13) Known Gaps vs Original Plan - -- Tray menu controls are not implemented (no menu items for start/stop/rewrite/quit; only click-to-toggle panel exists). -- Configured hotkey chord string is not yet mapped; current hardcoded gesture is `Ctrl+Shift+Space` only. -- CPAL fallback capture is not implemented despite a configuration option; only VoiceProcessingIO path is active. -- `rewrite.max_output_chars` and `rewrite.style` are persisted but not strictly enforced/applied in rewrite prompt yet. -- No explicit audio resampling step to 24 kHz is implemented in the current path. From d0666358d5bf2dabb01df99470938ea6f95dd3e1 Mon Sep 17 00:00:00 2001 From: Yvette Carlisle Date: Tue, 5 May 2026 11:21:29 +0800 Subject: [PATCH 2/3] {"schema":"maestro/commit/1","summary":"align docs taxonomy cleanup","authority":"manual"} --- README.md | 8 +-- docs/decisions/index.md | 28 ---------- docs/index.md | 15 ++--- docs/plans/.gitkeep | 0 ...-01-audio-input-device-selection-design.md | 51 +++++++++++++++++ ...2026-03-01-audio-input-device-selection.md | 56 +++++++++++++++++++ docs/policy.md | 45 ++++++--------- docs/reference/index.md | 4 +- ...pository-layout.md => workspace-layout.md} | 17 +++--- .../{first-run.md => first-run-onboarding.md} | 4 +- docs/runbook/index.md | 12 ++-- docs/spec/index.md | 7 +-- docs/spec/runtime.md | 10 ++-- 13 files changed, 162 insertions(+), 95 deletions(-) delete mode 100644 docs/decisions/index.md create mode 100644 docs/plans/.gitkeep create mode 100644 docs/plans/2026-03-01-audio-input-device-selection-design.md create mode 100644 docs/plans/2026-03-01-audio-input-device-selection.md rename docs/reference/{repository-layout.md => workspace-layout.md} (70%) rename docs/runbook/{first-run.md => first-run-onboarding.md} (98%) diff --git a/README.md b/README.md index 4bee064..416ccc6 100644 --- a/README.md +++ b/README.md @@ -136,7 +136,8 @@ First-run onboarding checklist: - Voxit uses request buttons to guide you through the permission prompts in sequence (Microphone → Accessibility → Input Monitoring); grant each permission and re-check when prompted. - Verify paste flow after permission grant and restart the app if needed. -For the full guided sequence, see [First Run](docs/runbook/first-run.md). +For the full guided sequence, see +[First-Run Onboarding](docs/runbook/first-run-onboarding.md). The app saves updates to the same `config.toml` path when settings are changed. @@ -172,10 +173,9 @@ The app saves updates to the same `config.toml` path when settings are changed. ### Docs -- [Documentation Index](docs/index.md) routes to spec, runbook, reference, and decision docs. +- [Documentation Index](docs/index.md) routes to spec, runbook, and reference docs. - [Runtime Spec](docs/spec/runtime.md) is the normative runtime contract. -- [First Run](docs/runbook/first-run.md) covers sign-in, permission grants, and paste validation. -- [Repository Layout](docs/reference/repository-layout.md) maps the current repo surfaces. +- [Workspace Layout](docs/reference/workspace-layout.md) maps the current repo surfaces. ## Support Me diff --git a/docs/decisions/index.md b/docs/decisions/index.md deleted file mode 100644 index 3bd3dcc..0000000 --- a/docs/decisions/index.md +++ /dev/null @@ -1,28 +0,0 @@ -# Decision Index - -Purpose: Route agents to durable rationale documents that explain why the repository is -shaped the way it is. - -Question this index answers: "why is it shaped this way?" - -## Use this index when - -- You need the accepted tradeoff behind a current behavior or UX choice. -- You need consequences and rationale that should survive implementation churn. -- You need to understand why a contract or surface exists before changing it. - -## Do not use this index when - -- You need the final runtime contract or schema. -- You need an operator procedure or validation sequence. -- You need a current-state layout map rather than design rationale. - -## What belongs in `docs/decisions/` - -- Durable rationale and tradeoffs. -- Consequences that affect future changes. -- Accepted choices that shape the governing spec or current repository structure. - -## Current decisions - -- No accepted decision records are currently checked in under this lane. diff --git a/docs/index.md b/docs/index.md index a1b4096..998ffdb 100644 --- a/docs/index.md +++ b/docs/index.md @@ -9,15 +9,17 @@ workflows. The split below is by question type, not by human-versus-agent audien - Read `README.md` first when you need the repository scope, platform target, or top-level runtime summary. +- Use `cargo make` whenever an equivalent repo task exists. When task details matter, + inspect `Makefile.toml` directly. - Read `docs/policy.md` for document contracts, placement rules, and naming rules. -- Use `cargo make` whenever an equivalent repo task exists. Inspect `Makefile.toml` - directly when task names or execution entrypoints matter. +- Read `Makefile.toml` when the task depends on repo task names or execution entrypoints. - Then choose one primary lane: - `docs/spec/index.md` when the question is "what must be true?" - `docs/runbook/index.md` when the question is "which sequence should I execute?" - `docs/reference/index.md` when the question is "how is it currently organized or implemented?" - - `docs/decisions/index.md` when the question is "why is it shaped this way?" +- Use `docs/plans/` only when a planning tool or execution workflow explicitly points to + a saved plan artifact there. ## Routing matrix @@ -27,17 +29,16 @@ workflows. The split below is by question type, not by human-versus-agent audien -> `docs/runbook/` - Need current repository layout, ownership boundaries, or implementation surface maps -> `docs/reference/` -- Need durable rationale, tradeoffs, or historical consequences -> `docs/decisions/` - Need repo task names or automation entrypoints -> `Makefile.toml` - Need documentation placement or authoring rules -> `docs/policy.md` +- Need a planning-tool artifact or saved execution plan -> `docs/plans/` ## Retrieval rules - Optimize for agent routing and execution, not narrative flow. - Keep one authoritative document per topic. Link instead of copying. -- Runtime and behavior authority lives in code plus `docs/spec/`. Runbook, reference, - and decision docs explain usage, current state, and rationale, but do not override the - governing spec. +- Keep runtime authority explicit: application and package crates plus `docs/spec/` + outrank runbook, reference, and plan artifacts. - Start each document with a short routing header that says what the document is for, when to read it, and what it does not cover. - Keep links explicit and stable. diff --git a/docs/plans/.gitkeep b/docs/plans/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/docs/plans/2026-03-01-audio-input-device-selection-design.md b/docs/plans/2026-03-01-audio-input-device-selection-design.md new file mode 100644 index 0000000..ed15d0a --- /dev/null +++ b/docs/plans/2026-03-01-audio-input-device-selection-design.md @@ -0,0 +1,51 @@ +# Audio Input Device Selection Design + +## Scope + +Document the implemented behavior for choosing and persisting the microphone used for recording, including fallback and UX constraints. + +## UX + +- The runtime control panel shows a microphone section with: + - A **Refresh microphones** button to re-enumerate available input devices. + - A **Input device** combo box rendered from discovered devices and a **System default** option. +- Combo text conventions: + - `System default` corresponds to no explicit device override. + - Discovered item labels follow `name (id)`. + - If the selected ID is no longer in the list, the fallback label uses `Device #` or the persisted `audio.input_device_name`. +- Changing selection updates config immediately and persists it. +- Recording status should expose fallback when it happens: + - e.g., `Selected microphone unavailable. Falling back to default: .` + +## Config contract + +- Keys under `[audio]`: + - `audio.input_device_id` (number, `0` => use system default). + - `audio.input_device_name` (string, best-effort human-readable label). +- Default state: + - `audio.input_device_id = 0`. + - `audio.input_device_name = ""`. +- Persistence: + - Both keys are serialized in config writes. + - On load, missing/invalid keys fall back to defaults. +- Resolution rules: + - If `audio.input_device_id == 0`, recording uses the platform default microphone. + - If non-zero, app attempts that ID. + +## Fallback and constraints + +- If configured `audio.input_device_id` is invalid, disconnected, or lacks input scope at session start: + - selection falls back to default input device. + - recording proceeds with `fallback_to_default = true`. + - status/logging reports the fallback. +- If the device enumeration call fails or returns empty: + - combo still supports **System default** path. + - no devices can be shown/selected from the list. +- Non-macOS paths currently do not support mic capture and are not in-scope for picker functionality. + +## Acceptance criteria + +- Picker always presents **System default** and any available input-capable device list. +- Selection persists and survives restart via `audio.input_device_name` + `audio.input_device_id`. +- Session start is deterministic when configured devices are unavailable. +- Fallback behavior is transparent in status/log output. diff --git a/docs/plans/2026-03-01-audio-input-device-selection.md b/docs/plans/2026-03-01-audio-input-device-selection.md new file mode 100644 index 0000000..12ddd59 --- /dev/null +++ b/docs/plans/2026-03-01-audio-input-device-selection.md @@ -0,0 +1,56 @@ +# Audio Input Device Selection Implementation Plan + +## Goal + +Deliver and document the implemented microphone picker behavior, aligned to current code paths and config contract. + +## High-level execution steps + +1. Confirm configuration parsing and serialization for audio keys + - Ensure `audio.input_device_id` and `audio.input_device_name` are preserved in `AudioConfig`. + - Keep defaults as: + - `input_device_id = 0` + - `input_device_name = ""` + - Keep parse/serialize behavior unchanged except for explicit persistence of these keys. + +2. Confirm audio module selection path + - Keep `list_input_devices()` returning all input-capable devices (sorted for deterministic order). + - Keep `resolve_input_device()` behavior: + - `None` => default input. + - explicit ID => selected if still input-capable. + - explicit invalid/missing ID => system default with `fallback_to_default = true`. + - Keep `start_recording_with_stream()` returning `InputDeviceSelection`. + +3. Wire app startup and picker state sync + - Refresh microphone list on startup and before user interaction via `refresh_input_devices()`. + - Keep `sync_input_device_name()` behavior so persisted names are repaired to current label when possible. + - Keep `selected_input_device_label()` as canonical display formatting. + +4. Implement picker UI behavior + - Keep `Refresh microphones` action tied to `refresh_input_devices()`. + - Keep combo options: + - `System default` mapped to `0`. + - discovered device ids mapped to list entries. + - On selection change: + - write `config.audio.input_device_id`. + - write `config.audio.input_device_name` for UI readability. + - call `persist_config()`. + +5. Apply startup-time fallback into recording flow + - In `start_recording()`, pass optional configured id using `configured_input_device_id()`. + - When `InputDeviceSelection` reports fallback: + - prepend fallback notice in status text. + - keep recorder + realtime session on fallback path. + - Continue to proceed with Pass2/Pass3 flow once recorder starts. + +6. Preserve diagnostics and constraints + - Keep user-facing status updates for: + - empty/failed refreshes, + - fallback-to-default behavior, + - non-macOS unsupported recording path. + - Keep behavior aligned with non-breaking UI contract and existing restart/reload behavior. + +## Validation scope (manual, no test rewrite) + +- Update docs/plans only; no code edits in this slice. +- Verify the two key names are referenced as `audio.input_device_name` and `audio.input_device_id`. diff --git a/docs/policy.md b/docs/policy.md index b7c66d6..6b6c0fd 100644 --- a/docs/policy.md +++ b/docs/policy.md @@ -8,12 +8,20 @@ The split below is by question type, not by reader type. ## Primary taxonomy +This repository standardizes on three primary documentation lanes: + | Lane | Location | Answers | Holds | | --- | --- | --- | --- | | Spec | `docs/spec/` | What must be true? | Contracts, schemas, invariants, required behavior | | Runbook | `docs/runbook/` | Which sequence should I execute? | Operational procedures, onboarding steps, validation flows, recovery steps | | Reference | `docs/reference/` | How is it currently organized or implemented? | Repository layout, surface maps, current implementation boundaries | -| Decisions | `docs/decisions/` | Why is it shaped this way? | Durable rationale, tradeoffs, and consequences | + +## Artifact lanes + +- `docs/plans/` is allowed for plan artifacts that are explicitly produced or consumed by + a planning workflow. +- `docs/plans/` is not a primary documentation lane and is not authoritative for runtime + behavior, repository policy, or operator procedures. ## Placement rules @@ -21,30 +29,21 @@ The split below is by question type, not by reader type. - If a document defines operator actions, it belongs in `docs/runbook/`. - If a document describes current structure, ownership, or implementation boundaries, it belongs in `docs/reference/`. -- If a document records durable rationale or tradeoffs, it belongs in - `docs/decisions/`. -- If a document drifts across lanes, split it instead of stretching one file to answer - several question types. - Do not duplicate authoritative content across lanes. Link to the source of truth. -- Do not add `docs/plans/` back. Transient planning artifacts are not part of the - durable docs tree in this repository. ## Naming rules -- Directory names express document lane. +- Directory names express document type. - File names express stable topic. - Use lowercase kebab-case for document file names. -- Keep primary-lane file names short and topic-first. -- Do not encode temporary versions such as `v1`, `draft2`, or dates into primary-lane - file names. +- Do not encode temporary versions such as `v0`, `v1`, or `draft2` into stable file + names. - Do not repeat the directory class in the file name when the topic is already clear. Prefer `runtime.md` under `docs/spec/` over `runtime-spec.md`. -- Prefer names like `runtime.md`, `first-run.md`, and `repository-layout.md`. -- Keep `index.md` reserved for lane routers. ## Document headers -Every primary-lane document should start with a short routing header. +Every document should start with a short routing header. Spec header: @@ -58,9 +57,9 @@ Runbook header: - `Goal` - `Read this when` -- `Inputs` or `Preconditions` +- `Preconditions` or `Inputs` - `Depends on` -- `Outputs` or `Verification` +- `Verification` or `Outputs` Reference header: @@ -69,21 +68,12 @@ Reference header: - `Not this document` - `Covers` -Decision header: - -- `Status` -- `Date` -- `Question` -- `Decision` -- `Consequences` - ## Canonical entry points - Unified router: `docs/index.md` - Normative router: `docs/spec/index.md` - Procedural router: `docs/runbook/index.md` - Current-state router: `docs/reference/index.md` -- Rationale router: `docs/decisions/index.md` - Repo task and automation entrypoints: `Makefile.toml` ## Update workflow @@ -91,6 +81,5 @@ Decision header: - Behavior or schema change: update the relevant spec. - Procedure change: update the relevant runbook. - Structural or ownership change: update the relevant reference doc. -- Tradeoff or rationale change: update the relevant decision doc. -- If a document starts carrying normative content from another lane, move that content - into the authoritative lane and link to it. +- If a document drifts across lanes, split it instead of stretching one document to do + several jobs. diff --git a/docs/reference/index.md b/docs/reference/index.md index 81cddb3..d78da16 100644 --- a/docs/reference/index.md +++ b/docs/reference/index.md @@ -15,9 +15,9 @@ Question this index answers: "how is it currently organized or implemented?" - You need a normative contract. - You need an execution sequence or operator runbook. -- You need durable design rationale rather than current-state description. +- You need saved plan artifacts. ## Current reference docs -- [`repository-layout.md`](./repository-layout.md) for the repository surface map and +- [`workspace-layout.md`](./workspace-layout.md) for the repository surface map and directory ownership boundaries. diff --git a/docs/reference/repository-layout.md b/docs/reference/workspace-layout.md similarity index 70% rename from docs/reference/repository-layout.md rename to docs/reference/workspace-layout.md index cd517ad..c03557c 100644 --- a/docs/reference/repository-layout.md +++ b/docs/reference/workspace-layout.md @@ -1,4 +1,4 @@ -# Repository Layout +# Workspace Layout Purpose: Describe the current top-level repository surfaces and which concerns each one owns. @@ -6,10 +6,9 @@ owns. Read this when: You need to know where the app entrypoint, shared packages, repo task definitions, or documentation topics currently live. -Not this document: The normative runtime contract, the first-run operator sequence, or -the design rationale behind specific product choices. +Not this document: The normative runtime contract or the onboarding sequence. -Covers: The repository surface map, ownership boundaries, and the role of `apps/`, +Covers: The workspace surface map, ownership boundaries, and the role of `apps/`, `packages/`, `docs/`, `scripts/`, and repository root policy files. ## Top-level surfaces @@ -23,7 +22,7 @@ Covers: The repository surface map, ownership boundaries, and the role of `apps/ - `docs/spec/` holds normative runtime and behavior contracts. - `docs/runbook/` holds operator procedures such as onboarding and validation flows. - `docs/reference/` holds current repository and implementation surface maps. -- `docs/decisions/` holds durable rationale and tradeoffs behind current design choices. +- `docs/plans/` holds saved plan artifacts rather than governing repository policy. - `Makefile.toml` holds repo-native task names for lint, test, format, and checks. - `scripts/` holds repository helper scripts such as local macOS packaging helpers. - `.github/workflows/` holds CI and release automation. @@ -32,8 +31,8 @@ Covers: The repository surface map, ownership boundaries, and the role of `apps/ - Runtime authority stays in the application and package crates plus the governing specs under `docs/spec/`. -- `docs/runbook/`, `docs/reference/`, and `docs/decisions/` must not override runtime - or configuration authority. +- `docs/runbook/` and `docs/reference/` must not override runtime or configuration + authority. - `Makefile.toml` is the source of truth for named repository tasks. -- Decision docs explain why the system is shaped a certain way; the spec still defines - what must be true at runtime. +- `docs/plans/` can capture design or execution artifacts, but those files do not become + policy until their conclusions are promoted into spec, runbook, or reference docs. diff --git a/docs/runbook/first-run.md b/docs/runbook/first-run-onboarding.md similarity index 98% rename from docs/runbook/first-run.md rename to docs/runbook/first-run-onboarding.md index 4382e42..e41e2c3 100644 --- a/docs/runbook/first-run.md +++ b/docs/runbook/first-run-onboarding.md @@ -1,4 +1,4 @@ -# First Run +# First-Run Onboarding Goal: Bring a fresh macOS Voxit install to the point where sign-in, permissions, and paste work end to end. @@ -69,7 +69,7 @@ $HOME/Library/Application Support/voxit/config.toml ## 6. Failure handling -- If sign-in stalls, reopen the auth surface and retry with the visible-window path. +- If sign-in stalls, reopen the auth surface and retry with the visible window path. - If a permission does not update, grant it in macOS System Settings and then re-check from Voxit. - If paste fails, verify Accessibility and Input Monitoring first before debugging the diff --git a/docs/runbook/index.md b/docs/runbook/index.md index b6412ef..67afab3 100644 --- a/docs/runbook/index.md +++ b/docs/runbook/index.md @@ -6,16 +6,16 @@ Question this index answers: "which sequence should I execute?" ## Use this index when -- You need a runbook, how-to, validation flow, troubleshooting path, or maintenance - procedure. +- You need a runbook, how-to, migration sequence, validation flow, troubleshooting path, + or maintenance procedure. - You already know the relevant spec and need the operational steps. - You need explicit prerequisites, commands, checkpoints, or verification. ## Do not use this index when - You need the authoritative contract, schema, or invariant. -- You need the current repository layout or implementation boundaries. -- You need durable rationale rather than operator steps. +- You need current repository layout or implementation boundaries. +- You need durable design rationale rather than operator steps. ## What belongs in `docs/runbook/` @@ -26,5 +26,5 @@ Question this index answers: "which sequence should I execute?" ## Current runbooks -- [`first-run.md`](./first-run.md) for first sign-in, permission grants, and paste-path - verification on macOS. +- [`first-run-onboarding.md`](./first-run-onboarding.md) for first sign-in, permission + grant, and paste-path verification on macOS. diff --git a/docs/spec/index.md b/docs/spec/index.md index 60156d0..032c4b4 100644 --- a/docs/spec/index.md +++ b/docs/spec/index.md @@ -9,14 +9,13 @@ Question this index answers: "what must remain true?" - You need an invariant, contract, schema, enum, state model, interface, or required behavior. - You are deciding whether code or data is correct. -- A runbook says "see the governing spec" and you need the authoritative source. +- A guide says "see the governing spec" and you need the authoritative source. ## Do not use this index when - You need step-by-step instructions, maintenance actions, migrations, or incident response. -- You need durable rationale rather than the final contract; read - `docs/decisions/index.md`. +- You need a planning-tool artifact or a saved execution plan under `docs/plans/`. - You want rationale only, without an authoritative contract. - You need current layout or implementation boundaries; read `docs/reference/index.md`. @@ -43,7 +42,7 @@ Then keep the body explicit: - Separate facts from rationale. - Include canonical names exactly as code or data uses them. - Include a small example when it removes ambiguity. -- Link to related runbooks instead of embedding procedures. +- Link to related guides instead of embedding procedures. ## Structure policy diff --git a/docs/spec/runtime.md b/docs/spec/runtime.md index 368b6ef..a72133a 100644 --- a/docs/spec/runtime.md +++ b/docs/spec/runtime.md @@ -9,8 +9,8 @@ Read this when: You need the authoritative contract for Voxit runtime behavior, transitions, authentication, audio capture, paste flow, configuration keys, or release scope. -Not this document: Step-by-step operational guidance, design rationale, or workflow -instructions. +Not this document: Step-by-step operational guidance, planning artifacts, exploratory +design notes, or workflow instructions. Defines: @@ -64,8 +64,8 @@ State transitions: - preferred: keyring - fallback: local `auth.json` - On startup: - - read status as "signed in" when unexpired token or session metadata exists - - otherwise show "Not signed in." + - read status as “signed in” when unexpired token or session metadata exists + - otherwise show “Not signed in.” ## 4) Audio Capture and Streaming Contract @@ -175,7 +175,7 @@ State transitions: - Input Monitoring: system prompt request plus re-check - Grant each permission in macOS Privacy & Security settings when prompted, then re-check in Voxit before continuing. -- "Paste raw now" is always available when finalization or rewrite is active and should +- “Paste raw now” is always available when finalization or rewrite is active and should bypass Pass3. ## 10) Configuration Contract From 90e8249e68cb17e1556be90498e9157f4781abbf Mon Sep 17 00:00:00 2001 From: Yvette Carlisle Date: Tue, 5 May 2026 11:25:30 +0800 Subject: [PATCH 3/3] {"schema":"maestro/commit/1","summary":"restore docs taxonomy normalization","authority":"manual"} --- README.md | 8 +-- docs/decisions/index.md | 28 ++++++++++ docs/index.md | 15 +++-- docs/plans/.gitkeep | 0 ...-01-audio-input-device-selection-design.md | 51 ----------------- ...2026-03-01-audio-input-device-selection.md | 56 ------------------- docs/policy.md | 45 +++++++++------ docs/reference/index.md | 4 +- ...rkspace-layout.md => repository-layout.md} | 17 +++--- .../{first-run-onboarding.md => first-run.md} | 4 +- docs/runbook/index.md | 12 ++-- docs/spec/index.md | 7 ++- docs/spec/runtime.md | 10 ++-- 13 files changed, 95 insertions(+), 162 deletions(-) create mode 100644 docs/decisions/index.md delete mode 100644 docs/plans/.gitkeep delete mode 100644 docs/plans/2026-03-01-audio-input-device-selection-design.md delete mode 100644 docs/plans/2026-03-01-audio-input-device-selection.md rename docs/reference/{workspace-layout.md => repository-layout.md} (70%) rename docs/runbook/{first-run-onboarding.md => first-run.md} (98%) diff --git a/README.md b/README.md index 416ccc6..4bee064 100644 --- a/README.md +++ b/README.md @@ -136,8 +136,7 @@ First-run onboarding checklist: - Voxit uses request buttons to guide you through the permission prompts in sequence (Microphone → Accessibility → Input Monitoring); grant each permission and re-check when prompted. - Verify paste flow after permission grant and restart the app if needed. -For the full guided sequence, see -[First-Run Onboarding](docs/runbook/first-run-onboarding.md). +For the full guided sequence, see [First Run](docs/runbook/first-run.md). The app saves updates to the same `config.toml` path when settings are changed. @@ -173,9 +172,10 @@ The app saves updates to the same `config.toml` path when settings are changed. ### Docs -- [Documentation Index](docs/index.md) routes to spec, runbook, and reference docs. +- [Documentation Index](docs/index.md) routes to spec, runbook, reference, and decision docs. - [Runtime Spec](docs/spec/runtime.md) is the normative runtime contract. -- [Workspace Layout](docs/reference/workspace-layout.md) maps the current repo surfaces. +- [First Run](docs/runbook/first-run.md) covers sign-in, permission grants, and paste validation. +- [Repository Layout](docs/reference/repository-layout.md) maps the current repo surfaces. ## Support Me diff --git a/docs/decisions/index.md b/docs/decisions/index.md new file mode 100644 index 0000000..3bd3dcc --- /dev/null +++ b/docs/decisions/index.md @@ -0,0 +1,28 @@ +# Decision Index + +Purpose: Route agents to durable rationale documents that explain why the repository is +shaped the way it is. + +Question this index answers: "why is it shaped this way?" + +## Use this index when + +- You need the accepted tradeoff behind a current behavior or UX choice. +- You need consequences and rationale that should survive implementation churn. +- You need to understand why a contract or surface exists before changing it. + +## Do not use this index when + +- You need the final runtime contract or schema. +- You need an operator procedure or validation sequence. +- You need a current-state layout map rather than design rationale. + +## What belongs in `docs/decisions/` + +- Durable rationale and tradeoffs. +- Consequences that affect future changes. +- Accepted choices that shape the governing spec or current repository structure. + +## Current decisions + +- No accepted decision records are currently checked in under this lane. diff --git a/docs/index.md b/docs/index.md index 998ffdb..a1b4096 100644 --- a/docs/index.md +++ b/docs/index.md @@ -9,17 +9,15 @@ workflows. The split below is by question type, not by human-versus-agent audien - Read `README.md` first when you need the repository scope, platform target, or top-level runtime summary. -- Use `cargo make` whenever an equivalent repo task exists. When task details matter, - inspect `Makefile.toml` directly. - Read `docs/policy.md` for document contracts, placement rules, and naming rules. -- Read `Makefile.toml` when the task depends on repo task names or execution entrypoints. +- Use `cargo make` whenever an equivalent repo task exists. Inspect `Makefile.toml` + directly when task names or execution entrypoints matter. - Then choose one primary lane: - `docs/spec/index.md` when the question is "what must be true?" - `docs/runbook/index.md` when the question is "which sequence should I execute?" - `docs/reference/index.md` when the question is "how is it currently organized or implemented?" -- Use `docs/plans/` only when a planning tool or execution workflow explicitly points to - a saved plan artifact there. + - `docs/decisions/index.md` when the question is "why is it shaped this way?" ## Routing matrix @@ -29,16 +27,17 @@ workflows. The split below is by question type, not by human-versus-agent audien -> `docs/runbook/` - Need current repository layout, ownership boundaries, or implementation surface maps -> `docs/reference/` +- Need durable rationale, tradeoffs, or historical consequences -> `docs/decisions/` - Need repo task names or automation entrypoints -> `Makefile.toml` - Need documentation placement or authoring rules -> `docs/policy.md` -- Need a planning-tool artifact or saved execution plan -> `docs/plans/` ## Retrieval rules - Optimize for agent routing and execution, not narrative flow. - Keep one authoritative document per topic. Link instead of copying. -- Keep runtime authority explicit: application and package crates plus `docs/spec/` - outrank runbook, reference, and plan artifacts. +- Runtime and behavior authority lives in code plus `docs/spec/`. Runbook, reference, + and decision docs explain usage, current state, and rationale, but do not override the + governing spec. - Start each document with a short routing header that says what the document is for, when to read it, and what it does not cover. - Keep links explicit and stable. diff --git a/docs/plans/.gitkeep b/docs/plans/.gitkeep deleted file mode 100644 index e69de29..0000000 diff --git a/docs/plans/2026-03-01-audio-input-device-selection-design.md b/docs/plans/2026-03-01-audio-input-device-selection-design.md deleted file mode 100644 index ed15d0a..0000000 --- a/docs/plans/2026-03-01-audio-input-device-selection-design.md +++ /dev/null @@ -1,51 +0,0 @@ -# Audio Input Device Selection Design - -## Scope - -Document the implemented behavior for choosing and persisting the microphone used for recording, including fallback and UX constraints. - -## UX - -- The runtime control panel shows a microphone section with: - - A **Refresh microphones** button to re-enumerate available input devices. - - A **Input device** combo box rendered from discovered devices and a **System default** option. -- Combo text conventions: - - `System default` corresponds to no explicit device override. - - Discovered item labels follow `name (id)`. - - If the selected ID is no longer in the list, the fallback label uses `Device #` or the persisted `audio.input_device_name`. -- Changing selection updates config immediately and persists it. -- Recording status should expose fallback when it happens: - - e.g., `Selected microphone unavailable. Falling back to default: .` - -## Config contract - -- Keys under `[audio]`: - - `audio.input_device_id` (number, `0` => use system default). - - `audio.input_device_name` (string, best-effort human-readable label). -- Default state: - - `audio.input_device_id = 0`. - - `audio.input_device_name = ""`. -- Persistence: - - Both keys are serialized in config writes. - - On load, missing/invalid keys fall back to defaults. -- Resolution rules: - - If `audio.input_device_id == 0`, recording uses the platform default microphone. - - If non-zero, app attempts that ID. - -## Fallback and constraints - -- If configured `audio.input_device_id` is invalid, disconnected, or lacks input scope at session start: - - selection falls back to default input device. - - recording proceeds with `fallback_to_default = true`. - - status/logging reports the fallback. -- If the device enumeration call fails or returns empty: - - combo still supports **System default** path. - - no devices can be shown/selected from the list. -- Non-macOS paths currently do not support mic capture and are not in-scope for picker functionality. - -## Acceptance criteria - -- Picker always presents **System default** and any available input-capable device list. -- Selection persists and survives restart via `audio.input_device_name` + `audio.input_device_id`. -- Session start is deterministic when configured devices are unavailable. -- Fallback behavior is transparent in status/log output. diff --git a/docs/plans/2026-03-01-audio-input-device-selection.md b/docs/plans/2026-03-01-audio-input-device-selection.md deleted file mode 100644 index 12ddd59..0000000 --- a/docs/plans/2026-03-01-audio-input-device-selection.md +++ /dev/null @@ -1,56 +0,0 @@ -# Audio Input Device Selection Implementation Plan - -## Goal - -Deliver and document the implemented microphone picker behavior, aligned to current code paths and config contract. - -## High-level execution steps - -1. Confirm configuration parsing and serialization for audio keys - - Ensure `audio.input_device_id` and `audio.input_device_name` are preserved in `AudioConfig`. - - Keep defaults as: - - `input_device_id = 0` - - `input_device_name = ""` - - Keep parse/serialize behavior unchanged except for explicit persistence of these keys. - -2. Confirm audio module selection path - - Keep `list_input_devices()` returning all input-capable devices (sorted for deterministic order). - - Keep `resolve_input_device()` behavior: - - `None` => default input. - - explicit ID => selected if still input-capable. - - explicit invalid/missing ID => system default with `fallback_to_default = true`. - - Keep `start_recording_with_stream()` returning `InputDeviceSelection`. - -3. Wire app startup and picker state sync - - Refresh microphone list on startup and before user interaction via `refresh_input_devices()`. - - Keep `sync_input_device_name()` behavior so persisted names are repaired to current label when possible. - - Keep `selected_input_device_label()` as canonical display formatting. - -4. Implement picker UI behavior - - Keep `Refresh microphones` action tied to `refresh_input_devices()`. - - Keep combo options: - - `System default` mapped to `0`. - - discovered device ids mapped to list entries. - - On selection change: - - write `config.audio.input_device_id`. - - write `config.audio.input_device_name` for UI readability. - - call `persist_config()`. - -5. Apply startup-time fallback into recording flow - - In `start_recording()`, pass optional configured id using `configured_input_device_id()`. - - When `InputDeviceSelection` reports fallback: - - prepend fallback notice in status text. - - keep recorder + realtime session on fallback path. - - Continue to proceed with Pass2/Pass3 flow once recorder starts. - -6. Preserve diagnostics and constraints - - Keep user-facing status updates for: - - empty/failed refreshes, - - fallback-to-default behavior, - - non-macOS unsupported recording path. - - Keep behavior aligned with non-breaking UI contract and existing restart/reload behavior. - -## Validation scope (manual, no test rewrite) - -- Update docs/plans only; no code edits in this slice. -- Verify the two key names are referenced as `audio.input_device_name` and `audio.input_device_id`. diff --git a/docs/policy.md b/docs/policy.md index 6b6c0fd..b7c66d6 100644 --- a/docs/policy.md +++ b/docs/policy.md @@ -8,20 +8,12 @@ The split below is by question type, not by reader type. ## Primary taxonomy -This repository standardizes on three primary documentation lanes: - | Lane | Location | Answers | Holds | | --- | --- | --- | --- | | Spec | `docs/spec/` | What must be true? | Contracts, schemas, invariants, required behavior | | Runbook | `docs/runbook/` | Which sequence should I execute? | Operational procedures, onboarding steps, validation flows, recovery steps | | Reference | `docs/reference/` | How is it currently organized or implemented? | Repository layout, surface maps, current implementation boundaries | - -## Artifact lanes - -- `docs/plans/` is allowed for plan artifacts that are explicitly produced or consumed by - a planning workflow. -- `docs/plans/` is not a primary documentation lane and is not authoritative for runtime - behavior, repository policy, or operator procedures. +| Decisions | `docs/decisions/` | Why is it shaped this way? | Durable rationale, tradeoffs, and consequences | ## Placement rules @@ -29,21 +21,30 @@ This repository standardizes on three primary documentation lanes: - If a document defines operator actions, it belongs in `docs/runbook/`. - If a document describes current structure, ownership, or implementation boundaries, it belongs in `docs/reference/`. +- If a document records durable rationale or tradeoffs, it belongs in + `docs/decisions/`. +- If a document drifts across lanes, split it instead of stretching one file to answer + several question types. - Do not duplicate authoritative content across lanes. Link to the source of truth. +- Do not add `docs/plans/` back. Transient planning artifacts are not part of the + durable docs tree in this repository. ## Naming rules -- Directory names express document type. +- Directory names express document lane. - File names express stable topic. - Use lowercase kebab-case for document file names. -- Do not encode temporary versions such as `v0`, `v1`, or `draft2` into stable file - names. +- Keep primary-lane file names short and topic-first. +- Do not encode temporary versions such as `v1`, `draft2`, or dates into primary-lane + file names. - Do not repeat the directory class in the file name when the topic is already clear. Prefer `runtime.md` under `docs/spec/` over `runtime-spec.md`. +- Prefer names like `runtime.md`, `first-run.md`, and `repository-layout.md`. +- Keep `index.md` reserved for lane routers. ## Document headers -Every document should start with a short routing header. +Every primary-lane document should start with a short routing header. Spec header: @@ -57,9 +58,9 @@ Runbook header: - `Goal` - `Read this when` -- `Preconditions` or `Inputs` +- `Inputs` or `Preconditions` - `Depends on` -- `Verification` or `Outputs` +- `Outputs` or `Verification` Reference header: @@ -68,12 +69,21 @@ Reference header: - `Not this document` - `Covers` +Decision header: + +- `Status` +- `Date` +- `Question` +- `Decision` +- `Consequences` + ## Canonical entry points - Unified router: `docs/index.md` - Normative router: `docs/spec/index.md` - Procedural router: `docs/runbook/index.md` - Current-state router: `docs/reference/index.md` +- Rationale router: `docs/decisions/index.md` - Repo task and automation entrypoints: `Makefile.toml` ## Update workflow @@ -81,5 +91,6 @@ Reference header: - Behavior or schema change: update the relevant spec. - Procedure change: update the relevant runbook. - Structural or ownership change: update the relevant reference doc. -- If a document drifts across lanes, split it instead of stretching one document to do - several jobs. +- Tradeoff or rationale change: update the relevant decision doc. +- If a document starts carrying normative content from another lane, move that content + into the authoritative lane and link to it. diff --git a/docs/reference/index.md b/docs/reference/index.md index d78da16..81cddb3 100644 --- a/docs/reference/index.md +++ b/docs/reference/index.md @@ -15,9 +15,9 @@ Question this index answers: "how is it currently organized or implemented?" - You need a normative contract. - You need an execution sequence or operator runbook. -- You need saved plan artifacts. +- You need durable design rationale rather than current-state description. ## Current reference docs -- [`workspace-layout.md`](./workspace-layout.md) for the repository surface map and +- [`repository-layout.md`](./repository-layout.md) for the repository surface map and directory ownership boundaries. diff --git a/docs/reference/workspace-layout.md b/docs/reference/repository-layout.md similarity index 70% rename from docs/reference/workspace-layout.md rename to docs/reference/repository-layout.md index c03557c..cd517ad 100644 --- a/docs/reference/workspace-layout.md +++ b/docs/reference/repository-layout.md @@ -1,4 +1,4 @@ -# Workspace Layout +# Repository Layout Purpose: Describe the current top-level repository surfaces and which concerns each one owns. @@ -6,9 +6,10 @@ owns. Read this when: You need to know where the app entrypoint, shared packages, repo task definitions, or documentation topics currently live. -Not this document: The normative runtime contract or the onboarding sequence. +Not this document: The normative runtime contract, the first-run operator sequence, or +the design rationale behind specific product choices. -Covers: The workspace surface map, ownership boundaries, and the role of `apps/`, +Covers: The repository surface map, ownership boundaries, and the role of `apps/`, `packages/`, `docs/`, `scripts/`, and repository root policy files. ## Top-level surfaces @@ -22,7 +23,7 @@ Covers: The workspace surface map, ownership boundaries, and the role of `apps/` - `docs/spec/` holds normative runtime and behavior contracts. - `docs/runbook/` holds operator procedures such as onboarding and validation flows. - `docs/reference/` holds current repository and implementation surface maps. -- `docs/plans/` holds saved plan artifacts rather than governing repository policy. +- `docs/decisions/` holds durable rationale and tradeoffs behind current design choices. - `Makefile.toml` holds repo-native task names for lint, test, format, and checks. - `scripts/` holds repository helper scripts such as local macOS packaging helpers. - `.github/workflows/` holds CI and release automation. @@ -31,8 +32,8 @@ Covers: The workspace surface map, ownership boundaries, and the role of `apps/` - Runtime authority stays in the application and package crates plus the governing specs under `docs/spec/`. -- `docs/runbook/` and `docs/reference/` must not override runtime or configuration - authority. +- `docs/runbook/`, `docs/reference/`, and `docs/decisions/` must not override runtime + or configuration authority. - `Makefile.toml` is the source of truth for named repository tasks. -- `docs/plans/` can capture design or execution artifacts, but those files do not become - policy until their conclusions are promoted into spec, runbook, or reference docs. +- Decision docs explain why the system is shaped a certain way; the spec still defines + what must be true at runtime. diff --git a/docs/runbook/first-run-onboarding.md b/docs/runbook/first-run.md similarity index 98% rename from docs/runbook/first-run-onboarding.md rename to docs/runbook/first-run.md index e41e2c3..4382e42 100644 --- a/docs/runbook/first-run-onboarding.md +++ b/docs/runbook/first-run.md @@ -1,4 +1,4 @@ -# First-Run Onboarding +# First Run Goal: Bring a fresh macOS Voxit install to the point where sign-in, permissions, and paste work end to end. @@ -69,7 +69,7 @@ $HOME/Library/Application Support/voxit/config.toml ## 6. Failure handling -- If sign-in stalls, reopen the auth surface and retry with the visible window path. +- If sign-in stalls, reopen the auth surface and retry with the visible-window path. - If a permission does not update, grant it in macOS System Settings and then re-check from Voxit. - If paste fails, verify Accessibility and Input Monitoring first before debugging the diff --git a/docs/runbook/index.md b/docs/runbook/index.md index 67afab3..b6412ef 100644 --- a/docs/runbook/index.md +++ b/docs/runbook/index.md @@ -6,16 +6,16 @@ Question this index answers: "which sequence should I execute?" ## Use this index when -- You need a runbook, how-to, migration sequence, validation flow, troubleshooting path, - or maintenance procedure. +- You need a runbook, how-to, validation flow, troubleshooting path, or maintenance + procedure. - You already know the relevant spec and need the operational steps. - You need explicit prerequisites, commands, checkpoints, or verification. ## Do not use this index when - You need the authoritative contract, schema, or invariant. -- You need current repository layout or implementation boundaries. -- You need durable design rationale rather than operator steps. +- You need the current repository layout or implementation boundaries. +- You need durable rationale rather than operator steps. ## What belongs in `docs/runbook/` @@ -26,5 +26,5 @@ Question this index answers: "which sequence should I execute?" ## Current runbooks -- [`first-run-onboarding.md`](./first-run-onboarding.md) for first sign-in, permission - grant, and paste-path verification on macOS. +- [`first-run.md`](./first-run.md) for first sign-in, permission grants, and paste-path + verification on macOS. diff --git a/docs/spec/index.md b/docs/spec/index.md index 032c4b4..60156d0 100644 --- a/docs/spec/index.md +++ b/docs/spec/index.md @@ -9,13 +9,14 @@ Question this index answers: "what must remain true?" - You need an invariant, contract, schema, enum, state model, interface, or required behavior. - You are deciding whether code or data is correct. -- A guide says "see the governing spec" and you need the authoritative source. +- A runbook says "see the governing spec" and you need the authoritative source. ## Do not use this index when - You need step-by-step instructions, maintenance actions, migrations, or incident response. -- You need a planning-tool artifact or a saved execution plan under `docs/plans/`. +- You need durable rationale rather than the final contract; read + `docs/decisions/index.md`. - You want rationale only, without an authoritative contract. - You need current layout or implementation boundaries; read `docs/reference/index.md`. @@ -42,7 +43,7 @@ Then keep the body explicit: - Separate facts from rationale. - Include canonical names exactly as code or data uses them. - Include a small example when it removes ambiguity. -- Link to related guides instead of embedding procedures. +- Link to related runbooks instead of embedding procedures. ## Structure policy diff --git a/docs/spec/runtime.md b/docs/spec/runtime.md index a72133a..368b6ef 100644 --- a/docs/spec/runtime.md +++ b/docs/spec/runtime.md @@ -9,8 +9,8 @@ Read this when: You need the authoritative contract for Voxit runtime behavior, transitions, authentication, audio capture, paste flow, configuration keys, or release scope. -Not this document: Step-by-step operational guidance, planning artifacts, exploratory -design notes, or workflow instructions. +Not this document: Step-by-step operational guidance, design rationale, or workflow +instructions. Defines: @@ -64,8 +64,8 @@ State transitions: - preferred: keyring - fallback: local `auth.json` - On startup: - - read status as “signed in” when unexpired token or session metadata exists - - otherwise show “Not signed in.” + - read status as "signed in" when unexpired token or session metadata exists + - otherwise show "Not signed in." ## 4) Audio Capture and Streaming Contract @@ -175,7 +175,7 @@ State transitions: - Input Monitoring: system prompt request plus re-check - Grant each permission in macOS Privacy & Security settings when prompted, then re-check in Voxit before continuing. -- “Paste raw now” is always available when finalization or rewrite is active and should +- "Paste raw now" is always available when finalization or rewrite is active and should bypass Pass3. ## 10) Configuration Contract