[components] VoiceRecorder + Whisper transcribe dev harness (refs #4) by d3v07 · Pull Request #18 · d3v07/cbc

d3v07 · 2026-04-26T16:41:06Z

Summary

First slice of d3v07's deferred client deliverables for issue #4. Pure browser, no model dependency for the recorder itself; the dev page also exercises the existing /api/transcribe route end-to-end.

What's in

components/VoiceRecorder.tsx — MediaRecorder + canvas waveform; states idle/recording/done/error; clean unmount; specific copy for permission-denied and unsupported-MIME paths
lib/audio/recorder-mime.ts — priority list (webm;opus → webm → mp4;aac → mp4) with Safari fallback
app/dev/voice/page.tsx — dev harness mounting the recorder + audio playback + Whisper "Transcribe" button
vitest.config.ts — @vitejs/plugin-react + environmentMatchGlobs so tests/components/** runs under happy-dom while route tests stay on node
DevDeps: @testing-library/react, @testing-library/dom, happy-dom, @vitejs/plugin-react@^4

Verification

Manual: visited /dev/voice in Chrome, recorded ~5s of speech, Whisper returned the correct transcript with segment-level timestamps. Live verified on localhost:3000.

Gates

Gate	Result
`tsc --noEmit`	clean
`vitest`	67/67 (5 new component tests + 6 new MIME tests + 56 carryover)
`next lint`	clean
`next build`	`/dev/voice` registered (2.13 kB)

Out of scope (next slices)

ReelRenderer (canvas 1080×1920 @ 30fps + Web Animations API)
ffmpeg.wasm WebM → MP4 transcode wrapper
/dev/reel integration harness for the full pipeline

Notes

app/dev/* routes are dev-only; can be guarded with NEXT_PUBLIC_ENABLE_DEMO=1 later if we want to block them in production builds.
VoiceRecorder is provider-agnostic — it just hands a Blob to its caller; downstream is whoever consumes that blob (transcribe + curator path).

- components/VoiceRecorder.tsx: MediaRecorder + canvas waveform; states idle/recording/done/error; props { onComplete(blob, durationMs), maxDurationMs }; cleans up MediaStream + AudioContext on unmount; specific copy for permission-denied and unsupported-MIME paths. - lib/audio/recorder-mime.ts: priority list (webm;opus -> webm -> mp4;aac -> mp4) with MediaRecorder.isTypeSupported probing + Safari fallback. - app/dev/voice/page.tsx: dev-only harness mounting the recorder, an audio playback element, and a Whisper "Transcribe" button that exercises /api/transcribe end-to-end on the captured blob. - vitest.config.ts: adds @vitejs/plugin-react and environmentMatchGlobs so tests/components/** runs under happy-dom while route tests stay on node. - 11 new tests across the slice; all 67 in suite pass. Refs #4 (audio-video). UI for the photo-reader/curator and ffmpeg.wasm transcode land in follow-up slices.

Copilot

Pull request overview

Adds an in-browser voice recording component plus a dev harness page that records audio and exercises the existing /api/transcribe endpoint end-to-end, along with Vitest configuration updates to support React component tests under happy-dom.

Changes:

Introduces VoiceRecorder (MediaRecorder + canvas waveform) and a MIME selection helper for browser compatibility.
Adds /dev/voice dev harness to record, playback, and send audio to /api/transcribe.
Extends Vitest setup for .tsx tests, React plugin support, and mixed test environments (node vs happy-dom).

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`components/VoiceRecorder.tsx`	New client component for recording audio, rendering a live waveform, and emitting a `Blob` + duration.
`lib/audio/recorder-mime.ts`	New helper to pick the best supported recorder MIME type with Safari-friendly fallbacks.
`app/dev/voice/page.tsx`	New dev-only UI to record audio, playback via blob URL, and call `/api/transcribe`.
`vitest.config.ts`	Enables React plugin, adds `.test.tsx`, and maps component tests to `happy-dom`.
`tests/components/voice-recorder.test.tsx`	Adds component tests with stubs for missing media/audio Web APIs.
`tests/lib/audio/recorder-mime.test.ts`	Adds unit tests for MIME selection fallthrough behavior.
`package.json`	Adds testing/dev dependencies for React component testing and `happy-dom`.
`package-lock.json`	Locks added dependency tree for new dev dependencies.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-26T16:44:34Z

+          onComplete={(blob, durationMs) => {
+            setLast({
+              blob,
+              size: blob.size,
+              durationMs,
+              mime: blob.type || "unknown",
+              url: URL.createObjectURL(blob),
+            });


URL.createObjectURL(blob) is created for each completed recording but never revoked. Re-recording multiple times (or navigating away) will leak object URLs in the tab. Track the previous URL and call URL.revokeObjectURL() when replacing last and in an unmount cleanup effect.

Copilot · 2026-04-26T16:44:34Z

+    <main style={{ maxWidth: 720, margin: "40px auto", padding: 24, fontFamily: "system-ui" }}>
+      <h1 style={{ marginBottom: 4 }}>VoiceRecorder · /dev/voice</h1>
+      <p style={{ color: "#666", marginTop: 0, fontSize: 14 }}>
+        Dev harness. Grant microphone, record, stop, retake, optionally transcribe.
+      </p>


This /dev/voice page makes it trivial for any visitor in a deployed environment to hit /api/transcribe from the browser, which can translate into unbounded third‑party API usage/cost if the app is publicly accessible. If this route is intended to be dev-only, consider gating it behind an environment flag (e.g., NEXT_PUBLIC_ENABLE_DEMO) or otherwise preventing it from being served in production builds.

Copilot · 2026-04-26T16:44:35Z

+      const fd = new FormData();
+      fd.set("file", last.blob, "voice.webm");
+      const res = await fetch("/api/transcribe", { method: "POST", body: fd });


The uploaded filename is hard-coded as voice.webm, but the recorder may produce audio/mp4 (Safari fallback) or other types. Even if the backend primarily checks file.type, a mismatched extension can make debugging harder and may affect downstream services that infer format from the filename. Consider deriving the filename/extension from last.mime (or using a generic name without a misleading extension).

Copilot · 2026-04-26T16:44:35Z

+    if (!s) return;
+    sessionRef.current = null;
+    cancelAnimationFrame(s.rafId);
+    if (s.autoStop) clearTimeout(s.autoStop);


tearDown() stops tracks and closes the AudioContext but never stops the active MediaRecorder or detaches its event handlers. If the component unmounts mid-recording, the recorder can still fire onstop/ondataavailable later (e.g., due to tracks ending), which will call setState/onComplete after unmount and can also treat an aborted recording as "done". Consider explicitly stopping the recorder during teardown and/or guarding onstop with an "aborted" flag/session id so late events are ignored.

Suggested change

if (s.autoStop) clearTimeout(s.autoStop);

if (s.autoStop) clearTimeout(s.autoStop);

s.recorder.ondataavailable = null;

s.recorder.onstop = null;

s.recorder.onerror = null;

s.recorder.onstart = null;

if (s.recorder.state !== "inactive") {

try {

s.recorder.stop();

} catch {

// Best-effort cleanup during teardown.

}

}

- lib/video/transcode.ts: lazy-loaded FFmpeg singleton; transcodeWebmToMp4 with hasVideo flag (libx264+aac for video; aac-only for audio-only WebM produced by VoiceRecorder); ultrafast preset + faststart for streaming; guards for missing browser env / SharedArrayBuffer; cleanup on every path. - next.config.mjs: COEP require-corp + COOP same-origin scoped to /dev/* (required for SharedArrayBuffer; intentionally not site-wide). - app/dev/transcode/page.tsx: drop a .webm, transcode, preview MP4, download. Toggle for audio-only inputs. - 5 new tests (instance reuse, MP4 mime out, hasVideo flag, audio-only flag, failure propagation). 72 in suite total, all passing. - Deps: @ffmpeg/ffmpeg ^0.12.15, @ffmpeg/util ^0.12. Live verified: WebM (audio-only) -> MP4 round-trips on /dev/transcode with COOP/COEP headers confirmed in response.

Copilot AI review requested due to automatic review settings April 26, 2026 16:41

Copilot started reviewing on behalf of d3v07 April 26, 2026 16:41 View session

Copilot AI reviewed Apr 26, 2026

View reviewed changes

This was referenced Apr 26, 2026

[S4][audio-video] Curator + canvas reel renderer + ffmpeg.wasm transcode + download #10

Closed

[P0] d3v07 — Real Photo Reader / Whisper / TTS / Curator + Reel renderer (canvas + ffmpeg.wasm) + modal pack #27

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[components] VoiceRecorder + Whisper transcribe dev harness (refs #4)#18

[components] VoiceRecorder + Whisper transcribe dev harness (refs #4)#18
d3v07 wants to merge 2 commits into
persona-core-devfrom
components

d3v07 commented Apr 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 26, 2026

Uh oh!

Copilot AI Apr 26, 2026

Uh oh!

Copilot AI Apr 26, 2026

Uh oh!

Copilot AI Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    if (s.autoStop) clearTimeout(s.autoStop);
+    if (s.autoStop) clearTimeout(s.autoStop);
+    s.recorder.ondataavailable = null;
+    s.recorder.onstop = null;
+    s.recorder.onerror = null;
+    s.recorder.onstart = null;
+    if (s.recorder.state !== "inactive") {
+      try {
+        s.recorder.stop();
+      } catch {
+        // Best-effort cleanup during teardown.
+      }
+    }

Conversation

d3v07 commented Apr 26, 2026

Summary

What's in

Verification

Gates

Out of scope (next slices)

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants