Play dictation start cue after capture is ready by freshyjmp · Pull Request #478 · altic-dev/FluidVoice

freshyjmp · 2026-06-29T15:17:37Z

Description

Moves the dictation start sound from the hotkey/overlay moment to the point where the audio capture path is actually ready, and adds a conservative short-lived engine reuse window for repeated dictations.

The start cue now waits until:

pending startup audio-route recovery is idle
the engine has had a short stability window after engine start or startup recovery
at least a small number of captured samples are buffered
the same recording session that requested the cue is still active

Startup AVAudioEngineConfigurationChange recovery is also treated separately from later route changes by using a shorter startup recovery delay. Later route changes keep the existing longer recovery delay and clear any startup-only readiness state if they replace startup recovery.

For responsiveness, normal successful stops can retain the stopped AVAudioEngine for 20 seconds. If another dictation starts during that grace window, it reuses the retained engine instance; otherwise the engine is released when the grace window expires. Reuse is skipped and the engine is released immediately for Bluetooth routes or independent device binding, so headphones can return to their normal high-quality mode after dictation.

This also adds explicit start/stop sound and engine-reuse logging so future debugging can distinguish sound playback, readiness wait, engine reuse hits, reuse skips, and reuse expiry.

A local debug retest also exposed that immediate event-tap recovery after a timeout could interfere with keyboard input. Event-tap timeout recovery now yields back to the system event callback, recreates the tap asynchronously, and uses the existing retry loop for transient tap-creation failures.

Type of Change

🐞 Bug fix (non-breaking change which fixes an issue)
✨ New feature (non-breaking change which adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
🧹 Chore
📝 Documentation update

Related Issues

Refs [🐞 BUG] Start overlay/chime can indicate recording before audio capture is ready #477

Testing

Tested on Intel Mac
Tested on Apple Silicon Mac
Tested on macOS 26.5.1
Ran linter locally: swiftlint --strict --config .swiftlint.yml Sources (SwiftLint is not installed locally)
Ran formatter locally: swiftformat --config .swiftformat Sources (SwiftFormat is not installed locally)
Ran git diff --check
Ran xcrun swiftc -frontend -parse Sources/Fluid/Services/AudioDeviceService.swift Sources/Fluid/Services/GlobalHotkeyManager.swift Sources/Fluid/Services/ASRService.swift Sources/Fluid/ContentView.swift Sources/Fluid/Services/TranscriptionSoundPlayer.swift Tests/FluidDictationIntegrationTests/StartCueCaptureReadinessTests.swift
Ran xcodebuild test -project Fluid.xcodeproj -scheme Fluid -configuration Debug -destination 'platform=macOS,arch=arm64' -only-testing:FluidDictationIntegrationTests/StartCueCaptureReadinessTests PRODUCT_BUNDLE_IDENTIFIER=com.FluidApp.debug CODE_SIGN_STYLE=Manual DEVELOPMENT_TEAM= CODE_SIGN_IDENTITY=- CODE_SIGNING_ALLOWED=YES CODE_SIGNING_REQUIRED=YES (6 tests, 0 failures)
Ran xcodebuild build -project Fluid.xcodeproj -scheme Fluid -configuration Debug -destination 'platform=macOS,arch=arm64' PRODUCT_BUNDLE_IDENTIFIER=com.FluidApp.debug CODE_SIGN_STYLE=Manual DEVELOPMENT_TEAM= CODE_SIGN_IDENTITY=- CODE_SIGNING_ALLOWED=YES CODE_SIGNING_REQUIRED=YES

Runtime debug testing showed:

cold/new-engine start: engine_reuse_start hit=false, asr_start_return elapsedMs=105, start cue after capture-ready wait
reuse starts: engine_reuse_start hit=true, retained engine age around 1.4-2.6s, asr_start_return elapsedMs=36-40
after no restart, the retained engine released on schedule: engine_reuse_release reason=reuse_grace_expired hadEngine=true

Notes

This intentionally makes the audible start cue represent actual capture readiness instead of the instant hotkey/overlay transition. On routes that need startup recovery, the cue may therefore happen later, but it should be a truer signal to begin speaking.
The 20-second engine reuse window is deliberately short so FluidVoice does not become an always-hot audio app. The engine is stopped during this window; the retained object avoids rebuilding the graph for rapid repeated dictations only on routes where retention is safe.
This PR does not change the visual overlay state yet. A follow-up could show a distinct "starting" state before capture is ready, then switch to "listening" when this readiness check passes.
Potential follow-up performance work: avoid unnecessary output-node/device work for capture-only dictation and see whether the startup format/configuration change can be avoided.

Screenshots / Video

Not included; this change is timing/audio-cue behavior and logging rather than a visual UI change.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4e9fa709ce

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a8890dcb2a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

altic-dev · 2026-06-30T04:36:45Z

Good fix! But i was wondering if you can optimize the engine to open faster than the other way around. Thoughts?

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0c81e63b85

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

freshyjmp · 2026-06-30T05:47:30Z

Replying to @altic-dev: agreed that making engine startup faster is the better direction where we can do it safely.

I kept this PR split between correctness and a conservative startup optimization:

The cue change makes the UI/audio cue mean “capture is actually ready,” not just “the hotkey was accepted.”
The performance piece keeps a stopped AVAudioEngine around briefly after normal stops, so rapid repeat dictations can reuse the engine instead of paying full teardown/recreate cost each time.
I added targeted regression coverage in 311a3cf for stale cue waiters and the post-recovery sample baseline.

I would be cautious about a permanently prewarmed/running engine because it can keep mic/CoreAudio/Bluetooth resources active. For a follow-up, I think the next lower-risk cold-start optimization is to measure and trim the capture graph setup, especially avoiding unnecessary output-node/device-binding work when sync mode is forced and dictation only needs input capture.

Happy to split the engine reuse out if you prefer this PR to stay cue/readiness-only.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 311a3cf3b4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 53da73873b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 814e2a6842

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 21a82d1ab1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 84115b771b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fa95c81f50

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-30T20:50:35Z

+            return
+        }
+
+        TranscriptionSoundPlayer.shared.playStartSound()


Mute capture while playing the start cue

When transcription sounds are enabled and the output is audible to the microphone (built-in speakers, loopback/monitoring, or loud headphones), this plays the start cue while ASRService is still running and its tap is appending to the same buffer that stop() later transcribes and saves. That means the cue itself can be captured at the beginning of the dictation and can pollute the final/streaming transcription; consider temporarily gating or clearing capture around the cue playback after readiness has been established.

Useful? React with 👍 / 👎.

altic-dev · 2026-06-30T20:53:10Z

This PR is maxing out github actions for some reason. Would appreciate if you can reduce the pr commits and make a final commit with the changes :) thanks. Also i am workijg on this optimization on the side, so i’d you wait a little to make changes from now as if I am able to figure this out in a better way, we don have to waste your efforts. Thanks for understanding!

…

On Tue, Jun 30, 2026 at 1:50 PM chatgpt-codex-connector[bot] < ***@***.***> wrote: ***@***.***[bot]* commented on this pull request. 💡 Codex Review Here are some automated review suggestions for this pull request. *Reviewed commit:* fa95c81 ℹ️ About Codex in GitHub Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you - Open a pull request for review - Mark a draft as ready - Comment ***@***.*** <https://github.com/codex> review". If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT <https://openai.com/codex>, Codex can also answer questions or update the PR, like ***@***.*** <https://github.com/codex> address that feedback". ------------------------------ In Sources/Fluid/ContentView.swift <#478 (comment)>: > + let ready = await self.asr.waitForCaptureReadyForStartCue(sessionID: sessionID) + DebugLogger.shared.benchmark( + "APP_BENCH", + message: "start_cue_ready ready=\(ready) elapsedMs=\(Int(((ProcessInfo.processInfo.systemUptime - cueWaitStartedAt) * 1000).rounded()))", + source: "AppBenchmark" + ) + + guard ready, + self.asr.isRunning, + self.asr.currentRecordingSessionID == sessionID + else { + DebugLogger.shared.debug("Start cue skipped because capture is no longer active", source: "ContentView") + return + } + + TranscriptionSoundPlayer.shared.playStartSound() *[image: P2 Badge] <https://camo.githubusercontent.com/f2c1aacb361ddd3a0e9f9cacdb84fab050de434017f6747bb916e31e29bdf03d/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f50322d79656c6c6f773f7374796c653d666c6174> Mute capture while playing the start cue* When transcription sounds are enabled and the output is audible to the microphone (built-in speakers, loopback/monitoring, or loud headphones), this plays the start cue while ASRService is still running and its tap is appending to the same buffer that stop() later transcribes and saves. That means the cue itself can be captured at the beginning of the dictation and can pollute the final/streaming transcription; consider temporarily gating or clearing capture around the cue playback after readiness has been established. Useful? React with 👍 / 👎. — Reply to this email directly, view it on GitHub <#478?email_source=notifications&email_token=BVSOW2TM6YOSVPF453K7T635CQR3FA5CNFSNUABKM5UWIORPF5TWS5BNNB2WEL2QOVWGYUTFOF2WK43UKJSXM2LFO4XTINRQGQYDANBQGMZKM4TFMFZW63VHNVSW45DJN5XKKZLWMVXHJLDGN5XXIZLSL5RWY2LDNM#pullrequestreview-4604004032>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BVSOW2WM2NAOXUL2XYKZLAL5CQR3FAVCNFSNUABGKJSXA33TNF2G64TZHMYTANRRGMZDOMZRGE5US43TOVSTWNBXGY4TGNZRHA2DTILWAI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

fix: play start cue after capture is ready

4e9fa70

chatgpt-codex-connector Bot reviewed Jun 29, 2026

View reviewed changes

Comment thread Sources/Fluid/Services/ASRService.swift Outdated

Comment thread Sources/Fluid/Services/ASRService.swift Outdated

perf: reuse stopped audio engine briefly

a8890dc

chatgpt-codex-connector Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread Sources/Fluid/ContentView.swift Outdated

fix: address capture readiness review feedback

33bae45

freshyjmp added 2 commits June 30, 2026 00:29

fix: recover hotkey event tap asynchronously

73cf1e4

fix: tie start cue to recording session

0c81e63

chatgpt-codex-connector Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread Sources/Fluid/Services/ASRService.swift

Comment thread Sources/Fluid/Services/GlobalHotkeyManager.swift Outdated

test: cover start cue readiness gating

311a3cf

chatgpt-codex-connector Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread Sources/Fluid/Services/ASRService.swift Outdated

fix: address audio readiness review follow-ups

53da738

chatgpt-codex-connector Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread Sources/Fluid/Services/ASRService.swift

chatgpt-codex-connector Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread Sources/Fluid/Services/GlobalHotkeyManager.swift

fix: require fresh samples after route recovery

21a82d1

freshyjmp force-pushed the fix/start-cue-after-capture-ready branch from 814e2a6 to 21a82d1 Compare June 30, 2026 19:32

chatgpt-codex-connector Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread Sources/Fluid/Services/ASRService.swift Outdated

fix: avoid stale start cue recovery races

84115b7

chatgpt-codex-connector Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread Sources/Fluid/Services/ASRService.swift Outdated

fix: reuse cached startup recovery decision

fa95c81

chatgpt-codex-connector Bot reviewed Jun 30, 2026

View reviewed changes

Uh oh!

Conversation

freshyjmp commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Related Issues

Testing

Notes

Screenshots / Video

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

altic-dev commented Jun 30, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

freshyjmp commented Jun 30, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

altic-dev commented Jun 30, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

freshyjmp commented Jun 29, 2026 •

edited

Loading