Skip to content

Switch dictation to tap-to-toggle by default#48

Open
SHAREN wants to merge 1 commit intofriuns2:mainfrom
SHAREN:codex/dictation-toggle-upstream-pr
Open

Switch dictation to tap-to-toggle by default#48
SHAREN wants to merge 1 commit intofriuns2:mainfrom
SHAREN:codex/dictation-toggle-upstream-pr

Conversation

@SHAREN
Copy link
Copy Markdown
Contributor

@SHAREN SHAREN commented Apr 12, 2026

Summary

This PR updates voice dictation to use the interaction pattern most users already expect from Codex and modern chat apps.

Instead of holding the microphone button while speaking, dictation now works as a tap-to-toggle flow:

  • Tap the microphone once to start recording.
  • Tap it again to stop recording and insert the transcript into the composer.
  • Edit the transcribed text if needed, or record again.
  • While recording, the send button can be used as a transcribe-and-send action, so the message is transcribed and submitted immediately.

This makes dictation much more comfortable to use because recording becomes hands-free.

What changed

  • Enabled click-to-toggle dictation by default.
  • Disabled auto-send dictation by default.
  • Added explicit dictation stop modes for insert vs send.
  • Updated the composer send button so it works as Transcribe and send while recording.
  • Added a loading spinner for the microphone button while transcription is in progress.
  • Added manual verification steps to tests.md.

Behavior

New default dictation flow:

  1. Tap the microphone to start recording.
  2. Speak without holding the button.
  3. Tap the microphone again to stop and insert the transcript into the composer.
  4. Optionally edit the text before sending.

Alternative fast path:

  1. Tap the microphone to start recording.
  2. Speak.
  3. Tap the send button instead of the stop button.
  4. The app transcribes the recording and sends it immediately.

Why

This matches the standard interaction pattern users are already familiar with from Codex and common chat applications, and removes the need to keep the microphone button pressed during the whole recording.

Testing

  • pnpm run build
  • Headless Playwright verification on mobile (375x812) and tablet (768x1024)
  • Verified tap -> stop -> insert
  • Verified recording -> send
  • Verified transcribing spinner state while /codex-api/transcribe is pending

@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Switch dictation to tap-to-toggle by default with send-on-stop support

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Switch dictation to tap-to-toggle interaction by default
• Add explicit dictation stop modes for insert vs send
• Update send button to transcribe-and-send while recording
• Add loading spinner for microphone button during transcription
• Add manual verification steps to tests documentation
Diagram
flowchart LR
  A["User taps microphone"] --> B["Recording starts hands-free"]
  B --> C{"User action"}
  C -->|"Tap microphone"| D["Stop and insert transcript"]
  C -->|"Tap send button"| E["Transcribe and send immediately"]
  D --> F["Transcribing spinner shown"]
  E --> F
  F --> G["Message ready or sent"]
Loading

Grey Divider

File Changes

1. src/composables/useDictation.ts ✨ Enhancement +11/-5

Add stop mode tracking to dictation composable

• Add DictationStopMode type to distinguish between insert and send stop modes
• Update onTranscript callback to receive stop mode parameter
• Track pendingStopMode state to preserve stop mode through recording lifecycle
• Pass stop mode through stopRecording() and transcribe() functions
• Preserve stop mode when recording is stopped before fully started

src/composables/useDictation.ts


2. src/App.vue ⚙️ Configuration changes +2/-2

Update dictation preference defaults

• Change DICTATION_CLICK_TO_TOGGLE_KEY default from false to true
• Change DICTATION_AUTO_SEND_KEY default from true to false

src/App.vue


3. src/components/content/ThreadComposer.vue ✨ Enhancement +53/-9

Implement transcribe-and-send flow with UI updates

• Add thread-composer-mic--transcribing CSS class for transcribing state styling
• Add loading spinner element that displays during transcription
• Disable microphone button during transcription state
• Update microphone button label to show "Transcribing dictation" state
• Add canRequestDictationSend computed property to check if send-on-stop is allowed
• Create submitButtonAriaLabel, submitButtonTitle, and isSubmitButtonDisabled computed
 properties for dynamic send button behavior
• Add onPrimarySubmit() function to handle send button click during recording as
 transcribe-and-send action
• Update send button to call onPrimarySubmit() instead of onSubmit()
• Update onTranscript callback to check stop mode and auto-send only when mode is 'send' or legacy
 auto-send is enabled

src/components/content/ThreadComposer.vue


View more (1)
4. tests.md 🧪 Tests +24/-0

Add manual verification tests for dictation feature

• Add comprehensive test feature for tap-to-toggle dictation workflow
• Include prerequisites, step-by-step instructions, and expected results
• Verify default settings, recording behavior, transcription spinner, and send-on-stop functionality
• Add rollback/cleanup instructions

tests.md


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review bot commented Apr 12, 2026

Code Review by Qodo

🐞 Bugs (2)   📘 Rule violations (0)   📎 Requirement gaps (0)   🎨 UX Issues (0)
🐞\ ≡ Correctness (2)

Grey Divider


Action required

1. Second tap ignored startup 🐞
Description
With click-to-toggle now default, a second mic tap during startRecording() (while
isStartingRecording is true and state is still idle) is ignored, so recording may still begin
even though the user tried to stop it. This breaks the expected tap-to-toggle behavior and can leave
users recording unintentionally until they notice and stop later.
Code

src/App.vue[R1017-1018]

+const dictationClickToToggle = ref(loadBoolPref(DICTATION_CLICK_TO_TOGGLE_KEY, true))
+const dictationAutoSend = ref(loadBoolPref(DICTATION_AUTO_SEND_KEY, false))
Evidence
dictationClickToToggle is now loaded with default true, which routes mic interaction through
onDictationToggle()/toggleRecording(). In useDictation, startRecording() sets
isStartingRecording = true while leaving state as 'idle' until after getUserMedia()
resolves; toggleRecording() does not handle this “starting” window, so a second tap calls
startRecording() again, which returns early due to the isStartingRecording guard, and no stop is
queued.

src/App.vue[1013-1019]
src/components/content/ThreadComposer.vue[1089-1095]
src/composables/useDictation.ts[147-176]
src/composables/useDictation.ts[293-301]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
With tap-to-toggle dictation enabled by default, users can tap the mic twice quickly. During the `getUserMedia()` startup window, `useDictation.toggleRecording()` ignores the second tap (because `startRecording()` bails out when `isStartingRecording` is true), so recording still starts even though the user intended to stop.

### Issue Context
`useDictation.stopRecording()` already supports a “stop requested before start” flow (`isStartingRecording && state === 'idle'`), but `toggleRecording()` never calls `stopRecording()` in that state.

### Fix Focus Areas
- src/composables/useDictation.ts[147-176]
- src/composables/useDictation.ts[185-202]
- src/composables/useDictation.ts[293-301]

### Suggested change
Update `toggleRecording()` to treat `isStartingRecording` as a stoppable state, e.g.:
- If `isStartingRecording` is true (and/or `state === 'idle'`), call `stopRecording()` to set `stopRequestedBeforeStart` and exit.
- Optionally consider introducing an explicit `'starting'` state to simplify UI/logic, but a minimal fix is to route the second tap into `stopRecording()`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Stop mode overwritten 🐞
Description
When a stop is requested before recording fully starts, pendingStopMode can be overwritten back to
'insert' because startRecording() triggers stopRecording() without passing the captured mode.
This can cause a 'send' stop request to be treated as 'insert' in the resulting transcript
handling.
Code

src/composables/useDictation.ts[R185-195]

+  function stopRecording(mode: DictationStopMode = 'insert') {
    if (isStartingRecording && state.value === 'idle') {
      stopRequestedBeforeStart = true
+      pendingStopMode = mode
      return
    }
    if (state.value !== 'recording' || !mediaRecorder) return
    if (mediaRecorder.state !== 'inactive') {
+      pendingStopMode = mode
      state.value = 'transcribing'
      try {
Evidence
The new DictationStopMode is stored in pendingStopMode when stopRecording(mode) is called
during startup. However, after startRecording() completes it checks stopRequestedBeforeStart and
calls stopRecording() with the default parameter ('insert'), which sets `pendingStopMode =
'insert'` before stopping, losing the original requested mode.

src/composables/useDictation.ts[29-33]
src/composables/useDictation.ts[147-176]
src/composables/useDictation.ts[185-202]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`useDictation` introduces `DictationStopMode` and `pendingStopMode`, but in the `stopRequestedBeforeStart` path the mode can be overwritten back to the default `'insert'`.

### Issue Context
- `stopRecording(mode)` during startup sets `pendingStopMode = mode`.
- `startRecording()` later calls `stopRecording()` with no args when `stopRequestedBeforeStart` is true, which defaults to `'insert'` and overwrites `pendingStopMode`.

### Fix Focus Areas
- src/composables/useDictation.ts[147-176]
- src/composables/useDictation.ts[185-202]

### Suggested change
In `startRecording()`, when handling `stopRequestedBeforeStart`, call `stopRecording(pendingStopMode)` (or copy `pendingStopMode` to a local const and pass it) so the originally requested stop mode is preserved.

Also ensure the `pendingStopMode` reset behavior remains correct (currently reset to `'insert'` in `mediaRecorder.onstop`).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Comment on lines +1017 to +1018
const dictationClickToToggle = ref(loadBoolPref(DICTATION_CLICK_TO_TOGGLE_KEY, true))
const dictationAutoSend = ref(loadBoolPref(DICTATION_AUTO_SEND_KEY, false))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Second tap ignored startup 🐞 Bug ≡ Correctness

With click-to-toggle now default, a second mic tap during startRecording() (while
isStartingRecording is true and state is still idle) is ignored, so recording may still begin
even though the user tried to stop it. This breaks the expected tap-to-toggle behavior and can leave
users recording unintentionally until they notice and stop later.
Agent Prompt
### Issue description
With tap-to-toggle dictation enabled by default, users can tap the mic twice quickly. During the `getUserMedia()` startup window, `useDictation.toggleRecording()` ignores the second tap (because `startRecording()` bails out when `isStartingRecording` is true), so recording still starts even though the user intended to stop.

### Issue Context
`useDictation.stopRecording()` already supports a “stop requested before start” flow (`isStartingRecording && state === 'idle'`), but `toggleRecording()` never calls `stopRecording()` in that state.

### Fix Focus Areas
- src/composables/useDictation.ts[147-176]
- src/composables/useDictation.ts[185-202]
- src/composables/useDictation.ts[293-301]

### Suggested change
Update `toggleRecording()` to treat `isStartingRecording` as a stoppable state, e.g.:
- If `isStartingRecording` is true (and/or `state === 'idle'`), call `stopRecording()` to set `stopRequestedBeforeStart` and exit.
- Optionally consider introducing an explicit `'starting'` state to simplify UI/logic, but a minimal fix is to route the second tap into `stopRecording()`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant