Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions src/App.vue
Original file line number Diff line number Diff line change
Expand Up @@ -1014,8 +1014,8 @@ const sendWithEnter = ref(loadBoolPref(SEND_WITH_ENTER_KEY, true))
const inProgressSendMode = ref<'steer' | 'queue'>(loadInProgressSendModePref())
const darkMode = ref<'system' | 'light' | 'dark'>(loadDarkModePref())
const chatWidth = ref<ChatWidthMode>(loadChatWidthPref())
const dictationClickToToggle = ref(loadBoolPref(DICTATION_CLICK_TO_TOGGLE_KEY, false))
const dictationAutoSend = ref(loadBoolPref(DICTATION_AUTO_SEND_KEY, true))
const dictationClickToToggle = ref(loadBoolPref(DICTATION_CLICK_TO_TOGGLE_KEY, true))
const dictationAutoSend = ref(loadBoolPref(DICTATION_AUTO_SEND_KEY, false))
Comment on lines +1017 to +1018
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Second tap ignored startup 🐞 Bug ≡ Correctness

With click-to-toggle now default, a second mic tap during startRecording() (while
isStartingRecording is true and state is still idle) is ignored, so recording may still begin
even though the user tried to stop it. This breaks the expected tap-to-toggle behavior and can leave
users recording unintentionally until they notice and stop later.
Agent Prompt
### Issue description
With tap-to-toggle dictation enabled by default, users can tap the mic twice quickly. During the `getUserMedia()` startup window, `useDictation.toggleRecording()` ignores the second tap (because `startRecording()` bails out when `isStartingRecording` is true), so recording still starts even though the user intended to stop.

### Issue Context
`useDictation.stopRecording()` already supports a “stop requested before start” flow (`isStartingRecording && state === 'idle'`), but `toggleRecording()` never calls `stopRecording()` in that state.

### Fix Focus Areas
- src/composables/useDictation.ts[147-176]
- src/composables/useDictation.ts[185-202]
- src/composables/useDictation.ts[293-301]

### Suggested change
Update `toggleRecording()` to treat `isStartingRecording` as a stoppable state, e.g.:
- If `isStartingRecording` is true (and/or `state === 'idle'`), call `stopRecording()` to set `stopRequestedBeforeStart` and exit.
- Optionally consider introducing an explicit `'starting'` state to simplify UI/logic, but a minimal fix is to route the second tap into `stopRecording()`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

const dictationLanguage = ref(loadDictationLanguagePref())
const dictationLanguageOptions = computed(() => buildDictationLanguageOptions())

Expand Down
62 changes: 53 additions & 9 deletions src/components/content/ThreadComposer.vue
Original file line number Diff line number Diff line change
Expand Up @@ -292,11 +292,12 @@
class="thread-composer-mic"
:class="{
'thread-composer-mic--active': dictationState === 'recording',
'thread-composer-mic--transcribing': dictationState === 'transcribing',
}"
type="button"
:aria-label="dictationButtonLabel"
:title="dictationButtonLabel"
:disabled="isInteractionDisabled"
:disabled="isInteractionDisabled || dictationState === 'transcribing'"
@click="onDictationToggle"
@pointerdown="onDictationPressStart"
@pointerup="onDictationPressEnd"
Expand All @@ -306,6 +307,7 @@
v-if="dictationState === 'recording'"
class="thread-composer-mic-icon thread-composer-mic-icon--stop"
/>
<span v-else-if="dictationState === 'transcribing'" class="thread-composer-mic-spinner" aria-hidden="true" />
<IconTablerMicrophone v-else class="thread-composer-mic-icon" />
</button>

Expand All @@ -326,10 +328,10 @@
class="thread-composer-submit"
:class="{ 'thread-composer-submit--queue': isTurnInProgress && activeInProgressMode === 'queue' }"
type="button"
:aria-label="isTurnInProgress && activeInProgressMode === 'queue' ? 'Queue message' : 'Send message'"
:title="isTurnInProgress ? `Send as ${activeInProgressMode}` : 'Send'"
:disabled="!canSubmit"
@click="onSubmit(isTurnInProgress ? activeInProgressMode : 'steer')"
:aria-label="submitButtonAriaLabel"
:title="submitButtonTitle"
:disabled="isSubmitButtonDisabled"
@click="onPrimarySubmit(isTurnInProgress ? activeInProgressMode : 'steer')"
>
<IconTablerArrowUp class="thread-composer-submit-icon" />
</button>
Expand Down Expand Up @@ -496,12 +498,12 @@ const {
cancel: cancelDictation,
} = useDictation({
getLanguage: () => props.dictationLanguage ?? 'auto',
onTranscript: (text) => {
onTranscript: (text, mode) => {
draft.value = draft.value ? `${draft.value}\n${text}` : text
dictationFeedback.value = ''
if (props.dictationAutoSend !== false) {
const mode = props.isTurnInProgress ? activeInProgressMode.value : 'steer'
onSubmit(mode)
if (mode === 'send' || props.dictationAutoSend !== false) {
const submitMode = props.isTurnInProgress ? activeInProgressMode.value : 'steer'
onSubmit(submitMode)
return
}
nextTick(() => inputRef.value?.focus())
Expand Down Expand Up @@ -581,6 +583,14 @@ const canSubmit = computed(() => {
if (pendingAttachmentCount.value > 0) return false
return draft.value.trim().length > 0 || selectedImages.value.length > 0 || fileAttachments.value.length > 0
})
const canRequestDictationSend = computed(() =>
dictationState.value === 'recording'
&& !props.disabled
&& !props.isUpdatingSpeedMode
&& !!props.activeThreadId
&& !isPlanModeWaitingForModel.value
&& pendingAttachmentCount.value <= 0,
)
const hasUnsavedDraft = computed(() =>
draft.value.trim().length > 0
|| selectedImages.value.length > 0
Expand Down Expand Up @@ -618,6 +628,7 @@ const activeInProgressMode = ref<'steer' | 'queue'>(inProgressMode.value)
const isDictationRecording = computed(() => dictationState.value === 'recording')
const dictationButtonLabel = computed(() => {
if (dictationState.value === 'recording') return 'Stop dictation'
if (dictationState.value === 'transcribing') return 'Transcribing dictation'
return props.dictationClickToToggle ? 'Click to dictate' : 'Hold to dictate'
})
const dictationErrorText = computed(() =>
Expand Down Expand Up @@ -652,6 +663,21 @@ const dictationDurationLabel = computed(() => {
const seconds = totalSeconds % 60
return `${minutes}:${String(seconds).padStart(2, '0')}`
})
const submitButtonAriaLabel = computed(() => {
if (dictationState.value === 'recording') return 'Transcribe and send'
if (dictationState.value === 'transcribing') return 'Transcribing dictation'
return props.isTurnInProgress && activeInProgressMode.value === 'queue' ? 'Queue message' : 'Send message'
})
const submitButtonTitle = computed(() => {
if (dictationState.value === 'recording') return 'Transcribe and send'
if (dictationState.value === 'transcribing') return 'Transcribing dictation'
return props.isTurnInProgress ? `Send as ${activeInProgressMode.value}` : 'Send'
})
const isSubmitButtonDisabled = computed(() => {
if (dictationState.value === 'transcribing') return true
if (canRequestDictationSend.value) return false
return !canSubmit.value
})

const placeholderText = computed(() =>
!props.activeThreadId
Expand Down Expand Up @@ -1033,6 +1059,16 @@ function onInterrupt(): void {
emit('interrupt')
}

function onPrimarySubmit(mode: 'steer' | 'queue' = 'steer'): void {
if (dictationState.value === 'recording') {
if (!canRequestDictationSend.value) return
stopRecording('send')
return
}
if (dictationState.value === 'transcribing') return
onSubmit(mode)
}

function onModelSelect(value: string): void {
emit('update:selected-model', value)
}
Expand Down Expand Up @@ -2107,10 +2143,18 @@ watch(
@apply bg-red-100 text-red-600 hover:bg-red-200 hover:text-red-700;
}

.thread-composer-mic--transcribing {
@apply bg-zinc-200 text-zinc-600;
}

.thread-composer-mic-icon {
@apply h-5 w-5;
}

.thread-composer-mic-spinner {
@apply block h-4 w-4 rounded-full border-2 border-current border-t-transparent animate-spin;
}

.thread-composer-dictation-waveform-wrap {
@apply min-w-0 flex-1;
}
Expand Down
16 changes: 11 additions & 5 deletions src/composables/useDictation.ts
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
import { onBeforeUnmount, ref } from 'vue'

export type DictationState = 'idle' | 'recording' | 'transcribing'
export type DictationStopMode = 'insert' | 'send'
const DICTATION_SILENCE_THRESHOLD = 0.0025
const DICTATION_BAR_WIDTH = 3
const DICTATION_BAR_GAP = 2
const MAX_WAVEFORM_SAMPLES = 256

export function useDictation(options: {
onTranscript: (text: string) => void
onTranscript: (text: string, mode: DictationStopMode) => void
getLanguage?: () => string
onEmpty?: () => void
onError?: (error: unknown) => void
Expand All @@ -28,6 +29,7 @@ export function useDictation(options: {
let isStartingRecording = false
let stopRequestedBeforeStart = false
let transcribeAbortController: AbortController | null = null
let pendingStopMode: DictationStopMode = 'insert'

function cancelTranscription(): void {
if (transcribeAbortController) {
Expand Down Expand Up @@ -160,8 +162,10 @@ export function useDictation(options: {
mediaRecorder.onstop = () => {
const recordedChunks = chunks
const recordedMimeType = mediaRecorder?.mimeType || recordedChunks[0]?.type || 'audio/webm'
const stopMode = pendingStopMode
pendingStopMode = 'insert'
cleanup()
void transcribe(recordedChunks, recordedMimeType)
void transcribe(recordedChunks, recordedMimeType, stopMode)
}
startWaveformCapture(mediaStream)
mediaRecorder.start(250)
Expand All @@ -178,13 +182,15 @@ export function useDictation(options: {
}
}

function stopRecording() {
function stopRecording(mode: DictationStopMode = 'insert') {
if (isStartingRecording && state.value === 'idle') {
stopRequestedBeforeStart = true
pendingStopMode = mode
return
}
if (state.value !== 'recording' || !mediaRecorder) return
if (mediaRecorder.state !== 'inactive') {
pendingStopMode = mode
state.value = 'transcribing'
try {
mediaRecorder.requestData()
Expand All @@ -202,7 +208,7 @@ export function useDictation(options: {
state.value = 'idle'
}

async function transcribe(recordedChunks: Blob[], mimeType: string) {
async function transcribe(recordedChunks: Blob[], mimeType: string, mode: DictationStopMode) {
if (recordedChunks.length === 0) {
options.onEmpty?.()
state.value = 'idle'
Expand Down Expand Up @@ -246,7 +252,7 @@ export function useDictation(options: {

const text = (data?.text ?? '').trim()
if (text.length > 0) {
options.onTranscript(text)
options.onTranscript(text, mode)
} else {
options.onEmpty?.()
}
Expand Down
24 changes: 24 additions & 0 deletions tests.md
Original file line number Diff line number Diff line change
Expand Up @@ -2232,3 +2232,27 @@ Toggle "Free mode" in settings to use free OpenRouter models without an OpenAI A

#### Rollback/Cleanup
- Run `codexui login` to restore Codex authentication if needed.

### Feature: Tap-to-toggle dictation with optional send-on-stop flow

#### Prerequisites
- App is running from this repository.
- The browser has microphone permission.
- A thread is selected so the composer is enabled.

#### Steps
1. Open Settings and confirm `Click to toggle dictation` is enabled by default and `Auto send dictation` is disabled by default.
2. In the composer, click the microphone once and verify recording starts without holding the button.
3. Speak a short phrase, then click the microphone again to stop recording.
4. While transcription is processing, confirm the microphone button shows a loading spinner.
5. Confirm the transcript is inserted into the composer without being auto-sent.
6. Start dictation again, speak another short phrase, and click the send arrow instead of the microphone stop button.

#### Expected Results
- Dictation starts on first click and keeps recording hands-free until the next click.
- Stopping with the microphone inserts the transcript into the composer and leaves it editable.
- The microphone shows a transcribing spinner while `/codex-api/transcribe` is still in progress.
- Clicking the send arrow during recording transcribes the audio and submits the message automatically.

#### Rollback/Cleanup
- Clear the draft or delete the test thread message if needed.