fix: improve follow-up prompt timeout and abort handling#6430
fix: improve follow-up prompt timeout and abort handling#6430Abhiiishek44 wants to merge 7 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a robust retry mechanism with exponential backoff and timeout handling for generating follow-up prompts across various LLM providers. It adds a new executeWithRetry utility and updates provider-specific logic to utilize this wrapper. Feedback highlights unreachable code at the end of the retry function and redundant timeout logic within the Ollama provider implementation, suggesting a cleaner approach using the existing wrapper's capabilities.
| } | ||
| } | ||
|
|
||
| throw new Error('Follow-up prompt retry attempts exhausted') |
| // Ollama client does not accept AbortSignal directly in this SDK call. | ||
| // Wrap the chat promise in a race so the caller-provided signal can | ||
| // abort the request and trigger our retry/timeout behavior. | ||
| const chatPromise = ollamaClient.chat({ | ||
| model: providerConfig.modelName, | ||
| messages: [ | ||
| { | ||
| role: 'user', | ||
| content: followUpPromptsPrompt | ||
| } | ||
| ], | ||
| format: { | ||
| type: 'object', | ||
| properties: { | ||
| questions: { | ||
| type: 'array', | ||
| items: { | ||
| type: 'string' | ||
| }, | ||
| minItems: 3, | ||
| maxItems: 3, | ||
| description: 'Three follow-up questions based on the conversation history' | ||
| } | ||
| }, | ||
| minItems: 3, | ||
| maxItems: 3, | ||
| description: 'Three follow-up questions based on the conversation history' | ||
| required: ['questions'], | ||
| additionalProperties: false | ||
| }, | ||
| options: { | ||
| temperature: parseFloat(`${providerConfig.temperature}`) | ||
| } | ||
| }, | ||
| required: ['questions'], | ||
| additionalProperties: false | ||
| }) | ||
|
|
||
| let abortHandler: (() => void) | undefined | ||
| const abortPromise = new Promise<never>((_, reject) => { | ||
| abortHandler = () => { | ||
| const timeoutError = new Error('Follow-up prompt request timed out') | ||
| timeoutError.name = 'TimeoutError' | ||
| reject(timeoutError) | ||
| } | ||
| signal.addEventListener('abort', abortHandler, { once: true }) | ||
| }) | ||
|
|
||
| try { | ||
| const response = await Promise.race([chatPromise, abortPromise]) | ||
| const result = FollowUpPromptType.parse(JSON.parse((response as any).message.content)) | ||
| if (!result.questions) { | ||
| throw new Error('Follow-up prompt response missing questions') | ||
| } | ||
| return { questions: result.questions } | ||
| } finally { | ||
| if (abortHandler) signal.removeEventListener('abort', abortHandler) | ||
| } |
There was a problem hiding this comment.
The manual Promise.race and AbortSignal listener inside the Ollama provider are redundant because executeWithRetry already wraps the execution in a race against a timeout-driven abort signal. Additionally, the check for result.questions is unnecessary as FollowUpPromptType.parse already validates the presence and type of the field, ensuring fail-fast behavior for external data.
// Ollama client does not accept AbortSignal directly in this SDK call.
// However, executeWithRetry already handles the timeout and abort logic
// via Promise.race at the outer level.
const response = await ollamaClient.chat({
model: providerConfig.modelName,
messages: [
{
role: 'user',
content: followUpPromptsPrompt
}
],
format: {
type: 'object',
properties: {
questions: {
type: 'array',
items: {
type: 'string'
},
minItems: 3,
maxItems: 3,
description: 'Three follow-up questions based on the conversation history'
}
},
required: ['questions'],
additionalProperties: false
},
options: {
temperature: parseFloat(`${providerConfig.temperature}`)
}
})
const result = FollowUpPromptType.parse(JSON.parse((response as any).message.content))
return { questions: result.questions }References
- Use a default (fallback) implementation unless the specific implementation has meaningfully different behavior or provides better error messages.
- When handling potentially invalid data from external sources, prefer throwing an error for invalid input types rather than silently returning a default or empty value.
|
Hi @jimjimovich, just checking in on this PR. |
|
|
||
| const isRetryableError = (error: any): boolean => { | ||
| const status = getErrorStatus(error) | ||
| if (status && [429, 500, 502, 503].includes(status)) return true |
There was a problem hiding this comment.
This is a great improvement, but I think the retry classifier should also include timeout HTTP statuses here, especially 504.
Right now 429, 500, 502, and 503 are retryable, and timeout-like messages/codes are retryable, but a provider/proxy error like as Axios' common Request failed with status code 504 can have status = 504 without a timeout-shaped code or message. In that case this would not retry even though 504 Gateway Timeout is one of the potential failures this PR is trying to handle.
Could you, please, include 504 in this list, and maybe 408 as well, with a focused test for at least the 504 path?
There was a problem hiding this comment.
Thanks for the review. I’ve added 504 and 408 to the retryable HTTP status list and included a focused test for the Axios-style 504 status path.
|
Hi @sviat-barbutsa, just following up on this PR. I’ve addressed your review feedback by adding 504 and 408 to the retryable HTTP status list, along with a focused test for the Axios-style 504 path. Could you please re-review it when you get time? Please let me know if any further changes are needed. Thanks! |
|
@Abhiiishek44 Hey! I was looking into issue #6421 and tested your branch on my machine. The retry logic for 504/408 works perfectly! I noticed the PR might be stuck on that automated bot feedback, so I tried applying it on my end to see what happens: Ignore the bot about executeWithRetry: The bot tells you to delete the throw new Error at the bottom, but if you do, TypeScript gets mad and won't compile. Your original code was right. Fix the Ollama block: The bot is right about the Ollama part. You can safely delete all that Promise.race and abortPromise code and just await ollamaClient.chat(...). Since your executeWithRetry function is already handling the timeout, you don't need to do it twice. I tested that Ollama cleanup and the build passes perfectly. Let me know if you want me to paste the exact code snippet here for you to copy/paste. Awesome work getting this built! |
|
@AbhilashG12 Thanks for checking this and testing the build locally. Yes, please paste the snippet here. I’ll review it with my current Ollama block and update the PR accordingly. |
|
@Abhiiishek44 No problem at all! Here is the cleaned-up block for the OLLAMA case. You can replace your entire current const chatPromise = ... block (including the try/finally logic and the Promise.race) with this: // Ollama client does not accept AbortSignal directly in this SDK call. const result = FollowUpPromptType.parse(JSON.parse((response as any).message.content)) This effectively removes the redundant nested timeout logic while keeping the actual API call logic exactly as you had it. Once you swap that in, your build should pass the linter and the TS compiler perfectly. Good luck with the PR! |
|
Hi @sviat-barbutsa, In the earlier update, I added I also kept the fallback Could you please re-review it when you get time? Thanks! |
Summary
This PR improves follow-up prompt reliability for Ollama-based requests by making the request flow abort-aware and ensuring cleanup across retry attempts.
Changes included:
AbortSignallisteners are always removed after execution to avoid listener accumulation across retriesNotes
packages/components/src/followUpPrompts.tsRelated: #6421