Skip to content

Support OpenAI responses API #231

@gerv94

Description

@gerv94

Both vLLM and OpenAI documentation talk about how to use the vLLM support of the responses API however I already faced an error trying to connect a client to my runpod serverless because the worker doesn't support responses, check the documentation found below

Source: https://docs.vllm.ai/projects/recipes/en/latest/OpenAI/GPT-OSS.html
Fragment:

Usage

Once the vllm serve runs and INFO: Application startup complete has been displayed, you can send requests using HTTP request or OpenAI SDK to the following endpoints:

    /v1/responses endpoint can perform tool use (browsing, python, mcp) in between chain-of-thought and deliver a final response. This endpoint leverages the openai-harmony library for input rendering and output parsing. Stateful operation and full streaming API are work in progress. Responses API is recommended by OpenAI as the way to interact with this model.

Source: https://cookbook.openai.com/articles/gpt-oss/run-vllm
Fragment:

Create a model response
post https://api.openai.com/v1/responses

Creates a model response. Provide text or image inputs to generate text or JSON outputs. Have the model call your own custom code or use built-in tools like web search or file search to use your own data as input for the model's response.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions