Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

Commit fd2a475

Browse files
chore: add example on structure output
1 parent 1b0e0f8 commit fd2a475

File tree

1 file changed

+226
-1
lines changed

1 file changed

+226
-1
lines changed
Lines changed: 226 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,228 @@
11
---
22
title: Structured Outputs
3-
---
3+
---
4+
# Structured Outputs
5+
6+
The Structured Outputs/Response Format feature in [OpenAI](https://platform.openai.com/docs/guides/structured-outputs) is fundamentally a prompt engineering challenge. While its goal is to use system prompts to generate JSON output matching a specific schema, popular open-source models like Llama 3.1 and Mistral Nemo struggle to consistently generate exact JSON output that matches the requirements. An easy way to directly guild the model to reponse in json format in system message:
7+
8+
```
9+
from openai import OpenAI
10+
from pydantic import BaseModel
11+
ENDPOINT = "http://localhost:39281/v1"
12+
MODEL = "llama3.1:8b-gguf-q4-km"
13+
client = OpenAI(
14+
base_url=ENDPOINT,
15+
api_key="not-needed"
16+
)
17+
18+
format = {
19+
"steps": [{
20+
"explanation": "string",
21+
"output": "string"
22+
}
23+
],
24+
"final_output": "string"
25+
}
26+
27+
completion_payload = {
28+
"messages": [
29+
{"role": "system", "content": f"You are a helpful math tutor. Guide the user through the solution step by step. You have to response in this json format {format}\n"},
30+
{"role": "user", "content": "how can I solve 8x + 7 = -23"}
31+
]
32+
}
33+
34+
response = client.chat.completions.create(
35+
top_p=0.9,
36+
temperature=0.6,
37+
model=MODEL,
38+
messages=completion_payload["messages"]
39+
)
40+
41+
print(response)
42+
```
43+
44+
The output of the model like this
45+
46+
```
47+
48+
ChatCompletion(
49+
id='OZI0q8hghjYQY7NXlLId',
50+
choices=[
51+
Choice(
52+
finish_reason=None,
53+
index=0,
54+
logprobs=None,
55+
message=ChatCompletionMessage(
56+
content='''Here's how you can solve it:
57+
58+
{
59+
"steps": [
60+
{
61+
"explanation": "First, we need to isolate the variable x. To do this, subtract 7 from both sides of the equation.",
62+
"output": "8x + 7 - 7 = -23 - 7"
63+
},
64+
{
65+
"explanation": "This simplifies to 8x = -30",
66+
"output": "8x = -30"
67+
},
68+
{
69+
"explanation": "Next, divide both sides of the equation by 8 to solve for x.",
70+
"output": "(8x) / 8 = -30 / 8"
71+
},
72+
{
73+
"explanation": "This simplifies to x = -3.75",
74+
"output": "x = -3.75"
75+
}
76+
],
77+
"final_output": "-3.75"
78+
}''',
79+
refusal=None,
80+
role='assistant',
81+
audio=None,
82+
function_call=None,
83+
tool_calls=None
84+
)
85+
)
86+
],
87+
created=1730645716,
88+
model='_',
89+
object='chat.completion',
90+
service_tier=None,
91+
system_fingerprint='_',
92+
usage=CompletionUsage(
93+
completion_tokens=190,
94+
prompt_tokens=78,
95+
total_tokens=268,
96+
completion_tokens_details=None,
97+
prompt_tokens_details=None
98+
)
99+
)
100+
```
101+
102+
From the output, you can easily parse the response to get correct json format as you guild the model in the system prompt.
103+
104+
Howerver, open source model like llama3.1 or mistral nemo still truggling on mimic newest OpenAI API on response format. For example, consider this request created using the OpenAI library with very simple request like [OpenAI](https://platform.openai.com/docs/guides/structured-outputs#chain-of-thought):
105+
106+
```
107+
from openai import OpenAI
108+
ENDPOINT = "http://localhost:39281/v1"
109+
MODEL = "llama3.1:8b-gguf-q4-km"
110+
client = OpenAI(
111+
base_url=ENDPOINT,
112+
api_key="not-needed"
113+
)
114+
115+
class Step(BaseModel):
116+
explanation: str
117+
output: str
118+
119+
120+
class MathReasoning(BaseModel):
121+
steps: List[Step]
122+
final_answer: str
123+
124+
125+
completion_payload = {
126+
"messages": [
127+
{"role": "system", "content": f"You are a helpful math tutor. Guide the user through the solution step by step.\n"},
128+
{"role": "user", "content": "how can I solve 8x + 7 = -23"}
129+
]
130+
}
131+
132+
response = client.beta.chat.completions.parse(
133+
top_p=0.9,
134+
temperature=0.6,
135+
model=MODEL,
136+
messages= completion_payload["messages"],
137+
response_format=MathReasoning
138+
)
139+
```
140+
141+
The response format parsed by OpenAI before sending to the server is quite complex for the `MathReasoning` schema. Unlike GPT models, Llama 3.1 and Mistral Nemo cannot reliably generate responses that can be parsed as shown in the [OpenAI tutorial](https://platform.openai.com/docs/guides/structured-outputs/example-response). This may be due to these models not being trained on similar structured output tasks.
142+
143+
```
144+
"response_format" :
145+
{
146+
"json_schema" :
147+
{
148+
"name" : "MathReasoning",
149+
"schema" :
150+
{
151+
"$defs" :
152+
{
153+
"Step" :
154+
{
155+
"additionalProperties" : false,
156+
"properties" :
157+
{
158+
"explanation" :
159+
{
160+
"title" : "Explanation",
161+
"type" : "string"
162+
},
163+
"output" :
164+
{
165+
"title" : "Output",
166+
"type" : "string"
167+
}
168+
},
169+
"required" :
170+
[
171+
"explanation",
172+
"output"
173+
],
174+
"title" : "Step",
175+
"type" : "object"
176+
}
177+
},
178+
"additionalProperties" : false,
179+
"properties" :
180+
{
181+
"final_answer" :
182+
{
183+
"title" : "Final Answer",
184+
"type" : "string"
185+
},
186+
"steps" :
187+
{
188+
"items" :
189+
{
190+
"$ref" : "#/$defs/Step"
191+
},
192+
"title" : "Steps",
193+
"type" : "array"
194+
}
195+
},
196+
"required" :
197+
[
198+
"steps",
199+
"final_answer"
200+
],
201+
"title" : "MathReasoning",
202+
"type" : "object"
203+
},
204+
"strict" : true
205+
},
206+
"type" : "json_schema"
207+
}
208+
```
209+
210+
The response for this request by `mistral-nemo` and `llama3.1` can not be used to parse result like in the [original tutorial by openAI](https://platform.openai.com/docs/guides/structured-outputs/example-response). Maybe `llama3.1` and `mistral-nemo` didn't train with this kind of data, so it fails to handle this case.
211+
212+
```
213+
Response: {
214+
"choices" :
215+
[
216+
{
217+
"finish_reason" : null,
218+
"index" : 0,
219+
"message" :
220+
{
221+
"content" : "Here's a step-by-step guide to solving the equation 8x + 7 = -23:\n\n```json\n{\n \"name\": \"MathReasoning\",\n \"schema\": {\n \"$defs\": {\n \"Step\": {\n \"additionalProperties\": false,\n \"properties\": {\n \"explanation\": {\"title\": \"Explanation\", \"type\": \"string\"},\n \"output\": {\"title\": \"Output\", \"type\": \"string\"}\n },\n \"required\": [\"explanation\", \"output\"],\n \"title\": \"Step\",\n \"type\": \"object\"\n }\n },\n \"additionalProperties\": false,\n \"properties\": {\n \"final_answer\": {\"title\": \"Final Answer\", \"type\": \"string\"},\n \"steps\": {\n \"items\": {\"$ref\": \"#/$defs/Step\"},\n \"title\": \"Steps\",\n \"type\": \"array\"\n }\n },\n \"required\": [\"steps\", \"final_answer\"],\n \"title\": \"MathReasoning\",\n \"type\": \"object\"\n },\n \"strict\": true\n}\n```\n\n1. **Subtract 7 from both sides** to isolate the term with x:\n\n - Explanation: To get rid of the +7 on the left side, we add -7 to both sides of the equation.\n - Output: `8x + 7 - 7 = -23 - 7`\n\n This simplifies to:\n ```\n 8x = -30\n ```\n\n2. **Divide both sides by 8** to solve for x:\n\n - Explanation: To get rid of the 8 on the left side, we multiply both sides of the equation by the reciprocal of 8, which is 1/8.\n - Output: `8x / 8 = -30 / 8`\n\n This simplifies to:\n ```\n x = -3.75\n ```\n\nSo, the final answer is:\n\n- Final Answer: `x = -3.75`",
222+
"role" : "assistant"
223+
}
224+
}
225+
],
226+
```
227+
228+
This feature currently works reliably only with GPT models, not with open-source models. Given these limitations, we suggest that you should only use Response Format feature as the first example (guild the json format for the reponse for model). Besides, the response format maybe just in beta because we have to use `client.beta.chat.completions.parse` to create chat completion instead of `client.chat.completion.create`

0 commit comments

Comments
 (0)