Skip to content

Commit a236198

Browse files
committed
openAI compatibility for springAI export
1 parent 801818c commit a236198

File tree

11 files changed

+447
-41
lines changed

11 files changed

+447
-41
lines changed

src/client/spring_ai/README.md

Lines changed: 37 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Spring AI template
22

33
## How to run:
4-
Prepare two configurations in the `ai-optimizer`, based on vector stores created using this kind of configuration:
4+
Prepare two configurations in the `Oracle ai optimizer and toolkit`, based on vector stores created using this kind of configuration:
55

66
* OLLAMA:
77
* Embbeding model: mxbai-embed-large
@@ -54,19 +54,23 @@ Start with:
5454

5555
This project contains a web service that will accept HTTP GET requests at
5656

57-
* `http://localhost:8080/v1/chat/completions`: to use RAG via OpenAI REST API
57+
* `http://localhost:9090/v1/chat/completions`: to use RAG via OpenAI REST API
5858

59-
* `http://localhost:8080/v1/service/llm` : to chat straight with the LLM used
60-
* `http://localhost:8080/v1/service/search/`: to search for document similar to the message provided
59+
* `http://localhost:9090/v1/service/llm` : to chat straight with the LLM used
60+
* `http://localhost:9090/v1/service/search/`: to search for document similar to the message provided
6161

6262

63-
RAG call example with `openai` build profile:
63+
RAG call example with `openai` build profile with no-stream:
6464

6565
```
66-
curl -X POST "localhost:8080/v1/chat/completions" \
67-
-H "Content-Type: application/json" \
68-
-H "Authorization: Bearer your_api_key" \
69-
-d '{"message": "Can I use any kind of development environment to run the example?"}' | jq .
66+
curl -N http://localhost:9090/v1/chat/completions \
67+
-H "Content-Type: application/json" \
68+
-H "Authorization: Bearer your_api_key" \
69+
-d '{
70+
"model": "server",
71+
"messages": [{"role": "user", "content": "Can I use any kind of development environment to run the example?"}],
72+
"stream": false
73+
}'
7074
```
7175

7276
the response with RAG:
@@ -82,10 +86,20 @@ the response with RAG:
8286
]
8387
}
8488
```
85-
89+
with stream output:
90+
```
91+
curl -N http://localhost:9090/v1/chat/completions \
92+
-H "Content-Type: application/json" \
93+
-H "Authorization: Bearer your_api_key" \
94+
-d '{
95+
"model": "server",
96+
"messages": [{"role": "user", "content": "Can I use any kind of development environment to run the example?"}],
97+
"stream": true
98+
}'
99+
```
86100
or the request without RAG:
87101
```
88-
curl --get --data-urlencode 'message=Can I use any kind of development environment to run the example?' localhost:8080/v1/service/llm | jq .
102+
curl --get --data-urlencode 'message=Can I use any kind of development environment to run the example?' localhost:9090/v1/service/llm | jq .
89103
```
90104

91105
response not grounded:
@@ -150,10 +164,10 @@ llama3.1:latest a80c4f17acd5 2.0 GB 3 minutes ago
150164
kubectl -n ollama exec svc/ollama -- ollama run "llama3.1" "what is spring boot?"
151165
```
152166

153-
* **NOTE**: The Microservices will access to the ADB23ai on which the vector store table should be created as done in the local desktop example shown before. To access the ai-optimizer running on **Oracle Backend for Microservices and AI** and create the same configuration, let's do:
167+
* **NOTE**: The Microservices will access to the ADB23ai on which the vector store table should be created as done in the local desktop example shown before. To access the ai-explorer running on **Oracle Backend for Microservices and AI** and create the same configuration, let's do:
154168
* tunnel:
155169
```
156-
kubectl -n ai-optimizer port-forward svc/ai-optimizer 8181:8501
170+
kubectl -n ai-explorer port-forward svc/ai-explorer 8181:8501
157171
```
158172
* on localhost:
159173
```
@@ -173,25 +187,29 @@ kubectl -n ollama exec svc/ollama -- ollama run "llama3.1" "what is spring boot?
173187
```
174188

175189

176-
* the `bind` will create the new user, if not exists, but to have the `<VECTOR_STORE>_SPRINGAI` table compatible with SpringAI Oracle vector store adapter, the microservices need to access to the vector store table created by the ai-optimizer with user ADMIN on ADB:
190+
* the `bind` will create the new user, if not exists, but to have the `<VECTOR_STORE>_SPRINGAI` table compatible with SpringAI Oracle vector store adapter, the microservices need to access to the vector store table created by the ai-explorer with user ADMIN on ADB:
177191

178192
```
179193
GRANT SELECT ON ADMIN.<VECTOR_STORE> TO vector;
180194
```
181195
* then deploy:
182196
```
183-
deploy --app-name rag --service-name myspringai --artifact-path <ProjectDir>/target/myspringai-1.0.0-SNAPSHOT.jar --image-version 1.0.0 --java-version ghcr.io/oracle/graalvm-native-image-obaas:21 --service-profile obaas
197+
deploy --app-name rag --service-name myspringai --artifact-path <ProjectDir>/target/myspringai-0.0.1-SNAPSHOT.jar --image-version 0.0.1 --java-version ghcr.io/oracle/graalvm-native-image-obaas:21 --service-profile obaas
184198
```
185199
* test:
186200
```
187201
kubectl -n rag port-forward svc/myspringai 9090:8080
188202
```
189203
* from shell:
190204
```
191-
curl -X POST "http://localhost:9090/v1/chat/completions" \
192-
-H "Content-Type: application/json" \
193-
-H "Authorization: Bearer your_api_key" \
194-
-d '{"message": "Can I use any kind of development environment to run the example?"}' | jq .
205+
curl -N http://localhost:9090/v1/chat/completions \
206+
-H "Content-Type: application/json" \
207+
-H "Authorization: Bearer your_api_key" \
208+
-d '{
209+
"model": "server",
210+
"messages": [{"role": "user", "content": "Can I use any kind of development environment to run the example?"}],
211+
"stream": false
212+
}'
195213
```
196214
it should return:
197215
```

src/client/spring_ai/src/main/java/org/springframework/ai/openai/samples/helloworld/AIController.java

Lines changed: 159 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,12 @@
66
package org.springframework.ai.openai.samples.helloworld;
77

88
import org.springframework.ai.chat.client.ChatClient;
9-
import org.springframework.ai.chat.model.ChatResponse;
9+
import org.springframework.ai.chat.client.ChatClient.ChatClientRequestSpec;
1010
import org.springframework.ai.chat.prompt.Prompt;
1111
import org.springframework.ai.chat.prompt.PromptTemplate;
1212
import org.springframework.ai.document.Document;
1313
import org.springframework.ai.embedding.EmbeddingModel;
14+
//import org.springframework.ai.openai.api.OpenAiApi.ChatCompletionRequest;
1415
import org.springframework.ai.reader.ExtractedTextFormatter;
1516
import org.springframework.ai.reader.pdf.PagePdfDocumentReader;
1617
import org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig;
@@ -25,11 +26,16 @@
2526
import org.springframework.web.bind.annotation.RequestBody;
2627
import org.springframework.web.bind.annotation.RequestParam;
2728
import org.springframework.web.bind.annotation.RestController;
29+
import org.springframework.web.servlet.mvc.method.annotation.ResponseBodyEmitter;
30+
31+
import com.fasterxml.jackson.databind.ObjectMapper;
32+
2833
import org.springframework.ai.vectorstore.oracle.OracleVectorStore;
2934

3035
import jakarta.annotation.PostConstruct;
3136

3237
import org.springframework.core.io.Resource;
38+
import org.springframework.http.MediaType;
3339
import org.springframework.jdbc.core.JdbcTemplate;
3440

3541
import java.io.IOException;
@@ -38,14 +44,26 @@
3844
import java.util.ArrayList;
3945
import java.util.Map;
4046
import java.util.HashMap;
47+
import java.security.SecureRandom;
48+
import java.time.Instant;
49+
4150

4251
import java.util.Iterator;
4352
import org.slf4j.Logger;
4453
import org.slf4j.LoggerFactory;
4554

55+
56+
import org.springframework.model.*;
57+
4658
@RestController
4759
class AIController {
4860

61+
@Value("${spring.ai.openai.chat.options.model}")
62+
private String modelOpenAI;
63+
64+
@Value("${spring.ai.ollama.chat.options.model}")
65+
private String modelOllamaAI;
66+
4967
@Autowired
5068
private final OracleVectorStore vectorStore;
5169

@@ -71,7 +89,8 @@ class AIController {
7189
private JdbcTemplate jdbcTemplate;
7290

7391
private static final Logger logger = LoggerFactory.getLogger(AIController.class);
74-
92+
private static final int SLEEP = 50; // Wait in streaming between chunks
93+
private static final int STREAM_SIZE = 5; // chars in each chunk
7594
AIController(ChatClient chatClient, EmbeddingModel embeddingModel, OracleVectorStore vectorStore) {
7695

7796
this.chatClient = chatClient;
@@ -169,14 +188,16 @@ public Prompt promptEngineering(String message, String contextInstr) {
169188
INSTRUCTIONS:""";
170189

171190
String default_Instr = """
172-
Answer the users question using the DOCUMENTS text above.
191+
Answer the users question using the DOCUMENTS text above.
173192
Keep your answer ground in the facts of the DOCUMENTS.
174193
If the DOCUMENTS doesn’t contain the facts to answer the QUESTION, return:
175194
I'm sorry but I haven't enough information to answer.
176195
""";
177196

178-
//This template doesn't work with agent pattern, but only via RAG
179-
//The contextInstr coming from AI Optimizer can't be used here: default only
197+
//This template doesn't work with re-phrasing/grading pattern, but only via RAG
198+
//The contextInstr coming from Oracle ai optimizer and toolkit can't be used here: default only
199+
//Modifiy it to include re-phrasing/grading if you wish.
200+
180201
template = template + "\n" + default_Instr;
181202

182203
List<Document> similarDocuments = this.vectorStore.similaritySearch(
@@ -208,25 +229,70 @@ StringBuilder createContext(List<Document> similarDocuments) {
208229
return context;
209230
}
210231

211-
@PostMapping("/chat/completions")
212-
Map<String, Object> completionRag(@RequestBody Map<String, String> requestBody) {
213-
214-
String message = requestBody.getOrDefault("message", "Tell me a joke");
215-
Prompt prompt = promptEngineering(message, contextInstr);
216-
logger.info(prompt.getContents());
217-
try {
218-
String content = chatClient.prompt(prompt).call().content();
219-
Map<String, Object> messageMap = Map.of("content", content);
220-
Map<String, Object> choicesMap = Map.of("message", messageMap);
221-
List<Map<String, Object>> choicesList = List.of(choicesMap);
222232

223-
return Map.of("choices", choicesList);
233+
@PostMapping(value = "/chat/completions", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
234+
public ResponseBodyEmitter streamCompletions(@RequestBody ChatRequest request) {
235+
ResponseBodyEmitter bodyEmitter = new ResponseBodyEmitter();
236+
String userMessageContent;
224237

225-
} catch (Exception e) {
226-
logger.error("Error while fetching completion", e);
227-
return Map.of("error", "Failed to fetch completion");
238+
for (Map<String, String> message : request.getMessages()) {
239+
if ("user".equals(message.get("role"))) {
240+
241+
String content = message.get("content");
242+
if (content != null && !content.trim().isEmpty()) {
243+
userMessageContent = content;
244+
logger.info("user message: "+userMessageContent);
245+
Prompt prompt = promptEngineering(userMessageContent, contextInstr);
246+
logger.info("prompt message: "+prompt.getContents());
247+
String contentResponse = chatClient.prompt(prompt).call().content();
248+
logger.info("-------------------------------------------------------");
249+
logger.info("- RAG RETURN -");
250+
logger.info("-------------------------------------------------------");
251+
logger.info(contentResponse);
252+
new Thread(() -> {
253+
try {
254+
ObjectMapper mapper = new ObjectMapper();
255+
256+
if (request.isStream()) {
257+
logger.info("Request is a Stream");
258+
List<String> chunks= chunkString(contentResponse);
259+
for (String token : chunks) {
260+
261+
ChatMessage messageAnswer = new ChatMessage("assistant", token);
262+
ChatChoice choice = new ChatChoice(messageAnswer);
263+
ChatStreamResponse chunk = new ChatStreamResponse("chat.completion.chunk", new ChatChoice[]{choice});
264+
265+
bodyEmitter.send("data: " + mapper.writeValueAsString(chunk) + "\n\n");
266+
Thread.sleep(SLEEP);
267+
}
268+
269+
bodyEmitter.send("data: [DONE]\n\n");
270+
} else {
271+
logger.info("Request isn't a Stream");
272+
String id="chatcmpl-"+generateRandomToken(28);
273+
String object="chat.completion";
274+
String created=String.valueOf(Instant.now().getEpochSecond());
275+
String model=getModel();
276+
ChatMessage messageAnswer = new ChatMessage("assistant", contentResponse);
277+
List<ChatChoice> choices = List.of(new ChatChoice(messageAnswer));
278+
bodyEmitter.send(new ChatResponse(id, object,created, model, choices));
279+
}
280+
bodyEmitter.complete();
281+
} catch (Exception e) {
282+
bodyEmitter.completeWithError(e);
283+
}
284+
}).start();
285+
286+
return bodyEmitter;
287+
288+
}
289+
break;
228290
}
229291
}
292+
293+
294+
return bodyEmitter;
295+
}
230296

231297
@GetMapping("/service/search")
232298
List<Map<String, Object>> search(@RequestParam(value = "message", defaultValue = "Tell me a joke") String query,
@@ -247,4 +313,77 @@ List<Map<String, Object>> search(@RequestParam(value = "message", defaultValue =
247313
;
248314
return resultList;
249315
}
316+
317+
@GetMapping("/models")
318+
Map<String, Object> models(@RequestBody (required = false) Map<String, String> requestBody) {
319+
String modelId = "custom";
320+
logger.info("models request");
321+
if (!"".equals(modelOpenAI)) {
322+
modelId = modelOpenAI;
323+
} else if (!"".equals(modelOllamaAI)) {
324+
modelId = modelOllamaAI;
325+
}
326+
logger.info("model");
327+
328+
329+
logger.info(chatClient.prompt().toString());
330+
try {
331+
Map<String, Object> model = new HashMap<>();
332+
model.put("id", modelId);
333+
model.put("object", "model");
334+
model.put("created", 0000000000L);
335+
model.put("owned_by", "no-info");
336+
337+
List<Map<String, Object>> dataList = new ArrayList<>();
338+
dataList.add(model);
339+
340+
Map<String, Object> response = new HashMap<>();
341+
response.put("object", "list");
342+
response.put("data", dataList);
343+
344+
return response;
345+
346+
} catch (Exception e) {
347+
logger.error("Error while fetching completion", e);
348+
return Map.of("error", "Failed to fetch completion");
349+
}
350+
}
351+
352+
353+
public List<String> chunkString(String input) {
354+
List<String> chunks = new ArrayList<>();
355+
int chunkSize = STREAM_SIZE;
356+
357+
for (int i = 0; i < input.length(); i += chunkSize) {
358+
int end = Math.min(input.length(), i + chunkSize);
359+
chunks.add(input.substring(i, end));
360+
}
361+
362+
return chunks;
363+
}
364+
365+
public String generateRandomToken(int length) {
366+
String CHARACTERS = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
367+
SecureRandom random = new SecureRandom();
368+
StringBuilder sb = new StringBuilder(length);
369+
for (int i = 0; i < length; i++) {
370+
int index = random.nextInt(CHARACTERS.length());
371+
sb.append(CHARACTERS.charAt(index));
372+
}
373+
return sb.toString();
374+
}
375+
376+
public String getModel(){
377+
String modelId="custom";
378+
if (!"".equals(modelOpenAI)) {
379+
modelId = modelOpenAI;
380+
} else if (!"".equals(modelOllamaAI)) {
381+
modelId = modelOllamaAI;
382+
}
383+
return modelId;
384+
}
250385
}
386+
387+
388+
389+

src/client/spring_ai/src/main/java/org/springframework/ai/openai/samples/helloworld/Config.java

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,3 +18,6 @@ ChatClient chatClient(ChatClient.Builder builder) {
1818
return builder.build();
1919
}
2020
}
21+
22+
23+

0 commit comments

Comments
 (0)