| title | API reference |
|---|---|
| description | REST API endpoints for RAG, Crawler, and Platform services. |
Each Tale service has its own REST API. These are used internally between services but are also available for direct integration with external systems.
All Python-based services have a Swagger UI for exploring and testing the API:
| Service | Swagger UI URL | OpenAPI JSON |
|---|---|---|
| RAG | http://localhost:8001/docs | http://localhost:8001/openapi.json |
| Crawler | http://localhost:8002/docs | http://localhost:8002/openapi.json |
The RAG API handles document indexing and search. It is the engine behind the knowledge base.
POST /api/v1/documents/upload
Content-Type: multipart/form-datafile: <binary file data>
file_id: "unique-file-id"
sync: "true" (optional, wait for indexing to complete)
metadata: '{"source": "upload"}' (optional JSON string)
Document indexing runs in the background by default. Set sync=true to wait for indexing to complete before the response returns.
POST /api/v1/documents/statuses{
"file_ids": ["file-id-1", "file-id-2"]
}Returns the indexing status for each document. States: queued, running, completed, failed.
POST /api/v1/search{
"query": "What is our return policy?",
"file_ids": ["file-id-1", "file-id-2"],
"top_k": 5,
"similarity_threshold": 0.0,
"include_metadata": true
}The file_ids parameter is required and scopes the search to specific documents.
DELETE /api/v1/documents/{file_id}GET /api/v1/documents/{file_id}/contentReturns the full extracted text of an indexed document.
POST /api/v1/documents/compare{
"file_id_a": "file-id-1",
"file_id_b": "file-id-2"
}POST /api/v1/websites{
"domain": "https://docs.example.com",
"scan_interval": 21600
}scan_interval is in seconds. Minimum value is 60.
POST /api/v1/urls/fetch{
"urls": ["https://docs.example.com/guide"],
"word_count_threshold": 100
}Returns cached content when available, or fetches it live if not.
GET /api/v1/websites/{domain}DELETE /api/v1/websites/{domain}GET /api/v1/websites/{domain}/urlsThe Platform service exposes a public API at /api/v1/* for programmatic access to your data. Authenticate using an API key from Settings > API Keys.
The platform provides an interface fully compatible with the OpenAI Chat Completions API. Any client or SDK that supports OpenAI (Python, Node, curl, LiteLLM, etc.) can connect by pointing base_url to your Tale instance.
from openai import OpenAI
client = OpenAI(
base_url="https://your-tale-instance.com/api/v1",
api_key="tale_...", # from Settings > API Keys
default_headers={"X-Organization-Slug": "default"},
)
response = client.chat.completions.create(
model="chat-agent", # agent slug from your Agents page
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://your-tale-instance.com/api/v1',
apiKey: 'tale_...',
defaultHeaders: { 'X-Organization-Slug': 'default' },
});
const response = await client.chat.completions.create({
model: 'chat-agent',
messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content);curl https://your-tale-instance.com/api/v1/chat/completions \
-H "Authorization: Bearer tale_..." \
-H "X-Organization-Slug: default" \
-H "Content-Type: application/json" \
-d '{"model":"chat-agent","messages":[{"role":"user","content":"Hello!"}]}'All requests require a Bearer token in the Authorization header:
Authorization: Bearer tale_...
Create API keys in Settings > API Keys in the platform UI.
| Header | Required | Description |
|---|---|---|
Authorization |
Yes | Bearer <api-key> |
X-Organization-Slug |
No | Organization slug. Auto-resolved if user belongs to one org. |
X-Thread-Id |
No | Reuse a conversation thread across requests. |
Send a chat message and receive a response. Supports streaming and tool calling.
Request body:
| Field | Type | Description |
|---|---|---|
model |
string | Required. Agent slug (e.g., chat-agent). |
messages |
array | Required. Conversation messages with role and content. |
stream |
boolean | Enable SSE streaming. Default: false. |
temperature |
number | Sampling temperature (0–2). |
max_tokens |
number | Maximum tokens to generate. |
top_p |
number | Nucleus sampling parameter. |
frequency_penalty |
number | Penalize repeated tokens. |
presence_penalty |
number | Penalize tokens already present. |
stop |
string or array | Stop sequences. |
response_format |
object | Set {"type": "json_object"} for JSON mode. |
tools |
array | Tool definitions for client-side tool calling. |
tool_choice |
string or object | "auto", "required", "none", or {"type":"function","function":{"name":"..."}}. |
Two modes:
- Agent mode (no
tools): The agent uses its pre-configured server-side tools (RAG, web search, etc.) and auto-executes them. The response contains the final text. - Client tool mode (
toolsprovided): Only the client-defined tools are available. The model returnstool_callsfor the client to execute. Send results back withrole: "tool"messages.
Tool calling example:
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a city",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"],
},
},
}]
# Step 1: send tools
response = client.chat.completions.create(
model="chat-agent",
messages=[{"role": "user", "content": "What's the weather?"}],
tools=tools,
tool_choice="required",
)
# Step 2: execute tool and send result
tc = response.choices[0].message.tool_calls[0]
messages = [
{"role": "user", "content": "What's the weather?"},
response.choices[0].message.model_dump(),
{"role": "tool", "tool_call_id": tc.id, "content": '{"temp": 20}'},
]
final = client.chat.completions.create(
model="chat-agent", messages=messages, tools=tools
)
print(final.choices[0].message.content)List available agents (models).
{
"object": "list",
"data": [
{ "id": "chat-agent", "object": "model", "owned_by": "default" },
{ "id": "workflow-assistant", "object": "model", "owned_by": "default" }
]
}