Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 0 additions & 8 deletions backend/app/api/docs/collections/README.md

This file was deleted.

17 changes: 6 additions & 11 deletions backend/app/api/docs/collections/create.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Setup and configure the document store that is pertinent to the RAG
Setup and configure the Vector store that is pertinent to the File search
pipeline:

* Create a vector store from the document IDs you received after uploading your
Expand All @@ -8,23 +8,18 @@ pipeline:
the cumulative size reaches 30 MB of documents given to upload to a vector store
or the document count reaches 200 files in a batch, whichever limit is hit first.
* [Deprecated] Attach the Vector Store to an OpenAI
[Assistant](https://platform.openai.com/docs/api-reference/assistants). Use
[Assistant]. Use
parameters in the request body relevant to an Assistant to flesh out
its configuration. Note that an assistant will only be created when you pass both
"model" and "instruction" in the request body otherwise only a vector store will be
created from the documents given.

If any one of the LLM service interactions fail, all service resources are
cleaned up. If an OpenAI vector Store is unable to be created, for example,
all file(s) that were uploaded to OpenAI are removed from
OpenAI. Failure can occur from OpenAI being down, or some parameter
value being invalid. It can also fail due to document types not being
accepted. This is especially true for PDFs that may not be parseable.
If any step in the LLM service interaction fails, all previously created resources are cleaned up automatically. For example, if the vector store creation fails, any files already uploaded to OpenAI are removed. Failures can be caused by service downtime, invalid parameter values, or unsupported document types — the latter is especially common with PDFs that cannot be parsed.

In the case of Openai, Vector store/assistant will be created asynchronously.
The immediate response from this endpoint is `collection_job` object which is
The Vector store/assistant will be created asynchronously.
The immediate response from this endpoint is
going to contain the collection "job ID" and status. Once the collection has
been created, information about the collection will be returned to the user via
the callback URL. If a callback URL is not provided, clients can check the
`collection job info` endpoint with the `job_id`, to retrieve
information about the creation of collection.
information about the created collection.
8 changes: 4 additions & 4 deletions backend/app/api/docs/collections/delete.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
Remove a collection from the platform. This is a two step process:
Remove a collection from the platform.

This is a two-step process:

1. Delete all resources that were allocated: file(s), the Vector
Store, and the Assistant.
2. Delete the collection entry from the kaapi database.

No action is taken on the documents themselves: the contents of the
documents that were a part of the collection remain unchanged, those
documents can still be accessed via the documents endpoints. The response from this
endpoint will be a `collection_job` object which will contain the collection `job_id` and
status. When you take the id returned and use the `collection job info` endpoint,
documents can still be accessed via the documents endpoints. The endpoint returns the job ID and status of the collection delete operation. When you take the id returned and use the `collection job info` endpoint,
if the job is successful, you will get the status as successful.
Additionally, if a `callback_url` was provided in the request body,
you will receive a message indicating whether the deletion was successful or if it failed.
6 changes: 3 additions & 3 deletions backend/app/api/docs/collections/info.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Retrieve detailed information about a specific collection by its collection id. This endpoint returns the collection object including its project, organization, timestamps, and service-specific details.
Retrieve detailed information about a specific collection by its collection id.

**Response Fields:**

**Note:** While the API schema shows both `llm_service_id`/`llm_service_name` AND `knowledge_base_id`/`knowledge_base_provider`, the actual response will only include the fields relevant to what was created:
**Note:** While the example response shows both `llm_service_id`/`llm_service_name` AND `knowledge_base_id`/`knowledge_base_provider`, the actual response will only include the fields relevant to what was created:

- **If an Assistant was created** (with model + instructions): The response will only include `llm_service_id` and `llm_service_name`
- **If only a Vector Store was created** (without model/instructions): The response will only include `knowledge_base_id` and `knowledge_base_provider`
Expand All @@ -11,4 +11,4 @@ Retrieve detailed information about a specific collection by its collection id.

If the `include_docs` flag in the parameter is true then you will get a list of document IDs associated with a given collection as well. Note that, documents returned are not only stored by Kaapi, but also by Vector store provider.

Additionally, if you set the `include_url` parameter to true, a signed URL will be included in the response, which is a clickable link to access the retrieved document. If you don't set it to true, the URL will not be included in the response.
Additionally, if you set the `include_url` parameter to true, a signed URL will be included in the response, which is a clickable link to access the retrieved document(s) of the collection you have retrieved. If you don't set it to true, the URL will not be included in the response.
6 changes: 4 additions & 2 deletions backend/app/api/docs/collections/job_info.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
Retrieve information about a collection job by the collection job ID. This endpoint provides detailed status and metadata for a specific collection job in Kaapi. It is especially useful for:
Retrieve information about a collection job by the collection job ID.

* Fetching the collection job object, including the collection job ID, the current status, and the associated collection details.
This endpoint is especially useful for:

* Fetching the collection job information, including the collection job ID, the current status, and the associated collection details.

* If the job has finished, has been successful and it was a job of creation of collection then this endpoint will fetch the associated collection details.

Expand Down
6 changes: 3 additions & 3 deletions backend/app/api/docs/collections/list.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ List all _active_ collections that have been created and are not deleted.

**Response Fields:**

**Note:** While the API schema shows both `llm_service_id`/`llm_service_name` AND `knowledge_base_id`/`knowledge_base_provider`, each collection in the response will only include the fields relevant to what was created:
**Note:** While the example response shows both `llm_service_id`/`llm_service_name` AND `knowledge_base_id`/`knowledge_base_provider`, each collection in the response will only include the fields relevant to what was created:

- **If an Assistant was created** (with model + instructions): The response will only include `llm_service_id` and `llm_service_name` (e.g., `llm_service_name: "gpt-4o"` and the assistant ID)
- **If only a Vector Store was created** (without model/instructions): The response will only include `knowledge_base_id` and `knowledge_base_provider` (e.g., `knowledge_base_provider: "openai vector store"` and the vector store ID)
- **If an Assistant was created** (with model + instructions): The response will only include `llm_service_id`(the assistant ID) and `llm_service_name` (e.g., `llm_service_name: "gpt-4o"`)
- **If only a Vector Store was created** (without model/instructions): The response will only include `knowledge_base_id`(the vector store ID) and `knowledge_base_provider` (e.g., `knowledge_base_provider: "openai vector store"`)
11 changes: 6 additions & 5 deletions backend/app/api/docs/credentials/create.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,17 +43,18 @@ Credentials are encrypted and stored securely for provider integrations (OpenAI,
"host": "https://cloud.langfuse.com"
},
"webhook_secret": {
"webhook_secret: "webhook_secret"
},
"webhook_secret": "webhook_secret"
}
}
}
```
#### For registering Webhook Secret
```json
{
"credential":{
"webhook_secret":"your-webhook-secret"
"credential": {
"webhook_secret": {
"webhook_secret": "webhook_secret"
}
}

}
```
3 changes: 0 additions & 3 deletions backend/app/api/docs/documents/README.md

This file was deleted.

12 changes: 5 additions & 7 deletions backend/app/api/docs/documents/delete.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
Perform a delete of the document. This makes the
document invisible. It does not delete the document from cloud storage
or its information from the database.
Perform a delete of the document.

If the document is part of an active collection, those collections
will be deleted using the collections delete interface. Noteably, this
means all OpenAI Vector Store's and Assistant's to which this document
belongs will be deleted.
This makes the document invisible. It does not delete the document
from cloud storage or its information from the database.

If the document belongs to any active collections, those collections will also be deleted. This includes all associated knowledge bases — for example, any OpenAI vector stores that were created through this platform with this document.
7 changes: 3 additions & 4 deletions backend/app/api/docs/documents/permanent_delete.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
Permanently delete a document from cloud storage.

This operation marks the document as deleted in the database while retaining its metadata. However, the actual file is
permanently deleted from cloud storage (e.g., S3) and cannot be recovered. Only the database record remains for reference
purposes.

If the document is part of an active collection, those collections
will be deleted using the collections delete interface. Noteably, this
means all OpenAI Vector Store's and Assistant's to which this document
belongs will be deleted.
If the document belongs to any active collections, those collections will also be deleted. This includes all associated knowledge bases — for example, any OpenAI vector stores that were created through this platform with this document.
10 changes: 5 additions & 5 deletions backend/app/api/docs/documents/upload.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
Upload a document to Kaapi.
Upload a document to Kaapi and optionally transform it as well.

- If only a file is provided, the document will be uploaded and stored, and its ID will be returned.
- If a target format is specified, a transformation job will also be created to transform document into target format in the background. The response will include both the uploaded document details and information about the transformation job.
- If a callback URL is provided, you will receive a notification at that URL once the document transformation job is completed.

### Supported Transformations
### Supported Transformations:

The following (source_format → target_format) transformations are supported:
The following (source_format → target_format) transformations are supported for now:

- pdf → markdown
- zerox

### Transformers
### Transformers:

Available transformer names and their implementations, default transformer is zerox:
Available transformer names and their implementations, default transformer is zerox for now:

- `zerox`
Loading