From f686331660b7ffad84737b6eeefd0cb645714cc7 Mon Sep 17 00:00:00 2001 From: nishika26 Date: Wed, 6 May 2026 17:52:16 +0530 Subject: [PATCH 1/3] fixing documentation fixing merge conflict# --- backend/app/api/docs/collections/create.md | 15 +++++---------- backend/app/api/docs/collections/delete.md | 8 ++++---- backend/app/api/docs/collections/info.md | 6 +++--- backend/app/api/docs/collections/job_info.md | 6 ++++-- backend/app/api/docs/collections/list.md | 6 +++--- backend/app/api/docs/credentials/create.md | 7 ++++--- backend/app/api/docs/documents/delete.md | 12 +++++------- .../app/api/docs/documents/permanent_delete.md | 7 +++---- backend/app/api/docs/documents/upload.md | 8 ++++---- 9 files changed, 35 insertions(+), 40 deletions(-) diff --git a/backend/app/api/docs/collections/create.md b/backend/app/api/docs/collections/create.md index bc94e7c45..a17ff1f1a 100644 --- a/backend/app/api/docs/collections/create.md +++ b/backend/app/api/docs/collections/create.md @@ -1,4 +1,4 @@ -Setup and configure the document store that is pertinent to the RAG +Setup and configure the Vector store that is pertinent to the File search pipeline: * Create a vector store from the document IDs you received after uploading your @@ -8,23 +8,18 @@ pipeline: the cumulative size reaches 30 MB of documents given to upload to a vector store or the document count reaches 200 files in a batch, whichever limit is hit first. * [Deprecated] Attach the Vector Store to an OpenAI - [Assistant](https://platform.openai.com/docs/api-reference/assistants). Use + [Assistant]. Use parameters in the request body relevant to an Assistant to flesh out its configuration. Note that an assistant will only be created when you pass both "model" and "instruction" in the request body otherwise only a vector store will be created from the documents given. -If any one of the LLM service interactions fail, all service resources are -cleaned up. If an OpenAI vector Store is unable to be created, for example, -all file(s) that were uploaded to OpenAI are removed from -OpenAI. Failure can occur from OpenAI being down, or some parameter -value being invalid. It can also fail due to document types not being -accepted. This is especially true for PDFs that may not be parseable. +If any step in the LLM service interaction fails, all previously created resources are cleaned up automatically. For example, if the vector store creation fails, any files already uploaded to OpenAI are removed. Failures can be caused by service downtime, invalid parameter values, or unsupported document types — the latter is especially common with PDFs that cannot be parsed. -In the case of Openai, Vector store/assistant will be created asynchronously. +The Vector store/assistant will be created asynchronously. The immediate response from this endpoint is `collection_job` object which is going to contain the collection "job ID" and status. Once the collection has been created, information about the collection will be returned to the user via the callback URL. If a callback URL is not provided, clients can check the `collection job info` endpoint with the `job_id`, to retrieve -information about the creation of collection. +information about the created collection. diff --git a/backend/app/api/docs/collections/delete.md b/backend/app/api/docs/collections/delete.md index 8cb213d51..97858d7fd 100644 --- a/backend/app/api/docs/collections/delete.md +++ b/backend/app/api/docs/collections/delete.md @@ -1,4 +1,6 @@ -Remove a collection from the platform. This is a two step process: +Remove a collection from the platform. + +This is a two step process: 1. Delete all resources that were allocated: file(s), the Vector Store, and the Assistant. @@ -6,9 +8,7 @@ Remove a collection from the platform. This is a two step process: No action is taken on the documents themselves: the contents of the documents that were a part of the collection remain unchanged, those -documents can still be accessed via the documents endpoints. The response from this -endpoint will be a `collection_job` object which will contain the collection `job_id` and -status. When you take the id returned and use the `collection job info` endpoint, +documents can still be accessed via the documents endpoints. The endpoint returns the job ID and status of the collection delete operation. When you take the id returned and use the `collection job info` endpoint, if the job is successful, you will get the status as successful. Additionally, if a `callback_url` was provided in the request body, you will receive a message indicating whether the deletion was successful or if it failed. diff --git a/backend/app/api/docs/collections/info.md b/backend/app/api/docs/collections/info.md index 5703366ab..c34d5dd36 100644 --- a/backend/app/api/docs/collections/info.md +++ b/backend/app/api/docs/collections/info.md @@ -1,8 +1,8 @@ -Retrieve detailed information about a specific collection by its collection id. This endpoint returns the collection object including its project, organization, timestamps, and service-specific details. +Retrieve detailed information about a specific collection by its collection id. **Response Fields:** -**Note:** While the API schema shows both `llm_service_id`/`llm_service_name` AND `knowledge_base_id`/`knowledge_base_provider`, the actual response will only include the fields relevant to what was created: +**Note:** While the example response shows both `llm_service_id`/`llm_service_name` AND `knowledge_base_id`/`knowledge_base_provider`, the actual response will only include the fields relevant to what was created: - **If an Assistant was created** (with model + instructions): The response will only include `llm_service_id` and `llm_service_name` - **If only a Vector Store was created** (without model/instructions): The response will only include `knowledge_base_id` and `knowledge_base_provider` @@ -11,4 +11,4 @@ Retrieve detailed information about a specific collection by its collection id. If the `include_docs` flag in the parameter is true then you will get a list of document IDs associated with a given collection as well. Note that, documents returned are not only stored by Kaapi, but also by Vector store provider. -Additionally, if you set the `include_url` parameter to true, a signed URL will be included in the response, which is a clickable link to access the retrieved document. If you don't set it to true, the URL will not be included in the response. +Additionally, if you set the `include_url` parameter to true, a signed URL will be included in the response, which is a clickable link to access the retrieved document(s) of the collection you have retrieved. If you don't set it to true, the URL will not be included in the response. diff --git a/backend/app/api/docs/collections/job_info.md b/backend/app/api/docs/collections/job_info.md index 8ca288b7e..57d1f55f8 100644 --- a/backend/app/api/docs/collections/job_info.md +++ b/backend/app/api/docs/collections/job_info.md @@ -1,6 +1,8 @@ -Retrieve information about a collection job by the collection job ID. This endpoint provides detailed status and metadata for a specific collection job in Kaapi. It is especially useful for: +Retrieve information about a collection job by the collection job ID. -* Fetching the collection job object, including the collection job ID, the current status, and the associated collection details. +This endpoint is especially useful for: + +* Fetching the collection job information, including the collection job ID, the current status, and the associated collection details. * If the job has finished, has been successful and it was a job of creation of collection then this endpoint will fetch the associated collection details. diff --git a/backend/app/api/docs/collections/list.md b/backend/app/api/docs/collections/list.md index ae5ad46e3..a81e984a2 100644 --- a/backend/app/api/docs/collections/list.md +++ b/backend/app/api/docs/collections/list.md @@ -2,7 +2,7 @@ List all _active_ collections that have been created and are not deleted. **Response Fields:** -**Note:** While the API schema shows both `llm_service_id`/`llm_service_name` AND `knowledge_base_id`/`knowledge_base_provider`, each collection in the response will only include the fields relevant to what was created: +**Note:** While the example response shows both `llm_service_id`/`llm_service_name` AND `knowledge_base_id`/`knowledge_base_provider`, each collection in the response will only include the fields relevant to what was created: -- **If an Assistant was created** (with model + instructions): The response will only include `llm_service_id` and `llm_service_name` (e.g., `llm_service_name: "gpt-4o"` and the assistant ID) -- **If only a Vector Store was created** (without model/instructions): The response will only include `knowledge_base_id` and `knowledge_base_provider` (e.g., `knowledge_base_provider: "openai vector store"` and the vector store ID) +- **If an Assistant was created** (with model + instructions): The response will only include `llm_service_id`(the assistant ID) and `llm_service_name` (e.g., `llm_service_name: "gpt-4o"` and the assistant ID) +- **If only a Vector Store was created** (without model/instructions): The response will only include `knowledge_base_id`(the vector store ID) and `knowledge_base_provider` (e.g., `knowledge_base_provider: "openai vector store"`) diff --git a/backend/app/api/docs/credentials/create.md b/backend/app/api/docs/credentials/create.md index 31fb87980..9b317d8ae 100644 --- a/backend/app/api/docs/credentials/create.md +++ b/backend/app/api/docs/credentials/create.md @@ -51,9 +51,10 @@ Credentials are encrypted and stored securely for provider integrations (OpenAI, #### For registering Webhook Secret ```json { - "credential":{ - "webhook_secret":"your-webhook-secret" + "credential": { + "webhook_": { + "webhook_secret: "webhook_secret" + } } - } ``` diff --git a/backend/app/api/docs/documents/delete.md b/backend/app/api/docs/documents/delete.md index ff7af99b6..734810664 100644 --- a/backend/app/api/docs/documents/delete.md +++ b/backend/app/api/docs/documents/delete.md @@ -1,8 +1,6 @@ -Perform a delete of the document. This makes the -document invisible. It does not delete the document from cloud storage -or its information from the database. +Perform a delete of the document. -If the document is part of an active collection, those collections -will be deleted using the collections delete interface. Noteably, this -means all OpenAI Vector Store's and Assistant's to which this document -belongs will be deleted. +This makes the document invisible. It does not delete the document +from cloud storage or its information from the database. + +If the document belongs to any active collections, those collections will also be deleted. This includes all associated knowledge bases — for example, any OpenAI vector stores that were created through this platform with this document. diff --git a/backend/app/api/docs/documents/permanent_delete.md b/backend/app/api/docs/documents/permanent_delete.md index 2a6479803..1944b7e48 100644 --- a/backend/app/api/docs/documents/permanent_delete.md +++ b/backend/app/api/docs/documents/permanent_delete.md @@ -1,8 +1,7 @@ +Permanently delete a document from cloud storage. + This operation marks the document as deleted in the database while retaining its metadata. However, the actual file is permanently deleted from cloud storage (e.g., S3) and cannot be recovered. Only the database record remains for reference purposes. -If the document is part of an active collection, those collections -will be deleted using the collections delete interface. Noteably, this -means all OpenAI Vector Store's and Assistant's to which this document -belongs will be deleted. +If the document belongs to any active collections, those collections will also be deleted. This includes all associated knowledge bases — for example, any OpenAI vector stores that were created through this platform with this document. diff --git a/backend/app/api/docs/documents/upload.md b/backend/app/api/docs/documents/upload.md index e667015f5..e7ce57148 100644 --- a/backend/app/api/docs/documents/upload.md +++ b/backend/app/api/docs/documents/upload.md @@ -4,15 +4,15 @@ Upload a document to Kaapi. - If a target format is specified, a transformation job will also be created to transform document into target format in the background. The response will include both the uploaded document details and information about the transformation job. - If a callback URL is provided, you will receive a notification at that URL once the document transformation job is completed. -### Supported Transformations +### Supported Transformations: -The following (source_format → target_format) transformations are supported: +The following (source_format → target_format) transformations are supported for now: - pdf → markdown - zerox -### Transformers +### Transformers: -Available transformer names and their implementations, default transformer is zerox: +Available transformer names and their implementations, default transformer is zerox for now: - `zerox` From 63398b09bef0189dc64ffc8d460a53ae1b1c744f Mon Sep 17 00:00:00 2001 From: nishika26 Date: Thu, 7 May 2026 09:32:55 +0530 Subject: [PATCH 2/3] adding few more changes to docs --- backend/app/api/docs/collections/README.md | 8 -------- backend/app/api/docs/collections/create.md | 2 +- backend/app/api/docs/collections/list.md | 2 +- backend/app/api/docs/documents/README.md | 3 --- backend/app/api/docs/documents/upload.md | 2 +- 5 files changed, 3 insertions(+), 14 deletions(-) delete mode 100644 backend/app/api/docs/collections/README.md delete mode 100644 backend/app/api/docs/documents/README.md diff --git a/backend/app/api/docs/collections/README.md b/backend/app/api/docs/collections/README.md deleted file mode 100644 index 38e51d735..000000000 --- a/backend/app/api/docs/collections/README.md +++ /dev/null @@ -1,8 +0,0 @@ -The collections interface is designed to manage document relationships -with RAG pipelines; where a RAG pipeline is any framework that aligns -LLM responses with information from a focused corpus of documents. - -Right now this endpoint tightly coupled with OpenAI [File -Search](https://platform.openai.com/docs/assistants/tools/file-search). Its -functionality, along with descriptions in this section, are therefore -centered around that. diff --git a/backend/app/api/docs/collections/create.md b/backend/app/api/docs/collections/create.md index a17ff1f1a..8900027fd 100644 --- a/backend/app/api/docs/collections/create.md +++ b/backend/app/api/docs/collections/create.md @@ -17,7 +17,7 @@ pipeline: If any step in the LLM service interaction fails, all previously created resources are cleaned up automatically. For example, if the vector store creation fails, any files already uploaded to OpenAI are removed. Failures can be caused by service downtime, invalid parameter values, or unsupported document types — the latter is especially common with PDFs that cannot be parsed. The Vector store/assistant will be created asynchronously. -The immediate response from this endpoint is `collection_job` object which is +The immediate response from this endpoint is going to contain the collection "job ID" and status. Once the collection has been created, information about the collection will be returned to the user via the callback URL. If a callback URL is not provided, clients can check the diff --git a/backend/app/api/docs/collections/list.md b/backend/app/api/docs/collections/list.md index a81e984a2..a29edcf2f 100644 --- a/backend/app/api/docs/collections/list.md +++ b/backend/app/api/docs/collections/list.md @@ -4,5 +4,5 @@ List all _active_ collections that have been created and are not deleted. **Note:** While the example response shows both `llm_service_id`/`llm_service_name` AND `knowledge_base_id`/`knowledge_base_provider`, each collection in the response will only include the fields relevant to what was created: -- **If an Assistant was created** (with model + instructions): The response will only include `llm_service_id`(the assistant ID) and `llm_service_name` (e.g., `llm_service_name: "gpt-4o"` and the assistant ID) +- **If an Assistant was created** (with model + instructions): The response will only include `llm_service_id`(the assistant ID) and `llm_service_name` (e.g., `llm_service_name: "gpt-4o"`) - **If only a Vector Store was created** (without model/instructions): The response will only include `knowledge_base_id`(the vector store ID) and `knowledge_base_provider` (e.g., `knowledge_base_provider: "openai vector store"`) diff --git a/backend/app/api/docs/documents/README.md b/backend/app/api/docs/documents/README.md deleted file mode 100644 index dcdab1673..000000000 --- a/backend/app/api/docs/documents/README.md +++ /dev/null @@ -1,3 +0,0 @@ -The documents interface manages client documents intended to drive LLM -chat interactions. The platform stores documents in AWS S3, but may -put copies in other databases to facilitate RAG pipelines. diff --git a/backend/app/api/docs/documents/upload.md b/backend/app/api/docs/documents/upload.md index e7ce57148..e7175b404 100644 --- a/backend/app/api/docs/documents/upload.md +++ b/backend/app/api/docs/documents/upload.md @@ -1,4 +1,4 @@ -Upload a document to Kaapi. +Upload a document to Kaapi and optionally transform it as well. - If only a file is provided, the document will be uploaded and stored, and its ID will be returned. - If a target format is specified, a transformation job will also be created to transform document into target format in the background. The response will include both the uploaded document details and information about the transformation job. From 9a4ccb90e3afc565576106059306bf2e10b91688 Mon Sep 17 00:00:00 2001 From: nishika26 Date: Thu, 7 May 2026 09:46:15 +0530 Subject: [PATCH 3/3] coderabbit reviews --- backend/app/api/docs/collections/delete.md | 2 +- backend/app/api/docs/credentials/create.md | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/backend/app/api/docs/collections/delete.md b/backend/app/api/docs/collections/delete.md index 97858d7fd..c6ffeabb2 100644 --- a/backend/app/api/docs/collections/delete.md +++ b/backend/app/api/docs/collections/delete.md @@ -1,6 +1,6 @@ Remove a collection from the platform. -This is a two step process: +This is a two-step process: 1. Delete all resources that were allocated: file(s), the Vector Store, and the Assistant. diff --git a/backend/app/api/docs/credentials/create.md b/backend/app/api/docs/credentials/create.md index 9b317d8ae..f01065f4d 100644 --- a/backend/app/api/docs/credentials/create.md +++ b/backend/app/api/docs/credentials/create.md @@ -43,8 +43,8 @@ Credentials are encrypted and stored securely for provider integrations (OpenAI, "host": "https://cloud.langfuse.com" }, "webhook_secret": { - "webhook_secret: "webhook_secret" - }, + "webhook_secret": "webhook_secret" + } } } ``` @@ -52,8 +52,8 @@ Credentials are encrypted and stored securely for provider integrations (OpenAI, ```json { "credential": { - "webhook_": { - "webhook_secret: "webhook_secret" + "webhook_secret": { + "webhook_secret": "webhook_secret" } } }