You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -68,7 +68,7 @@ RAG is the architectural foundation of OpenCodeIntel.
68
68
69
69
The knowledge base is built by parsing source code at function granularity using tree-sitter (an incremental parsing library that produces ASTs for Python, JavaScript, TypeScript, and TSX). Each parsed function is converted into rich embedding text:
The `get_context_for_task` tool is the primary prompt engineering component. It solves a specific problem: even with the right files retrieved, an AI assistant still needs to know the project's conventions (what exception class to use, what auth pattern to follow, where to put new files). Without this, the AI generates correct-looking but wrong code.
109
109
110
110
The context assembler reads rule files in priority order:
@@ -269,7 +269,7 @@ The context assembly feature could theoretically be used to extract sensitive pa
269
269
The embedding model (OpenAI `text-embedding-3-small`) may perform better on English-language identifiers and comments than on other languages. Codebases with non-English naming conventions may see lower retrieval recall. This is a known limitation.
270
270
271
271
**Copyright**
272
-
OCI does not reproduce or redistribute source code. It stores vector embeddings (real-valued floating point arrays) which cannot be reverse-engineered to reconstruct source code. Retrieval returns file paths and function signatures to help the AI locate relevant code — not the code itself verbatim (unless the user has authorized access to that repo).
272
+
OCI does not reproduce or redistribute source code. It stores vector embeddings (real-valued floating point arrays) which significantly reduces the risk of reconstructing original source code. Retrieval returns file paths and function signatures to help the AI locate relevant code — not the code itself verbatim (unless the user has authorized access to that repo).
273
273
274
274
**Content Filtering**
275
275
The system does not filter for malicious code patterns. It indexes whatever the user points it at. Users are responsible for ensuring they have authorization to index the repositories they connect.
@@ -289,7 +289,7 @@ The system does not filter for malicious code patterns. It indexes whatever the
0 commit comments