Skip to content

Comments

feat(memory): add knowledgebase and memory management from LocalRecall#424

Merged
mudler merged 2 commits intomainfrom
feat/localrecall
Feb 21, 2026
Merged

feat(memory): add knowledgebase and memory management from LocalRecall#424
mudler merged 2 commits intomainfrom
feat/localrecall

Conversation

@mudler
Copy link
Owner

@mudler mudler commented Feb 21, 2026

This PR merges LocalRecall functionalities in LocalAGI by importing it as a package, and adding it directly to the WebUI and exposing the relevant REST API endpoints.

The main change is that the knowledge base functionality is now built-in, removing the need for a separate LocalRecall service. The agent pool and compaction logic have been updated to support both embedded and HTTP-based knowledge base providers via a new abstraction. Documentation and Docker Compose files have been updated to reflect this new architecture.

Knowledge base integration refactor:

  • Introduced the RAGProvider abstraction, allowing agent pools to use either an embedded or HTTP-based knowledge base, and updated agent startup logic to use this provider. (core/state/pool.go) [1] [2] [3] [4] [5] [6] [7] [8]
  • Updated compaction logic to use a new KBCompactionClient interface, supporting both HTTP and embedded knowledge base clients. (core/state/compaction.go) [1] [2] [3] [4]
  • Modified agent startup to use the new provider for knowledge base actions and compaction, removing direct dependency on LocalRecall. (core/state/pool.go)

Documentation updates:

  • Revised the README.md to clarify that the knowledge base is now built-in, updated environment variable descriptions, and improved instructions for agent pool setup and configuration. (README.md) [1] [2] [3] [4] [5] [6] [7]

Docker Compose simplification:

  • Removed LocalRecall-related services from all Compose files and renamed the localrecall-postgres service to postgres, reflecting the new architecture. (docker-compose.yaml, docker-compose.amd.yaml, docker-compose.intel.yaml, docker-compose.nvidia.yaml) [1] [2] [3] [4]

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Copilot AI review requested due to automatic review settings February 21, 2026 22:17
@mudler mudler merged commit c56cc43 into main Feb 21, 2026
4 checks passed
@mudler mudler deleted the feat/localrecall branch February 21, 2026 22:17
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR integrates LocalRecall knowledge base functionality directly into LocalAGI by embedding it as a library, eliminating the need for a separate LocalRecall service. The integration introduces a new abstraction layer (RAGProvider) that supports both embedded and HTTP-based knowledge base providers, allowing flexible deployment options.

Changes:

  • Added embedded knowledge base with REST API endpoints (/api/collections/*) compatible with LocalRecall, including collection management, file uploads, semantic search, and external source syncing
  • Refactored agent pool and compaction logic to use RAGProvider abstraction, supporting both embedded (internalRAGAdapter) and HTTP-based (wrappedClientCompactionAdapter) knowledge base implementations
  • Added comprehensive Web UI for knowledge base management with tabs for Search, Collections, Upload, Sources, and Entries
  • Simplified Docker Compose architecture by removing separate LocalRecall services, with PostgreSQL now used directly by LocalAGI for vector storage
  • Updated documentation to reflect built-in knowledge base capabilities and new environment variables

Reviewed changes

Copilot reviewed 20 out of 21 changed files in this pull request and generated 21 comments.

Show a summary per file
File Description
webui/collections_handlers.go New file implementing LocalRecall-compatible REST API endpoints for collection management, file uploads, search, and external sources
webui/collections_internal.go New file providing internal adapters (internalRAGAdapter, internalCompactionAdapter) to bridge in-process collections with agent RAGDB interface
webui/routes.go Registers collection API routes via RegisterCollectionRoutes
webui/options.go Adds configuration options for collection database path, vector engine, embedding model, chunking settings, and PostgreSQL URL
webui/app.go Adds collectionsState field to App for in-process RAG state management
main.go Initializes collection configuration from environment variables, creates RAG provider (HTTP or embedded), and sets it on the agent pool
core/state/pool.go Introduces RAGProvider abstraction and NewHTTPRAGProvider, removes direct LocalRAG URL/key fields, adds SetRAGProvider method for configuring KB access
core/state/compaction.go Defines KBCompactionClient interface and adapts HTTP client to it via wrappedClientCompactionAdapter
webui/react-ui/src/pages/Knowledge.jsx New comprehensive UI component with tabs for searching, managing collections, uploading files, syncing external sources, and viewing entries
webui/react-ui/src/utils/api.js Adds collectionsApi with methods for all collection operations and handleCollectionsResponse for standardized response handling
webui/react-ui/src/utils/config.js Defines API endpoint configurations for collections
webui/react-ui/src/router.jsx Adds /knowledge route for the knowledge base UI
webui/react-ui/src/App.jsx Adds "Knowledge base" navigation menu item
webui/react-ui/src/App.css Adds comprehensive styling for knowledge base UI components
docker-compose.yaml Removes separate LocalRecall service and healthcheck, renames localrecall-postgres to postgres, removes localrag_assets volume, updates LocalAGI environment to use embedded KB with PostgreSQL
docker-compose.nvidia.yaml Updates service references to reflect removal of LocalRecall services
docker-compose.intel.yaml Updates service references to reflect removal of LocalRecall services
docker-compose.amd.yaml Updates service references to reflect removal of LocalRecall services
go.mod Adds github.com/mudler/localrecall v0.5.4 and related dependencies (PostgreSQL driver, PDF parser, sitemap parser), updates xlog to v0.0.5
go.sum Updates checksums for new and updated dependencies
README.md Updates documentation to clarify built-in knowledge base, new environment variables, and removal of separate LocalRecall service requirement

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}
defer f.Close()

filePath := filepath.Join(cfg.FileAssets, file.Filename)
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file is saved to cfg.FileAssets with the original filename from the upload. This could lead to path traversal vulnerabilities if the filename contains ".." or absolute paths. Consider sanitizing the filename using filepath.Base or similar to ensure it's just a filename without directory components.

Copilot uses AI. Check for mistakes.
Comment on lines +254 to +257
if err := collection.Store(filePath, map[string]string{"created_at": now}); err != nil {
xlog.Error("Failed to store file", err)
return c.Status(fiber.StatusInternalServerError).JSON(collectionsErrorResponse(errCodeInternalError, "Failed to store file", err.Error()))
}
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The created file is not cleaned up if collection.Store fails. The deferred out.Close() will close the file, but the file will remain in FileAssets directory. Consider removing the file if Store fails to avoid orphaned files.

Copilot uses AI. Check for mistakes.
const handleCollectionsResponse = async (response) => {
const data = await response.json().catch(() => ({}));
if (!response.ok || data.success === false) {
const msg = data.error?.message || data.error?.details || data.message || `API error: ${response.status}`;
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message construction could be improved for clarity. When response.ok is false but data.success is undefined, the error message will be "API error: [status]" which may not be as helpful. Consider checking for data.error first, then data.message, and finally fall back to a generic message with more detail about what failed.

Suggested change
const msg = data.error?.message || data.error?.details || data.message || `API error: ${response.status}`;
const errorMessageFromError =
typeof data.error === 'string'
? data.error
: data.error?.message || data.error?.details;
const msg =
errorMessageFromError ||
data.message ||
`Collections API error: ${response.status}${response.statusText ? ' ' + response.statusText : ''}`;

Copilot uses AI. Check for mistakes.
Comment on lines +41 to +42
hash := md5.Sum([]byte(s))
fileName := fmt.Sprintf("%s-%s.txt", dateTime, hex.EncodeToString(hash[:]))
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hash is computed from the entire content but only the first few bytes are used in the filename. This creates a potential for hash collisions when the filename is truncated. Consider using the full hash string or at least a longer prefix (e.g., 16 characters instead of the current hex encoding which gives 32 characters) to reduce collision probability.

Copilot uses AI. Check for mistakes.
Comment on lines +33 to +37
a.mu.RLock()
kb := a.kb
a.mu.RUnlock()
if kb == nil {
return fmt.Errorf("collection not available")
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a race condition here. The read lock is released at line 35, but the check for kb == nil at line 36 uses the value read under the lock. If another goroutine modifies the collection between the unlock and the check, this could cause issues. Consider moving the nil check inside the read lock or restructuring to avoid the race.

Copilot uses AI. Check for mistakes.
Comment on lines +497 to +512
if (!confirm(`Delete "${entry}" from collection?`)) return;
setDeleting(entry);
try {
await collectionsApi.deleteEntry(selectedCollection, entry);
showToast('Entry deleted', 'success');
setEntries(await collectionsApi.listEntries(selectedCollection));
} catch (err) {
showToast(err.message || 'Failed to delete', 'error');
} finally {
setDeleting(null);
}
};

const handleReset = async () => {
if (!selectedCollection) return;
if (!confirm(`Reset collection "${selectedCollection}"? This removes all entries.`)) return;
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'confirm' function is used for destructive operations without a proper accessible alternative. This appears multiple times in this file (lines 223, 497, 512). Consider creating a reusable confirmation modal component that is accessible and provides a consistent user experience.

Copilot uses AI. Check for mistakes.
pool.SetRAGProvider(state.NewHTTPRAGProvider(localRAG, apiKey))
} else {
embedded := app.CollectionsRAGProvider()
pool.SetRAGProvider(func(collectionName, _, _ string) (agent.RAGDB, state.KBCompactionClient, bool) {
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RAG provider lambda ignores the second and third parameters (effectiveRAGURL and effectiveRAGKey) when calling the embedded provider. This means agent-specific LocalRAGURL and LocalRAGAPIKey configuration will be ignored when using the embedded mode. Consider documenting this behavior clearly or warn users that these agent-level settings are only used with external LocalRAG URLs.

Suggested change
pool.SetRAGProvider(func(collectionName, _, _ string) (agent.RAGDB, state.KBCompactionClient, bool) {
pool.SetRAGProvider(func(collectionName, effectiveRAGURL, effectiveRAGKey string) (agent.RAGDB, state.KBCompactionClient, bool) {
// Agent-specific LocalRAGURL/LocalRAGAPIKey are only used with external HTTP RAG providers.
// When using the embedded in-process provider, these settings are ignored.
if effectiveRAGURL != "" || effectiveRAGKey != "" {
log.Printf("warning: agent-specific LocalRAGURL / LocalRAGAPIKey are ignored when using embedded RAG provider")
}

Copilot uses AI. Check for mistakes.
Comment on lines +87 to +91
if chunkOverlapEnv != "" {
if n, err := strconv.Atoi(chunkOverlapEnv); err == nil {
chunkOverlap = n
}
}
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above - invalid CHUNK_OVERLAP values are silently ignored. Consider adding a warning log.

Copilot uses AI. Check for mistakes.
Comment on lines +109 to +145
kb := a.kb
a.mu.RUnlock()
if kb == nil {
return nil, fmt.Errorf("collection not available")
}
return kb.ListDocuments(), nil
}

func (a *internalCompactionAdapter) GetEntryContent(entry string) (content string, chunkCount int, err error) {
a.mu.RLock()
kb := a.kb
a.mu.RUnlock()
if kb == nil {
return "", 0, fmt.Errorf("collection not available")
}
return kb.GetEntryFileContent(entry)
}

func (a *internalCompactionAdapter) Store(filePath string) error {
a.mu.RLock()
kb := a.kb
a.mu.RUnlock()
if kb == nil {
return fmt.Errorf("collection not available")
}
meta := map[string]string{"created_at": time.Now().Format(time.RFC3339)}
return kb.Store(filePath, meta)
}

func (a *internalCompactionAdapter) DeleteEntry(entry string) error {
a.mu.RLock()
kb := a.kb
a.mu.RUnlock()
if kb == nil {
return fmt.Errorf("collection not available")
}
return kb.RemoveEntry(entry)
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same race condition pattern exists in the internalCompactionAdapter methods (ListEntries, GetEntryContent, Store, DeleteEntry). Consider applying the same fix as suggested for the internalRAGAdapter methods.

Suggested change
kb := a.kb
a.mu.RUnlock()
if kb == nil {
return nil, fmt.Errorf("collection not available")
}
return kb.ListDocuments(), nil
}
func (a *internalCompactionAdapter) GetEntryContent(entry string) (content string, chunkCount int, err error) {
a.mu.RLock()
kb := a.kb
a.mu.RUnlock()
if kb == nil {
return "", 0, fmt.Errorf("collection not available")
}
return kb.GetEntryFileContent(entry)
}
func (a *internalCompactionAdapter) Store(filePath string) error {
a.mu.RLock()
kb := a.kb
a.mu.RUnlock()
if kb == nil {
return fmt.Errorf("collection not available")
}
meta := map[string]string{"created_at": time.Now().Format(time.RFC3339)}
return kb.Store(filePath, meta)
}
func (a *internalCompactionAdapter) DeleteEntry(entry string) error {
a.mu.RLock()
kb := a.kb
a.mu.RUnlock()
if kb == nil {
return fmt.Errorf("collection not available")
}
return kb.RemoveEntry(entry)
defer a.mu.RUnlock()
if a.kb == nil {
return nil, fmt.Errorf("collection not available")
}
return a.kb.ListDocuments(), nil
}
func (a *internalCompactionAdapter) GetEntryContent(entry string) (content string, chunkCount int, err error) {
a.mu.RLock()
defer a.mu.RUnlock()
if a.kb == nil {
return "", 0, fmt.Errorf("collection not available")
}
return a.kb.GetEntryFileContent(entry)
}
func (a *internalCompactionAdapter) Store(filePath string) error {
a.mu.RLock()
defer a.mu.RUnlock()
if a.kb == nil {
return fmt.Errorf("collection not available")
}
meta := map[string]string{"created_at": time.Now().Format(time.RFC3339)}
return a.kb.Store(filePath, meta)
}
func (a *internalCompactionAdapter) DeleteEntry(entry string) error {
a.mu.RLock()
defer a.mu.RUnlock()
if a.kb == nil {
return fmt.Errorf("collection not available")
}
return a.kb.RemoveEntry(entry)

Copilot uses AI. Check for mistakes.
Comment on lines +110 to +111
os.MkdirAll(cfg.CollectionDBPath, 0755)
os.MkdirAll(cfg.FileAssets, 0755)
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing error handling for os.MkdirAll. If directory creation fails, the subsequent ListAllCollections call may fail or behave unexpectedly. Consider logging an error or handling the failure appropriately.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant