feat: allow to manage external localrecall instances#425
Conversation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
There was a problem hiding this comment.
Pull request overview
This PR adds a pluggable collections/knowledge-base backend so the WebUI/REST API can manage collections either in-process or via an external LocalRAG/LocalRecall-compatible HTTP service, selected by configuration.
Changes:
- Introduces a
CollectionsBackendinterface and implements both in-process and HTTP-backed variants. - Updates collections REST handlers to depend on
CollectionsBackendinstead of directly managing in-memory state. - Adds
LocalRAGURLconfiguration + wiring to select the HTTP backend when set.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
webui/routes.go |
Chooses in-process vs HTTP collections backend based on LocalRAGURL and registers routes accordingly. |
webui/options.go |
Adds LocalRAGURL config field and option setter. |
webui/collections_handlers.go |
Refactors handlers to use the new CollectionsBackend abstraction. |
webui/collections_backend.go |
Defines CollectionsBackend interface and shared response DTOs. |
webui/collections_backend_inprocess.go |
New in-process backend implementing collections operations and state initialization. |
webui/collections_backend_http.go |
New HTTP backend delegating to pkg/localrag client. |
pkg/localrag/client.go |
Adds client methods for external source management (add/remove/list). |
main.go |
Passes LocalRAGURL into the web UI config. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if strings.Contains(err.Error(), "collection not found") { | ||
| return fiber.StatusNotFound | ||
| } | ||
| if strings.Contains(err.Error(), "entry not found") { |
There was a problem hiding this comment.
collectionErrStatus matches error strings using case-sensitive strings.Contains. Errors coming from the HTTP client (parseAPIError returns the server's wrap.Error.Message, which is typically capitalized like "Collection not found") may not match, causing 404 cases to be incorrectly returned as 500. Consider normalizing err.Error() to lower-case before matching, or better: have CollectionsBackend return typed/sentinel errors (e.g., ErrCollectionNotFound, ErrEntryNotFound) so handlers can map to status codes reliably.
| if strings.Contains(err.Error(), "collection not found") { | |
| return fiber.StatusNotFound | |
| } | |
| if strings.Contains(err.Error(), "entry not found") { | |
| msg := strings.ToLower(err.Error()) | |
| if strings.Contains(msg, "collection not found") { | |
| return fiber.StatusNotFound | |
| } | |
| if strings.Contains(msg, "entry not found") { |
| if !exists { | ||
| return fmt.Errorf("collection not found: %s", collection) | ||
| } | ||
| filePath := filepath.Join(b.cfg.FileAssets, filename) |
There was a problem hiding this comment.
Upload writes the user-supplied filename using filepath.Join(b.cfg.FileAssets, filename). If filename is absolute or contains ../path separators, this can write outside FileAssets (path traversal) and/or overwrite arbitrary files. Please sanitize the upload name (e.g., enforce filename == filepath.Base(filename) and reject absolute/parent paths) and consider scoping filenames by collection to avoid cross-collection collisions.
| filePath := filepath.Join(b.cfg.FileAssets, filename) | |
| // Sanitize filename to prevent path traversal and disallow path separators. | |
| sanitized := filepath.Base(filename) | |
| if sanitized != filename || sanitized == "." || sanitized == "" { | |
| return fmt.Errorf("invalid filename") | |
| } | |
| // Scope uploads to a per-collection subdirectory under FileAssets. | |
| collectionDir := filepath.Join(b.cfg.FileAssets, collection) | |
| if err := os.MkdirAll(collectionDir, 0o755); err != nil { | |
| return fmt.Errorf("failed to create collection directory: %w", err) | |
| } | |
| filePath := filepath.Join(collectionDir, sanitized) |
| tmpPath := filepath.Join(tmpDir, filename) | ||
| out, err := os.Create(tmpPath) | ||
| if err != nil { | ||
| return err |
There was a problem hiding this comment.
Upload builds tmpPath := filepath.Join(tmpDir, filename) from a user-controlled filename. If filename is absolute or contains ../path separators, Join can escape tmpDir and write to arbitrary locations before the deferred cleanup runs. Sanitize the name (e.g., filepath.Base + reject path traversal/absolute paths) before constructing tmpPath.
| func (b *collectionsBackendHTTP) EntryExists(collection, entry string) bool { | ||
| entries, err := b.client.ListEntries(collection) | ||
| if err != nil { | ||
| return false | ||
| } | ||
| for _, e := range entries { | ||
| if e == entry { | ||
| return true | ||
| } | ||
| } | ||
| return false |
There was a problem hiding this comment.
EntryExists for the HTTP backend calls ListEntries and scans the full list on every upload. For large collections this adds an extra network round-trip and O(n) work per upload. Prefer relying on Store/Upload returning a conflict error (and mapping it to 409) or add a dedicated endpoint to check existence (or cache entries) to avoid listing the whole collection each time.
| for _, s := range srcs { | ||
| var lastUpdate time.Time | ||
| if s.LastUpdate != "" { | ||
| lastUpdate, _ = time.Parse(time.RFC3339, s.LastUpdate) |
There was a problem hiding this comment.
ListSources ignores time.Parse errors (lastUpdate, _ = time.Parse(...)). If the server ever returns a malformed timestamp, this silently becomes the zero time and will serialize as 0001-01-01T00:00:00Z in responses. Consider handling the parse error (return it, or at least keep LastUpdate unset/zero only when the input is empty) so clients don't get misleading timestamps.
| lastUpdate, _ = time.Parse(time.RFC3339, s.LastUpdate) | |
| lastUpdate, err = time.Parse(time.RFC3339, s.LastUpdate) | |
| if err != nil { | |
| return nil, err | |
| } |
this PR adds an interface such as both http and in-process collections are managed seamelessly from the webui/rest api