fix(memory): escape user input in LanceDBStorage SQL filters (#5728)#5729
Open
devin-ai-integration[bot] wants to merge 1 commit intomainfrom
Open
fix(memory): escape user input in LanceDBStorage SQL filters (#5728)#5729devin-ai-integration[bot] wants to merge 1 commit intomainfrom
devin-ai-integration[bot] wants to merge 1 commit intomainfrom
Conversation
LanceDBStorage interpolated caller-supplied scope paths and record IDs directly into the WHERE clauses passed to LanceDB's where(), which accepts a raw DataFusion SQL expression and does not support parameterized queries. A malicious or unprivileged caller could escape the configured scope sandbox -- for example, calling delete(scope_prefix="/alpha' OR scope LIKE '/%") would wipe every record in the table instead of just the /alpha subtree -- and ordinary strings containing apostrophes (e.g. 'O''Brien') could crash the SQL parser. Add _escape_sql_str() and _escape_like() helpers and route every user-controlled value through them in search(), delete(), reset(), and the shared _scan_rows() reader. The LIKE clauses now also use ESCAPE '\\' so % and _ in caller-supplied prefixes are treated as literals instead of wildcards. Adds tests/memory/test_lancedb_storage_security.py covering each sink (search, delete by scope, delete by id, reset, scan-based readers) with both injection payloads and legitimate apostrophe- containing scopes/IDs.
Contributor
Author
|
Prompt hidden (unlisted session) |
Contributor
Author
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #5728.
LanceDBStorageinterpolated caller-supplied scope paths and record IDs directly into the WHERE clauses passed to LanceDB'swhere(), which accepts a raw Apache DataFusion SQL expression and does not support parameterized queries. This allowed:/alpha's subtree, because the resulting WHERE evaluated toscope LIKE '/alpha' OR scope LIKE '/%' OR scope = '/'."/O'Brien","O'Reilly-42") raisedRuntimeError: Unterminated string literalinlist_records,list_scopes,get_scope_info,list_categories,delete(record_ids=…), andreset(scope_prefix=…).The 4 reported sinks all routed through
lib/crewai/src/crewai/memory/storage/lancedb_storage.py:search(scope_prefix=…)delete(scope_prefix=…)delete(record_ids=…)reset(scope_prefix=…)(plus everything that goes through_scan_rows:list_records,list_scopes,get_scope_info,list_categories,count).Fix
Added two private helpers on
LanceDBStorage:_escape_sql_str(value)— doubles single quotes for string literals (O'Brien→O''Brien)._escape_like(value)— additionally escapes the SQLLIKEmetacharacters%,_, and\, so that a caller-supplied prefix is matched as a literal, not as a glob.Every user-controlled value in
search,delete,reset,_scan_rows,update,get_record, andtouch_recordsis now routed through the appropriate helper.LIKEclauses now useESCAPE '\\'so%/_in scope paths are treated as literals.Note:
update,get_record, andtouch_recordsalready escaped IDs inline viareplace("'", "''"). Those callsites were switched to the shared helper for consistency, but their behaviour is unchanged.Tests
Added
lib/crewai/tests/memory/test_lancedb_storage_security.pywith 12 regression tests covering every sink:_escape_sql_strand_escape_like,search,delete(scope_prefix),delete(record_ids), andreset— including thedelete(scope_prefix="/alpha' OR scope LIKE '/%")payload from the report — asserting that no unintended record is touched,%in a caller-suppliedscope_prefixis treated as a literal, not a wildcard.All existing memory tests in
lib/crewai/tests/memory/continue to pass.Review & Testing Checklist for Human
ESCAPE '\\'clause is supported by the minimum LanceDB version pinned inlib/crewai/pyproject.toml. DataFusion has supported it for a long time, but worth a quick sanity check against the lockfile.%or_in ascope_prefixactually behaving as a wildcard (this PR turns them into literals — which matches the documented "scope path" semantics, but is a behaviour change for any code that was using LIKE-style globs).lancedb_storage.pyto confirm no other f-string-built WHERE clause was missed (I searched for all of them; the placeholder-row delete on line 210 is the only remaining one and uses a hard-coded literal).Notes
Reproduction script (pre-fix output) and a per-sink demonstration are included in the linked session. The 4 sinks called out in the issue all flow through this file; no other module in
crewai(and no module increwai-tools) builds a LanceDBwhere()from caller-supplied input.Reporter (
@ibondarenko1) does not need their CVSS-9.6 PoC to verify — the deterministicdelete(scope_prefix="/alpha' OR scope LIKE '/%")test in this PR is the same class of bug.Link to Devin session: https://app.devin.ai/sessions/75235e5dd2a74ca4a347a558ef7cc052