Skip to content

fix: more filter_id issues#1730

Merged
edwinjosechittilappilly merged 6 commits into
mainfrom
pr-1666-filter-id-main
Jun 3, 2026
Merged

fix: more filter_id issues#1730
edwinjosechittilappilly merged 6 commits into
mainfrom
pr-1666-filter-id-main

Conversation

@phact
Copy link
Copy Markdown
Collaborator

@phact phact commented Jun 2, 2026

Summary by CodeRabbit

  • New Features

    • Added filter ID-based document deletion option
    • Extended search and chat endpoints with filter ID resolution
    • Improved request-level filter override handling
  • Bug Fixes

    • Enhanced delete operation success reporting accuracy
    • Clarified wildcard filter handling in filter dimensions
  • Tests

    • Strengthened integration test determinism
    • Added comprehensive unit tests for filter overrides

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 2, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

This PR extends filter resolution across v1 endpoints by centralizing a new merge_filter_overrides helper, applying it in chat and search endpoints, and implementing filter-id-based document deletion with per-file JWT forwarding and aggregated success status.

Changes

Filter-based deletion and filter-scoped retrieval

Layer / File(s) Summary
Shared filter resolution helpers and unit tests
src/api/v1/_filter_resolution.py, tests/unit/api/test_v1_filter_resolution.py
Updated wildcard-handling documentation; _strip_wildcards keeps concrete dimensions; new merge_filter_overrides function merges resolved defaults with request-body overrides using Pydantic field presence; three unit tests validate merge behaviors for absent fields, explicit defaults, and partial filter merging.
Chat endpoint integration
src/api/v1/chat.py
Chat endpoint imports merge_filter_overrides and replaces inline per-field fallback logic with single merged call when body.filter_id is provided.
Search endpoint integration
src/api/v1/search.py
Search endpoint imports merge_filter_overrides, sets authentication context before filter resolution, resolves filter_id with user JWT, merges request-body overrides, and passes resolved parameters and jwt_token to search service.
Document deletion by filename or filter id
src/api/v1/documents.py
DELETE /v1/documents accepts filename or filter_id (mutually exclusive); filter_id resolves to filenames, empty/wildcard lists rejected, per-file deletions receive user.jwt_token, overall success aggregates per-file 2xx statuses; ingest endpoint docstring clarified.
Integration test helpers and improvements
tests/integration/sdk/test_documents.py, tests/integration/sdk/test_filters.py
Helper _create_filter_for asserts created filter result.id is non-empty; integration test filter deletion in finally blocks conditional on successful creation; deletion-verification checks updated to use filter-scoped searches; chat/search test docstrings revised.
Python SDK documentation clarification
sdks/python/openrag_sdk/documents.py
SDK DocumentsClient.delete() comment clarifies that 404 "filter not found" errors for filter_id are still raised.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • langflow-ai/openrag#1666: Implements v1 filter_id support with overlapping resolve_filter_id in src/api/v1/_filter_resolution.py; this PR extends it with merge_filter_overrides and applies it across chat/search/documents endpoints.

Suggested reviewers

  • mfortman11
🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 77.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'fix: more filter_id issues' is vague and generic, using 'more issues' without specifying what filter_id problems are being addressed or what the main fix entails. Replace with a more specific title describing the core fix, such as 'fix: implement filter_id resolution with override merging' or 'fix: add filter_id support to search and delete endpoints'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch pr-1666-filter-id-main

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added backend 🔷 Issues related to backend services (OpenSearch, Langflow, APIs) tests labels Jun 2, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
tests/integration/sdk/test_filters.py (1)

193-199: ⚡ Quick win

Make the not-found checks assert the actual contract.

These two tests currently pass on any OpenRAGError, including unrelated auth or transport failures. Please assert the 404/not-found details from the exception so they really validate the new filter_id path.

Also applies to: 249-255

tests/unit/api/test_v1_filter_resolution.py (1)

14-55: ⚡ Quick win

Add direct tests for resolve_filter_id.

This file only locks down merge_filter_overrides. The new shared path that loads a saved filter, strips wildcard dimensions, and returns 404s is still untested, so regressions there will only show up through higher-level integration failures.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/api/test_v1_filter_resolution.py` around lines 14 - 55, Add direct
unit coverage for resolve_filter_id in this test module, since it is the shared
path that loads a saved filter, removes wildcard dimensions, and handles missing
filters with 404 behavior. Create focused tests around resolve_filter_id’s
successful lookup and normalization behavior, plus the not-found case, using the
existing filter-loading and merge helpers only as needed to keep the assertions
at the resolve_filter_id level.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/api/v1/_filter_resolution.py`:
- Around line 55-59: The current logic swallows JSONDecodeError for
query_data_raw and falls back to an empty dict, allowing a malformed saved
filter to produce an unfiltered/default search; instead, detect invalid JSON and
fail closed: on json.JSONDecodeError for query_data_raw (the block handling
query_data_raw → query_data) raise/return an explicit error (e.g., propagate the
exception or return a validation/HTTP error) so the caller handling filter
resolution knows the filter is invalid rather than treating it as {}. Locate the
code that reads query_data_raw and ensure the exception is not converted into an
empty dict (adjust the except branch in the query_data parsing logic used by the
filter resolution routine).

---

Nitpick comments:
In `@tests/unit/api/test_v1_filter_resolution.py`:
- Around line 14-55: Add direct unit coverage for resolve_filter_id in this test
module, since it is the shared path that loads a saved filter, removes wildcard
dimensions, and handles missing filters with 404 behavior. Create focused tests
around resolve_filter_id’s successful lookup and normalization behavior, plus
the not-found case, using the existing filter-loading and merge helpers only as
needed to keep the assertions at the resolve_filter_id level.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 52c1ae51-99be-463a-9ddd-48d3790a3fa4

📥 Commits

Reviewing files that changed from the base of the PR and between 17e8a19 and b9a93ae.

📒 Files selected for processing (12)
  • sdks/python/openrag_sdk/documents.py
  • sdks/python/openrag_sdk/models.py
  • sdks/typescript/src/documents.ts
  • sdks/typescript/src/types.ts
  • sdks/typescript/tests/integration.test.ts
  • src/api/v1/_filter_resolution.py
  • src/api/v1/chat.py
  • src/api/v1/documents.py
  • src/api/v1/search.py
  • tests/integration/sdk/test_documents.py
  • tests/integration/sdk/test_filters.py
  • tests/unit/api/test_v1_filter_resolution.py

Comment on lines +55 to +59
if isinstance(query_data_raw, str):
try:
query_data = json.loads(query_data_raw)
except json.JSONDecodeError:
query_data = {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't fail open on malformed saved filter JSON.

If query_data is invalid JSON here, the code silently resolves it to {}, which becomes an unfiltered/default search/chat request. That turns a corrupted filter_id into broader access than intended instead of failing closed.

Proposed fix
     if isinstance(query_data_raw, str):
         try:
             query_data = json.loads(query_data_raw)
         except json.JSONDecodeError:
-            query_data = {}
+            raise HTTPException(
+                status_code=500,
+                detail={"error": f"Filter {filter_id} has invalid query_data"},
+            )
     else:
         query_data = query_data_raw or {}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if isinstance(query_data_raw, str):
try:
query_data = json.loads(query_data_raw)
except json.JSONDecodeError:
query_data = {}
if isinstance(query_data_raw, str):
try:
query_data = json.loads(query_data_raw)
except json.JSONDecodeError:
raise HTTPException(
status_code=500,
detail={"error": f"Filter {filter_id} has invalid query_data"},
)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/api/v1/_filter_resolution.py` around lines 55 - 59, The current logic
swallows JSONDecodeError for query_data_raw and falls back to an empty dict,
allowing a malformed saved filter to produce an unfiltered/default search;
instead, detect invalid JSON and fail closed: on json.JSONDecodeError for
query_data_raw (the block handling query_data_raw → query_data) raise/return an
explicit error (e.g., propagate the exception or return a validation/HTTP error)
so the caller handling filter resolution knows the filter is invalid rather than
treating it as {}. Locate the code that reads query_data_raw and ensure the
exception is not converted into an empty dict (adjust the except branch in the
query_data parsing logic used by the filter resolution routine).

@phact phact force-pushed the pr-1666-filter-id-main branch from b9a93ae to 5d8d20d Compare June 2, 2026 14:45
@phact phact changed the title Pr 1666 filter id main fix: more filter_id issues Jun 2, 2026
@github-actions github-actions Bot added the bug 🔴 Something isn't working. label Jun 2, 2026
Copy link
Copy Markdown
Collaborator

@edwinjosechittilappilly edwinjosechittilappilly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions Bot added lgtm bug 🔴 Something isn't working. and removed bug 🔴 Something isn't working. labels Jun 2, 2026
phact and others added 5 commits June 2, 2026 20:16
Change assignments to the request body to use bracket notation (body['filename'], body['filter_id']) instead of dot notation. This ensures the string keys are set explicitly on the Record<string, string> and avoids relying on declared property names.
@github-actions github-actions Bot added bug 🔴 Something isn't working. and removed bug 🔴 Something isn't working. labels Jun 3, 2026
@github-actions github-actions Bot added bug 🔴 Something isn't working. and removed bug 🔴 Something isn't working. labels Jun 3, 2026
@edwinjosechittilappilly
Copy link
Copy Markdown
Collaborator

@phact I have pushed fixes to main, ideally the test should pass now

@github-actions github-actions Bot removed the bug 🔴 Something isn't working. label Jun 3, 2026
@github-actions github-actions Bot added the bug 🔴 Something isn't working. label Jun 3, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/api/v1/_filter_resolution.py (1)

57-61: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail closed on malformed query_data instead of silently broadening scope.

json.JSONDecodeError is still converted to {}, which can turn a bad saved filter into default/unfiltered behavior. This should return an explicit error.

Proposed fix
     if isinstance(query_data_raw, str):
         try:
             query_data = json.loads(query_data_raw)
         except json.JSONDecodeError:
-            query_data = {}
+            raise HTTPException(
+                status_code=500,
+                detail={"error": f"Filter {filter_id} has invalid query_data"},
+            )
     else:
         query_data = query_data_raw or {}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/api/v1/_filter_resolution.py` around lines 57 - 61, The current handling
of a malformed query_data_raw silently falls back to an empty dict; change this
to fail closed by catching json.JSONDecodeError and returning an explicit client
error instead of setting query_data = {}. In the block that checks
isinstance(query_data_raw, str) replace the silent fallback with raising an HTTP
error (e.g. HTTPException with status 400 and a clear detail like "malformed
query_data JSON") so callers get a validation error; reference the variables
query_data_raw and query_data and the exception json.JSONDecodeError to locate
and update the logic accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@src/api/v1/_filter_resolution.py`:
- Around line 57-61: The current handling of a malformed query_data_raw silently
falls back to an empty dict; change this to fail closed by catching
json.JSONDecodeError and returning an explicit client error instead of setting
query_data = {}. In the block that checks isinstance(query_data_raw, str)
replace the silent fallback with raising an HTTP error (e.g. HTTPException with
status 400 and a clear detail like "malformed query_data JSON") so callers get a
validation error; reference the variables query_data_raw and query_data and the
exception json.JSONDecodeError to locate and update the logic accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8bb983e9-7f29-47ec-bab6-a9aa49800787

📥 Commits

Reviewing files that changed from the base of the PR and between a0d4ac4 and 5cfe3be.

📒 Files selected for processing (8)
  • sdks/python/openrag_sdk/documents.py
  • src/api/v1/_filter_resolution.py
  • src/api/v1/chat.py
  • src/api/v1/documents.py
  • src/api/v1/search.py
  • tests/integration/sdk/test_documents.py
  • tests/integration/sdk/test_filters.py
  • tests/unit/api/test_v1_filter_resolution.py
🚧 Files skipped from review as they are similar to previous changes (7)
  • tests/integration/sdk/test_documents.py
  • tests/integration/sdk/test_filters.py
  • sdks/python/openrag_sdk/documents.py
  • tests/unit/api/test_v1_filter_resolution.py
  • src/api/v1/documents.py
  • src/api/v1/chat.py
  • src/api/v1/search.py

@edwinjosechittilappilly edwinjosechittilappilly merged commit db9ae7f into main Jun 3, 2026
22 of 24 checks passed
@github-actions github-actions Bot deleted the pr-1666-filter-id-main branch June 3, 2026 05:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend 🔷 Issues related to backend services (OpenSearch, Langflow, APIs) bug 🔴 Something isn't working. lgtm tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants