feat(webui): Make max search results configurable via settings.json#2246
feat(webui): Make max search results configurable via settings.json#2246goynam wants to merge 11 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThe pull request makes search result limits configurable across Web UI client and server by replacing hardcoded 1000 values with a configurable ChangesConfigurable search results limit
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tools/deployment/package-helm/templates/webui-deployment.yaml`:
- Around line 58-59: The YAML linter flags the Helm template expression on the
SEARCH_MAX_NUM_RESULTS line for the "braces" rule; suppress it by adding an
inline yamllint disable comment for the braces rule on that line in
tools/deployment/package-helm/templates/webui-deployment.yaml (e.g., add a "#
yamllint disable-line braces" style comment right after or on the same line as
the SEARCH_MAX_NUM_RESULTS mapping) so the diff-scoped linter won't fail while
preserving the existing quoted Helm template expression.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 06efef7b-5726-44af-a977-25b87ffe04b3
📥 Commits
Reviewing files that changed from the base of the PR and between 351dd0e and b42745f7a1836faed6a9e5f026a5e08ad07e4813.
📒 Files selected for processing (5)
components/webui/server/src/plugins/external/env.tscomponents/webui/server/src/routes/api/search/index.tscomponents/webui/server/src/routes/api/search/typings.tstools/deployment/package-helm/templates/webui-deployment.yamltools/deployment/package-helm/values.yaml
c52cca1 to
e7f85dc
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@components/webui/server/src/routes/api/search/index.ts`:
- Line 19: The import statement bringing in SEARCH_MAX_NUM_RESULTS and
setSearchMaxNumResults from "./typings.js" should be split across multiple lines
to satisfy import-newlines/enforce and `@stylistic/object-curly-newline`; update
the import for the module "./typings.js" so the named specifiers
(SEARCH_MAX_NUM_RESULTS and setSearchMaxNumResults) each appear on their own
line (or otherwise broken across lines) inside the braces while preserving the
same identifiers and relative order.
In `@components/webui/server/src/routes/api/search/typings.ts`:
- Around line 16-24: Replace the magic numeric literal by introducing a named
constant and use it to initialize the mutable variable: add a const like
DEFAULT_SEARCH_MAX_NUM_RESULTS (value 1000) and change the mutable
SEARCH_MAX_NUM_RESULTS initialization to use that constant; keep the setter
setSearchMaxNumResults as-is so the runtime mutability remains but the literal
is no longer inline.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: d3762c92-35f9-49a6-987f-c940371f7f4f
📥 Commits
Reviewing files that changed from the base of the PR and between c52cca1bbb945973a560f8511c0548485bece8dc and e7f85dc64a24fc73c2d4d6ed0f053b2e3492f4be.
📒 Files selected for processing (6)
components/webui/server/src/plugins/external/env.tscomponents/webui/server/src/routes/api/search/index.tscomponents/webui/server/src/routes/api/search/typings.tstools/deployment/package-helm/Chart.yamltools/deployment/package-helm/templates/webui-deployment.yamltools/deployment/package-helm/values.yaml
|
Thanks for working on this! To give some context: the max result count isn't really a proper mechanism for scalability - we'll likely drop it entirely in the future - but in the meantime, making it configurable is the right step, and it's great to see someone take it on. I've been thinking through the same problem and wanted to share some design considerations for the approach. The core difference is around how the configuration flows through the system and how complete the coverage is. 1. Configuration mechanism:
|
| Step | File | Change |
|---|---|---|
| Config definition | clp_config.py |
Add max_search_results: PositiveInt = 1000 to WebUi |
| Templates | client/public/settings.json, server/settings.json |
Add "MaxSearchResults": 1000 |
| Propagation | controller.py |
Add to both client & server settings_json_updates |
| Client types | client/src/settings.ts |
Add MaxSearchResults: number to Settings |
| Client config | client/src/config/index.ts |
Export SETTINGS_MAX_SEARCH_RESULTS |
| Client usage (3 sites) | typings.ts, useSearchResults.ts, usePrestoSearchResults.ts |
Replace hardcoded constant with SETTINGS_MAX_SEARCH_RESULTS |
| Presto guided | presto-guided-search-requests.ts |
Replace DEFAULT_SEARCH_LIMIT with SETTINGS_MAX_SEARCH_RESULTS |
| Server usage (3 sites) | search/typings.ts, search/index.ts, search/utils.ts, presto-search/typings.ts, presto-search/index.ts |
Use settings.MaxSearchResults instead of hardcoded/mutable constants |
| Package template | clp-config.template.json.yaml |
Add commented max_search_results: 1000 |
| Helm | values.yaml, configmap.yaml |
Add max_search_results to webui section and both settings JSON blocks |
This covers all search paths end-to-end, uses the existing config flow, and avoids introducing mutable global state.
Happy to help if you'd like to iterate on any of this!
Make the maximum number of search results configurable end-to-end using the existing settings.json pattern (following the MaxDatasetsPerQuery precedent). The setting flows through: - clp-config.yaml (webui.max_search_results) with default of 1000 - controller.py propagates to both client and server settings.json - Client reads via settings.ts -> config/index.ts -> all search paths - Server reads directly from settings.json in both search and presto-search routes All hardcoded 1000 constants across server (search + presto-search) and client (native search + presto guided) now read from configuration. Backward compatible: default remains 1000. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
e7f85dc to
6a85e83
Compare
Add a "Limit" input in the search controls area allowing users to override the max results per query. The value is sent to the server which clamps it to the system-wide MaxSearchResults ceiling. - Add maxNumResults (optional) to QueryJobCreationSchema - Add MaxResultsInput component with Ant Design InputNumber - Add maxNumResults to SearchState (Zustand store) - Server uses client-provided value clamped to settings.MaxSearchResults - Default is the system max (1000) for no behavior change Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace deprecated `addonBefore` with `Space.Compact` + Typography - Ensure JSX props are sorted alphabetically - Bump Helm chart version from 0.3.2-dev.0 to 0.3.2-dev.1 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| /** | ||
| * The maximum number of results to retrieve for a search. | ||
| */ | ||
| const SEARCH_MAX_NUM_RESULTS = 1000; | ||
| const SEARCH_MAX_NUM_RESULTS = SETTINGS_MAX_SEARCH_RESULTS; |
There was a problem hiding this comment.
this is super confusing and defeats the purpose of defaulting to a lower limit on the client side, and is potentially incorrect.
as a result of this modification, now:
- when we submit a query, we use the MaxResultsInput-controlled Zustand state store SearchStore 's variable
maxNumResults(A), which is correct. - as the search performs, different workers dump maximum
maxNumResults(A) into the result cache. assume the query needs to scan results from 10 archives -> 10 workers can dump up to (A)*10 results into the result cache. - we always subscribe to the result cache with
limit: SEARCH_MAX_NUM_RESULTS(B), pulling thisSEARCH_MAX_NUM_RESULTS(B) number of results from the server and display in the client
now suppose A (client-session-only, requested) is 1,000 and B (server-enforced-maximum) is 10,000 , the effective max number of results to be displayed in the client is 10,000. which doesn't correctly match the client requested number.
i think we can define a default value here to be used in SEARCH_STATE_DEFAULT.
| /** | |
| * The maximum number of results to retrieve for a search. | |
| */ | |
| const SEARCH_MAX_NUM_RESULTS = 1000; | |
| const SEARCH_MAX_NUM_RESULTS = SETTINGS_MAX_SEARCH_RESULTS; | |
| /** | |
| * The default maximum number of results to retrieve for a search. | |
| */ | |
| const DEFAULT_SEARCH_MAX_NUM_RESULTS = Math.min(1000, SETTINGS_MAX_SEARCH_RESULTS); |
There was a problem hiding this comment.
Good point. The client cursor limit (SEARCH_MAX_NUM_RESULTS) and the per-query submit limit (maxNumResults in the store) serve different roles: the cursor limit caps what MongoDB returns to the UI, while the per-query limit caps what the server tells workers to produce. Both are set to
SETTINGS_MAX_SEARCH_RESULTS so the cursor can display all results the server produces. If we want a lower default for the per-query input, we can change the store default independently — happy to adjust if you have a preferred value.
There was a problem hiding this comment.
sorry i think my previous message has confused you -
i mean to say this shall be renamed to DEFAULT_SEARCH_MAX_NUM_RESULTS to correctly reflect the purpose, as it's only used as an init value in SEARCH_STATE_DEFAULT.maxNumResults
then we also want to clamp this default value to a safer number at 1000, so that most queries won't blow up the server even when the ClpConfig.webui.max_search_results has been (incorrectly) configured with an aggressive value.
for example, if ClpConfig.webui.max_search_results is configured as 10,000 in the cluster, we might not want every query to return 10,000 results. The above change proposes a default of 1,000; if an end user ever needs more than 1,000 results, they can always change the MaxResultsInput to a higher number in the browser for that specific query session, without affecting most other queries.
overall:
- we propose a default of max 1000 results for all queries
- the intention to make the max number results configurable is to give end users freedom to temporarily limit the number to a higher value, for specific investigations
- ClpConfig.webui.max_search_results serves as a safety net to disallow crazy values from an end user
| "ArchiveOutputTargetSegmentSize": self._clp_config.archive_output.target_segment_size, | ||
| "ClpQueryEngine": self._clp_config.package.query_engine, | ||
| "ClpStorageEngine": self._clp_config.package.storage_engine, | ||
| "MaxSearchResults": self._clp_config.webui.max_search_results, |
There was a problem hiding this comment.
this definition should follow the order in components/webui/server/settings.json , after the order is corrected there
| */ | ||
| const SEARCH_STATE_DEFAULT = Object.freeze({ | ||
| aggregationJobId: null, | ||
| maxNumResults: SETTINGS_MAX_SEARCH_RESULTS, |
There was a problem hiding this comment.
see the comment in https://github.com/y-scope/clp/pull/2246/changes#r3214638646
There was a problem hiding this comment.
| maxNumResults: SETTINGS_MAX_SEARCH_RESULTS, | |
| maxNumResults: DEFAULT_SEARCH_MAX_NUM_RESULTS, |
| const QueryJobCreationSchema = Type.Object({ | ||
| datasets: Type.Array(Type.String()), | ||
| ignoreCase: Type.Boolean(), | ||
| maxNumResults: Type.Optional(Type.Integer({minimum: 1})), |
There was a problem hiding this comment.
since this API is not public, i think we should not make it optional.
| queryString, | ||
| } = request.body; | ||
|
|
||
| const effectiveMaxResults = "undefined" === typeof maxNumResults ? |
There was a problem hiding this comment.
following the comment at https://github.com/y-scope/clp/pull/2246/changes#r3214643549 , we don't have to do this undefined check here if we don't make maxNumResults Optional
- Make maxNumResults required (not optional) in QueryJobCreationSchema - Rename effectiveMaxResults to clampedMaxNumResults in server handler - Remove undefined check (no longer needed since field is required) - Move MaxSearchResults to correct position in server/settings.json (before MongoDb group, alphabetically sorted) - Fix controller.py MaxSearchResults ordering to match settings.json - Extract inline styles into index.module.css for MaxResultsInput - Use useCallback with useSearchStore.getState() pattern in handler Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Presto guided search path was using the system-wide SETTINGS_MAX_SEARCH_RESULTS constant, ignoring the user's per-query limit from the MaxResultsInput. Now reads from the store to respect the user's chosen limit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
@hoophalab please help confirm the interactions and visuals match your expectations
There was a problem hiding this comment.
following the suggestion at https://github.com/y-scope/clp/pull/2246/changes#r3214638646 , i believe we shall update this file too.
@hoophalab please help confirm the exact changes needed
There was a problem hiding this comment.
and whether the presto flow is correctly updated
| */ | ||
| const SEARCH_STATE_DEFAULT = Object.freeze({ | ||
| aggregationJobId: null, | ||
| maxNumResults: SETTINGS_MAX_SEARCH_RESULTS, |
There was a problem hiding this comment.
sorry for missing this in the previous round of review - there is a naming mismatch "max num results" vs "max search results"
i believe running this should be sufficient to globally rename all variables / constants:
git grep -Ilz -e 'MaxSearchResults' -e 'max_search_results' -e 'SETTINGS_MAX_SEARCH_RESULTS' -e 'SEARCH_MAX_NUM_RESULTS' -e 'MAX_PRESTO_SEARCH_RESULTS' -e 'maxNumResults' -e 'updateMaxNumResults' -e 'clampedMaxNumResults' -- . \
| xargs -0 -r perl -0pi -e 's/SETTINGS_MAX_SEARCH_RESULTS/SETTINGS_MAX_NUM_SEARCH_RESULTS/g; s/SEARCH_MAX_NUM_RESULTS/SEARCH_MAX_NUM_SEARCH_RESULTS/g; s/MAX_PRESTO_SEARCH_RESULTS/MAX_NUM_PRESTO_SEARCH_RESULTS/g; s/MaxSearchResults/MaxNumSearchResults/g; s/max_search_results/max_num_search_results/g; s/updateMaxNumResults/updateMaxNumSearchResults/g; s/clampedMaxNumResults/clampedMaxNumSearchResults/g; s/maxNumResults/maxNumSearchResults/g'Per junhaoliao's feedback: - Rename SEARCH_MAX_NUM_RESULTS to DEFAULT_SEARCH_MAX_NUM_RESULTS to reflect its purpose as a default/init value - Clamp to Math.min(1000, SETTINGS_MAX_SEARCH_RESULTS) so that even if the system-wide max is configured aggressively (e.g., 100000), the default per-query limit stays safe at 1000 - Update SearchState to use the clamped default - Update cursor limit consumers (useSearchResults, usePrestoSearchResults) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ary ?? - Split antd import across multiple lines (import-newlines/enforce) - Extract magic number 1000 to SAFE_DEFAULT_LIMIT constant - Remove unnecessary ?? operator (maxNumResults is always a number) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- MaxResultsInput: CSS module values can be undefined, use ?? "" fallback - NativeResultsTimeline: Include required maxNumResults in query payload (QueryJobCreation now requires it since we made it non-optional) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Makes the maximum number of search results configurable end-to-end using the existing
settings.jsonpattern (following theMaxDatasetsPerQueryprecedent), addressing all feedback from the initial review.Changes
Configuration Flow
clp-config.yaml→controller.py→settings.json(client + server) → application codeclp_config.pymax_search_results: PositiveInt = 1000toWebUiclp-config.template.json.yamlmax_search_results: 1000controller.pyMaxSearchResultsto both client & serversettings_json_updatesclient/public/settings.json"MaxSearchResults": 1000client/src/settings.tsMaxSearchResults: numbertoSettingstypeclient/src/config/index.tsSETTINGS_MAX_SEARCH_RESULTSSearchResultsTable/typings.tsSETTINGS_MAX_SEARCH_RESULTSinstead of hardcoded 1000presto-guided-search-requests.tsSETTINGS_MAX_SEARCH_RESULTSinstead of hardcoded 1000server/settings.json"MaxSearchResults": 1000search/typings.tssettings.MaxSearchResultsinstead of hardcoded 1000presto-search/typings.tssettings.MaxSearchResultsinstead of hardcoded 1000values.yamlmax_search_results: 1000towebuisectionconfigmap.yamlMaxSearchResultsto both client and server settings JSON blocksDesign Decisions
settings.jsonover env vars: Environment variables are reserved for credentials; general config usessettings.json(per existing pattern)Checklist
Validation performed
1000constants across search paths are replaced with the configurable valuemax_search_resultsuse the default of 1000MaxDatasetsPerQueryis followed throughoutSummary by CodeRabbit
New Features
Chores