Skip to content

docs: add aggregation docs#90

Merged
moflotas merged 8 commits intomainfrom
89-aggregation-docs
Sep 22, 2025
Merged

docs: add aggregation docs#90
moflotas merged 8 commits intomainfrom
89-aggregation-docs

Conversation

@moflotas
Copy link
Contributor

@moflotas moflotas commented Aug 20, 2025

Fixes #89

Summary by CodeRabbit

  • Documentation
    • Added comprehensive Aggregations guide (functions, histograms, timeseries) in English and Russian with usage scenarios and SQL-style translations.
    • Clarified indexed-field requirements and enumerated supported functions (SUM, AVG, MIN, MAX, QUANTILE, UNIQUE, COUNT) and timeseries constraints.
    • Updated Public API examples (EN/RU) to use inline grpcurl -d payloads for aggregation requests.
    • Cross-referenced public API sections; no API behavior changes.

@coderabbitai
Copy link

coderabbitai bot commented Aug 20, 2025

📝 Walkthrough

Walkthrough

Documentation updates: inline grpcurl payloads replace piped file usage in English and Russian public API pages. New aggregation documentation pages are added in both languages describing function aggregations, histograms, and timeseries with examples and public-API references. No code or API behavior changes.

Changes

Cohort / File(s) Change summary
Public API grpcurl example format
docs/en/10-public-api.md, docs/ru/10-public-api.md
Replaced piped JSON payload examples (-d @) with inline payloads (`-d '...') in GetAggregation grpcurl examples. JSON content unchanged; only invocation formatting updated.
New aggregation docs (EN/RU)
docs/en/14-aggregations.md, docs/ru/14-aggregations.md
Added comprehensive aggregation guides covering functional aggregations, histograms, and timeseries; lists supported functions, input shapes, SQL-like translations, and grpcurl examples; references public API examples.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10–15 minutes

Possibly related PRs

Suggested reviewers

  • dkharms
  • forshev
  • ssnd

Pre-merge checks (5 passed)

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "docs: add aggregation docs" accurately and concisely captures the primary change in the PR — adding aggregation documentation and updating related public‑API examples; it is specific, relevant, and suitable for repository history.
Linked Issues Check ✅ Passed Linked issue #89 requests adding aggregation documentation; this PR adds docs/en/14-aggregations.md and docs/ru/14-aggregations.md and updates the public API examples in docs/en/10-public-api.md and docs/ru/10-public-api.md, which directly implements the issue's documentation objectives without requiring code changes.
Out of Scope Changes Check ✅ Passed All changes are documentation-only (new aggregation docs and formatting tweaks to grpcurl examples) and there are no modifications to code, API signatures, or unrelated files; the only other artifact is an automated Codecov comment noting a minor coverage delta.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e8fefdd and b41bbf8.

📒 Files selected for processing (2)
  • docs/en/14-aggregations.md (1 hunks)
  • docs/ru/14-aggregations.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • docs/en/14-aggregations.md
  • docs/ru/14-aggregations.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 89-aggregation-docs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
docs/en/13-aggregations.md (2)

16-16: Fix phrasing: “timeseries” → “time series”.

Use the conventional term.

Apply this diff:

-- `AGG_FUNC_UNIQUE` — computation of unique field values (not supported in timeseries)
+- `AGG_FUNC_UNIQUE` — computation of unique field values (not supported in time series)

17-17: Improve grammar in COUNT description.

Noun phrase to align with the other bullets.

-- `AGG_FUNC_COUNT` — count number of documents per group
+- `AGG_FUNC_COUNT` — count of documents per group
docs/ru/13-aggregations.md (1)

16-16: Grammar: “в временных” → “во временных”.

Correct preposition before “временных”.

-- `AGG_FUNC_UNIQUE` — вычисление уникальных значений поля (не поддерживается в временных рядах)
+- `AGG_FUNC_UNIQUE` — вычисление уникальных значений поля (не поддерживается во временных рядах)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 0b406d7 and a5f38d0.

📒 Files selected for processing (2)
  • docs/en/13-aggregations.md (1 hunks)
  • docs/ru/13-aggregations.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/ru/13-aggregations.md

[uncategorized] ~16-~16: Правильная форма предлога: «во временных».
Context: ...альных значений поля (не поддерживается в временных рядах) - AGG_FUNC_COUNT — подсчет кол...

(V_K_IS)

🔇 Additional comments (3)
docs/en/13-aggregations.md (1)

5-18: Clear and useful addition.

Good, concise overview and alignment with the API section.

docs/ru/13-aggregations.md (2)

19-19: Проверьте наличие якоря в публичном API.

Убедитесь, что в 10-public-api.md есть заголовок «Примеры агрегаций», чтобы ссылка работала.

Use the script shared in the English file comment to validate both locales.


7-18: Хорошее краткое введение.

Список функций и оговорка про time series — в точку.

@codecov-commenter
Copy link

codecov-commenter commented Aug 20, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.28%. Comparing base (b46ee46) to head (2997217).

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #90      +/-   ##
==========================================
- Coverage   71.49%   71.28%   -0.22%     
==========================================
  Files         200      200              
  Lines       18143    18143              
==========================================
- Hits        12972    12933      -39     
- Misses       4453     4493      +40     
+ Partials      718      717       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
docs/en/13-aggregations.md (1)

26-26: Verify anchor “#aggregation-examples” exists in public API docs

This link depends on a heading that was missing in earlier commits. Please confirm it now exists in both English and Russian public API docs; add the heading if absent.

#!/bin/bash
set -euo pipefail

# Check English anchor
en=$(fd -a '^10-public-api\.md$' docs/en || true)
if [ -n "$en" ]; then
  echo "Checking English: $en"
  rg -nP '^\s{0,3}#{1,6}\s+Aggregation examples\b' "$en" || { echo "Missing 'Aggregation examples' heading in $en"; exit 1; }
else
  echo "English 10-public-api.md not found under docs/en"; exit 1
fi

# Check Russian anchor (parallel file referenced in RU docs)
ru=$(fd -a '^10-public-api\.md$' docs/ru || true)
if [ -n "$ru" ]; then
  echo "Checking Russian: $ru"
  rg -nP '^\s{0,3}#{1,6}\s+Примеры агрегаций\b' "$ru" || { echo "Missing 'Примеры агрегаций' heading in $ru"; exit 1; }
else
  echo "Russian 10-public-api.md not found under docs/ru"; exit 1
fi

# Also verify histogram anchor used below exists
rg -nP '^\s{0,3}#{1,6}\s+GetHistogram\b' "$en"
🧹 Nitpick comments (5)
docs/en/13-aggregations.md (5)

7-10: Fix grammar and tighten intro wording

  • “seq-db support” → “seq-db supports”
  • Prefer “an inverted index” over “the inverted-index”
  • Remove “very” per style guidance; the sentence reads better without it.
-seq-db support various types of aggregations: functional aggregations, histograms and timeseries. Each of the types
-relies on the usage of the inverted-index, therefore to calculate aggregations for the fields, the field must be
-indexed. However, because of that, seq-db can very quickly retrieve and aggregate data.
+seq-db supports various types of aggregations: function aggregations, histograms, and time series. Each type
+relies on an inverted index; therefore, to compute aggregations on a field, that field must be
+indexed. Thanks to this, seq-db can quickly retrieve and aggregate data.

23-23: Clarify time-series limitation phrasing

Use “time series” and clarify scope.

-- `AGG_FUNC_UNIQUE` — computation of unique field values (not supported in timeseries)
+- `AGG_FUNC_UNIQUE` — unique values of a field (not supported for time‑series aggregations)

32-39: Polish parameter bullet list and terminology

  • Use consistent punctuation (colons).
  • Capitalize “SQL”.
  • Minor wording tweaks.
-- `AGG_FUNC` which is one of `AGG_FUNC_SUM`, `AGG_FUNC_AVG`, `AGG_FUNC_MIN`, `AGG_FUNC_MAX`, `AGG_FUNC_QUANTILE`,
-- `aggregate_by_field` - the field on which aggregation will be applied
-- `group_by_field` - the field by which values will be grouped
-- `filtering_query`- query to filter only relevant logs for the aggregation
-- `quantile` - only for the `AGG_FUNC_QUANTILE`
+- `AGG_FUNC`: one of `AGG_FUNC_SUM`, `AGG_FUNC_AVG`, `AGG_FUNC_MIN`, `AGG_FUNC_MAX`, `AGG_FUNC_QUANTILE`
+- `aggregate_by_field`: the field to aggregate
+- `group_by_field`: the field to group by
+- `filtering_query`: a query that filters logs included in the aggregation
+- `quantile`: required only for `AGG_FUNC_QUANTILE`

60-63: Tighten wording for Count/Unique explanation

Small grammar/clarity pass.

-Count and unique aggregations are very similar to the above examples, except for those aggregation there is no need to
-have an
-additional `group_by_field`, since we are already grouping by `aggregate_by_field`.
+Count and Unique are similar to the above, except there is no need to specify an
+additional `group_by_field`, because we already group by `aggregate_by_field`.

85-89: Improve histogram description and verify anchor

  • Grammar/style tweaks.
  • Also ensure the GetHistogram anchor exists in 10-public-api.md (script in earlier comment checks this).
-Histograms allow users to visually understand amount of logs in each sub-interval. E.g. visualize number of logs
-particular service for the given interval of time
-
-For the API of the functions, please refer to [public API](10-public-api.md#gethistogram)
+Histograms help visualize the number of logs in each sub‑interval (e.g., for a particular service over a time range).
+
+For the API, see the [public API](10-public-api.md#gethistogram).
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a5f38d0 and fb3a0b4.

📒 Files selected for processing (1)
  • docs/en/13-aggregations.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/en/13-aggregations.md

[style] ~9-~9: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...d. However, because of that, seq-db can very quickly retrieve and aggregate data. ## Functi...

(EN_WEAK_ADJECTIVE)

@moflotas moflotas force-pushed the 89-aggregation-docs branch from fb3a0b4 to 0a82d3a Compare August 27, 2025 11:11
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (5)
docs/ru/13-aggregations.md (1)

5-5: Fix typo in the Russian title (“Аггрегации” → “Агрегации”).

Single “г”.

-# Аггрегации
+# Агрегации
docs/en/13-aggregations.md (4)

42-47: Fix SQL: remove trailing comma; WHERE before GROUP BY.

-SELECT <group_by_field>, AGG_FUNC(<aggregate_by_field>),
-FROM db
-GROUP BY <group_by_field>
-WHERE <filtering_query>
+SELECT <group_by_field>, AGG_FUNC(<aggregate_by_field>)
+FROM db
+WHERE <filtering_query>
+GROUP BY <group_by_field>

52-56: Fix example SQL order (WHERE before GROUP BY) and comment.

 SELECT service, AVG(response_time)
 FROM db
-GROUP BY service WHERE response_time:* -- meaning that `response_time` field exists in logs
+WHERE response_time:*  -- field exists
+GROUP BY service

66-71: Fix SQL: COUNT(*) spacing; WHERE before GROUP BY.

-SELECT <aggregate_by_field>, COUNT (*)
-FROM db
-GROUP BY <aggregate_by_field>
-WHERE <filtering_query>
+SELECT <aggregate_by_field>, COUNT(*)
+FROM db
+WHERE <filtering_query>
+GROUP BY <aggregate_by_field>

77-81: Fix example SQL order (WHERE before GROUP BY).

 SELECT level, COUNT(*)
 FROM db
-GROUP BY level WHERE service:seq-db
+WHERE service:seq-db
+GROUP BY level
🧹 Nitpick comments (5)
docs/ru/13-aggregations.md (1)

16-16: Grammar: use “во временных рядах”.

-- `AGG_FUNC_UNIQUE` — вычисление уникальных значений поля (не поддерживается в временных рядах)
+- `AGG_FUNC_UNIQUE` — вычисление уникальных значений поля (не поддерживается во временных рядах)
docs/en/13-aggregations.md (4)

7-10: Tighten intro wording and fix subject-verb agreement.

-seq-db support various types of aggregations: functional aggregations, histograms and timeseries. Each of the types
-relies on the usage of the inverted-index, therefore to calculate aggregations for the fields, the field must be
-indexed. However, because of that, seq-db can very quickly retrieve and aggregate data.
+seq-db supports various types of aggregations: functional aggregations, histograms, and timeseries.
+Each type relies on the inverted index; to calculate aggregations, the field must be indexed.
+This enables fast retrieval and aggregation.

37-37: Punctuation: add space/em dash after code term.

-- `filtering_query`- query to filter only relevant logs for the aggregation
+- `filtering_query` — query to filter only relevant logs for the aggregation

60-63: Grammar: “these aggregations”; tighten phrasing.

-Count and unique aggregations are very similar to the above examples, except for those aggregation there is no need to
-have an
-additional `group_by_field`, since we are already grouping by `aggregate_by_field`.
+Count and unique aggregations are similar to the above, except for these aggregations there is no need
+for an additional `group_by_field`, since we already group by `aggregate_by_field`.

85-86: Clarify histogram phrasing.

-Histograms allow users to visually understand amount of logs in each sub-interval. E.g. visualize number of logs
-particular service for the given interval of time
+Histograms help visualize the number of logs in each sub-interval.
+For example, visualize the number of logs for a particular service over a given time interval.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between fb3a0b4 and 0a82d3a.

📒 Files selected for processing (2)
  • docs/en/13-aggregations.md (1 hunks)
  • docs/ru/13-aggregations.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/en/13-aggregations.md

[grammar] ~7-~7: There might be a mistake here.
Context: ...ograms and timeseries. Each of the types relies on the usage of the inverted-inde...

(QB_NEW_EN)


[grammar] ~8-~8: There might be a mistake here.
Context: ...ations for the fields, the field must be indexed. However, because of that, seq-d...

(QB_NEW_EN)


[style] ~9-~9: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...d. However, because of that, seq-db can very quickly retrieve and aggregate data. ## Functi...

(EN_WEAK_ADJECTIVE)


[grammar] ~13-~13: There might be a mistake here.
Context: ...s that match the query. E.g. calculating number of logs written by each service i...

(QB_NEW_EN)


[grammar] ~37-~37: There might be a mistake here.
Context: ...lues will be grouped - filtering_query- query to filter only relevant logs for ...

(QB_NEW_EN)


[grammar] ~49-~49: There might be a mistake here.
Context: ...ring real-world example, we may want to calculate average response time for services havi...

(QB_NEW_EN)


[grammar] ~49-~49: There might be a mistake here.
Context: ...time for services having response_time field, then we will write the following ...

(QB_NEW_EN)


[grammar] ~60-~60: There might be a mistake here.
Context: ...or those aggregation there is no need to have an additional group_by_field, sin...

(QB_NEW_EN)


[grammar] ~61-~61: There might be a mistake here.
Context: ... aggregation there is no need to have an additional group_by_field, since we ar...

(QB_NEW_EN)


[grammar] ~73-~73: There might be a mistake here.
Context: ...ring real-world example, we may want to calculate number of logs for each logging level (...

(QB_NEW_EN)


[grammar] ~74-~74: There might be a mistake here.
Context: ...ogging level (debug, info, etc.) for the particular service, e.g. seq-db, t...

(QB_NEW_EN)


[grammar] ~85-~85: There might be a mistake here.
Context: ...ams Histograms allow users to visually understand amount of logs in each sub-interval. E....

(QB_NEW_EN)


[grammar] ~85-~85: There might be a mistake here.
Context: ...-interval. E.g. visualize number of logs particular service for the given interva...

(QB_NEW_EN)

docs/ru/13-aggregations.md

[uncategorized] ~16-~16: Правильная форма предлога: «во временных».
Context: ...альных значений поля (не поддерживается в временных рядах) - AGG_FUNC_COUNT — подсчет кол...

(V_K_IS)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test
🔇 Additional comments (1)
docs/ru/13-aggregations.md (1)

19-19: Anchor verification passed

The heading “Примеры агрегаций” is present in docs/ru/10-public-api.md at line 202. No further action needed.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
docs/en/13-aggregations.md (1)

42-47: SQL clause order and trailing comma — fixed

The example now uses SELECT → FROM → WHERE → GROUP BY with no trailing comma. Looks good.

🧹 Nitpick comments (8)
docs/en/13-aggregations.md (8)

7-10: Tighten intro: agreement, hyphenation, and concision

Fix subject–verb agreement, drop the hyphen in “inverted index,” and trim adverbs.

-seq-db support various types of aggregations: functional aggregations, histograms and timeseries. Each of the types
-relies on the usage of the inverted-index, therefore to calculate aggregations for the fields, the field must be
-indexed. However, because of that, seq-db can very quickly retrieve and aggregate data.
+seq-db supports multiple types of aggregations: functional aggregations, histograms, and timeseries. These
+rely on an inverted index. To compute aggregations for a field, the field must be
+indexed. As a result, seq-db can quickly retrieve and aggregate data.

11-16: Align heading with terminology and polish wording

Use “Functional aggregations” to match the earlier phrase; tighten example sentence.

-## Function aggregations
+## Functional aggregations
@@
-Aggregations allow the computation of statistical values over document fields that match the query. E.g. calculating
-number of logs written by each service in the given interval, or all unique values of the field.
+Aggregations compute statistical values over document fields that match a query—for example, the number of logs
+written by each service in a given interval, or all unique values of a field.

34-39: Bullet formatting consistency and minor fixes

Use “: ” after parameter names, fix spacing around hyphen, and slightly clarify quantile note.

-- `AGG_FUNC` which is one of `AGG_FUNC_SUM`, `AGG_FUNC_AVG`, `AGG_FUNC_MIN`, `AGG_FUNC_MAX`, `AGG_FUNC_QUANTILE`,
-- `aggregate_by_field` - the field on which aggregation will be applied
-- `group_by_field` - the field by which values will be grouped
-- `filtering_query`- query to filter only relevant logs for the aggregation
-- `quantile` - only for the `AGG_FUNC_QUANTILE`
+- `AGG_FUNC`: one of `AGG_FUNC_SUM`, `AGG_FUNC_AVG`, `AGG_FUNC_MIN`, `AGG_FUNC_MAX`, `AGG_FUNC_QUANTILE`
+- `aggregate_by_field`: the field to aggregate
+- `group_by_field`: the field to group by
+- `filtering_query`: query to filter only relevant logs for the aggregation
+- `quantile`: only when `AGG_FUNC_QUANTILE` is selected

49-51: Polish example lead-in

Improve flow and grammar.

-Considering real-world example, we may want to calculate average response time for services having `response_time`
-field, then we will write the following query:
+As a real-world example, to calculate the average response time for services where the `response_time`
+field exists, use:

61-72: Grammar and COUNT formatting

Pluralize “aggregations” and remove the space in COUNT(*).

-Count and unique aggregations are very similar to the above examples, except for those aggregation there is no need to
-have an
-additional `group_by_field`, since we are already grouping by `aggregate_by_field`.
+Count and unique aggregations are very similar to the above examples, except that for these aggregations there is no need
+to have an additional `group_by_field`, since we are already grouping by `aggregate_by_field`.
@@
-SELECT <aggregate_by_field>, COUNT (*)
+SELECT <aggregate_by_field>, COUNT(*)

74-77: Tighten prose before the COUNT example

Shorten and clarify.

-Considering real-world example, we may want to calculate number of logs for each logging level (`debug`, `info`, etc.)
-for
-the particular service, e.g. `seq-db`, then we can write the following query:
+As a real-world example, to calculate the number of logs per level (`debug`, `info`, etc.)
+for a particular service (e.g., `seq-db`), use:

87-94: Histogram section: grammar + anchor check

Minor grammar tweaks; also double-check the gethistogram/complexsearch anchors exist (script in earlier comment).

-Histograms allow users to visually interpret the distribution of logs satisfying given query. E.g. number of logs of the
-particular service for the given interval of time.
+Histograms let users visually interpret the distribution of logs satisfying a given query—e.g., the number of logs for a
+particular service over a given time interval.
@@
-Histograms can be queried separately, using [GetHistogram](10-public-api.md#gethistogram) or with documents and
-functional aggregations using [ComplexSearch](10-public-api.md#complexsearch) gRPC handlers.
+You can query histograms separately via [GetHistogram](10-public-api.md#gethistogram), or together with documents and
+functional aggregations via [ComplexSearch](10-public-api.md#complexsearch) gRPC handlers.
@@
-For the detailed API and examples, please refer to [public API](10-public-api.md#gethistogram)
+For detailed API docs and examples, see the [public API](10-public-api.md#gethistogram).

97-112: Timeseries section: grammar and clarity

Improve readability and tighten wording.

-Timeseries allow to calculate aggregations for intervals and visualize them. They are something in between histograms
-and functional aggregations: they allow to simultaneously calculate multiple histograms for the given aggregate
-functions.
+Timeseries enable calculating aggregations over intervals and visualizing them. They are a middle ground between histograms
+and functional aggregations: they let you compute multiple histograms for the given aggregate functions at once.
@@
-Consider the previous example of histograms, where we visualized number of logs over time only for one service at a
-time. Using the power of timeseries, we can calculate number of logs for each service simultaneously, using the
-`AGG_FUNC_COUNT` over `service` field.
+Consider the previous histogram example, where we visualized the number of logs over time for a single service.
+With timeseries, you can compute the number of logs for each service simultaneously using
+`AGG_FUNC_COUNT` over the `service` field.
@@
-Another example of using timeseries is visualizing number of logs for each log-level over time. It may be exceptionally
-useful, when there is a need to debug real-time problems. We can simply visualize number of logs for each level and find
-unusual spikes and logs associated with them.
+Another example is visualizing the number of logs for each log level over time. This is especially useful for debugging
+real-time issues—spikes become obvious, along with the associated logs.
@@
-Because timeseries are basically aggregations, they have the same API as aggregations, except a new `interval` field is
-present to calculate number of buckets to calculate aggregation on. For the details, please refer
-to [public API](10-public-api.md#aggregation-examples)
+Because timeseries are aggregations, they use the same API, plus an `interval` parameter that determines the number of buckets.
+For details, see the [public API](10-public-api.md#aggregation-examples).
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 0a82d3a and 614cc27.

📒 Files selected for processing (1)
  • docs/en/13-aggregations.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/en/13-aggregations.md

[grammar] ~7-~7: There might be a mistake here.
Context: ...ograms and timeseries. Each of the types relies on the usage of the inverted-inde...

(QB_NEW_EN)


[grammar] ~8-~8: There might be a mistake here.
Context: ...ations for the fields, the field must be indexed. However, because of that, seq-d...

(QB_NEW_EN)


[style] ~9-~9: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...d. However, because of that, seq-db can very quickly retrieve and aggregate data. ## Functi...

(EN_WEAK_ADJECTIVE)


[grammar] ~13-~13: There might be a mistake here.
Context: ...s that match the query. E.g. calculating number of logs written by each service i...

(QB_NEW_EN)


[grammar] ~49-~49: There might be a mistake here.
Context: ...ring real-world example, we may want to calculate average response time for services havi...

(QB_NEW_EN)


[grammar] ~49-~49: There might be a mistake here.
Context: ...time for services having response_time field, then we will write the following ...

(QB_NEW_EN)


[grammar] ~61-~61: There might be a mistake here.
Context: ...or those aggregation there is no need to have an additional group_by_field, sin...

(QB_NEW_EN)


[grammar] ~62-~62: There might be a mistake here.
Context: ... aggregation there is no need to have an additional group_by_field, since we ar...

(QB_NEW_EN)


[grammar] ~74-~74: There might be a mistake here.
Context: ...ring real-world example, we may want to calculate number of logs for each logging level (...

(QB_NEW_EN)


[grammar] ~75-~75: There might be a mistake here.
Context: ...ogging level (debug, info, etc.) for the particular service, e.g. seq-db, t...

(QB_NEW_EN)


[grammar] ~87-~87: There might be a mistake here.
Context: ...ally interpret the distribution of logs satisfying given query. E.g. number of logs of the...

(QB_NEW_EN)


[grammar] ~87-~87: There might be a mistake here.
Context: ... given query. E.g. number of logs of the particular service for the given interva...

(QB_NEW_EN)


[grammar] ~90-~90: There might be a mistake here.
Context: ...i.md#gethistogram) or with documents and functional aggregations using [ComplexSe...

(QB_NEW_EN)


[grammar] ~97-~97: There might be a mistake here.
Context: ...They are something in between histograms and functional aggregations: they allow ...

(QB_NEW_EN)


[grammar] ~98-~98: There might be a mistake here.
Context: ...tiple histograms for the given aggregate functions. Consider the previous exampl...

(QB_NEW_EN)


[grammar] ~101-~101: There might be a mistake here.
Context: ...revious example of histograms, where we visualized number of logs over time only for one s...

(QB_NEW_EN)


[grammar] ~101-~101: There might be a mistake here.
Context: ...logs over time only for one service at a time. Using the power of timeseries, we ...

(QB_NEW_EN)


[grammar] ~102-~102: There might be a mistake here.
Context: .... Using the power of timeseries, we can calculate number of logs for each service simulta...

(QB_NEW_EN)


[grammar] ~102-~102: There might be a mistake here.
Context: ...r each service simultaneously, using the AGG_FUNC_COUNT over service field. ...

(QB_NEW_EN)


[grammar] ~105-~105: There might be a mistake here.
Context: ... Another example of using timeseries is visualizing number of logs for each log-level over ...

(QB_NEW_EN)


[grammar] ~105-~105: There might be a mistake here.
Context: ...level over time. It may be exceptionally useful, when there is a need to debug re...

(QB_NEW_EN)


[grammar] ~106-~106: There might be a mistake here.
Context: ...debug real-time problems. We can simply visualize number of logs for each level and find ...

(QB_NEW_EN)


[grammar] ~106-~106: There might be a mistake here.
Context: ...e number of logs for each level and find unusual spikes and logs associated with ...

(QB_NEW_EN)


[grammar] ~109-~109: There might be a mistake here.
Context: ...ations, except a new interval field is present to calculate number of buckets t...

(QB_NEW_EN)


[grammar] ~110-~110: There might be a mistake here.
Context: ...pt a new interval field is present to calculate number of buckets to calculate aggregat...

(QB_NEW_EN)


[grammar] ~110-~110: There might be a mistake here.
Context: ...gation on. For the details, please refer to [public API](10-public-api.md#aggrega...

(QB_NEW_EN)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test

@forshev forshev self-requested a review September 1, 2025 11:42
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (5)
docs/ru/14-aggregations.md (5)

7-9: Grammar/clarity: “по полям поле должно быть проиндексировано”.

Plural/singularity clash and extra wording.

-seq-db поддерживает различные типы агрегаций: функциональные агрегации, гистограммы и таймсерии (временные ряды). Каждый
-из этих типов опирается на использование обратного индекса, поэтому для вычисления агрегаций по полям поле должно быть
-проиндексировано. Однако благодаря этому seq-db может очень быстро извлекать и агрегировать данные.
+seq-db поддерживает различные типы агрегаций: функциональные агрегации, гистограммы и таймсерии (временные ряды). Каждый
+из этих типов опирается на обратный индекс, поэтому для вычисления агрегаций по полям соответствующие поля должны быть
+проиндексированы. Благодаря этому seq-db может очень быстро извлекать и агрегировать данные.

37-38: Punctuation and style around filtering_query.

Add em dash and comma.

-- `filtering_query`- запрос чтобы отфильтровать только интересующие нас логи
+- `filtering_query` — запрос, чтобы отфильтровать только интересующие нас логи

75-76: Translate the inline SQL comment.

Russian doc should keep comments in Russian.

-WHERE response_time:* -- meaning that `response_time` field exists in logs
+WHERE response_time:* -- означает, что поле `response_time` присутствует в логах

99-101: Localize/format the heading “Count, unique”.

Use Russian conjunction and consistent casing with function names.

-### Count, unique
+### Count и Unique

185-187: Clarify time-series ‘interval’ description and add code formatting.

Make purpose explicit and format the field name.

-Поскольку временные ряды по сути являются агрегациями, они имеют тот же API, что и агрегации, за исключением того, что
-присутствует новое поле interval для вычисления количества интервалов для расчета агрегации. Для получения подробной
-информации обратитесь к [публичному API](10-public-api.md#count-с-указанием-интервала).
+Поскольку временные ряды по сути являются агрегациями, они используют тот же API, за исключением нового поля `interval`,
+которое задаёт шаг временного окна (разбиение на интервалы) для расчёта агрегации. Для получения подробной информации
+обратитесь к [публичному API](10-public-api.md#count-с-указанием-интервала).
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 614cc27 and 6d3c31c.

📒 Files selected for processing (4)
  • docs/en/10-public-api.md (10 hunks)
  • docs/en/14-aggregations.md (1 hunks)
  • docs/ru/10-public-api.md (10 hunks)
  • docs/ru/14-aggregations.md (1 hunks)
✅ Files skipped from review due to trivial changes (3)
  • docs/en/14-aggregations.md
  • docs/en/10-public-api.md
  • docs/ru/10-public-api.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test

moflotas and others added 2 commits September 11, 2025 14:27
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@eguguchkin eguguchkin added this to the v0.62.0 milestone Sep 22, 2025
@moflotas moflotas merged commit 8d4e256 into main Sep 22, 2025
7 checks passed
@moflotas moflotas deleted the 89-aggregation-docs branch September 22, 2025 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add documentation for aggregations

5 participants