docs: translate mappings and index types docs to russian#124
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #124 +/- ##
==========================================
+ Coverage 71.31% 71.50% +0.19%
==========================================
Files 201 201
Lines 18167 18167
==========================================
+ Hits 12955 12991 +36
+ Misses 4494 4456 -38
- Partials 718 720 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
📝 WalkthroughWalkthroughRefined English indexing docs with an auto-indexing option, renamed indexing configuration keys, expanded mapping type explanations, added indexing internals and object/nested indexing notes; Russian docs rewritten/translated with matching structure, examples, and configuration key updates. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Pre-merge checks (3 passed)✅ Passed checks (3 passed)
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (9)
docs/en/03-index-types.md (2)
20-24: Tighten wording around high-cardinality fields.“might blow up the index” is colloquial; suggest more precise language.
Apply:
-... as the indexing of these fields might blow up the index. +... as indexing these fields may drastically increase index size and memory usage.
132-140: Unify YAML style for field-with-dot example.Mixed forms can confuse readers. Prefer consistent
name/typeobjects.Apply:
- - user.name: keyword + - name: user.name + type: keyworddocs/ru/03-index-types.md (7)
20-25: Правки стиля и грамматики дляkeyword.
- “все значение поля это один токен” → “всё значение поля — один токен”.
- Уточнить про рост памяти/индекса.
Apply:
-Тип поля `keyword` считает, что все значение поля это один токен, значение поля никак не разбивается на части. +Тип поля `keyword` рассматривает всё значение как один токен; оно не разбивается на части. -... способствует росту индекса. +... может существенно увеличить размер индекса и потребление памяти.
36-39: Термины и капитализация.“url” → “URL”, “elasticsearch” → “Elasticsearch”.
Apply:
-... пути в файловой системе или url. +... пути в файловой системе или URL. -Похож на ... в elasticsearch. +Похож на ... в Elasticsearch.
91-97: Единицы измерения, термин и пробел.
- “constant” → “константа”.
- Пробел после запятой.
- Уточнить, что размеры — в байтах.
Apply:
-* `indexing.max_token_size` - максимальный размер токена,по умолчанию 72. +* `indexing.max_token_size` - максимальный размер токена (в байтах), по умолчанию 72. -* constant `consts.MaxTextFieldValueLength` - ограничивает максимальную длину текстового поля, текущий порог 32768 байт. +* Константа `consts.MaxTextFieldValueLength` - ограничивает максимальную длину текстового поля (32768 байт).Also consider clarifying
indexing.partial_field_indexingsimilarly to EN.
100-115: Правки грамматики и терминов в примере object.
- “название поле” → “название поля”.
- “json данные” → “JSON-данные”.
Apply:
- - name: "myobject" # название поле, которое содержит вложенные json данные + - name: "myobject" # название поля, которое содержит вложенные JSON-данные - mapping-list: # маппинг для вложенных полей + mapping-list: # маппинг для вложенных полей
158-164: Согласовать число: “Название ... состоит”.Apply:
-Названия "неявного" поля состоит из значений `name` и `title`, соединенных точкой между ними. +Название "неявного" поля состоит из значений `name` и `title`, соединённых точкой.
190-192: Уточнить: что такоеsizeи единицы.Добавьте “(в байтах)” и что это ограничение на входное значение поля.
Apply:
-Также есть `size`, которое позволяет указать максимальный размер значения поля. -Если поле `size` не указано, будет использовано значение [по умолчанию](#параметры-конфигурации). +Параметр `size` задаёт максимальный размер значения поля (в байтах). +Если `size` не указан, используется значение [по умолчанию](#параметры-конфигурации).
195-206: Правки пунктуации и форматирования ID.
- Нужна запятая: “документ, используя”.
- В inline-коде достаточно одинарных бэктиков.
Apply:
-Запишем документ используя [маппинг](#пример-маппинга) (ID документа будет следующим: ```id = N```): +Запишем документ, используя [маппинг](#пример-маппинга) (ID документа: `id = N`):
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
docs/en/03-index-types.md(5 hunks)docs/ru/03-index-types.md(7 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (4)
docs/en/03-index-types.md (3)
200-246: I don’t see any issues beyond the JSON formatting fix you already applied. The_exists_tokens are always emitted by the bulk indexer (seeproxy/bulk/indexer.gocomments and mapping), regardless of explicit mapping entries, and tests confirm it. Everything else is correct.
41-50: Clarify trailing slash and case handling in path matching: add a note after the example stating that trailing-slash normalization and case sensitivity adhere toindexing.case_sensitive; confirm your deployment’s defaults.
96-101: Clarify units and scope for indexing config keys
I didn’t find any code paths using these settings—please confirm whethermax_token_sizeis measured in bytes (i.e. Go’slen(s)) or runes/characters, and enumerate exactly which analyzers or mapping types respectcase_sensitive, then update the docs to state those details explicitly.docs/ru/03-index-types.md (1)
210-229: Исправить пример JSON: заключить ключidв кавычки и убрать лишнюю запятуюПрименить:
-{ - id: N, - tokens: [ +{ + "id": "N", + "tokens": [ "_exists_:message", "message:hello", "message:world", "_exists_:message.keyword", "message.keyword:hello world", "_exists_:level", "level:info", "_exists_:foo", "_exists_:bar", "bar:ddd eee fff", "_exists_:uri", "uri:/my", - "uri:/my/path", + "uri:/my/path" ] }
fa01427 to
62be887
Compare
62be887 to
85424d5
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
docs/en/03-index-types.md (1)
11-13: Fix grammar + strengthen production warning (duplicate of prior comment).Previous review already proposed the exact diff to remove “this this”, clarify behavior, and add a stronger warning. Please apply it verbatim.
🧹 Nitpick comments (3)
docs/en/03-index-types.md (3)
78-83: Tighten wording for exists semantics.Minor grammar and clarity tweak.
-Used when the **presence** of the field is important and not the value. +Used when the field’s **presence**—not its value—matters.
105-110: Clarify thatobjectis a container mapping, not a searchable index itself.Add a sentence to avoid confusion.
seq-db can also index logs containing nested structured data. In this case, the parent field should have the `object` index type, and contain a `mapping-list` object inside, that would specify how exactly its nested fields should be indexed. +The `object` type is a container: it isn’t queried directly, but defines how its nested fields are indexed and searched.
200-236: Make the “Indexing internals” JSON example valid JSON and disambiguate tokens with spaces.Quote keys, remove trailing commas, and keep a consistent token representation. If tokens with spaces are possible, show them clearly.
-```json -{ - id: N, - tokens: [ - "_exists_:message", - "message:hello", - "message:world", - "_exists_:message.keyword", - "message.keyword:hello world", - "_exists_:level", - "level:info", - "_exists_:foo", - "_exists_:bar", - "bar:ddd eee fff", - "_exists_:uri", - "uri:/my", - "uri:/my/path", - ] -} -``` +```json +{ + "id": N, + "tokens": [ + "_exists_:message", + "message:hello", + "message:world", + "_exists_:message.keyword", + "message.keyword:hello world", + "_exists_:level", + "level:info", + "_exists_:foo", + "_exists_:bar", + "bar:\"ddd eee fff\"", + "_exists_:uri", + "uri:/my", + "uri:/my/path" + ] +} +``` If the internal token format never uses quotes, consider using a keyword example without spaces to avoid ambiguity. </blockquote></details> </blockquote></details> <details> <summary>📜 Review details</summary> **Configuration used**: CodeRabbit UI **Review profile**: CHILL **Plan**: Pro <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between fa01427b3135503b0ad80af0d7ddd37beee31a28 and 85424d5509369f177dae5e5dba8a064088da02e8. </details> <details> <summary>📒 Files selected for processing (2)</summary> * `docs/en/03-index-types.md` (5 hunks) * `docs/ru/03-index-types.md` (7 hunks) </details> <details> <summary>🚧 Files skipped from review as they are similar to previous changes (1)</summary> * docs/ru/03-index-types.md </details> <details> <summary>⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)</summary> * GitHub Check: test </details> <details> <summary>🔇 Additional comments (1)</summary><blockquote> <details> <summary>docs/en/03-index-types.md (1)</summary><blockquote> `41-42`: **Clarify path matching and normalization semantics.** State explicitly whether matching is prefix-only, how trailing slashes are handled (/my/path vs /my/path/), case sensitivity, and percent-encoding normalization. This avoids surprises for users. </blockquote></details> </blockquote></details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
Description
translate mappings and index types docs to russian
If you have used LLM/AI assistance please provide model name and full prompt:
Summary by CodeRabbit