lake: incremental updates 0607#23019
Conversation
Add a (Beta) label to the page front matter title and the main heading in tidb-cloud-lake/guides/integrate-with-amazon-sqs-s3.md to indicate the integration task is in beta.
There was a problem hiding this comment.
Code Review
This pull request updates the TiDB Cloud Lake documentation to mark the Amazon SQS (S3) integration task and IAM role data source as Beta features. It also documents the new NDJSON support for schema evolution, detailing the workflow, privilege requirements, sampling options, and inference rules. Additionally, the system settings reference table has been completely updated with more comprehensive settings and descriptions. The review feedback focuses on enforcing sentence case for headings across several files in accordance with the style guide, and correcting grammatical errors in the system settings descriptions.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| Confirm that `QueueArn` points to the target SQS queue, `Events` includes `s3:ObjectCreated:*`, and `FilterRules` matches the `Object Key Prefix` / `Object Key Suffix` configured in the {{{ .lake }}} data source. | ||
|
|
||
| ## Step 4: Create an IAM Role for Platform to Assume | ||
| ## Step 4: Create an IAM Role for {{{ .lake }}} to Assume |
There was a problem hiding this comment.
According to the style guide, headings should use sentence case. Please update this heading to use sentence case.
| ## Step 4: Create an IAM Role for {{{ .lake }}} to Assume | |
| ## Step 4: Create an IAM role for {{{ .lake }}} to assume |
References
- Use sentence case for headings (e.g.,
## Configure the cluster). (link)
|
|
||
| Query-based COPY is not affected. For example, `COPY INTO <table> FROM (SELECT ... FROM @stage)` keeps the existing privilege requirements. | ||
|
|
||
| ## Parquet Example |
There was a problem hiding this comment.
According to the style guide, headings should use sentence case. Please update this heading to use sentence case.
| ## Parquet Example | |
| ## Parquet example |
References
- Use sentence case for headings (e.g.,
## Configure the cluster). (link)
|
|
||
| Row 3 has `currency = NULL` because its source file did not contain that column. | ||
|
|
||
| ## NDJSON Example |
There was a problem hiding this comment.
According to the style guide, headings should use sentence case. Please update this heading to use sentence case.
| ## NDJSON Example | |
| ## NDJSON example |
References
- Use sentence case for headings (e.g.,
## Configure the cluster). (link)
|
|
||
| {{{ .lake }}} loads NDJSON files with `TYPE = ndjson`. NDJSON files do not have an embedded columnar schema like Parquet files, so {{{ .lake }}} samples file content, infers fields that are missing from the target table, and appends them as nullable columns. | ||
|
|
||
| ### Step 1: Create a Table and Stage |
There was a problem hiding this comment.
According to the style guide, headings should use sentence case. Please update this heading to use sentence case.
| ### Step 1: Create a Table and Stage | |
| ### Step 1: Create a table and stage |
References
- Use sentence case for headings (e.g.,
## Configure the cluster). (link)
| CREATE OR REPLACE STAGE events_stage; | ||
| ``` | ||
|
|
||
| ### Step 2: Generate NDJSON Files with Different Fields |
There was a problem hiding this comment.
According to the style guide, headings should use sentence case. Please update this heading to use sentence case.
| ### Step 2: Generate NDJSON Files with Different Fields | |
| ### Step 2: Generate NDJSON files with different fields |
References
- Use sentence case for headings (e.g.,
## Configure the cluster). (link)
| | COLUMN_MATCH_MODE | For Parquet: column name matching mode | `case-insensitive` | | ||
| | SCHEMA_EVOLUTION | For NDJSON: sampling options used to infer columns that are missing from the target table. Requires `ENABLE_SCHEMA_EVOLUTION = true` and the `ALTER` privilege on the target table. | `AUTO` sampling | | ||
|
|
||
| ### SCHEMA_EVOLUTION Options |
There was a problem hiding this comment.
According to the style guide, headings should use sentence case. Please update this heading to use sentence case.
| ### SCHEMA_EVOLUTION Options | |
| ### SCHEMA_EVOLUTION options |
References
- Use sentence case for headings (e.g.,
## Configure the cluster). (link)
Adjust the heading level for 'NDJSON Inference Rules' from ### to #### to improve document structure and nesting consistency. No content changes were made beyond the header level.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lilin90 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
b7fa8dd
into
pingcap:feature/preview-cloud-lake
What is changed, added or deleted? (Required)
Incremental docs updates for TiDB Cloud Lake till June 7, 2026
Which TiDB version(s) do your changes apply to? (Required)
Tips for choosing the affected version(s):
By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.
For details, see tips for choosing the affected versions.
What is the related PR or file link(s)?
Do your changes match any of the following descriptions?