Skip to content

Conversation

@Dwrite
Copy link

@Dwrite Dwrite commented Dec 17, 2025

ClickHouse does not support standard nested JOIN syntax and has strict scoping rules for identifiers in subqueries. When a JOIN's right side is another JOIN, it must be wrapped in a subquery.

This commit ensures that the ClickHouse dialect:

Wraps nested JOINs in a subquery to satisfy syntax requirements.

Generates explicit column aliases (e.g., AS col_name) for all projected columns within the subquery. This prevents "Unknown Identifier" errors by ensuring internal table qualifiers are flattened and resolvable by the outer query block.

Avoids redundant subquery nesting to maintain optimal SQL structure.

.withClickHouse()
.exec();

assertThat(sql, containsString("LEFT JOIN (SELECT *"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think use following is fine. Just using a expected string.

    sql(query)
        .schema(CalciteAssert.SchemaSpec.JDBC_SCOTT)
        .withClickHouse()
        .ok(expected);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I agree that using ok(expected) is generally preferred
and is what we should use whenever possible.

In this case, however, the generated SQL contains auto-generated subquery aliases
(e.g. t220414 / t220415) coming from Calcite’s alias generation, and these IDs are
not stable across different runs (for example, gradlew test vs IDE runs).
This makes a strict ok(expected) assertion flaky.

To avoid introducing non-deterministic tests, this test verifies the structural
properties of the generated SQL instead (i.e. whether the required wrapper
SELECT * FROM (...) is present or absent), rather than asserting exact alias
names.

I’ve added a comment in the test to document this reasoning. If alias generation
becomes stable in the future, we can switch this back to ok(expected).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that if a fixed SQL statement is processed in a specific way, the alias should be stable. Can you explain why it might be unstable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this issue remains to be discussed and has not been resolved.

@xiedeyantu
Copy link
Member

xiedeyantu commented Dec 18, 2025

Tip: The commit message, Jira title and PR title should be consistent. I think the commit message title is fine. PR title may lose something.

@Dwrite Dwrite changed the title [CALCITE-7279] Add subquery wrappers for nested JOINs in ClickHouse d… [CALCITE-7279] Add subquery wrappers for nested JOINs in ClickHouse dialect Dec 19, 2025
String targetName = fieldNames.get(i);

// If the expression is already aliased, strip the AS to get the raw expression.
if (expr.getKind() == SqlKind.AS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is quite strange. What if there are ambiguous fields?
What will happen if I want to have something like t1.id as customer_id, t2.id as product_id?
It will become t1.id as id, t2.id as id

Also, can you point me to ClickHouse documentation that shows this limitation?
Have you tried creating an issue in the ClickHouse repository as well ?

Copy link
Author

@Dwrite Dwrite Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the PR description with the specific error log from ClickHouse 25.12.1.

To answer your concern about ambiguity: Calcite's RelDataType already ensures that all field names in the row type are unique. My implementation uses these unique names for explicit aliasing (e.g., t1.id AS id, t2.id AS id_0). This guarantees that the subquery 'exports' a clean and deterministic schema to the outer query, resolving the ClickHouse scoping limitation without risking field collisions.

I also renamed the methods to shouldWrapNestedJoin and wrapNestedJoinWithAliases to better reflect their purpose.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test might be a good way to illustrate this.

writer.endList(frame);
}

@Override public boolean supportWrapNestedJoin(RelNode rel) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why dialect itself detect if join is nested ? Why dialect just don't have:

boolean supportWrapNestedJoin() {
    return true //  just true for clickhouse without logic
}

Similar to requiresAliasForFromItems

Then RelToSqlConverter detect whether the join is nested or not, instead of moving such logic on dialect level

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a great point regarding architectural consistency. The reason I passed the RelNode to the dialect instead of using a simple boolean flag is that not all nested JOINs require wrapping in ClickHouse.

  1. Context-Specific Wrapping: ClickHouse's syntax restriction primarily targets cases where the right side of a JOIN is another JOIN. If we used a global supportWrapNestedJoin() boolean, RelToSqlConverter might end up wrapping every single JOIN or every nested structure indiscriminately, which would lead to redundant subqueries and potential performance regressions for simple linear joins (e.g., A JOIN B JOIN C).

  2. Scoping & Aliasing Complexity: As discussed, the fix isn't just about adding a SELECT * wrapper; it's about explicitly aliasing the projected columns based on the specific RelDataType of that join node. By passing the RelNode, the dialect (or the converter guided by the dialect) can inspect the row type to ensure identifiers are flattened correctly.

  3. Flexibility for Future Dialects: Some dialects might only need wrapping for specific types of JOINs (e.g., RIGHT JOIN or OUTER JOIN). Passing the RelNode provides the necessary metadata (like JoinRelType) to make an informed decision, whereas a simple boolean flag like requiresAliasForFromItems is too "blunt" for this specific scoping issue.

However, I agree we should keep the common logic in RelToSqlConverter. My implementation aims to keep the "When to wrap" decision in the Dialect (since it's a dialect-specific quirk) while keeping the "How to wrap" logic in the Converter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @dssysolyatin makes sense. Are there other databases that have similar problems?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have verified this against PostgreSQL, and as expected, PostgreSQL handles nested join identifiers natively without requiring any subquery wrapping or explicit aliasing. It can correctly resolve identifiers like d2.loc across nested join boundaries.

This confirms that the issue is specific to certain dialects like ClickHouse, which enforce stricter subquery isolation and scope boundaries. In ClickHouse, the internal table qualifiers are effectively "hidden" from the outer join scope unless they are flattened via an explicit AS alias in a subquery.

By introducing dialect.shouldWrapNestedJoin(), we can keep the generated SQL clean and standard for databases like PostgreSQL, while providing the necessary "scoping fix" for dialects with these specific constraints.

@Dwrite Dwrite changed the title [CALCITE-7279] Add subquery wrappers for nested JOINs in ClickHouse dialect [CALCITE-7279] Resolve ClickHouse identifier resolution error by aliasing nested JOIN projections Dec 22, 2025
@Dwrite Dwrite requested a review from mihaibudiu December 22, 2025 15:12
Result rightResult = visitInput(e, 1).resetAlias();

if (dialect.shouldWrapNestedJoin(e)) {
String newAlias = "t" + e.getId(); // alias
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why this identifier is safe and won't for example hide another identifier with the same name that happens to exist?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the safety of the identifier t + e.getId():

In the RelToSqlConverter phase, original SQL aliases (like 'j') are often not preserved or can become ambiguous after multiple logical transformations. Using rel.getId() is a standard practice in Calcite for generating synthetic aliases because:

Global Uniqueness: RelNode IDs are globally unique within a single planning session. This ensures that the generated alias t{id} will not collide with any other synthetic aliases generated by the converter for different nodes in the same query tree.

Deterministic Scoping: Since we are creating a new subquery scope to resolve the ClickHouse scoping issue, this unique identifier provides a clean boundary. Even if a user-defined table named t123 exists, the risk of collision is significantly lower than attempting to reuse or guess original aliases which might have been pruned or renamed by optimization rules.

Avoids Identifier Shadowing: By forcing this unique alias at the join boundary, we ensure that the outer scope has a deterministic way to reference the flattened columns without worrying about shadowing identifiers from deeper nested structures.

While using SqlValidatorUtil.uniquify is another option, it requires maintaining a usedNames set across the entire implementation. Given Calcite's internal conventions, t + id provides the best balance between implementation robustness and identifier safety for this dialect-specific fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Significantly lower != 0.
A compiler has to work for all user programs.
The worst case is that some programs are compiled to produce incorrect results. These kinds of bugs are also very hard to diagnose when they happen. So why not avoid them?
I can easily see a program that has many tables named T0, T1, etc. where such a collision can happen.
For column names Calcite automatically generates fresh names if there is a collision.
If that's not true for relation names (I don't know myself), I think the implementation has to make the extra effort to generate correct code, for example by computing the used names before generating a fresh identifier. If uniquify does what you need, why not use it?

@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants