Skip to content

Add Join operator support for Generic JDBC platform#708

Open
mohit-devlogs wants to merge 4 commits intoapache:mainfrom
mohit-devlogs:jdbc-join-operator
Open

Add Join operator support for Generic JDBC platform#708
mohit-devlogs wants to merge 4 commits intoapache:mainfrom
mohit-devlogs:jdbc-join-operator

Conversation

@mohit-devlogs
Copy link
Contributor

This PR adds JOIN operator support for the Generic JDBC platform.

The JOIN operator is currently only implemented for wayang-postgres. This PR extends this to make it work on any JDBC database.

Implementation:

  • Implemented GenericJdbcJoinOperator
  • Implemented JoinMapping
  • Added the mapping to Mappings.java

This PR extends the feature set of JOIN operators.

Closes #707

Copy link
Contributor

@zkaoudi zkaoudi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @mohit-devlogs. That was fast :) Could you please also add a unit test for the operator to make sure it works as it is supposed to?

@mohit-devlogs
Copy link
Contributor Author

thank you @mohit-devlogs. That was fast :) Could you please also add a unit test for the operator to make sure it works as it is supposed to?

Thank you! I will add a unit test for the operator and update the PR shortly.

@mspruc
Copy link
Contributor

mspruc commented Mar 2, 2026

Can you expand the unit test to include the operator being executed?

@mohit-devlogs
Copy link
Contributor Author

Can you expand the unit test to include the operator being executed?

Thank you for the suggestion.
I will extend the unit test to execute the operator within a minimal plan and verify the join behavior.

@mohit-devlogs
Copy link
Contributor Author

I have extended the unit test to execute the GenericJdbcJoinOperator via JdbcExecutor against an in-memory HSQLDB instance. However, I am facing the following issue when I run the code:

java.sql.SQLSyntaxErrorException: user lacks privilege or object not found: ID1

This issue occurs when the GenericSqlToStreamOperator tries to execute the query. I have checked the schema initialization and also tested different strategies for the identifier. However, this issue still occurs. I suspect that this issue occurs because of the aliasing that takes place in the SQL generation and the aliasing that takes place in the HSQLDB database. Since this issue prevents the build from passing, I have left the execution-level test local. Would you like me to commit the code, and you can look at the execution code, or is there a different approach that we should take to test the execution of the operators in this module?

@mspruc

@zkaoudi
Copy link
Contributor

zkaoudi commented Mar 3, 2026

@mohit-devlogs
Copy link
Contributor Author

@zkaoudi I adapted the test to execute the operator using an in-memory HSQLDB instance.
The test now reaches the SQL execution stage, but the generated SQL fails with:

java.sql.SQLSyntaxErrorException: user lacks privilege or object not found: A

This appears to come from the SQL generated by GenericJdbcJoinOperator, where the column reference becomes detached from its table alias and HSQLDB interprets it as a table identifier.
Since the execution pipeline is now working and the failure occurs inside the generated SQL, could you confirm whether the expected approach is to adjust the SQL implementation mapping for GenericJdbcJoinOperator or whether the operator itself needs modification?

@zkaoudi
Copy link
Contributor

zkaoudi commented Mar 4, 2026

Yes, there should be something wrong with the genericjdbc implementation. As the JdbcJoinOperatorTest.java works correctly it has to be a bug in the generic jdbc module. You can go ahead and fix it in this PR. Then I can check the changes.

@mohit-devlogs
Copy link
Contributor Author

Yes, there should be something wrong with the genericjdbc implementation. As the JdbcJoinOperatorTest.java works correctly it has to be a bug in the generic jdbc module. You can go ahead and fix it in this PR. Then I can check the changes.

Thanks for the hint. The issue was caused by GenericJdbcTableSource interpreting the first constructor argument as jdbcName, which led to configuration lookups like wayang.A.jdbc.url.
I fixed the implementation so it correctly uses the Generic JDBC configuration and the join test now executes successfully.


source1.addTargetPlatform(GenericJdbcPlatform.getInstance());
source2.addTargetPlatform(GenericJdbcPlatform.getInstance());
join.addTargetPlatform(JavaPlatform.getInstance());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The join happens in Java for this test. So it's not the genericjdbcjoin operator that is being tested.

}

long cardinality = resultSet.getLong(1);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to introduce all these line breaks

@zkaoudi
Copy link
Contributor

zkaoudi commented Mar 4, 2026

I think there is a deeper issue with the jdbc executor, so it would be enough if your test just checks if the right SQL statement is created, just like the JdbcJoinOperatorTest does

@mohit-devlogs
Copy link
Contributor Author

I think there is a deeper issue with the jdbc executor, so it would be enough if your test just checks if the right SQL statement is created, just like the JdbcJoinOperatorTest does

Thanks for the clarification. I understand that the test should only verify the SQL generation, similar to JdbcJoinOperatorTest, rather than executing the query. I am currently adjusting the test accordingly. While doing so, I also noticed the issue in the JDBC executor related to stages with multiple sources (which affects JOIN operations). I am investigating it to understand whether it needs to be addressed or if the test should simply focus on SQL generation as suggested.

I will update the test and push the changes shortly.

@mohit-devlogs
Copy link
Contributor Author

I have updated the tests and pushed the changes. The GenericJdbcJoinOperatorTest now verifies SQL generation correctly, and I also adjusted the GenericJdbcFilterOperatorTest to include the downstream SqlToStreamOperator stage so that the JdbcExecutor can build the SQL query pipeline properly.

Both tests pass locally in the wayang-generic-jdbc module.

Please let me know if any further changes are required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Join operator in jdbc-generic platform

3 participants