Skip to content

Fix supportsBatchUpdates() to honor EnableBatchedInserts#1496

Open
sreekanth-db wants to merge 1 commit into
databricks:mainfrom
sreekanth-db:fix/supports-batch-updates
Open

Fix supportsBatchUpdates() to honor EnableBatchedInserts#1496
sreekanth-db wants to merge 1 commit into
databricks:mainfrom
sreekanth-db:fix/supports-batch-updates

Conversation

@sreekanth-db

Copy link
Copy Markdown
Collaborator

Summary

DatabricksDatabaseMetaData.supportsBatchUpdates() was hard-coded to return false. Batch-aware JDBC clients (e.g. Apache Hop) call this method before batching; when it returns false they never call executeBatch() and instead fall back to one executeUpdate() per row. As a result, the multi-row INSERT optimization gated by EnableBatchedInserts (added in #944) never runs — even when the customer explicitly sets EnableBatchedInserts=1.

This PR makes supportsBatchUpdates() return true when EnableBatchedInserts=1, so batch-aware clients take the optimized executeBatch() path.

Fixes ES-1963495.

Why gate on EnableBatchedInserts rather than return true unconditionally

  • Zero change for existing users. With the flag off (the default), the method still returns false, so default behavior is untouched. This preserves the deliberate "safe rollout" posture chosen in Implement multi-row INSERT batching for PreparedStatement #944.
  • Advertises support exactly when batching actually engages. With the flag on, executeBatch() routes INSERTs through the optimized multi-row path; the metadata answer now matches real behavior.
  • Consistent with existing code. supportsTransactions() in the same class is already derived from a connection parameter (IgnoreTransactions), so a context-driven capability method is an established pattern here.

Spec / conformance notes

  • The JDBC spec states batch support is optional, but if a driver supports it, supportsBatchUpdates() must return true. The driver does implement a working executeBatch() (it handles all DML types; only INSERTs are optimized, others fall back to per-row), so true is the conformant value.
  • The spec imposes no per-statement-type requirement and does not require atomicity — non-atomic, app-managed (auto-commit) batches are the expected model. INSERT-only optimization does not preclude returning true.
  • This matches MySQL Connector/J and PostgreSQL pgjdbc, which both return true while gating their multi-row INSERT rewrite behind a separate flag (rewriteBatchedStatements / reWriteBatchedInserts, both default-off).

Changes

  • DatabricksDatabaseMetaData.supportsBatchUpdates() returns isBatchedInsertsEnabled() instead of false.
  • Added test supportsBatchUpdates_returnsTrueWhenBatchedInsertsEnabled; renamed the existing case to supportsBatchUpdates_returnsFalseByDefault.
  • NEXT_CHANGELOG.md entry under ### Fixed.

Testing

mvn test -pl jdbc-core -Dtest=DatabricksDatabaseMetaDataTest → 249 run, 0 failures, 0 skipped (both new/updated cases included).

DatabricksDatabaseMetaData.supportsBatchUpdates() was hard-coded to
return false. Batch-aware JDBC clients (e.g. Apache Hop) call this
method before batching; seeing false, they never call executeBatch()
and fall back to one executeUpdate() per row, so the multi-row INSERT
optimization gated by EnableBatchedInserts never runs.

Return true when EnableBatchedInserts=1 so those clients use the
optimized executeBatch() path. Default behavior is unchanged (flag
off -> false). Mirrors the existing context-driven supportsTransactions().

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant