Skip to content

Use async wait timeout for SEA when direct results disabled (fix hybrid-path truncation)#1476

Merged
sreekanth-db merged 2 commits into
mainfrom
repro/sea-dr-truncation
Jun 4, 2026
Merged

Use async wait timeout for SEA when direct results disabled (fix hybrid-path truncation)#1476
sreekanth-db merged 2 commits into
mainfrom
repro/sea-dr-truncation

Conversation

@sreekanth-db

@sreekanth-db sreekanth-db commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator

Description

When direct results are disabled on the SEA (SQL Execution API) path, the driver
built the ExecuteStatement request with waitTimeout=10s (SYNC_TIMEOUT_VALUE)
and onWaitTimeout=CONTINUE — the old SEA hybrid direct-results path.

For results that span multiple chunks but compress small (highly-compressible
payloads), the server returns only the first chunk inline with no external
links
(hasInlineAttachment=true, numExternalLinks=0). The driver's inline path
returns just that first chunk and never fetches the rest, so the result is
silently truncated.

This change sets waitTimeout=0s (ASYNC_TIMEOUT_VALUE) when direct results are
disabled, avoiding the hybrid inline path. The server then delivers results via
external links, which the driver downloads in full.

Resulting contract (SEA):

  • DirectResults = 0WaitTimeout = 0 (async)
  • DirectResults = 1WaitTimeout unset (true direct results)

Thrift is unaffected — it has no waitTimeout and paginates correctly.

Related: ES-1714092 (server-side compressed-vs-uncompressed byte limit on the
hybrid path). This driver change stops using that path; the server-side fix is
tracked separately.

Requirement / Motivation

Reported via Slack: SEA with direct results disabled silently truncated a large,
highly-compressible result (200MB logical → ~0.8MB compressed) to a handful of
rows, while Thrift returned all rows.

Testing done

Repro (manual, against a serverless SQL warehouse) — query
SELECT repeat('A', 1024 * 1024) AS payload FROM range(200) with direct results
disabled:

Before After
SEA 20 / 200 rows, isCloudFetchUsed=false (inline) ❌ 200 / 200 rows, isCloudFetchUsed=true (external links) ✅
Thrift 200 / 200 ✅ 200 / 200 ✅

Additional disambiguation runs (repeating the real table via UNION ALL up to 64×,
9.6M rows, 40s execution) confirmed the trigger is the inline-vs-external-links
delivery (compressed size)
— not query time or polling: a fast no-poll query
truncated while a slow 6-poll query did not.

Regression — SEA unit + fakeservice suites (all pass, 0 failures):

  • DatabricksSdkClientTest (42)
  • SqlExecApiHybridResultsIntegrationTests (2)
  • DatabricksMetadataQueryClientTest (57)
  • CommandBuilderTest (21)
  • DatabricksEmptyMetadataClientTest (11)
  • SeaCircuitBreakerManagerTest (13)

Notes / caveats

  • Latency: with direct results disabled, queries now always
    execute → poll → fetch (one extra round-trip for small results) instead of
    returning inline on the first response.
  • The underlying server-side bug (ES-1714092) is separate; this change avoids the
    affected path rather than fixing the server.

When direct results are disabled, the SEA execute request used
waitTimeout=10s (the old hybrid path) with onWaitTimeout=CONTINUE. For
results that span multiple chunks but compress small (highly-compressible
payloads), the server returns only the first chunk inline with no external
links, and the driver returns just that chunk -- silently truncating the
result.

Switch to waitTimeout=0s (async) when direct results are disabled so the
server delivers results via external links, which the driver fetches in
full. This matches the intended contract: DirectResults=0 -> WaitTimeout=0
(async); DirectResults=1 -> WaitTimeout unset. Thrift is unaffected (no
waitTimeout; it paginates).

Related: ES-1714092

Co-authored-by: Isaac
Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>
if (executeAsync) {
request.setWaitTimeout(ASYNC_TIMEOUT_VALUE);
} else {
// Only set timeout if direct results mode is not enabled

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we add a test?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the unit tests

request.setWaitTimeout(SYNC_TIMEOUT_VALUE);
request.setWaitTimeout(ASYNC_TIMEOUT_VALUE);
}
request.setOnWaitTimeout(ExecuteStatementRequestOnWaitTimeout.CONTINUE);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed for async case?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pre existing line - unchanged by this PR. It might not be required for async mode, but its better to be explicit than missing it out for some edge case (in any future changes)

Asserts getRequest() sets waitTimeout=0s when direct results are disabled
and leaves it unset when enabled. Addresses review feedback on #1476.

Co-authored-by: Isaac
Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>
@sreekanth-db sreekanth-db requested a review from gopalldb June 3, 2026 08:18
@sreekanth-db sreekanth-db merged commit a8f94f3 into main Jun 4, 2026
21 checks passed
@sreekanth-db sreekanth-db deleted the repro/sea-dr-truncation branch June 4, 2026 07:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants