Skip to content

Fix BladeBridge crash on large DataStage XML files (>100MB)#2364

Open
xsergiolpx wants to merge 1 commit intodatabrickslabs:mainfrom
xsergiolpx:fix/bladebridge-large-xml
Open

Fix BladeBridge crash on large DataStage XML files (>100MB)#2364
xsergiolpx wants to merge 1 commit intodatabrickslabs:mainfrom
xsergiolpx:fix/bladebridge-large-xml

Conversation

@xsergiolpx
Copy link
Copy Markdown

Summary

  • Add patch script and documentation for BladeBridge crash on large DataStage XML exports
  • The dbxconv binary receives the output directory as a relative path via the -n flag. For large XML files with hundreds of jobs, the binary recursively nests that directory name (transpiled/transpiled/transpiled/...) until the path exceeds the OS limit (OSError: [Errno 63] File name too long)
  • Fix: pass an absolute path to -n instead of a relative path

Files Added

  • scripts/patch_bladebridge_large_xml.sh — Idempotent patch script that modifies the installed BladeBridge plugin (transpiler.py line 203: str(transpiled_dir.relative_to(workdir))str(transpiled_dir.absolute()))
  • docs/bladebridge_large_xml_fix.md — Root cause analysis, manual fix instructions, and validation results

Why a patch script (not a direct code change)

The fix is in databricks/labs/bladebridge/transpiler.py, which lives in the BladeBridge plugin (databricks-bb-plugin PyPI package, source: databrickslabs/bladerunner). Since that repo is private and the plugin is distributed as a compiled wheel, this PR provides a post-install patch script. The proper fix should be applied in the bladerunner repo.

Root cause

The dbxconv binary receives the output directory via -n as a relative path (e.g., -n transpiled). For large XML files, the binary resolves this path relative to its own output on every internal write, creating recursive nesting. We verified this is not a name collision — renaming the directory to bb_output produced bb_output/bb_output/bb_output/... instead. Passing an absolute path prevents the recursion entirely.

Validation

Tested on a DataStage XML export (119 MB, 2.2M lines, DataStage 11.5):

Metric Without Patch With Patch
Result Crashes after ~69 min Completes in ~3 hours
Output files 0 425 (379 notebooks + 46 workflow JSONs)
Recursive nesting 87+ levels 0
Errors Fatal OSError: [Errno 63] 0

Test plan

  • Applied patch on macOS (Apple Silicon, Darwin 25.3.0)
  • Ran BladeBridge transpile on 119 MB DataStage XML → 425 files, 0 errors
  • Verified patch script is idempotent (re-run detects already patched)
  • Verified patch script creates backup before modifying

Resolves #2097

Add patch script and documentation for issue databrickslabs#2097.

The BladeBridge plugin's dbxconv binary receives the output directory
as a relative path via the -n flag. For large XML files with hundreds
of jobs, the binary recursively nests that directory name, creating
transpiled/transpiled/transpiled/... until the path exceeds the OS
limit (errno 63: File name too long).

Fix: pass an absolute path to -n instead of a relative path.

Validated on a 119 MB DataStage XML export (2.2M lines):
- Without patch: crashes after ~69 min
- With patch: completes successfully, 425 files generated, 0 errors

Relates to databrickslabs#2097

Co-authored-by: Isaac
@xsergiolpx xsergiolpx requested a review from a team as a code owner April 6, 2026 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: Issue converting code from files >100 MB

1 participant