Skip to content

[BUG]: Issue converting code from files >100 MB #2097

@xsergiolpx

Description

@xsergiolpx

Is there an existing issue for this?

  • I have searched the existing issues

Category of Bug / Issue

Converter bug

Current Behavior

Hi,

When running the code conversion for a DataStage XML file that is 125MB it crashes. When talking with Databricks it turns out that there are issues with files that are >100MB.

This is a blocker for large migrations. Would it be possible to add support for files > 100MB?

Expected Behavior

Convert correctly the code

Steps To Reproduce

databricks labs lakebridge transpile datastage --input-source /Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml --output-folder /Users/sergio.ballesteros/Code/lakebridge/output/ --debug

Relevant log output or Exception details

sergio.ballesteros@VKMGW54Q6V lakebridge % databricks labs lakebridge transpile datastage --input-source /Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml --output-folder /Users/sergio.ballesteros/Code/lakebridge/output/ --debug
10:30:13 Info: start pid=5143 version=0.271.0 args="databricks, labs, lakebridge, transpile, datastage, --input-source, /Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml, --output-folder, /Users/sergio.ballesteros/Code/lakebridge/output/, --debug"
10:30:13 Debug: Fetching latest releases for databrickslabs/lakebridge from GitHub API pid=5143
10:30:13 Debug: Loading installed version info from: /Users/sergio.ballesteros/.databricks/labs/lakebridge/state/version.json pid=5143
10:30:13 Debug: Loading login configuration from: /Users/sergio.ballesteros/.databricks/labs/lakebridge/config/login.json pid=5143
10:30:13 Debug: Using workspace-level login profile: DEFAULT pid=5143
10:30:13 Debug: Loading DEFAULT profile from /Users/sergio.ballesteros/.databrickscfg pid=5143 sdk=true
10:30:13 Debug: Resolved login: Config: host=https://adb-2690017451936431.11.azuredatabricks.net, token=***, profile=DEFAULT, config_file=/Users/sergio.ballesteros/.databrickscfg pid=5143 sdk=true
10:30:13 Debug: Passing down environment variables: DATABRICKS_HOST, DATABRICKS_TOKEN pid=5143
10:30:13 Debug: Forwarding subprocess: /Users/sergio.ballesteros/.databricks/labs/lakebridge/state/venv/bin/python3 /Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/src/databricks/labs/lakebridge/cli.py {"command":"transpile","flags":{"catalog-name":"","error-file-path":"","input-source":"/Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml","log_level":"debug","output-folder":"/Users/sergio.ballesteros/Code/lakebridge/output/","schema-name":"","skip-validation":"true","source-dialect":"","transpiler-config-path":""},"output_type":""} pid=5143
10:30:13 Debug: starting: /Users/sergio.ballesteros/.databricks/labs/lakebridge/state/venv/bin/python3 /Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/src/databricks/labs/lakebridge/cli.py {"command":"transpile","flags":{"catalog-name":"","error-file-path":"","input-source":"/Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml","log_level":"debug","output-folder":"/Users/sergio.ballesteros/Code/lakebridge/output/","schema-name":"","skip-validation":"true","source-dialect":"","transpiler-config-path":""},"output_type":""} pid=5143
10:30:16    DEBUG [d.labs.lakebridge] Leaving DATABRICKS_HOST as-is: https://adb-2690017451936431.11.azuredatabricks.net
10:30:16    DEBUG [databricks.sdk] Loaded from environment
10:30:16    DEBUG [databricks.sdk] Attempting to configure auth: pat
10:30:17    DEBUG [databricks.sdk] GET /api/2.0/preview/scim/v2/Me
< 200 OK
< {
<   "active": true,
<   "displayName": "Sergio Ballesteros Solanas",
<   "emails": [
<     {
<       "primary": true,
<       "type": "work",
<       "value": "**REDACTED**"
<     }
<   ],
<   "entitlements": [
<     {
<       "value": "**REDACTED**"
<     },
<     "... (1 additional elements)"
<   ],
<   "externalId": "4601b55a-daa0-4daa-aabc-ab7580d6a240",
<   "groups": [
<     {
<       "$ref": "Groups/1025253049853392",
<       "display": "admins",
<       "type": "direct",
<       "value": "**REDACTED**"
<     }
<   ],
<   "id": "7978958767567807",
<   "name": {
<     "familyName": "Solanas",
<     "givenName": "Sergio Ballesteros"
<   },
<   "schemas": [
<     "urn:ietf:params:scim:schemas:core:2.0:User",
<     "... (1 additional elements)"
<   ],
<   "userName": "sergio.ballesteros@databricks.com"
< }
10:30:17    DEBUG [d.l.blueprint.installation] Loading TranspileConfig from config.yml
10:30:18    DEBUG [databricks.sdk] GET /api/2.0/workspace/export?path=/Users/sergio.ballesteros@databricks.com/.lakebridge/config.yml&direct_download=true
< 200 OK
< [raw stream]
10:30:18    DEBUG [d.labs.lakebridge] Preconfigured transpiler config: TranspileConfig(transpiler_config_path='/Users/sergio.ballesteros/.databricks/labs/remorph-transpilers/bladebridge/lib/config.yml', source_dialect='datastage', input_source=None, output_folder='/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/transpiled', error_file_path='/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/errors.log', sdk_config=None, skip_validation=True, catalog_name='remorph', schema_name='transpiler', transpiler_options={'overrides-file': None, 'target-tech': 'SPARKSQL'})
10:30:18    DEBUG [d.l.l.contexts.application] Added User-Agent extra cmd=execute-transpile
10:30:18    DEBUG [d.labs.lakebridge] Setting input_source to: '/Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml'
10:30:18    DEBUG [d.labs.lakebridge] Setting output_folder to: '/Users/sergio.ballesteros/Code/lakebridge/output/'
10:30:18    DEBUG [d.labs.lakebridge] Setting skip_validation to: True
10:30:18    DEBUG [d.labs.lakebridge] Checking config: TranspileConfig(transpiler_config_path='/Users/sergio.ballesteros/.databricks/labs/remorph-transpilers/bladebridge/lib/config.yml', source_dialect='datastage', input_source='/Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml', output_folder='/Users/sergio.ballesteros/Code/lakebridge/output/', error_file_path='/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/errors.log', sdk_config=None, skip_validation=True, catalog_name='remorph', schema_name='transpiler', transpiler_options={'overrides-file': None, 'target-tech': 'SPARKSQL'})
10:30:18    DEBUG [d.labs.lakebridge] Using configured source_dialect: 'datastage'
10:30:18    DEBUG [d.labs.lakebridge] Validated config: TranspileConfig(transpiler_config_path='/Users/sergio.ballesteros/.databricks/labs/remorph-transpilers/bladebridge/lib/config.yml', source_dialect='datastage', input_source='/Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml', output_folder='/Users/sergio.ballesteros/Code/lakebridge/output/', error_file_path='/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/errors.log', sdk_config=None, skip_validation=True, catalog_name='remorph', schema_name='transpiler', transpiler_options={'overrides-file': None, 'target-tech': 'SPARKSQL'})
10:30:18    DEBUG [d.labs.lakebridge] Final configuration for transpilation: TranspileConfig(transpiler_config_path='/Users/sergio.ballesteros/.databricks/labs/remorph-transpilers/bladebridge/lib/config.yml', source_dialect='datastage', input_source='/Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml', output_folder='/Users/sergio.ballesteros/Code/lakebridge/output/', error_file_path='/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/errors.log', sdk_config=None, skip_validation=True, catalog_name='remorph', schema_name='transpiler', transpiler_options={'overrides-file': None, 'target-tech': 'SPARKSQL'})
10:30:18    DEBUG [d.l.l.contexts.application] Added User-Agent extra transpiler_source_tech=datastage
10:30:18    DEBUG [d.l.l.contexts.application] Added User-Agent extra transpiler_plugin_name=Bladebridge
10:30:19    DEBUG [databricks.sdk] GET /api/2.0/preview/scim/v2/Me
< 200 OK
< {
<   "active": true,
<   "displayName": "Sergio Ballesteros Solanas",
<   "emails": [
<     {
<       "primary": true,
<       "type": "work",
<       "value": "**REDACTED**"
<     }
<   ],
<   "entitlements": [
<     {
<       "value": "**REDACTED**"
<     },
<     "... (1 additional elements)"
<   ],
<   "externalId": "4601b55a-daa0-4daa-aabc-ab7580d6a240",
<   "groups": [
<     {
<       "$ref": "Groups/1025253049853392",
<       "display": "admins",
<       "type": "direct",
<       "value": "**REDACTED**"
<     }
<   ],
<   "id": "7978958767567807",
<   "name": {
<     "familyName": "Solanas",
<     "givenName": "Sergio Ballesteros"
<   },
<   "schemas": [
<     "urn:ietf:params:scim:schemas:core:2.0:User",
<     "... (1 additional elements)"
<   ],
<   "userName": "sergio.ballesteros@databricks.com"
< }
10:30:19    DEBUG [d.labs.lakebridge] User: User(active=True, display_name='Sergio Ballesteros Solanas', emails=[ComplexValue(display=None, primary=True, ref=None, type='work', value='sergio.ballesteros@databricks.com')], entitlements=[ComplexValue(display=None, primary=None, ref=None, type=None, value='allow-cluster-create'), ComplexValue(display=None, primary=None, ref=None, type=None, value='allow-instance-pool-create')], external_id='4601b55a-daa0-4daa-aabc-ab7580d6a240', groups=[ComplexValue(display='admins', primary=None, ref='Groups/1025253049853392', type='direct', value='1025253049853392')], id='7978958767567807', name=Name(family_name='Solanas', given_name='Sergio Ballesteros'), roles=[], schemas=[<UserSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_CORE_2_0_USER: 'urn:ietf:params:scim:schemas:core:2.0:User'>, <UserSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_EXTENSION_WORKSPACE_2_0_USER: 'urn:ietf:params:scim:schemas:extension:workspace:2.0:User'>], user_name='sergio.ballesteros@databricks.com')
10:30:19    DEBUG [d.l.l.contexts.application] Added User-Agent extra cmd=execute-transpile
10:30:19    DEBUG [d.labs.lakebridge] User: User(active=True, display_name='Sergio Ballesteros Solanas', emails=[ComplexValue(display=None, primary=True, ref=None, type='work', value='sergio.ballesteros@databricks.com')], entitlements=[ComplexValue(display=None, primary=None, ref=None, type=None, value='allow-cluster-create'), ComplexValue(display=None, primary=None, ref=None, type=None, value='allow-instance-pool-create')], external_id='4601b55a-daa0-4daa-aabc-ab7580d6a240', groups=[ComplexValue(display='admins', primary=None, ref='Groups/1025253049853392', type='direct', value='1025253049853392')], id='7978958767567807', name=Name(family_name='Solanas', given_name='Sergio Ballesteros'), roles=[], schemas=[<UserSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_CORE_2_0_USER: 'urn:ietf:params:scim:schemas:core:2.0:User'>, <UserSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_EXTENSION_WORKSPACE_2_0_USER: 'urn:ietf:params:scim:schemas:extension:workspace:2.0:User'>], user_name='sergio.ballesteros@databricks.com')
10:30:19    DEBUG [d.l.l.t.lsp.lsp_engine] Detected virtual environment to use at: /Users/sergio.ballesteros/.databricks/labs/remorph-transpilers/bladebridge/lib/.venv
10:30:19    DEBUG [d.l.l.t.lsp.lsp_engine] Using PATH for launching LSP server: /Users/sergio.ballesteros/.databricks/labs/remorph-transpilers/bladebridge/lib/.venv/bin:/Users/sergio.ballesteros/.databricks/labs/lakebridge/state/venv/bin:/opt/homebrew/opt/openjdk@17/bin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/Applications/VMware Fusion.app/Contents/Public:/Applications/iTerm.app/Contents/Resources/utilities
10:30:19    DEBUG [d.l.l.t.lsp.lsp_engine] Starting LSP engine: /Users/sergio.ballesteros/.databricks/labs/remorph-transpilers/bladebridge/lib/.venv/bin/python3 ['-m', 'databricks.labs.bladebridge.server', '--log_level=DEBUG'] (cwd=/Users/sergio.ballesteros/.databricks/labs/remorph-transpilers/bladebridge/lib)
10:30:19    DEBUG [d.l.l.t.lsp.lsp_engine] LSP init params: InitializeParams(capabilities=ClientCapabilities(workspace=None, text_document=None, notebook_document=None, window=None, general=None, experimental=None), process_id=5145, client_info=ClientInfo(name='lakebridge', version='0.10.11'), locale=None, root_path=None, root_uri='file:///Users/sergio.ballesteros/Code/lakebridge/input', initialization_options={'remorph': {'source-dialect': 'datastage'}, 'options': {'overrides-file': None, 'target-tech': 'SPARKSQL'}, 'custom': {}}, trace=None, work_done_token=None, workspace_folders=None)
10:30:20    DEBUG [d.l.l.t.lsp.lsp_engine] Registered capability: document/transpileToDatabricks
10:30:20    DEBUG [d.l.l.transpiler.execute] Starting to process input file: /Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml
10:30:20     INFO [d.l.l.transpiler.execute] Transpiling file: /Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml
10:30:20    DEBUG [d.l.l.transpiler.execute] Started processing file: /Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml
10:30:20    DEBUG [d.l.blueprint.paths] XML declaration detected, sniffing further with encoding: us-ascii
10:30:20    DEBUG [d.l.blueprint.paths] XML declaration encoding detected: UTF-8
^C
sergio.ballesteros@VKMGW54Q6V lakebridge % Traceback (most recent call last):
  File "/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/src/databricks/labs/lakebridge/cli.py", line 719, in <module>
    lakebridge()
  File "/Users/sergio.ballesteros/.databricks/labs/lakebridge/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/cli.py", line 187, in __call__
    run_main(self._route)
  File "/Users/sergio.ballesteros/.databricks/labs/lakebridge/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/entrypoint.py", line 35, in run_main
    main(*sys.argv[1:])
  File "/Users/sergio.ballesteros/.databricks/labs/lakebridge/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/cli.py", line 118, in _route
    cmd.fn(**kwargs)
  File "/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/src/databricks/labs/lakebridge/cli.py", line 126, in transpile
    result = asyncio.run(_transpile(ctx, config, engine))
  File "/opt/homebrew/Cellar/python@3.10/3.10.13_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/opt/homebrew/Cellar/python@3.10/3.10.13_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
    self.run_forever()
  File "/opt/homebrew/Cellar/python@3.10/3.10.13_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
    self._run_once()
  File "/opt/homebrew/Cellar/python@3.10/3.10.13_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/base_events.py", line 1871, in _run_once
    event_list = self._selector.select(timeout)
  File "/opt/homebrew/Cellar/python@3.10/3.10.13_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/selectors.py", line 562, in select
    kev_list = self._selector.control(None, max_ev, timeout)
KeyboardInterrupt

sergio.ballesteros@VKMGW54Q6V lakebridge %
sergio.ballesteros@VKMGW54Q6V lakebridge %
sergio.ballesteros@VKMGW54Q6V lakebridge % databricks labs lakebridge transpile datastage --input-source /Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml --output-folder /Users/sergio.ballesteros/Code/lakebridge/output/ --debug > logs.txt
10:30:36 Info: start pid=5174 version=0.271.0 args="databricks, labs, lakebridge, transpile, datastage, --input-source, /Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml, --output-folder, /Users/sergio.ballesteros/Code/lakebridge/output/, --debug"
10:30:36 Debug: Loading installed version info from: /Users/sergio.ballesteros/.databricks/labs/lakebridge/state/version.json pid=5174
10:30:36 Debug: Loading login configuration from: /Users/sergio.ballesteros/.databricks/labs/lakebridge/config/login.json pid=5174
10:30:36 Debug: Using workspace-level login profile: DEFAULT pid=5174
10:30:36 Debug: Loading DEFAULT profile from /Users/sergio.ballesteros/.databrickscfg pid=5174 sdk=true
10:30:36 Debug: Resolved login: Config: host=https://adb-2690017451936431.11.azuredatabricks.net, token=***, profile=DEFAULT, config_file=/Users/sergio.ballesteros/.databrickscfg pid=5174 sdk=true
10:30:36 Debug: Passing down environment variables: DATABRICKS_HOST, DATABRICKS_TOKEN pid=5174
10:30:36 Debug: Forwarding subprocess: /Users/sergio.ballesteros/.databricks/labs/lakebridge/state/venv/bin/python3 /Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/src/databricks/labs/lakebridge/cli.py {"command":"transpile","flags":{"catalog-name":"","error-file-path":"","input-source":"/Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml","log_level":"debug","output-folder":"/Users/sergio.ballesteros/Code/lakebridge/output/","schema-name":"","skip-validation":"true","source-dialect":"","transpiler-config-path":""},"output_type":""} pid=5174
10:30:36 Debug: starting: /Users/sergio.ballesteros/.databricks/labs/lakebridge/state/venv/bin/python3 /Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/src/databricks/labs/lakebridge/cli.py {"command":"transpile","flags":{"catalog-name":"","error-file-path":"","input-source":"/Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml","log_level":"debug","output-folder":"/Users/sergio.ballesteros/Code/lakebridge/output/","schema-name":"","skip-validation":"true","source-dialect":"","transpiler-config-path":""},"output_type":""} pid=5174
10:30:37    DEBUG [d.labs.lakebridge] Leaving DATABRICKS_HOST as-is: https://adb-2690017451936431.11.azuredatabricks.net
10:30:37    DEBUG [databricks.sdk] Loaded from environment
10:30:37    DEBUG [databricks.sdk] Attempting to configure auth: pat
10:30:38    DEBUG [databricks.sdk] GET /api/2.0/preview/scim/v2/Me
< 200 OK
< {
<   "active": true,
<   "displayName": "Sergio Ballesteros Solanas",
<   "emails": [
<     {
<       "primary": true,
<       "type": "work",
<       "value": "**REDACTED**"
<     }
<   ],
<   "entitlements": [
<     {
<       "value": "**REDACTED**"
<     },
<     "... (1 additional elements)"
<   ],
<   "externalId": "4601b55a-daa0-4daa-aabc-ab7580d6a240",
<   "groups": [
<     {
<       "$ref": "Groups/1025253049853392",
<       "display": "admins",
<       "type": "direct",
<       "value": "**REDACTED**"
<     }
<   ],
<   "id": "7978958767567807",
<   "name": {
<     "familyName": "Solanas",
<     "givenName": "Sergio Ballesteros"
<   },
<   "schemas": [
<     "urn:ietf:params:scim:schemas:core:2.0:User",
<     "... (1 additional elements)"
<   ],
<   "userName": "sergio.ballesteros@databricks.com"
< }
10:30:38    DEBUG [d.l.blueprint.installation] Loading TranspileConfig from config.yml
10:30:39    DEBUG [databricks.sdk] GET /api/2.0/workspace/export?path=/Users/sergio.ballesteros@databricks.com/.lakebridge/config.yml&direct_download=true
< 200 OK
< [raw stream]
10:30:39    DEBUG [d.labs.lakebridge] Preconfigured transpiler config: TranspileConfig(transpiler_config_path='/Users/sergio.ballesteros/.databricks/labs/remorph-transpilers/bladebridge/lib/config.yml', source_dialect='datastage', input_source=None, output_folder='/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/transpiled', error_file_path='/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/errors.log', sdk_config=None, skip_validation=True, catalog_name='remorph', schema_name='transpiler', transpiler_options={'overrides-file': None, 'target-tech': 'SPARKSQL'})
10:30:39    DEBUG [d.l.l.contexts.application] Added User-Agent extra cmd=execute-transpile
10:30:39    DEBUG [d.labs.lakebridge] Setting input_source to: '/Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml'
10:30:39    DEBUG [d.labs.lakebridge] Setting output_folder to: '/Users/sergio.ballesteros/Code/lakebridge/output/'
10:30:39    DEBUG [d.labs.lakebridge] Setting skip_validation to: True
10:30:39    DEBUG [d.labs.lakebridge] Checking config: TranspileConfig(transpiler_config_path='/Users/sergio.ballesteros/.databricks/labs/remorph-transpilers/bladebridge/lib/config.yml', source_dialect='datastage', input_source='/Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml', output_folder='/Users/sergio.ballesteros/Code/lakebridge/output/', error_file_path='/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/errors.log', sdk_config=None, skip_validation=True, catalog_name='remorph', schema_name='transpiler', transpiler_options={'overrides-file': None, 'target-tech': 'SPARKSQL'})
10:30:39    DEBUG [d.labs.lakebridge] Using configured source_dialect: 'datastage'
10:30:39    DEBUG [d.labs.lakebridge] Validated config: TranspileConfig(transpiler_config_path='/Users/sergio.ballesteros/.databricks/labs/remorph-transpilers/bladebridge/lib/config.yml', source_dialect='datastage', input_source='/Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml', output_folder='/Users/sergio.ballesteros/Code/lakebridge/output/', error_file_path='/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/errors.log', sdk_config=None, skip_validation=True, catalog_name='remorph', schema_name='transpiler', transpiler_options={'overrides-file': None, 'target-tech': 'SPARKSQL'})
10:30:39    DEBUG [d.labs.lakebridge] Final configuration for transpilation: TranspileConfig(transpiler_config_path='/Users/sergio.ballesteros/.databricks/labs/remorph-transpilers/bladebridge/lib/config.yml', source_dialect='datastage', input_source='/Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml', output_folder='/Users/sergio.ballesteros/Code/lakebridge/output/', error_file_path='/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/errors.log', sdk_config=None, skip_validation=True, catalog_name='remorph', schema_name='transpiler', transpiler_options={'overrides-file': None, 'target-tech': 'SPARKSQL'})
10:30:39    DEBUG [d.l.l.contexts.application] Added User-Agent extra transpiler_source_tech=datastage
10:30:39    DEBUG [d.l.l.contexts.application] Added User-Agent extra transpiler_plugin_name=Bladebridge
10:30:40    DEBUG [databricks.sdk] GET /api/2.0/preview/scim/v2/Me
< 200 OK
< {
<   "active": true,
<   "displayName": "Sergio Ballesteros Solanas",
<   "emails": [
<     {
<       "primary": true,
<       "type": "work",
<       "value": "**REDACTED**"
<     }
<   ],
<   "entitlements": [
<     {
<       "value": "**REDACTED**"
<     },
<     "... (1 additional elements)"
<   ],
<   "externalId": "4601b55a-daa0-4daa-aabc-ab7580d6a240",
<   "groups": [
<     {
<       "$ref": "Groups/1025253049853392",
<       "display": "admins",
<       "type": "direct",
<       "value": "**REDACTED**"
<     }
<   ],
<   "id": "7978958767567807",
<   "name": {
<     "familyName": "Solanas",
<     "givenName": "Sergio Ballesteros"
<   },
<   "schemas": [
<     "urn:ietf:params:scim:schemas:core:2.0:User",
<     "... (1 additional elements)"
<   ],
<   "userName": "sergio.ballesteros@databricks.com"
< }
10:30:40    DEBUG [d.labs.lakebridge] User: User(active=True, display_name='Sergio Ballesteros Solanas', emails=[ComplexValue(display=None, primary=True, ref=None, type='work', value='sergio.ballesteros@databricks.com')], entitlements=[ComplexValue(display=None, primary=None, ref=None, type=None, value='allow-cluster-create'), ComplexValue(display=None, primary=None, ref=None, type=None, value='allow-instance-pool-create')], external_id='4601b55a-daa0-4daa-aabc-ab7580d6a240', groups=[ComplexValue(display='admins', primary=None, ref='Groups/1025253049853392', type='direct', value='1025253049853392')], id='7978958767567807', name=Name(family_name='Solanas', given_name='Sergio Ballesteros'), roles=[], schemas=[<UserSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_CORE_2_0_USER: 'urn:ietf:params:scim:schemas:core:2.0:User'>, <UserSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_EXTENSION_WORKSPACE_2_0_USER: 'urn:ietf:params:scim:schemas:extension:workspace:2.0:User'>], user_name='sergio.ballesteros@databricks.com')
10:30:40    DEBUG [d.l.l.contexts.application] Added User-Agent extra cmd=execute-transpile
10:30:40    DEBUG [d.labs.lakebridge] User: User(active=True, display_name='Sergio Ballesteros Solanas', emails=[ComplexValue(display=None, primary=True, ref=None, type='work', value='sergio.ballesteros@databricks.com')], entitlements=[ComplexValue(display=None, primary=None, ref=None, type=None, value='allow-cluster-create'), ComplexValue(display=None, primary=None, ref=None, type=None, value='allow-instance-pool-create')], external_id='4601b55a-daa0-4daa-aabc-ab7580d6a240', groups=[ComplexValue(display='admins', primary=None, ref='Groups/1025253049853392', type='direct', value='1025253049853392')], id='7978958767567807', name=Name(family_name='Solanas', given_name='Sergio Ballesteros'), roles=[], schemas=[<UserSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_CORE_2_0_USER: 'urn:ietf:params:scim:schemas:core:2.0:User'>, <UserSchema.URN_IETF_PARAMS_SCIM_SCHEMAS_EXTENSION_WORKSPACE_2_0_USER: 'urn:ietf:params:scim:schemas:extension:workspace:2.0:User'>], user_name='sergio.ballesteros@databricks.com')
10:30:40    DEBUG [d.l.l.t.lsp.lsp_engine] Detected virtual environment to use at: /Users/sergio.ballesteros/.databricks/labs/remorph-transpilers/bladebridge/lib/.venv
10:30:40    DEBUG [d.l.l.t.lsp.lsp_engine] Using PATH for launching LSP server: /Users/sergio.ballesteros/.databricks/labs/remorph-transpilers/bladebridge/lib/.venv/bin:/Users/sergio.ballesteros/.databricks/labs/lakebridge/state/venv/bin:/opt/homebrew/opt/openjdk@17/bin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/Applications/VMware Fusion.app/Contents/Public:/Applications/iTerm.app/Contents/Resources/utilities
10:30:40    DEBUG [d.l.l.t.lsp.lsp_engine] Starting LSP engine: /Users/sergio.ballesteros/.databricks/labs/remorph-transpilers/bladebridge/lib/.venv/bin/python3 ['-m', 'databricks.labs.bladebridge.server', '--log_level=DEBUG'] (cwd=/Users/sergio.ballesteros/.databricks/labs/remorph-transpilers/bladebridge/lib)
10:30:40    DEBUG [d.l.l.t.lsp.lsp_engine] LSP init params: InitializeParams(capabilities=ClientCapabilities(workspace=None, text_document=None, notebook_document=None, window=None, general=None, experimental=None), process_id=5175, client_info=ClientInfo(name='lakebridge', version='0.10.11'), locale=None, root_path=None, root_uri='file:///Users/sergio.ballesteros/Code/lakebridge/input', initialization_options={'remorph': {'source-dialect': 'datastage'}, 'options': {'overrides-file': None, 'target-tech': 'SPARKSQL'}, 'custom': {}}, trace=None, work_done_token=None, workspace_folders=None)
10:30:40    DEBUG [d.l.l.t.lsp.lsp_engine] Registered capability: document/transpileToDatabricks
10:30:40    DEBUG [d.l.l.transpiler.execute] Starting to process input file: /Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml
10:30:40     INFO [d.l.l.transpiler.execute] Transpiling file: /Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml
10:30:40    DEBUG [d.l.l.transpiler.execute] Started processing file: /Users/sergio.ballesteros/Code/lakebridge/input/FinancialCredit.xml
10:30:40    DEBUG [d.l.blueprint.paths] XML declaration detected, sniffing further with encoding: us-ascii
10:30:40    DEBUG [d.l.blueprint.paths] XML declaration encoding detected: UTF-8
11:30:03    ERROR [d.l.lakebridge.transpile] Failed to call transpile
Traceback (most recent call last):
  File "/Users/sergio.ballesteros/.databricks/labs/lakebridge/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/cli.py", line 118, in _route
    cmd.fn(**kwargs)
  File "/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/src/databricks/labs/lakebridge/cli.py", line 126, in transpile
    result = asyncio.run(_transpile(ctx, config, engine))
  File "/opt/homebrew/Cellar/python@3.10/3.10.13_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/opt/homebrew/Cellar/python@3.10/3.10.13_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/src/databricks/labs/lakebridge/cli.py", line 492, in _transpile
    status, errors = await do_transpile(ctx.workspace_client, engine, config)
  File "/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/src/databricks/labs/lakebridge/transpiler/execute.py", line 336, in transpile
    status, errors = await _do_transpile(workspace_client, engine, config)
  File "/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/src/databricks/labs/lakebridge/transpiler/execute.py", line 368, in _do_transpile
    result = await _process_input_file(config, validator, engine)
  File "/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/src/databricks/labs/lakebridge/transpiler/execute.py", line 328, in _process_input_file
    no_of_sqls, error_list = await _process_one_file(context)
  File "/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/src/databricks/labs/lakebridge/transpiler/execute.py", line 107, in _process_one_file
    transpile_result = await _transpile(
  File "/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/src/databricks/labs/lakebridge/transpiler/execute.py", line 417, in _transpile
    return await engine.transpile(from_dialect, to_dialect, source_code, input_path)
  File "/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/src/databricks/labs/lakebridge/transpiler/lsp/lsp_engine.py", line 590, in transpile
    response = await self.transpile_document(file_path)
  File "/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/src/databricks/labs/lakebridge/transpiler/lsp/lsp_engine.py", line 608, in transpile_document
    result = await self._client.transpile_document_async(params)
  File "/Users/sergio.ballesteros/.databricks/labs/lakebridge/lib/src/databricks/labs/lakebridge/transpiler/lsp/lsp_engine.py", line 298, in transpile_document_async
    return await self.protocol.send_request_async(TRANSPILE_TO_DATABRICKS_METHOD, params)
pygls.exceptions.JsonRpcInternalError: OSError: [Errno 24] Too many open files: '/var/folders/km/xwj6s6cn3cb_j60j0slv82xh0000gp/T/bladerunner__j0hdtqf/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled/transpiled'

Logs Confirmation

  • I ran the command line with --debug
  • I have attached the lsp-server.log under USER_HOME/.databricks/labs/remorph-transpilers/<converter_name>/lib/lsp-server.log

Sample Query

Operating System

macOS

Version

latest via Databricks CLI

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions