Skip to content

Conversation

@spbnick
Copy link
Collaborator

@spbnick spbnick commented Apr 17, 2025

Sorry, couldn't finish this before leaving, but basically this adds support for streaming loads into BigQuery.

Reference: https://cloud.google.com/bigquery/docs/write-api-batch

I didn't reach the state of loading ever working.

Here's my testing procedure so far, briefly:

# Specifying service account credentials
export GOOGLE_APPLICATION_CREDENTIALS=~/.kernelci-staging-admin.json
# Making a partial deployment with "bq_stream" namespace
./cloud deploy 'kernelci-staging' bq_stream --test --smtp-mocked -v -s '@(psql|bigquery|secrets|iam)'
# Entering the shell with environment setup and PostgreSQL proxy
./cloud shell 'kernelci-staging' bq_stream --test --smtp-mocked
# Running basic DB tests
pytest --tb=native -vv -k 'test_db'
# The above fails in multiple ways
# Trying a simple load
kcidb-db-load -lDEBUG -d 'bigquery:kernelci-staging.bq_stream_kcidb_empty_test' < ../kcidb-trash/comprehensive.json

# Getting a lovely, very specific exception :D
2025-04-17 21:41:39,785:DEBUG:kcidb.misc:Traceback (most recent call last):                                                                                                                                                                    
  File "/home/nkondras/.local/bin/kcidb-db-load", line 33, in <module>                                                                                                                                                                         
    sys.exit(load_entry_point('kcidb', 'console_scripts', 'kcidb-db-load')())                                                                                                                                                                  
             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^                                                                                                                                                                   
  File "/home/nkondras/projects/github.com/kernelci/kcidb/kcidb/db/__init__.py", line 1010, in load_main                                                                                                                                       
    client.load(data, with_metadata=args.with_metadata, copy=False)                                                                                                                                                                            
    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                            
  File "/home/nkondras/projects/github.com/kernelci/kcidb/kcidb/db/__init__.py", line 671, in load                                                                                                                                             
    self.load_iter([data], with_metadata=with_metadata, copy=copy)                                                                                                                                                                             
    ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                             
  File "/home/nkondras/projects/github.com/kernelci/kcidb/kcidb/db/__init__.py", line 646, in load_iter                                                                                                                                        
    self.driver.load_iter(data_iter,                                                                                                                                                                                                           
    ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^                                                                                                                                                                                                           
                          with_metadata=with_metadata, copy=copy)                                                                                                                                                                              
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                              
  File "/home/nkondras/projects/github.com/kernelci/kcidb/kcidb/db/schematic.py", line 802, in load_iter                                                                                                                                       
    self.schema.load_iter(data_iter,                                                                                                                                                                                                           
    ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^                                                                                                                                                                                                           
                          with_metadata=with_metadata, copy=copy)                                                                                                                                                                              
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                              
  File "/home/nkondras/projects/github.com/kernelci/kcidb/kcidb/db/bigquery/v04_00.py", line 1320, in load_iter                                                                                                                                
    futures.append(write_setup.manager.send(AppendRowsRequest(                                                                                                                                                                                 
                   ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                 
        proto_rows=AppendRowsRequest.ProtoData(rows=rows)                                                                                                                                                                                      
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                      
    )))                                                                                                                                                                                                                                        
    ^^                                                                                                                                                                                                                                         
  File "/home/nkondras/.local/lib/python3.13/site-packages/google/cloud/bigquery_storage_v1/writer.py", line 167, in send                                                                                                                      
    return self._connection.send(request)                                                                                                                                                                                                      
           ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^                                                                                                                                                                                                      
  File "/home/nkondras/.local/lib/python3.13/site-packages/google/cloud/bigquery_storage_v1/writer.py", line 425, in send                                                                                                                      
    return self.open(request)                                                                                                                                                                                                                  
           ~~~~~~~~~^^^^^^^^^                                                                                                                                                                                                                  
  File "/home/nkondras/.local/lib/python3.13/site-packages/google/cloud/bigquery_storage_v1/writer.py", line 297, in open                                                                                                                      
    return self._open(initial_request, timeout)                                                                                                                                                                                                
           ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                
  File "/home/nkondras/.local/lib/python3.13/site-packages/google/cloud/bigquery_storage_v1/writer.py", line 380, in _open                                                                                                                     
    raise request_exception                                                                                                                                                                                                                    
google.api_core.exceptions.Unknown: None There was a problem opening the stream. Try turning on DEBUG level logs to see the error.                                                                                                             
Unknown: None There was a problem opening the stream. Try turning on DEBUG level logs to see the error.                                                                                                                                        

# Exiting the shell (kinda important)
exit
# Cleaning up - withdrawing the partial deployment
./cloud withdraw 'kernelci-staging' bq_stream --test --smtp-mocked -v

A likely reason for the failure is that I've been bold and hopeful and tried supplying parameters to all the myriad API classes via constructor arguments, and not the very verbose approach of creating a class and setting properties separately as seen in the sample above 🙈 Was just trying to maintain my sanity. So that would be the first thing to try to fix: mimic exactly the way they create everything.

Any way, hit me with questions, if anyone ever picks this up 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants