Skip to content

feat: add bigquery-alloydb-insights setup scripts and readme#1290

Merged
jeffonelson merged 1 commit into
GoogleCloudPlatform:mainfrom
jeffonelson:add-bq-alloydb-insights
Jun 5, 2026
Merged

feat: add bigquery-alloydb-insights setup scripts and readme#1290
jeffonelson merged 1 commit into
GoogleCloudPlatform:mainfrom
jeffonelson:add-bq-alloydb-insights

Conversation

@jeffonelson
Copy link
Copy Markdown
Contributor

No description provided.

@jeffonelson jeffonelson requested a review from Lsubatin June 5, 2026 00:15
@jeffonelson jeffonelson merged commit 3bc4ea3 into GoogleCloudPlatform:main Jun 5, 2026
5 checks passed
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds setup scripts and documentation for Lab 2 of the Informed AI Roadshow, which involves provisioning AlloyDB, configuring BigQuery datasets and connections, and seeding telemetry and thermal data. The review feedback focuses on improving the robustness of the shell scripts, including enabling pipefail to catch pipeline failures, validating critical variables like PROJECT_ID, CURRENT_USER, and SA_EMAIL to prevent cascading failures, ensuring curl fails on HTTP errors, and using consistent authentication commands.

# Designed to be kicked off early in the lab and run in the background
# while attendees work through BigQuery tasks (~10-15 min to complete).
# ---------------------------------------------------------------------------
set -e
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using only set -e does not enable pipefail. In this script, the output of deploy_alloydb.sh is piped to tee (line 50). Without pipefail, if deploy_alloydb.sh fails, the pipeline's exit status will still be 0 (from tee), causing the script to continue running on a non-existent or failed cluster. Enabling set -euo pipefail ensures the script exits immediately on any command or pipeline failure, and also catches unbound variables.

Suggested change
set -e
set -euo pipefail

echo " You can re-run this step later with: bq mk --connection ..."
# Note: BigQuery treats AlloyDB as a PostgreSQL Cloud SQL connection under the hood via the REST API.
curl -s -X POST \
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using gcloud auth application-default print-access-token requires the user to have set up Application Default Credentials (ADC) locally or in Cloud Shell, which is typically not done in standard lab environments. Since the script already verifies active user credentials on line 28 using gcloud auth print-access-token, we should use the same command here to ensure consistency and avoid authentication failures.

Suggested change
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \

# --- Lab-specific configuration ---
export CLUSTER_NAME="lost-cargo-cluster"
export INSTANCE_NAME="lost-cargo-instance"
export PROJECT_ID="${PROJECT_ID:-$(gcloud config get-value project)}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If PROJECT_ID is not set in the environment and gcloud config get-value project returns empty, the script will proceed with an empty PROJECT_ID. This will cause subsequent commands (like GCS bucket IAM binding and API calls) to fail with cryptic errors. We should validate that PROJECT_ID is non-empty and fail explicitly if it is missing, matching the behavior in setup_lab.sh.

Suggested change
export PROJECT_ID="${PROJECT_ID:-$(gcloud config get-value project)}"
export PROJECT_ID="${PROJECT_ID:-$(gcloud config get-value project 2>/dev/null)}"
if [[ -z "${PROJECT_ID}" ]]; then
echo "ERROR: No project set. Run 'gcloud config set project YOUR_PROJECT_ID' first." >&2
exit 1
fi
References
  1. When an environment variable is required for the application to function correctly and has no sensible default, do not provide a default value. Instead, enforce that the user must set it, allowing the application to fail explicitly if it's missing.

Comment on lines +62 to +66
curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://alloydb.googleapis.com/v1beta/projects/${PROJECT_ID}/locations/${REGION}/clusters/${CLUSTER_NAME}/instances/${INSTANCE_NAME}?updateMask=dataApiAccess" \
-d '{ "dataApiAccess": "ENABLED" }' || echo "⚠️ Warning: Failed to enable Data API access."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

By default, curl does not return a non-zero exit code when the server returns an HTTP error (e.g., 4xx or 5xx). As a result, the || echo "⚠️ Warning: ..." fallback will never be triggered if the API call fails. Adding the -f / --fail flag ensures curl fails on HTTP errors so the warning is correctly printed.

Suggested change
curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://alloydb.googleapis.com/v1beta/projects/${PROJECT_ID}/locations/${REGION}/clusters/${CLUSTER_NAME}/instances/${INSTANCE_NAME}?updateMask=dataApiAccess" \
-d '{ "dataApiAccess": "ENABLED" }' || echo "⚠️ Warning: Failed to enable Data API access."
curl -s -f -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://alloydb.googleapis.com/v1beta/projects/${PROJECT_ID}/locations/${REGION}/clusters/${CLUSTER_NAME}/instances/${INSTANCE_NAME}?updateMask=dataApiAccess" \
-d '{ "dataApiAccess": "ENABLED" }' || echo "⚠️ Warning: Failed to enable Data API access."

Comment on lines +81 to +87
CURRENT_USER=$(gcloud config get-value account)
echo "[4/5] Creating database user for current IAM principal: ${CURRENT_USER}..."
gcloud alloydb users create "${CURRENT_USER}" \
--cluster="${CLUSTER_NAME}" \
--region="${REGION}" \
--type=IAM_BASED \
--superuser=true || echo "⚠️ Warning: Failed to create database user for ${CURRENT_USER}. You may need to create it manually."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If gcloud config get-value account returns empty or fails, CURRENT_USER will be empty, leading to an invalid gcloud alloydb users create command. We should check if CURRENT_USER is non-empty before attempting to create the database user.

Suggested change
CURRENT_USER=$(gcloud config get-value account)
echo "[4/5] Creating database user for current IAM principal: ${CURRENT_USER}..."
gcloud alloydb users create "${CURRENT_USER}" \
--cluster="${CLUSTER_NAME}" \
--region="${REGION}" \
--type=IAM_BASED \
--superuser=true || echo "⚠️ Warning: Failed to create database user for ${CURRENT_USER}. You may need to create it manually."
CURRENT_USER=$(gcloud config get-value account 2>/dev/null || echo "")
if [[ -n "$CURRENT_USER" ]]; then
echo "[4/5] Creating database user for current IAM principal: ${CURRENT_USER}..."
gcloud alloydb users create "${CURRENT_USER}" \
--cluster="${CLUSTER_NAME}" \
--region="${REGION}" \
--type=IAM_BASED \
--superuser=true || echo "⚠️ Warning: Failed to create database user for ${CURRENT_USER}. You may need to create it manually."
else
echo "⚠️ Warning: Could not retrieve current IAM principal. Skipping database user creation."
fi

Comment on lines +60 to +64
SA_EMAIL=$(bq show --format=prettyjson --connection us.lost_cargo_conn \
| grep "serviceAccountId" | cut -d '"' -f 4)
echo " Connection service account: $SA_EMAIL"
echo " Waiting 10s for service account to propagate..."
sleep 10
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the connection creation failed or if bq show fails to retrieve the service account email, SA_EMAIL will be empty. This will cause the subsequent gcloud projects add-iam-policy-binding commands to fail with invalid member errors. We should check if SA_EMAIL is non-empty and fail explicitly if it is missing, similar to how SA_EMAIL_ALLOYDB is handled on line 101.

Suggested change
SA_EMAIL=$(bq show --format=prettyjson --connection us.lost_cargo_conn \
| grep "serviceAccountId" | cut -d '"' -f 4)
echo " Connection service account: $SA_EMAIL"
echo " Waiting 10s for service account to propagate..."
sleep 10
SA_EMAIL=$(bq show --format=prettyjson --connection us.lost_cargo_conn \
| grep "serviceAccountId" | cut -d '"' -f 4)
if [[ -z "$SA_EMAIL" ]]; then
echo "ERROR: Failed to retrieve service account for connection us.lost_cargo_conn." >&2
exit 1
fi
echo " Connection service account: $SA_EMAIL"
echo " Waiting 10s for service account to propagate..."
sleep 10
References
  1. In shell scripts, prefer using standard tools like grep and cut for parsing command output over tools like jq if the latter's availability or functionality in the execution environment is not guaranteed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant