fix(bakery): fix demographics table column shift and define dataset description by n-issei-777 · Pull Request #52 · google/mcp

n-issei-777 · 2026-05-28T06:07:03Z

Description

This PR fixes a data alignment bug and a missing environment/script variable definition in the launchmybakery BigQuery setup script.

The Issue:
The source CSV file data/demographics.csv contains 8 columns in the following order:
zip_code,city,neighborhood,median_household_income,total_population,median_age,bachelors_degree_pct,foot_traffic_index
However, the CREATE TABLE schema in setup_bigquery.sh only defined 7 columns, completely omitting the 4th column median_household_income.
Because bq load performs a positional mapping for CSV imports, and --ignore_unknown_values=true was specified, the data was shifted left sequentially:
median_household_income (4th CSV column) was loaded into total_population (4th schema column).
total_population was loaded into median_age (causing values like 33626 to be stored as the age).
median_age was loaded into bachelors_degree_pct.
bachelors_degree_pct was loaded into foot_traffic_index.
The actual foot_traffic_index (8th CSV column) was entirely ignored and lost.
The Fix:
Added median_household_income INT64 in the 4th position of the CREATE TABLE query inside setup_bigquery.sh to perfectly align with the CSV columns and restore data integrity.

The Issue:
The bq mk command uses --description "$DATASET_DESCRIPTION", but the variable $DATASET_DESCRIPTION was never initialized anywhere in the script, resulting in an empty dataset description.
The Fix:
Defined DATASET_DESCRIPTION="Dataset for MCP Bakery Demo" at the top of the script.

…me to demographics table

n-issei-777 · 2026-05-28T11:41:58Z

Resolved by #53, so this PR is no longer needed.
Closing this PR.

fix(bakery): define DATASET_DESCRIPTION and add median_household_inco…

0c33eb5

…me to demographics table

n-issei-777 closed this May 28, 2026