Skip to content

[PULL REQUEST] Upgrades to debug mode#207

Open
Eric-Liu-SANDAG wants to merge 4 commits intomainfrom
debug-updgrade
Open

[PULL REQUEST] Upgrades to debug mode#207
Eric-Liu-SANDAG wants to merge 4 commits intomainfrom
debug-updgrade

Conversation

@Eric-Liu-SANDAG
Copy link
Contributor

Describe this pull request. What changes are being made?

Upgrades to debug mode to make it not completely useless

What issues does this pull request address?

Additional context

N/A

@Eric-Liu-SANDAG Eric-Liu-SANDAG self-assigned this Mar 17, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR upgrades the project’s debug mode to support running a single module for a single year by re-using a pre-existing complete run_id, while avoiding writes to the production database and saving outputs locally instead.

Changes:

  • Add a debug flag to runtime configuration/parsing and propagate it through module entrypoints from main.py.
  • When debug is enabled, skip database inserts/updates in modules and write inputs/outputs to a local debug_output/ folder.
  • Simplify debug configuration to {run_id, year, module} and validate that the referenced run is complete.

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
python/utils.py Exposes DEBUG from parser, adds debug_output/ folder creation, expands runtime logging.
python/parsers.py Reworks debug config to single {run_id, year, module}, validates run completeness, adds module list constant.
main.py Threads debug=utils.DEBUG into all module orchestrators.
python/startup.py Adds debug parameter and skips DB insertion in debug mode.
python/staging.py Adds debug parameter and skips metadata update in debug mode.
python/hs_hh.py Adds debug parameter; writes CSVs instead of DB inserts in debug mode.
python/pop_type.py Adds debug parameter; writes CSVs instead of DB inserts in debug mode.
python/ase.py Adds debug parameter; writes CSVs instead of DB inserts in debug mode (including BULK INSERT bypass).
python/hh_characteristics.py Adds debug parameter; writes CSVs instead of DB inserts in debug mode.
python/employment.py Adds debug parameter; writes CSVs instead of DB inserts in debug mode.
config.toml Updates debug configuration shape to {run_id, year, module} with new comments.
README.md Minor TOML example quoting change for sql.staging.
.gitignore Adds debug_output/ and replaces with a more complete Python .gitignore template.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR upgrades the project’s debug mode so a single module can be run against a previously completed run_id, while avoiding writes to the production database by exporting outputs locally.

Changes:

  • Reworks debug configuration parsing/validation to target exactly one module + one year for an existing complete run_id.
  • Threads a debug flag through module entrypoints to skip DB inserts/updates and write outputs to debug_output/ instead.
  • Updates documentation/config examples and expands .gitignore to ignore debug_output/ (and adopts a standard Python ignore template).

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
python/utils.py Exposes DEBUG, expands startup logging, creates debug_output/ folder in debug mode
python/parsers.py Implements new debug config schema/validation and module selection logic
python/startup.py Adds debug flag and skips startup DB insertion in debug mode
python/staging.py Adds debug flag and skips metadata UPDATE in debug mode
python/hs_hh.py Adds debug flag; exports inputs/outputs to CSV instead of DB in debug mode
python/pop_type.py Adds debug flag; exports inputs/outputs to CSV instead of DB in debug mode
python/ase.py Adds debug flag; exports controls/results to CSV instead of DB/BULK INSERT in debug mode
python/hh_characteristics.py Adds debug flag; exports inputs/outputs to CSV instead of DB in debug mode
python/employment.py Adds debug flag; exports inputs/outputs to CSV instead of DB in debug mode
main.py Passes utils.DEBUG into module entrypoints
config.toml Updates debug configuration format (run_id, year, module)
README.md Updates user-facing configuration documentation for new debug mode
.gitignore Ignores debug_output/ and replaces with a fuller Python .gitignore template
Comments suppressed due to low confidence (5)

python/hs_hh.py:41

  • The function signature now includes debug, but the docstring Args: section only documents year. Please document what debug changes (e.g., skip DB writes and export to debug_output/).
def run_hs_hh(year: int, debug: bool) -> None:
    """Orchestrator function to calculate and insert housing stock and households.

    Inserts housing stock by MGRA from SANDAG's LUDU database for a given year
    into the production database. Then calculates households by MGRA using
    the housing stock by MGRA, applying both Census tract and jurisdiction-level
    occupancy controls, and then runs an integerization and reallocation
    procedure to produce total households by MGRA. Results are inserted into
    the production database.

    Functionality is segmented into functions for code encapsulation:
        _get_hs_hh_inputs - Get housing stock and occupancy controls
        _validate_hs_hh_inputs - Validate the households input data from the above
            function
        _create_hs_hh - Calculate households by MGRA applying occupancy
            controls, integerization, and reallocation
        _validate_hs_hh_outputs - Validate the households output data from the above
            function
        _insert_hs_hh - Insert occupancy controls and households by MGRA to
            production database

    A single utility function is also defined:
        _calculate_hh_adjustment - Calculate adjustments to make to households

    Args:
        year (int): estimates year
    """

python/employment.py:32

  • The function signature now includes debug, but the docstring Args: section only documents year. Please document the debug parameter so its effect on DB writes / local exports is clear.
def run_employment(year: int, debug: bool):
    """Control function to create jobs data by naics_code (NAICS) at the MGRA level.

    Get the LEHD LODES data, aggregate to the MGRA level using the block to MGRA
    crosswalk, then apply control totals from QCEW using integerization.

    Functionality is split apart for code encapsulation (function inputs not included):
        _get_jobs_inputs - Get all input data related to jobs, including LODES data,
            block to MGRA crosswalk, and control totals from QCEW. Then process the
            LODES data to the MGRA level by naics_code.
        _validate_jobs_inputs - Validate the input tables from the above function
        _create_jobs_output - Apply control totals to employment data using
            utils.integerize_1d() and create output table
        _validate_jobs_outputs - Validate the output table from the above function
        _insert_jobs - Store input and output data related to jobs to the database.

    Args:
        year: estimates year
    """

python/pop_type.py:46

  • The function signature now includes debug, but the docstring Args: section only documents year. Please document what debug controls (e.g., skip DB writes and export to debug_output/).
def run_pop(year: int, debug: bool):
    """Control function to create population by type (GQ and HHP) data

    Get MGRA group quarters input data, create the output data, then load both into the
    production database. Also get MGRA household population input data, create the
    output data, then load both into the production database. See the wiki linked at the
    top of this file for additional details.

    Functionality is split apart for code encapsulation (function inputs not included):
        _get_gq_inputs - Get city level group quarter controls (DOF E-5) and GQ point
            data pre-aggregated into MGRAs
        _validate_gq_inputs - Validate the data from the above function
        _create_gq_outputs - Control MGRA level GQ data to the city level group
            quarter controls
        _validate_gq_outputs - Validate the data from the above function
        _insert_gq - Store both the city level control data and controlled
            MGRA level GQ data into the production database
        _get_hhp_inputs - Get city level household population controls (DOF E-5),
            MGRA level households, and tract level household size
        _validate_hhp_inputs - Validate the data from the above function
        _create_hhp_outputs - Compute MGRA household population, then control to
            city level household population
        _validate_hhp_outputs - Validate the data from the above function
        _insert_hhp - Store certain household population input/output data to
            the production database

    A single utility function is also defined:
        _calculate_hhp_adjustment - Calculate adjustments to make to household
            population

    Args:
        year (int): estimates year
    """

python/hh_characteristics.py:46

  • The function signature now includes debug, but the docstring Args: section only documents year. Please add documentation for the debug flag so callers know how outputs are handled in debug mode.
def run_hh_characteristics(year: int, debug: bool) -> None:
    """Orchestrator function to calculate and insert household characteristics.

    The exact household characteristics created are:
    1. Households split by household income category
    2. Households split by number of people in each household

    Both characteristics are generated by applying ACS data to MGRA level households,
    which are created by the HS/HH module.

    Functionality is segmented into functions for code encapsulation. The following are
    used for households split by income category:
        _get_hh_income_inputs - Get MGRA households and ACS tract distributions for
            income
        _validate_hh_income_inputs - Validate the hh income input data
        _create_hh_income - Calculate the hh income, control to MGRA households
        _validate_hh_income_outputs - Validate the hh income output data
        _insert_hh_income - Insert hh income and tract level income distributions to
            database

    The following functions are used for households split by size
        _get_hh_size_inputs - Get MGRA households, MGRA household population, and ACS
            tract distributions for size
        _validate_hh_size_inputs - Validate the hh size input data
        _create_hh_size - Calculate the hh size, control to MGRA households and MGRA
            household population
        _validate_hh_size_outputs - Validate the hh size output data
        _insert_hh_size - Insert hh size and tract level size distributions to database

    Args:
        year (int): estimates year
    """

python/ase.py:64

  • The function signature now includes debug, but the docstring Args: section only documents year. Please document debug (e.g., controls whether outputs are written to DB vs exported locally).
def run_ase(year: int, debug: bool) -> None:
    """Orchestrator function for age/sex/ethnicity population by type.

    Creates regional age/sex/ethnicity controls by population type. Then
    calculates MGRA level age/sex/ethnicity population by type using these
    regional controls, synthesized census tract level seed data, and MGRA
    level population by type generated by the Population by Type module.
    Results are inserted into the production database along with the regional
    controls.

    Functionality is segmented into functions for code encapsulation:
        _get_controls_inputs - Get regional age/sex/ethnicity controls from
            CA DOF for total population, regional age/sex/ethnicity group
            quarters by type distributions from the 5-year ACS PUMS, and
            regional population by type generated by the Population by Type
            module
        _validate_controls_inputs - Validate the controls input data from the
            above function
        _create_controls - Calculate regional age/sex/ethnicity controls by
            population type
        _validate_controls_outputs - Validate the controls output data from
            the above function
        _insert_controls - Insert regional age/sex/ethnicity controls by
            population type to production database
        _get_seed_inputs - Get 5-year ACS Detailed Tables B010001, B03002, and
            B01001(B-I)
        _create_seed - Calculate census tract level age/sex/ethnicity seed
            data for total population
        _get_ase_inputs - Get MGRA population by type generated by the
            Population by Type module, special MGRAs with age/sex restrictions
            by population type, regional age/sex/ethnicity controls by
            population type, and census tract level age/sex/ethnicity seed
            data for total population
        _validate_ase_inputs - Validate the age/sex/ethnicity input data from
            the above function
        _create_ase - Calculate MGRA level age/sex/ethnicity population by
            population type
        _validate_ase_outputs - Validate the age/sex/ethnicity output data from
            the above function
        _insert_ase - Insert MGRA level age/sex/ethnicity population by
            population type to production database

    Args:
        year (int): estimates year
    """

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Upgrade Debug mode

2 participants