A clean-main workstation smoke run failed before writing any artifacts/local_us_microplex_smoke/local-smoke-v1/ checkpoint when the pipeline attempted to download a SIPP donor asset from Hugging Face via Xet and the local disk filled.
Command shape:
python -m microplex_us.pipelines.pe_us_data_rebuild_checkpoint \
--output-root artifacts/local_us_microplex_smoke \
--version-id local-smoke-v1 \
--baseline-dataset /Users/administrator/Documents/PolicyEngine/policyengine-us-data/policyengine_us_data/storage/enhanced_cps_2024.h5 \
--targets-db /Users/administrator/Documents/PolicyEngine/calibration-diagnostics/.artifacts/policy_data.db \
--policyengine-us-data-repo /Users/administrator/Documents/PolicyEngine/policyengine-us-data \
--policyengine-us-data-python /Users/administrator/Documents/PolicyEngine/microplex-us/.venv/bin/python \
--calibration-backend microcalibrate \
--donor-imputer-backend zi_qrf \
--policyengine-materialize-batch-size 100000 \
--cps-sample-n 1000 \
--puf-sample-n 1000 \
--donor-sample-n 1000 \
--n-synthetic 1000 \
--no-include-acs \
--defer-policyengine-harness \
--defer-policyengine-native-score \
--defer-native-audit \
--defer-imputation-ablation
The run had to add --no-include-acs because policyengine-us-data does not provide storage/acs_2022.h5 locally and the ACS source has no download URL.
Failure excerpt:
RuntimeError: Data processing error: File reconstruction error: IO Error: No space left on device (os error 28)
...
File "microplex_us/data_sources/donor_surveys.py", line 676, in _download_policyengine_us_data_file
downloaded = hf_hub_download(
...
Loading processed CPS ASEC 2023 from /Users/administrator/.cache/microplex/cps_asec_2023_processed_v20260601_ecps_spm_takeup_inputs.parquet
Loading PUF from /Users/administrator/.cache/microplex/puf_2015.csv...
Raw records: 207,692
Loading demographics from /Users/administrator/.cache/microplex/demographics_2015.csv...
After demographics merge: 207,692
Expanded 1,000 tax units to 1,921 persons
Observed behavior:
- The failure occurs before any durable smoke output/checkpoint appears under
artifacts/local_us_microplex_smoke/local-smoke-v1/.
- The Xet log showed successful reconstruction of one donor file shortly before the failure, then the next donor download exhausted disk.
- Because no checkpoint exists, the next retry cannot resume from a completed stage and must restart the pre-checkpoint source-loading/imputation work.
Potential improvements:
- Preflight available disk space for Hugging Face/cache/download directories before beginning source loading.
- Emit which donor source/file is being downloaded before invoking
hf_hub_download.
- Consider making source-loading checkpointing more granular so large donor downloads do not require restarting the whole pre-checkpoint phase after local environment failures.
A clean-main workstation smoke run failed before writing any
artifacts/local_us_microplex_smoke/local-smoke-v1/checkpoint when the pipeline attempted to download a SIPP donor asset from Hugging Face via Xet and the local disk filled.Command shape:
The run had to add
--no-include-acsbecausepolicyengine-us-datadoes not providestorage/acs_2022.h5locally and the ACS source has no download URL.Failure excerpt:
Observed behavior:
artifacts/local_us_microplex_smoke/local-smoke-v1/.Potential improvements:
hf_hub_download.