Skip to content

OSF backed data#11

Merged
himjl merged 5 commits intov2from
osf-upload
Mar 27, 2026
Merged

OSF backed data#11
himjl merged 5 commits intov2from
osf-upload

Conversation

@himjl
Copy link
Copy Markdown
Owner

@himjl himjl commented Mar 27, 2026

PR Notes
This branch updates hobj’s packaged-data flow to use OSF instead of the old S3 tarball, and makes the cache layout safer for installed-package use.

Key changes relative to v2:

  • Switched dataset download in hobj/data/download.py from the S3 .tar.gz to the OSF root zip for node pj6wm.
  • Added support for the new archive layout where the outer OSF zip contains data/, and data/images.tar.gz is unpacked into data/images and then deleted.
  • Moved the default cache out of the repo root into a versioned user cache path: ~/.hobj_cache/pj6wm-v2/data.
  • Added the hobj-download-data console script in pyproject.toml.
  • Expanded cache/download coverage in tests/test_data_loader_cache.py.
  • Added new tests for dataset-shaped statistics outputs in tests/test_statistics_builders.py and benchmark usage in tests/test_statistics_dataset_usage.py.
  • Refreshed the README and example notebooks to reflect the updated data flow and outputs.

Suggested PR summary:
“Replace packaged dataset download flow with OSF-backed versioned cache, add nested image archive extraction, and extend dataset/statistics test coverage.”

Validation run:

  • make lint
  • make check
  • make test

@himjl himjl changed the base branch from master to v2 March 27, 2026 04:03
@himjl himjl merged commit 8d7730d into v2 Mar 27, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant