Skip to content

Add paths.buffer config key to redirect buffer snapshot storage#71

Open
oleg-savchenko wants to merge 1 commit into
cweniger:mainfrom
oleg-savchenko:feature/paths-buffer
Open

Add paths.buffer config key to redirect buffer snapshot storage#71
oleg-savchenko wants to merge 1 commit into
cweniger:mainfrom
oleg-savchenko:feature/paths-buffer

Conversation

@oleg-savchenko
Copy link
Copy Markdown

Mirrors the existing paths.samples pattern: if paths.buffer is set in the config (or passed as a Hydra override), buffer snapshots are written to {paths.buffer}/snapshots/ instead of the default {run_dir}/buffer/snapshots/.

Useful when run_dir is on a fast local filesystem but buffer snapshots (which can be large) should land on a separate scratch volume.

Motivation

In HPC workflows, adaptive falcon training produces two categories of output with
very different storage requirements:

Output Typical size Storage preference
Buffer snapshots (buffer/snapshots/) Large — grows with max_samples × field_size × adaptive_rounds Fast scratch (high I/O, temporary)
Posterior samples (samples/posterior/) Large Fast scratch
Model checkpoints (graph/) Small Persistent home/project dir
Configs, logs, plots Small Persistent home/project dir

paths.samples already allows redirecting posterior samples to a separate volume.
However, buffer snapshots were hardcoded to {run_dir}/buffer/snapshots/ with no
override, so they always landed alongside the model and config files — filling up
persistent storage with large temporary simulation data.

This PR adds a paths.buffer config key — mirroring the existing paths.samples
pattern exactly — so buffer snapshots can be independently routed to scratch while
keeping run_dir (model checkpoints, configs, logs, plots) on persistent storage.

Note: paths.buffer was previously present as a dead/unused key and removed in
75ab086. This re-adds it as a working, documented feature.

Change

falcon/cli.py — one new variable, mirrors the paths.samples pattern:

# before
snapshots_path=str(Path(cfg.run_dir) / "buffer" / "snapshots"),

# after
buffer_base = cfg.paths.get("buffer", str(Path(cfg.run_dir) / "buffer"))
snapshots_path=str(Path(buffer_base) / "snapshots"),

Mirrors the existing paths.samples pattern: if paths.buffer is set in the
config (or passed as a Hydra override), buffer snapshots are written to
{paths.buffer}/snapshots/ instead of the default {run_dir}/buffer/snapshots/.

Useful when run_dir is on a fast local filesystem but buffer snapshots
(which can be large) should land on a separate scratch volume.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cweniger
Copy link
Copy Markdown
Owner

There are merge conflicts, can you please merge with the latest master?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants