Skip to content

Conversation

@ColmTalbot
Copy link
Collaborator

I noticed that it was difficult to pass a user-specified pool through run_sampler, and it isn't used at all in the post processing. This PR:

  • unifies how to create a pool for Bilby with a context (a la with multiprocessing.pool()...)
  • explicitly passes the user pool around including for post processing
  • moves logic out of Sampler._setup_pool and Sampler._close_pool
  • support MPI pools using schwimmbad. This change will effectively make parallel_bilby moot.

@ColmTalbot ColmTalbot marked this pull request as draft October 30, 2025 14:22
@ColmTalbot ColmTalbot marked this pull request as ready for review November 3, 2025 16:03
@ColmTalbot ColmTalbot added >100 lines refactoring sampling Issues about sampling algorithms, their efficiency and robustness labels Nov 4, 2025
Copy link
Collaborator

@mj-will mj-will left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, I look forward to trying it! I've added some initial comments but might need to have a second look.

Comment on lines +301 to +307
result = apply_conversion_function(
result=result,
likelihood=likelihood,
conversion_function=conversion_function,
npool=npool,
pool=pool,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this always reapply the conversion function? If so, I'm unsure if this is desirable since rather than bilby quickly exiting when a run is already done it will spend time doing the conversion. That said, i don't feel strongly about this, so happy to keep it. Thoughts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will reapply it, this maintains the current behaviour. I'd be open to changing that behaviour, maybe by adding a flag to the result file to say if the conversion has been applied, but I would say to do that as a separate change.

parameters=priors.sample(),
) as _pool:
start_time = datetime.datetime.now()
sampler.pool = _pool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this safe to do? Depending on how the sampler uses the pool, are there settings where the pool the sampler has stored is not updated?

For example, if the sampler has constructed a likelihood using the pool.map from the initial input pool will this break?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to work through what this would look like.

If the initial input pool is not None, all of the pool objects reference here should be the same, so I don't think that will make a difference.

One potential issue is that if a specific sampler implementation handles the pool internally by itself this will create two pools and then in the best case, we have a bunch of extra processes we don't need. I think this is what nessai does, so maybe we should game through that specific case.

if (
os.path.isfile(self.resume_file)
and os.path.getsize(self.resume_file)
and os.path.getsize(self.resume_file) > 5
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty pickle files have nonzero size (5B) so currently the code thinks that should be a valid resume file and fails. Probably this should be fixed on main too.

npool=None,
pool=None,
parameters=None,
):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would benefit from a doc-string with example usage.

npool=None,
pool=None,
parameters=None,
):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would benefit from a doc-string. In particular, I think we should make clear what happens in npool is None or 1.

Comment on lines +246 to +258
my_pool = create_pool(likelihood=this_logl, npool=npool)
if my_pool is None:
map_fn = map
else:
map_fn = partial(my_pool.imap, chunksize=chunksize)
likelihood_fn = partial(_safe_likelihood_call, this_logl)

log_l = list(tqdm(
map_fn(likelihood_fn, dict_samples[starting_index:]),
desc='Computing likelihoods',
total=n,
))
close_pool(my_pool)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to use open/close over the context manager?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems not.

Comment on lines +2332 to +2343
if _pool is not None:
subset_samples = _pool.map(
_compute_per_detector_log_likelihoods,
fill_args[ii: ii + block]
)
else:
from ..core.sampler.base_sampler import _sampling_convenience_dump
_sampling_convenience_dump.likelihood = likelihood
subset_samples = [
list(_compute_per_detector_log_likelihoods(xx))
for xx in fill_args[ii: ii + block]
]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth simplifying this with something like this?

if _pool is None:
    map_fn = map
    _sampling_convenience_dump.likelihood = likelihood
else:
    map_fn = pool.map

 subset_samples = map_fn(
                        _compute_per_detector_log_likelihoods,
                        fill_args[ii: ii + block]
                    )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Comment on lines +2462 to +2465
if _pool is not None:
subset_samples = _pool.map(fill_sample, fill_args[ii: ii + block])
else:
subset_samples = [list(fill_sample(xx)) for xx in fill_args[ii: ii + block]]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about map but not sure it's worth it. Thoughts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified it to look closer, but I agree it probably isn't worth it in this case.

Copy link
Collaborator Author

@ColmTalbot ColmTalbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember exactly what the behaviour for some of these questions is, so I'll go back and check, and ideally write docstrings about them.

if (
os.path.isfile(self.resume_file)
and os.path.getsize(self.resume_file)
and os.path.getsize(self.resume_file) > 5
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty pickle files have nonzero size (5B) so currently the code thinks that should be a valid resume file and fails. Probably this should be fixed on main too.

Comment on lines +301 to +307
result = apply_conversion_function(
result=result,
likelihood=likelihood,
conversion_function=conversion_function,
npool=npool,
pool=pool,
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will reapply it, this maintains the current behaviour. I'd be open to changing that behaviour, maybe by adding a flag to the result file to say if the conversion has been applied, but I would say to do that as a separate change.

parameters=priors.sample(),
) as _pool:
start_time = datetime.datetime.now()
sampler.pool = _pool
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to work through what this would look like.

If the initial input pool is not None, all of the pool objects reference here should be the same, so I don't think that will make a difference.

One potential issue is that if a specific sampler implementation handles the pool internally by itself this will create two pools and then in the best case, we have a bunch of extra processes we don't need. I think this is what nessai does, so maybe we should game through that specific case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactoring sampling Issues about sampling algorithms, their efficiency and robustness >100 lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants