Avoid retesting non-viable chunk sizes in tuner by GMNGeoffrey · Pull Request #212 · aqlaboratory/openfold-3

GMNGeoffrey · 2026-05-05T02:05:11Z

Summary
The chunk size tuner performs a binary search over chunk sizes, but the existing algorithm only kept track of the lower-bound of the search and so unnecessarily re-tested chunk sizes that had already been proven non-viable.

Changes

This commit adds more conventional hi/lo tracking to the chunk size search as well as tests for this failure mode.

Related Issues
Fixes #211

Testing
Added unit tests for this case and confirmed they failed with the old implementation.

Other Notes
Based on #207 (to avoid issues with the +4 on chunk sizes). Only the last commit is part of this PR

- Avoid weird addition of 4 to power-of-two chunk sizes. This was added in aqlaboratory@a9a12890d without explanation. We can hypothesize that it was related to adding 4 to an input dimension in trace_utils.py (trying to get a test case to fit in one chunk?), but that file was long ago deleted. This just looks like a bug and makes us hit unhappy paths all over the place. Fixes aqlaboratory#203 - Enable chunking for AuxiliaryHeadsAllAtom pairformer embedding when using optimized kernels. Without chunking, this is the first call to cause OOMs because its `diffusion_samples*sequence_length` batches. Chunking gets turned off in prediction_heads.py due to batch size > 1 and use of optimized kernels because cross-sample chunking requires expanding out pair bias and they all require it to have size 1 in the second dimension with implicit broadcasting. So we turn on `apply_per_sample` when optimized kernels are in use. This splits the > 1 batch dimension, which avoids this problematic path and then we can do normal chunking for the rest if it's still too large. We could do something more elaborate (see suggestions in linked issue), but this is an improvement for now. Fixes aqlaboratory#206

The chunk size tuner performs a binary search over chunk sizes, but the existing algorithm only kept track of the lower-bound of the search and so unnecessarily re-tested chunk sizes that had already been proven non-viable. This commit adds more conventional hi/lo tracking to the chunk size search as well as tests for this failure mode.

GMNGeoffrey · 2026-05-08T22:59:45Z

@christinaflo PTAL :-)

christinaflo · 2026-05-13T15:09:30Z

Hi @GMNGeoffrey, sorry I was out the last week, I'll take a look at these chunking PRs this week!

GMNGeoffrey added 2 commits May 1, 2026 16:47

GMNGeoffrey mentioned this pull request May 5, 2026

[BUG] Chunk size tuner re-tests non-viable chunk sizes #211

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid retesting non-viable chunk sizes in tuner#212

Avoid retesting non-viable chunk sizes in tuner#212
GMNGeoffrey wants to merge 2 commits into
aqlaboratory:mainfrom
GMNGeoffrey:chunk-tuner-fix-bin-search

GMNGeoffrey commented May 5, 2026 •

edited

Loading

Uh oh!

GMNGeoffrey commented May 8, 2026

Uh oh!

christinaflo commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

GMNGeoffrey commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GMNGeoffrey commented May 8, 2026

Uh oh!

christinaflo commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

GMNGeoffrey commented May 5, 2026 •

edited

Loading