Short sequence prefix-invariant evo2 implementation by jstjohn · Pull Request #1580 · NVIDIA-BioNeMo/bionemo-framework

jstjohn · 2026-05-22T18:54:30Z

Description

Changes:

codex added to top level devcontainer
bump causal-conv1d, megatron-bridge, and associated dependencies
add test coverage for prefix invariance when running evo2 on very short sequences through inference and training

Summary by CodeRabbit

Chores
- Updated development container configuration for improved build environment setup.
- Added build dependencies to support enhanced model training infrastructure.
Bug Fixes
- Improved robustness of GPU kernel operations with enhanced validation checks.
- Enhanced model compatibility across different system configurations.
Tests
- Expanded test coverage for model inference, training, and GPU kernel correctness.

Signed-off-by: John St. John <jstjohn@nvidia.com>

coderabbitai · 2026-05-22T18:54:36Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 0297d2f6-ffb6-4aa1-9279-ab710dccfe99

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

✅ Review completed - (🔄 Check again to review again)

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch jstjohn/prefix_invariance_evo2

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

farhadrgh

Two bugs, two questions, and the last section flags that this PR regresses subq-ops inference support that already landed in #1565.

Bugs

1. hyena_utils.fftconv_func fix is incomplete, bidirectional path still broken.

The fix lands only inside the else: # causal branch:

if use_subquadratic_ops:
    y = fft_causal_conv1d(u, k.squeeze(0))
else:
    fft_size = max(fft_size, 2 * k.shape[-1])   # <-- only here
    k_f = torch.fft.rfft(k, n=fft_size) / fft_size

The if bidirectional: branch immediately above still does torch.fft.rfft(k, n=fft_size) with the original fft_size = 2 * seqlen. Same truncation bug if anyone runs the bidirectional path with seqlen < K. Suggest hoisting the max(...) line to right after fft_size = 2 * seqlen so both branches benefit.

2. The short-filter causal_conv1d subq path was reverted, but the xfail only covers the fused B2B path.

Two separate code paths got removed in this PR, but the xfail (test_b2b_causal_conv1d_module_matches_sequential_reference) only documents one:

engine.parallel_fir lost its if use_subquadratic_ops: _subq_causal_conv1d(...) arm in the < 128 branch.
ParallelCausalDepthwiseConv1d.forward now always uses causal_conv1d_fn instead of dispatching to subq when use_subquadratic_ops=True.

Neither of those is the fused B2B kernel, they're plain depthwise short-filter convolutions. Is the issue actually with subq's causal_conv1d under causal-conv1d 1.6+, or did these get caught in the same revert? If it's the latter, worth keeping them — they're the easier speedup with no fusion semantics to verify.

Questions

3. @torch.compile removed from ImplicitModalFilter.filter does the comment refer to a specific reproducer? A pointer in the comment would help future readers, and if the bad-interaction scope is narrow we may be able to keep @torch.compile with dynamic=False or wrap the offending call site in torch.compiler.disable instead of dropping it altogether.

4. hyena_block.py variable-arity get_cpu_offload_context call, clean fix for the 6-vs-7-arg drift, but len(inspect.signature(...).parameters) is a brittle proxy (it counts a *args parameter as 1, which would silently break the slice). Worth a # tied to MCore <= 0.x note so future readers know to revisit if MCore changes the signature again.

Regression of #1565 (already on `main`)

This PR removes the two inference subq-ops code paths that landed in #1565 (merged 2026-04-30):

engine.parallel_fir short branch: the if use_subquadratic_ops: _subq_causal_conv1d(...) arm from #1565 is removed (item 2 above).
HyenaMixer.forward prefill: #1565 added _populate_b2b_inference_state and gated the fused b2b kernel on use_subquadratic_ops. This PR forces the gate off via self.use_fused_b2b_causal_conv1d = False (hardcoded), so the fused path can never fire even when the user passes --use-subquadratic-ops. This also disables the original training and predict_evo2 b2b path that predates #1565.

Net effect for infer_evo2 --use-subquadratic-ops after this PR lands:

The flag still routes long-filter FFT convs through subq-ops (_subq_fft_causal_conv1d), so the existing test_subquadratic_ops_matches_baseline correctness test will still pass.
But the short-filter and fused-B2B prefill paths are gone, so the measured ~15% prefill speedup at 8K prompt on the 1B model (single A6000) goes back to zero. Users get the CLI flag without the performance it was added for.

I get why this is happening, the xfail in test_hyena_utils.py shows the fused B2B kernel doesn't match the reference under causal-conv1d 1.6+. That's a real kernel-side bug. But two things:

(a) The fix for the fused-B2B mismatch shouldn't take out the short-filter causal_conv1d path too. They're independent (see item 2 above). If the subq short-filter kernel is also broken under 1.6+, a passing/failing test would clarify; if it isn't broken, please keep that path.

(b) Disabling the fused B2B path is reasonable as a temporary measure, but hardcoding the flag to False makes the regression permanent until someone re-edits the file. Please make it a real config attribute so it can be flipped back on once subquadratic-ops ships the 1.6+ fix, without another PR. Suggested:

self.use_fused_b2b_causal_conv1d = getattr(
    transformer_config, "use_fused_b2b_causal_conv1d", False
)

That way #1565's runtime behavior is recoverable via config, and we don't lose the speedup permanently. (And anyone hitting a predict_evo2 perf regression after this lands can re-enable it for the training/predict path independently.)

…nd fail loudly if the CUDA_ERROR_UNSUPPORTED_PTX_VERSION error comes up Signed-off-by: John St. John <jstjohn@nvidia.com>

Signed-off-by: John St. John <jstjohn@nvidia.com>

Signed-off-by: John St John <jstjohn@nvidia.com>

…to jstjohn/prefix_invariance_evo2

Signed-off-by: John St John <jstjohn@nvidia.com>

copy-pr-bot · 2026-05-29T16:17:22Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

jstjohn · 2026-06-01T16:48:32Z

/ok to test 121c57e

copy-pr-bot · 2026-06-01T16:48:35Z

/ok to test 121c57e

@jstjohn, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

jstjohn · 2026-06-01T16:50:03Z

/ok to test

copy-pr-bot · 2026-06-01T16:50:07Z

/ok to test

@jstjohn, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.devcontainer/postCreateCommand.sh:
- Around line 4-7: The current install silently ignores failures because of the
"|| true" after the curl/install pipeline; update the block in
.devcontainer/postCreateCommand.sh that runs "curl -fsSL
https://chatgpt.com/codex/install.sh | sh || true" to instead capture the
installer exit status, remove the "|| true", and if the install fails (non-zero
exit) or "command -v codex" still does not find the binary, emit a clear warning
to stderr (e.g., echo to >&2) describing the failure and that codex is not
available; then re-check "command -v codex" after the install attempt and log
the warning if missing so the failure is visible.

In `@bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/run/predict.py`:
- Around line 1287-1292: The call to register_allowed_target_prefix("bionemo.")
is only in main(), but predict() calls instantiate(run_config["model"]) and may
run before the allowlist is registered; add a small helper (e.g.,
ensure_bionemo_allowed()) that wraps the try/except import of
megatron.bridge.utils.instantiate_utils and calls
register_allowed_target_prefix("bionemo.") if available, then invoke that helper
at the very start of predict() (in addition to leaving the existing call in
main()) so imports of predict() or direct predict() calls always register the
prefix before instantiate() is used.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 3644a2b5-83a5-4215-bebe-b26858ce3cd1

📥 Commits

Reviewing files that changed from the base of the PR and between 0a594f5 and 9d2f598.

📒 Files selected for processing (26)

.devcontainer/Dockerfile
.devcontainer/devcontainer.json
.devcontainer/initializeCommand.sh
.devcontainer/postCreateCommand.sh
.devcontainer/start.sh
bionemo-recipes/recipes/evo2_megatron/.ci_build.sh
bionemo-recipes/recipes/evo2_megatron/build_requirements.txt
bionemo-recipes/recipes/evo2_megatron/pyproject.toml
bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/models/evo2_provider.py
bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/models/megatron/hyena/engine.py
bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/models/megatron/hyena/fft_utils.py
bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/models/megatron/hyena/hyena_block.py
bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/models/megatron/hyena/hyena_layer.py
bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/models/megatron/hyena/hyena_mixer.py
bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/models/megatron/hyena/hyena_utils.py
bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/models/megatron/hyena/subquadratic_safety.py
bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/run/infer.py
bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/run/predict.py
bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/run/train.py
bionemo-recipes/recipes/evo2_megatron/tests/bionemo/evo2/conftest.py
bionemo-recipes/recipes/evo2_megatron/tests/bionemo/evo2/models/megatron/hyena/test_engine.py
bionemo-recipes/recipes/evo2_megatron/tests/bionemo/evo2/models/megatron/hyena/test_hyena_mixer_kernel.py
bionemo-recipes/recipes/evo2_megatron/tests/bionemo/evo2/models/megatron/hyena/test_hyena_utils.py
bionemo-recipes/recipes/evo2_megatron/tests/bionemo/evo2/run/test_infer.py
bionemo-recipes/recipes/evo2_megatron/tests/bionemo/evo2/run/test_predict.py
bionemo-recipes/recipes/evo2_megatron/tests/bionemo/evo2/test_model_providers.py

💤 Files with no reviewable changes (1)

bionemo-recipes/recipes/evo2_megatron/tests/bionemo/evo2/conftest.py

pstjohn · 2026-06-01T21:18:47Z

/ok to test 9d2f598

jstjohn added 2 commits May 22, 2026 11:38

Initial commit of short sequence prefix-invariant evo2 implementation

121c57e

Signed-off-by: John St. John <jstjohn@nvidia.com>

Bump megatron bridge dep

f2cd6a1

Signed-off-by: John St. John <jstjohn@nvidia.com>

jstjohn requested review from cspades, dorotat-nv, jomitchellnv, jwilber, pstjohn, savitha-eng and trvachov as code owners May 22, 2026 18:54

jstjohn requested review from farhadrgh and moradza May 22, 2026 18:54

jstjohn commented May 22, 2026

View reviewed changes

Comment thread bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/models/megatron/hyena/engine.py Outdated

farhadrgh requested changes May 22, 2026

View reviewed changes

jstjohn added 11 commits May 22, 2026 12:51

Go back to the original subq version, assume it works on other gpus a…

49be647

…nd fail loudly if the CUDA_ERROR_UNSUPPORTED_PTX_VERSION error comes up Signed-off-by: John St. John <jstjohn@nvidia.com>

Roll back hyena_mixer diffs

9532cba

Signed-off-by: John St. John <jstjohn@nvidia.com>

Roll back more variable renamings

2be5190

Signed-off-by: John St. John <jstjohn@nvidia.com>

Remove overly granular checks on compatability

4ed5d0d

Signed-off-by: John St. John <jstjohn@nvidia.com>

Add back more of the removed subq ops calls

5736e04

Signed-off-by: John St. John <jstjohn@nvidia.com>

Remove conftest diffs

4f57ec2

Signed-off-by: John St. John <jstjohn@nvidia.com>

Reduce the number of inner loop checks for compatibility

c372637

Signed-off-by: John St. John <jstjohn@nvidia.com>

Address PR feedback

5411e09

Signed-off-by: John St. John <jstjohn@nvidia.com>

Update default fp32 residual

1fa7260

Signed-off-by: John St. John <jstjohn@nvidia.com>

Add missing pytest dep

e8fb3c3

Signed-off-by: John St. John <jstjohn@nvidia.com>

Fix changed import in infer.py

53fdc45

Signed-off-by: John St. John <jstjohn@nvidia.com>

farhadrgh approved these changes May 22, 2026

View reviewed changes

Register allowed prefix

3b8ebc2

Signed-off-by: John St. John <jstjohn@nvidia.com>

moradza approved these changes May 26, 2026

View reviewed changes

Attempt to address failing CI

74ac48f

Signed-off-by: John St. John <jstjohn@nvidia.com>

savitha-eng approved these changes May 29, 2026

View reviewed changes

jstjohn added 2 commits May 29, 2026 00:00

Address PR feedback

ef06aeb

Signed-off-by: John St John <jstjohn@nvidia.com>

Merge branch 'main' of github.com:NVIDIA-BioNeMo/bionemo-framework in…

927c93a

…to jstjohn/prefix_invariance_evo2

pstjohn approved these changes May 29, 2026

View reviewed changes

Comment thread .devcontainer/Dockerfile Outdated

Use new method of installing codex

2a592d9

Signed-off-by: John St John <jstjohn@nvidia.com>

jstjohn enabled auto-merge May 29, 2026 02:54

Do not fail on URL resolution issues with claude/codex install

9d2f598

Signed-off-by: John St John <jstjohn@nvidia.com>

coderabbitai Bot reviewed Jun 1, 2026

View reviewed changes

Comment thread .devcontainer/postCreateCommand.sh

Comment thread bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/run/predict.py

jstjohn added this pull request to the merge queue Jun 2, 2026

Merged via the queue into main with commit 85a8dcc Jun 2, 2026
17 checks passed

jstjohn deleted the jstjohn/prefix_invariance_evo2 branch June 2, 2026 00:23

Conversation

jstjohn commented May 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Uh oh!

farhadrgh left a comment

Choose a reason for hiding this comment

Bugs

Questions

Regression of #1565 (already on main)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

copy-pr-bot Bot commented May 29, 2026

Uh oh!

jstjohn commented Jun 1, 2026

Uh oh!

copy-pr-bot Bot commented Jun 1, 2026

Uh oh!

jstjohn commented Jun 1, 2026

Uh oh!

copy-pr-bot Bot commented Jun 1, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pstjohn commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jstjohn commented May 22, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 22, 2026 •

edited

Loading

Regression of #1565 (already on `main`)