Skip to content

Conversation

@jbloom
Copy link

@jbloom jbloom commented Oct 16, 2025

display deep mutational scanning escape data from clinically relevant anti-F antibodies

@rneher, we are currently making this as a draft pull request to see if you would be interested in integrated this into your Nextstrain RSV builds. If so, that would be awesome. If not (eg, if this is too many changes), we can also make our own separate builds with this, but wanted to know which approach you prefer.

As background, @CSimonich @tmcmaho in our group have done pseudovirus deep mutational scanning to measure how mutations to RSV F affect neutralization by the two anti-RSV antibodies that have been clinically approved (Nirsevimab and Clesrovimab), measuring against both the antibody IgG and Fab (both measurements are informative for reasons we will explain the paper associated with this). The data are not yet final QC-ed, but we have draft data that is close to what we will have with final QC.

We think it would be useful to be able to display these data on the RSV trees to look when sequences have mutations that likely affect escape.

In this pull request, we do several things:

1. Enable trees to be colored by total escape and max escape mutation for each antibody

This coloring is now enabled for the genome, F and F-antibody-escape builds using a viridis color scale.
This change involved some relatively modest additions to the configuration and snakemake rules, as well as a new script that does the scoring from the DMS data.

2. Add a new F-antibody-escape build that subsamples to ensure high escape sequences are included

The existing F build (for which sequence sampling in unchanged) samples by year-geography.
I also added a new F-antibody-escape build that samples in such a way as to ensure that sequences with high escape to the antibodies get included too.
This build is designed to make sure the trees show high-escape sequences.
The largest fraction of changes to the snakemake rules are related to this aspect (adding this new build, which requires scoring all sequences pre-tree-build and then using that in an additional custom subsampling rule.)

3. Partially successful effort to change background minimum date

To me the 3y and 6y builds are actually sort of hard to look at as the background sequences on those builds go all the way back to 1975.
I'm not sure if you had a good reason for doing things that way, but I tried reducing background_min_date in the confg for those so less of the depth of the tree was used on very old sequences and more was used on recent sequences.
This only sort of worked---better for some builds than others.
This change could be reverted, or further improved.
I do think that the depth of background sequence dates here does obscure the goal of the 3y and 6y builds to show more recent sequences.

How the new trees look

Here is how those new trees look on data from a few weeks ago (I haven't re-run on today's data)

Those are two of the builds, the rest are at similar URLs for the different subtypes / builds / date ranges on the jbloomlab Nextstrain group.

Next steps

Can you comment on whether this is something that you might want to merge into this main nextstrain build. If YES we can start working on any changes you would require. If NO we can try to set up a build with this on our own.

Checklist

At this point no additions have been made to the tests, and I'm not sure if they still pass.

The CHANGELOG has also not been updated.

  • Checks pass
  • Update changelog

@jbloom jbloom marked this pull request as draft October 16, 2025 21:02
@jbloom
Copy link
Author

jbloom commented Oct 21, 2025

See this pull request from @rneher re the point of setting background sequence minimum date: #110

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant