add clinical antibody deep mutational scanning escape data #109

jbloom · 2025-10-16T21:01:54Z

display deep mutational scanning escape data from clinically relevant anti-F antibodies

@rneher, we are currently making this as a draft pull request to see if you would be interested in integrated this into your Nextstrain RSV builds. If so, that would be awesome. If not (eg, if this is too many changes), we can also make our own separate builds with this, but wanted to know which approach you prefer.

As background, @CSimonich @tmcmaho in our group have done pseudovirus deep mutational scanning to measure how mutations to RSV F affect neutralization by the two anti-RSV antibodies that have been clinically approved (Nirsevimab and Clesrovimab), measuring against both the antibody IgG and Fab (both measurements are informative for reasons we will explain the paper associated with this). The data are not yet final QC-ed, but we have draft data that is close to what we will have with final QC.

We think it would be useful to be able to display these data on the RSV trees to look when sequences have mutations that likely affect escape.

In this pull request, we do several things:

1. Enable trees to be colored by total escape and max escape mutation for each antibody

This coloring is now enabled for the genome, F and F-antibody-escape builds using a viridis color scale.
This change involved some relatively modest additions to the configuration and snakemake rules, as well as a new script that does the scoring from the DMS data.

2. Add a new `F-antibody-escape` build that subsamples to ensure high escape sequences are included

The existing F build (for which sequence sampling in unchanged) samples by year-geography.
I also added a new F-antibody-escape build that samples in such a way as to ensure that sequences with high escape to the antibodies get included too.
This build is designed to make sure the trees show high-escape sequences.
The largest fraction of changes to the snakemake rules are related to this aspect (adding this new build, which requires scoring all sequences pre-tree-build and then using that in an additional custom subsampling rule.)

3. Partially successful effort to change background minimum date

To me the 3y and 6y builds are actually sort of hard to look at as the background sequences on those builds go all the way back to 1975.
I'm not sure if you had a good reason for doing things that way, but I tried reducing background_min_date in the confg for those so less of the depth of the tree was used on very old sequences and more was used on recent sequences.
This only sort of worked---better for some builds than others.
This change could be reverted, or further improved.
I do think that the depth of background sequence dates here does obscure the goal of the 3y and 6y builds to show more recent sequences.

How the new trees look

Here is how those new trees look on data from a few weeks ago (I haven't re-run on today's data)

Those are two of the builds, the rest are at similar URLs for the different subtypes / builds / date ranges on the jbloomlab Nextstrain group.

Next steps

Can you comment on whether this is something that you might want to merge into this main nextstrain build. If YES we can start working on any changes you would require. If NO we can try to set up a build with this on our own.

Checklist

At this point no additions have been made to the tests, and I'm not sure if they still pass.

The CHANGELOG has also not been updated.

Checks pass
Update changelog

jbloom · 2025-10-21T19:05:34Z

See this pull request from @rneher re the point of setting background sequence minimum date: #110

…rd to see when config missing keys, and to enable build names like `F-antibody-escape`

…es with substantial antibody escape

…bstantially reduces depth and makes looking at recent sequences easier

…escription

jbloom marked this pull request as draft October 16, 2025 21:02

jbloom added 7 commits October 27, 2025 14:18

add F DMS data (prelim) and score all sequences pre-subsampling

dbc18be

compute F escape scores for all nodes and use to color tree

baf70f1

modify snakemake rules to eliminate .get for dicts as it makes ha…

df915db

…rd to see when config missing keys, and to enable build names like `F-antibody-escape`

add F-antibody-escape build which enriches sequence set for sequenc…

64d9484

…es with substantial antibody escape

keep background sequences only going back 12Y, for some trees this su…

572574e

…bstantially reduces depth and makes looking at recent sequences easier

add some details re DMS data antibody escape to README and config / d…

23adc59

…escription

update to new QC-ed DMS data

71fdfe1

jbloom force-pushed the DMS-data-for-F branch from 10ff910 to 71fdfe1 Compare October 27, 2025 23:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add clinical antibody deep mutational scanning escape data #109

add clinical antibody deep mutational scanning escape data #109

Uh oh!

jbloom commented Oct 16, 2025 •

edited

Loading

Uh oh!

jbloom commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

add clinical antibody deep mutational scanning escape data #109

Are you sure you want to change the base?

add clinical antibody deep mutational scanning escape data #109

Uh oh!

Conversation

jbloom commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

display deep mutational scanning escape data from clinically relevant anti-F antibodies

1. Enable trees to be colored by total escape and max escape mutation for each antibody

2. Add a new F-antibody-escape build that subsamples to ensure high escape sequences are included

3. Partially successful effort to change background minimum date

How the new trees look

Next steps

Checklist

Uh oh!

jbloom commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jbloom commented Oct 16, 2025 •

edited

Loading

2. Add a new `F-antibody-escape` build that subsamples to ensure high escape sequences are included