Merge VastDB input mode into rna_maps.py with minus-strand fix#12
Open
Bear-Bee89 wants to merge 5 commits intoulelab:mainfrom
Open
Merge VastDB input mode into rna_maps.py with minus-strand fix#12Bear-Bee89 wants to merge 5 commits intoulelab:mainfrom
Bear-Bee89 wants to merge 5 commits intoulelab:mainfrom
Conversation
New standalone script that accepts VastDB EVENT ID lists with pre-assigned categories (enhanced, silenced, control, constitutive) instead of rMATS output, parsing coordinates directly from a VastDB EVENT_INFO annotation file. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ranch Adds dedicated VastDB section at the top covering inputs, usage, outputs, and a comparison table vs the original rMATS-based script. Original rna_maps.py docs preserved below. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Merges the standalone
rna_maps_vastdb_only.pyintorna_maps.pyas asecond input mode, so a single script handles both rMATS and VastDB inputs
with a shared downstream pipeline.
What changed
Two input tracks, one shared pipeline
-i): unchanged interface, all original arguments preserved--vastdb_mode): takes--vastdb_enhanced,--vastdb_silenced,--vastdb_control,--vastdb_constitutiveID list files +--vastdb_annotationshared BED creation, coverage calculation, plotting, and heatmap generation
Minus-strand fix for rMATS flanking exon coordinates
rMATS labels
upstreamES/EEanddownstreamES/EEby genomic position (lowercoords = "upstream"), which is inverted for minus-strand genes. The original
script compensated with cross-exon column pairing in
get_ss_bed()calls.This merge instead swaps the columns at load time in
load_rmats_data()sothat upstream/downstream always mean transcript direction after loading. Both
modes then use identical same-exon
get_ss_bed()calls.New features
--seedflag (default 42) for reproducible control/constitutive subsettingtables (these were missing from the standalone script)
_RMATS_with_categories.tsvvs_VastDB_with_categories.tsvFiles changed
rna_maps.py— replaced with merged versionREADME.md— rewritten to document both modesrna_maps_vastdb_only.py— superseded (can be removed)Testing
Tested on KCL CREATE HPC:
ran to completion, outputs match standalone script
ran to completion with correct category counts
(4448 enhanced, 2256 silenced, 52158 constitutive, 8354 control)
Breaking changes
minus-strand genes compared to the original script. This is a bug fix —
the original cross-exon pairing produced correct results through
compensating logic, but the new approach is clearer and consistent with
VastDB mode.
rna_maps_vastdb_only.pyis no longer needed as a separate script.