Skip to content

Add filtering by total length in download_uniprot rule #33

@DimaMolod

Description

@DimaMolod

change https://github.com/KosinskiLab/AlphaPulldownSnakemake/blob/main/workflow/Snakefile#L128-L133 to add a default filtering (>2000?)
below is a bash command that reports length of a protein for reference

uniprot_id="Q5S007"
len=$(curl -fsS "[https://rest.uniprot.org/uniprotkb/${uniprot_id}.fasta](https://rest.uniprot.org/uniprotkb/$%7Buniprot_id%7D.fasta)" \
  | tail -n +2 | tr -d '\n\r' | wc -m)
echo -e "${uniprot_id}\t${len}"

Also add the threshold to config.yaml

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions