Skip to content

Large intermediate files in work/ directory, profiler setup challenges, and need for preprocessing skip option #29

@tarshad805

Description

@tarshad805

Description of feature

Hi Muneeb,
First of all, thank you for providing such a powerful and well-designed pipeline.
Here are some issues I faced:

  1. While running the pipeline, I noticed that large files such as aligned .bam and trimmed FASTQs appear in the work/ directory with their full sizes (GB-scale) duplicates existed in both work/ and results/ folders. This can quickly fill storage, especially for large sample sets.
  2. Setting up the profiler took a lot of time and was challenging for me as a new user. More detailed guidance or examples for different environments (HPC, Conda, Docker) would make onboarding much easier please.
  3. It would be very useful to have a parameter (e.g., --skip_preprocessing true) that allows skipping read trimming and QC steps for users who already have preprocessed reads.

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentationenhancementNew feature or request

Projects

Status

Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions