Skip to content

modkit questions about filtering #582

@weishwu

Description

@weishwu

Hi @ArtRand

I have a couple of questions about modkit (v0.6.1):

  1. I used to run modkit pileup with --ignore h since I need 5mCG only. I noticed that this argument is gone in the latest modkit version (v0.6.1). Instead, it has --modified-bases. So I can use --modified-bases 5mC to do the same thing as the previous --ignore h, right?

  2. When I run modkit pileup with the default setting, I get this in the log: "Threshold of 0.6621094 for base C is low. Consider increasing the filter-percentile or specifying a higher threshold." My ONT sequencing was targeted to a 34Mb panel of ROI enriched for imprinted regions, so I wonder if this could make the default estimate unstable (which can potentially affect sample comparison?). Is it OK to set an explicit threshold for the filter like this:

modkit pileup \
      -t {threads} \
      --cpg \
      --modified-bases 5mC \
      --filter-threshold C:0.8 \
      --ref {params.genome_fasta} \
      --combine-strands \
      --include-bed {params.roi_full} \
      --bgzf \
      --phased \
      --prefix {params.out_prefix} \
      {input} \
      {params.out_dir}
  1. When I run modkit dmr pair with the outputs from above, I get this error:

batch failed: invalid data, valid coverage (51) is not equal to the sum of canonical and modified counts (49), [BedMethylLine { chrom: "chr1", interval: Interval { start: 1008712, stop: 1008713, val: () }, raw_mod_code: Code('m'), strand: Both, count_methylated: 31, valid_coverage: 51, count_canonical: 18, count_other: 2, count_delete: 0, count_fail: 1, count_diff: 0, count_nocall: 0 }] chrom: chr1 starting at 1008712, stopping > 6 batches processed > Error! invalid-bedmethyl-data

It is because in some lines, there are reads categorized as Nother_mod, making Nmod + Ncanonical != Nvalid_cov. Is this because of the --filter-threshold C:0.8 parameter I used in modkit pileup?

Metadata

Metadata

Assignees

No one assigned

    Labels

    troubleshootingworkflow and data preparation questions

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions