- tidy_names : Read a FASTA file containing isoforms, where the headers (previously taken from a GFF file) include the IDs for the protein, mRNA, and gene. The script generates a new FASTA file with cleaned headers, where only the gene ID is shown. In cases where there are isoforms, only the longest sequence is included.
- family_expansion : Read a FASTA file containing CDS, tidy the names for pairwise alignments (e.g., BLAST), and identify family members at different identity and coverage thresholds.
- read_coverage : Compute the read coverage at each position in the genome based on aligned reads (from a .BAM file). Plot the read coverage and gene annotations within a specified genomic region of interest. Obtains the read counts (i.e., the number of reads assigned to each gene) for all genes in the provided GFF file.
AGR114molecularBreeding/scripts
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|