Skip to content

Commit 076958e

Browse files
committed
Added optional chapter 3
1 parent ccba59c commit 076958e

File tree

11 files changed

+578741
-9
lines changed

11 files changed

+578741
-9
lines changed

_optionals/03-extra_tools.md

Lines changed: 85 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,90 @@
11
---
2-
title: "More bioinformatic tools"
3-
teaching:
4-
exercises:
5-
questions:
6-
- dsfgdf
2+
title: "Another bioinformatic tool: PyANI"
73
objectives:
8-
- dfgdf
9-
keypoints:
10-
- dfgdf
4+
- To calculated average nucleotide identity of a genome with other related genomes.
115

126
---
137

14-
WIP
8+
In the regular lessons, we implemented three bioinformatic tools:
9+
`blast` for homology search,
10+
`maaft` for sequence alignment, and
11+
`raxml` for phylogenetic analysis.
12+
13+
In this section, we will discuss two more tools.
14+
15+
## PyANI
16+
PyANI is an open-source python-based tool for calculating
17+
Average Nucleotide Identity (ANI) between two or more sequences.
18+
When comparing two genomes, first syntenic regions are identified
19+
using tools such as `mummer` or `blast`.
20+
Then the nucleotide identity is calculated in the syntenic regions.
21+
22+
The source code for **PyANI** is available at
23+
[widdowquinn/pyani](https://github.com/widdowquinn/pyani){: target="_blank"}.
24+
The documentation for basic usage is available
25+
[here](https://github.com/widdowquinn/pyani/blob/master/README_v_0_2_x.md){: target="_blank"}.
26+
27+
**PyANI** v2 is available in Hipergator, but has to be loaded.
28+
The dependencies, `mummer` and `blast+` will be loaded together with `pyani`.
29+
30+
~~~
31+
$ ml pyani
32+
~~~
33+
{: .language-bash}
34+
35+
~~~
36+
Lmod is automatically replacing "python/3.8" with "pyani/0.2.10".
37+
~~~
38+
{: .output}
39+
40+
We will be using the genomes present in `files/ani` for computing ANI.
41+
The file `UXhortspp.fasta` contains genome of a unknown *X. hortorum* species.
42+
The other sequences are genome of some *X. hortorum* pathovars
43+
downloaded from NCBI.
44+
45+
> ## Getting genome sequences from NCBI
46+
> PyANI has a script called `genbank_get_genomes_by_taxon.py` to download
47+
> all genomes for a taxon from NCBI.
48+
> For usage, check the documentation linked above.
49+
{: .tips}
50+
51+
The objective now is to perform pairwise comparisons of all reference genomes
52+
and calculate ANI. This can be performed with following command.
53+
54+
~~~
55+
average_nucleotide_identity.py -i files/ani -o ani -m ANIm -g --gformat png,pdf
56+
~~~
57+
{: .language-bash}
58+
59+
> - `average_nucleotide_identity.py` is the name of the script
60+
> - `-i` is used to specify directory containing input genomes/sequences.
61+
> - `-o` is used to specify output directory.
62+
> Note that the program will exit if this directory preexists.
63+
> - `-m` is used to specify mode for alignment of syntenic region.
64+
> `ANIm` specifies `mummer` and `ANIb` specifies `blast+`.
65+
> - `-g` is used to generate graphic output, i.e., heatmap.
66+
> `--gformat` specifies the graphic output formats.
67+
{: .notes}
68+
69+
~~~
70+
$ ls ani
71+
~~~
72+
{: .language-bash}
73+
74+
~~~
75+
ANIm_alignment_coverage.pdf ANIm_hadamard.pdf ANIm_similarity_errors.pdf
76+
ANIm_alignment_coverage.png ANIm_hadamard.png ANIm_similarity_errors.png
77+
ANIm_alignment_coverage.tab ANIm_hadamard.tab ANIm_similarity_errors.tab
78+
ANIm_alignment_lengths.pdf ANIm_percentage_identity.pdf nucmer_output.tar.gz
79+
ANIm_alignment_lengths.png ANIm_percentage_identity.png
80+
ANIm_alignment_lengths.tab ANIm_percentage_identity.tab
81+
~~~
82+
{: .output}
83+
84+
You can now transfer `ANIm_percentage_identity.png`
85+
to your computer to view the heatmap.
86+
For numeric values, you can use `ANIm_percentage_identity.tab` table.
87+
88+
<img src="/fig/ANIm_percentage_identity.png" height="500px">
89+
90+
Based on ANI, the unknown strain seems to be *X. hortorum pv. gardneri*.

fig/ANIm_percentage_identity.png

46 KB
Loading

files/ani/UXhortspp.fasta

Lines changed: 68018 additions & 0 deletions
Large diffs are not rendered by default.

files/ani/Xhortcaro1.fasta

Lines changed: 63156 additions & 0 deletions
Large diffs are not rendered by default.

files/ani/Xhortcyna1.fasta

Lines changed: 63993 additions & 0 deletions
Large diffs are not rendered by default.

files/ani/Xhortgard2.fasta

Lines changed: 67707 additions & 0 deletions
Large diffs are not rendered by default.

files/ani/Xhorthede1.fasta

Lines changed: 66997 additions & 0 deletions
Large diffs are not rendered by default.

files/ani/Xhortpela.fasta

Lines changed: 65196 additions & 0 deletions
Large diffs are not rendered by default.

files/ani/Xhorttara1.fasta

Lines changed: 62867 additions & 0 deletions
Large diffs are not rendered by default.

files/ani/Xhortviti1.fasta

Lines changed: 65885 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)