Phylogenetic Trees
OpenGenomeBrowser can generate and display phylogenetic trees using three distinct algorithms:
- TaxId-based tree: Short calculation time but based on taxid-annotations only (NOT on sequence similarity).
- Genome-similarity-based tree: Short calculation time, based on pairwise comparisons of assemblies (default: GenDisCal (preset approxANI, see footnotes) (Goussarov et al., Bioinformatics, 2020)
- Single-copy-ortholog-based tree (core-genome-based tree): Long calculation time. The methodology is based on OrthoFinder consensus tree of all single-copy ortholog alignments (Emms et al., Genome Biology, 2019)
How to get there
In the genome table, select multiple genomes (using Shift
and Ctrl
) and open the
context menu using right click. Then, click on Show phylogenetic trees
.
Advanced usage
Through the settings sidebar, it is possible to apply custom colors to the genomes in the tree.
Copy on the newick string to copy the dendrogram, for example to use it in other tools (e.g. phylo.io or itol)
Additional downloads:
- For genome-similarity-based trees, download the distance matrix that was used to calculate the tree by clicking on
Download distance matrix
orCopy distance matrix
) - For single-copy-ortholog-based trees, download the full OrthoFinder output by clicking on
Download as .tar.xz
. If this option is not available, click onReload OrthoFinder
to regenerate the file.
Footnotes
GenDisCal, preset approxANI
The distance matrix is based on whole genome nucleotide similarity calculated using
GenDisCal (Goussarov et al., Bioinformatics, 2020) with the approxANI
preset.
This preset is documented as follows:
approxANI is a minhash-based approximation of Average Nucleotide Identity (ANI). This method is similar to the one used by the Mash software (Ondov et al., Gen. biol., 2016) The ambiguity region for species is [0.04-0.06]
Thus, the GenDisCal similarities should produce values that strongly correlate with original ANI, but can be computed much faster.