Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Feb 27:12:63.
doi: 10.1186/1471-2105-12-63.

BIO::Phylo-phyloinformatic analysis using perl

Affiliations

BIO::Phylo-phyloinformatic analysis using perl

Rutger A Vos et al. BMC Bioinformatics. .

Abstract

Background: Phyloinformatic analyses involve large amounts of data and metadata of complex structure. Collecting, processing, analyzing, visualizing and summarizing these data and metadata should be done in steps that can be automated and reproduced. This requires flexible, modular toolkits that can represent, manipulate and persist phylogenetic data and metadata as objects with programmable interfaces.

Results: This paper presents Bio::Phylo, a Perl5 toolkit for phyloinformatic analysis. It implements classes and methods that are compatible with the well-known BioPerl toolkit, but is independent from it (making it easy to install) and features a richer API and a data model that is better able to manage the complex relationships between different fundamental data and metadata objects in phylogenetics. It supports commonly used file formats for phylogenetic data including the novel NeXML standard, which allows rich annotations of phylogenetic data to be stored and shared. Bio::Phylo can interact with BioPerl, thereby giving access to the file formats that BioPerl supports. Many methods for data simulation, transformation and manipulation, the analysis of tree shape, and tree visualization are provided.

Conclusions: Bio::Phylo is composed of 59 richly documented Perl5 modules. It has been deployed successfully on a variety of computer architectures (including various Linux distributions, Mac OS X versions, Windows, Cygwin and UNIX-like systems). It is available as open source (GPL) software from http://search.cpan.org/dist/Bio-Phylo.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Data model of core Bio::Phylo objects. Cardinality relationships between the objects are shown as "crow's feet" notation; for example, a Bio::Phylo::Project has references to zero or more Bio::Phylo::Forest objects.
Figure 2
Figure 2
Visualization example: mammal rates. This tree was generated using Bio::Phylo by (i) reading a 3185 taxon phylogeny; (ii) collapsing the major clades; (iii) converting branch-specific speciation rates (read from a separate file, blue indicates low rates, purple and burgundy indicate higher rates) to RGB color codes; (iv) applying these colors to the branches and the average color within each collapsed clade to the triangles (data and use case kindly provided by Mark Pagel and Chris Venditti).
Figure 3
Figure 3
Code sample: parsing OTUs from Newick tree descriptions and annotating them with NCBI taxonomy database record identifiers using the SKOS vocabulary to describe the relationship between the OTU and the database record (i.e. a close string match).

References

    1. Cracraft J. The Seven Great Questions of Systematic Biology: An Essential Foundation for Conservation and the Sustainable Use of Biodiversity. Annals of the Missouri Botanical Garden. 2002;89:127–144. doi: 10.2307/3298558. - DOI
    1. Noble WS. A quick guide to organizing computational biology projects. PLoS computational biology. 2009;5:e1000424. doi: 10.1371/journal.pcbi.1000424. - DOI - PMC - PubMed
    1. Sukumaran J, Holder M. DendroPy: a Python library for phylogenetic computing. Bioinformatics. 2010;26:1569–1571. doi: 10.1093/bioinformatics/btq228. - DOI - PubMed
    1. Huerta-Cepas J, Dopazo J, Gabaldon T. ETE: a python Environment for Tree Exploration. BMC Bioinformatics. 2010;11:24. doi: 10.1186/1471-2105-11-24. - DOI - PMC - PubMed
    1. Cock P, Antao T, Chang J, Chapman B, Cox C, Dalke A, Friedberg I, Hamelryck T, Kauff F, Wilczynski B, de Hoon M. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics. 2009;25:1422–1423. doi: 10.1093/bioinformatics/btp163. - DOI - PMC - PubMed

Publication types

LinkOut - more resources