Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jul 2:15:230.
doi: 10.1186/1471-2105-15-230.

Osiris: accessible and reproducible phylogenetic and phylogenomic analyses within the Galaxy workflow management system

Affiliations

Osiris: accessible and reproducible phylogenetic and phylogenomic analyses within the Galaxy workflow management system

Todd H Oakley et al. BMC Bioinformatics. .

Abstract

Background: Phylogenetic tools and 'tree-thinking' approaches increasingly permeate all biological research. At the same time, phylogenetic data sets are expanding at breakneck pace, facilitated by increasingly economical sequencing technologies. Therefore, there is an urgent need for accessible, modular, and sharable tools for phylogenetic analysis.

Results: We developed a suite of wrappers for new and existing phylogenetics tools for the Galaxy workflow management system that we call Osiris. Osiris and Galaxy provide a sharable, standardized, modular user interface, and the ability to easily create complex workflows using a graphical interface. Osiris enables all aspects of phylogenetic analysis within Galaxy, including de novo assembly of high throughput sequencing reads, ortholog identification, multiple sequence alignment, concatenation, phylogenetic tree estimation, and post-tree comparative analysis. The open source files are available on in the Bitbucket public repository and many of the tools are demonstrated on a public web server (http://galaxy-dev.cnsi.ucsb.edu/osiris/).

Conclusions: Osiris can serve as a foundation for other phylogenomic and phylogenetic tool development within the Galaxy platform.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Phylocatenator matrix. Part of the output from Phylocatenator is shown as an html table representing gene coverage across species. The table contains the gene name, the model assigned to each partition (if that information is provided by the user prior to while running Phylocatenator), and the presence (black) or absence (white) of the gene for each taxon.
Figure 2
Figure 2
Workflow. Here we show a workflow constructed in Galaxy’s workflow editor. The analysis starts with the input dataset (1), which in this instance would be an unaligned tab-delimited, four-column phytab file. The raw phytab file then gets aligned (2) using phytab-MUSCLE, which will implement a multiple sequence alignment on each gene individually. After alignment, we implement a masking step (3) using phytab-aliscorecut, which will remove ambiguous regions separately from each gene. The data are now ready to be concatenated using phylocatenator (4A), and used to reconstruct a phylogeny with RAxML (5A). Alternatively or simultaneously, the data from step 3 can be used to estimate a separate phylogeny for each gene using phytab-RAxML (4B), which stores all gene trees in tabular format. Subsequently, gene trees can be used to estimate a species tree using NJst (5B), and/or all gene trees can be plotted individually for visual inspection using tab2trees (5C). Finally, any resulting tree file in Newick format can be plotted using TreeVector (6).
Figure 3
Figure 3
All tools. Here we show the various analyses Osiris in Galaxy is capable of. Different tools depicted in this figure can be combined to create complex phylogenetic workflows starting with a wide range of input files. Tools for which we created wrappers as part of this publication are in italics, while existing tools already available in Galaxy are denoted with an asterisk. Each box depicts an analysis category or different stage in a potential workflow. Lines connecting the boxes show different ways these tools can be combined based on input and output formats.

Similar articles

Cited by

References

    1. Drummond AJ, Suchard MA, Xie D, Rambaut A. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol. 2012;29:1969–1973. - PMC - PubMed
    1. Han MV, Zmasek CM. phyloXML: XML for evolutionary biology and comparative genomics. BMC Bionf. 2009;10:356. - PMC - PubMed
    1. Vos RA, Balhoff JP, Caravas JA, Holder MT, Lapp H, Maddison WP, Midford PE, Priyam A, Sukumaran J, Xia XH, Stoltzfus A. NeXML: rich, extensible, and verifiable representation of comparative data and metadata. Syst Biol. 2012;61(4):675–689. - PMC - PubMed
    1. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28(10):2731–2739. - PMC - PubMed
    1. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, Buxton A, Cooper A, Markowitz S, Duran C, Thierer T, Ashton B, Meintjes P, Drummond A. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. pp. 1647–1649. - PMC - PubMed

Publication types

LinkOut - more resources