Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Sep 17:1:e167.
doi: 10.7717/peerj.167. eCollection 2013.

Galaxy tools and workflows for sequence analysis with applications in molecular plant pathology

Affiliations

Galaxy tools and workflows for sequence analysis with applications in molecular plant pathology

Peter J A Cock et al. PeerJ. .

Abstract

The Galaxy Project offers the popular web browser-based platform Galaxy for running bioinformatics tools and constructing simple workflows. Here, we present a broad collection of additional Galaxy tools for large scale analysis of gene and protein sequences. The motivating research theme is the identification of specific genes of interest in a range of non-model organisms, and our central example is the identification and prediction of "effector" proteins produced by plant pathogens in order to manipulate their host plant. This functional annotation of a pathogen's predicted capacity for virulence is a key step in translating sequence data into potential applications in plant pathology. This collection includes novel tools, and widely-used third-party tools such as NCBI BLAST+ wrapped for use within Galaxy. Individual bioinformatics software tools are typically available separately as standalone packages, or in online browser-based form. The Galaxy framework enables the user to combine these and other tools to automate organism scale analyses as workflows, without demanding familiarity with command line tools and scripting. Workflows created using Galaxy can be saved and are reusable, so may be distributed within and between research groups, facilitating the construction of a set of standardised, reusable bioinformatic protocols. The Galaxy tools and workflows described in this manuscript are open source and freely available from the Galaxy Tool Shed (http://usegalaxy.org/toolshed or http://toolshed.g2.bx.psu.edu).

Keywords: Accessibility; Annotation; Effector proteins; Galaxy; Genomics; Pipeline; Reproducibility; Sequence analysis; Workflow.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Schematic of effector infiltration into a host plant cell.
Plant pests and pathogens introduce effector proteins into the host plant cell (green), where they can target and manipulate plant biochemistry to the benefit of the pathogen (Dodds & Rathjen, 2010). Effectors may be delivered by haustorial ingression from a fungus or oomycete such as P. infestans (orange), via the bacterial Type III secretion system mechanism (blue), by nematodes via injection into the plant cell though a needle-like stylet (red), or many other processes (not illustrated). Where effectors may be identified by sequence properties, candidate effector proteins can be computationally predicted using Galaxy.
Figure 2
Figure 2. Screenshot of the Galaxy PSORTb v3.0 wrapper.
The left hand pane (A) holds a menu of tools which is configurable by the Galaxy administrator. The central pane (B) shows the currently selected tool or dataset, here PSORTb. The right hand pane (C) holds the current datasets, and is empty in this example. The tool interface presents the user with familiar drop down list controls (in this example a file selector and other parameters), option radio-selectors, check boxes, or text boxes as defined in the tool configuration file. For PSORTb, text input boxes are restricted to only accept numeric values. Tool input parameters are followed by a blue “Execute” button that runs the tool when clicked. Below this, user documentation and citation information are provided.
Figure 3
Figure 3. Screenshot from Galaxy workflow editor illustrating the Glimmer3 gene finding example discussed in “Basic assembly and gene calling”.
Figure 4
Figure 4. Screenshot from Galaxy workflow editor illustrating the simple RXLR and Venn Diagram example discussed in “Simple comparison of RXLR predictions”.

References

    1. Afgan E, Baker D, Coraor N, Chapman B, Nekrutenko A, Taylor J. Galaxy CloudMan: delivering cloud compute clusters. BMC Bioinformatics. 2010;11(Suppl 12):S4. doi: 10.1186/1471-2105-11-S12-S4. - DOI - PMC - PubMed
    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. Journal of Molecular Biology. 1990;215(3):403–410. doi: 10.1016/S0022-2836(05)80360-2. - DOI - PubMed
    1. Arnold R, Brandmaier S, Kleine F, Tischler P, Heinz E, Behrens S, Niinikoski A, Mewes H-W, Horn M, Rattei T. Sequence-based prediction of type III secreted proteins. PLoS Pathogens. 2009;5(4):e1000376. doi: 10.1371/journal.ppat.1000376. - DOI - PMC - PubMed
    1. Baltrus DA, Nishimura MT, Romanchuk A, Chang JH, Mukhtar MS, Cherkis K, Roach J, Grant SR, Jones CD, Dangl JL. Dynamic evolution of pathogenicity revealed by sequencing and comparative genomics of 19 Pseudomonas syringae isolates. PLoS Pathogens. 2011;7(7):e1002132. doi: 10.1371/journal.ppat.1002132. - DOI - PMC - PubMed
    1. Bendtsen JD, Nielsen H, von Heijne G, Brunak S. Improved prediction of signal peptides: SignalP 3.0. Journal of Molecular Biology. 2004;340(4):783–795. doi: 10.1016/j.jmb.2004.05.028. - DOI - PubMed