Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Feb 15;7(1):1.
doi: 10.1186/1751-0473-7-1.

Yabi: An online research environment for grid, high performance and cloud computing

Affiliations

Yabi: An online research environment for grid, high performance and cloud computing

Adam A Hunter et al. Source Code Biol Med. .

Abstract

Background: There is a significant demand for creating pipelines or workflows in the life science discipline that chain a number of discrete compute and data intensive analysis tasks into sophisticated analysis procedures. This need has led to the development of general as well as domain-specific workflow environments that are either complex desktop applications or Internet-based applications. Complexities can arise when configuring these applications in heterogeneous compute and storage environments if the execution and data access models are not designed appropriately. These complexities manifest themselves through limited access to available HPC resources, significant overhead required to configure tools and inability for users to simply manage files across heterogenous HPC storage infrastructure.

Results: In this paper, we describe the architecture of a software system that is adaptable to a range of both pluggable execution and data backends in an open source implementation called Yabi. Enabling seamless and transparent access to heterogenous HPC environments at its core, Yabi then provides an analysis workflow environment that can create and reuse workflows as well as manage large amounts of both raw and processed data in a secure and flexible way across geographically distributed compute resources. Yabi can be used via a web-based environment to drag-and-drop tools to create sophisticated workflows. Yabi can also be accessed through the Yabi command line which is designed for users that are more comfortable with writing scripts or for enabling external workflow environments to leverage the features in Yabi. Configuring tools can be a significant overhead in workflow environments. Yabi greatly simplifies this task by enabling system administrators to configure as well as manage running tools via a web-based environment and without the need to write or edit software programs or scripts. In this paper, we highlight Yabi's capabilities through a range of bioinformatics use cases that arise from large-scale biomedical data analysis.

Conclusion: The Yabi system encapsulates considered design of both execution and data models, while abstracting technical details away from users who are not skilled in HPC and providing an intuitive drag-and-drop scalable web-based workflow environment where the same tools can also be accessed via a command line. Yabi is currently in use and deployed at multiple institutions and is available at http://ccg.murdoch.edu.au/yabi.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Yabi Architecture. The YABI architecture outlining the key components including Frontend Application, Middleware Appliance and the Resource Manager.
Figure 2
Figure 2
Yabi web-based client. Screenshot of the Yabi web-based client in the "design view" accessing a tool available from the European Bioinformatics Institute via a webservice (http://www.ebi.ac.uk/Tools/webservices/).
Figure 3
Figure 3
Yabi Admin. Screenshot of a completed workflow from a system administration perspective.
Figure 4
Figure 4
HTP Genomic and Automated annotation workflows. Screenshots of (a) a high throughput genomic analysis workflow; and (b) a bioinformatics workflow to predict candidate G-Protein coupling receptor proteins batched over 14,000 molecular sequences.
Figure 5
Figure 5
Proteomics Analysis Workflow. Screenshot of Proteomics workflow combining tools from TPP and Mascot.

References

    1. Goble C, Stevens R. State of the nation in the data integration for bioinformatics. Journal of Biomedical Informatics. 2008;41(5):687–693. doi: 10.1016/j.jbi.2008.01.008. - DOI - PubMed
    1. Louys M, Bonnarel F, Schaaff A, Claudon J-J, Pestel C. In: Highlights of Astronomy, XXVIth IAU General Assembly. van der Hucht KA, editor. Vol. 14 2006. Implementing astronomical image analysis pipelines using VO standards.
    1. Walton NA, Brenton JD, Caldas C, Irwin MJ, Akram A, Gonzalez-Solares E, Lewis JR, Maccallum PH, Morris LJ, Rixon GT. PathGrid: a service-orientated architecture for microscopy image analysis. Philos Transact A Math Phys Eng Sci. 2010;368:3937–3952. doi: 10.1098/rsta.2010.0158. - DOI - PubMed
    1. Stajich JE, Block D, Boulez K, Brenner SE, Chervitz SA, Dagdigian C, Fuellen G, Gilbert JG, Korf I, Lapp H, Lehväslaiho H, Matsalla C, Mungall CJ, Osborne BI, Pocock MR, Schattner P, Senger M, Stein LD, Stupka E, Wilkinson MD, Birney E. The Bioperl toolkit: Perl modules for the life sciences. Genome Res. 2002;12(10):1611–8. doi: 10.1101/gr.361602. - DOI - PMC - PubMed
    1. Pocock M, Down T, Hubbard T. BioJava: open source components for bioinformatics. ACM SIGBIO Newsletter. 2000;20(2):10–12. doi: 10.1145/360262.360266. - DOI