Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Sep 20;2(9):e000086.
doi: 10.1099/mgen.0.000086. eCollection 2016 Sep.

CLIMB (the Cloud Infrastructure for Microbial Bioinformatics): an online resource for the medical microbiology community

Affiliations

CLIMB (the Cloud Infrastructure for Microbial Bioinformatics): an online resource for the medical microbiology community

Thomas R Connor et al. Microb Genom. .

Abstract

The increasing availability and decreasing cost of high-throughput sequencing has transformed academic medical microbiology, delivering an explosion in available genomes while also driving advances in bioinformatics. However, many microbiologists are unable to exploit the resulting large genomics datasets because they do not have access to relevant computational resources and to an appropriate bioinformatics infrastructure. Here, we present the Cloud Infrastructure for Microbial Bioinformatics (CLIMB) facility, a shared computing infrastructure that has been designed from the ground up to provide an environment where microbiologists can share and reuse methods and data.

Keywords: bioinformatics; cloud computing; infrastructure; metagenomics; population genomics; virtual laboratory.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Overview of the system. (a) The sites where the computational hardware is based. (b) High-level overview of the system and how the different software components connect to one another. (c) Compute hardware present at each of the four sites. (d) Hardware comprising the Ceph storage system at each site. (e) Type and role of network hardware used at each site.
Fig. 2.
Fig. 2.
Relative performance of virtual machines running on cloud services, compared to the Cardiff University HPC system, Raven. (a) Values for each package are the mean of the wall time taken for 10 runs performed on Raven, divided by the mean wall time of 40 runs undertaken on the virtual machine on the named service. Values greater than 1 are faster than Raven, values less than 1 are slower. (b) The raw wall time values for the named software on each of the systems. The data generated as part of the benchmarking exercise is included in Supplementary File 1.
Fig. 3.
Fig. 3.
CLIMB virtual machine launch workflow. A user, on logging in to the Bryn launcher interface, is presented with a list of the virtual machines they are running and are able to stop, reboot or terminate them (a). Users launch a new Genomics Virtual Laboratory (GVL) server with a minimal interface, specifying a name, the server ‘flavour’ (user or group) and an access password (b). On booting, the user accesses a webserver running on the GVL instance, which gives access to various services that are started automatically (c). The GVL provides access to a Cloudman, a Galaxy server, an administration interface, Jupyter notebook and RStudio (d, top to bottom).

References

    1. Afgan E., Sloggett C., Goonasekera N., Makunin I., Benson D., Crowe M., Gladman S., Kowsar Y., Pheasant M., et al. (2015). Genomics virtual laboratory: a practical bioinformatics workbench for the cloud. PLoS One 10e0140829.10.1371/journal.pone.0140829 - DOI - PMC - PubMed
    1. Chang J.(2015). Core services: reward bioinformaticians. Nature 520151–152.10.1038/520151a - DOI - PubMed
    1. Connor T. R., Barker C. R., Baker K. S., Weill F., Talukder K. A., Smith A. M., Baker S., Gouali M., Pham Thanh D., et al. (2015). Species-wide whole genome sequencing reveals historical global spread and recent local persistence in Shigella flexneri. Elife 4e0733510.7554/eLife.07335 - DOI - PMC - PubMed
    1. Drummond A. J., Rambaut A.(2007). beast: bayesian evolutionary analysis by sampling trees. BMC Evol Biol 7214.10.1186/1471-2148-7-214 - DOI - PMC - PubMed
    1. Edgar R. C.(2004). muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 321792–1797.10.1093/nar/gkh340 - DOI - PMC - PubMed