Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012:2012:1-10.
Epub 2012 Mar 19.

From sequencer to supercomputer: an automatic pipeline for managing and processing next generation sequencing data

Affiliations

From sequencer to supercomputer: an automatic pipeline for managing and processing next generation sequencing data

Terry Camerlengo et al. AMIA Jt Summits Transl Sci Proc. 2012.

Abstract

Next Generation Sequencing is highly resource intensive. NGS Tasks related to data processing, management and analysis require high-end computing servers or even clusters. Additionally, processing NGS experiments requires suitable storage space and significant manual interaction. At The Ohio State University's Biomedical Informatics Shared Resource, we designed and implemented a scalable architecture to address the challenges associated with the resource intensive nature of NGS secondary analysis built around Illumina Genome Analyzer II sequencers and Illumina's Gerald data processing pipeline. The software infrastructure includes a distributed computing platform consisting of a LIMS called QUEST (http://bisr.osumc.edu), an Automation Server, a computer cluster for processing NGS pipelines, and a network attached storage device expandable up to 40TB. The system has been architected to scale to multiple sequencers without requiring additional computing or labor resources. This platform provides demonstrates how to manage and automate NGS experiments in an institutional or core facility setting.

PubMed Disclaimer

Figures

Figure 1
Figure 1
An overview of the QUEST system and its use Cases.
Figure 2:
Figure 2:
The NGS data processing and automation pipeline.
Figure 3
Figure 3
Execution of the Configuration file.
Figure 4
Figure 4
Main page for listing the studies.
Figure 5
Figure 5
NGS/GAII run history and comments/analyzing instructions.
Figure 6
Figure 6
Flow cell properties header.
Figure 7
Figure 7
An example of lane properties panel for a flowcell (showing first 4 lanes).
Figure 8
Figure 8
Configuration Manager.

References

    1. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature methods. 2008 Jul;5(7):621–8. - PubMed
    1. Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009 Jan;10(1):57–63. - PMC - PubMed
    1. Park PJ. Epigenetics meets next-generation sequencing. Epigenetics. 2008 Nov;3(6):318–21. - PubMed
    1. Trapnell C, Salzberg SL. How to map billions of short reads onto genomes. Nature biotechnology. 2009 May;27(5):455–7. - PMC - PubMed
    1. Pepke S, Wold B, Mortazavi A. Computation for ChIP-seq and RNA-seq studies. Nature methods. 2009 Nov;6(11 Suppl):S22–32. - PMC - PubMed

LinkOut - more resources