Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jan;39(Database issue):D15-8.
doi: 10.1093/nar/gkq1150. Epub 2010 Nov 23.

The International Nucleotide Sequence Database Collaboration

Affiliations

The International Nucleotide Sequence Database Collaboration

Guy Cochrane et al. Nucleic Acids Res. 2011 Jan.

Abstract

Under the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org), globally comprehensive public domain nucleotide sequence is captured, preserved and presented. The partners of this long-standing collaboration work closely together to provide data formats and conventions that enable consistent data submission to their databases and support regular data exchange around the globe. Clearly defined policy and governance in relation to free access to data and relationships with journal publishers have positioned INSDC databases as a key provider of the scientific record and a core foundation for the global bioinformatics data infrastructure. While growth in sequence data volumes comes no longer as a surprise to INSDC partners, the uptake of next-generation sequencing technology by mainstream science that we have witnessed in recent years brings a step-change to growth, necessarily making a clear mark on INSDC strategy. In this article, we introduce the INSDC, outline data growth patterns and comment on the challenges of increased growth.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
(a) Base pairs in INSDC over time, excluding the Trace Archive (raw data from capillary sequencing platforms). Cumulative data volume in base pairs over time. (b) Base pairs in INSDC over time since 2002, broken down into selected data components. Cumulative data volume in base pairs broken down into assembled sequence (whole genome shotgun methods and others) and raw next-generation-sequence data.
Figure 2.
Figure 2.
Growth in complete genomes. The layered chart shows the number of complete genomes available from INSDC databases over time. The end of 2010 time point is conservatively (linearly) extrapolated from October 2010 figures, which are the latest available at the time of submission.
Figure 3.
Figure 3.
Taxonomic coverage. Growth in the number of taxa with associated sequence (or with subordinate taxa with associated sequence) over time.

References

    1. Leinonen R, Sugawara H, Shumway M on behalf of the International Nucleotide Sequence Database Collaboration. The Sequence Read Archive. Nucleic Acids Res. 2011;39:D19–D21. - PMC - PubMed
    1. Kaminuma E, Mashima J, Kodama Y, Gojobori T, Ogasawara O, Okubo K, Takagi T, Nakamura Y. DDBJ launches a new archive database with analytical tools for next-generation sequence data. Nucleic Acids Res. 2010;38:D33–D38. - PMC - PubMed
    1. Leinonen R, Birney E, Bower L, Cerdeno-Tárraga A, Cheng Y, Cleland I, Faruque N, Goodgame N, Gibson R, Jang M, et al. The European Nucleotide Archive. Nucleic Acids Res. 2011;39:D28–D31. - PMC - PubMed
    1. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Res. 2010;38:D46–D51. - PMC - PubMed
    1. Brunak S, Danchin A, Hattori M, Nakamura H, Shinozaki K, Matise T, Preuss D. Nucleotide Sequence Database Policies. Science. 2002;298:1333. - PubMed

Publication types

MeSH terms