Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 7;50(D1):D387-D390.
doi: 10.1093/nar/gkab1053.

The Sequence Read Archive: a decade more of explosive growth

Affiliations

The Sequence Read Archive: a decade more of explosive growth

Kenneth Katz et al. Nucleic Acids Res. .

Abstract

The Sequence Read Archive (SRA, https://www.ncbi.nlm.nih.gov/sra/) stores raw sequencing data and alignment information to enhance reproducibility and facilitate new discoveries through data analysis. Here we note changes in storage designed to increase access and highlight analyses that augment metadata with taxonomic insight to help users select data. In addition, we present three unanticipated applications of taxonomic analysis.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Growth of SRA over the last decade. Through September 2021 Public Access contains approximately 25.6 Petabase pairs originating from over 14.8 million publicly available runs averaging 1.7 Gbp per run, 0.83 GB per run, 9.6 million spots per run and 187 bp per spot.

References

    1. Kodama Y., Shumway M., Leinonen R.International Nucleotide Sequence Database Collaboration . The Sequence Read Archive: explosive growth of sequencing data. Nucleic Acids Res. 2012; 40:D54–D56. - PMC - PubMed
    1. Wilkinson M.D., Dumontier M., Aalbersberg I.J., Appleton G., Axton M., Baak A., Blomberg N., Boiten J.W., da Silva Santos L.B., Bourne P.E.et al. .. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data. 2016; 15:3. - PMC - PubMed
    1. Yu Y.W., Yorukoglu D., Peng J., Berger B.. Quality score compression improves genotyping accuracy. Nat. Biotechnol. 2015; 33:240–243. - PMC - PubMed
    1. Bonfield J.K., Mahoney M.V.. Compression of FASTQ and SAM format sequencing data. PLoS One. 2013; 8:e59190. - PMC - PubMed
    1. Sayers E.W., Bolton E.E., Brister J.R., Canese K., Chan J., Comeau D.C., Connor R., Funk K., Chris C., Kim S.et al. .. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2021; 10.1093/nar/gkab1112. - DOI - PMC - PubMed

Publication types

LinkOut - more resources