Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Mar;32(3):246-51.
doi: 10.1038/nbt.2835. Epub 2014 Feb 16.

Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls

Affiliations

Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls

Justin M Zook et al. Nat Biotechnol. 2014 Mar.

Abstract

Clinical adoption of human genome sequencing requires methods that output genotypes with known accuracy at millions or billions of positions across a genome. Because of substantial discordance among calls made by existing sequencing methods and algorithms, there is a need for a highly accurate set of genotypes across a genome that can be used as a benchmark. Here we present methods to make high-confidence, single-nucleotide polymorphism (SNP), indel and homozygous reference genotype calls for NA12878, the pilot genome for the Genome in a Bottle Consortium. We minimize bias toward any method by integrating and arbitrating between 14 data sets from five sequencing technologies, seven read mappers and three variant callers. We identify regions for which no confident genotype call could be made, and classify them into different categories based on reasons for uncertainty. Our genotype calls are publicly available on the Genome Comparison and Analytic Testing website to enable real-time benchmarking of any method.

PubMed Disclaimer

Comment in

  • Extensive sequencing of seven human genomes to characterize benchmark reference materials.
    Zook JM, Catoe D, McDaniel J, Vang L, Spies N, Sidow A, Weng Z, Liu Y, Mason CE, Alexander N, Henaff E, McIntyre AB, Chandramohan D, Chen F, Jaeger E, Moshrefi A, Pham K, Stedman W, Liang T, Saghbini M, Dzakula Z, Hastie A, Cao H, Deikus G, Schadt E, Sebra R, Bashir A, Truty RM, Chang CC, Gulbahce N, Zhao K, Ghosh S, Hyland F, Fu Y, Chaisson M, Xiao C, Trow J, Sherry ST, Zaranek AW, Ball M, Bobe J, Estep P, Church GM, Marks P, Kyriazopoulou-Panagiotopoulou S, Zheng GX, Schnall-Levin M, Ordonez HS, Mudivarti PA, Giorda K, Sheng Y, Rypdal KB, Salit M. Zook JM, et al. Sci Data. 2016 Jun 7;3:160025. doi: 10.1038/sdata.2016.25. Sci Data. 2016. PMID: 27271295 Free PMC article.

References

    1. Nature. 2010 Oct 28;467(7319):1061-73 - PubMed
    1. Hum Genet. 2013 Oct;132(10):1153-63 - PubMed
    1. Nature. 2012 Jun 20;486(7403):405-9 - PubMed
    1. Nat Genet. 2011 May;43(5):491-8 - PubMed
    1. BMC Bioinformatics. 2011 Nov 21;12:451 - PubMed

Publication types