. 2012 Jan;40(1):e2.

doi: 10.1093/nar/gkr861. Epub 2011 Oct 19.

Ultrasensitive detection of rare mutations using next-generation targeted resequencing

Patrick Flaherty¹, Georges Natsoulis, Omkar Muralidharan, Mark Winters, Jason Buenrostro, John Bell, Sheldon Brown, Mark Holodniy, Nancy Zhang, Hanlee P Ji

Affiliations

PMID: 22013163
PMCID: PMC3245950
DOI: 10.1093/nar/gkr861

Ultrasensitive detection of rare mutations using next-generation targeted resequencing

Patrick Flaherty et al. Nucleic Acids Res. 2012 Jan.

. 2012 Jan;40(1):e2.

doi: 10.1093/nar/gkr861. Epub 2011 Oct 19.

Authors

Patrick Flaherty¹, Georges Natsoulis, Omkar Muralidharan, Mark Winters, Jason Buenrostro, John Bell, Sheldon Brown, Mark Holodniy, Nancy Zhang, Hanlee P Ji

Affiliation

¹ Stanford Genome Technology Center, Stanford University, Palo Alto, CA 94304, USA.

PMID: 22013163
PMCID: PMC3245950
DOI: 10.1093/nar/gkr861

Abstract

With next-generation DNA sequencing technologies, one can interrogate a specific genomic region of interest at very high depth of coverage and identify less prevalent, rare mutations in heterogeneous clinical samples. However, the mutation detection levels are limited by the error rate of the sequencing technology as well as by the availability of variant-calling algorithms with high statistical power and low false positive rates. We demonstrate that we can robustly detect mutations at 0.1% fractional representation. This represents accurate detection of one mutant per every 1000 wild-type alleles. To achieve this sensitive level of mutation detection, we integrate a high accuracy indexing strategy and reference replication for estimating sequencing error variance. We employ a statistical model to estimate the error rate at each position of the reference and to quantify the fraction of variant base in the sample. Our method is highly specific (99%) and sensitive (100%) when applied to a known 0.1% sample fraction admixture of two synthetic DNA samples to validate our method. As a clinical application of this method, we analyzed nine clinical samples of H1N1 influenza A and detected an oseltamivir (antiviral therapy) resistance mutation in the H1N1 neuraminidase gene at a sample fraction of 0.18%.

PubMed Disclaimer

Figures

**Figure 1.**
Method flowchart. The method for detecting rare variants compares the baseline error rate from multiple reference replicates to the sample error rate at each position. Sample and reference DNA are independently prepared and tagged with indexed adapters. The reference and sample libraries are pooled and sequenced on the same lane. The reads are aligned and preprocessed to filter out strand-specific errors. The parameters of a Beta-Binomial model are fit to the reference sequence data to obtain a null hypothesis error rate distribution for each position. Finally, the error rate of the sample sequencing data is compared to the null distribution to call rare variants.

**Figure 2.**
Position-specific error rate distribution. The average sequence error rate variance across positions is significantly greater than the average variability at each position. The across-position distribution is shown on the right side in dark blue and a sample of five within-position density estimates is shown below it. The empirical within-position and across-position distribution estimates show that a small number of outlying positions contribute to the excessive variance in the across-position distribution.

**Figure 3.**
Variant positions in the 0.1% mixture sample of synthetic DNA are identified by the statistical model. The x-axis is the reference error rate as estimated by in the model and the y-axis is the sample error rate (error read depth/total read depth). True negatives (black), true positives (blue) and false positives (red) for three replicates are identified in both samples. For each of the three replicates, the model finds 14 of 14 true positives; 5, 4 and 1 additional calls (false positives), respectively, are made. Requiring a consensus call of all three replicates eliminates these false positives.

formula image — **Figure 3.**
Variant positions in the 0.1% mixture sample of synthetic DNA are identified by the statistical model. The x-axis is the reference error rate as estimated by in the model and the y-axis is the sample error rate (error read depth/total read depth). True negatives (black), true positives (blue) and false positives (red) for three replicates are identified in both samples. For each of the three replicates, the model finds 14 of 14 true positives; 5, 4 and 1 additional calls (false positives), respectively, are made. Requiring a consensus call of all three replicates eliminates these false positives.

**Figure 4.**
Detection power depends on both read depth and experimental precision. We show here that the statistical power of the model, the likelihood of detecting a true positive at a given effect size (level of prevalence), increases with read depth and sample preparation precision, up to asymptotic limits. (a) Read depth (n) is held constant at an example level of 10 000 and it can be seen that power increases with experimental precision () up to a limit of approximately 0.4 for an effect size of 0.1%. (b) When the experimental precision () is held constant at 10 000, power increases with read depth (n) up to a limit of approximately 0.4 for an effect size of 0.1%. (c) For a fixed false positive and false negative rate, the detectable effect size decreases with both increasing sample preparation precision () and read depth (). A greater gain is achieved by improving sample preparation precision than by increasing read depth if the experimental variation is large. (d) The ROC curve for a fixed effect size and sample preparation precision improves rapidly as the read depth increases. Read depth limits the sensitivity at all false positive rates when low, but when read depth is high the ROC curve approaches an asymptotic curve controlled by the experimental variation.

**Figure 5.**
Sequencing results of clinical samples of H1N1 influenza A. (a) A red dot indicates a position called as a mutant () and has a sample fraction >0.1% and green dots indicate an estimated sample fraction >1%. (b) A detail display of 10 positions in sample BN3 shows the difference between the reference and sample sequencing error rates for called mutations in two replicate lanes. The non-reference base composition for both lanes (in sequence logo format) shows that the three mutations are T to C pyrimidine transitions. (c) We identified the H275Y mutation responsible for oseltamivir resistance in one clinical sample (BN9). Across all of the H1N1 clinical samples, we display a breakdown of the individual sequencing error rate for the non-reference bases at codon position 1. The mutation in sample BN9 is readily apparent. The dotted line indicates the expected base error rate from a uniform distribution across bases using the total sequencing error rate.

See this image and copyright information in PMC

Cited by

Implications of genetic heterogeneity in cancer.
Schmitt MW, Prindle MJ, Loeb LA. Schmitt MW, et al. Ann N Y Acad Sci. 2012 Sep;1267:110-6. doi: 10.1111/j.1749-6632.2012.06590.x. Ann N Y Acad Sci. 2012. PMID: 22954224 Free PMC article.
RVD2: an ultra-sensitive variant detection model for low-depth heterogeneous next-generation sequencing data.
He Y, Zhang F, Flaherty P. He Y, et al. Bioinformatics. 2015 Sep 1;31(17):2785-93. doi: 10.1093/bioinformatics/btv275. Epub 2015 Apr 29. Bioinformatics. 2015. PMID: 25931517 Free PMC article.
A new approach for detecting low-level mutations in next-generation sequence data.
Li M, Stoneking M. Li M, et al. Genome Biol. 2012 May 23;13(5):R34. doi: 10.1186/gb-2012-13-5-r34. Genome Biol. 2012. PMID: 22621726 Free PMC article.
Limited Practical Utility of Liquid Biopsy in the Treated Patients with Advanced Breast Cancer.
Niwinska A, Bałabas A, Kulecka M, Kluska A, Piątkowska M, Paziewska A, Pyśniak K, Olszewski W, Mikula M, Ostrowski J. Niwinska A, et al. Diagnostics (Basel). 2020 Jul 28;10(8):523. doi: 10.3390/diagnostics10080523. Diagnostics (Basel). 2020. PMID: 32731384 Free PMC article.
Accuracy of Next Generation Sequencing Platforms.
Fox EJ, Reid-Bayliss KS, Emond MJ, Loeb LA. Fox EJ, et al. Next Gener Seq Appl. 2014;1:1000106. doi: 10.4172/jngsa.1000106. Next Gener Seq Appl. 2014. PMID: 25699289 Free PMC article.

See all "Cited by" articles

References

1. Hedskog C, Mild M, Jernberg J, Sherwood E, Bratt G, Leitner T, Lundeberg J, Andersson B, Albert J. Dynamics of HIV-1 quasispecies during antiviral treatment dissected using ultra-deep pyrosequencing. PLoS One. 2010;5:e11345. - PMC - PubMed
1. Kuroda M, Katano H, Nakajima N, Tobiume M, Ainai A, Sekizuka T, Hasegawa H, Tashiro M, Sasaki Y, Arakawa Y, et al. Characterization of quasispecies of pandemic 2009 influenza A virus (A/H1N1/2009) by de novo sequencing using a next-generation DNA sequencer. PLoS One. 2010;5:e10256. - PMC - PubMed
1. Tsibris AM, Korber B, Arnaout R, Russ C, Lo CC, Leitner T, Gaschen B, Theiler J, Paredes R, Su Z, et al. Quantitative deep sequencing reveals dynamic HIV-1 escape and large population shifts during CCR5 antagonist therapy in vivo. PLoS One. 2009;4:e5683. - PMC - PubMed
1. Wang C, Mitsuya Y, Gharizadeh B, Ronaghi M, Shafer RW. Characterization of mutation spectra with ultra-deep pyrosequencing: application to HIV-1 drug resistance. Genome Res. 2007;17:1195–1201. - PMC - PubMed
1. Thomas RK, Nickerson E, Simons JF, Janne PA, Tengs T, Yuza Y, Garraway LA, LaFramboise T, Lee JC, Shah K, et al. Sensitive mutation detection in heterogeneous cancer specimens by massively parallel picoliter reactor sequencing. Nat. Med. 2006;12:852–855. - PubMed

Publication types

Actions
Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions

Grants and funding

R01 HG006137/HG/NHGRI NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Ultrasensitive detection of rare mutations using next-generation targeted resequencing

Affiliation

Ultrasensitive detection of rare mutations using next-generation targeted resequencing

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials