Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Feb 17;23(4):2244.
doi: 10.3390/ijms23042244.

The Value of Whole-Genome Sequencing for Mitochondrial DNA Population Studies: Strategies and Criteria for Extracting High-Quality Mitogenome Haplotypes

Affiliations

The Value of Whole-Genome Sequencing for Mitochondrial DNA Population Studies: Strategies and Criteria for Extracting High-Quality Mitogenome Haplotypes

Kimberly Sturk-Andreaggi et al. Int J Mol Sci. .

Abstract

Whole-genome sequencing (WGS) data present a readily available resource for mitochondrial genome (mitogenome) haplotypes that can be utilized for genetics research including population studies. However, the reconstruction of the mitogenome is complicated by nuclear mitochondrial DNA (mtDNA) segments (NUMTs) that co-align with the mtDNA sequences and mimic authentic heteroplasmy. Two minimum variant detection thresholds, 5% and 10%, were assessed for the ability to produce authentic mitogenome haplotypes from a previously generated WGS dataset. Variants associated with NUMTs were detected in the mtDNA alignments for 91 of 917 (~8%) Swedish samples when the 5% frequency threshold was applied. The 413 observed NUMT variants were predominantly detected in two regions (nps 12,612-13,105 and 16,390-16,527), which were consistent with previously documented NUMTs. The number of NUMT variants was reduced by ~97% (400) using a 10% frequency threshold. Furthermore, the 5% frequency data were inconsistent with a platinum-quality mitogenome dataset with respect to observed heteroplasmy. These analyses illustrate that a 10% variant detection threshold may be necessary to ensure the generation of reliable mitogenome haplotypes from WGS data resources.

Keywords: NUMTs; heteroplasmy; massively parallel sequencing; mitochondrial DNA; next-generation sequencing; nuclear elements of mtDNA; whole-genome sequencing.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest. The funders had no role in the design of the study, in the collection, analyses, or interpretation of data, in the writing of the manuscript, or in the decision to publish the results.

Figures

Figure 1
Figure 1
Scatterplots of the correlation between (a) total reads (billions) and (b) the proportion of mapped reads that mapped to the mtDNA reference genome and the average mtDNA read depth. Samples are plotted based on the category and subclassification: green = complete with less than 1.15 billion total reads (n = 785), dark green = complete with more than 1.15 billion total reads (n = 73), light green = nearly complete (n = 59), gray = incomplete (n = 17), yellow = mixed (n = 7), and blue = related (n = 1).
Figure 2
Figure 2
Classification of mixed positions that were observed in the mitochondrial genome (mitogenome) haplotypes from the SweGen whole-genome sequencing data. Mixed positions (white) were identified during initial variant detection using a 5% minimum minor nucleotide (light gray). These 833 mixed positions were detected with a 5% minor nucleotide frequency threshold and classified as either nuclear mtDNA segment (NUMT) variants or point heteroplasmies (PHPs) during multiple assessments (gray gradient). The 413 NUMT variants are shown in the inner plot (orange; scale 0–60 observations) and the 420 PHPs are displayed in the outer plot (green; scale 0–35 observations).
Figure 3
Figure 3
Distribution of mixed positions across the circular mitogenome, including the hypervariable segments 1 and 2 (HVS1 and HVS2, respectively; dark gray) of the mtDNA control region and the entire mtDNA coding region (light gray). These 833 mixed positions were detected with a 5% minor nucleotide frequency threshold and classified as either a NUMT variant or PHP. The 413 NUMT variants are shown in the inner plot (orange; scale 0–60 observations) and the 420 PHPs are displayed in the outer plot (green; scale 0–35 observations).
Figure 4
Figure 4
Distribution of the frequency (>5%) of the minor nucleotide for the 420 PHPs (green) included in the SweGen haplotypes and the 413 NUMT variants (orange) identified in the mitogenomes. The av-erage for each classification is shown as an “×” and outliers are shown as single data points.
Figure 5
Figure 5
Distribution of average read depth based on the detection of NUMT interference at either a 2% frequency threshold or 5% frequency threshold. The sample count per NUMT variant observation category is noted in parentheses for each category. The average for each classification is shown as an “×” and outliers are shown as single data points.

Similar articles

Cited by

References

    1. Benson D.A., Cavanaugh M., Clark K., Karsch-Mizrachi I., Ostell J., Pruitt K.D., Sayers E.W. GenBank. Nucleic Acids Res. 2018;46:D41–D47. doi: 10.1093/nar/gkx1094. - DOI - PMC - PubMed
    1. Dayama G., Emery S.B., Kidd J.M., Mills R.E. The genomic landscape of polymorphic human nuclear mitochondrial insertions. Nucleic Acids Res. 2014;42:12640–12649. doi: 10.1093/nar/gku1038. - DOI - PMC - PubMed
    1. Woerner A.E., Cihlar J.C., Smart U., Budowle B. Numt identification and removal with RtN! Bioinformatics. 2020;36:5115–5116. doi: 10.1093/bioinformatics/btaa642. - DOI - PubMed
    1. Marshall C., Parson W. Interpreting NUMTs in forensic genetics: Seeing the forest for the trees. Forensic Sci. Int. Genet. 2021;53:102497. doi: 10.1016/j.fsigen.2021.102497. - DOI - PubMed
    1. Balciuniene J., Balciunas D. A Nuclear mtDNA Concatemer (Mega-NUMT) Could Mimic Paternal Inheritance of Mitochondrial Genome. Front. Genet. 2019;10:518. doi: 10.3389/fgene.2019.00518. - DOI - PMC - PubMed

Substances

LinkOut - more resources