Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 Jan 19:2023.01.19.23284696.
doi: 10.1101/2023.01.19.23284696.

Nuclear genetic control of mtDNA copy number and heteroplasmy in humans

Affiliations

Nuclear genetic control of mtDNA copy number and heteroplasmy in humans

Rahul Gupta et al. medRxiv. .

Update in

Abstract

Human mitochondria contain a high copy number, maternally transmitted genome (mtDNA) that encodes 13 proteins required for oxidative phosphorylation. Heteroplasmy arises when multiple mtDNA variants co-exist in an individual and can exhibit complex dynamics in disease and in aging. As all proteins involved in mtDNA replication and maintenance are nuclear-encoded, heteroplasmy levels can, in principle, be under nuclear genetic control, however this has never been shown in humans. Here, we develop algorithms to quantify mtDNA copy number (mtCN) and heteroplasmy levels using blood-derived whole genome sequences from 274,832 individuals of diverse ancestry and perform GWAS to identify nuclear loci controlling these traits. After careful correction for blood cell composition, we observe that mtCN declines linearly with age and is associated with 92 independent nuclear genetic loci. We find that nearly every individual carries heteroplasmic variants that obey two key patterns: (1) heteroplasmic single nucleotide variants are somatic mutations that accumulate sharply after age 70, while (2) heteroplasmic indels are maternally transmitted as mtDNA mixtures with resulting levels influenced by 42 independent nuclear loci involved in mtDNA replication, maintenance, and novel pathways. These nuclear loci do not appear to act by mtDNA mutagenesis, but rather, likely act by conferring a replicative advantage to specific mtDNA molecules. As an illustrative example, the most common heteroplasmy we identify is a length variant carried by >50% of humans at position m.302 within a G-quadruplex known to serve as a replication switch. We find that this heteroplasmic variant exerts cis -acting genetic control over mtDNA abundance and is itself under trans -acting genetic control of nuclear loci encoding protein components of this regulatory switch. Our study showcases how nuclear haplotype can privilege the replication of specific mtDNA molecules to shape mtCN and heteroplasmy dynamics in the human population.

PubMed Disclaimer

Conflict of interest statement

COMPETING INTERESTS

VKM is a paid advisor to 5am Ventures and Janssen Pharmaceuticals. BMN is a member of the scientific advisory board at Deep Genomics and Neumora, consultant of the scientific advisory board for Camp4 Therapeutics and consultant for Merck. KJK is a consultant for Vor Biopharma.

Figures

Figure 1.
Figure 1.. Genetic and phenotypic determinants of mtDNA copy number in UK Biobank.
A. Variance explained in mtCN by blood composition, technical, and demographic correction models. Relationship of B. mtCNraw and C. mtCNcorr as a function of age and genetic sex. D. GWAS Manhattan plot from cross-ancestry meta-analysis in UKB. Labeled genes were obtained either via fine-mapping or, if a credible set (CS) could not be constructed, mapping to the nearest gene. Red genes are mitochondrial or are implicated in mtDNA disease; † = CS variants proximal to the gene with posterior probability of inclusion (PIP) > 0.1; ‡ = CS variants with PIP > 0.9; ”c” = coding variant in the CS; underline = eQTL colocalization PIP > 0.1. Asterisks above peaks on chromosome 19 and 21 correspond to GP6 and RUNX1 respectively. E. Table of variants in the 95% CS with PIP > 0.1 causing a protein-altering change. Red indicates mitochondria-relevant. F. Standardized odds ratios for log mtCNraw, log mtCNcorr, and major blood composition phenotypes in predicting risk of selected common diseases in UKB. Inset numbers are p-values; error-bars are 95% CI. HTN = hypertension; MI = myocardial infarction; T2D = type 2 diabetes. Correlation between effect sizes for genome-wide significant lead SNPs detected for neutrophil count between neutrophil count and G. mtCNraw and H. mtCNcorr. Error bars represent 1SE, dotted line is weighted least squares regression line, inset corresponds to regression p-value.
Figure 2.
Figure 2.. Nuclear genetic control of relative mtDNA coverage in the non-coding region.
A. Mean per-base coverage across the mtDNA in UKB. Zoomed dropdown highlights coverage depression in the mtDNA non-coding region. Arrows correspond to stages of replication: red dashed arrow = RNA primer; black dashed arrow = transient DNA “primer” flap; black solid arrow = retained replicated DNA. Grey ribbon is +/− 1 standard deviation. CSB = conserved sequence box. B. GWAS Manhattan plot of the residual of the regression of mtDNA median DNA primer coverage on median RNA primer coverage. C. GWAS Manhattan plot of the residual of the regression of mtDNA median DNA primer coverage on median 7s DNA region coverage. Insets for B and C show 2D histograms of the correlation between the respective quantities across all individuals in UKB. Red genes are mitochondrial or are implicated in mtDNA disease; † corresponds to CS variants proximal to the gene with posterior probability of inclusion (PIP) > 0.1; ‡ corresponds to CS variants with PIP > 0.9, ”c” corresponds to a missense variant in the CS; underline corresponds to eQTL colocalization PIP > 0.1. D. Structure of MGME1 (5ZYV) shown with bound ssDNA in dark blue, the 310 helix in pink and the T265 alpha carbon as a red sphere. Inset shows the hydrogen bond between T265 and I262.
Figure 3.
Figure 3.. Evidence of intermediate phenotypes among carriers of the MELAS variant in UKB.
Table shows carrier frequencies for 10 known pathogenic mutations in UKB, including chrM:3243:A,G (pathogenic for MELAS), with heteroplasmy distributions plotted as jittered points. Panels show mean Hemoglobin A1c, triglyceride levels, auditory threshold (via speech recognition threshold test), and visual impairment (via vision test measured as logMAR) among mtDNA pathogenic variant carriers. Only points corresponding to more than 10 measurements are shown. Vertical lines represent per-trait means among individuals with none of the 10 pathogenic mutations detected.
Figure 4.
Figure 4.. Pervasive nuclear genetic control over the most common mitochondrial DNA heteroplasmies.
A. mtDNA heteroplasmies passing QC in UKB and AoU. Data tracks show, starting from the inside: positions of poly-C tracts; mtDNA genomic annotations (orange = HVR, yellow = rRNA genes; blue = tRNA genes; purple = coding genes); counts of heteroplasmic SNVs (red); counts of heteroplasmic indels (black). Teal arc corresponds to region highlighted in Figure 5. Light line in outermost track is a reference line at 100. B. Selected heteroplasmy distributions across UKB and AoU in individuals carrying the allele. C. Mean count of heteroplasmies per individual across age groups in AoU. Error bars are 1SE. D. Relationship between heteroplasmy levels in mother-offspring (left), father-offspring (middle), and sibling-sibling (right) for all heteroplasmies found in >5 individuals. E. GWAS lead SNPs from all common heteroplasmies with genome-wide significant signals. Point size corresponds to lead SNP p-value; dark points are genome-wide significant. Vertical lines correspond to SNPs near genes of interest and/or loci found across multiple mtDNA variants. Green corresponds to genes nominated for mtCN, † = CS variants with PIP > 0.1; ‡ = CS variants with PIP > 0.9, ”c” = coding variant in CS; underline = eQTL colocalization PIP > 0.1. F. mtDNA dynamics pathway showing genes highlighted in heteroplasmy GWAS. G. chrM:16183:AC,A heteroplasmy as a function of lead SNP genotype in DGUOK. H. Structure of DGUOK (2OCP) with amino acid Q170 in red and nearby residues participating in hydrogen bonds or stacking interaction in pink. dATP shown as black sticks. I. chrM:16183:A,AC heteroplasmy as a function of lead SNP genotype in POLG2. J. Structure of polymerase gamma enzyme (4ZTU) with POLG in light blue and POLG2 subunits in green and yellow. Bound DNA is in dark blue and the POLG2 residue G416 is shown as red spheres. In panels G and I, red lines correspond to medians.
Figure 5.
Figure 5.. chrM:302 length heteroplasmies are inherited maternally as mixtures, co-exist in single cells, and are under the influence of the nuclear genome.
A. Scheme showing chrM:302 region inside CSBII responsible for forming a G-quadruplex structure along with length heteroplasmy GmAGn nomenclature. B. Sibling-sibling transmission of length heteroplasmies at chrM:302. C. Length heteroplasmy composition across all UKB individuals. D. Length heteroplasmy composition in UKB in select mtDNA haplogroups. E. Length heteroplasmy composition across 171 single cells in whole blood. Each vertical bar corresponds to a single individual (C, D) or cell (E). Colors for panels B-E correspond to legend between panels B and D. F. Effect of length of major allele at chrM:302 (red line) and TFAM fine-mapped variant (black dot) on mtCN. Error bars are 1SE. G. Case-only mtDNA heteroplasmy GWAS Manhattan plot for chrM:302:A,AC. Red genes are mitochondrial or are implicated in mtDNA disease; † corresponds to CS variants proximal to the gene with PIP > 0.1; ”c” corresponds to coding variant in CS; underline corresponds to eQTL colocalization PIP > 0.1. H. chrM:302 length heteroplasmies as a function of highest PIP SNP genotype in SSBP1 locus. Red line corresponds to per-nuclear-genotype median heteroplasmy.

References

    1. Agaronyan K., Morozov Y. I., Anikin M., & Temiakov D. (2015). Replication-transcription switch in human mitochondria. Science, 347(6221), 548–551. 10.1126/SCIENCE.AAA0986 - DOI - PMC - PubMed
    1. Aguet F., Barbeira A. N., Bonazzola R., Brown A., Castel S. E., Jo B., Kasela S., Kim-Hellmuth S., Liang Y., Oliva M., Flynn E. D., Parsana P., Fresard L., Gamazon E. R., Hamel A. R., He Y., Hormozdiari F., Mohammadi P., Muñoz-Aguirre M., … Volpi S. (2020). The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science, 369(6509), 1318–1330. 10.1126/SCIENCE.AAZ1776/SUPPL_FILE/AAZ1776_TABLESS10-S16.XLSX - DOI - PMC - PubMed
    1. Anderson S., Bankier A. T., Barrell B. G., de Bruijn M. H. L., Coulson A. R., Drouin J., Eperon I. C., Nierlich D. P., Roe B. A., Sanger F., Schreier P. H., Smith A. J. H., Staden R., & Young I. G. (1981). Sequence and organization of the human mitochondrial genome. Nature, 290(6), 338–346. - PubMed
    1. Ashar F. N., Zhang Y., Longchamps R. J., Lane J., Moes A., Grove M. L., Mychaleckyj J. C., Taylor K. D., Coresh J., Rotter J. I., Boerwinkle E., Pankratz N., Guallar E., & Arking D. E. (2017). Association of mitochondrial DNA copy number with cardiovascular disease. JAMA Cardiology, 2(11), 1247–1255. 10.1001/jamacardio.2017.3683 - DOI - PMC - PubMed
    1. Aul P., Idker M. R., Ifai A. R., Ynda L., Ose R., Ulie J., Uring E. B., & Ook A. R. C. (2002). Comparison of C-Reactive Protein and Low-Density Lipoprotein Cholesterol Levels in the Prediction of First Cardiovascular Events. 10.1056/NEJMoa021993, 347(20), 1557–1565. 10.1056/NEJMOA021993 - DOI - PubMed

Publication types