Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep 2;12(1):5241.
doi: 10.1038/s41467-021-25482-x.

Cell reprogramming shapes the mitochondrial DNA landscape

Affiliations

Cell reprogramming shapes the mitochondrial DNA landscape

Wei Wei et al. Nat Commun. .

Abstract

Individual induced pluripotent stem cells (iPSCs) show considerable phenotypic heterogeneity, but the reasons for this are not fully understood. Comprehensively analysing the mitochondrial genome (mtDNA) in 146 iPSC and fibroblast lines from 151 donors, we show that most age-related fibroblast mtDNA mutations are lost during reprogramming. However, iPSC-specific mutations are seen in 76.6% (108/141) of iPSC lines at a mutation rate of 8.62 × 10-5/base pair. The mutations observed in iPSC lines affect a higher proportion of mtDNA molecules, favouring non-synonymous protein-coding and tRNA variants, including known disease-causing mutations. Analysing 11,538 single cells shows stable heteroplasmy in sub-clones derived from the original donor during differentiation, with mtDNA variants influencing the expression of key genes involved in mitochondrial metabolism and epidermal cell differentiation. Thus, the dynamic mtDNA landscape contributes to the heterogeneity of human iPSCs and should be considered when using reprogrammed cells experimentally or as a therapy.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. mtDNA heteroplasmic variants detected from whole-genome sequences.
a Summary of bulk whole-genome (WGS), bulk RNA and single-cell RNA (scRNA) sequencing data analysed in this study. Colour represents the heterogeneity of cell populations under WGS and bulk RNA-seq, and three different cell stages under scRNAseq, where mesendo = mesoderm, and defendo = definitive endoderm. b Heteroplasmic variants detected from WGS. Circos plot from outside the circle to inside: (1) mtDNA position; (2) heteroplasmic variants identified in 141 iPSC lines; (3) minor allele frequency of common variants (MAF > 1%) in European population; mtDNA genes (purple— D-loop, red—coding region, yellow—rRNAs and grey—tRNAs); (4) heteroplasmic variants identified in fibroblast cell lines; (5) blue lines pointing to the positions of variants specific in fibroblast cell lines; (6) orange lines pointing to the positions of pathogenic mutations; (2) & (4) vertical axes represent the HFs; (3) vertical axes represent the frequencies of single-nucleotide substitutions. c High resolution of mtDNA D-loop region. Plots from top to bottom: (1) mtDNA position; (2) decrease shift variants; (3) D-loop regions; (4) increase shift variants; (2) & (4) vertical axes represent heteroplasmic shifts.
Fig. 2
Fig. 2. Characteristics of mtDNA heteroplasmic variants detected through the bulk analysis of 146 fibroblast cell lines and 141 derived iPSC lines.
a Distribution of the mean number of heteroplasmies defined in fibroblast and iPS cell lines. b Correlation of the mean number of heteroplasmies per fibroblast cell line with the donor’s age. Shaded regions show mean ± standard deviation (s.d.). c Correlation of the mean number of heteroplasmies per fibroblast cell line in each mtDNA region with the donor’s age. Shaded regions show mean ± s.d. d Distribution of the average heteroplasmy fraction (HF) in fibroblast and iPS cell lines. e Correlation of the average HF per fibroblast cell line with the donor’s age. Shaded regions show mean ± s.d. f Correlation of the average HF per fibroblast cell line in each mtDNA region with the donor’s age. Shaded regions show mean ± s.d. g Heteroplasmies defined in fibroblast (top) and iPS cell lines (bottom). HFs are shown on the left y axis. mtDNA regions covered by different colours. The depth of the shading represents the mutation rate of each mtDNA region (shown on the right side of y axis). Three fibroblast-specific mutations are highlighted in red rectangles. The regions were significantly enriched mutations than expected by chance were labelled by asterisks. h Frequency of specific mutations in fibroblast and iPS cell lines. i The fibroblast-specific mutation 414G was associated with the donor’s age. j Distribution of heteroplasmies defined in fibroblast and iPS cell lines. k Distribution of heteroplasmies defined in fibroblast and iPS cell lines, two iPSC lines derived from the same fibroblast cell line are shown separately. l Ratio of non-synonymous/synonymous variants (NS/SS) observed in fibroblasts, iPSCs and iPSC-specific variants. m Distribution of the mtDNA copy number in fibroblast and iPS cell lines. n Distribution of the heteroplasmy fraction in fibroblast and iPS cell lines. Red lines show the mean HFs within each dataset. a, d P values were calculated using two-sided Wilcoxon test. Source data are provided as a Source Data file. b, c, f P values were calculated using linear regression model. g P values were calculated using two-sided Fisher’s exact test.
Fig. 3
Fig. 3. The spectrum of mtDNA mutations changes during cell reprogramming.
a Normalised heteroplasmic shifts (HS) estimated between 83 fibroblast cell lines and 141 derived iPSC lines. Normalised heteroplasmic shifts are shown on the left side of the y axis. mtDNA regions shown by different colours; the depth of shading represents the mutation rate of iPSC-specific variants and lost variants within each mtDNA region (shown on the right side of the y axis). Variants above the red lines were extreme shifts seen in iPSCs. Source data are provided as a Source Data file. b Distributions of the HS in different mtDNA regions. c Cumulative distributions of the HSs within each mtDNA region. P values were calculated using two-sided Fisher’s exact test. d Pathogenic mutations observed in this study. Different colours represent lost, iPSC-specific or shared mutations between fibroblast cell lines and derived iPSC lines. Blue triangles are HFs in fibroblasts and orange triangles are HFs in iPSCs. mtDNA regions shown at the bottom. e Trinucleotide mutational signature of heteroplasmic variants observed in fibroblasts, iPSCs, lost in fibroblasts, iPSC-specific and shared variants between fibroblasts and iPSCs. The bars represent the substitution rate, mutations from the H or L-strand are shown in different colours. P values were calculated using two-sided Fisher’s exact test. f Mutational signature of six categories (C > A, C > G, C > T, T > A, T > C & T > G) within each mtDNA region. The bars represent the relative frequency of each category. Mutations from the H or L-strand are shown in different colours. g Correlation between mutational signatures observed in this study with the cancer signatures. The gradients of circles correspond to correlation R2 values. The sizes of circles correspond to the p values (larger circles have smaller P values). The names of cancer signatures are shown at the top.
Fig. 4
Fig. 4. Overview of mtDNA variants detected by single-cell RNA sequencing.
a From top to bottom, mtDNA variants detected from single-cell RNA sequencing, cells defined as iPSC, mesendo and defendo stages are shown separately. Each dot represents the mean HF of each variant per cell line, and the error bar was 95% confidence interval; mtDNA variants observed in their fibroblast cells from bulk whole-genome sequencing (purple plot); mtDNA sequencing depth from single-cell RNA sequencing. Mean depth ± standard deviation (s.d.) is shown in the shaded area. Source data are provided as a Source Data file. b Overview of the variants detected in the single cells defined at iPSC, mesendo and defendo stages. The text labelled on mesendo stage also applies to the iPSC and defendo cell stages. c Distribution of the proportion of cells carrying the same variant from each cell line. The y axis shows the percentage of mtDNA variants. The majority of the variants were shared by a small proportion of cells from the same cell line. Cells defined as iPSCs, mesendo and defendo cells are shown in different colours. d Distribution of mean heteroplasmy fraction of each variant from each cell line. The y axis shows the percentage of mtDNA variants. The majority of the variants were low-level heteroplasmic variants (mean HF = sum HF/the number of cells carrying the same mutation). Cells defined as iPSCs, mesendo and defendo cells are shown in different colours. e Distribution of heteroplasmy fractions of pseudo-bulk variants from each cell line. The y axis shows the percentage of mtDNA variants. The majority of the variants were low-level heteroplasmic variants in pseudo-bulk heteroplasmy level (HF = sum HF / the total number of cells within each cell line). Cells defined as iPSCs, mesendo and defendo cells are shown in different colours.
Fig. 5
Fig. 5. Characteristics of mtDNA heteroplasmic variants detected by single-cell RNA sequencing.
a Illustration of separate models explaining the variants with the same pseudo-bulk heteroplasmy level. A variant with 50% HF could be due to 100% of cells carrying ~50% HF heteroplasmic variants (left). Alternatively, a variant with 50% HF could be due to 50% of cells carrying homoplasmic variants in the population (right two graphs). In the middle, the variant was passed from the same ancestral cell. On the right, the variant mutated independently in a large proportion of cells. b The proportion of cells carrying the same variant from each cell line was grouped into different bins. Line plots show the distributions of the mutational rates estimated within each bin (where mutational rate = number of mutations within each bin from the same cell line divided by the number of cells from each cell line multiplied by 16569 (bp)). The mutational frequency profiles were consistent between the three cell stages. Cells defined as iPSCs, mesendo and defendo cells are shown in different colours. c Scatter plots of the log2 percentage of cells carrying the same variant from each cell line between any two of three cell stages. The HF for each individual mutation was highly correlated across all three cell stages. Source data are provided as a Source Data file. d Cumulative distribution of the heteroplasmic variants detected in each cell type. Variants shared with their matched fibroblast cell lines are shown in the upper left side, and iPSC/mesendo/defendo-specific variants are shown in the lower right side. e Violin and box plots show the percentage of the variance for gene expression explained by mtDNA variants from two independent cell lines. Cells defined as iPSCs, mesendo and defendo cells are shown in different colours.
Fig. 6
Fig. 6. Lineage tracing using mtDNA variants reveals multiple sub-clones within each cell line.
a UMAP plot of mitochondrial mutation profiles, based on 11 cell lines with at least 300 cells. Single cells were separated according to their origins based solely on the mtDNA heteroplasmic variant data. Cells are coloured by each cell line of origin. b An example of hierarchical clustering by the mitochondrial genotyping (rows) for the single cells within a single-cell line. Cells are coloured by their cell stages (columns). Colour bar = heteroplasmy fraction. c, d An example of UMAP plot of mtDNA mutation profiles from a single-cell line, with cells coloured by the defined cluster (c), and heteroplasmy fractions of specific mutations observed in a cell line (d). The mutations are labelled at the top of the plots.

References

    1. Kilpinen H, et al. Common genetic variation drives molecular heterogeneity in human iPSCs. Nature. 2017;546:370–375. doi: 10.1038/nature22403. - DOI - PMC - PubMed
    1. Carcamo-Orive I, et al. Analysis of transcriptional variability in a large human iPSC library reveals genetic and non-genetic determinants of heterogeneity. Cell Stem Cell. 2017;20:518–532 e519. doi: 10.1016/j.stem.2016.11.005. - DOI - PMC - PubMed
    1. Cuomo ASE, et al. Single-cell RNA-sequencing of differentiating iPS cells reveals dynamic genetic effects on gene expression. Nat. Commun. 2020;11:810. doi: 10.1038/s41467-020-14457-z. - DOI - PMC - PubMed
    1. Jerber J, et al. Population-scale single-cell RNA-seq profiling across dopaminergic neuron differentiation. Nat. Genet. 2021;53:304–312. doi: 10.1038/s41588-021-00801-6. - DOI - PMC - PubMed
    1. Wallace DC. Mitochondrial genetic medicine. Nat. Genet. 2018;50:1642–1649. doi: 10.1038/s41588-018-0264-z. - DOI - PubMed

Publication types

Substances