Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 24;14(8):szaf032.
doi: 10.1093/stcltm/szaf032.

Modeling rare genetic disease with patient-derived induced pluripotent stem cells: reassessment of the minimum numbers of lines needed

Affiliations

Modeling rare genetic disease with patient-derived induced pluripotent stem cells: reassessment of the minimum numbers of lines needed

Ashok R Dinasarapu et al. Stem Cells Transl Med. .

Abstract

Induced pluripotent stem cells (iPSCs) are widely used to model human genetic diseases. The most common strategy involves collecting cells from relevant individuals and then reprogramming them into iPSCs. This strategy is very powerful, but finding enough individuals with a specific genetic disease can be challenging, especially since most are rare. In addition, making numerous iPSC lines is time-consuming and expensive. As a result, most studies have included relatively small numbers of iPSC lines, sometimes from the same individual. Considering the experimental variability obtained using different iPSC lines, there has been great interest in delineating the most efficient number of lines needed to achieve a robust and reproducible result. Several recommendations have been published, although most conclusions have been based on methods where experimental variance from individual cases is difficult to separate from technical issues related to the preparation of iPSCs. The current study used gene expression profiles determined by RNA sequencing (RNAseq) to empirically evaluate the impact of the number of unique individuals and the number of replicate iPSC lines from each individual for modeling Lesch-Nyhan disease (LND). This disease is caused by mutations in the HPRT1 gene, which encodes the enzyme hypoxanthine-guanine phosphoribosyltransferase. Results for detecting disease-relevant changes in gene expression depended on the analytical method employed, and whether or not statistical procedures were used to address multiple iPSC lines from the same individual. In keeping with prior studies, the best results were obtained with iPSC lines from 3-4 unique individuals per group. In contrast to prior studies, results were improved with 2 lines per individual, without statistical corrections for duplicate lines from the same individual. In the current study where all lines were produced in parallel using the same methods, most variance in gene expression came from technical factors unrelated to the individual from whom the iPSC lines were prepared.

Keywords: HPRT1; Lesch–Nyhan disease; disease modeling; human; induced pluripotent stem cell; transcriptome.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

None
Disease modeling involved 4 unrelated Lesch-Nyhan disease (LND) cases and 4 unrelated controls (CON), with 3 independently derived iPSC lines made from fibroblasts for each individual. All samples were subject to RNAseq in one batch. The RNAseq results were used to assess pluripotency, and different methods were used to assess the disease effect on gene expression, with and without applying methods to address multiple lines from the same case. Sensitivity and specificity when applying different methods to different numbers of iPSC lines were evaluated using permutation-based analyses, and source of variance in gene expression were evaluated using variance partitioning.
Figure 1.
Figure 1.
Pluripotency gene expression among all iPSC lines. Panels A-E show expression levels in counts per million or CPM for typical pluripotency genes measured via RNAseq for each of the iPSC lines. Panel F shows the “pluripotency score” from the PluriTest method applied to RNAseq-based gene expression for each line.
Figure 2.
Figure 2.
Gene expression analyses. Panel A shows the relatedness of gene expression patterns among all 24 lines by principal component analysis (PCA, LND = filled symbols and CON = open symbols, replicate lines from different cases shown as distinct symbols); Panels B-D show differential gene expression using edgeR when all 12 LND and all 12 CON iPSC lines were treated as independent samples; panel B shows a volcano plot for differentially expressed genes (LND vs CON, horizontal line shows FDR < 0.05); panel C shows the relatedness among lines for differentially expressed genes by PCA (N = 232); panel D shows biological pathways defined by these differentially expressed genes. Panel E shows the relatedness of lines from different individuals when results from all lines from the same individual were first pooled before further analyses. Panels F-H show differential gene expression using edgeR after pooling; panel F shows a volcano plot for differentially expressed genes (LND vs CON, horizontal line shows FDR < 0.05); panel G shows the relatedness among lines for differentially expressed genes by PCA (N = 2183); panel H shows the biological pathways defined by these differentially expressed genes. Results for differential gene expression with limma-voom with and without correction for multiple lines from the same case are shown in panels I-J. The overlap among the different methods when using a pre-defined FDR < 0.05 as criteria for differential gene expression is shown in panels K (edgeR, with and without pooling) and L (edgeR and limma-voom without correcting for duplicates). Similarities among the different methods when using correlations among P-values for all differentially expressed genes are shown in panel M.
Figure 3.
Figure 3.
Assessment of sensitivity and specificity with edgeR. Panel A shows results of all possible permutations of different numbers of individual cases and different numbers of iPSC lines for each case, with each line considered an independent sample. Panel B shows the median for the corresponding AUC (area under the curve) metric to assess overall sensitivity and specificity. Panel C shows the impact of the fold-change as the median value of all permutations. Panels D-F show parallel analyses following pooling iPSC lines from the same case.
Figure 4.
Figure 4.
Assessment of sensitivity and specificity with limma-voom. Panel A shows results of all possible permutations of different numbers of individual cases and different numbers of iPSC lines for each case, with each line considered an independent sample. Panel B shows the corresponding AUC (area under the curve) metric to assess overall sensitivity and specificity. Panel C shows the impact of the fold-change as the median value of all permutations. Panels D-F show parallel analyses following application of statistical methods to address iPSC lines from the same case (DupCor).
Figure 5.
Figure 5.
Robustness of results across two independent datasets. Panel A shows differentially expressed genes for the current dataset (232 genes) and a previous dataset (407 genes), both collected and analyzed in the same way. Although only 47 genes overlapped between these studies, there was a high correlation when considering the fold-change for either significantly differentially expressed genes (B) or all genes (C). Variance portioning was used to disclose the main source of variance. Panel D shows source of variance when considering expression of all genes in the current dataset, and panel E shows sources of variance when considering only differentially expressed genes. Similar analyses are also provided for a prior dataset for all genes (F) versus only differentially expressed genes (G). after combining both datasets into a single analysis, the effect of batch can also be seen (H).

Similar articles

References

    1. Rowe RG, Daley GQ.. Induced pluripotent stem cells in disease modelling and drug discovery. Nat Rev Genet. 2019;20:377-388. https://doi.org/ 10.1038/s41576-019-0100-z - DOI - PMC - PubMed
    1. Freel BA, Sheets JN, Francis KR.. iPSC modeling of rare pediatric disorders. J Neurosci Methods. 2020;332:108533. https://doi.org/ 10.1016/j.jneumeth.2019.108533 - DOI - PMC - PubMed
    1. Nicholson MW, Ting CY, Chan DZH, et al. Utility of iPSC-derived cells for disease modeling, drug development, and cell therapy. Cells 2022;11:1853. https://doi.org/ 10.3390/cells11111853 - DOI - PMC - PubMed
    1. Germain PL, Testa G.. Taming human genetic variability: Transcriptomic meta-analysis guides the experimental design and interpretation of iPSC-based disease modeling. Stem Cell Rep. 2017;8:1784-1796. https://doi.org/ 10.1016/j.stemcr.2017.05.012 - DOI - PMC - PubMed
    1. Brunner JW, Lammertse HCA, van Berkel AA, et al. Power and optimal study design in iPSC-based brain disease modelling. Mol Psychiatry. 2023;28:1545-1556. https://doi.org/ 10.1038/s41380-022-01866-3 - DOI - PMC - PubMed

MeSH terms