Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2025 Apr 11:2023.11.06.23297858.
doi: 10.1101/2023.11.06.23297858.

Deciphering epistatic genetic regulation of cardiac hypertrophy

Affiliations

Deciphering epistatic genetic regulation of cardiac hypertrophy

Qianru Wang et al. medRxiv. .

Update in

  • Epistasis regulates genetic control of cardiac hypertrophy.
    Wang Q, Tang TM, Youlton M, Weldy CS, Kenney AM, Ronen O, Hughes JW, Chin ET, Sutton SC, Agarwal A, Li X, Behr M, Kumbier K, Moravec CS, Tang WHW, Margulies KB, Cappola TP, Butte AJ, Arnaout R, Brown JB, Priest JR, Parikh VN, Yu B, Ashley EA. Wang Q, et al. Nat Cardiovasc Res. 2025 Jun;4(6):740-760. doi: 10.1038/s44161-025-00656-8. Epub 2025 Jun 5. Nat Cardiovasc Res. 2025. PMID: 40473955 Free PMC article.

Abstract

Although genetic variant effects often interact non-additively, strategies to uncover epistasis remain in their infancy. Here, we develop low-signal signed iterative random forests to elucidate the complex genetic architecture of cardiac hypertrophy, using deep learning-derived left ventricular mass estimates from 29,661 UK Biobank cardiac MRIs. We report epistatic variants near CCDC141, IGF1R, TTN, and TNKS, identifying loci deemed insignificant in genome-wide association studies. Functional genomic and integrative enrichment analyses reveal that genes mapped from these loci share biological process gene ontologies and myogenic regulatory factors. Transcriptomic network analyses using 313 human hearts demonstrate strong co-expression correlations among these genes in healthy hearts, with significantly reduced connectivity in failing hearts. To assess causality, RNA silencing in human induced pluripotent stem cell-derived cardiomyocytes, combined with novel microfluidic single-cell morphology analysis, confirms that cardiomyocyte hypertrophy is non-additively modifiable by interactions between CCDC141, TTN, and IGF1R. Our results expand the scope of cardiac genetic regulation to epistasis.

PubMed Disclaimer

Conflict of interest statement

Competing interests E.A.A. is a founder of Personalis, Deepcell, Svexa, Candela, Saturnus Bio, and Parameter Health; an advisor to SequenceBio, Foresite Labs, Pacific Biosciences, and Versant Ventures; a non-executive director for AstraZeneca and Svexa; a stockholder in Pacific Biosciences and AstraZeneca; and has received in-kind collaborative support from Illumina, Pacific Biosciences, Oxford Nanopore, Cache, and Cellsonics. V.N.P. is an SAB member for and receives research support from BioMarin, Inc., an SAB member for Lexeo Therapeutics, and a consultant for Constantiam Biosciences and viz.ai. C.S.W. is a consultant for AiRNA Bio and Avidity Biosciences. While these companies may have interests in genomics or cardiovascular health, they had no direct role in the design, data presentation, analysis, or interpretation of this study. The remaining authors declare no competing interest.

Figures

Extended Data Fig. 1:
Extended Data Fig. 1:. Distribution of LVM and LVMi measurements for 29,661 UK Biobank participants.
Left ventricular mass (LVM, a) and LVM indexed to body surface area (LVMi, b) measurements were extracted from cardiac magnetic resonance imaging for 29,661 unrelated White British individuals via deep learning. c, A high Pearson correlation of 0.92 was observed between these LVM and LVMi measurements.
Extended Data Fig. 2:
Extended Data Fig. 2:. LVMi GWAS using BOLT-LMM and PLINK.
Two-sided GWAS p-values from BOLT-LMM (a) and PLINK (b) identified LVMi-associated SNVs, of which the top lead SNV rs3045696 showed the highest significance in both analyses. Other labeled lead SNVs were significant in only one of the two methods. The red dashed line indicates the genome-wide significance threshold (p < 5E-8, two-sided). Notably, rs3045696 and rs67172995 were also stably prioritized by lo-siRF as epistasis interactor variants. Further details are provided in Supplementary Data 1.
Extended Data Fig. 3:
Extended Data Fig. 3:. Differences in local stability scores between high and low LV mass highlight the importance of the lo-siRF-prioritized interactions between genetic loci.
a, Schematic of local stability importance score computation. Given a locus (light blue transcript), the local stability importance score for an individual is defined as the proportion of trees for which at least one SNV (shaded black region) in the locus is used in the individual’s decision path. This computation (top) was performed for each individual (denoted by the stacked boxes). Then, a two-sided permutation test was conducted to assess the difference in these local stability importance scores between the low and high LVMi individuals (bottom). b, Differences in the distribution of local stability importance scores suggest that the identified interactions between genetic loci are important for differentiating individuals with high (dark gray) and low (orange) LVMi in the siRF fit. This result, evaluated on the validation data, is stable across the three binarization thresholds and is quantified by a nominal two-sided p-value from a permutation test given in the top right corner of each subplot. Given the use of 10,000 permutations, p-values cannot be reported with greater precision than p = 1E-4.
Extended Data Fig. 4:
Extended Data Fig. 4:. Top SNVs from lo-siRF-prioritized loci and interactions between loci.
The most important SNVs and SNV-SNV pairs, as measured by their proportion of occurrence in the siRF fit, are annotated for the top lo-siRF-prioritized interactions between loci in a and top genetic loci in b. The y-axis shows the proportion of decision paths in siRF, for which the SNV or SNV-SNV pair occurs, averaged across all three binarization thresholds. In each of the interactions between genetic loci, rs7591091 in the CCDC141 locus appears most frequently, suggesting a key role in cardiac hypertrophy.
Extended Data Fig. 5:
Extended Data Fig. 5:. Genes mapped from epistatic loci share transcription factors and splicing regulators.
Gene pairs exhibiting strong co-association with transcription factors (TFs) and RNA-binding regulators (p < 0.05, two-sided Fisher’s exact test, Fig. 4e) are displayed. Horizontal bars indicate the number of shared TFs or RNA-binding regulators (bottom axis), with the top-ranked TF or regulator, determined by the lowest two-sided enrichment p-value (dots, top axis), labeled on each bar. Bars are color-coded by the corresponding gene set library. Detailed information can be found in Supplementary Data 7.
Extended Data Fig. 6:
Extended Data Fig. 6:. Dendrograms from WGCNA network analyses
Dendrograms from WGCNA control (a) and heart failure (b) networks show distinctive gene module structures and modular assignments for CCDC141, IGF1R, and TTN.
Extended Data Fig. 7:
Extended Data Fig. 7:. Spiral-shaped inertial microfluidic channel for cell focusing and imaging.
a, Schematic of an inertial microfluidic cell focusing device. Cell suspensions and fresh medium were introduced into the microfluidic device through the cell and sheath flow inlets, respectively, using a syringe pump and flowed down the 5-loop spiral microchannel with the same flow rate (1.2 mL/min). Inserted microscopy images show that randomly dispersed cells were separated by size and bifurcated into the top (large cells) or bottom (small cells) outlets. Scale bar, 10 μm. Outlet channels are connected to straight observation channels where flowing cells were further focused in the channel height direction and imaged using a high-speed camera for morphological feature extraction (Fig. 6c). b, Schematic of the cell focusing principle. The spiral microchannel has a cross-section with a slanted ceiling, resulting in different depths at the inner and outer side of the microchannel. This geometry induces strong Dean vortices (counter rotating vortices in the plane perpendicular to the main flow direction) in the outer half of the microchannel cross-section. The interplay between drag forces (FD) induced by Dean vortices and lift forces (FL) due to shear gradient and the channel wall drives cell transverse migration towards equilibrium positions where the net force is zero. As a result, large cells in a heterogeneous population progressively migrate closer to the inner channel wall, while smaller cells move towards the outer channel wall. Details about microchannel dimensions can be found in Methods.
Extended Data Fig. 8:
Extended Data Fig. 8:. Epistatic genes non-uniformly reshape cardiomyocyte size distributions.
a, Heatmap showing relative differences in cell sizes at various quantile levels between gene-silenced and scramble control conditions for unaffected and MYH7-R403Q variant cardiomyocytes. Larger quantiles correspond to larger cells in the cell size distribution. Dark red indicates strong size reduction at the specified quantile level in gene-silenced cells relative to scramble controls. The corresponding statistical differences (b) were evaluated by the maximum (two-sided) p-values across n = 3 (n = 4 for unaffected cells silencing CCDC141) independent cell batches using a bootstrap quantile test with 10,000 bootstrapped samples (exact p values per batch are available in Source Data). c, QQ-plots of averaged cell size quantiles across cell batches compare gene-silenced cells against scramble controls in unaffected (top row) and MYH7-R403Q variant (bottom row) cardiomyocytes, revealing a clear size-dependent effect of silencing CCDC141-IGF1R on correcting cardiomyocyte hypertrophy.
Extended Data Fig. 9:
Extended Data Fig. 9:. Effects of lo-siRF-prioritized genes and gene-gene interactions on hypertrophic and non-hypertrophic cell morphology.
Relative differences in median cell size (a), normalized peak number (b), and roundness error (c) were analyzed separately for cells size-sorted into the top (hypertrophic cells, green) and bottom (non-hypertrophic cells, orange) microchannel outlets (Extended Data Fig. 7), as well as for all cells combined (blue). For both unaffected (left) and MYH7-R403Q variant (right) cardiomyocytes, the gene-silencing induced relative differences in each morphological feature (a-c) were quantified as (msmc)/mc × 100%, where ms and mc denote median values in gene-silencing and scrambled control conditions, respectively. Squares represent mean relative differences across n = 3 independent cell batches (n = 4 for unaffected cells silencing CCDC141), with overlaid points indicating medians of individual replicates. Violins display mean relative differences from 1,000 bootstrap samples, with error bars indicating standard deviations. Asterisks indicate significant differences compared to scramble controls, based on the maximum p-values of two-sided Mann-Whitney U test across all cell batches (*p < 0.05, **p < 1E-3, ***p < 1E-4). Exact p-values and corresponding cell counts per condition per batch are provided in Source Data.
Fig. 1:
Fig. 1:. Schematic of the study workflow.
The study workflow includes four major stages: (a) derivation of left ventricular mass from cardiac magnetic resonance imaging (green boxes); (b) computational prioritization of epistatic drivers (orange); (c) functional interpretation of the hypothesized epistatic genetic loci (purple); and (d) experimental confirmation of epistasis in cardiac tissues and cells (blue). Abbreviations: MRI, magnetic resonance imaging; LV, left ventricle; LVM, left ventricular mass; LVMi, left ventricular mass indexed by body surface area; SNV, single-nucleotide variant; GWAS, genome-wide association study; PLINK and BOLT-LMM, two different GWAS software packages; lo-siRF, low-signal signed iterative random forest; ANNOVAR, a software for functional annotation of genetic variants; CADD, combined annotation dependent depletion, which scores the deleteriousness of variants; RegulomeDB, a database that scores functional regulatory variants; ChromHMM, a multivariate Hidden Markov Model for chromatin state annotation; eQTL, expression quantitative trait locus; sQTL, splicing quantitative trait locus; Hi-C, high-throughput chromosome conformation capture; PheWAS, phenome-wide association study; siRNA, small interfering RNA; hiPSC-CM, human induced pluripotent stem cell-derived cardiomyocyte; HCM, hypertrophic cardiomyopathy.
Fig. 2:
Fig. 2:. Low-signal signed iterative random forest (lo-siRF) workflow.
a, Lo-siRF took as input single-nucleotide variant (SNV) data and cardiac MRI-derived left ventricular mass indexed by body surface area (LVMi) from 29,661 UK Biobank participants. b, Two genome-wide association studies (GWAS) were applied using different software to reduce the dimensionality of SNVs. c, LVMi was binarized into high and low categories using three thresholds (shown as stacked boxes). d, For each threshold, a signed iterative random forest (siRF) was fitted using the GWAS-filtered SNVs to predict the binarized LVMi, achieving prediction accuracy better than or comparable to other methods, before interpreting the model fit. e, siRF aggregates SNVs into genetic loci using ANNOVAR and finally ranks loci and their interactions based on a stability-driven importance score averaged across the three binarization thresholds.
Fig. 3:
Fig. 3:. Lo-siRF finds epistasis between genetic risk loci for left ventricular hypertrophy.
a, Circos plot illustrating genetic risk loci identified by lo-siRF (green, circle 2) and regions after clumping FUMA-extracted SNVs in LD (r2 > 0.6) with lo-siRF-prioritized SNVs (black, circle 3). Circle 1 shows the top 300 epistatic SNV-SNV pairs ranked by occurrence frequency in lo-siRF (green), SNV-gene linkages (FDR < 0.5) from GTEx V8 cis-eQTL data of heart and skeletal muscle tissues (purple), and 3D chromatin interactions from Hi-C data of left ventricular tissue (GSE87112). Circle 5 and 6 show occurrence frequencies and the number of partner SNVs in epistasis (normalized by the maximum value per locus) identified by lo-siRF, respectively. Circle 7 displays the ChromHMM core-15 chromatin state for left ventricle (LV), right ventricle (RV), right atrium (RA), and fetal heart (Fetal). Circle 8 presents a GWAS Manhattan plot (points, two-sided p < 0.05 from PLINK), where SNVs are color-coded by their maximum r2 value relative to the 283 lo-siRF-prioritized SNVs. The outer heatmap layer represents LD-linked SNVs (r2 > 0.6) extracted from the FUMA reference panel, which lack GWAS p-values. SNVs not in LD (r2 ≤ 0.6) with any of the 283 lo-siRF-prioritized SNVs appear in gray. The dashed line indicates GWAS p = 5E-8. Circle 9 shows genes functionally mapped by FUMA. b, Pie charts depicting ANNOVAR enrichment results for the six lo-siRF loci (circle 2 in Fig. 3a and Table 1). Slice arc length represents the proportion of SNVs with a specific functional annotation. Slice radius indicates log2(E + 1), where E is the enrichment score computed as (proportion of SNVs with an annotation for a given locus)/(proportion of SNVs with an annotation relative to all available SNVs in the FUMA reference panel). The dashed circle indicates E = 1 (no enrichment). Asterisks denote two-sided Fisher’s exact test p-values for the enrichment of each annotation. Details in Supplementary Data 4 and 5.
Fig. 4:
Fig. 4:. Genes mapped from epistatic loci show strong functional co-associations.
a, UpSet plot showing the number of lo-siRF-prioritized SNVs (dark blue) and their LD-linked (r2 > 0.6) SNVs (light blue, circle 8 in Fig. 3a) functionally mapped to 20 protein-coding genes via positional, eQTL, and/or chromatin interaction (CI) mapping in FUMA. CCDC141 and IGF1R (highlighted red) are prioritized by all three mapping strategies (details in Supplementary Data 4). b, Heatmap of averaged expression (GTEx, 50% winsorization, log2(TPM + 1)) across different tissues for these functionally mapped genes. c, Co-association network built from an enrichment analysis integrating multiple annotated gene set libraries for gene ontology (GO) and pathway terms from Enrichr. Nodes represent genes (green) functionally linked from lo-siRF-prioritized epistatic and hypostatic loci (Table 1) and top enriched GO/pathway terms. Edge width indicates the co-association strength between enriched terms and genes ranked by two-sided p-values from an exhaustive permutation of the co-association score for all possible gene-gene combinations in the network (details in Methods and Supplementary Data 6). d, Comparison between the top 10 transcription factors (TFs) enriched from genes prioritized (top) and deprioritized (bottom) by lo-siRF. Prioritized genes are functionally linked to lo-siRF-prioritized SNVs (panel a), while deprioritized genes are genes linked to SNVs failed to pass the lo-siRF prioritization thresholds. Enrichment was performed against nine TF-annotated gene set libraries from ChEA3 and Enrichr, ranked by the number of significantly (p < 0.05, two-sided Fisher’s exact test) overlapped libraries (numbers in nLibraries column) and the mean scaled rank across all libraries containing that TF (colored boxes in nLibraries column). The balloon plot shows the lowest two-sided Fisher’s exact test p-values (horizontal axis) and the proportion of overlapped genes (balloon size) between the input gene set and the corresponding TF-annotated gene set. e, Heatmap showing the TF co-association strength of gene-gene combinations among lo-siRF-prioritized genes relative to randomly selected gene pairs in the co-association network. Further details in Extended Data Fig. 5 and Supplementary Data 7.
Fig. 5:
Fig. 5:. Network analysis using transcriptomic data from 313 human hearts reveals strong correlations between suggestive statistical epistasis contributors.
a, Gene co-expression networks for control (blue) and heart failure (red) conditions were established using weighted gene co-expression network analysis (WGCNA) on transcriptomic data from 313 non-failing and failing human heart tissues. b-c, The connectivity between lo-siRF-prioritized genes was assessed against the full connectivity distributions of all possible gene-gene combinations in the control (b) and heart failure (c) networks. CCDC141 showed a significant connectivity to SYNM and LYSMD4 (IGF1R lo-siRF locus) and TTN and FKBP7 (TTN lo-siRF locus) in the control network. Color scales indicate two-sided empirical p-values. d, Comparing between the control and heart failure networks indicated a significant connectivity decrease of these gene pairs during heart failure progression. e, A Sanky plot demonstrating the rewired gene modular assignments for the lo-siRF loci-associated genes (middle column) in the control vs. heart failure networks. Module names in control (left column) and heart failure (right column) networks were derived from KEGG and Reactome associations of genes within each module.
Fig. 6:
Fig. 6:. Single-cell image analysis of epistatic gene-silencing effects on cardiomyocyte hypertrophy.
a, Human induced pluripotent stem cell (iPSC)-derived cardiomyocytes, with and without hypertrophic cardiomyopathy (MYH7-R403Q mutation), were transfected with scramble siRNA or siRNAs specifically targeting single (CCDC141, IGF1R, TTN) or combined (CCDC141-IGF1R, CCDC141-TTN) loci prioritized by lo-siRF. b, Gene-silenced cardiomyocytes were bifurcated into large and small cells using a spiral microfluidic device (Extended Data Fig. 7) for high-resolution single-cell imaging. c, Image analysis workflow. Single-cell images were analyzed using a customized MATLAB program to extract cell size/shape features through a multi-step process.
Fig. 7:
Fig. 7:. CCDC141 non-additively interacts with TTN and IGF1R to modify cardiomyocyte morphology.
a, Violin plots showing total cell diameter distributions for unaffected (blue, n = 36327) and MYH7-R403Q variant (red, n = 46542) cardiomyocytes from three independent cell batches (p = 2.1E-36, two-sided Mann-Whitney U test). In all panels, boxplots display the interquartile range (IQR) with the median line, and whiskers extend to 1.5x IQR. Overlaid points represent median diameters from individual batches. b, Gene-silencing efficiency measured by RT-qPCR in unaffected (blue) and MYH7-R403Q variant (red) cells from at least n = 3 cell batches (exact n numbers provided in Source Data). Bars, overlaid points, and error bars show mean values, individual biological replicates, and standard deviations. c-h, Gene-silencing effects on single-cell morphology analyzed for n = 3 independent cell batches (n = 4 for unaffected cells silencing CCDC141), with over 33,000 cells measured per condition per batch. c, Percent change in median cell diameter for each gene-silencing condition relative to scramble controls (i.e., relative size difference). Squares represent mean relative size differences across batches, with overlaid points indicating individual batches. Violins display mean relative size differences from 1,000 bootstrap samples, with error bars indicating standard deviations. Asterisks: *p < 0.05, **p < 6E-5 (exact p-values in Source Data), two-sided Mann-Whitney U test. d, Violin plots showing non-additive interaction effects (β^12) for unaffected (blue) and MYH7-R403Q variant (red) cells compared to marginal effects (β^1 and β^2, gray), estimated via a quantile regression model across 10,000 bootstrap samples (details in Methods). e, Scatter plots showing gene-silencing effects on cellular texture (i.e. norm. peak No., x-axis) and boundary (i.e. roundness error, y-axis) features. Markers represent mean values across cell batches, with error bars indicating standard deviations from 1,000 bootstrap samples. Bootstrapped values are visualized as density contours (scattered dots represent a density level lower than 0.2). f, Cell boundary waviness and texture irregularity were measured by roundness error (top) and normalized peak number (bottom), respectively. g-h, Representative single-cell images illustrating boundary irregularity (g) and textural variation (h). Scale bars: 10 μm. Statistical details in Supplementary Data 8 and Source Data.

References

    1. Weldy C. S. & Ashley E. A. Towards precision medicine in heart failure. Nat. Rev. Cardiol. 1–18 (2021). - PubMed
    1. Bastos M. B. et al. Invasive left ventricle pressure–volume analysis: overview and practical clinical implications. Eur. Heart J. 41, 1286–1297 (2019). - PMC - PubMed
    1. Burkhoff D., Mirsky I. & Suga H. Assessment of systolic and diastolic ventricular properties via pressure-volume analysis: a guide for clinical, translational, and basic researchers. American Journal of Physiology-Heart and Circulatory Physiology 289, H501–H512 (2005). - PubMed
    1. Marian A. J. & Braunwald E. Hypertrophic Cardiomyopathy: Genetics, Pathogenesis, Clinical Manifestations, Diagnosis, and Therapy. Circ. Res. 121, 749–770 (2017). - PMC - PubMed
    1. Haider A. W., Larson M. G., Benjamin E. J. & Levy D. Increased left ventricular mass and hypertrophy are associated with increased risk for sudden death. J. Am. Coll. Cardiol. 32, 1454–1459 (1998). - PubMed

Publication types

LinkOut - more resources