Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 17:16:1596049.
doi: 10.3389/fgene.2025.1596049. eCollection 2025.

Transcriptomic analysis and machine learning modeling identifies novel biomarkers and genetic characteristics of hypertrophic cardiomyopathy

Affiliations

Transcriptomic analysis and machine learning modeling identifies novel biomarkers and genetic characteristics of hypertrophic cardiomyopathy

Feng Zhang et al. Front Genet. .

Abstract

Objective: This study aimed to leverage bioinformatics approaches to identify novel biomarkers and characterize the molecular mechanisms underlying hypertrophic cardiomyopathy (HCM).

Methods: Two RNA-sequencing datasets (GSE230585 and GSE249925) were obtained from the Gene Expression Omnibus (GEO) repository. Computational analysis was performed to compare transcriptomic profiles between normal cardiac tissues from healthy donors and myocardial tissues from HCM patients. Functional annotation of differentially expressed genes (DEGs) was performed using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses. Immune cell infiltration patterns were quantified via single-sample gene set enrichment analysis (ssGSEA). A predictive model for HCM was developed through systematic evaluation of 113 combinations of 12 machine-learning algorithms, employing 10-fold cross-validation on training datasets and external validation using an independent cohort (GSE180313).

Results: A total of 271 DEGs were identified, primarily enriched in multiple biological pathways. Immune infiltration analysis revealed distinct patterns of immune cell composition. Based on the top differentially expressed genes, a robust 12-gene diagnostic signature (COMP, SFRP4, RASD1, IL1RL1, S100A8, S100A9, ESM1, CA3, MYL1, VGLL2, MCEMP1, and MT1A) was constructed, demonstrating superior performance in both training and testing cohorts.

Conclusion: This study utilized bioinformatics approaches to analyze RNA-sequencing datasets, identifying DEGs and distinct immune infiltration patterns in HCM. These findings enabled the construction of a 12-gene diagnostic signature with robust predictive performance, thereby advancing our understanding of HCM's molecular biomarkers and pathogenic mechanisms.

Keywords: DNA repair; RNA sequencing; biomarker; gene expression; gene expression omnibus; hypertrophic cardiomyopathy; machine learning.

PubMed Disclaimer

Conflict of interest statement

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
The integration of datasets and differentially expressed genes (DEGs) between heart healthy donors (control) and hypertrophic cardiomyopathy (HCM) patients. (A) PCA of two original HCM datasets prior to (A) and after (B) batch-effect correction. (C) Heatmap of DEGs between the control and HCM groups. (D) Volcano plot of the DEGs. Significant DEGs (|fold-change| > 2; False discovery rate <0.05) are indicated in red (upregulated) or blue (downregulated).
FIGURE 2
FIGURE 2
Disease Ontology, Gene ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of the differentially expressed genes (DEGs). (A) Bubble plot showing the DO enrichment results. (B) Bubble plot showing that DEGs between the control and HCM groups were enriched in several biological processes (BP), cell components (CC), and molecular functions (MF). (C) Bubble chart illustrating the DEG-enriched KEGG pathways. The terms are shown on the y-axis and their enrichment scores are shown on the x-axis. The size of each bubble positively correlates with the number of associated genes, with a higher pathway enrichment P-value intensifying the pink hue of the bubble.
FIGURE 3
FIGURE 3
Immunological characteristics. Boxplots comparing immune cell abundances between HCM vs. controls. ***P < 0.001, **P < 0.01, *P < 0.05.
FIGURE 4
FIGURE 4
Gene set enrichment analysis (GSEA) of the top five differentially expressed genes (DEGs) between the control and hypertrophic cardiomyopathy (HCM) samples. (A) Significant GSEA sets of DEGs. (B) Ridge plots showing enrichment of different gene sets.
FIGURE 5
FIGURE 5
Diagnostic performance of our model. (A) 113 machine learning algorithm combinations evaluated via 10-fold cross-validation. (B,C) The receiver-operating characteristic (ROC) curves for two distinct validation cohorts (GSE230585 and GSE249925), assessing algorithmic accuracy in these datasets. (D) The ROC curves for an external independent validation cohort (GSE180313), testing the model’s generalizability beyond primary datasets. (E) The ROC curves for the training cohort, evaluating in-sample model fit. (F) The calibration curve assesses the alignment between predicted and observed outcomes to ensure accuracy. (G) Clinical decision-curve analysis evaluates the net clinical benefit at different threshold probabilities for the Lasso and Stepglm[both] algorithm within the model. The x-axis represents the threshold probability (0–1) and the y-axis represents the net benefit.

Similar articles

References

    1. Albulushi A., Abri Q. A., Tawfek A., Bagheri A. R., Al-Hinai S. (2025). Review article--exercise and hypertrophic cardiomyopathy: risks, benefits, and safety - a systematic review and meta-analysis. J. Saudi Heart Assoc. 37 (1), 9. 10.37616/2212-5043.1421 - DOI - PMC - PubMed
    1. Ananthamohan K., Stelzer J. E., Sadayappan S. (2024). Hypertrophic cardiomyopathy in MYBPC3 carriers in aging. J. Cardiovasc Aging 4 (1), 9. 10.20517/jca.2023.29 - DOI - PMC - PubMed
    1. Cai S., Du R., Zhang Y., Yuan Z., Shang J., Yang Y., et al. (2022). Construction and comprehensive analysis of ceRNA networks and tumor-infiltrating immune cells in hepatocellular carcinoma with vascular invasion. Front. Bioinform 2, 836981. 10.3389/fbinf.2022.836981 - DOI - PMC - PubMed
    1. Chen B., Sun X., Huang H., Feng C., Chen W., Wu D. (2024). An integrated machine learning framework for developing and validating a diagnostic model of major depressive disorder based on interstitial cystitis-related genes. J. Affect Disord. 359, 22–32. 10.1016/j.jad.2024.05.061 - DOI - PubMed
    1. Chu T., Han Q., Shi H., Li C., Ma Q., Li P., et al. (2025). Aberration of CA3 functionally mediates the pathogenesis of Cardiomyocyte hypertrophy in a miR-138-5p dependent manner. Acta histochem. 127 (1), 152233. 10.1016/j.acthis.2025.152233 - DOI - PubMed

LinkOut - more resources