Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025:2952:369-410.
doi: 10.1007/978-1-0716-4690-8_21.

The Use of AI for Phenotype-Genotype Mapping

Affiliations

The Use of AI for Phenotype-Genotype Mapping

Jyoti Sharma et al. Methods Mol Biol. 2025.

Abstract

The mapping of genotypes to phenotypes is a cornerstone of genetics, critical for understanding disease mechanisms and advancing precision medicine. The advent of next-generation sequencing (NGS) technologies has enabled the generation of extensive genomic datasets, yet the complexity and scale of these data demand innovative analytical approaches. Artificial intelligence (AI) has emerged as a transformative tool, integrating genotype and phenotype data, uncovering intricate patterns, and driving advancements in diagnosis, therapy, and research.AI applications in phenotype-genotype mapping span various machine learning and deep learning techniques. Supervised learning methods, such as Support Vector Machines (SVMs), Random Forests, and Gradient Boosting, predict variant pathogenicity and classify genetic risks by leveraging curated datasets. Unsupervised approaches, including k-Means clustering and hierarchical clustering, uncover hidden patterns in data, enabling the identification of disease subtypes and novel associations. Dimensionality reduction techniques like Principal Component Analysis (PCA) and t-Distributed Stochastic Neighbor Embedding (t-SNE) simplify high-dimensional genomic data for analysis and visualization. Neural networks, including Convolutional and Recurrent Neural Networks (CNNs and RNNs), excel at extracting insights from complex datasets like gene expression profiles and genomic sequences. These methodologies have found applications in rare disease diagnosis, drug discovery, and risk assessment for complex diseases. AI tools integrate genetic and phenotypic data to prioritize pathogenic variants, significantly improving diagnostic yields for unresolved cases. Multi-omic data integration, incorporating genomics, transcriptomics, and proteomics, offers a holistic perspective on genotype-phenotype relationships. In drug discovery, AI identifies therapeutic targets and predicts drug efficacy, accelerating the development of precision treatments.Despite its potential, challenges persist. Data heterogeneity, limited interpretability of AI models, privacy concerns, and insufficient datasets for rare diseases impede broader implementation. To address these issues, AI frameworks incorporate data standardization, explainability techniques like SHAP and LIME, federated learning for secure collaborative research, and data augmentation methods such as transfer learning and GANs. Future directions include the integration of multi-omic data, advanced explainable AI for clinical adoption, and the expansion of federated learning to facilitate cross-institutional collaborations. By bridging the gap between genotype and phenotype, AI-driven methodologies are transforming clinical genomics and personalized medicine. This chapter explores the methodologies, applications, challenges, and future prospects of AI in phenotype-genotype mapping, highlighting its pivotal role in advancing genetic research and improving healthcare outcomes.

Keywords: Artificial intelligence; Genetic disorders; Graph Neural Networks; Human Phenotype Ontology; Next-generation sequencing; Polygenic Risk Scores.

PubMed Disclaimer

Similar articles

References

    1. Zhang Y, Cheng Y, Jiang W, Ye Y, Lu Q, Zhao H (2021) Comparison of methods for estimating genetic correlation between complex traits using GWAS summary statistics. Brief Bioinform 22(5):bbaa442 - PubMed - PMC - DOI
    1. Satam H, Joshi K, Mangrolia U, Waghoo S, Zaidi G, Rawool S, Thakare RP, Banday S, Mishra AK, Das G, Malonia SK (2023) Next-generation sequencing technology: current trends and advancements. Biology (Basel) 12(7):997 - PubMed
    1. Nichol D, Robertson-Tessi M, Anderson ARA, Jeavons P (2019) Model genotype-phenotype mappings and the algorithmic structure of evolution. J R Soc Interface 16(160):20190332 - PubMed - PMC - DOI
    1. Deng CH, Naithani S, Kumari S, Cobo-Simón I, Quezada-Rodríguez EH, Skrabisova M, Gladman N, Correll MJ, Sikiru AB, Afuwape OO, Marrano A, Rebollo I, Zhang W, Jung S (2023) Genotype and phenotype data standardization, utilization and integration in the big data era for agricultural sciences. Database (Oxford) 2023:baad088 - PubMed - DOI
    1. Wojcik MH, Reuter CM, Marwaha S et al (2023) Beyond the exome: what’s next in diagnostic testing for Mendelian conditions. Am J Hum Genet 110(8):1229–1248 - PubMed - PMC - DOI

LinkOut - more resources