Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May:115:105677.
doi: 10.1016/j.ebiom.2025.105677. Epub 2025 Apr 24.

Artificial intelligence-driven genotype-epigenotype-phenotype approaches to resolve challenges in syndrome diagnostics

Affiliations

Artificial intelligence-driven genotype-epigenotype-phenotype approaches to resolve challenges in syndrome diagnostics

Christopher C Y Mak et al. EBioMedicine. 2025 May.

Abstract

Background: Decisions to split two or more phenotypic manifestations related to genetic variations within the same gene can be challenging, especially during the early stages of syndrome discovery. Genotype-based diagnostics with artificial intelligence (AI)-driven approaches using next-generation phenotyping (NGP) and DNA methylation (DNAm) can be utilized to expedite syndrome delineation within a single gene.

Methods: We utilized an expanded cohort of 56 patients (22 previously unpublished individuals) with truncating variants in the MN1 gene and attempted different methods to assess plausible strategies to objectively delineate phenotypic differences between the C-Terminal Truncation (CTT) and N-Terminal Truncation (NTT) groups. This involved transcriptomics analysis on available patient fibroblast samples and AI-assisted approaches, including a new statistical method of GestaltMatcher on facial photos and blood DNAm analysis using a support vector machine (SVM) model.

Findings: RNA-seq analysis was unable to show a significant difference in transcript expression despite our previous hypothesis that NTT variants would induce nonsense mediated decay. DNAm analysis on nine blood DNA samples revealed an episignature for the CTT group. In parallel, the new statistical method of GestaltMatcher objectively distinguished the CTT and NTT groups with a low requirement for cohort number. Validation of this approach was performed on syndromes with known DNAm signatures of SRCAP, SMARCA2 and ADNP to demonstrate the effectiveness of this approach.

Interpretation: We demonstrate the potential of using AI-based technologies to leverage genotype, phenotype and epigenetics data in facilitating splitting decisions in diagnosis of syndromes with minimal sample requirement.

Funding: The specific funding of this article is provided in the acknowledgements section.

Keywords: GestaltMatcher; MCTT; MN1; Methylation; Splitting; Support vector machine.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Overview of combined GestaltMatcher and DNAm approach for splitting of gene–disease relationships according to ClinGen. Evidence for MN1, SRCAP, SMARCA2 and ADNP in relation to the ClinGen lumpers and splitters framework by Thaxton et al. The latter three are reported syndromes with DNAm by Rots et al. Chater-Diehl et al. and Bend et al. Our proposed approach enhances evidence for splitting my molecular mechanism (DNAm) and phenotypic features (GestaltMatcher).
Fig. 2
Fig. 2
a Clinical features of individuals with previously unreported MN1 C-Terminal Truncations (CTT). The core recognizable facial features of MCTT (midface hypoplasia, downslanting palpebral fissures, hypertelorism, exophthalmia, short, upturned nose, and small low-set ear) are consistently observed. Individual C09 is the oldest known individual with MCTT to date. Oral photos demonstrate abnormal palate with thickened lateral palatine ridges and severe dental malposition and crowding (Individuals C10, C32, C34). b Clinical features of individuals with previously unreported MN1 N-Terminal Truncations (NTTs). Among all individuals with NTT, there is reduced height of the lower third of the face with micrognathia and the nose is often found to be prominent. Individual N03 with hypotelorism, long nose with prominent full tip, small mouth and microretrognathia; Individual N08 with high forehead, low nasal bridge, short nose, high arched palate and tented upper lip; Individual N09 with downslanting palpebral fissures and broad nasal bridge, and submucous cleft palate (not shown in photo); Individual N12 with upslanting palpebral fissures, high bridge of nose and micrognathia; Individual N13 (mother of N12) with broad tipped nose, long palpebral fissures and thin vermillion of the upper lip.
Fig. 3
Fig. 3
RNAseq of NTT and CTT variants. RNAseq data of fibroblast samples from two individuals (N02 and N13) with NTT variants and four individuals (C21, C22, C33 and C38) with previously reported CTT variants. Expression of NTT transcripts (c.880C > T and c.3417dupG) showed a read count of 40% and 22% respectively. Read counts for samples with CTT transcripts (c.3873delC, c.3870_3879dup, c.3883C > T, and c.3903G > A) were 39.1%, 34.1%, 45%, and 52.7% respectively. There was no significant difference between the expression of NTT and CTT transcripts.
Fig. 4
Fig. 4
GestaltMatcher AI-driven facial gestalt analysis. a. The pairwise rank of 38 images (NTT:10 and CTT:28) in CFPS. Naming of photos with molecular location of expected truncation before patient ID. Gallery images were the images to be matched in CFPS. Each column is the result of testing one subject in the column and listing the rank of the other 37 photos in each row. For example, by testing C41 (the third column from the right), C42 was on the third rank, and C37 was on the first rank of C41. Both X and Y axes were sorted by the genomic location. The boundary of the NTT and the CTT separated the patients into two clear clusters. b. t-SNE visualization of Facial Phenotypic Descriptors of 38 images. Visualization of facial phenotypic descriptors by reduction down to two dimensions supports CTT (blue) and NTT (orange) as two distinct phenotypic entities. c. Comparing distance distribution of CTT and NTT patients to the distribution sampled from the same syndromes and different syndromes. The same syndrome distribution (blue) was sampled from the patients with the same syndrome, and the different syndrome distribution (red) was sampled from the patients with two different syndromes. 100% of the CTT and NTT distribution is above the threshold, indicating that patients with CTT and NTT present two different facial phenotypes. d. Downsampling analysis between the CTT and NTT groups. The X-axis is the sample included in each group. When both groups had at least two patients, the mean pairwise distance distribution was above the threshold.
Fig. 5
Fig. 5
DNAm signatures for truncating variants in MN1. a. Principal component analysis (left) showing the separation between individuals with CTT (blue circles) and controls (black circles) from the discovery cohort at the CTT specific DNAm signature (30 CpG sites). Hierarchical clustering and heatmap (right) displaying the methylation profile between the 2 groups at the signature sites and the clustering of CTT individuals separate from controls. NTT MN1 samples (green) separate from CTT (blue) with the affected mother (triangle, N13) clustering with controls. Percentage variation from principal component analysis in brackets. b. Classification of samples using SVM machine learning models based on each DNAm signature, showing a clear separation of CTT samples (dark blue) from NTT samples (green), thus validating the distinction of CTT and NTT samples by PCA plot. Scores closer to 1 indicate that the DNAm profile is positive and likely disease causing and scores closer to 0 indicate that the classification is negative and that the DNAm profile is more similar to controls.

Similar articles

References

    1. McKusick V.A. On lumpers and splitters, or the nosology of genetic disease. Perspect Biol Med. 1969;12(2):298–312. - PubMed
    1. Hsieh T.C., Bar-Haim A., Moosa S., et al. GestaltMatcher facilitates rare disease matching using facial phenotype descriptors. Nat Genet. 2022;54(3):349–357. - PMC - PubMed
    1. Ferry Q., Steinberg J., Webber C., et al. Diagnostically relevant facial gestalt information from ordinary photos. Elife. 2014;3 - PMC - PubMed
    1. Kuru K., Niranjan M., Tunca Y., Osvank E., Azim T. Biomedical visual data analysis to build an intelligent diagnostic decision support system in medical genetics. Artif Intell Med. 2014;62(2):105–118. - PubMed
    1. Cerrolaza J.J., Porras A.R., Mansoor A., Zhao Q., Summar M., Linguraru M.G. 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI); 2016. IEEE; 2016. Identification of dysmorphic syndromes using landmark-specific local texture descriptors; pp. 1080–1083.

Substances