Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar;54(3):349-357.
doi: 10.1038/s41588-021-01010-x. Epub 2022 Feb 10.

GestaltMatcher facilitates rare disease matching using facial phenotype descriptors

Affiliations

GestaltMatcher facilitates rare disease matching using facial phenotype descriptors

Tzung-Chien Hsieh et al. Nat Genet. 2022 Mar.

Abstract

Many monogenic disorders cause a characteristic facial morphology. Artificial intelligence can support physicians in recognizing these patterns by associating facial phenotypes with the underlying syndrome through training on thousands of patient photographs. However, this 'supervised' approach means that diagnoses are only possible if the disorder was part of the training set. To improve recognition of ultra-rare disorders, we developed GestaltMatcher, an encoder for portraits that is based on a deep convolutional neural network. Photographs of 17,560 patients with 1,115 rare disorders were used to define a Clinical Face Phenotype Space, in which distances between cases define syndromic similarity. Here we show that patients can be matched to others with the same molecular diagnosis even when the disorder was not included in the training set. Together with mutation data, GestaltMatcher could not only accelerate the clinical diagnosis of patients with ultra-rare disorders and facial dysmorphism but also enable the delineation of new phenotypes.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:. Subsets of disorders supported by DeepGestalt and GestaltMatcher.
The lower x-axis shows examples of disease genes, and the upper x-axis is the cumulative number of genes. The y-axis shows the number of pathogenic submissions in ClinVar for each gene. The numbers on the curve indicate the number of submissions for each of the indicated genes. Most of the rare disorders that DeepGestalt supports have relatively high prevalence based on their ClinVar submissions; e.g., Cornelia de Lange syndrome (CdLS) is caused by a mutation in NIPBL, SMC1A, or HDAC8 (yellow), among other genes. Disease genes such as PACS1 (gray) cause highly distinctive phenotypes but are ultra-rare, representing the limit of what current technology can achieve. The first novel disease that was characterized by GestaltMatcher is caused by mutations in LEMD2 (red). A candidate disease gene associated with a characteristic phenotype that can be identified by GestaltMatcher is PSMC3.
Figure 2:
Figure 2:. Concept of GestaltMatcher.
a, Architecture of a deep convolutional neural network consisting of an encoder and a classifier. Facial dysmorphic features of 299 frequent syndromes were used for supervised learning. The last fully connected layer in the feature encoder was taken as a Facial Phenotypic Descriptor (FPD), which forms a point in the Clinical Face Phenotype Space (CFPS). b, In the CFPS, the distance between each patient’s FPD can be considered as a measure of similarity of their facial phenotypic features. The distances can be further used for classifying ultra-rare disorders or matching patients with novel phenotypes. Take the input image shown in the figure as an example: the patient’s ultra-rare disease, which is caused by mutations in LEMD2, was not in the classifier, but was matched with another patient with the same ultra-rare disorder in the CFPS.
Figure 3:
Figure 3:. Influence of the number of syndromes included in model training.
The x-axis is the number of syndromes used in model training. The y-axis shows the average top-10 accuracy of testing images in the rare set. Each line uses the same number of subjects per syndrome, which is shown in the key. For each point, we train the models five times with five different splits, and average the results. The null accuracy (the expected value if the encoder returned random predictions) is 1.2% (10/816).
Figure 4:
Figure 4:. Pairwise ranks of subjects with mutations in TMEM94.
Each label consists of family numbering and subject numbering, which are the same as in the original publication. For example, F-2-7 means the seventh subject in the second family. Each column is the result of testing the image indicated at the bottom of the column. The number in the box is the rank to the corresponding image in the gallery. The fourth column starting from the left is the result of testing F-2-5, and the fourth row from the bottom shows that F-1-1 has a rank of 2 for F-2-5. In the fifth to seventh rows from the bottom are the ranks from family 2, which is the same family that F-2-5 is from.
Figure 5:
Figure 5:. Correlation among syndrome prevalence, distinctiveness score, and top-10 accuracy.
a, Distribution of top-10 accuracy and distinctiveness score. The Spearman rank correlation coefficient was 0.400 (P = 0.004). b, Distribution of top-10 accuracy and prevalence. The Spearman rank correlation coefficient was −0.217 (P = 0.130) The details of each syndrome can be found in Supplementary Table 6 using the syndrome ID shown in the figure; syndrome 5 is Schuurs-Hoeijmakers syndrome. The y-axis shows the average top-10 accuracy of the experiments over 100 iterations.

References

    1. Ferreira CR The burden of rare diseases. Am. J. Med. Genet. A 179, 885–892 (2019). - PubMed
    1. Baird PA, Anderson TW, Newcombe HB & Lowry RB Genetic disorders in children and young adults: A population study. Am. J. Hum. Genet 42, 677–693 (1988). - PMC - PubMed
    1. Hart TC & Hart PS Genetic studies of craniofacial anomalies: clinical implications and applications. Orthod. Craniofac. Res 12, 212–220 (2009). - PMC - PubMed
    1. Marbach F et al. The Discovery of a LEMD2-Associated Nuclear Envelopathy with Early Progeroid Appearance Suggests Advanced Applications for AI-Driven Facial Phenotyping. Am. J. Hum. Genet 104, 749–757 (2019). - PMC - PubMed
    1. Ferry Q et al. Diagnostically relevant facial gestalt information from ordinary photos. Elife 3, e02020 (2014). - PMC - PubMed