Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar;144(2-3):227-242.
doi: 10.1007/s00439-024-02722-w. Epub 2025 Jan 9.

CAGI6 ID panel challenge: assessment of phenotype and variant predictions in 415 children with neurodevelopmental disorders (NDDs)

Affiliations

CAGI6 ID panel challenge: assessment of phenotype and variant predictions in 415 children with neurodevelopmental disorders (NDDs)

Maria Cristina Aspromonte et al. Hum Genet. 2025 Mar.

Abstract

The Genetics of Neurodevelopmental Disorders Lab in Padua provided a new intellectual disability (ID) Panel challenge for computational methods to predict patient phenotypes and their causal variants in the context of the Critical Assessment of the Genome Interpretation, 6th edition (CAGI6). Eight research teams submitted a total of 30 models to predict phenotypes based on the sequences of 74 genes (VCF format) in 415 pediatric patients affected by Neurodevelopmental Disorders (NDDs). NDDs are clinically and genetically heterogeneous conditions, with onset in infant age. Here, we assess the ability and accuracy of computational methods to predict comorbid phenotypes based on clinical features described in each patient and their causal variants. We also evaluated predictions for possible genetic causes in patients without a clear genetic diagnosis. Like the previous ID Panel challenge in CAGI5, seven clinical features (ID, ASD, ataxia, epilepsy, microcephaly, macrocephaly, hypotonia), and variants (Pathogenic/Likely Pathogenic, Variants of Uncertain Significance and Risk Factors) were provided. The phenotypic traits and variant data of 150 patients from the CAGI5 ID Panel Challenge were provided as training set for predictors. The CAGI6 challenge confirms CAGI5 results that predicting phenotypes from gene panel data is highly challenging, with AUC values close to random, and no method able to predict relevant variants with both high accuracy and precision. However, a significant improvement is noted for the best method, with recall increasing from 66% to 82%. Several groups also successfully predicted difficult-to-detect variants, emphasizing the importance of variants initially excluded by the Padua NDD Lab.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethical approval and consent to participate: This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the Ethics Committee of University Hospital of Padua, Italy. According to approved protocols of each referring clinical center, written informed consent was obtained from the probands or their legal representatives for specimen collection and genetic analysis. All individuals recruited provided informed consent for their participation in the study and publication of relevant findings. Conflict of interest: The authors declare no conflict of interest.

Figures

Fig. 1
Fig. 1
Summary of CAGI-6 ID panel challenge dataset. a The number of patients where the presence or absence of the phenotype was ascertained by a clinician. b For the 415 patients included in the study, the Padua NDD lab noted at least one variant relevant to the phenotype in 43.4% of the patients
Fig. 2
Fig. 2
Overall performance for each submission on phenotype prediction. A Each cell represents MCC values. The color scale ranges from green (+ 1, perfect correlation) to red (− 1, negative correlation). White means no better than random prediction. B Each cell represents the mean AUC values of the ROC for 1000 bootstrap iterations. The color scale ranges from dark (+ 1, perfect performance) to white (0, random performance). C Standard deviation (SD) of the bootstrapped AUC values shown in B. AUC, area under ROC curve; MCC, Matthew correlation coefficient; ROC, receiver operating characteristic
Fig. 3
Fig. 3
Distribution of the ROC curves for all seven clinical traits. The best performant submission for each phenotype, based on the AUC value, is shown
Fig. 4
Fig. 4
Performance of the eight groups matching the specific phenotype in 415 patients. Colors represent the proportion and number of groups which correctly predicted the phenotype
Fig. 5
Fig. 5
Predicted variants distribution. Category “Dataset” is the amount of variants which were identified and classified by the Padua NDD lab. Each bar represents the amount of variants and types predicted by each submission. NDD, neurodevelopmental disorder
Fig. 6
Fig. 6
Performance of the eight groups predicting the correct variants. The amount of variants was calculated for each category (P/LP, VUS, RF). Colors indicate the proportion and number of groups which correctly predicted those variants

Update of

References

    1. Adzhubei I, Jordan DM, Sunyaev SR (2013) Predicting functional effect of human missense mutations using PolyPhen-2. Curr Protocols Human Genet. 10.1002/0471142905.hg0720s76 - PMC - PubMed
    1. Aspromonte MC, Bellini M, Gasparini A, Carraro M, Bettella E, Polli R, Cesca F, Bigoni S, Boni S, Carlet O, Negrin S, Mammi I, Milani D, Peron A, Sartori S, Toldo I, Soli F, Turolla L, Stanzial F, Leonardi E (2019) Characterization of intellectual disability and autism comorbidity through gene panel sequencing. Hum Mutat 40(9):1346–1363. 10.1002/humu.23822 - PMC - PubMed
    1. Aspromonte MC, Del Conte A, Polli R, Baldo D, Benedicenti F, Bettella E, Bigoni S, Boni S, Ciaccio C, D’Arrigo S, Donati I (2023) Rare variants in 45 genes account for 25% of cases with NDDs in 415 pediatric patients. 10.21203/rs.3.rs-3139796/v1
    1. Babbi G, Martelli PL, Casadio R (2019) PhenPath: a tool for characterizing biological functions underlying different phenotypes. BMC Genomics 20(Suppl 8):548. 10.1186/s12864-019-5868-x - PMC - PubMed
    1. Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn 30(7):1145–1159. 10.1016/S0031-3203(96)00142-2

LinkOut - more resources