. 2025 Mar;144(2-3):227-242.

doi: 10.1007/s00439-024-02722-w. Epub 2025 Jan 9.

CAGI6 ID panel challenge: assessment of phenotype and variant predictions in 415 children with neurodevelopmental disorders (NDDs)

Maria Cristina Aspromonte^#^{1

2}, Alessio Del Conte^#¹, Shaowen Zhu³, Wuwei Tan³, Yang Shen³, Yexian Zhang^{4

5}, Qi Li^{4

5}, Maggie Haitian Wang^{4

5}, Giulia Babbi⁶, Samuele Bovo⁷, Pier Luigi Martelli⁶, Rita Casadio⁶, Azza Althagafi^{8

9}, Sumyyah Toonsi⁸, Maxat Kulmanov⁸, Robert Hoehndorf⁸, Panagiotis Katsonis¹⁰, Amanda Williams¹⁰, Olivier Lichtarge¹⁰, Su Xian¹¹, Wesley Surento¹¹, Vikas Pejaver^{12

13}, Sean D Mooney¹¹, Uma Sunderam¹⁴, Rajgopal Srinivasan¹⁴, Alessandra Murgia², Damiano Piovesan¹, Silvio C E Tosatto^{15

16}, Emanuela Leonardi^{17

18}

Affiliations

¹ Department of Biomedical Sciences, University of Padova, Padova, Italy.
² Department of Women's and Children's Health, University of Padova, Padova, Italy.
³ Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, 77843, USA.
⁴ CUHK Shenzhen Research Institute, Shenzhen, China.
⁵ JC School of Public Health and Primary Care, Chinese University of Hong Kong, Hong Kong, SAR, China.
⁶ Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.
⁷ Department of Agricultural and Food Sciences, University of Bologna, Bologna, Italy.
⁸ Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.
⁹ Computer Science Department, College of Computers and Information Technology, Taif University, Taif, 26571, Saudi Arabia.
¹⁰ Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
¹¹ Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98195, USA.
¹² Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
¹³ Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
¹⁴ Innovation Labs, Tata Consultancy Services, Hyderabad, India.
¹⁵ Department of Biomedical Sciences, University of Padova, Padova, Italy. silvio.tosatto@unipd.it.
¹⁶ Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR- IBIOM), Bari, Italy. silvio.tosatto@unipd.it.
¹⁷ Department of Biomedical Sciences, University of Padova, Padova, Italy. emanuela.leonardi@unipd.it.
¹⁸ Department of Women's and Children's Health, University of Padova, Padova, Italy. emanuela.leonardi@unipd.it.

^# Contributed equally.

PMID: 39786577
PMCID: PMC11976362
DOI: 10.1007/s00439-024-02722-w

CAGI6 ID panel challenge: assessment of phenotype and variant predictions in 415 children with neurodevelopmental disorders (NDDs)

Maria Cristina Aspromonte et al. Hum Genet. 2025 Mar.

. 2025 Mar;144(2-3):227-242.

doi: 10.1007/s00439-024-02722-w. Epub 2025 Jan 9.

Authors

Affiliations

¹ Department of Biomedical Sciences, University of Padova, Padova, Italy.
² Department of Women's and Children's Health, University of Padova, Padova, Italy.
³ Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, 77843, USA.
⁴ CUHK Shenzhen Research Institute, Shenzhen, China.
⁵ JC School of Public Health and Primary Care, Chinese University of Hong Kong, Hong Kong, SAR, China.
⁶ Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.
⁷ Department of Agricultural and Food Sciences, University of Bologna, Bologna, Italy.
⁸ Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.
⁹ Computer Science Department, College of Computers and Information Technology, Taif University, Taif, 26571, Saudi Arabia.
¹⁰ Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
¹¹ Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98195, USA.
¹² Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
¹³ Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
¹⁴ Innovation Labs, Tata Consultancy Services, Hyderabad, India.
¹⁵ Department of Biomedical Sciences, University of Padova, Padova, Italy. silvio.tosatto@unipd.it.
¹⁶ Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR- IBIOM), Bari, Italy. silvio.tosatto@unipd.it.
¹⁷ Department of Biomedical Sciences, University of Padova, Padova, Italy. emanuela.leonardi@unipd.it.
¹⁸ Department of Women's and Children's Health, University of Padova, Padova, Italy. emanuela.leonardi@unipd.it.

^# Contributed equally.

PMID: 39786577
PMCID: PMC11976362
DOI: 10.1007/s00439-024-02722-w

Abstract

The Genetics of Neurodevelopmental Disorders Lab in Padua provided a new intellectual disability (ID) Panel challenge for computational methods to predict patient phenotypes and their causal variants in the context of the Critical Assessment of the Genome Interpretation, 6th edition (CAGI6). Eight research teams submitted a total of 30 models to predict phenotypes based on the sequences of 74 genes (VCF format) in 415 pediatric patients affected by Neurodevelopmental Disorders (NDDs). NDDs are clinically and genetically heterogeneous conditions, with onset in infant age. Here, we assess the ability and accuracy of computational methods to predict comorbid phenotypes based on clinical features described in each patient and their causal variants. We also evaluated predictions for possible genetic causes in patients without a clear genetic diagnosis. Like the previous ID Panel challenge in CAGI5, seven clinical features (ID, ASD, ataxia, epilepsy, microcephaly, macrocephaly, hypotonia), and variants (Pathogenic/Likely Pathogenic, Variants of Uncertain Significance and Risk Factors) were provided. The phenotypic traits and variant data of 150 patients from the CAGI5 ID Panel Challenge were provided as training set for predictors. The CAGI6 challenge confirms CAGI5 results that predicting phenotypes from gene panel data is highly challenging, with AUC values close to random, and no method able to predict relevant variants with both high accuracy and precision. However, a significant improvement is noted for the best method, with recall increasing from 66% to 82%. Several groups also successfully predicted difficult-to-detect variants, emphasizing the importance of variants initially excluded by the Padua NDD Lab.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethical approval and consent to participate: This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the Ethics Committee of University Hospital of Padua, Italy. According to approved protocols of each referring clinical center, written informed consent was obtained from the probands or their legal representatives for specimen collection and genetic analysis. All individuals recruited provided informed consent for their participation in the study and publication of relevant findings. Conflict of interest: The authors declare no conflict of interest.

Figures

**Fig. 1**
Summary of CAGI-6 ID panel challenge dataset. a The number of patients where the presence or absence of the phenotype was ascertained by a clinician. b For the 415 patients included in the study, the Padua NDD lab noted at least one variant relevant to the phenotype in 43.4% of the patients

**Fig. 2**
Overall performance for each submission on phenotype prediction. A Each cell represents MCC values. The color scale ranges from green (+ 1, perfect correlation) to red (− 1, negative correlation). White means no better than random prediction. B Each cell represents the mean AUC values of the ROC for 1000 bootstrap iterations. The color scale ranges from dark (+ 1, perfect performance) to white (0, random performance). C Standard deviation (SD) of the bootstrapped AUC values shown in B. AUC, area under ROC curve; MCC, Matthew correlation coefficient; ROC, receiver operating characteristic

**Fig. 3**
Distribution of the ROC curves for all seven clinical traits. The best performant submission for each phenotype, based on the AUC value, is shown

**Fig. 4**
Performance of the eight groups matching the specific phenotype in 415 patients. Colors represent the proportion and number of groups which correctly predicted the phenotype

**Fig. 5**
Predicted variants distribution. Category “Dataset” is the amount of variants which were identified and classified by the Padua NDD lab. Each bar represents the amount of variants and types predicted by each submission. NDD, neurodevelopmental disorder

**Fig. 6**
Performance of the eight groups predicting the correct variants. The amount of variants was calculated for each category (P/LP, VUS, RF). Colors indicate the proportion and number of groups which correctly predicted those variants

See this image and copyright information in PMC

Update of

CAGI6 ID-Challenge: Assessment of phenotype and variant predictions in 415 children with Neurodevelopmental Disorders (NDDs).
Aspromonte MC, Conte AD, Zhu S, Tan W, Shen Y, Zhang Y, Li Q, Wang MH, Babbi G, Bovo S, Martelli PL, Casadio R, Althagafi A, Toonsi S, Kulmanov M, Hoehndorf R, Katsonis P, Williams A, Lichtarge O, Xian S, Surento W, Pejaver V, Mooney SD, Sunderam U, Srinivasan R, Murgia A, Piovesan D, Tosatto SCE, Leonardi E. Aspromonte MC, et al. Res Sq [Preprint]. 2023 Aug 2:rs.3.rs-3209168. doi: 10.21203/rs.3.rs-3209168/v1. Res Sq. 2023. Update in: Hum Genet. 2025 Mar;144(2-3):227-242. doi: 10.1007/s00439-024-02722-w. PMID: 37577579 Free PMC article. Updated. Preprint.

References

1. Adzhubei I, Jordan DM, Sunyaev SR (2013) Predicting functional effect of human missense mutations using PolyPhen-2. Curr Protocols Human Genet. 10.1002/0471142905.hg0720s76 - PMC - PubMed
1. Aspromonte MC, Bellini M, Gasparini A, Carraro M, Bettella E, Polli R, Cesca F, Bigoni S, Boni S, Carlet O, Negrin S, Mammi I, Milani D, Peron A, Sartori S, Toldo I, Soli F, Turolla L, Stanzial F, Leonardi E (2019) Characterization of intellectual disability and autism comorbidity through gene panel sequencing. Hum Mutat 40(9):1346–1363. 10.1002/humu.23822 - PMC - PubMed
1. Aspromonte MC, Del Conte A, Polli R, Baldo D, Benedicenti F, Bettella E, Bigoni S, Boni S, Ciaccio C, D’Arrigo S, Donati I (2023) Rare variants in 45 genes account for 25% of cases with NDDs in 415 pediatric patients. 10.21203/rs.3.rs-3139796/v1
1. Babbi G, Martelli PL, Casadio R (2019) PhenPath: a tool for characterizing biological functions underlying different phenotypes. BMC Genomics 20(Suppl 8):548. 10.1186/s12864-019-5868-x - PMC - PubMed
1. Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn 30(7):1145–1159. 10.1016/S0031-3203(96)00142-2

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- PubMed Central
- Springer
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

CAGI6 ID panel challenge: assessment of phenotype and variant predictions in 415 children with neurodevelopmental disorders (NDDs)

Affiliations

CAGI6 ID panel challenge: assessment of phenotype and variant predictions in 415 children with neurodevelopmental disorders (NDDs)

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Update of

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous