Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul;156(1):186-194.
doi: 10.1016/j.jaci.2025.02.032. Epub 2025 Mar 7.

A predictive model for identification of pediatric individuals with common variable immunodeficiency through electronic medical records

Affiliations

A predictive model for identification of pediatric individuals with common variable immunodeficiency through electronic medical records

Nouf Alsaati et al. J Allergy Clin Immunol. 2025 Jul.

Abstract

Introduction: Common variable immunodeficiency (CVID) is characterized by recurrent sinopulmonary infections. However, in the pediatric population, recurrent sinopulmonary infections early in life are common, which can render key clinical features of CVID less distinctive. Accordingly, the diagnosis of CVID is often delayed owing to the heterogeneous nature of the presentation and the broad range of ages of onset. A 10-year lag in diagnosis has been found for CVID, and there is a critical need for improved time to diagnosis.

Objective: Our aim was to utilize machine learning techniques to identify a clinical signature of CVID in a pediatric population.

Methods: Our selected cohort included 112 individuals with CVID and 627 controls. The controls were restricted from having other medical conditions associated with infection. A machine learning data set was constructed by summing patient-level counts of clinical metrics. A total of 3 supervised machine learning classifiers were trained, tuned, and performance-tested. We validated our findings using a distinct control cohort with high medical complexity and tested a logistic regression approach.

Results: Key features associated with CVID were chest radiograph count, number of antibiotic prescriptions, and number of common infections. Our Extreme Gradient Boosting (XGBoost) model best predicted eventual CVID diagnosis, with an F1 score of 0.77, a total of 21 of 29 CVID diagnoses classified correctly (8 false-negative results), and 179 of 183 patients without CVID correctly classified (4 false-positive results) up to 10 years before the eventual clinical diagnosis. Key features with a robust association with pediatric CVID were the frequency of common infections and antibiotic prescriptions.

Conclusion: In spite of a high frequency of infections in the comparator population, the clinical signature of pediatric CVID was sufficiently distinctive to enable early identification.

Keywords: Primary immunodeficiency; antibody deficiency; artificial intelligence; inborn error of immunity; machine learning; predictive algorithm.

PubMed Disclaimer

Conflict of interest statement

Disclosure statement Supported by The Children's Hospital of Philadelphia and the Wallace Chair of Pediatrics (to K.S.), the National Institute for Neurological Disorders and Stroke (grants K02 NS112600, R01 NS131512, and R01 NS127830 [to I.H.]), and the Hartwell Foundation (an individual biomedical research award [to I.H.]). Disclosure of potential conflict of interest: The authors declare that they have no relevant conflicts of interest.