Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation

Data-driven consideration of genetic disorders for global genomic newborn screening programs

Thomas Minten et al. medRxiv. .

Update in

  • Data-driven consideration of genetic disorders for global genomic newborn screening programs.
    Minten T, Bick S, Adelson S, Gehlenborg N, Amendola LM, Boemer F, Coffey AJ, Encina N, Ferlini A, Kirschner J, Russell BE, Servais L, Sund KL, Taft RJ, Tsipouras P, Zouk H; ICoNS Gene List Contributors; Bick D; International Consortium on Newborn Sequencing (ICoNS); Green RC, Gold NB. Minten T, et al. Genet Med. 2025 Jul;27(7):101443. doi: 10.1016/j.gim.2025.101443. Epub 2025 May 9. Genet Med. 2025. PMID: 40357684

Abstract

Purpose: Over 30 international studies are exploring newborn sequencing (NBSeq) to expand the range of genetic disorders included in newborn screening. Substantial variability in gene selection across programs exists, highlighting the need for a systematic approach to prioritize genes.

Methods: We assembled a dataset comprising 25 characteristics about each of the 4,390 genes included in 27 NBSeq programs. We used regression analysis to identify several predictors of inclusion, and developed a machine learning model to rank genes for public health consideration.

Results: Among 27 NBSeq programs, the number of genes analyzed ranged from 134 to 4,299, with only 74 (1.7%) genes included by over 80% of programs. The most significant associations with gene inclusion across programs were presence on the US Recommended Uniform Screening Panel (inclusion increase of 74.7%, CI: 71.0%-78.4%), robust evidence on the natural history (29.5%, CI: 24.6%-34.4%) and treatment efficacy (17.0%, CI: 12.3%- 21.7%) of the associated genetic disease. A boosted trees machine learning model using 13 predictors achieved high accuracy in predicting gene inclusion across programs (AUC = 0.915, R² = 84%).

Conclusion: The machine learning model developed here provides a ranked list of genes that can adapt to emerging evidence and regional needs, enabling more consistent and informed gene selection in NBSeq initiatives.

PubMed Disclaimer

Publication types