Statistical learning to identify and characterise neurodevelopmental outcomes at 2 years in babies born preterm: model development and validation using population-level data from England and Wales
- PMID: 40532624
- PMCID: PMC12212173
- DOI: 10.1016/j.ebiom.2025.105811
Statistical learning to identify and characterise neurodevelopmental outcomes at 2 years in babies born preterm: model development and validation using population-level data from England and Wales
Abstract
Background: Children born preterm face elevated risks of neurodevelopmental impairments across domains. Prior studies have relied on expert-imposed typologies within single domains. This study applies statistical learning to a national database to identify transdomain clusters and their maternal and neonatal predictors.
Methods: Latent class analysis (LCA) was used to derive transdomain clusters from parent-reported visual, auditory, neuromotor, and communication impairments in preterm-born children at two years corrected age using the UK National Neonatal Research Database data (N = 27,261). Replication was conducted in an independent sample from Wales (N = 975). Clusters were clinically validated using cerebral palsy diagnosis, Bayley Scales of Infant and Toddler Development (3rd edition), and global neurodevelopmental delay. Random forest identified cluster-specific and shared predictors.
Findings: Four homogeneous clusters were derived (silhouette score = 0.71) and replicated in Wales with high balanced accuracy (93%): (1) typically developing (84.8%), (2) communication impairments (8.4%), (3) neuro-motor impairments (4.1%), and (4) multiple neuro-morbidity (2.7%). Clusters had high clinical validity and were distinguishable by shared and cluster-specific predictors. Neonatal brain injuries were most predictive of neuro-motor and multiple neuro-morbidity clusters. Birthweight, gestational age, socio-economic deprivation, and sex were stronger predictors of the communication cluster than preterm co-morbidities.
Interpretation: This study provides first evidence of the transdomain nature of neurodevelopmental impairments after preterm birth using LCA. The finding that socio-demographic and perinatal factors rather than co-morbidities increase the risk of communication impairment highlights the importance of environmental modification alongside clinical interventions. Applying data-driven approaches to routinely collected data may offer a cost-effective way to stratify at-risk children and inform targeted support strategies.
Funding: UKRI Medical Research Council.
Keywords: Birth cohorts; Machine learning; Neonatal; Neurocognitive; Neurodevelopmental impairments; Preterm.
Copyright © 2025 The Author(s). Published by Elsevier B.V. All rights reserved.
Conflict of interest statement
Declaration of interests AT has involvement in the following grants, which are broadly in the area of digital health, although they did not support the present manuscript: EU Horizon: “EUmetriosis: transforming endometriosis care in Europe via an integrated approach addressing current knowledge, diagnosis, tailored management and patient empowerment” (Universite Catholique de Louvain, Belgium); NIHR: EQUI-RESP-AFRICA (University of Edinburgh); UKRI “AI Centre for Doctoral Training in Biomedical Innovation (AI4BI)” (University of Edinburgh); Wellcome Trust Programme grant: 226944/Z/23/Z (University of Edinburgh); Dunhill Medical Trust PhD studentship: “Preventing rehospitalizations of elderly acute care survivors using longitudinal physical and mental health monitoring with wearable sensors and smartphones” (main PhD supervisor); Exogenous sex steroid hormones and asthma phenotypes: a population-based prospective cohort study using UK-wide primary care databases, (University of Edinburgh); NES Tender, Digital Health and Care Transformation Leaders Programme in Scotland, building on the NHS Digital Academy Leads (University of Edinburgh); UKRI/Versus Arthritis APDP consortia: MR/W002426/1 (University of Cambridge); HEE for the further development of the NHS Digital Academy (renewal award)–collaborative project between Imperial College London, the University of Edinburgh, and HDRUK; Standard Life Grant, topped up by EXPPECT contribution and UoE CMVM funds, PhD studentship on Endometriosis and wearable technology. Supervisors: A. Tsanas, A. Horne, P. Saunders (University of Edinburgh); ESRC: “Beyond the 10,000 steps: Managing less visible aspects of healthy ageing at work” (Business School, University of Edinburgh); BHF: RG/20/10/34966 (University of Edinburgh); HDRUK: CFC0109 (University of Oxford); Wellcome Trust ISSF, 204826/Z/16/Z and 204826/Z/16/Z (University of Oxford); Asthma UK, Asthma: renewal funding bid (University of Edinburgh & QMUL); NHS England commissioning for the development of the NHS Digital Academy. (Imperial College London and the University of Edinburgh, with input from Harvard University); HDRUK core site award (Reg. no: Edin1), Universities of Edinburgh (coordinating), Glasgow, Dundee, Aberdeen, Strathclyde, and St Andrews. AT received consulting fees from Mirador Analytics for statistical risk disclosure and dataset certification. AT received honoraria for talks in the area of digital health (World AI conference) and Cirrus Logic. SRC reports the following grants: US NIH Grant R01AG054628, U01AG083829, & 1RF1AG073593 (University of Edinburgh); BBSRC & ESRC Grant BB/W008793/1 (University of Edinburgh); Wellcome & Royal Society Grant 221890/Z/20/Z (University of Edinburgh). CB is supported by National Institute for Health and Care Research (NIHR) via an Advanced Fellowship programme, and holds unpaid leadership roles at the NIHR Health Technology Assessment Prioritisation committee (Deputy Chair) and the British Association of Perinatal Medicine (Honorary Secretary). JPB holds a MRC UKRI Programme grant: “Preterm birth as a determinant of neurodevelopment and cognition in children: mechanisms and causal evidence”, MR/X003434/1, PI: J. Boardman (University of Edinburgh); reports book royalties from Walter Kluwer for Avery and MacDonald's Neonatology Pathophysiology and Management of the Newborn, Eighth edition. Editors: J P Boardman, A M Groves, J Ramasethu. Publisher: Lippincott Williams & Wilkins (LWW). ISBN: 978-1-97-512925-5; support for travel expenses from Perinatal Science International, International Neonatology Association, Witness to House of Lords select committee on preterm birth, and the Joint European Neonatal Societies to attend meetings; participation at Data and Safety Monitoring Committee of the Pregnancy Outcome Prediction Study 2 (POPS2); and holds leadership roles at the NHS England Maternity Neonatal Programme, Member of Clinical Outcomes Group, Scientific Advisory Panel, Action Medical Research, PREMSTEM (Brain injury in the premature born infant: stem cell regeneration research network) scientific advisory board. EU programme. SH, GDB, RMR, HCW, and REM declare no competing interests.
Figures







References
-
- Ohuma E.O., Moller A.B., Bradley E., et al. National, regional, and global estimates of preterm birth in 2020, with trends from 2010: a systematic analysis. Lancet. 2023;402(10409):1261–1271. - PubMed
-
- Inder T.E., Volpe J.J., Anderson P.J. Defining the neurologic consequences of preterm birth. N Engl J Med. 2023;389(5):441–453. - PubMed
-
- van Boven M.R., Henke C.E., Leemhuis A.G., et al. Machine learning prediction models for neurodevelopmental outcome after preterm birth: a scoping review and new machine learning evaluation framework. Pediatrics. 2022;150(1) - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources
Medical
Research Materials