Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 15;37(12):1431-1444.
doi: 10.1089/neu.2019.6705. Epub 2020 Mar 11.

Unsupervised Machine Learning Reveals Novel Traumatic Brain Injury Patient Phenotypes with Distinct Acute Injury Profiles and Long-Term Outcomes

Affiliations

Unsupervised Machine Learning Reveals Novel Traumatic Brain Injury Patient Phenotypes with Distinct Acute Injury Profiles and Long-Term Outcomes

Kaitlin A Folweiler et al. J Neurotrauma. .

Abstract

The heterogeneity of traumatic brain injury (TBI) remains a core challenge for the success of interventional clinical trials. Data-driven approaches for patient stratification may help to identify TBI patient phenotypes during the acute injury period as well as facilitate targeted trial patient enrollment and analysis of treatment efficacy. In this study, we implemented an unsupervised machine learning approach to identify TBI subpopulations at injury baseline using data from 1213 TBI patients who participated in the Citicoline Brain Injury Treatment Trial (COBRIT) Trial. A wrapper framework utilizing generalized low-rank models automatically selected relevant clinical features that were subsequently used to cluster patients using a partitioning around medoids clustering algorithm. Using this approach, we identified three patient phenotypes with unique clinical injury profiles based on a subset of acute injury features. Phenotype-specific differences in long-term functional outcome trajectories were respectively observed at 3 and 6 months after injury. In comparison, when patients were grouped by baseline Glasgow Coma Scale (GCS), no differences in baseline clinical feature profiles or long-term outcomes were observed. To test phenotype reproducibility in an external validation data set, we used a K-nearest neighbors algorithm to classify subjects in the Transforming Research and Clinical Knowledge in Traumatic Brain Injury (TRACK-TBI) Pilot data set into corresponding phenotypes, then measured the Gower's dissimilarities between TRACK-TBI and COBRIT subjects in each phenotype. No significant differences were found between trial subjects within two phenotypes, suggesting that these phenotypes may be generalizable within a broad range of TBI severity. Further, Extended Glasgow Outcome Scale (GOS-E) outcomes in the TRACK-TBI data set similarly demonstrated phenotype-specific differences in long-term outcomes. Our results suggest that unsupervised machine learning is a promising and effective approach for discovery of novel injury subpopulations over the conventional GCS-based method, and may improve patient selection in future TBI clinical trials.

Keywords: GCS; TBI; clinical trial; machine learning; unsupervised clustering.

PubMed Disclaimer

Conflict of interest statement

No competing financial interests exist.

Figures

FIG. 1.
FIG. 1.
Diagram of the hybrid generalized low-rank model and clustering approach implemented for unsupervised learning. The full feature set with n features and m observations (i.e., traumatic brain injury [TBI] patients) is decomposed into two matrices of lower rank (i.e., dimensions), k. An L1-regularization parameter, γ, is applied to the second low-rank matrix to create a feature subset n, of the original matrix. The n’ x m feature subset is used to calculate an m x m dissimilarity matrix of the observations and clustered using the partitioning around medoids (PAM) algorithm. The average silhouette width of the clusters is calculated for a range of 3–10 clusters in PAM. The γ parameter is increased if the average silhouette width is higher than the previous iteration and stopped when n’ is zero. The final feature subset n’ and clustering schema is selected using the γ value that yields the highest cluster silhouette width.
FIG. 2.
FIG. 2.
Determining the necessity of each feature in contributing to the final cluster assignment. Feature necessity: Each feature was individually replaced with a null distribution of randomly shuffled values. The remaining features plus the null feature were then clustered upon and the similarity of the clustering result was compared with the original feature set clustering solution using two different measures: (A) the Jaccard similarity coefficient and (B) the pairwise similarity index. Any feature with a Jaccard similarity coefficient >0.75 and pairwise similarity index >90% (dotted lines) when nullified was considered unnecessary. Color image is available online.
FIG. 3.
FIG. 3.
Partitional clustering reveals distinct traumatic brain injury phenotypes. (A) T-distributed stochastic neighbor embedding (T-SNE) projection of 1213 traumatic brain injury (TBI) patients from the Citicoline Brain Injury Treatment Trial (COBRIT) study, each dot representing one patient. The partitioning around medoids (PAM) clustering solution, which yielded the maximum average silhouette width, resulted in three clusters labeled phenotype A (teal, n = 420), phenotype B (red, n = 446), and phenotype C (purple, n = 347). X and Y axes denote two-dimensional (2-D) representation of six-dimensional feature space. Novel TBI phenotypes have different recovery outcome trajectories based on the Extended Glasgow Outcome Scale (GOS-E) scores at (B) 90 days and (C) 180 days post-injury. Statistical significance was computed using the Kruskal–Wallis test with Holm's correction for multiple comparisons (asterisks represent p values: ****p < 0.0001, p > 0.05 n.s.). Color image is available online.
FIG. 4.
FIG. 4.
Baseline Glasgow Coma Scale (GCS) scores do not overlap with traumatic brain injury patient phenotypes and do not correlate with long-term outcome. (A) T-distributed stochastic neighbor embedding (T-SNE) projection of patients within a reduced feature space (same as Fig. 3) labeled by injury severity based on patients' acute GCS score. Injury severity was classified as severe (GCS <8, n = 834; dark green), moderate (GCS 9–12, n = 304; orange), and mild (defined as GCS 13–15 with an abnormal computed tomography [CT] scan, n = 75; blue). Extended Glasgow Outcome Scale (GOS-E) scores at (B) 90 days and (C) 180 days post-injury by injury severity. Statistical significance was computed using the Kruskal–Wallis test with Holm's correction for multiple comparisons (asterisks represent p values: *p < 0.05, p > 0.05 n.s.). Color image is available online.
FIG. 5.
FIG. 5.
Transforming Research and Clinical Knowledge in Traumatic Brain Injury (TRACK-TBI) Pilot subjects classified into phenotypes demonstrate similar injury profiles and Extended Glasgow Outcome Scale (GOS-E) outcomes as Citicoline Brain Injury Treatment Trial (COBRIT) phenotype subjects. (A) T-distributed stochastic neighbor embedding (T-SNE) projection of the original COBRIT subject phenotypes (COBRIT Phen. A, Phen. B, Phen. C) with the addition of TRACK-TBI Pilot subjects given phenotype assignments by a K-nearest neighbors (K-NN) classifier (TRACK-TBI Phen. A, Phen. B, Phen. C). TRACK-TBI phenotype extended GOS-E scores at (B) 90 days and (C) 180 days post-injury significance was computed using the Kruskal–Wallis test with Holm's correction for multiple comparisons (asterisks represent p values: **p < 0.01, p > 0.05 n.s.). Color image is available online.

Comment in

References

    1. Taylor C.A., Bell J.M., Breiding M.J., and Xu L. (2017). Traumatic brain injury–related emergency department visits, hospitalizations, and deaths — United States, 2007 and 2013. MMWR Surveill. Summ. 66, 1–16 - PMC - PubMed
    1. Maas A.I.R., Steyerberg E.W., Murray G.D., Bullock R., Baethmann A., Marshall L.F., and Teasdale G.M. (1999). Why have recent trials of neuroprotective agents in head injury failed to show convincing efficacy? A pragmatic analysis and theoretical considerations. Neurosurgery 44, 1286–1298 - PubMed
    1. Maas A.I.R., Roozenbeek B., and Manley G.T. (2010). Clinical trials in traumatic brain injury: past experience and current developments. Neurotherapeutics 7, 115–26 - PMC - PubMed
    1. Narayan R.K., Michel M.E., Ansell B., Baethmann A., Biegon A., Bracken M.B., Bullock M.R., Choi S.C., Clifton G.L., Contant C.F., Coplin W.M., Dietrich W.D., Ghajar J., Grady S.M., Grossman R.G., Hall E.D., Heetderks W., Hovda D.A., Jallo J., Katz R.L., Knoller N., Kochanek P.M., Maas A.I., Majde J., Marion D.W., Marmarou A., Marshall L.F., McIntosh T.K., Miller E., Mohberg N., Muizelaar J.P., Pitts L.H., Quinn P., Riesenfeld G., Robertson C.S., Strauss K.I., Teasdale G., Temkin N., Tuma R., Wade C., Walker M.D., Weinrich M., Whyte J., Wilberger J., Young A.B., and Yurkewicz L. (2002). Clinical trials in head injury. J. Neurotrauma 19, 503–557 - PMC - PubMed
    1. Marshall L.F. (2000). Head injury: recent past, present, and future. Neurosurgery 47, 546–61 - PubMed

Publication types