Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Dec:59:68-76.
doi: 10.1016/j.jpsychires.2014.08.017. Epub 2014 Sep 16.

Quantitative forecasting of PTSD from early trauma responses: a Machine Learning application

Affiliations

Quantitative forecasting of PTSD from early trauma responses: a Machine Learning application

Isaac R Galatzer-Levy et al. J Psychiatr Res. 2014 Dec.

Abstract

There is broad interest in predicting the clinical course of mental disorders from early, multimodal clinical and biological information. Current computational models, however, constitute a significant barrier to realizing this goal. The early identification of trauma survivors at risk of post-traumatic stress disorder (PTSD) is plausible given the disorder's salient onset and the abundance of putative biological and clinical risk indicators. This work evaluates the ability of Machine Learning (ML) forecasting approaches to identify and integrate a panel of unique predictive characteristics and determine their accuracy in forecasting non-remitting PTSD from information collected within 10 days of a traumatic event. Data on event characteristics, emergency department observations, and early symptoms were collected in 957 trauma survivors, followed for fifteen months. An ML feature selection algorithm identified a set of predictors that rendered all others redundant. Support Vector Machines (SVMs) as well as other ML classification algorithms were used to evaluate the forecasting accuracy of i) ML selected features, ii) all available features without selection, and iii) Acute Stress Disorder (ASD) symptoms alone. SVM also compared the prediction of a) PTSD diagnostic status at 15 months to b) posterior probability of membership in an empirically derived non-remitting PTSD symptom trajectory. Results are expressed as mean Area Under Receiver Operating Characteristics Curve (AUC). The feature selection algorithm identified 16 predictors, present in ≥ 95% cross-validation trials. The accuracy of predicting non-remitting PTSD from that set (AUC = .77) did not differ from predicting from all available information (AUC = .78). Predicting from ASD symptoms was not better then chance (AUC = .60). The prediction of PTSD status was less accurate than that of membership in a non-remitting trajectory (AUC = .71). ML methods may fill a critical gap in forecasting PTSD. The ability to identify and integrate unique risk indicators makes this a promising approach for developing algorithms that infer probabilistic risk of chronic posttraumatic stress psychopathology based on complex sources of biological, psychological, and social information.

Keywords: Course and prognosis; Early prediction; Forecasting; Machine Learning; Markov boundary feature selection; Posttraumatic stress disorder (PTSD); Support Vector Machines.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Three Trajectory Model of PTSD Symptom Severity Recovery Trajectories (n=957)
Note: x-axis indicates number of PTSD symptoms reported on the PSS-I. Y-axis represents time from ~10 days to ~15 months Trajectories represent estimated marginal means. ‘d’ indicates days from emergency room admission. Individuals are identified as members of modeled trajectories based on their posterior probability of class membership derived using Latent Growth Mixture Modeling.
Figure 2
Figure 2. Machine Learning Approach for feature selection and classification
Note: 1. Unselected data are organized such that all potential predictors are normalized to ranges of 0–1 and a ‘target’ variable is specified; 2+3. The Markov Boundary Feature Selection algorithm removes redundant or uninformative variables to identify an irreducible set of predictors in a random 90% of cases. This set of predictors is then confirmed in a random 10% of cases, and the procedure is repeated 10 times; 4+5. Selected features are fed into seven different classification algorithms to determine the accuracy of selected features to classify the ‘target’ variable and to provide an accuracy estimate using area under the receive operator characteristic curve (AUC). The classification algorithms are tested using the same cross-validation procedure as for feature selection. Additionally, when optimizing parameters of the model (for example for the polynomial SVMs) an additional step of splitting each training set into a training & validation set is added; f. A mean AUC across 100 cross-validation runs is provided to determine the overall accuracy of the selected and validated features for classifying the target variable (in this analysis remission vs. non-remission trajectory membership or PTSD diagnostic status). SVM = Support Vector Machine; ROC = Receiver Operator Characteristic.
Figure 3
Figure 3. Features Selected Using Generalized Local Learning (GLL-MB) Algorithm across 100 cross validation runs
The figure is a graphical representation of the percentage of cross-validation runs for which each feature was selected. Features that were selected at a high frequency indicate that they consistently provide unique predictive information. CGI refers to the Clinical Global Impression metric.

References

    1. Aliferis CF, Statnikov A, Tsamardinos I, Mani S, Koutsoukos XD. Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical Evaluation. J. Mach. Learn. Res. 2010;11:171–234.
    1. Batista GE, Monard MC. An analysis of four missing data treatment methods for supervised learning. Applied Artificial Intelligence. 2003;17(5–6):519–533.
    1. Boscarino JA, Erlich PM, Hoffman SN, Zhang X. Higher FKBP5, COMT, CHRNA5, and CRHR1 allele burdens are associated with PTSD and interact with trauma exposure: implications for neuropsychiatric research and treatment. Neuropsychiatric disease and treatment. 2012;8:131–139. - PMC - PubMed
    1. Bradley AP. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition. 1997;30(7):15.
    1. Breiman L. Random Forests. Machine Learning. 2001;45(1):5–32.

Publication types