Dynamic suicide topic modelling: Deriving population-specific, psychosocial and time-sensitive suicide risk variables from Electronic Health Record psychotherapy notes
- PMID: 36797651
- PMCID: PMC11172400
- DOI: 10.1002/cpp.2842
Dynamic suicide topic modelling: Deriving population-specific, psychosocial and time-sensitive suicide risk variables from Electronic Health Record psychotherapy notes
Abstract
In the machine learning subfield of natural language processing, a topic model is a type of unsupervised method that is used to uncover abstract topics within a corpus of text. Dynamic topic modelling (DTM) is used for capturing change in these topics over time. The study deploys DTM on corpus of electronic health record psychotherapy notes. This retrospective study examines whether DTM helps distinguish closely matched patients that did and did not die by suicide. Cohort consists of United States Department of Veterans Affairs (VA) patients diagnosed with Posttraumatic Stress Disorder (PTSD) between 2004 and 2013. Each case (those who died by suicide during the year following diagnosis) was matched with five controls (those who remained alive) that shared psychotherapists and had similar suicide risk based on VA's suicide prediction algorithm. Cohort was restricted to patients who received psychotherapy for 9+ months after initial PTSD diagnoses (cases = 77; controls = 362). For cases, psychotherapy notes from diagnosis until death were examined. For controls, psychotherapy notes from diagnosis until matched case's death date were examined. A Python-based DTM algorithm was utilized. Derived topics identified population-specific themes, including PTSD, psychotherapy, medication, communication and relationships. Control topics changed significantly more over time than case topics. Topic differences highlighted engagement, expressivity and therapeutic alliance. This study strengthens groundwork for deriving population-specific, psychosocial and time-sensitive suicide risk variables.
Keywords: dynamic topic models; electronic medical records; natural language processing; suicide prediction.
Published 2023. This article is a U.S. Government work and is in the public domain in the USA.
Conflict of interest statement
CONFLICT OF INTEREST STATEMENT
The authors have no conflict of interest.
Figures
References
-
- Alloghani M, Al-Jumeily D, Mustafina J, Hussain A, & Aljaaf AJ (2020). A systematic review on supervised and unsupervised machine learning algorithms for data science. In Berry MW, Mohamed A, & Yap BW (Eds.), Supervised and unsupervised learning for data science (pp. 3–21). Springer International Publishing. 10.1007/978-3-030-22475-2_1 - DOI
-
- AlSumait L, Barbará D, Gentle J, & Domeniconi C. (2009). Topic significance ranking of LDA generative models. In Buntine W, Grobelnik M, Mladenic D, & Shawe-Taylor J. (Eds.), Machine learning and knowledge discovery in databases (Vol. 5781) (pp. 67–82). Springer. 10.1007/978-3-642-04180-8_22 - DOI
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Medical
