Deep Learning Transformer Models for Building a Comprehensive and Real-time Trauma Observatory: Development and Validation Study

Gabrielle Chenais^#¹, Cédric Gil-Jardiné^#^{1

2}, Hélène Touchais^#¹, Marta Avalos Fernandez^#^{1

3}, Benjamin Contrand^#¹, Eric Tellier^#^{1

2}, Xavier Combes^#², Loick Bourdois^#¹, Philippe Revel^#², Emmanuel Lagarde^#¹

Affiliations

¹ Unit 1219, Bordeaux Public Health Center, Institut National de la Santé et de la Recherche Médicale, Bordeaux, France.
² Emergency Department, Bordeaux University Hospital, Bordeaux, France.
³ Statistics in Systems Biology and Translational Medicine Team, University of Bordeaux, Institut National de Recherche en Sciences et Technologies du Numérique, Talence, France.

^# Contributed equally.

PMID: 38875539
PMCID: PMC11041521
DOI: 10.2196/40843

Deep Learning Transformer Models for Building a Comprehensive and Real-time Trauma Observatory: Development and Validation Study

Gabrielle Chenais et al. JMIR AI. 2023.

. 2023 Jan 12:2:e40843.

doi: 10.2196/40843.

Authors

Affiliations

¹ Unit 1219, Bordeaux Public Health Center, Institut National de la Santé et de la Recherche Médicale, Bordeaux, France.
² Emergency Department, Bordeaux University Hospital, Bordeaux, France.
³ Statistics in Systems Biology and Translational Medicine Team, University of Bordeaux, Institut National de Recherche en Sciences et Technologies du Numérique, Talence, France.

^# Contributed equally.

PMID: 38875539
PMCID: PMC11041521
DOI: 10.2196/40843

Abstract

Background: Public health surveillance relies on the collection of data, often in near-real time. Recent advances in natural language processing make it possible to envisage an automated system for extracting information from electronic health records.

Objective: To study the feasibility of setting up a national trauma observatory in France, we compared the performance of several automatic language processing methods in a multiclass classification task of unstructured clinical notes.

Methods: A total of 69,110 free-text clinical notes related to visits to the emergency departments of the University Hospital of Bordeaux, France, between 2012 and 2019 were manually annotated. Among these clinical notes, 32.5% (22,481/69,110) were traumas. We trained 4 transformer models (deep learning models that encompass attention mechanism) and compared them with the term frequency-inverse document frequency associated with the support vector machine method.

Results: The transformer models consistently performed better than the term frequency-inverse document frequency and a support vector machine. Among the transformers, the GPTanam model pretrained with a French corpus with an additional autosupervised learning step on 306,368 unlabeled clinical notes showed the best performance with a micro F₁-score of 0.969.

Conclusions: The transformers proved efficient at the multiclass classification of narrative and medical data. Further steps for improvement should focus on the expansion of abbreviations and multioutput multiclass classification.

Keywords: deep learning; emergencies; natural language processing; public health; transformers; trauma.

©Gabrielle Chenais, Cédric Gil-Jardiné, Hélène Touchais, Marta Avalos Fernandez, Benjamin Contrand, Eric Tellier, Xavier Combes, Loick Bourdois, Philippe Revel, Emmanuel Lagarde. Originally published in JMIR AI (https://ai.jmir.org), 12.01.2023.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

**Figure 1**
F1-score curves for CamemBERT, FlauBERT, BelGPT2 and GPTanam on the validation dataset.

**Figure 2**
Plot of micro F1-scores of all models for each class for both the complete test data set (blue bars) and the test data set without potentially ambiguous content as regard to its classification (pink bars). TF-IDF: term frequency–inverse document frequency.

See this image and copyright information in PMC

References

1. Fouillet A, Fournet N, Caillère N. SurSaUD® Software: a tool to support the data management, the analysis and the dissemination of results from the french syndromic surveillance system. Online J Public Health Informatics. 2013;5 doi: 10.5210/ojphi.v5i1.4426. http://ojphi.org - DOI
1. Caserio SC, Henry V, Fouillet A, Bousquet V. Le Système de Surveillance Syndromique SurSaUDz. Bulletin épidémiologique hebdomadaire. 2014:38–44. http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&id...
1. Josseran L, Fouillet A, Caillère N, Brun-Ney D, Ilef D, Brucker G, Medeiros H, Astagneau P. Assessment of a syndromic surveillance system based on morbidity data: results from the Oscour network during a heat wave. PLoS One. 2010 Aug 09;5:e11984. doi: 10.1371/journal.pone.0011984. https://dx.plos.org/10.1371/journal.pone.0011984 e11984 - DOI - DOI - PMC - PubMed
1. Paireau J, Pelat C, Caserio-Schönemann C, Pontais I, Le Strat Y, Lévy-Bruhl D, Cauchemez S. Mapping influenza activity in emergency departments in France using Bayesian model-based geostatistics. Influenza Other Respir Viruses. 2018 Nov 21;12:772–9. doi: 10.1111/irv.12599. https://europepmc.org/abstract/MED/30055089 - DOI - PMC - PubMed
1. Hughes HE, Morbey R, Fouillet A, Caserio-Schönemann C, Dobney A, Hughes TC, Smith GE, Elliot AJ. Retrospective observational study of emergency department syndromic surveillance data during air pollution episodes across London and Paris in 2014. BMJ Open. 2018 Apr 19;8:e018732. doi: 10.1136/bmjopen-2017-018732. https://bmjopen.bmj.com/lookup/pmidlookup?view=long&pmid=29674360 bmjopen-2017-018732 - DOI - PMC - PubMed

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Deep Learning Transformer Models for Building a Comprehensive and Real-time Trauma Observatory: Development and Validation Study

Affiliations

Deep Learning Transformer Models for Building a Comprehensive and Real-time Trauma Observatory: Development and Validation Study

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources