. 2022 Oct 24:7:1001266.

doi: 10.3389/frma.2022.1001266. eCollection 2022.

Temporal disambiguation of relative temporal expressions in clinical texts

Amy L Olex^{1

2}, Bridget T McInnes²

Affiliations

¹ C. Kenneth and Diane Wright Center for Clinical and Translational Research, Virginia Commonwealth University, Richmond, VA, United States.
² Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States.

PMID: 36352893
PMCID: PMC9638055
DOI: 10.3389/frma.2022.1001266

Temporal disambiguation of relative temporal expressions in clinical texts

Amy L Olex et al. Front Res Metr Anal. 2022.

. 2022 Oct 24:7:1001266.

doi: 10.3389/frma.2022.1001266. eCollection 2022.

Authors

Amy L Olex^{1

2}, Bridget T McInnes²

Affiliations

¹ C. Kenneth and Diane Wright Center for Clinical and Translational Research, Virginia Commonwealth University, Richmond, VA, United States.
² Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States.

PMID: 36352893
PMCID: PMC9638055
DOI: 10.3389/frma.2022.1001266

Abstract

Temporal expression recognition and normalization (TERN) is the foundation for all higher-level temporal reasoning tasks in natural language processing, such as timeline extraction, so it must be performed well to limit error propagation. Achieving new heights in state-of-the-art performance for TERN in clinical texts requires knowledge of where current systems struggle. In this work, we summarize the results of a detailed error analysis for three top performing state-of-the-art TERN systems that participated in the 2012 i2b2 Clinical Temporal Relation Challenge, and compare our own home-grown system Chrono to identify specific areas in need of improvement. Performance metrics and an error analysis reveal that all systems have reduced performance in normalization of relative temporal expressions, specifically in disambiguating temporal types and in the identification of the correct anchor time. To address the issue of temporal disambiguation we developed and integrated a module into Chrono that utilizes temporally fine-tuned contextual word embeddings to disambiguate relative temporal expressions. Chrono now achieves state-of-the-art performance for temporal disambiguation of relative temporal expressions in clinical text, and is the only TERN system to output dual annotations into both TimeML and SCATE schemes.

Keywords: BERT; clinical text; contextual word embeddings; error analysis; natural language processing; relative temporal expression; temporal expression recognition and normalization; temporal reasoning.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
Overview of the fine-tuning, embedding extraction, and classification strategies examined in this work. Baseline BERT models are either the BertBase or ClinBioBert models referenced in the text. **(A)** No fine-tuning; **(B)** binary fine-tuning; **(C)** sequential Binary-Seq2Seq fine-tuning; **(D)** Seq2Seq fine-tuning.

**Figure 2**
Chrono's performance on the i2b2 training and evaluation data sets after conversion changes and algorithm improvements using span-based P, R, and F1 metrics.

**Figure 3**
Performance of top systems from the 2012 i2b2 Temporal Challenge and Chrono on **(A)** the full evaluation data set, and **(B)** the subset of poor performing files using span-based P, R, and F1 metrics. RB, Rule-Based; H, Hybrid.

**Figure 4**
Temporal phrases that were hard to correctly classify as a DURATION or DATE temporal type. Red text indicates an incorrect classification.

**Figure 5**
Temporal phrases for which it was hard to correctly identify the Anchor Time and/or Delta Value. Red text indicates an incorrect date.

**Figure 6**
ClinBioBert SVM performance using the Gold Standard RelIV-TIMEX Evaluation data set using class-based P, R, and F1 metrics. Scores are weighted averages across DATE and DURATION. Bold, best performance across all SVM models; orange, high; white, median; blue, low scores relative to all scores in the table.

**Figure 7**
BertBase SVM performance using the Gold Standard RelIV-TIMEX Evaluation data set using class-based P, R, and F1 metrics. Scores are weighted averages across DATE and DURATION. Bold, best performance across all SVM models; Orange, high; white, median; blue, low scores relative to all scores in the table.

**Figure 8**
System performance on the RelIV-TIMEX Evaluation data set of Chrono before and after the TTD model integration, and the three i2b2 state-of-the-art system using class-based P, R, and F1 metrics. Values are the weighted average bootstrap estimates across individual DATE and DURATION performance (Supplementary Table 5 contains the 95% confidence intervals). Bold, best performance across all SVM models; orange, high; white, median; blue, low scores with the maximum and minimum relative to each column instead of the entire table.

See this image and copyright information in PMC

References

1. Almasian S., Aumiller D., Gertz M. (2022). BERT got a date: introducing transformers to temporal tagging. arXiv preprint arXiv: 2109.14927. 10.48550/arXiv.2109.14927 - DOI
1. Alsentzer E., Murphy J. R., Boag W., Weng W.-H., Jin D., Naumann T., et al. . (2019). Publicly available clinical BERT embeddings. arXiv preprint arXiv: 1904.03323. 10.18653/v1/W19-1909 - DOI
1. Antunes R., Matos S. (2017). Supervised learning and knowledge-based approaches applied to biomedical word sense disambiguation. J. Integr. Bioinform. 14, 20170051. 10.1515/jib-2017-0051 - DOI - PMC - PubMed
1. Bethard S., Parker J. (2016). A semantically compositional annotation scheme for time normalization, in Proceedings of the Tenth International Conference on Language Resources and Evaluation (Portorož: European Language Resources Association; ), 3779–3786. Available online at: https://aclanthology.org/L16-1599/
1. Cheng Y., Anick P., Hong P., Xue N. (2013). Temporal relation discovery between events and temporal expressions identified in clinical narrative. J. Biomed. Inform. 46, S48–S53. 10.1016/j.jbi.2013.09.010 - DOI - PubMed

Grants and funding

UL1 TR002649/TR/NCATS NIH HHS/United States

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Temporal disambiguation of relative temporal expressions in clinical texts

Affiliations

Temporal disambiguation of relative temporal expressions in clinical texts

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

References

Grants and funding

LinkOut - more resources

Full Text Sources