Annotating risk factors for heart disease in clinical narratives for diabetic patients

Amber Stubbs¹, Özlem Uzuner²

Affiliations

¹ School of Library and Information Science, Simmons College, Boston, MA, USA. Electronic address: stubbs@simmons.edu.
² Department of Information Studies, State University of New York at Albany, Albany, NY, USA.

PMID: 26004790
PMCID: PMC4978180
DOI: 10.1016/j.jbi.2015.05.009

Annotating risk factors for heart disease in clinical narratives for diabetic patients

Amber Stubbs et al. J Biomed Inform. 2015 Dec.

. 2015 Dec;58 Suppl(Suppl):S78-S91.

doi: 10.1016/j.jbi.2015.05.009. Epub 2015 May 21.

Authors

Amber Stubbs¹, Özlem Uzuner²

Affiliations

¹ School of Library and Information Science, Simmons College, Boston, MA, USA. Electronic address: stubbs@simmons.edu.
² Department of Information Studies, State University of New York at Albany, Albany, NY, USA.

PMID: 26004790
PMCID: PMC4978180
DOI: 10.1016/j.jbi.2015.05.009

Abstract

The 2014 i2b2/UTHealth natural language processing shared task featured a track focused on identifying risk factors for heart disease (specifically, Cardiac Artery Disease) in clinical narratives. For this track, we used a "light" annotation paradigm to annotate a set of 1304 longitudinal medical records describing 296 patients for risk factors and the times they were present. We designed the annotation task for this track with the goal of balancing annotation load and time with quality, so as to generate a gold standard corpus that can benefit a clinically-relevant task. We applied light annotation procedures and determined the gold standard using majority voting. On average, the agreement of annotators with the gold standard was above 0.95, indicating high reliability. The resulting document-level annotations generated for each record in each longitudinal EMR in this corpus provide information that can support studies of progression of heart disease risk factors in the included patients over time. These annotations were used in the Risk Factor track of the 2014 i2b2/UTHealth shared task. Participating systems achieved a mean micro-averaged F1 measure of 0.815 and a maximum F1 measure of 0.928 for identifying these risk factors in patient records.

Keywords: Annotation; Medical records; Natural language processing.

PubMed Disclaimer

Figures

**Figure 1**
Risk factor annotation in MAE

**Figure 2**
Example of Trial 1: very light annotation

**Figure 3**
Example of Trial 2: moderately light annotation

**Figure 4**
Example of Trial 3: exhaustive annotation

See this image and copyright information in PMC

References

1. Kumar Vishesh, Stubbs Amber, Shaw Stanley, Uzuner Ozlem. Creation of a new longitudinal corpus of clinical narratives. this issue. - PMC - PubMed
1. Miller Timothy, Bethard Steven, Dligach Dmitriy, Pradhan Sameer, Lin Chen, Savova Guergana. Proceedings of the 2013 Workshop on Biomedical Natural Language Processing. Association for Computational Linguistics; Sofia, Bulgaria: 2013. Discovering Temporal Narrative Containers in Clinical Text. pp. 18–26.
1. NDIC (National Diabetes Information Clearinghouse) [February 19, 2014];Diabetes, Heart Disease, and Stroke. http://diabetes.niddk.nih.gov/dm/pubs/stroke/index.aspx.
1. Pestian John P., Brew Christopher, Matykiewicz Paweł, Hovermale DJ, Johnson Neil, Bretonnel Cohen K, Duch Włodzisław. Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing (BioNLP '07) Association for Computational Linguistics; Stroudsburg, PA, USA: 2007. A shared task involving multi-label classification of clinical free text. pp. 97–104.
1. Pustejovsky James, Stubbs Amber. 2011 Proceedings of the Linguistic Annotation Workshop V, Association of Computational Linguistics. Portland, Oregon: Jul 23-24, 2011. Increasing Informativeness in Temporal Annotation.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Medical
- MedlinePlus Consumer Health Information
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Annotating risk factors for heart disease in clinical narratives for diabetic patients

Affiliations

Annotating risk factors for heart disease in clinical narratives for diabetic patients

Authors

Affiliations

Abstract

Figures

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical