Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Apr;22(e1):e81-92.
doi: 10.1136/amiajnl-2014-003009. Epub 2014 Oct 28.

Automatic abstraction of imaging observations with their characteristics from mammography reports

Affiliations

Automatic abstraction of imaging observations with their characteristics from mammography reports

Selen Bozkurt et al. J Am Med Inform Assoc. 2015 Apr.

Erratum in

  • Erratum.
    [No authors listed] [No authors listed] J Am Med Inform Assoc. 2015 Sep;22(5):1112. doi: 10.1093/jamia/ocv089. J Am Med Inform Assoc. 2015. PMID: 26330470 Free PMC article. No abstract available.

Abstract

Background: Radiology reports are usually narrative, unstructured text, a format which hinders the ability to input report contents into decision support systems. In addition, reports often describe multiple lesions, and it is challenging to automatically extract information on each lesion and its relationships to characteristics, anatomic locations, and other information that describes it. The goal of our work is to develop natural language processing (NLP) methods to recognize each lesion in free-text mammography reports and to extract its corresponding relationships, producing a complete information frame for each lesion.

Materials and methods: We built an NLP information extraction pipeline in the General Architecture for Text Engineering (GATE) NLP toolkit. Sequential processing modules are executed, producing an output information frame required for a mammography decision support system. Each lesion described in the report is identified by linking it with its anatomic location in the breast. In order to evaluate our system, we selected 300 mammography reports from a hospital report database.

Results: The gold standard contained 797 lesions, and our system detected 815 lesions (780 true positives, 35 false positives, and 17 false negatives). The precision of detecting all the imaging observations with their modifiers was 94.9, recall was 90.9, and the F measure was 92.8.

Conclusions: Our NLP system extracts each imaging observation and its characteristics from mammography reports. Although our application focuses on the domain of mammography, we believe our approach can generalize to other domains and may narrow the gap between unstructured clinical report text and structured information extraction needed for data mining and decision support.

Keywords: Breast Imaging Reporting and Data System (BI-RADS); breast; imaging informatics; information extraction; natural language processing.

PubMed Disclaimer

Conflict of interest statement

None.

Figures

Figure 1:
Figure 1:
Example mammography report describing two different masses and the ideal output from a natural language processing system to extract the information suitable for input to a decision support system.
Figure 2:
Figure 2:
Natural language processing information extraction processing pipeline.
Figure 3:
Figure 3:
Breast Imaging-Reporting and Data System (BI-RADS) ontology. The is-a hierarchy is shown and the entity names are the term preferred names (synonyms are not shown).

References

    1. Elmore JG, Wells CK, Lee CH, et al. . Variability in radiologists’ interpretations of mammograms. N Engl J Med. 1994;331:1493–9. - PubMed
    1. Jiang Y, Nishikawa RM, Schmidt RA, et al. . Potential of computer-aided diagnosis to reduce variability in radiologists’ interpretations of mammograms depicting microcalcifications. Radiology. 2001;220:787–94. - PubMed
    1. Kerlikowske K, Grady D, Barclay J, et al. . Variability and accuracy in mammographic interpretation using the American College of Radiology Breast Imaging Reporting and Data System. J Natl Cancer Inst. 1998;90:1801–9. - PubMed
    1. Liberman L, Menell JH. Breast imaging reporting and data system (BI-RADS). Radiol Clin North Am. 2002;40:409–30, v. - PubMed
    1. Park CS, Lee JH, Yim HW, et al. . Observer agreement using the ACR Breast Imaging Reporting and Data System (BI-RADS)-ultrasound, First Edition (2003). Korean J Radiol. 2007;8:397–402. - PMC - PubMed

Publication types