Classification of cancer stage from free-text histology reports
- PMID: 17945879
- DOI: 10.1109/IEMBS.2006.259563
Classification of cancer stage from free-text histology reports
Abstract
This article investigates the classification of a patient's lung cancer stage based on analysis of their free-text medical reports. The system uses natural language processing to transform the report text, including identification of UMLS terms and detection of negated findings. The transformed report is then classified using statistical machine learning techniques. A support vector machine is trained for each stage category based on word occurrences in a corpus of histology reports for pathologically staged patients. New reports can be classified according to the most likely stage, allowing the collection of population stage data for analysis of outcomes. While the system could in principle be applied to stage different cancer types, the current work focuses on lung cancer due to data availability. The article presents initial experiments quantifying system performance for T and N staging on a corpus of histology reports from more than 700 lung cancer patients.
Similar articles
-
Multi-class classification of cancer stages from free-text histology reports using support vector machines.Annu Int Conf IEEE Eng Med Biol Soc. 2007;2007:5140-3. doi: 10.1109/IEMBS.2007.4353497. Annu Int Conf IEEE Eng Med Biol Soc. 2007. PMID: 18003163
-
Development and evaluation of RapTAT: a machine learning system for concept mapping of phrases from medical narratives.J Biomed Inform. 2014 Apr;48:54-65. doi: 10.1016/j.jbi.2013.11.008. Epub 2013 Dec 4. J Biomed Inform. 2014. PMID: 24316051
-
UMLS-Query: a perl module for querying the UMLS.AMIA Annu Symp Proc. 2008 Nov 6;2008:652-6. AMIA Annu Symp Proc. 2008. PMID: 18998805 Free PMC article.
-
Essential Elements of Natural Language Processing: What the Radiologist Should Know.Acad Radiol. 2020 Jan;27(1):6-12. doi: 10.1016/j.acra.2019.08.010. Epub 2019 Sep 17. Acad Radiol. 2020. PMID: 31537505 Review.
-
Towards Automated Screening of Literature on Artificial Intelligence in Nursing.Stud Health Technol Inform. 2022 Jun 6;290:637-640. doi: 10.3233/SHTI220155. Stud Health Technol Inform. 2022. PMID: 35673094 Review.
Cited by
-
Obtaining Knowledge in Pathology Reports Through a Natural Language Processing Approach With Classification, Named-Entity Recognition, and Relation-Extraction Heuristics.JCO Clin Cancer Inform. 2019 Aug;3:1-8. doi: 10.1200/CCI.19.00008. JCO Clin Cancer Inform. 2019. PMID: 31365274 Free PMC article.
-
Using clinical Natural Language Processing for health outcomes research: Overview and actionable suggestions for future advances.J Biomed Inform. 2018 Dec;88:11-19. doi: 10.1016/j.jbi.2018.10.005. Epub 2018 Oct 24. J Biomed Inform. 2018. PMID: 30368002 Free PMC article.
-
Automated Extraction of Tumor Staging and Diagnosis Information From Surgical Pathology Reports.JCO Clin Cancer Inform. 2021 Oct;5:1054-1061. doi: 10.1200/CCI.21.00065. JCO Clin Cancer Inform. 2021. PMID: 34694896 Free PMC article.
-
Automated determination of metastases in unstructured radiology reports for eligibility screening in oncology clinical trials.Exp Biol Med (Maywood). 2013 Dec;238(12):1370-8. doi: 10.1177/1535370213508172. Epub 2013 Oct 9. Exp Biol Med (Maywood). 2013. PMID: 24108448 Free PMC article.
-
Discerning tumor status from unstructured MRI reports--completeness of information in existing reports and utility of automated natural language processing.J Digit Imaging. 2010 Apr;23(2):119-32. doi: 10.1007/s10278-009-9215-7. Epub 2009 May 30. J Digit Imaging. 2010. PMID: 19484309 Free PMC article. Review.
MeSH terms
LinkOut - more resources
Full Text Sources
Medical