Exploiting the systematic review protocol for classification of medical abstracts

Oana Frunza¹, Diana Inkpen, Stan Matwin, William Klement, Peter O'Blenis

Affiliations

PMID: 21084178
DOI: 10.1016/j.artmed.2010.10.005

Exploiting the systematic review protocol for classification of medical abstracts

Oana Frunza et al. Artif Intell Med. 2011 Jan.

. 2011 Jan;51(1):17-25.

doi: 10.1016/j.artmed.2010.10.005. Epub 2010 Nov 16.

Authors

Oana Frunza¹, Diana Inkpen, Stan Matwin, William Klement, Peter O'Blenis

Affiliation

¹ School of Information Technology and Engineering, University of Ottawa, 800 King Edward, Ottawa, Ontario, Canada K1N 6N5. ofrunza@site.uottawa.ca

PMID: 21084178
DOI: 10.1016/j.artmed.2010.10.005

Abstract

Objective: To determine whether the automatic classification of documents can be useful in systematic reviews on medical topics, and specifically if the performance of the automatic classification can be enhanced by using the particular protocol of questions employed by the human reviewers to create multiple classifiers.

Methods and materials: The test collection is the data used in large-scale systematic review on the topic of the dissemination strategy of health care services for elderly people. From a group of 47,274 abstracts marked by human reviewers to be included in or excluded from further screening, we randomly selected 20,000 as a training set, with the remaining 27,274 becoming a separate test set. As a machine learning algorithm we used complement naïve Bayes. We tested both a global classification method, where a single classifier is trained on instances of abstracts and their classification (i.e., included or excluded), and a novel per-question classification method that trains multiple classifiers for each abstract, exploiting the specific protocol (questions) of the systematic review. For the per-question method we tested four ways of combining the results of the classifiers trained for the individual questions. As evaluation measures, we calculated precision and recall for several settings of the two methods. It is most important not to exclude any relevant documents (i.e., to attain high recall for the class of interest) but also desirable to exclude most of the non-relevant documents (i.e., to attain high precision on the class of interest) in order to reduce human workload.

Results: For the global method, the highest recall was 67.8% and the highest precision was 37.9%. For the per-question method, the highest recall was 99.2%, and the highest precision was 63%. The human-machine workflow proposed in this paper achieved a recall value of 99.6%, and a precision value of 17.8%.

Conclusion: The per-question method that combines classifiers following the specific protocol of the review leads to better results than the global method in terms of recall. Because neither method is efficient enough to classify abstracts reliably by itself, the technology should be applied in a semi-automatic way, with a human expert still involved. When the workflow includes one human expert and the trained automatic classifier, recall improves to an acceptable level, showing that automatic classification techniques can reduce the human workload in the process of building a systematic review.

PubMed Disclaimer

Cited by

Automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine.
Cohen AM, Smalheiser NR, McDonagh MS, Yu C, Adams CE, Davis JM, Yu PS. Cohen AM, et al. J Am Med Inform Assoc. 2015 May;22(3):707-17. doi: 10.1093/jamia/ocu025. Epub 2015 Feb 5. J Am Med Inform Assoc. 2015. PMID: 25656516 Free PMC article.
Validation of a Semiautomated Natural Language Processing-Based Procedure for Meta-Analysis of Cancer Susceptibility Gene Penetrance.
Deng Z, Yin K, Bao Y, Armengol VD, Wang C, Tiwari A, Barzilay R, Parmigiani G, Braun D, Hughes KS. Deng Z, et al. JCO Clin Cancer Inform. 2019 Aug;3:1-9. doi: 10.1200/CCI.19.00043. JCO Clin Cancer Inform. 2019. PMID: 31419182 Free PMC article.
A new iterative method to reduce workload in systematic review process.
Jonnalagadda S, Petitti D. Jonnalagadda S, et al. Int J Comput Biol Drug Des. 2013;6(1-2):5-17. doi: 10.1504/IJCBDD.2013.052198. Epub 2013 Feb 21. Int J Comput Biol Drug Des. 2013. PMID: 23428470 Free PMC article.
Machine Learning Methods for Systematic Reviews:: A Rapid Scoping Review.
Roth S, Wermer-Colan A. Roth S, et al. Dela J Public Health. 2023 Nov 30;9(4):40-47. doi: 10.32481/djph.2023.11.008. eCollection 2023 Nov. Dela J Public Health. 2023. PMID: 38173960 Free PMC article.
Novel text analytics approach to identify relevant literature for human health risk assessments: A pilot study with health effects of in utero exposures.
Cawley M, Beardslee R, Beverly B, Hotchkiss A, Kirrane E, Sams R 2nd, Varghese A, Wignall J, Cowden J. Cawley M, et al. Environ Int. 2020 Jan;134:105228. doi: 10.1016/j.envint.2019.105228. Epub 2019 Nov 8. Environ Int. 2020. PMID: 31711016 Free PMC article. Review.

See all "Cited by" articles

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

KT62363/Canadian Institutes of Health Research/Canada

LinkOut - more resources

Full Text Sources
- ClinicalKey
- Elsevier Science
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Exploiting the systematic review protocol for classification of medical abstracts

Affiliation

Exploiting the systematic review protocol for classification of medical abstracts

Authors

Affiliation

Abstract

Similar articles

Cited by

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials