In-depth evaluation of machine learning methods for semi-automating article screening in a systematic review of mechanistic literature

Mihiretu M Kebede¹, Charlotte Le Cornet¹, Renée Turzanski Fortner¹

Affiliations

PMID: 35798691
DOI: 10.1002/jrsm.1589

In-depth evaluation of machine learning methods for semi-automating article screening in a systematic review of mechanistic literature

Mihiretu M Kebede et al. Res Synth Methods. 2023 Mar.

. 2023 Mar;14(2):156-172.

doi: 10.1002/jrsm.1589. Epub 2022 Jul 23.

Authors

Mihiretu M Kebede¹, Charlotte Le Cornet¹, Renée Turzanski Fortner¹

Affiliation

¹ German Cancer Research Center (DKFZ), Heidelberg, Germany.

PMID: 35798691
DOI: 10.1002/jrsm.1589

Abstract

We aimed to evaluate the performance of supervised machine learning algorithms in predicting articles relevant for full-text review in a systematic review. Overall, 16,430 manually screened titles/abstracts, including 861 references identified relevant for full-text review were used for the analysis. Of these, 40% (n = 6573) were sub-divided for training (70%) and testing (30%) the algorithms. The remaining 60% (n = 9857) were used as a validation set. We evaluated down- and up-sampling methods and compared unigram, bigram, and singular value decomposition (SVD) approaches. For each approach, Naïve Bayes, Support Vector Machines (SVM), regularized logistic regressions, neural networks, random forest, Logit boost, and XGBoost were implemented using simple term frequency or Tf-Idf feature representations. Performance was evaluated using sensitivity, specificity, precision and area under the Curve. We combined predictions of the best-performing algorithms (Youden Index ≥0.3 with sensitivity/specificity≥70/60%). In a down-sample unigram approach, Naïve Bayes, SVM/quanteda text models with Tf-Idf, and linear SVM e1071 package with Tf-Idf achieved >90% sensitivity at specificity >65%. Combining the predictions of the 10 best-performing algorithms improved the performance to reach 95% sensitivity and 64% specificity in the validation set. Crude screening burden was reduced by 61% (5979) (adjusted: 80.3%) with 5% (27) false negativity rate. All the other approaches yielded relatively poorer performances. The down-sampling unigram approach achieved good performance in our data. Combining the predictions of algorithms improved sensitivity while screening burden was reduced by almost two-third. Implementing machine learning approaches in title/abstract screening should be investigated further toward refining these tools and automating their implementation.

Keywords: NLP; automated screening; citation screening; machine learning; natural language processing; systematic review; text mining.

PubMed Disclaimer

References

REFERENCES

1. Bornmann L, Mutz R. Growth rates of modern science: a bibliometric analysis based on the number of publications and cited references. J Assn Inf Sci Tec. 2015;66(11):2215-2222. doi:10.1002/asi.23329
1. Wallace BC, Trikalinos TA, Lau J, Brodley C, Schmid CH. Semi-automated screening of biomedical citations for systematic reviews. BMC Bioinformatics. 2010;11(55):1-11. doi:10.1186/1471-2105-11-55
1. Nussbaumer-Streit B, Ellen M, Klerings I, et al. Resource use during systematic review production varies widely: a scoping review. J Clin Epidemiol. 2021;139:287-296. doi:10.1016/j.jclinepi.2021.05.019
1. Blaizot AA-O, Veettil SK, Saidoung P, et al. Using artificial intelligence methods for systematic review in health sciences: a systematic review. doi:10.1002/jrsm.1553
1. Ertaylan G, Le Cornet C, van Roekel EH, et al. A comparative study on the WCRF international/University of Bristol Methodology for systematic reviews of mechanisms underpinning exposure-cancer associations. Cancer Epidemiol Biomarkers Prev. 2017;26(11):1583-1594. doi:10.1158/1055-9965.Epi-17-0230

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Wiley
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

In-depth evaluation of machine learning methods for semi-automating article screening in a systematic review of mechanistic literature

Affiliation

In-depth evaluation of machine learning methods for semi-automating article screening in a systematic review of mechanistic literature

Authors

Affiliation

Abstract

References

REFERENCES

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous