. 2019 Jan 15;8(1):23.

doi: 10.1186/s13643-019-0942-7.

Machine learning algorithms for systematic review: reducing workload in a preclinical review of animal studies and reducing human screening error

Alexandra Bannach-Brown^{1

2

3}, Piotr Przybyła⁴, James Thomas⁵, Andrew S C Rice⁶, Sophia Ananiadou⁴, Jing Liao⁷, Malcolm Robert Macleod⁷

Affiliations

¹ Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland. a.bannach-brown@ed.ac.uk.
² Translational Neuropsychiatry Unit, Aarhus University, Aarhus, Denmark. a.bannach-brown@ed.ac.uk.
³ Present Address: Centre for Research in Evidence-Based Practice, Bond University, Gold Coast, Australia. a.bannach-brown@ed.ac.uk.
⁴ National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, England.
⁵ EPPI-Centre, Department of Social Science, University College London, London, England.
⁶ Pain Research, Department of Surgery and Cancer, Imperial College, London, England.
⁷ Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland.

PMID: 30646959
PMCID: PMC6334440
DOI: 10.1186/s13643-019-0942-7

Machine learning algorithms for systematic review: reducing workload in a preclinical review of animal studies and reducing human screening error

Alexandra Bannach-Brown et al. Syst Rev. 2019.

. 2019 Jan 15;8(1):23.

doi: 10.1186/s13643-019-0942-7.

Authors

Alexandra Bannach-Brown^{1

2

3}, Piotr Przybyła⁴, James Thomas⁵, Andrew S C Rice⁶, Sophia Ananiadou⁴, Jing Liao⁷, Malcolm Robert Macleod⁷

Affiliations

¹ Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland. a.bannach-brown@ed.ac.uk.
² Translational Neuropsychiatry Unit, Aarhus University, Aarhus, Denmark. a.bannach-brown@ed.ac.uk.
³ Present Address: Centre for Research in Evidence-Based Practice, Bond University, Gold Coast, Australia. a.bannach-brown@ed.ac.uk.
⁴ National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, England.
⁵ EPPI-Centre, Department of Social Science, University College London, London, England.
⁶ Pain Research, Department of Surgery and Cancer, Imperial College, London, England.
⁷ Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland.

PMID: 30646959
PMCID: PMC6334440
DOI: 10.1186/s13643-019-0942-7

Abstract

Background: Here, we outline a method of applying existing machine learning (ML) approaches to aid citation screening in an on-going broad and shallow systematic review of preclinical animal studies. The aim is to achieve a high-performing algorithm comparable to human screening that can reduce human resources required for carrying out this step of a systematic review.

Methods: We applied ML approaches to a broad systematic review of animal models of depression at the citation screening stage. We tested two independently developed ML approaches which used different classification models and feature sets. We recorded the performance of the ML approaches on an unseen validation set of papers using sensitivity, specificity and accuracy. We aimed to achieve 95% sensitivity and to maximise specificity. The classification model providing the most accurate predictions was applied to the remaining unseen records in the dataset and will be used in the next stage of the preclinical biomedical sciences systematic review. We used a cross-validation technique to assign ML inclusion likelihood scores to the human screened records, to identify potential errors made during the human screening process (error analysis).

Results: ML approaches reached 98.7% sensitivity based on learning from a training set of 5749 records, with an inclusion prevalence of 13.2%. The highest level of specificity reached was 86%. Performance was assessed on an independent validation dataset. Human errors in the training and validation sets were successfully identified using the assigned inclusion likelihood from the ML model to highlight discrepancies. Training the ML algorithm on the corrected dataset improved the specificity of the algorithm without compromising sensitivity. Error analysis correction leads to a 3% improvement in sensitivity and specificity, which increases precision and accuracy of the ML algorithm.

Conclusions: This work has confirmed the performance and application of ML algorithms for screening in systematic reviews of preclinical animal studies. It has highlighted the novel use of ML algorithms to identify human error. This needs to be confirmed in other reviews with different inclusion prevalence levels, but represents a promising approach to integrating human decisions and automation in systematic review methodology.

Keywords: Analysis of human error; Automation tools; Citation screening; Machine learning; Systematic review.

PubMed Disclaimer

Conflict of interest statement

Ethics approval

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

**Fig. 1**
Diagram of the layout of the study

**Fig. 2**
Error analysis. The methodology for using cross-validation to assign ML-predicted probability scores. The ML-predicted probability scores for the records were checked against the original human inclusion decision

**Fig. 3**
Performance of machine learning approaches. For the interactive version of this plot with cut-off values, see code and data at https://github.com/abannachbrown/The-use-of-text-mining-and-machine-learning-algorithms-in-systematic-reviews/blob/master/ML-fig3.html

**Fig. 4**
Performance of approach 1 after error analysis. The updated approach is retrained on the corrected training set after error analysis correction. Performance on both the original and the updated approach is measured on the corrected validation set (with error analysis correction). For the interactive version of this plot with the ability to read off performance at all cut-off values, see code and data at https://github.com/abannachbrown/The-use-of-text-mining-and-machine-learning-algorithms-in-systematic-reviews/blob/master/error-analysis-plot.html

See this image and copyright information in PMC

References

1. Bornmann L, Mutz R. Growth rates of modern science: a bibliometric analysis based on the number of publications and cited references. J Assoc Inf Sci Technol. 2015;66(11):2215–2222. doi: 10.1002/asi.23329. - DOI
1. Cohen AM, Adams CE, Davis JM, Yu C, Yu PS, Meng W, et al. Evidence-based medicine, the essential role of systematic reviews, and the need for automated text mining tools. In: Proceedings of the 1st ACM international Health Informatics Symposium: ACM; 2010. p. 376–80.
1. Howard BE, Phillips J, Miller K, Tandon A, Mav D, Shah MR, Holmgren S, Pelch KE, Walker V, Rooney AA, Macleod M. SWIFT-review: a text-mining workbench for systematic review. Syst Rev. 2016;5(1):87. doi: 10.1186/s13643-016-0263-z. - DOI - PMC - PubMed
1. Tsafnat G, Glasziou P, Choong MK, Dunn A, Galgani F, Coiera E. Systematic review automation technologies. Syst Rev. 2014;3(1):74. doi: 10.1186/2046-4053-3-74. - DOI - PMC - PubMed
1. O’Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst Rev. 2015;4(1):5. doi: 10.1186/2046-4053-4-5. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Machine learning algorithms for systematic review: reducing workload in a preclinical review of animal studies and reducing human screening error

Affiliations

Machine learning algorithms for systematic review: reducing workload in a preclinical review of animal studies and reducing human screening error

Authors

Affiliations

Abstract

Conflict of interest statement

Ethics approval

Competing interests

Publisher’s Note

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources