Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 5:4:685298.
doi: 10.3389/frai.2021.685298. eCollection 2021.

Sysrev: A FAIR Platform for Data Curation and Systematic Evidence Review

Affiliations

Sysrev: A FAIR Platform for Data Curation and Systematic Evidence Review

Thomas Bozada Jr et al. Front Artif Intell. .

Abstract

Well-curated datasets are essential to evidence based decision making and to the integration of artificial intelligence with human reasoning across disciplines. However, many sources of data remain siloed, unstructured, and/or unavailable for complementary and secondary research. Sysrev was developed to address these issues. First, Sysrev was built to aid in systematic evidence reviews (SER), where digital documents are evaluated according to a well defined process, and where Sysrev provides an easy to access, publicly available and free platform for collaborating in SER projects. Secondly, Sysrev addresses the issue of unstructured, siloed, and inaccessible data in the context of generalized data extraction, where human and machine learning algorithms are combined to extract insights and evidence for better decision making across disciplines. Sysrev uses FAIR - Findability, Accessibility, Interoperability, and Reuse of digital assets - as primary principles in design. Sysrev was developed primarily because of an observed need to reduce redundancy, reduce inefficient use of human time and increase the impact of evidence based decision making. This publication is an introduction to Sysrev as a novel technology, with an overview of the features, motivations and use cases of the tool. Methods: Sysrev. com is a FAIR motivated web platform for data curation and SER. Sysrev allows users to create data curation projects called "sysrevs" wherein users upload documents, define review tasks, recruit reviewers, perform review tasks, and automate review tasks. Conclusion: Sysrev is a web application designed to facilitate data curation and SERs. Thousands of publicly accessible Sysrev projects have been created, accommodating research in a wide variety of disciplines. Described use cases include data curation, managed reviews, and SERs.

Keywords: data extraction; data management; evidence review; machine learning; meta analysis; software; systematic review.

PubMed Disclaimer

Conflict of interest statement

TB, JB, JW, MD, and TL was employed by Insilica LLC. TL was employed by Toxtrack LLC. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Sysrev projects can be broken into 4 stages from left to right. (A) Articles are collected to form a data source. (B) Review tasks or “labels” are defined. (C) Reviewers are recruited and asked to complete review tasks. Active learning is involved at this stage to prioritize and replicate labelling tasks. (D) Data is exported and analyzed.
FIGURE 2
FIGURE 2
Users extract testing information from a safety data sheet using a group label. The results of this safety data sheet review are available (with pdf hidden) at sysrev.com/o/2/p/31871/article/8296525.
FIGURE 3
FIGURE 3
A screenshot of the sysrev label counting tool. A demonstration of this tool is publicly available at https://sysrev.com/o/2/p/21696/analytics/labels. Label counts can be filtered, visualized and updated in real time as a sysrev progresses.
FIGURE 4
FIGURE 4
A screenshot of the Sysrev concordance tool in action. This tool can be demonstrated at sysrev.com/p/21696/analytics/concordance.
FIGURE 5
FIGURE 5
(A) Machine learning prediction of article inclusion compared to actual reviewer inclusion decisions from 3 public sysrev projects on spinal surgery sysrev.com/p/14872, sysrev.com/p/14873, sysrev.com/p/14874. (B) Sysrev categorical and Boolean predictions. Sysrev predictions correctly identify that this article references a “positive effect” on “heart/cardiovascular.” They provide a strong prediction of “purified” mangiferin substance, even though the article does not explicitly specify this. Model predictions can be seen on every article in a project. This screenshot comes from an Insilica.co project on Mangiferin (sysrev.com/p/24557/article/7225450).
FIGURE 6
FIGURE 6
(A) Three of the larger communities on sysrev.com. Pink nodes represent users, green nodes represent projects. Edges link projects with their users. Several notable projects and users have been enlarged for discussion. (B) Bar chart representing completed document reviews for each student in sysrev.com/p/3509, an educational review focused on evidence based toxicology by Dr. Lena Smirnova. Each bar represents a student, students were asked to complete a review of 20 documents, blue bars indicate excluded articles, red bars indicate included articles in an article screening exercise.
FIGURE 7
FIGURE 7
(A) Box plots indicate distribution of model performance relative to the worst and best model in a given model’s project. Models are bucketed according to the number of articles labeled before model training. Models improve rapidly until 300 articles have been reviewed. (B) Accuracy metrics for a large sysrev reviewing insect population changes. Model performance is charted as a function of number of articles used in training, across 3 performance metrics, and evaluated on a consistent holdout set. (C) Best model accuracy and balanced accuracy evaluated in 64 sysrevs.

References

    1. Arnaud E., Laporte M.-A., Kim S., Aubert C., Leonelli S., Miro B., et al. (2020). The Ontologies Community of Practice: A CGIAR Initiative for Big Data in Agrifood Systems. Patterns 1, 100105. 10.1016/j.patter.2020.100105 - DOI - PMC - PubMed
    1. Bickel J. E. (2007). Some Comparisons Among Quadratic, Spherical, and Logarithmic Scoring Rules. Decis. Anal. 4, 49–65. 10.1287/deca.1070.0089 - DOI
    1. Bilynsky C., Han W., Gupta A., Aldarondo D., Fox H., Brown C., et al. (2021). Scoping Review of Pre-clinical and Translational Studies on Macrophage Polarization in Nanoparticle-Based Cancer Immunotherapy. 10.17605/OSF.IO/HWD2R - DOI
    1. Bozada T. (2020). What Is Sysrev for? Literature Reviews and Data Curation. Available at: https://blog.sysrev.com/literature-review-data-curation/.
    1. Bozada T. J. (2020). Supporting COVID Research: Rapid Reviews on Sysrev. Available at: https://blog.sysrev.com/covid-rapid-review/ (Accessed Mar 16, 2021).