Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Jun 8;118(23):e2100766118.
doi: 10.1073/pnas.2100766118.

Analyzing the vast coronavirus literature with CoronaCentral

Affiliations
Review

Analyzing the vast coronavirus literature with CoronaCentral

Jake Lever et al. Proc Natl Acad Sci U S A. .

Abstract

The SARS-CoV-2 pandemic has caused a surge in research exploring all aspects of the virus and its effects on human health. The overwhelming publication rate means that researchers are unable to keep abreast of the literature. To ameliorate this, we present the CoronaCentral resource that uses machine learning to process the research literature on SARS-CoV-2 together with SARS-CoV and MERS-CoV. We categorize the literature into useful topics and article types and enable analysis of the contents, pace, and emphasis of research during the crisis with integration of Altmetric data. These topics include therapeutics, disease forecasting, as well as growing areas such as "long COVID" and studies of inequality. This resource, available at https://coronacentral.ai, is updated daily.

Keywords: coronavirus; literature analysis; literature categorization; machine learning.

PubMed Disclaimer

Conflict of interest statement

Competing interest statement: D.L.D., J.L., and R.B.A. are all affiliated with Stanford University.

Figures

Fig. 1.
Fig. 1.
Overview of research trends and important topics. (A) Largest year-on-year changes in the percentage of papers that mention a biomedical concept using data from PubTator (8). (B) Frequency of each topic and (C) article type across the entire coronavirus literature. (D) The trajectories of the top five topics for original research and comment/editorial articles for SARS-CoV-2. (E) Different proportions of article types for each topic.
Fig. 2.
Fig. 2.
Communication of research has changed with a greater emphasis on social media and preprint servers. (A) The number of papers categorized with each topic in the 100 papers with highest Altmetric scores. (B) Top journals and preprint servers. (C) Topic breakdown for each preprint server and nonpreprint peer-reviewed journals. Infrequent topics in preprints are grouped in “Other.”

Update of

References

    1. Wang L. L., et al., “CORD-19: The COVID-19 open research dataset” in Proceedings of the first Workshop on NLP for COVID-19 at ACL 2020 (Association for Computational Linguistics, 2020).
    1. Doanvo A., et al., Machine learning maps research needs in covid-19 literature. Patterns (N Y) 1, 100123 (2020). - PMC - PubMed
    1. Bras P. L., et al., Visualising covid-19 research. arXiv [Preprint] (2020). https://arxiv.org/abs/2005.06380 (Accessed 15 December 2020).
    1. Roberts K., et al., TREC-COVID: Rationale and structure of an information retrieval shared task for COVID-19. J. Am. Med. Inform. Assoc. 27, 1431–1436 (2020). - PMC - PubMed
    1. Zhang E., et al., Covidex: Neural ranking models and keyword search infrastructure for the covid-19 open research dataset. arXiv [Preprint] (2020). https://arxiv.org/abs/2007.07846 (Accessed 15 December 2020).

Publication types

MeSH terms