Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 20:9:78341-78355.
doi: 10.1109/ACCESS.2021.3082108. eCollection 2021.

A Comparative NLP-Based Study on the Current Trends and Future Directions in COVID-19 Research

Affiliations

A Comparative NLP-Based Study on the Current Trends and Future Directions in COVID-19 Research

Priyankar Bose et al. IEEE Access. .

Abstract

COVID-19 is a global health crisis that has altered human life and still promises to create ripples of death and destruction in its wake. The sea of scientific literature published over a short time-span to understand and mitigate this global phenomenon necessitates concerted efforts to organize our findings and focus on the unexplored facets of the disease. In this work, we applied natural language processing (NLP) based approaches on scientific literature published on COVID-19 to infer significant keywords that have contributed to our social, economic, demographic, psychological, epidemiological, clinical, and medical understanding of this pandemic. We identify key terms appearing in COVID literature that vary in representation when compared to other virus-borne diseases such as MERS, Ebola, and Influenza. We also identify countries, topics, and research articles that demonstrate that the scientific community is still reacting to the short-term threats such as transmissibility, health risks, treatment plans, and public policies, underpinning the need for collective international efforts towards long-term immunization and drug-related challenges. Furthermore, our study highlights several long-term research directions that are urgently needed for COVID-19 such as: global collaboration to create international open-access data repositories, policymaking to curb future outbreaks, psychological repercussions of COVID-19, vaccine development for SARS-CoV-2 variants and their long-term efficacy studies, and mental health issues in both children and elderly.

Keywords: COVID-19; coefficient of variation; mean squared error; natural language processing.

PubMed Disclaimer

Figures

FIGURE 1.
FIGURE 1.
Summary of contributions: Each document consists of the abstracts of scientific literature of each disease used to analyze significant over- and under-expressed words, topics, countries and research articles.
FIGURE 2.
FIGURE 2.
Frequency of the top 400 words in the Research Abstracts.
FIGURE 3.
FIGURE 3.
Most frequent words in COVID-19 research.
FIGURE 4.
FIGURE 4.
Most frequent words in Ebola research.
FIGURE 5.
FIGURE 5.
Most frequent words in Influenza research.
FIGURE 6.
FIGURE 6.
Most frequent words in MERS research.
FIGURE 7.
FIGURE 7.
Significant common words across all diseases.
FIGURE 8.
FIGURE 8.
Top under-expressed words based on log fold change.
FIGURE 9.
FIGURE 9.
Distribution of the first 25 under-expressed words by the log ratio (formula image).
FIGURE 10.
FIGURE 10.
Top over-expressed words based on log fold change.
FIGURE 11.
FIGURE 11.
Distribution of the first 25 over-expressed words by the log ratio (formula image).
FIGURE 12.
FIGURE 12.
Names of the most mentioned countries in COVID abstracts.
FIGURE 13.
FIGURE 13.
Frequency of 25 most mentioned countries in COVID research.
FIGURE 14.
FIGURE 14.
Daily COVID-19 infections in the 5 countries mentioned highly in COVID-19 research abstracts.
FIGURE 15.
FIGURE 15.
Evolution of the frequency of occurrence of the most mentioned countries in COVID-19 research abstracts.
FIGURE 16.
FIGURE 16.
Counts (on a log scale) of the top 5 keywords co-occurring with the top 5 nations mentioned in COVID-19 research abstracts.
FIGURE 17.
FIGURE 17.
Evolution of daily COVID-19 infection cases in the least mentioned countries.
FIGURE 18.
FIGURE 18.
Relationship between disease documents and topics.
FIGURE 19.
FIGURE 19.
Similarity across disease literature based on mean squared error.

Similar articles

Cited by

References

    1. Raoult D., Zumla A., Locatelli F., Ippolito G., and Kroemer G., “Coronavirus infections: Epidemiological, clinical and immunological features and hypotheses,” Cell stress, vol. 4, no. 4, p. 66, 2020, doi: 10.15698/cst2020.04.216. - DOI - PMC - PubMed
    1. Isaacs D., Flowers D., Clarke J. R., Valman H. B., and MacNaughton M. R., “Epidemiology of coronavirus respiratory infections,” Arch. Disease Childhood, vol. 58, no. 7, pp. 500–503, Jul. 1983, doi: 10.1136/adc.58.7.500. - DOI - PMC - PubMed
    1. Liu D. X., Liang J. Q., and Fung T. S., “Human coronavirus-229E, -OC43, -NL63, and -HKU1 (coronaviridae),” in Encyclopedia of Virology, 4th ed., Bamford D. H. and Zuckerman M., Eds. New York, NY, USA: Academic, 2021, pp. 428–440, doi: 10.1016/B978-0-12-809633-8.21501-X. - DOI
    1. Kagan D., Moran-Gilad J., and Fire M., “Scientometric trends for coronaviruses and other emerging viral infections,” GigaScience, vol. 9, no. 8, Aug. 2020, Art. no. giaa085, doi: 10.1093/gigascience/giaa085. - DOI - PMC - PubMed
    1. Struyf T., Deeks J. J., Dinnes J., Takwoingi Y., Davenport C., Leeflang M. M., Spijker R., Hooft L., Emperador D., Dittrich S., Domen J., Horn S. R. A., and Van den Bruel A., “Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19 disease,” Cochrane Database Systematic Rev., no. 7, Jul. 2020, Art. no. CD013665, doi: 10.1002/14651858.CD013665. - DOI - PMC - PubMed

LinkOut - more resources