Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Meta-Analysis
. 2020 Oct 2;22(10):e20509.
doi: 10.2196/20509.

Identification of Risk Factors and Symptoms of COVID-19: Analysis of Biomedical Literature and Social Media Data

Affiliations
Meta-Analysis

Identification of Risk Factors and Symptoms of COVID-19: Analysis of Biomedical Literature and Social Media Data

Jouhyun Jeon et al. J Med Internet Res. .

Abstract

Background: In December 2019, the COVID-19 outbreak started in China and rapidly spread around the world. Lack of a vaccine or optimized intervention raised the importance of characterizing risk factors and symptoms for the early identification and successful treatment of patients with COVID-19.

Objective: This study aims to investigate and analyze biomedical literature and public social media data to understand the association of risk factors and symptoms with the various outcomes observed in patients with COVID-19.

Methods: Through semantic analysis, we collected 45 retrospective cohort studies, which evaluated 303 clinical and demographic variables across 13 different outcomes of patients with COVID-19, and 84,140 Twitter posts from 1036 COVID-19-positive users. Machine learning tools to extract biomedical information were introduced to identify mentions of uncommon or novel symptoms in tweets. We then examined and compared two data sets to expand our landscape of risk factors and symptoms related to COVID-19.

Results: From the biomedical literature, approximately 90% of clinical and demographic variables showed inconsistent associations with COVID-19 outcomes. Consensus analysis identified 72 risk factors that were specifically associated with individual outcomes. From the social media data, 51 symptoms were characterized and analyzed. By comparing social media data with biomedical literature, we identified 25 novel symptoms that were specifically mentioned in tweets but have been not previously well characterized. Furthermore, there were certain combinations of symptoms that were frequently mentioned together in social media.

Conclusions: Identified outcome-specific risk factors, symptoms, and combinations of symptoms may serve as surrogate indicators to identify patients with COVID-19 and predict their clinical outcomes in order to provide appropriate treatments.

Keywords: COVID-19; SARS-CoV-2; Twitter; biomedical literature; diagnosis; risk factor; social media; symptom; treatment; tweets.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
Properties of clinical and demographic variables of COVID-19. (A) Landscape of clinical and demographic variables. (B) Association between variables and overall clinical outcomes. Number of variables tested in ≥5 studies appears in brackets. Asterisk indicated association types of cancer. (C) and (D) Association types of clinical outcomes of COVID-19; association types of (C) dry or sore throat and (D) cardiovascular disease depending on different clinical outcomes were shown. CRRT: continuous renal replacement therapy; ARDS: acute respiratory distress syndrome; ICU: intensive care unit.
Figure 2
Figure 2
Consensus identification of outcome-specific clinical and demographic variables. (A) Outcome-specific clinical and demographic variables in a given outcome of COVID-19. Variables that were only specific for one clinical outcome were shown in the red-dashed box. Clinical and demographic variables that were specific for at least three outcomes are presented in (B). Blue coloring indicates identified outcome-specific variables (risk factors). ICU: SOFA: Sequential Organ Failure Assessment; ARDS: acute respiratory distress syndrome; IL-10: interleukin 10; NT-proBNP: N-terminal pro-brain natriuretic peptide.
Figure 3
Figure 3
COVID-19–related symptoms extracted from social media data. (A) Landscape of symptoms identified from social media data. Orange and white indicate the presence and absence of symptoms in a given user, respectively. (B) Fraction of common, less common, and rare symptoms. Common symptoms were mentioned from >10% of users, and rare symptoms were mentioned from <1% of users. (C) Co-occurrence of symptoms. One major cluster was shown in the red-dashed box. (D) Number of symptoms pairs depending on mentioning frequency. Blue bars (bottom) indicate the number of co-occurring pairs. (E) One major cluster of symptom pairs. Green, gray, and orange indicate rare, less common, and common symptoms, respectively.
Figure 4
Figure 4
Identification of novel symptoms of COVID-19, and comparison of symptoms between biomedical literature and social media data. Symptoms that were observed in the literature or social media are colored in blue; 25 social media–specific symptoms are presented. Green, gray, and orange indicated rare, less common, and common symptoms, respectively.

Similar articles

Cited by

References

    1. COVID-19 Coronavirus Pandemic. Worldometer. 2020. [2020-09-30]. https://www.worldometers.info/coronavirus.
    1. Guan W, Ni Z, Hu Y, Liang W, Ou C, He J, Liu L, Shan H, Lei C, Hui DSC, Du B, Li L, Zeng G, Yuen K, Chen R, Tang C, Wang T, Chen P, Xiang J, Li S, Wang J, Liang Z, Peng Y, Wei L, Liu Y, Hu Y, Peng P, Wang J, Liu J, Chen Z, Li G, Zheng Z, Qiu S, Luo J, Ye C, Zhu S, Zhong N, China Medical Treatment Expert Group for Covid-19 Clinical Characteristics of Coronavirus Disease 2019 in China. N Engl J Med. 2020 Apr 30;382(18):1708–1720. doi: 10.1056/NEJMoa2002032. http://europepmc.org/abstract/MED/32109013 - DOI - PMC - PubMed
    1. Wu C, Chen X, Cai Y, Xia J, Zhou X, Xu S, Huang H, Zhang L, Zhou X, Du C, Zhang Y, Song J, Wang S, Chao Y, Yang Z, Xu J, Zhou X, Chen D, Xiong W, Xu L, Zhou F, Jiang J, Bai C, Zheng J, Song Y. Risk Factors Associated With Acute Respiratory Distress Syndrome and Death in Patients With Coronavirus Disease 2019 Pneumonia in Wuhan, China. JAMA Intern Med. 2020 Jul 01;180(7):934–943. doi: 10.1001/jamainternmed.2020.0994. http://europepmc.org/abstract/MED/32167524 - DOI - PMC - PubMed
    1. Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. The Lancet Infectious Diseases. 2020 May 19;20(5):533–534. doi: 10.1016/S1473-3099(20)30120-1. - DOI - PMC - PubMed
    1. Kohlmeier S, Lo K, Wang LL, Yang YY. COVID-19 Open Research Dataset Challenge (CORD-19) Zenodo. 2020 Mar;:e. doi: 10.5281/zenodo.3727291. https://pages.semanticscholar.org/coronavirus-research - DOI

Publication types

MeSH terms

Substances