Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 21;118(51):e2111455118.
doi: 10.1073/pnas.2111455118.

Global monitoring of the impact of the COVID-19 pandemic through online surveys sampled from the Facebook user base

Affiliations

Global monitoring of the impact of the COVID-19 pandemic through online surveys sampled from the Facebook user base

Christina M Astley et al. Proc Natl Acad Sci U S A. .

Abstract

Simultaneously tracking the global impact of COVID-19 is challenging because of regional variation in resources and reporting. Leveraging self-reported survey outcomes via an existing international social media network has the potential to provide standardized data streams to support monitoring and decision-making worldwide, in real time, and with limited local resources. The University of Maryland Global COVID-19 Trends and Impact Survey (UMD-CTIS), in partnership with Facebook, has invited daily cross-sectional samples from the social media platform's active users to participate in the survey since its launch on April 23, 2020. We analyzed UMD-CTIS survey data through December 20, 2020, from 31,142,582 responses representing 114 countries/territories weighted for nonresponse and adjusted to basic demographics. We show consistent respondent demographics over time for many countries/territories. Machine Learning models trained on national and pooled global data verified known symptom indicators. COVID-like illness (CLI) signals were correlated with government benchmark data. Importantly, the best benchmarked UMD-CTIS signal uses a single survey item whereby respondents report on CLI in their local community. In regions with strained health infrastructure but active social media users, we show it is possible to define COVID-19 impact trajectories using a remote platform independent of local government resources. This syndromic surveillance public health tool is the largest global health survey to date and, with brief participant engagement, can provide meaningful, timely insights into the global COVID-19 pandemic at a local scale.

Keywords: COVID-19 surveillance; SARS-CoV-2 testing; global health; human social sensing.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interest.

Figures

Fig. 1.
Fig. 1.
UMD-CTIS data pipeline, coverage, and demographic distributions. (A) The FAUB is sampled daily and invited to participate in the online UMD-CTIS, administered by Qualtrics and accessed via an online form using a smartphone or computer. Participants are asked about demographics, COVID-19 symptoms, behaviors, and outcomes. Facebook supplies survey weights to account for nonresponse and to adjust for basic demographics of the participant. Aggregated data are released to the public in near real time. Researchers may apply to use raw microdata to study COVID-19. (B) The map of the distribution of surveys per capita during the study period for countries and territories sampled and that have survey weights (n = 114, gray for all other countries/territories). (C) The distribution of difference of proportion in each group in UMD-CTIS versus local demographics by six age–gender groups (male and female versus young [18 to 34 y], middle [35 to 54 y], and elderly [>54 y]), that is, Dg = Pg,UMD-CTISPg,Census, where Pg is the proportion in group g. (D) The distribution of mean absolute differences across age–gender groups (i.e., δ = Σg |Dg|/6). The distribution by week (w) of the change (Δ) in the (E) difference of proportions [ΔDg,w = Dg,w – median(Dg,w)] and (F) mean absolute difference [Δδ,w = δw – median(δw)] versus the median measure for that country/territory over the study period. The range across all locales (light ribbon), 25th to 75th percentile (dark ribbon), and median (solid line) are shown.
Fig. 2.
Fig. 2.
The global model predicting recent COVID-19 positive test results using self-reported symptoms and minimal demographic data. (A) The receiver operating characteristic of the hyperparameter tuned global model showing the area under the curve. (B) The SHapley Additive exPlanations distribution of relative feature importance from the global model (green diamonds) compared to country/territory models (box and whisker plots). The within-model feature importance was normalized to loss of smell/taste to facilitate between-model comparison.
Fig. 3.
Fig. 3.
The schematic of COVID-19 case and UMD-CTIS surveillance signal benchmarking globally. For each country and territory, the 7-d smoothed COVID-19 case counts from Our World in Data (A) are compared to the survey-weighted CTIS surveillance measure. (B) The CCLI signal for Bolivia and Italy is shown for illustrative purposes. The survey-weighted sum of “yes” responses to the surveillance questions (here the CCLI survey question) for each week was divided by the sum of survey weights for all surveys over a 7-d window. (C) Time series were normalized to a range of 0 to 1using minimum and maximum during the survey period to allow within- and between-locale comparison of trends across a range of values using color intensity. (D) For each country/territory (rows), we combined normalized time series with log10 of the number of surveys (black bar chart), percent surveys per population (white bar chart), age and gender distributions (stacked bar charts), peak day (solid black circle, benchmark; open colored shapes, signals) and the benchmark–signal correlation strength (green) in the form of an annotated heatmap.
Fig. 4.
Fig. 4.
The time series heatmap comparing the benchmark cases to UMD-CTIS–based signals. Refer to the illustration of the generation of the time series heatmap components in Fig. 3. Normalized benchmark (black column) and UMD-CTIS (navy through orange columns) signal time series by country or territory (Country/Territory, rows) are clustered by benchmark within geographic regions. Signals include recent positive COVID-19 test result (Positive Test), CCLI, self-reported fever, cough or loss of smell/taste (Broad CLI), or self-reported loss of smell/taste of less than 14 d duration (Narrow CLI). The days to peak for each signal is compared (Peak) with the benchmark. The Spearman correlation strength (Correlation) of UMD-CTIS with the benchmark. Log10 surveys (LogN) and surveys per population (Pct) as bar charts and proportion of surveys for each Age or Gender as stacked bar charts.

References

    1. Lipsitch M., Swerdlow D. L., Finelli L., Defining the epidemiology of Covid-19—Studies needed. N. Engl. J. Med. 382, 1194–1196 (2020). - PubMed
    1. Tian H., et al. , An investigation of transmission control measures during the first 50 days of the COVID-19 epidemic in China. Science (80-.). 368, 638–642 (2020). - PMC - PubMed
    1. Kraemer M. U. G., et al. , Data curation during a pandemic and lessons learned from COVID-19. Nat. Comput. Sci. 1, 9–10 (2021). - PubMed
    1. Alwan N. A., Surveillance is underestimating the burden of the COVID-19 pandemic. Lancet 396, e24 (2020). - PMC - PubMed
    1. Emanuel E. J., et al. , Fair allocation of scarce medical resources in the time of Covid-19. N. Engl. J. Med. 382, 2049–2055 (2020). - PubMed

Publication types