Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun 28:8:e55013.
doi: 10.2196/55013.

Nonrepresentativeness of Human Mobility Data and its Impact on Modeling Dynamics of the COVID-19 Pandemic: Systematic Evaluation

Affiliations

Nonrepresentativeness of Human Mobility Data and its Impact on Modeling Dynamics of the COVID-19 Pandemic: Systematic Evaluation

Chuchu Liu et al. JMIR Form Res. .

Abstract

Background: In recent years, a range of novel smartphone-derived data streams about human mobility have become available on a near-real-time basis. These data have been used, for example, to perform traffic forecasting and epidemic modeling. During the COVID-19 pandemic in particular, human travel behavior has been considered a key component of epidemiological modeling to provide more reliable estimates about the volumes of the pandemic's importation and transmission routes, or to identify hot spots. However, nearly universally in the literature, the representativeness of these data, how they relate to the underlying real-world human mobility, has been overlooked. This disconnect between data and reality is especially relevant in the case of socially disadvantaged minorities.

Objective: The objective of this study is to illustrate the nonrepresentativeness of data on human mobility and the impact of this nonrepresentativeness on modeling dynamics of the epidemic. This study systematically evaluates how real-world travel flows differ from census-based estimations, especially in the case of socially disadvantaged minorities, such as older adults and women, and further measures biases introduced by this difference in epidemiological studies.

Methods: To understand the demographic composition of population movements, a nationwide mobility data set from 318 million mobile phone users in China from January 1 to February 29, 2020, was curated. Specifically, we quantified the disparity in the population composition between actual migrations and resident composition according to census data, and shows how this nonrepresentativeness impacts epidemiological modeling by constructing an age-structured SEIR (Susceptible-Exposed-Infected- Recovered) model of COVID-19 transmission.

Results: We found a significant difference in the demographic composition between those who travel and the overall population. In the population flows, 59% (n=20,067,526) of travelers are young and 36% (n=12,210,565) of them are middle-aged (P<.001), which is completely different from the overall adult population composition of China (where 36% of individuals are young and 40% of them are middle-aged). This difference would introduce a striking bias in epidemiological studies: the estimation of maximum daily infections differs nearly 3 times, and the peak time has a large gap of 46 days.

Conclusions: The difference between actual migrations and resident composition strongly impacts outcomes of epidemiological forecasts, which typically assume that flows represent underlying demographics. Our findings imply that it is necessary to measure and quantify the inherent biases related to nonrepresentativeness for accurate epidemiological surveillance and forecasting.

Keywords: COVID-19; data representativeness; epidemiological modeling; human mobility; population composition.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
Profiles of the intercity movements extracted from mobile phone data between January 1 and February 29, 2020, in China. (A) and (B) show the daily number of travelers for different age and gender groups; (C) and (D) show the respective ratios. Dashed horizontal lines denote the composition of respective groups in the latest 7th census. As children generally do not have mobile phones, the proportions of young (20-39 years), middle-aged (40-59 years), and older people (≥60 years) add up to 100%.
Figure 2
Figure 2
Dynamics of the incidence rates among different groups predicted by the age-structured model. Solid lines indicate the incidence rates of different age groups from census data, and dashed lines indicate the incidence rates by using traveling data from mobile phones.

Similar articles

References

    1. Barbosa H, Barthelemy M, Ghoshal G, James CR, Lenormand M, Louail T, Menezes R, Ramasco JJ, Simini F, Tomasini M. Human mobility: Models and applications. Physics Reports. 2018 Mar;734:1–74. doi: 10.1016/j.physrep.2018.01.001. - DOI
    1. Tan Suoyi, Lai Shengjie, Fang Fan, Cao Ziqiang, Sai Bin, Song Bing, Dai Bitao, Guo Shuhui, Liu Chuchu, Cai Mengsi, Wang Tong, Wang Mengning, Li Jiaxu, Chen Saran, Qin Shuo, Floyd Jessica R, Cao Zhidong, Tan Jing, Sun Xin, Zhou Tao, Zhang Wei, Tatem Andrew J, Holme Petter, Chen Xiaohong, Lu Xin. Mobility in China, 2020: a tale of four phases. Natl Sci Rev. 2021 Nov;8(11):nwab148. doi: 10.1093/nsr/nwab148. https://europepmc.org/abstract/MED/34876997 nwab148 - DOI - PMC - PubMed
    1. Hou X, Gao S, Li Q, Kang Y, Chen N, Chen K, Rao J, Ellenberg JS, Patz JA. Intracounty modeling of COVID-19 infection with human mobility: assessing spatial heterogeneity with business traffic, age, and race. Proc Natl Acad Sci U S A. 2021 Jun 15;118(24):e2020524118. doi: 10.1073/pnas.2020524118. https://europepmc.org/abstract/MED/34049993 2020524118 - DOI - PMC - PubMed
    1. Schlosser F, Maier BF, Jack O, Hinrichs D, Zachariae A, Brockmann D. COVID-19 lockdown induces disease-mitigating structural changes in mobility networks. Proc Natl Acad Sci U S A. 2020 Dec 29;117(52):32883–32890. doi: 10.1073/pnas.2012326117. https://www.pnas.org/doi/abs/10.1073/pnas.2012326117?url_ver=Z39.88-2003... 2012326117 - DOI - DOI - PMC - PubMed
    1. Lu X, Tan J, Cao Z, Xiong Y, Qin S, Wang T, Liu C, Huang S, Zhang W, Marczak LB, Hay SI, Thabane L, Guyatt GH, Sun X. Mobile phone-based population flow data for the COVID-19 outbreak in Mainland China. Health Data Sci. 2021 Jun 18;2021:9796431. doi: 10.34133/2021/9796431. https://spj.science.org/doi/10.34133/2021/9796431?url_ver=Z39.88-2003&rf... - DOI - DOI - PMC - PubMed

LinkOut - more resources