Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Feb;15(139):20170776.
doi: 10.1098/rsif.2017.0776.

Tracking random walks

Affiliations

Tracking random walks

Riccardo Gallotti et al. J R Soc Interface. 2018 Feb.

Abstract

In empirical studies, trajectories of animals or individuals are sampled in space and time. Yet, it is unclear how sampling procedures bias the recorded data. Here, we consider the important case of movements that consist of alternating rests and moves of random durations and study how the estimate of their statistical properties is affected by the way we measure them. We first discuss the ideal case of a constant sampling interval and short-tailed distributions of rest and move durations, and provide an exact analytical calculation of the fraction of correctly sampled trajectories. Further insights are obtained with simulations using more realistic long-tailed rest duration distributions showing that this fraction is dramatically reduced for real cases. We test our results for real human mobility with high-resolution GPS trajectories, where a constant sampling interval allows one to recover at best 18% of the movements, while over-evaluating the average trip length by a factor of 2. Using a sampling interval extracted from real communication data, we recover only 11% of the moves, a value that cannot be increased above 16% even with ideal algorithms. These figures call for a more cautious use of data in quantitative studies of individuals' movements.

Keywords: animal movement; human mobility; renewal theory; statistical physics.

PubMed Disclaimer

Conflict of interest statement

We declare we have no competing interests.

Figures

Figure 1.
Figure 1.
Examples of trajectory sampling. On a trajectory with exponentially distributed rest and move durations, we show the case of constant sampling interval (red circles) and the case of random sampling interval (blue crosses) with P(Δ) ∝ Δ−1 (formula image, Δmax = 12 h). See electronic supplementary material, figure S1 for a two-dimensional example. (Online version in colour.)
Figure 2.
Figure 2.
Distributions P(ℓ*/v) obtained from periodic sampling with exponential distribution of rest and move times. (a) Dependence of equation (2.6) on formula image fixing formula image and formula image = 1 h. The distribution has a maximum when the average rest times exceed the sampling time, and its value is strictly zero for ℓ* > vformula image. (b) Dependence of equation (2.6) on formula image fixing formula image, formula image. Short sampling times introduce a cut-off in the distribution. Large deviations can be observed when sampling time intervals are long. (Online version in colour.)
Figure 3.
Figure 3.
Optimal sampling for exponential distributions and constant sampling. (a) We verify numerically (black dots) our analytical results (blue lines) for the first (k = 1, equation (2.8)) and second (k = 2, electronic supplementary material, equation (S28)) moment of the displacements distribution (normalized by 〈ℓ〉 and 〈ℓ2〉, respectively) versus sampling time interval. The original average value 〈ℓ〉 (yellow solid line) is obtained by definition for formula image (filled circle), while is overestimated by ≈ 10% for formula image (up triangle). The second moment (k = 2) has a deviation of about 10% for both optimal sampling times (empty circle and down triangle). In the inset, we show the ratio of the estimated number of trips n* over the actual number of trips n. With formula image (circle), we correctly evaluate the number of moves, while formula image (triangle) yields a slightly underestimated value n* ≈ 0.90n. (b) The fraction of good moves follows the curve predicted by equation (2.10) (blue line). The maximum value of 51% is reached for formula image (triangle), but at formula image (circle) the value is only 1% lower. We choose here formula image, formula image. (Online version in colour.)
Figure 4.
Figure 4.
Maximization of Fgood. (a) The maximum formula image for exponential distributions. We observe that formula image in the limit for small formula image, and decreases as formula image becomes comparable to formula image. The upper bound to sampling quality is 51% for the car mobility conditions of figure 3 (orange solid triangle) and 29% for GeoLife trajectories of figure 5 (yellow empty triangle). (b) The sampling rate formula image optimizing Fgood has a non-trivial dependence on formula image and formula image. We identify a relatively weak dependence on formula image, of the form formula image, with α ranging between 1.84 and 2 for all values of formula image. In particular, for the characteristic values observed for car mobility (orange solid triangle, formula image, formula image), the curve exhibits a plateau, allowing us to approximate formula image. For the GeoLife trajectories (yellow empty triangle), which have significantly shorter rest times (formula image h, formula image) the deviation from this approximation is only of about 1.5%. (Online version in colour.)
Figure 5.
Figure 5.
Constant sampling on GPS data. Results are obtained by sampling the GeoLife GPS data with a constant sampling interval Δ. We show (black dots) the fraction of moves correctly sampled as a function of the length of the sampling interval formula image. The dashed blue line corresponds to the theoretical curve computed for exponential distributions. The red circle corresponds to formula image, while orange triangles correspond to the empirical maximum formula image h of Fgood. Strikingly, the latter coincides with the theoretical value of formula image for exponential distributions. (Online version in colour.)

References

    1. Vespignani A. 2012. Modelling dynamical processes in complex socio-technical systems. Nat. Phys. 8, 32–39. (10.1038/nphys2160) - DOI
    1. Zheng Y. 2015. Trajectory data mining: an overview. ACM. Trans. Intell. Syst. Technol. 6, 29–41.
    1. González MC, Hidalgo CA, Barabási A-L. 2008. Understanding individual human mobility patterns. Nature 453, 779–782. (10.1038/nature06958) - DOI - PubMed
    1. Song C, Koren T, Wang P, Barabási A-L. 2010. Modelling the scaling properties of human mobility. Nat. Phys. 6, 818–823. (10.1038/nphys1760) - DOI
    1. Raichlen DA, Wood BM, Gordon AD, Mabulla AZ, Marlowe FW, Pontzer H. 2014. Evidence of Lévy walk foraging patterns in human hunter–gatherers. Proc. Natl Acad. Sci. USA 111, 728–733. (10.1073/pnas.1318616111) - DOI - PMC - PubMed

Publication types

LinkOut - more resources