. 2012;7(5):e37027.

doi: 10.1371/journal.pone.0037027. Epub 2012 May 29.

A tale of many cities: universal patterns in human urban mobility

Anastasios Noulas¹, Salvatore Scellato, Renaud Lambiotte, Massimiliano Pontil, Cecilia Mascolo

Affiliations

PMID: 22666339
PMCID: PMC3362592
DOI: 10.1371/journal.pone.0037027

A tale of many cities: universal patterns in human urban mobility

Anastasios Noulas et al. PLoS One. 2012.

. 2012;7(5):e37027.

doi: 10.1371/journal.pone.0037027. Epub 2012 May 29.

Authors

Anastasios Noulas¹, Salvatore Scellato, Renaud Lambiotte, Massimiliano Pontil, Cecilia Mascolo

Affiliation

¹ Computer Laboratory, University of Cambridge, Cambridge, United Kingdom. anastasios.noulas@cl.cam.ac.uk

PMID: 22666339
PMCID: PMC3362592
DOI: 10.1371/journal.pone.0037027

Erratum in

PLoS One. 2012;7(9). doi:10.1371/annotation/ca85bf7a-7922-47d5-8bfb-bcdf25af8c72

Abstract

The advent of geographic online social networks such as Foursquare, where users voluntarily signal their current location, opens the door to powerful studies on human movement. In particular the fine granularity of the location data, with GPS accuracy down to 10 meters, and the worldwide scale of Foursquare adoption are unprecedented. In this paper we study urban mobility patterns of people in several metropolitan cities around the globe by analyzing a large set of Foursquare users. Surprisingly, while there are variations in human movement in different cities, our analysis shows that those are predominantly due to different distributions of places across different urban environments. Moreover, a universal law for human mobility is identified, which isolates as a key component the rank-distance, factoring in the number of places between origin and destination, rather than pure physical distance, as considered in some previous works. Building on our findings, we also show how a rank-based movement model accurately captures real human movements in different cities.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

**Figure 1. Global movements.**
The probability density function (PDF) of human displacements as seen through 35 million location broadcasts (check-ins) across the planet. The power-law fit features an exponent and a threshold confirming previous works on human mobility data. The spatial granularity offered by GPS data allows for the inspection of human movements at very small distances, whereas the global reach of Foursquare reveals the full tail of the planetary distribution of human movements.

formula image — **Figure 1. Global movements.**
The probability density function (PDF) of human displacements as seen through 35 million location broadcasts (check-ins) across the planet. The power-law fit features an exponent and a threshold confirming previous works on human mobility data. The spatial granularity offered by GPS data allows for the inspection of human movements at very small distances, whereas the global reach of Foursquare reveals the full tail of the planetary distribution of human movements.

**Figure 2. Urban movements.**
The probability density function (PDF) of human displacements in cities (intracity). For two successive location broadcasts (check-ins) a sample is included if the locations involved in the transition belong to the same city. Approximately 10 million of those transitions have been measured. The poor power-law fit of the data (, ) suggests that the distribution of intracity displacements can not be fully described by a power law. Short transitions which correspond to a large portion of the movements distribution are not captured by such process.

**Figure 3. Urban movement heterogeneities.**
The probability density function (PDF) of human displacements in three cities: Houston, San Francisco and Singapore (for 47, 112 and 79 thousand transitions, respectively). Common trends are observed, e.g., the probability of a jump steadily decreases after the distance threshold of 100 meters, but the shapes of the distributions vary from city to city, suggesting either that human movements do not exhibit universal patterns across cities or that distance is not the appropriate variable to model them.

**Figure 4. City place densities and mean movement lengths.**
Scatter plot of the density of a city, defined as the number of places per square kilometer, versus its mean human transition in kilometers. Each datapoint corresponds to a city, while the red line is a fit that highlights the relationship of the two variables (). A longer mean transition corresponds to the expectation of a sparser urban environment, indicating that the number of available places per area unit could have an impact on human urban travel.

**Figure 5. City area sizes and mean movement lengths.**
Scatter plot of the area of a city, measured in square kilometers, versus its mean human transition in kilometers. Unlike place density, the area of a city does not seem strongly related to the mean length of its transitions (). To measure the area of a city we have segmented the spatial plane around its geographic midpoint in squares of size . The area of a city has been defined as the sum area of all squares that feature at least five places.

**Figure 6. Rank distributions in three cities.**
(a) Probability density function (PDF) of rank values for three cities (Houston, Singapore and San Francisco). Our methodology to measure the rank distribution is the following: for each transition between two places and , we measure defined as the number of places that are geographically closer to than . We observe that the distributions of the three cities collapse to a single line, which suggests that universal laws can be formulated in terms of the rank variable. The observation confirms the hypothesis that human movements are driven by the density of the geographic environment rather than the exact distance cost of our travels. A least squares fit (red line) underlines the decreasing trend of the probability of a jump as the rank of a places increases.

**Figure 7. Rank distributions in urban environments.**
Superimposition of the probability density functions (PDF) of rank values the thirty-four cities analyzed in the Foursquare dataset. A decreasing trend for the probability of a jump at a place as its rank value increases is common. The trend remains stable despite the large number of plotted cities and their potential differences with respect to a number of variables such us number of places, number of displacements, area size, density or other cultural, national or organizational ones.

**Figure 8. Fitting urban movements.**
Probability Density Functions (PDF) of human movements and corresponding fits with the rank-distance and gravity models in three cities (Houston, San Francisco and Singapore). In the rank-distance model the probability of transiting from a place to a place in a city, only depends on the rank value of with respect to . In the case of the gravity model, the deterrence affect of distance is co-integrated with a mass based attractiveness of a place . The associated mass, , has been defined according to the number of neighboring places. The parameters for the depicted fit of the gravity model are and meters. The places of a city employed for the simulation experiments where those observed in the Foursquare dataset, hence while the rank-based model is the same for all cities the underlying spatial distribution of places may vary. Excellent fits are observed for all cities analyzed. It is interesting to note that the model is able to reproduce even minor anomalies, such as the case of San Francisco where we have ‘jumps’ in the probability of a movement at 20 and 40 kilometers.

**Figure 9. Fitting urban movements for all cities in the Foursquare dataset.**
The dominance of the rank-distance model over the gravity case extends to the rest of the cities (34 in total) we have experimented with in the Foursquare dataset. The results depicted here correspond to the gravity model with parameters and meters, whereas in the case of the rank-distance model an exponent has been used to simulate movement in all cities and corresponds to the empirical average of the exponents resulting from the fit of the rank value distributions.

**Figure 10. Geographic distribution of places in cities.**
Gaussian kernel density estimation (KDE) applied on the spatial distribution of places in three cities (Houston, San Francisco and Singapore). Each dark point corresponds to a venue observed in the Foursquare dataset encoded in terms of longitude and latitude values. The output of the KDE is visualised with a thermal map. A principal core of high density is observed in the three cities, but point-wise density and spatial distribution patterns may differ. The rank-based model can cope with those heterogeneities as it accounts for the relative density for a given pair of places and .

**Figure 11. Probability density function (PDF) of observing two randomly selected places at a distance**
**in a city.** We have enumerated 11808, 15970, 15617 unique venues for Houston, San Francisco and Singapore respectively. The probability is increasing with , as expected in two dimensions before falling due to finite size effect. It is interesting to note that the probability for two randomly selected places to be the origin and destination of a jump monotonically decreases with distance (see SI).

**Figure 12. Effect of place coordinate randomization on the performance of the rank-distance model.**
On the y-axis we present the KL-divergence, , between the empirically observed distribution of displacements in a city and which is the one obtained by the *rank-distance* model. On the x-axis the probability of randomization, , is depicted. In order to randomize the spatial distribution of places in a city, we iterate through the associated set of places and the coordinates of a place , are randomized with probability . A new pair of coordinates, , is assigned uniformly and within a pre-specified range, where and . corresponds to the case that the original distribution of displacements within a city is maintained, whereas the opposite extreme where equals means that all places have been randomized. The errors bars correspond to standard deviations across cities.

See this image and copyright information in PMC

References

1. Ravenstein EG. The laws of migration. 1885;48:167–235.
1. Zheng Y, Zhang L, Xie X, Ma WY. Mining interesting locations and travel sequences from gps trajectories. 2009. In: Proceedings of WWW' 09.
1. Zheng VW, Zheng Y, Xie X, Yang Q. Collaborative location and activity recommendations with gps history data. 2010. In: Proceedings of WWW' 10.
1. Quercia D, Lathia N, Calabrese F, Lorenzo GD, Crowcroft J. Recommending social events from mobile phone location data. 2010. In: Proceedings of IEEE ICDM '10.
1. Scellato S, Mascolo C, Musolesi M, Crowcroft J. Track globally, deliver locally: Improving content delivery networks by tracking geographic social cascades. 2011. In: Proceedings of WWW' 11.

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A tale of many cities: universal patterns in human urban mobility

Affiliation

A tale of many cities: universal patterns in human urban mobility

Authors

Affiliation

Erratum in

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources