Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2008 Jan 29;105(4):1118-23.
doi: 10.1073/pnas.0706851105. Epub 2008 Jan 23.

Maps of random walks on complex networks reveal community structure

Affiliations
Review

Maps of random walks on complex networks reveal community structure

Martin Rosvall et al. Proc Natl Acad Sci U S A. .

Abstract

To comprehend the multipartite organization of large-scale biological and social systems, we introduce an information theoretic approach that reveals community structure in weighted and directed networks. We use the probability flow of random walks on a network as a proxy for information flows in the real system and decompose the network into modules by compressing a description of the probability flow. The result is a map that both simplifies and highlights the regularities in the structure and their relationships. We illustrate the method by making a map of scientific communication as captured in the citation patterns of >6,000 journals. We discover a multicentric organization with fields that vary dramatically in size and degree of integration into the network of science. Along the backbone of the network-including physics, chemistry, molecular biology, and medicine-information flows bidirectionally, but the map reveals a directional pattern of citation from the applied fields to the basic sciences.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Detecting communities by compressing the description of information flows on networks. (A) We want to describe the trajectory of a random walk on the network such that important structures have unique names. The orange line shows one sample trajectory. (B) A basic approach is to give a unique name to every node in the network. The Huffman code illustrated here is an efficient way to do so. The 314 bits shown under the network describe the sample trajectory in A, starting with 1111100 for the first node on the walk in the upper left corner, 1100 for the second node, etc., and ending with 00011 for the last node on the walk in the lower right corner. (C) A two-level description of the random walk, in which major clusters receive unique names, but the names of nodes within clusters are reused, yields on average a 32% shorter description for this network. The codes naming the modules and the codes used to indicate an exit from each module are shown to the left and the right of the arrows under the network, respectively. Using this code, we can describe the walk in A by the 243 bits shown under the network in C. The first three bits 111 indicate that the walk begins in the red module, the code 0000 specifies the first node on the walk, etc. (D) Reporting only the module names, and not the locations within the modules, provides an efficient coarse graining of the network.
Fig. 2.
Fig. 2.
Mapping flow highlights different aspects of structure than does optimizing modularity in directed and weighted networks. The coloring of nodes illustrates alternative partitions of two sample networks. (Left) Partitions show the modular structure as optimized by the map equation (minimum L). (Right) Partitions show the structure as optimized by modularity (maximum Q). In the network shown in A, the left-hand partition minimizes the map equation because the persistence times in the modules are long; with the weight of the bold links set to twice the weight of other links, a random walker without teleportation takes on average three steps in a module before exiting. The right-hand clustering gives a longer description length because a random walker takes on average only 12/5 steps in a module before exiting. The right-hand clustering maximizes the modularity because modularity counts weights of links, the in-degree, and the out-degree in the modules; the right-hand partitioning places the heavily weighted links inside of the modules. In B, for the same reason, the right-hand partition again maximizes modularity, but not so the map equation. Because every node is either a sink or a source in this network, the links do not induce any long-range flow, and the one-step walks are best described as in the left-hand partition, with all nodes in the same cluster.
Fig. 3.
Fig. 3.
A map of science based on citation patterns. We partitioned 6,128 journals connected by 6,434,916 citations into 88 modules and 3,024 directed and weighted links. For visual simplicity, we show only the links that the random surfer traverses >1/5,000th of her time, and we only show the modules that are visited via these links (see SI for the complete list). Because of the automatic ranking of nodes and links by the random surfer (22), we are assured of showing the most important links and nodes. For this particular level of detail, we capture 98% of the node weights and 94% of all flow.
Fig. 4.
Fig. 4.
A map of the social sciences. The journals listed in the 2004 social science edition of Journal Citation Reports (32) are a subset of those illustrated in Fig. 3, totaling 1,431 journals and 217,287 citations. When we map this subset on its own, we get a finer level of resolution. The 10 modules that correspond to the social sciences now are partitioned into 54 modules, but for simplicity we show only links that the random surfer visits at least 1/2,000th of her times together with the modules they connect (see SI for the complete list). For this particular level of detail, we capture 97% of the node weights and 90% of all flow.

References

    1. Newman MEJ. SIAM Review. 2003;45:167–256.
    1. Newman MEJ, Barabási AL, Watts DJ. The Structure and Dynamics of Networks. Princeton, NJ: Princeton Univ Press; 2006.
    1. Girvan M, Newman MEJ. Proc Natl Acad Sci USA. 2002;99:7821–7826. - PMC - PubMed
    1. Palla G, Derényi I, Farkas I, Vicsek T. Nature. 2005;435:814–818. - PubMed
    1. Sales-Pardo M, Guimerà R, Moreira AA, Amaral LAN. Proc Natl Acad Sci USA. 2007;104:15224. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources