Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Jun 5;104(23):9564-9.
doi: 10.1073/pnas.0610537104. Epub 2007 May 24.

Mixture models and exploratory analysis in networks

Affiliations

Mixture models and exploratory analysis in networks

M E J Newman et al. Proc Natl Acad Sci U S A. .

Abstract

Networks are widely used in the biological, physical, and social sciences as a concise mathematical representation of the topology of systems of interacting components. Understanding the structure of these networks is one of the outstanding challenges in the study of complex systems. Here we describe a general technique for detecting structural features in large-scale network data that works by dividing the nodes of a network into classes such that the members of each class have similar patterns of connection to other nodes. Using the machinery of probabilistic mixture models and the expectation-maximization algorithm, we show that it is possible to detect, without prior knowledge of what we are looking for, a very broad range of types of structure in networks. We give a number of examples demonstrating how the method can be used to shed light on the properties of real-world networks, including social and information networks.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Application of the method described here to the “karate club” network of ref. . The two shaded regions indicate the division of the network in the real world, whereas the shades of the individual vertices indicate the decomposition chosen by the algorithm. The sizes of the vertices indicate the probabilities θ1i for edges from vertices in group 1 (the left group) to be connected to each other vertex, with the probabilities ranging from 0 for the smallest vertices to 0.19 for the largest.
Fig. 2.
Fig. 2.
The adjacency network of English words described in the text. The two shaded groups contain adjectives and nouns respectively and the shades of the individual vertices represent the classification found by the algorithm.
Fig. 3.
Fig. 3.
A directed social network of U.S. high school students and the division into two groups found by the directed version of our method. Vertex shapes show the (self-identified) ethnicity of the students, as indicated.
Fig. 4.
Fig. 4.
The four-group network described in the text, in which connections between vertices are entirely random, except for connections to the eight keystone vertices in the center. Each of the four groups (dashed boxes) is thus distinguished solely by the unique pattern of its connections to the keystone vertices. Vertex shapes represent the groups to which vertices are assigned by our analyses using the traditional community detection (a) and maximum likelihood (b) methods of this paper.

References

    1. Newman MEJ, Barabási A-L, Watts DJ. The Structure and Dynamics of Networks. Princeton: Princeton Univ Press; 2006.
    1. Albert R, Barabási A-L. Rev Mod Phys. 2002;74:47–97.
    1. Newman MEJ. SIAM Rev. 2003;45:167–256.
    1. Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang D-U. Phys Rep. 2006;424:175–308.
    1. Wasserman S, Faust K. Social Network Analysis. Cambridge, UK: Cambridge Univ Press; 1994.

LinkOut - more resources