Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jun 25:15:220.
doi: 10.1186/1471-2105-15-220.

Exploring community structure in biological networks with random graphs

Affiliations

Exploring community structure in biological networks with random graphs

Pratha Sah et al. BMC Bioinformatics. .

Abstract

Background: Community structure is ubiquitous in biological networks. There has been an increased interest in unraveling the community structure of biological systems as it may provide important insights into a system's functional components and the impact of local structures on dynamics at a global scale. Choosing an appropriate community detection algorithm to identify the community structure in an empirical network can be difficult, however, as the many algorithms available are based on a variety of cost functions and are difficult to validate. Even when community structure is identified in an empirical system, disentangling the effect of community structure from other network properties such as clustering coefficient and assortativity can be a challenge.

Results: Here, we develop a generative model to produce undirected, simple, connected graphs with a specified degrees and pattern of communities, while maintaining a graph structure that is as random as possible. Additionally, we demonstrate two important applications of our model: (a) to generate networks that can be used to benchmark existing and new algorithms for detecting communities in biological networks; and (b) to generate null models to serve as random controls when investigating the impact of complex network features beyond the byproduct of degree and modularity in empirical biological networks.

Conclusion: Our model allows for the systematic study of the presence of community structure and its impact on network function and dynamics. This process is a crucial step in unraveling the functional consequences of the structural properties of biological systems and uncovering the mechanisms that drive these systems.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic representation of the steps of our algorithm.(a) The algorithm assigns a within-degree and between-degree to each node, which are represented here as half-within-edges and half-between-edges respectively. (b) The half-between-edges are then connected using a modified version of the Havel-Hakimi algorithm, and to remove degree correlations, the between-edges are randomized. (c) Finally, the half-within-edges are connected using the standard Havel-Hakimi algorithm for each module and (d) the within-edges are randomized to remove degree correlations.
Figure 2
Figure 2
Modular random graphs with n =150, m =375, K =3, P ( s =50)=1 andpk is power law with modularity values of: a) Q =0.1; b) Q = 0.3; and c) Q = 0.6. As the modularity increases, the ratio of the total number of edges within modules to the number of edges in the network increases (i.e. dw¯ increases), while the remaining parameter values (degree distribution, network mean degree, number of modules) are held constant.
Figure 3
Figure 3
Values of (a) Assortativity, r , (b) clustering coefficient, C , and (c) path length, L in modular random graphs do not vary significantly with increasing modularity ( Q ). Each graph has n=2000 nodes, a mean degree d¯=10 and K=10 modules with P(s=200)=1. The data points represent the average value of 50 random graphs. Standard deviations are plotted as error bars.
Figure 4
Figure 4
Performance of the Louvain method [[46]], fast modularity method [[47]], the spin-glass based method [[48]], the infoMAP method [[49]], label propagation [[50]] and the random-walk based method [[51]] in networks with mean degree 10. Fill circles, open circles and triangles represent networks with Poisson, geometric and power-law degree distributions, respectively. Each data point represents the average over ten modular random graphs. Error bars represent standard deviations. The solid line is the reference line where estimated modularity is equal to the input modularity.
Figure 5
Figure 5
Visualization of empirical and random graphs of social interaction of dolphins and food-web trophic interactions at the Little Rock Lake in Wisconsin. Figure (a) is the empirical network of Dolphin social network, (b) its modular random graph, and (c) its random graph counterpart with matched degree distribution (Q=0). Figure (d) is the empirical network for the food-web trophic interaction at Little Rock Lake in Wisconsin, (e) is its modular random graph and (f) its random graph counterpart with matched degree distribution. Modular random graphs have generated to match the overall degree distribution, network mean degree, the level of modularity and the number of modules of the empirical graphs. Random graphs with matched degree distribution are based on the configuration model.
Figure 6
Figure 6
Comparisons of empirical networks, modular random graphs and random graphs with matched degree distribution (based on the configuration model). The figure summarizes network statistics of the empirical network as well as the ensemble mean of two types of random graphs in terms of (a) Modularity, Q ; (b) Assortativity, r; (c) Path length, L and (d) Clustering coefficient, C. The path length value for the empirical Yeast-Protein interaction network is missing as the network contains disconnected components. Error bars denote standard deviation from the ensemble mean of the generated random graphs. Errors bars for modular random graphs in Figure 6(a) have been omitted as the value of modularity (Q) match the empirical networks perfectly. FW = Little Rock food web, YP = Yeast protein interaction network, CM =C.elegans metabolic network and DS = Dolphin social network.

Similar articles

Cited by

References

    1. Proulx SR, Promislow DEL, Phillips PC. Network thinking in ecology and evolution. Trends Ecol Evol. 2005;20(6):345–53. doi: 10.1016/j.tree.2005.04.004. - DOI - PubMed
    1. Bansal S, Khandelwal S, Meyers LA. Exploring biological network structure with clustered random networks. BMC Bioinformatics. 2009;10:405. doi: 10.1186/1471-2105-10-405. - DOI - PMC - PubMed
    1. Girvan M, Newman MEJ. Community structure in social and biological networks. Proc Nat Acad Sci USA. 2002;99(12):7821–7826. doi: 10.1073/pnas.122653799. - DOI - PMC - PubMed
    1. Newman M. Mixing patterns in networks. Phys Rev E. 2003;67(2):026126. - PubMed
    1. Ravasz E, Somera AL, Oltvai ZN, Barabási AL. Mongru Da. Hierarchical organization of modularity in metabolic networks. Science (New York, NY) 2002;297(5586):1551–1555. doi: 10.1126/science.1073374. [ http://www.ncbi.nlm.nih.gov/pubmed/12202830] - DOI - PubMed

Publication types

LinkOut - more resources