Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Sep 2;5(9):e12528.
doi: 10.1371/journal.pone.0012528.

Community landscapes: an integrative approach to determine overlapping network module hierarchy, identify key nodes and predict network dynamics

Affiliations

Community landscapes: an integrative approach to determine overlapping network module hierarchy, identify key nodes and predict network dynamics

István A Kovács et al. PLoS One. .

Abstract

Background: Network communities help the functional organization and evolution of complex networks. However, the development of a method, which is both fast and accurate, provides modular overlaps and partitions of a heterogeneous network, has proven to be rather difficult.

Methodology/principal findings: Here we introduce the novel concept of ModuLand, an integrative method family determining overlapping network modules as hills of an influence function-based, centrality-type community landscape, and including several widely used modularization methods as special cases. As various adaptations of the method family, we developed several algorithms, which provide an efficient analysis of weighted and directed networks, and (1) determine persvasively overlapping modules with high resolution; (2) uncover a detailed hierarchical network structure allowing an efficient, zoom-in analysis of large networks; (3) allow the determination of key network nodes and (4) help to predict network dynamics.

Conclusions/significance: The concept opens a wide range of possibilities to develop new approaches and applications including network routing, classification, comparison and prediction.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Description of the ModuLand method family.
For this illustrative example we used the network science co-authorship network without link weights using the LinkLand influence function calculation method with the TotalHill module membership assignment method. The network was laid out using the Kamada-Kawai algorithm and was visualized with a custom Blender script. On the vertical axes influence function values (panel A), or community landscape values (panels B, C and D) of the links are shown. Influence functions of panels A1, or A2 belong to the Barabási-Vicsek, or Girvan-Newman author-pairs, respectively. Panel A3 shows the merged influence function of the Arenas-Pastor-Satorras and Guimera-Amaral co-authorship links. Links and nodes of panels C and D are colored in proportion of the colors of the modules they belong. Panel A: influence function calculation. First, the influence function of each link (or node) of the network were identified. If a link is in the ‘middle’ of a module, it is affected by many influence functions (all the three widely collaborating author-pairs, whose influence functions are shown by the arrows, are from this category). On the contrary, links at module ‘edges’ are affected by few influence functions only. At the bottom of the panel the names of the three algorithms we described in details are shown. Panel B: community landscape construction. Next, the community landscape is constructed by summing up the influence function values for all nodes or links. The hills of the community landscape correspond to the modules of the network. Panel C: determination of overlapping modules. Last, modular centers are identified as the links at the local maxima of the community landscapes, and memberships of links in all network modules are determined. At the top of the panel the names of the three methods we described in details are shown. Panel D: determination of network hierarchy. Optionally, a higher level hierarchical representation of the network can be created, where nodes of the higher level correspond to modules of the original network, and links of the higher level correspond to overlaps between the respective modules. Sizes of higher level nodes correspond to the log size of the respective lower level modules, where the module size is the sum of the membership assignment strengths of all nodes to that module.
Figure 2
Figure 2. Comparison of the ModuLand method with other modularization methods.
Panel A: Comparison of the identified modules with the modules of the benchmark graph of Lancichinetti et al. . Modularization has been performed on benchmark graphs with degree and module size distribution exponents γ = 2 and β = 1 using the NodeLand or LinkLand influence function calculation algorithm with the ProportionalHill module membership assignment method with merging highly correlated modules using an arbitrary chosen correlation threshold of 0.9 (see Section VI.1. in the Electronic Supplementary Material S1; red squares with dashed line and red rectangles with solid line for NodeLand and LinkLand, respectively), the InfoMap method (black circles with dotted line, [5]), the Louvain method (black circles with solid line, [22]) and the CFinder method with k = 4 cliques (black circles with dashed line, [7]). The number of nodes of the benchmark graphs was N = 1000, the maximum degree was Kmax = 50, the average degree was K = 15 and the network fuzziness μ of the x-axis of Panel A) was ranging from 0.1 to 0.85, where μ>0.5 means that the modules are no longer defined in the strong sense. Higher normalized mutual information (shown on the y-axis) represents a better recovery of the original modules. The panel shows the averaged results of 50 representations. Panel B: comparison of module assignment of the cAMP-dependent protein kinase family in the yeast protein-protein interaction network. The panel shows the modular assignment of the 3 catalytic and the regulatory subunit of the yeast cAMP-dependent protein kinase together with that of their first neighbors in the high fidelity protein-protein interaction network of Ekman et al. . For the sake of simplicity only intra-subnetwork contacts have been included. The top left, top right, bottom left and bottom right figures show the modular assignment using the NodeLand, InfoMap, Louvain and CFinder methods, respectively, determined as described in the legend of Panel A. Various colors correspond to different modules. Overlapping ModuLand modules of cAMP kinase members and CFinder modules of their casein kinase II neighbor are marked with pie-charts, where the area of color-codes is proportional to the module membership value of the given node in the given module. To simplify the figure, in case of the NodeLand method (top right figure of Panel B) all neighbors of cAMP-dependent protein kinase family members (which should all have similar pie-charts to the 4 central members due to their multiple modules) were assigned to their maximal modules only.
Figure 3
Figure 3. Overlapping modules of a word-association network.
Modules of the University of South Florida word association network were determined using the LinkLand influence function calculation method and the TotalHill module membership assignment method. During the post-processing of the module assignment, we merged the modules with ProportionalHill module membership assignment-based correlation higher than 0.9 (see Section VI. in the Electronic Supplementary Material S1, we received similar results without this merging process; data not shown). The network was laid out using the Kamada-Kawai algorithm of Graphviz and visualized using a custom program written in Python language using OpenGL graphics. Links were colored in proportion to the colors of the modules they belong. Panel A: modules around the antagonym word, “terrific”. Panel B: modules around the heteronym word, “content”. In addition to the selected words “terrific” and “content” similar words above a similarity threshold of 10% are also shown with a contrast corresponding to their degree of similarity. The extent of similarity between two words was calculated as the sum of the two pair-wise minima of their unity-normalized module membership vector giving the membership assignment strength of the given word to all modules of the network (for more details see Section V.6.e. in the Electronic Supplementary Material S1).
Figure 4
Figure 4. Overlapping modules of a school-friendship network.
We have determined the modular structure of Community-44 of the Add Health survey using the LinkLand influence function calculation method together with the ProportionalHill module membership assignment method. During the post-processing of the module assignment, we merged the modules with ProportionalHill module membership assignment-based correlation higher than 0.9 (see Section VI. in the Electronic Supplementary Material S1, we received similar results without this merging process; data not shown). Panel A: modules of Community-44. The school friendship network was laid out using the Kamada-Kawai algorithm. Nodes represent the individual students, and were colored according to the color of the friendship module they assigned the most. We show the modular structure of the first hierarchical level having 18 modules. The inset of Panel A shows color-codes of the modules with an area proportional to the size of the respective module. Panel B: the number of network modules in case of boys (blue, solid bars) and girls (red-black hatched bars) with mixed racial contents at the lowest hierarchical level (level 0). The extent of mixed racial content was monitored using the “effective number of races” (Section V.6.b. in the Electronic Supplementary Material S1) with a bin-size of 0.5. Panel C: overlaps of boys and girls in friendship circles. The number of boys (blue, solid bars) and girls (red-black hatched bars) having different overlaps in friendship circles were determined in the first hierarchical level with a bin-size of 1. Overlap was measured as the “effective number” (Section V.6.b. in the Electronic Supplementary Material S1) of modules of the given student.
Figure 5
Figure 5. Determination of key nodes of the USA Western Power Grid network.
The figure shows the decreasing integrity of the USA Western Power Grid network as a function of the number of nodes removed. Nodes were removed in the order of their decreasing degree (black alternating dashes and dots) betweenness centrality (red dashed lines) or “bridgeness” (solid blue lines), where “bridgeness” measures the overlap of the given node between different modules as described in detail in Section V.6.d. in the Electronic Supplementary Material S1. Network integrity has been calculated after Latora and Marchiori . Bridgeness was calculated from the modular structure of the lowest hierarchical level as determined by the LinkLand influence function calculation method and the TotalHill module membership assignment method. During the post-processing of the module assignment, we merged the modules with ProportionalHill module membership assignment-based correlation higher than 0.9 (see Section VI. in the Electronic Supplementary Material S1, we received similar results without this merging process; data not shown). On the vertical axis of the insets the betweenness centrality (left, color-coded from red to yellow) and bridgeness (right, color-coded from blue to green) of the nodes of the USA Western Power Grid network are shown. Networks on the insets were laid out using the Kamada-Kawai algorithm and visualized with a custom Blender script.
Figure 6
Figure 6. Prediction of the dynamical behavior of network nodes: segregation of date- and party-hubs based on their modular overlaps.
Overlapping modules of the yeast protein-protein interaction network of Ekman et al. were identified using the LinkLand influence function calculation method with the TotalHill module membership assignment method using the modular structure of the lowest level of hierarchy. During the post-processing of the module assignment, we merged the modules with ProportionalHill module membership assignment-based correlation higher than 0.9 (see Section VI. in the Electronic Supplementary Material S1, we received similar results without this merging process; data not shown). Panel A: 3D view of the yeast protein-protein interaction network. The underlying 2D network layout was set by the Kamada-Kawai algorithm. The vertical positions reflect the community landscape values of the nodes on a linear scale. Nodes were colored as the module of their maximum membership. Panel B: centrality and bridgeness of yeast date- and party-hubs. Hubs having more than 8 neighbors and non-hubs with less neighbors were positioned on the scattergram according to their ModuLand centrality (x-axis, the height of the community landscape) and ModuLand bridgeness (y-axis) as defined in Section V.6.d. in the Electronic Supplementary Material S1. Date- and party-hubs are marked with red circles and blue triangles, respectively, while non-hub proteins are represented by gray crosses. The inset shows a double logarithmic plot of hubs with large centrality.

References

    1. Fortunato S. Community detection in graphs. Phys Rep. 2010;486:75–174.
    1. Girvan M, Newman ME. Community structure in social and biological networks. Proc Natl Acad Sci U S A. 2002;99:7821–7826. - PMC - PubMed
    1. Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D. Defining and identifying communities in networks. Proc Natl Acad Sci U S A. 2004;101:2658–2663. - PMC - PubMed
    1. Nepusz T, Petróczi A, Négyessy L, Bazsó F. Fuzzy communities and the concept of bridgeness in complex networks. Phys Rev E. 2008;77:016107. - PubMed
    1. Rosvall M, Bergstrom CT. Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci U S A. 2008;105:1118–1123. - PMC - PubMed

Publication types