Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul 30;10(1):12884.
doi: 10.1038/s41598-020-69795-1.

Exploiting graphlet decomposition to explain the structure of complex networks: the GHuST framework

Affiliations

Exploiting graphlet decomposition to explain the structure of complex networks: the GHuST framework

Rafael Espejo et al. Sci Rep. .

Abstract

The characterization of topology is crucial in understanding network evolution and behavior. This paper presents an innovative approach, the GHuST framework to describe complex-network topology from graphlet decomposition. This new framework exploits the local information provided by graphlets to give a global explanation of network topology. The GHuST framework is comprised of 12 metrics that analyze how 2- and 3-node graphlets shape the structure of networks. The main strengths of the GHuST framework are enhanced topological description, size independence, and computational simplicity. It allows for straight comparison among different networks disregarding their size. It also reduces the complexity of graphlet counting, since it does not use 4- and 5-node graphlets. The application of the novel framework to a large set of networks shows that it can classify networks of distinct nature based on their topological properties. To ease network classification and enhance the graphical representation of them, we reduce the 12 dimensions to their main principal components. Furthermore, the 12 dimensions are easily interpretable. This enables the connection between complex-network analyses and diverse real applications.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
2- to 5-node graphlets (from G0 to G29) and their automorphism orbits (from 0 to 72). For each graphlet, nodes in the same automorphism orbits are identified with the same color (e.g. all blue nodes in G1 are in O1, they are in a symmetric position in the graphlet, the green node is in a different topological position, it is in O2).
Figure 2
Figure 2
(a) Variance explained (%) and accumulated variance explained (%) by each of the principal components resulting from the PCA analysis to a set of 1,404 networks. As can be seen, the first 3 principal components summarize more than 90% of the variance of the original data. (b) Graphical representation of the 1,404 networks in the 3-d space defined by the three first principal components. In this representation, several clusters can be appreciated, corresponding to networks with similar topological structure. (c) 2-d projections of the 3-d representation of the 1,404 networks in the space defined by the three first principal components. (d) Loadings of ρ for the three first principal components. As there are no null coefficients, it can be concluded that there are no irrelevant dimensions in the proposed 12 dimensional metric.
Figure 3
Figure 3
Range of variation of each metric dimension, ρi, (green area) and median value of each ρi for the seven set of networks analyzed: autonomous systems, enzymes, Facebook, power networks, retweet, roads and webs. Networks with different structures exhibit a distinct distribution of the ρi values, as highlighted by these charts.
Figure 4
Figure 4
Contributions of each dimension of the GHuST framework to the first principal component obtained for each set of networks analyzed. Larger loadings are associated with a high variance in the original dataset, providing information about the topological differences between networks of the same set.
Figure 5
Figure 5
Variance explained and cumulative variance explained by each of the principal components of the resulting PCA applied independently to each type of network analyzed. As can be seen, in all cases using the first 3 principal components account for more than 90% of the variance of the original dataset.

Similar articles

Cited by

References

    1. Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang D-U. Complex networks: Structure and dynamics. Phys. Rep. 2006;424:175–308. doi: 10.1016/j.physrep.2005.10.009. - DOI
    1. Newman M. Networks: An Introduction. Oxford: Oxford University Press; 2010.
    1. Alon U. Network motifs: Theory and experimental approaches. Nat. Rev. Genet. 2007;8:450–461. doi: 10.1038/nrg2102. - DOI - PubMed
    1. Palla G, Derényi I, Farkas I, Vicsek T. Uncovering the overlapping community structure of complex networks in nature and society. Nature. 2005;435:814–818. doi: 10.1038/nature03607. - DOI - PubMed
    1. Barabási A, Bonabeau E. Scale-free networks. Sci. Am. 2003;288:60–69. doi: 10.1038/scientificamerican0503-60. - DOI - PubMed