. 2018 Jul 17;115(29):7468-7472.

doi: 10.1073/pnas.1710547115. Epub 2018 Jul 3.

Local structure can identify and quantify influential global spreaders in large scale social networks

Yanqing Hu¹, Shenggong Ji², Yuliang Jin³, Ling Feng^{4

5}, H Eugene Stanley⁶, Shlomo Havlin⁷

Affiliations

¹ School of Data and Computer Science, Sun Yat-sen University, Guangzhou 510006, China; huyanq@mail.sysu.edu.cn hes@bu.edu.
² School of Information Science and Technology, Southwest Jiaotong University, Chengdu 610031, China.
³ Key Laboratory for Theoretical Physics, Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 100190, China.
⁴ Computing Science, Institute of High Performance Computing, Agency for Science, Technology, and Research, Singapore 138632.
⁵ Department of Physics, National University of Singapore, Singapore 117551.
⁶ Center for Polymer Studies and Department of Physics, Boston University, Boston, MA 02215; huyanq@mail.sysu.edu.cn hes@bu.edu.
⁷ Department of Physics, Bar-Ilan University, Ramat-Gan 52900, Israel.

PMID: 29970418
PMCID: PMC6055149
DOI: 10.1073/pnas.1710547115

Local structure can identify and quantify influential global spreaders in large scale social networks

Yanqing Hu et al. Proc Natl Acad Sci U S A. 2018.

. 2018 Jul 17;115(29):7468-7472.

doi: 10.1073/pnas.1710547115. Epub 2018 Jul 3.

Authors

Yanqing Hu¹, Shenggong Ji², Yuliang Jin³, Ling Feng^{4

5}, H Eugene Stanley⁶, Shlomo Havlin⁷

Affiliations

¹ School of Data and Computer Science, Sun Yat-sen University, Guangzhou 510006, China; huyanq@mail.sysu.edu.cn hes@bu.edu.
² School of Information Science and Technology, Southwest Jiaotong University, Chengdu 610031, China.
³ Key Laboratory for Theoretical Physics, Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 100190, China.
⁴ Computing Science, Institute of High Performance Computing, Agency for Science, Technology, and Research, Singapore 138632.
⁵ Department of Physics, National University of Singapore, Singapore 117551.
⁶ Center for Polymer Studies and Department of Physics, Boston University, Boston, MA 02215; huyanq@mail.sysu.edu.cn hes@bu.edu.
⁷ Department of Physics, Bar-Ilan University, Ramat-Gan 52900, Israel.

PMID: 29970418
PMCID: PMC6055149
DOI: 10.1073/pnas.1710547115

Abstract

Measuring and optimizing the influence of nodes in big-data online social networks are important for many practical applications, such as the viral marketing and the adoption of new products. As the viral spreading on a social network is a global process, it is commonly believed that measuring the influence of nodes inevitably requires the knowledge of the entire network. Using percolation theory, we show that the spreading process displays a nucleation behavior: Once a piece of information spreads from the seeds to more than a small characteristic number of nodes, it reaches a point of no return and will quickly reach the percolation cluster, regardless of the entire network structure; otherwise the spreading will be contained locally. Thus, we find that, without the knowledge of the entire network, any node's global influence can be accurately measured using this characteristic number, which is independent of the network size. This motivates an efficient algorithm with constant time complexity on the long-standing problem of best seed spreaders selection, with performance remarkably close to the true optimum.

Keywords: complex network; influence; percolation; social media; viral marketing.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Fig. 1.**
Two phases phenomena. (A) Examples of simulated local (1, 3, 5) and viral (2, 4, 6) SIR spreadings in the NOLA Facebook network ( $β = 0.02, β_{c} = 0.01$ ). We start the simulation from a randomly chosen node (red, $k = 27$ ). The active and nonactive nodes are colored in orange and white, respectively. (B) An illustration of giant (*Left*) and finite (*Right*) clusters in a bond percolation process. (C) The spreading probability distribution $g (i, s)$ (columns) is plotted together with the cluster size distribution function $p (s)$ (green line) obtained from percolation. Note that $p (s)$ is the average of $g (i, s)$ over all nodes. In this example, we use the same seed node $i$ as in A, but other randomly chosen nodes give similar bimodal distributions, with the same viral peak at $s^{\infty}$ (*SI Appendix*, section II). (D) The finite part $p_{f} (s)$ (green circles) of $p (s)$ is fitted to Eq. 5 (black solid line) to obtain the characteristic size $s * = 32.9 \pm 0.6$ and the exponent $τ = 2.50$ . (D, *Inset*) The characteristic size $s *$ is fitted to a power-law divergence near the critical point $β_{c}$ , with a non–mean-field exponent $σ = 1.05$ . The same network and the same $β$ are used in *A–D*.

**Fig. 2.**
Spreading power. (A) Comparison between the truncated spreading power $\tilde{S} (i)$ (Eq. 4) and the real exact spreading power $S (i)$ (Eq. 1) in NOLA Facebook and Macau Weibo ( $β_{c} = 0.05$ ) networks, where each point represents one node. (B) The $m$ dependence of the relative error $E^{r} (i, m)$ of nodes whose degrees are equal to the average degree $⟨ k ⟩$ , in the NOLA Facebook ( $β = 0.02$ ) network. (B, *Inset*) The $m$ dependence of the relative error $E^{r} (i, m)$ of nodes with different degrees. The relative error decreases quickly with $m$ and becomes smaller than $1 %$ when $m > s *$ . (C) Comparison among the influence radius $ℓ *$ , the average distance of the farthest nodes from the seed nodes $ℓ_{\infty} *$ , and the network diameter $D$ in nine OSNs and two random networks [an ER network with $N = 50,000, ⟨ k ⟩ = 10$ and a scale-free (SF) network with $N = 50,000, P (k) \sim k^{- 2.5}$ ]. We choose $β$ in different networks such that the fraction of the giant component is the same; i.e., $s^{\infty} = 0.3 N$ (see *SI Appendix*, section I for the real networks description). (D) The NOLA Facebook influence radius $ℓ *$ is smaller than both $ℓ_{\infty} *$ and $D$ for any $β > β_{c}$ .

**Fig. 3.**
Algorithm time complexity. (A) Comparison of the computational execution count of percolation-based greedy algorithm(PBGA) and natural greedy algorithm (NGA) (8) in ER networks with $β = 0.2$ ( $β_{c} = 0.1$ ). The algorithms select the set of $M = 10$ most influential nodes from $L = 1,000$ candidates with degree at $⟨ k ⟩ = 10$ . Unlike the NGA, PBGA’s computational complexities are independent of network size. (B) Comparison of the computational execution count (rescaled by $⟨ k ⟩$ and $m$ ) of the same algorithms in real OSNs (open symbols, from left to right: CA-GrQc, CA-HepTh, Macau Weibo, Email-Enron, NOLA Facebook, DBLP, Delicious, QQ, and LiveJournal; see *SI Appendix*, section I for the real networks description). The solid symbols are values of extrapolated execution count based on the size of whole Twitter and Facebook networks. The value of $β$ is chosen such that the giant component size is 30% of the network size; i.e., $s^{\infty} = 0.3 N$ in each OSN.

**Fig. 4.**
Algorithm performance on real online social networks. For (A) NOLA Facebook ( $β = 0.012$ ) and (B) Macau Weibo ( $β = 0.055$ ) networks, we compare the algorithm performance of the PBGA with that of other algorithms: brute-force search (BFS), degree discount heuristic (DDH) (8), eigenvector method (EM), genetic algorithm (GA), maximum betweenness (MB) (10), maximum closeness (MC) (11), maximum degree (MD), maximum Katz (MK) index (12), maximum k-shell (MKS) (2), NGA, and MCI (4). The candidate nodes are randomly selected from the nodes with median degree nodes: Degree is 10 for nodes in Facebook and out-degree is 3 for nodes from Weibo. Since the candidate nodes have the same degree, the MD method is equivalent to the random selection of seed nodes. $S (V)$ is normalized by dividing the giant component size $s^{\infty}$ . Here $L = 100$ candidates and $M$ varies from 1 to 100. (A and B, *Insets*) The regime $1 \leq M \leq 6$ is enlarged, where the rigorous optimum obtained from BFS is available. (C) On the Facebook network, the combined rigorous lower bound ${P R}_{c o m b}^{m i n}$ and the approximated lower bound ${P R}_{a p p r o x}^{m i n}$ are plotted together with the submodular lower bound ${P R}_{s u b m o d}^{m i n} = 0.63$ (17), as functions of $M$ . (D) The relative performance between the PBGA solution based on $β = 0.02$ ( $β = 0.05$ ) and the PBGA solution based on the other $β$ values, with both performances ${\tilde{V}}_{0}$ and ${\tilde{V}}_{β}$ estimated upon the same spreading rate $β$ . The solid symbols (blue star and red diamond) label the performance of 1 when $β = β_{0}$ . We see that as long as $β > β_{0}$ , the solution at $β_{0}$ can be used at $β$ since their performances are almost the same, as long as both $β_{0}$ and $β$ are larger than the critical point $β_{c} \approx 0.01$ .

**Fig. 5.**
Performance comparison between the PBGA and MCI. The vertical axis is the ratio between the seeds’ influence of the PBGA and MCI. For a small number $M$ of seed nodes, the PBGA significantly outperforms MCI. The difference diminishes as $M$ increases, when both solutions approach the theoretical maximum of giant component size. The simulation is carried out on the Facebook network with $β = 0.012$ .

See this image and copyright information in PMC

Cited by

Influential Nodes Identification in Complex Networks via Information Entropy.
Guo C, Yang L, Chen X, Chen D, Gao H, Ma J. Guo C, et al. Entropy (Basel). 2020 Feb 21;22(2):242. doi: 10.3390/e22020242. Entropy (Basel). 2020. PMID: 33286016 Free PMC article.
Systematic comparison between methods for the detection of influential spreaders in complex networks.
Erkol Ş, Castellano C, Radicchi F. Erkol Ş, et al. Sci Rep. 2019 Oct 22;9(1):15095. doi: 10.1038/s41598-019-51209-6. Sci Rep. 2019. PMID: 31641200 Free PMC article.
Beyond network centrality: individual-level behavioral traits for predicting information superspreaders in social media.
Zhou F, Lü L, Liu J, Mariani MS. Zhou F, et al. Natl Sci Rev. 2024 Mar 1;11(7):nwae073. doi: 10.1093/nsr/nwae073. eCollection 2024 Jul. Natl Sci Rev. 2024. PMID: 38883306 Free PMC article.
Spreading dynamics of information on online social networks.
Meng F, Xie J, Sun J, Xu C, Zeng Y, Wang X, Jia T, Huang S, Deng Y, Hu Y. Meng F, et al. Proc Natl Acad Sci U S A. 2025 Jan 28;122(4):e2410227122. doi: 10.1073/pnas.2410227122. Epub 2025 Jan 23. Proc Natl Acad Sci U S A. 2025. PMID: 39847317 Free PMC article.
Detecting and modelling real percolation and phase transitions of information on social media.
Xie J, Meng F, Sun J, Ma X, Yan G, Hu Y. Xie J, et al. Nat Hum Behav. 2021 Sep;5(9):1161-1168. doi: 10.1038/s41562-021-01090-z. Epub 2021 Apr 1. Nat Hum Behav. 2021. PMID: 33795858

See all "Cited by" articles

References

1. Rust RT, Oliver RW. The death of advertising. J Advert. 1994;23:71–77.
1. Kitsak M, et al. Identification of influential spreaders in complex networks. Nat Phys. 2010;6:888–893.
1. Wang P, et al. Understanding the spreading patterns of mobile phone viruses. Science. 2009;324:1071–1076. - PubMed
1. Morone F, Makse HA. Influence maximization in complex networks through optimal percolation. Nature. 2015;524:65–68. - PubMed
1. Aral S, Walker D. Identifying influential and susceptible members of social networks. Science. 2012;337:337–341. - PubMed

Publication types

Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Local structure can identify and quantify influential global spreaders in large scale social networks

Affiliations

Local structure can identify and quantify influential global spreaders in large scale social networks

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials