. 2021 Jan 26:3:608043.

doi: 10.3389/fdata.2020.608043. eCollection 2020.

Proximity-Based Compression for Network Embedding

Muhammad Ifte Islam¹, Farhan Tanvir¹, Ginger Johnson², Esra Akbas¹, Mehmet Emin Aktas³

Affiliations

¹ Department of Computer Science, Oklahoma State University, Stillwater, OK, United States.
² Department of Computer Science, University of Tulsa, Tulsa, OK, United States.
³ Department of Mathematics and Statistics, University of Central Oklahoma, Edmond, OK, United States.

PMID: 33693427
PMCID: PMC7931879
DOI: 10.3389/fdata.2020.608043

Proximity-Based Compression for Network Embedding

Muhammad Ifte Islam et al. Front Big Data. 2021.

. 2021 Jan 26:3:608043.

doi: 10.3389/fdata.2020.608043. eCollection 2020.

Authors

Muhammad Ifte Islam¹, Farhan Tanvir¹, Ginger Johnson², Esra Akbas¹, Mehmet Emin Aktas³

Affiliations

¹ Department of Computer Science, Oklahoma State University, Stillwater, OK, United States.
² Department of Computer Science, University of Tulsa, Tulsa, OK, United States.
³ Department of Mathematics and Statistics, University of Central Oklahoma, Edmond, OK, United States.

PMID: 33693427
PMCID: PMC7931879
DOI: 10.3389/fdata.2020.608043

Abstract

Network embedding that encodes structural information of graphs into a low-dimensional vector space has been proven to be essential for network analysis applications, including node classification and community detection. Although recent methods show promising performance for various applications, graph embedding still has some challenges; either the huge size of graphs may hinder a direct application of the existing network embedding method to them, or they suffer compromises in accuracy from locality and noise. In this paper, we propose a novel Network Embedding method, NECL, to generate embedding more efficiently or effectively. Our goal is to answer the following two questions: 1) Does the network Compression significantly boost Learning? 2) Does network compression improve the quality of the representation? For these goals, first, we propose a novel graph compression method based on the neighborhood similarity that compresses the input graph to a smaller graph with incorporating local proximity of its vertices into super-nodes; second, we employ the compressed graph for network embedding instead of the original large graph to bring down the embedding cost and also to capture the global structure of the original graph; third, we refine the embeddings from the compressed graph to the original graph. NECL is a general meta-strategy that improves the efficiency and effectiveness of many state-of-the-art graph embedding algorithms based on node proximity, including DeepWalk, Node2vec, and LINE. Extensive experiments validate the efficiency and effectiveness of our method, which decreases embedding time and improves classification accuracy as evaluated on single and multi-label classification tasks with large real-world graphs.

Keywords: graph classification; graph compression; graph representation learning; network embedding; node similarity.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**FIGURE 1**
Example of graph compressing on Les Miserables network (Original Network **(A)**, Compressed Network **(B)**, Embedding of Original Network **(C)** and Embedding of Compressed Network **(D)**).

**FIGURE 2**
Example of graph compressing. **(a,b)** are merged into super-node ab connected to the neighbors of both **(a,b)**.

**Algorithm 1**
Graph Compressing (G, λ).

**Algorithm 2**
NECL: Network Embedding on Compressed Graph.

**FIGURE 3**
Detailed classification results on Citeseer.

**FIGURE 4**
Detailed classification results on Wiki.

**FIGURE 5**
Detailed classification results on DBLP.

**FIGURE 6**
Detailed classification results on BlogCatalog.

**FIGURE 7**
Run time analyses for different similarity threshold values λ (Citeseer **(A)**, Wiki **(B)**, DBLP **(C)** and BlogCatalog **(D)**).

**FIGURE 8**
The ratio of vertices/edges of the compressed graphs to that of the original graphs. (Citeseer **(A)**, Wiki **(B)**, DBLP **(C)** and BlogCatalog **(D)**).

**FIGURE 9**
Detailed comparisons of classification results on Citeseer

**FIGURE 10**
Detailed comparisons of classification results on Wiki.

**FIGURE 11**
Detailed comparisons of classification results on DBLP.

**FIGURE 12**
Detailed comparisons of classification results on BlogCatalog.

See this image and copyright information in PMC

References

1. Adler M., Mitzenmacher M. (2001). “Towards compressing web graphs,” in Proceedings of data compression conference Snowbird, UT, USA, March 27-29, 2001 (DCC: IEEE; ), 203–212.
1. Akbas E., Zhao P. (2017). “Attributed graph clustering: an attribute-aware graph embedding approach,” in Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining, Sydney, Australia, July, 2017, 305–308.
1. Akbas E., Aktas M. E. (2019a). “Network embedding: on compression and learning,” in IEEE international conference on Big data (Big data), Los Angeles, CA, USA, December 9-12, 2019, 4763–4772.
1. Akbas E., Aktas M. E. (2019b). Network embedding: on compression and learning. arXiv preprint arXiv:1907.02811 - PMC - PubMed
1. Akbas E., Zhao P. (2019). “Graph clustering based on attribute-aware graph embedding,” in From security to community detection in social networking platforms (Springer International Publishing; ), 109–131.

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Proximity-Based Compression for Network Embedding

Affiliations

Proximity-Based Compression for Network Embedding

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources

Other Literature Sources