Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Apr 21;23(8):4168.
doi: 10.3390/s23084168.

Graph Representation Learning and Its Applications: A Survey

Affiliations
Review

Graph Representation Learning and Its Applications: A Survey

Van Thuy Hoang et al. Sensors (Basel). .

Abstract

Graphs are data structures that effectively represent relational data in the real world. Graph representation learning is a significant task since it could facilitate various downstream tasks, such as node classification, link prediction, etc. Graph representation learning aims to map graph entities to low-dimensional vectors while preserving graph structure and entity relationships. Over the decades, many models have been proposed for graph representation learning. This paper aims to show a comprehensive picture of graph representation learning models, including traditional and state-of-the-art models on various graphs in different geometric spaces. First, we begin with five types of graph embedding models: graph kernels, matrix factorization models, shallow models, deep-learning models, and non-Euclidean models. In addition, we also discuss graph transformer models and Gaussian embedding models. Second, we present practical applications of graph embedding models, from constructing graphs for specific domains to applying models to solve tasks. Finally, we discuss challenges for existing models and future research directions in detail. As a result, this paper provides a structured overview of the diversity of graph embedding models.

Keywords: graph embedding; graph neural networks; graph representation learning; graph transformer.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
The popularity of graph representation learning models in the Scopus database. The line plot shows changes in the number of publications in different types of graph representation learning models from 2010 to 2022. The y-axis denotes the number of publications on the popularity of graph representation learning models over the years. There are seven keywords, including graph representation learning (GRL), graph kernels (GK), matrix factorization-based graph embedding (MF), graph neural networks (GNNs), graph autoencoder (GAE), graph convolution networks (GCNs), graph transformer (GT), and non-Euclidean graph embedding (NEGE). There are nineteen representative models, including DeepWalk [14], Grarep [15], LINE [16], GGCN [17], GCN [18], HOPE [5], Node2Vec [4], GAT [19], Metapath2Vec [20], Struc2Vec [21], GraphSage [22], G2G [23], GIN [24], HGAT [25], DGI [26], HGNN [27], GCNII [28], GT [29], and EGT [30].
Figure 2
Figure 2
A comprehensive view of graph embedding. Given a spare, high-dimensional graph G=(V,E) where V and E denote the set of nodes and edges. Graph embedding learning aims to find a function ϕ that maps nodes from graph space to d-dimensional vector space with d|V|.
Figure 3
Figure 3
Methods for modeling dynamic graphs over time. (a) The representation of a dynamic graph by a series of snapshots; (b) The evolution of edges and nodes in the dynamic graph from time t to t+1. In (a), the graph G is the collection of G(t) (i.e., G={G(1),G(2),,G(t)}) which t is the time span, and the entities of G change from time t to t+1. (b) depicts the evolution of edges in the same dynamic graph from (a) which each edge contains the series of the time spans from t to t+1. At time t, the graph has five nodes (v1, v2, v3, v4, v5) and five edges (e13e15e34e45e23). However, at time t+1, the edge e23 and node v2 are removed, and a new node v6, a new edge e56 are added in the graph.
Figure 4
Figure 4
The proposed taxonomy for graph representation learning models.
Figure 5
Figure 5
The Weisfeiler–Lehman isomorphism test. (a) Original labels, i=0; (b) Relabeled labels, i=1. There are two interactions of WL relabeling for the graph with five nodes v1,v2,v3,v4,v5. In (a), labels of nodes are initialized consisting of 5 nodes. In (b), in the first iteration, new labels of the nodes will be reassigned and calculated based on the connection information to its adjacent nodes. For example, node v1 is adjacent to node v2 and node v3, therefore the new label of v1 is calculated as v1,v2,v3 and resigned as new label v6. The same steps are repeated until a steady state for the nodes is reached.
Figure 6
Figure 6
Node sampling techniques. (a) k-hop sampling; (b) Random-walk sampling. The source node vs and the target node vt are taken as the source node and the target node in the graph. In (a), the k-hop proximity sampling strategy begins from source node vs, and the green nodes are considered to be the 1st-hop proximity of node vs. The blue and the black nodes are considered 2nd-hop and 3rd-hop proximity of node vs, respectively. In (b), the random-walk sampling strategy takes a random walk (red arrow) from the source node vs to the target node vt.
Figure 7
Figure 7
Sampling strategy in Node2Vec and WalkLets model. (a) Sampling strategy in Node2Vec model; (b) Sampling strategy in WalkLets model. In (a), assume a random path from the DeepWalk model is of the form: (v1v2v3v4), then the corpus of random walk pairs at scale k=3 is: A1={(v1,v2),(v2,v3),(v3,v4)}, A2={(v1,v3),(v2,v4)}, and A3={(v1,v4)}. In (b), there are two parameters: The return parameter p and the in–out parameter q. Parameters 1,1/p, and 1/q are conditional probabilities. Starting at node u and now at v, the random walk looks at the next node based on the probabilities 1/p and 1/q.
Figure 8
Figure 8
Sampling strategy in Sub2Vec model. Assume that there are two subgraphs G1={v1,v2,v3,v4}, and G2={v5,v6,v7,v9}. For neighborhood properties, the model uses random-walk sampling on all nodes in subgraphs G1 and G2 to capture the subgraph structure. For structural properties, they introduced a ratio of node degree when sampling. With the length of the random-walk path is 3, then the degree path for G1 is 0.750.750.75, while the degree path from node v5 to v9 is: 0.250.750.25.
Figure 9
Figure 9
The random-walk sampling based on motif. (a) Random-walk sampling; (b) Motif-based random-walk sampling. (a) presents a random-walk path from node v1 to v7: v1v3v4v5v7. In (b), the motif-based path is: v1,v2,v3v2,v3,v4v2,v4,v5v4,v5,v6.
Figure 10
Figure 10
Updating random-walk paths to the corpus on dynamic graphs. At time t, the graph has 3 nodes: v1,v2,v3 with two edges:(v1,v2) and (v2,v3). Assuming the length of the random walk is 3, then the set of random walks: v1,v2,v1, v1,v2,v3, v2,v1,v2, v2,v3,v2, v3,v2,v1, v3,v2,v3. At the time t+1: The graph has a new node v4 and a new edge (v2,v4). Then, new random walks will be updated on the corpus are: v4,v2,v1, v4,v2,v3, and v4,v2,v4.
Figure 11
Figure 11
The strategy of edge and node collapsing of HARP model. (a) Edge compression; (b) Node compression. In (a), the super nodes v1,v2 and v3,v4 are formed by merging edges e12 and e34, respectively. In (b), the super nodes v1,v2 and v3,v4 are formed by merging node pairs (v1,v2) and (v3,v4), respectively.
Figure 12
Figure 12
The self-centered network of NEWEE model. For instance, the self-centered of node v2 could be defined as G=V,E where V=v1,v2,v3,v4,v5 and E is the set of edges in G.
Figure 13
Figure 13
The architecture of SDNE model. The features of nodes xi and xj are the inputs of the SDNE model. The encoder layer compresses the feature data xi and xj into vectors Zi and Zj in the latent space. The decoder layer aims to reconstruct the node features.
Figure 14
Figure 14
The architecture of DynGEM model. Similarity to the SDNE model, the DynGEM model could capture the 1st-order and 2nd-order proximity between two nodes in graphs with the encoder and decoder layers. The difference is vector embedding θt parameters at time t are updated from vector embedding θt1 at time t1.
Figure 15
Figure 15
An example of the Topo-LSTM model. Given by a cascade sequence S={(v1,1),(v2,2),(v3,3),(v4,4)}, the model first takes features of each node x1, x2, x3, x4 as inputs and then infers embeddings via Topo-LSTM model.
Figure 16
Figure 16
The sampling strategy of [57]. The model lists all the node pairs in respective weights as input of the autoencoder model.
Figure 17
Figure 17
The temporal random-walk sampling strategy of LSTM-Node2Vec model during the graphs’ evolution. (a) t; (b) t+1; (c) t+2. At the time t, the graph has four nodes and four edges between nodes. At the time t+1 and t+2, the graph has new nodes v5 and v6, respectively. A temporal random walk for node v1 with length L=3 could be: P={(v2,v3,v4),(v3,v2,v5),(v3,v5,v6),}.
Figure 18
Figure 18
An example of the GraphSAINT model. (a) A subgraph has five nodes v1, v2, v3, v4, and v5; (b) A full GCN has three layers. (a) presents a subgraph with nodes. In the subgraph, there are 3 nodes (v1, v2, v3) with higher order than the other two nodes (v4, v5). (b) presents a full CGNN-based model with three layers. Three nodes with higher degrees should be sampled from each other in the next layers.
Figure 19
Figure 19
The architecture of GAE and VGAE model. The model adopts the adjacency matrix A and the feature matrix X as inputs. The encoder part includes two convolutional GNN layers. In the GAE model, the decoder part adopts the embedding matrix Z as input and reconstructs the adjacency matrix A using an inner product. In the VAGE model, the output of GNN could be represented as a Gaussian distribution.

References

    1. Rossi R.A., Ahmed N.K. The Network Data Repository with Interactive Graph Analytics and Visualization; Proceedings of the 29th Conference on Artificial Intelligence (AAAI 2015); Austin, TX, USA. 25–30 January 2015; Austin, TX, USA: AAAI Press; 2015. pp. 4292–4293.
    1. Ahmed Z., Zeeshan S., Dandekar T. Mining biomedical images towards valuable information retrieval in biomedical and life sciences. Database J. Biol. Databases Curation. 2016;2016:baw118. doi: 10.1093/database/baw118. - DOI - PMC - PubMed
    1. Hamilton W.L. Synthesis Lectures on Artificial Intelligence and Machine Learning. Springer; Berlin/Heidelberg, Germany: 2020. Graph Representation Learning. - DOI
    1. Grover A., Leskovec J. node2vec: Scalable Feature Learning for Networks; Proceedings of the 22nd International Conference on Knowledge Discovery and Data Mining (SIGKDD 2016); San Francisco, CA, USA. 13–17 August 2016; San Francisco, CA, USA: ACM; 2016. pp. 855–864. - DOI - PMC - PubMed
    1. Ou M., Cui P., Pei J., Zhang Z., Zhu W. Asymmetric Transitivity Preserving Graph Embedding; Proceedings of the 22nd International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD 2016); San Francisco, CA, USA. 13–17 August 2016; San Francisco, CA, USA: ACM; 2016. pp. 1105–1114. - DOI