MultiVERSE: a multiplex and multiplex-heterogeneous network embedding approach

Léo Pio-Lopez¹, Alberto Valdeolivas², Laurent Tichit³, Élisabeth Remy³, Anaïs Baudot^{4

5}

Affiliations

¹ Aix Marseille Univ, CNRS, Centrale Marseille, I2M, Marseille, France. leo.pio.lopez@gmail.com.
² Heidelberg University, Institute for Computational Biomedicine, Heidelberg, Germany.
³ Aix Marseille Univ, CNRS, Centrale Marseille, I2M, Marseille, France.
⁴ Aix Marseille Univ, INSERM, CNRS, MMG, Marseille, France.
⁵ Barcelona Supercomputing Center, Barcelona, Spain.

PMID: 33888761
PMCID: PMC8062697
DOI: 10.1038/s41598-021-87987-1

MultiVERSE: a multiplex and multiplex-heterogeneous network embedding approach

Léo Pio-Lopez et al. Sci Rep. 2021.

. 2021 Apr 22;11(1):8794.

doi: 10.1038/s41598-021-87987-1.

Authors

Léo Pio-Lopez¹, Alberto Valdeolivas², Laurent Tichit³, Élisabeth Remy³, Anaïs Baudot^{4

5}

Affiliations

¹ Aix Marseille Univ, CNRS, Centrale Marseille, I2M, Marseille, France. leo.pio.lopez@gmail.com.
² Heidelberg University, Institute for Computational Biomedicine, Heidelberg, Germany.
³ Aix Marseille Univ, CNRS, Centrale Marseille, I2M, Marseille, France.
⁴ Aix Marseille Univ, INSERM, CNRS, MMG, Marseille, France.
⁵ Barcelona Supercomputing Center, Barcelona, Spain.

PMID: 33888761
PMCID: PMC8062697
DOI: 10.1038/s41598-021-87987-1

Abstract

Network embedding approaches are gaining momentum to analyse a large variety of networks. Indeed, these approaches have demonstrated their effectiveness in tasks such as community detection, node classification, and link prediction. However, very few network embedding methods have been specifically designed to handle multiplex networks, i.e. networks composed of different layers sharing the same set of nodes but having different types of edges. Moreover, to our knowledge, existing approaches cannot embed multiple nodes from multiplex-heterogeneous networks, i.e. networks composed of several multiplex networks containing both different types of nodes and edges. In this study, we propose MultiVERSE, an extension of the VERSE framework using Random Walks with Restart on Multiplex (RWR-M) and Multiplex-Heterogeneous (RWR-MH) networks. MultiVERSE is a fast and scalable method to learn node embeddings from multiplex and multiplex-heterogeneous networks. We evaluate MultiVERSE on several biological and social networks and demonstrate its performance. MultiVERSE indeed outperforms most of the other methods in the tasks of link prediction and network reconstruction for multiplex network embedding, and is also efficient in link prediction for multiplex-heterogeneous network embedding. Finally, we apply MultiVERSE to study rare disease-gene associations using link prediction and clustering. MultiVERSE is freely available on github at https://github.com/Lpiol/MultiVERSE .

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Figure 1**
Illustrations of the different types of networks. (A) A multiplex network. The different layers share the same set of nodes but different types of edges. (B) A heterogeneous network. The two networks are composed of different types of nodes and edges, connected by bipartite interactions (black dashed lines). (C) A multiplex-heterogeneous network composed of two multiplex networks. The multiplex networks are connected by bipartite interactions (dashed lines). For the sake of simplicity, the figure does not represent all the possible bipartite interactions (each layer of a given multiplex is in reality linked with every layer of the other multiplex).

**Figure 2**
Overview of the MultiVERSE pipeline. Starting from a multiplex-heterogeneous network, we represent its structure through an adjacency matrix (size $| V | \times | V |$ ); we then compute a similarity matrix using Random Walk with Restart algorithm, and apply an optimized version of the VERSE algorithm to compute the embeddings. The resulting matrix of embeddings will be used for the applications.

**Figure 3**
General approach for link prediction on multiplex networks: (top) for the link prediction heuristics, we apply them to each layer and average them across all layers; (center) for monoplex-based methods, we embed each layer with the given method, then average it; (bottom) for multiplex-based methods, we apply the specific embedding method to the network. The embedding operators are then applied to monoplex- and multiplex-based method embeddings. The three types of methods are finally evaluated for link prediction using a binary classifier and a ROC-AUC is computed.

**Figure 4**
General approach for network reconstruction on multiplex networks: (top) for monoplex-based methods, embed each layer with the given method, then average it; (bottom) for multiplex-based methods, apply the specific embedding method to the network. Embedding operators are then applied to monoplex- and multiplex-based method embeddings. The three types of methods are finally evalutated for network embedding using a binary classifier and a precision@K score is computed.

**Figure 5**
Cluster containing the HGPS disease node. Disease-Disease edges from the disease multiplex network are represented in green (shared symptoms) and blue (CTD projection). Gene-Gene edges from the molecular multiplex network are represented in pink (Reactome pathways), red (protein-protein interactions) and orange (molecular complexes). Gene-Disease bipartite interactions are represented with black dashed lines.

**Figure 6**
Cluster containing the Xeroderma pigmentosum VII disease node. Gene-Gene edges from the molecular multiplex network are represented in pink (Reactome pathways), red (protein-protein interactions) and orange (molecular complexes). Gene-Disease bipartite interactions are represented with black dashed lines.

See this image and copyright information in PMC

References

1. Hamilton, W. L. & Ying, R., Leskovec, J. Methods and applications. IEEE Data Engineering Bulletin, Representation learning on graphs, (2017).
1. Liao L, He X, Zhang H, Chua T-S. Attributed social network embedding. IEEE Trans. Knowl. Data Eng. 2018;30:2257–2270. doi: 10.1109/TKDE.2018.2819980. - DOI
1. Ma, G., Lu, C.-T., He, L., Philip, S. Y. & Ragin, A. B. Multi-view graph embedding with hub detection for brain network analysis. In 2017 IEEE International Conference on Data Mining (ICDM) 967–972 (IEEE, 2017).
1. Nelson W, et al. To embed or not: Network embedding as a paradigm in computational biology. Front. Genet. 2019;10:381. doi: 10.3389/fgene.2019.00381. - DOI - PMC - PubMed
1. Perozzi, B., Al-Rfou, R. & Skiena, S. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining 701–710 (ACM, 2014).

Publication types

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

MultiVERSE: a multiplex and multiplex-heterogeneous network embedding approach

Affiliations

MultiVERSE: a multiplex and multiplex-heterogeneous network embedding approach

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases