Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun 22:5:894632.
doi: 10.3389/fdata.2022.894632. eCollection 2022.

Significant Subgraph Detection in Multi-omics Networks for Disease Pathway Identification

Affiliations

Significant Subgraph Detection in Multi-omics Networks for Disease Pathway Identification

Mohamed Abdel-Hafiz et al. Front Big Data. .

Abstract

Chronic obstructive pulmonary disease (COPD) is one of the leading causes of death in the United States. COPD represents one of many areas of research where identifying complex pathways and networks of interacting biomarkers is an important avenue toward studying disease progression and potentially discovering cures. Recently, sparse multiple canonical correlation network analysis (SmCCNet) was developed to identify complex relationships between omics associated with a disease phenotype, such as lung function. SmCCNet uses two sets of omics datasets and an associated output phenotypes to generate a multi-omics graph, which can then be used to explore relationships between omics in the context of a disease. Detecting significant subgraphs within this multi-omics network, i.e., subgraphs which exhibit high correlation to a disease phenotype and high inter-connectivity, can help clinicians identify complex biological relationships involved in disease progression. The current approach to identifying significant subgraphs relies on hierarchical clustering, which can be used to inform clinicians about important pathways involved in the disease or phenotype of interest. The reliance on a hierarchical clustering approach can hinder subgraph quality by biasing toward finding more compact subgraphs and removing larger significant subgraphs. This study aims to introduce new significant subgraph detection techniques. In particular, we introduce two subgraph detection methods, dubbed Correlated PageRank and Correlated Louvain, by extending the Personalized PageRank Clustering and Louvain algorithms, as well as a hybrid approach combining the two proposed methods, and compare them to the hierarchical method currently in use. The proposed methods show significant improvement in the quality of the subgraphs produced when compared to the current state of the art.

Keywords: Louvain; PageRank; graph clustering; multi-omics graph; subgraph detection.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Hybrid clustering procedure. The first iteration (1) uses the global graph provided by SmCCNet, with subsequent iterations (2+) using the cluster generated by PageRank in its place.
Figure 2
Figure 2
Graph comparison between (A) two sample graphs, G1 (left) and G2 (right); (B) shows the heatmap visualization of the edit distances between G1 and G2, with the green regions representing similarity between he graphs and gray regions representing differences; the ration between the regions visually represents how similar the two graphs are.
Figure 3
Figure 3
Correlated PageRank results: (A) shows the randomized search results for the sequential approach to Correlated PageRank; (B) shows a distribution of |ρ| for subgraphs identified by sequential Correlated PageRank with randomized seed selection; (C) shows randomized search results for the simultaneous approach to Correlated PageRank; (D) shows a distribution of |ρ| for subgraphs identified by simultaneous Correlated PageRank with randomized seed selection; (E) shows randomized search results for simultaneous Correlated PageRank with selected strong seeds only; (F) shows a distribution of |ρ| for simultaneous Correlated PageRank with strong seeds.
Figure 4
Figure 4
Correlated Louvain results: (A) shows a distribution of |ρ| produced by the Correlated Louvain method with k4 = 0.8; (B) shows the same distribution but with singleton clusters removed, only showing subgraphs with |V| ≥ 2.
Figure 5
Figure 5
Level effect of Correlated Louvain: (A) shows the intermediate clustering of cluster 21 with k4 = 0.8, representing the common behavior of intermediate clusters; (B) shows uncommon behavior or intermediate clusters, such as cluster 1 with k4 = 0.2.
Figure 6
Figure 6
Change in correlation (A) and subnet size (B) of subnets produced by PageRank in each iteration.
Figure 7
Figure 7
Visual comparison of subnet 175 with other top subnets.

References

    1. Alaimo S., Giugno R., Pulvirenti A. (2014). ncPred: ncRNA-disease association prediction through tripartite network-based inference. Front. Bioeng. Biotechnol. 2, 71. 10.3389/fbioe.2014.00071 - DOI - PMC - PubMed
    1. Baadel S., Thabtah F., Lu J. (2016). Overlapping clustering: a review. 2016 SAI Computing Conference (SAI). London. 10.1109/SAI.2016.7555988 - DOI
    1. Barracchia E. P., Pio G., D'Elia D., Ceci M. (2020). Prediction of new associations between ncRNAs and diseases exploiting multi-type hierarchical clustering. BMC Bioinform. 21, 70. 10.1186/s12859-020-3392-2 - DOI - PMC - PubMed
    1. Bhatt S., Padhee S., Sheth A., Chen K., Shalin V., Doran D., et al. . (2019). Knowledge Graph Enhanced Community Detection and Characterization. New York, NY: Association for Computing Machinery. 10.1145/3289600.3291031 - DOI
    1. Blondel V. D., Guillaume J-L., Lambiotte R., Lefebvre E. (2008). Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2018:P10008. 10.1088/1742-5468/2008/10/P10008 - DOI - PubMed

LinkOut - more resources