Methods for biological data integration: perspectives and challenges
- PMID: 26490630
- PMCID: PMC4685837
- DOI: 10.1098/rsif.2015.0571
Methods for biological data integration: perspectives and challenges
Abstract
Rapid technological advances have led to the production of different types of biological data and enabled construction of complex networks with various types of interactions between diverse biological entities. Standard network data analysis methods were shown to be limited in dealing with such heterogeneous networked data and consequently, new methods for integrative data analyses have been proposed. The integrative methods can collectively mine multiple types of biological data and produce more holistic, systems-level biological insights. We survey recent methods for collective mining (integration) of various types of networked biological data. We compare different state-of-the-art methods for data integration and highlight their advantages and disadvantages in addressing important biological problems. We identify the important computational challenges of these methods and provide a general guideline for which methods are suited for specific biological problems, or specific data types. Moreover, we propose that recent non-negative matrix factorization-based approaches may become the integration methodology of choice, as they are well suited and accurate in dealing with heterogeneous data and have many opportunities for further development.
Keywords: biological networks; data fusion; heterogeneous data integration; non-negative matrix factorization; omics data; systems biology.
© 2015 The Author(s).
Figures





Similar articles
-
Heterogeneous Multi-Layered Network Model for Omics Data Integration and Analysis.Front Genet. 2020 Jan 28;10:1381. doi: 10.3389/fgene.2019.01381. eCollection 2019. Front Genet. 2020. PMID: 32063919 Free PMC article. Review.
-
Independent Component Analysis for Unraveling the Complexity of Cancer Omics Datasets.Int J Mol Sci. 2019 Sep 7;20(18):4414. doi: 10.3390/ijms20184414. Int J Mol Sci. 2019. PMID: 31500324 Free PMC article. Review.
-
DrugNet: network-based drug-disease prioritization by integrating heterogeneous data.Artif Intell Med. 2015 Jan;63(1):41-9. doi: 10.1016/j.artmed.2014.11.003. Epub 2015 Jan 13. Artif Intell Med. 2015. PMID: 25704113
-
Integrative approaches to reconstruct regulatory networks from multi-omics data: A review of state-of-the-art methods.Comput Biol Chem. 2019 Dec;83:107120. doi: 10.1016/j.compbiolchem.2019.107120. Epub 2019 Sep 6. Comput Biol Chem. 2019. PMID: 31499298 Review.
-
Neural Collective Matrix Factorization for integrated analysis of heterogeneous biomedical data.Bioinformatics. 2022 Sep 30;38(19):4554-4561. doi: 10.1093/bioinformatics/btac543. Bioinformatics. 2022. PMID: 35929808
Cited by
-
Understanding and predicting disease relationships through similarity fusion.Bioinformatics. 2019 Apr 1;35(7):1213-1220. doi: 10.1093/bioinformatics/bty754. Bioinformatics. 2019. PMID: 30169824 Free PMC article.
-
GenoSurf: metadata driven semantic search system for integrated genomic datasets.Database (Oxford). 2019 Jan 1;2019:baz132. doi: 10.1093/database/baz132. Database (Oxford). 2019. PMID: 31820804 Free PMC article.
-
Towards the Generation of Medical Imaging Classifiers Robust to Common Perturbations.BioMedInformatics. 2024 Jun;4(2):889-910. doi: 10.3390/biomedinformatics4020050. Epub 2024 Apr 1. BioMedInformatics. 2024. PMID: 40535105 Free PMC article.
-
Widespread redundancy in -omics profiles of cancer mutation states.Genome Biol. 2022 Jun 27;23(1):137. doi: 10.1186/s13059-022-02705-y. Genome Biol. 2022. PMID: 35761387 Free PMC article.
-
NetMix: A Network-Structured Mixture Model for Reduced-Bias Estimation of Altered Subnetworks.J Comput Biol. 2021 May;28(5):469-484. doi: 10.1089/cmb.2020.0435. Epub 2021 Jan 5. J Comput Biol. 2021. PMID: 33400606 Free PMC article.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources