. 2018 Jan 19;19(Suppl 1):919.

doi: 10.1186/s12864-017-4338-6.

InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk

Liang Cheng¹, Yue Jiang², Hong Ju³, Jie Sun¹, Jiajie Peng⁴, Meng Zhou⁵, Yang Hu⁶

Affiliations

¹ College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, People's Republic of China.
² Hospital for Sick Children, Toronto, M5G 1X8, Canada.
³ Department of Information Engineering, Heilongjiang Biological Science and Technology Career Academy, Harbin, 150081, People's Republic of China.
⁴ School of Computer Science, Northwestern Polytechnical University, Xian, 710072, People's Republic of China.
⁵ College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, People's Republic of China. biofomeng@hotmail.com.
⁶ School of Life Science and Technology, Harbin Institute of Technology, Harbin, 150088, People's Republic of China. huyang@hit.edu.cn.

PMID: 29363423
PMCID: PMC5780854
DOI: 10.1186/s12864-017-4338-6

InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk

Liang Cheng et al. BMC Genomics. 2018.

. 2018 Jan 19;19(Suppl 1):919.

doi: 10.1186/s12864-017-4338-6.

Authors

Liang Cheng¹, Yue Jiang², Hong Ju³, Jie Sun¹, Jiajie Peng⁴, Meng Zhou⁵, Yang Hu⁶

Affiliations

¹ College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, People's Republic of China.
² Hospital for Sick Children, Toronto, M5G 1X8, Canada.
³ Department of Information Engineering, Heilongjiang Biological Science and Technology Career Academy, Harbin, 150081, People's Republic of China.
⁴ School of Computer Science, Northwestern Polytechnical University, Xian, 710072, People's Republic of China.
⁵ College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, People's Republic of China. biofomeng@hotmail.com.
⁶ School of Life Science and Technology, Harbin Institute of Technology, Harbin, 150088, People's Republic of China. huyang@hit.edu.cn.

PMID: 29363423
PMCID: PMC5780854
DOI: 10.1186/s12864-017-4338-6

Abstract

Background: Since the establishment of the first biomedical ontology Gene Ontology (GO), the number of biomedical ontology has increased dramatically. Nowadays over 300 ontologies have been built including extensively used Disease Ontology (DO) and Human Phenotype Ontology (HPO). Because of the advantage of identifying novel relationships between terms, calculating similarity between ontology terms is one of the major tasks in this research area. Though similarities between terms within each ontology have been studied with in silico methods, term similarities across different ontologies were not investigated as deeply. The latest method took advantage of gene functional interaction network (GFIN) to explore such inter-ontology similarities of terms. However, it only used gene interactions and failed to make full use of the connectivity among gene nodes of the network. In addition, all existent methods are particularly designed for GO and their performances on the extended ontology community remain unknown.

Results: We proposed a method InfAcrOnt to infer similarities between terms across ontologies utilizing the entire GFIN. InfAcrOnt builds a term-gene-gene network which comprised ontology annotations and GFIN, and acquires similarities between terms across ontologies through modeling the information flow within the network by random walk. In our benchmark experiments on sub-ontologies of GO, InfAcrOnt achieves a high average area under the receiver operating characteristic curve (AUC) (0.9322 and 0.9309) and low standard deviations (1.8746e-6 and 3.0977e-6) in both human and yeast benchmark datasets exhibiting superior performance. Meanwhile, comparisons of InfAcrOnt results and prior knowledge on pair-wise DO-HPO terms and pair-wise DO-GO terms show high correlations.

Conclusions: The experiment results show that InfAcrOnt significantly improves the performance of inferring similarities between terms across ontologies in benchmark set.

Keywords: Biomedical ontology; Information flow; Random walk; Term similarities.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

**Fig. 1**
Sub-graph of the Directed Acyclic Graph of three GO sub-ontologies. Each node indicates a term of GO, and each arrow symbol represents an ‘IS_A’ relationship of GO. For example, “catalytic complex” is linked to “protein complex” by an ‘IS_A’ relationship

**Fig. 2**
Overview of InfAcrOnt demonstrating the basic ideas of measuring similarity between terms across ontologies

**Fig. 3**
ROC analysis of the benchmark set and random sets for human. a ROC curves for the experimental results on the benchmark set and a random set for human. It shows 1-specificity versus sensitivity of each method for calculating the similarities of terms across BP and MF. b Average of AUC for 100 iterators for human

**Fig. 4**
The correlation between the term similarity based on ontology annotations and prior knowledge in HPO project. a The distribution of the similarity scores by InfAcrOnt method. b Pearson Correlation Coefficient between similarity scores based on TF-IDF and other methods

**Fig. 5**
The correlation between the term similarity based on ontology annotations and prior knowledge in PubMed. a The distribution of the similarity scores by InfAcrOnt method. b Pearson Correlation Coefficient between similarity score based on EMI and other methods

See this image and copyright information in PMC

Cited by

Ultrasound Image Classification of Thyroid Nodules Based on Deep Learning.
Yang J, Shi X, Wang B, Qiu W, Tian G, Wang X, Wang P, Yang J. Yang J, et al. Front Oncol. 2022 Jul 15;12:905955. doi: 10.3389/fonc.2022.905955. eCollection 2022. Front Oncol. 2022. PMID: 35912199 Free PMC article.
Identification of Alzheimer's Disease-Related Genes Based on Data Integration Method.
Hu Y, Zhao T, Zang T, Zhang Y, Cheng L. Hu Y, et al. Front Genet. 2019 Jan 25;9:703. doi: 10.3389/fgene.2018.00703. eCollection 2018. Front Genet. 2019. PMID: 30740125 Free PMC article.
RF-PseU: A Random Forest Predictor for RNA Pseudouridine Sites.
Lv Z, Zhang J, Ding H, Zou Q. Lv Z, et al. Front Bioeng Biotechnol. 2020 Feb 26;8:134. doi: 10.3389/fbioe.2020.00134. eCollection 2020. Front Bioeng Biotechnol. 2020. PMID: 32175316 Free PMC article.
Identification and Classification of Enhancers Using Dimension Reduction Technique and Recurrent Neural Network.
Li Q, Xu L, Li Q, Zhang L. Li Q, et al. Comput Math Methods Med. 2020 Oct 18;2020:8852258. doi: 10.1155/2020/8852258. eCollection 2020. Comput Math Methods Med. 2020. PMID: 33133227 Free PMC article.
eQTLMAPT: Fast and Accurate eQTL Mediation Analysis With Efficient Permutation Testing Approaches.
Wang T, Peng Q, Liu B, Liu X, Liu Y, Peng J, Wang Y. Wang T, et al. Front Genet. 2020 Jan 9;10:1309. doi: 10.3389/fgene.2019.01309. eCollection 2019. Front Genet. 2020. PMID: 31998368 Free PMC article.

See all "Cited by" articles

References

1. Hastings J, de Matos P, Dekker A, Ennis M, Harsha B, Kale N, Muthukrishnan V, Owen G, Turner S, Williams M. The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Res. 2013;41(D1):D456–D463. doi: 10.1093/nar/gks1146. - DOI - PMC - PubMed
1. Schindelman G, Fernandes JS, Bastiani CA, Yook K, Sternberg PW. Worm phenotype ontology: integrating phenotype data within and beyond the C. Elegans community. BMC bioinformatics. 2011;12:32. doi: 10.1186/1471-2105-12-32. - DOI - PMC - PubMed
1. Smith CL, Goldsmith CA, Eppig JT. The mammalian phenotype ontology as a tool for annotating, analyzing and comparing phenotypic information. Genome Biol. 2005;6(1):R7. doi: 10.1186/gb-2004-6-1-r7. - DOI - PMC - PubMed
1. Smith CL, Eppig JT. The mammalian phenotype ontology as a unifying standard for experimental and high-throughput phenotyping data. Mamm Genome. 2012;23(9–10):653–668. doi: 10.1007/s00335-012-9421-3. - DOI - PMC - PubMed
1. Camon E, Magrane M, Barrell D, Lee V, Dimmer E, Maslen J, Binns D, Harte N, Lopez R, Apweiler R. The gene ontology annotation (goa) database: sharing knowledge in uniprot with gene ontology. Nucleic Acids Res. 2004;32(suppl 1):D262–D266. doi: 10.1093/nar/gkh021. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Molecular Biology Databases
- Saccharomyces Genome Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk

Affiliations

InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk

Authors

Affiliations

Abstract

Conflict of interest statement

Ethics approval and consent to participate

Consent for publication

Competing interests

Publisher’s Note

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases