Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan 19;19(Suppl 1):919.
doi: 10.1186/s12864-017-4338-6.

InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk

Affiliations

InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk

Liang Cheng et al. BMC Genomics. .

Abstract

Background: Since the establishment of the first biomedical ontology Gene Ontology (GO), the number of biomedical ontology has increased dramatically. Nowadays over 300 ontologies have been built including extensively used Disease Ontology (DO) and Human Phenotype Ontology (HPO). Because of the advantage of identifying novel relationships between terms, calculating similarity between ontology terms is one of the major tasks in this research area. Though similarities between terms within each ontology have been studied with in silico methods, term similarities across different ontologies were not investigated as deeply. The latest method took advantage of gene functional interaction network (GFIN) to explore such inter-ontology similarities of terms. However, it only used gene interactions and failed to make full use of the connectivity among gene nodes of the network. In addition, all existent methods are particularly designed for GO and their performances on the extended ontology community remain unknown.

Results: We proposed a method InfAcrOnt to infer similarities between terms across ontologies utilizing the entire GFIN. InfAcrOnt builds a term-gene-gene network which comprised ontology annotations and GFIN, and acquires similarities between terms across ontologies through modeling the information flow within the network by random walk. In our benchmark experiments on sub-ontologies of GO, InfAcrOnt achieves a high average area under the receiver operating characteristic curve (AUC) (0.9322 and 0.9309) and low standard deviations (1.8746e-6 and 3.0977e-6) in both human and yeast benchmark datasets exhibiting superior performance. Meanwhile, comparisons of InfAcrOnt results and prior knowledge on pair-wise DO-HPO terms and pair-wise DO-GO terms show high correlations.

Conclusions: The experiment results show that InfAcrOnt significantly improves the performance of inferring similarities between terms across ontologies in benchmark set.

Keywords: Biomedical ontology; Information flow; Random walk; Term similarities.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
Sub-graph of the Directed Acyclic Graph of three GO sub-ontologies. Each node indicates a term of GO, and each arrow symbol represents an ‘IS_A’ relationship of GO. For example, “catalytic complex” is linked to “protein complex” by an ‘IS_A’ relationship
Fig. 2
Fig. 2
Overview of InfAcrOnt demonstrating the basic ideas of measuring similarity between terms across ontologies
Fig. 3
Fig. 3
ROC analysis of the benchmark set and random sets for human. a ROC curves for the experimental results on the benchmark set and a random set for human. It shows 1-specificity versus sensitivity of each method for calculating the similarities of terms across BP and MF. b Average of AUC for 100 iterators for human
Fig. 4
Fig. 4
The correlation between the term similarity based on ontology annotations and prior knowledge in HPO project. a The distribution of the similarity scores by InfAcrOnt method. b Pearson Correlation Coefficient between similarity scores based on TF-IDF and other methods
Fig. 5
Fig. 5
The correlation between the term similarity based on ontology annotations and prior knowledge in PubMed. a The distribution of the similarity scores by InfAcrOnt method. b Pearson Correlation Coefficient between similarity score based on EMI and other methods

Similar articles

Cited by

References

    1. Hastings J, de Matos P, Dekker A, Ennis M, Harsha B, Kale N, Muthukrishnan V, Owen G, Turner S, Williams M. The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Res. 2013;41(D1):D456–D463. doi: 10.1093/nar/gks1146. - DOI - PMC - PubMed
    1. Schindelman G, Fernandes JS, Bastiani CA, Yook K, Sternberg PW. Worm phenotype ontology: integrating phenotype data within and beyond the C. Elegans community. BMC bioinformatics. 2011;12:32. doi: 10.1186/1471-2105-12-32. - DOI - PMC - PubMed
    1. Smith CL, Goldsmith CA, Eppig JT. The mammalian phenotype ontology as a tool for annotating, analyzing and comparing phenotypic information. Genome Biol. 2005;6(1):R7. doi: 10.1186/gb-2004-6-1-r7. - DOI - PMC - PubMed
    1. Smith CL, Eppig JT. The mammalian phenotype ontology as a unifying standard for experimental and high-throughput phenotyping data. Mamm Genome. 2012;23(9–10):653–668. doi: 10.1007/s00335-012-9421-3. - DOI - PMC - PubMed
    1. Camon E, Magrane M, Barrell D, Lee V, Dimmer E, Maslen J, Binns D, Harte N, Lopez R, Apweiler R. The gene ontology annotation (goa) database: sharing knowledge in uniprot with gene ontology. Nucleic Acids Res. 2004;32(suppl 1):D262–D266. doi: 10.1093/nar/gkh021. - DOI - PMC - PubMed

Publication types

LinkOut - more resources