Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Apr 24:11:400.
doi: 10.3389/fgene.2020.00400. eCollection 2020.

A Literature Review of Gene Function Prediction by Modeling Gene Ontology

Affiliations
Review

A Literature Review of Gene Function Prediction by Modeling Gene Ontology

Yingwen Zhao et al. Front Genet. .

Abstract

Annotating the functional properties of gene products, i.e., RNAs and proteins, is a fundamental task in biology. The Gene Ontology database (GO) was developed to systematically describe the functional properties of gene products across species, and to facilitate the computational prediction of gene function. As GO is routinely updated, it serves as the gold standard and main knowledge source in functional genomics. Many gene function prediction methods making use of GO have been proposed. But no literature review has summarized these methods and the possibilities for future efforts from the perspective of GO. To bridge this gap, we review the existing methods with an emphasis on recent solutions. First, we introduce the conventions of GO and the widely adopted evaluation metrics for gene function prediction. Next, we summarize current methods of gene function prediction that apply GO in different ways, such as using hierarchical or flat inter-relationships between GO terms, compressing massive GO terms and quantifying semantic similarities. Although many efforts have improved performance by harnessing GO, we conclude that there remain many largely overlooked but important topics for future research.

Keywords: directed acyclic graph; functional genomics; gene function prediction; gene ontology; inter-relationships; semantic similarity.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Snapshot of a directed acyclic graph from Gene Ontology. Each ontological term is represented by an alphanumeric identifier, and its biological function is described by controlled words. These GO terms are hierarchically connected with different types of directed edges. The level of a GO term in the DAG is determined by its furthest distance to the root GO term (“GO:0008150” in BPO, “GO:0005575” in CCO, and “GO:0003674” in MFO). For example, “GO:0048087” is a direct child and also a grandson of “GO:0048066,” and its furthest distance to the root term is 5, while “GO:0006856” is another direct child of “GO:0048066” and its furthest distance to the root is 4, so “GO:0006856” is plotted at a higher level than “GO:0048087”.
Figure 2
Figure 2
The number of published papers related to GO-based gene function prediction over 10 years.
Figure 3
Figure 3
Three issues in gene function prediction (left), and categorization of existing computational solutions based on GO (right).
Figure 4
Figure 4
Exemplar tasks of gene function prediction, which include predicting missing, negative, and noisy annotations.

References

    1. Ashburner M., Ball C. A., Blake J. A., Botstein D., Butler H., Cherry J. M., et al. (2000). Gene ontology: tool for the unification of biology. Nat. Genet. 25, 25–29. 10.1038/75556 - DOI - PMC - PubMed
    1. Barabási A.-L., Gulbahce N., Loscalzo J. (2011). Network medicine: a network-based approach to human disease. Nat. Rev. Genet. 12, 56–68. 10.1038/nrg2918 - DOI - PMC - PubMed
    1. Barutcuoglu Z., Schapire R. E., Troyanskaya O. G. (2006). Hierarchical multi-label prediction of gene function. Bioinformatics 22, 830–836. 10.1093/bioinformatics/btk048 - DOI - PubMed
    1. Blake J. A. (2013). Ten quick tips for using the gene ontology. PLoS Comput. Biol. 9:e1003343. 10.1371/journal.pcbi.1003343 - DOI - PMC - PubMed
    1. Blei D. M., Ng A. Y., Jordan M. I. (2003). Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022. 10.1162/jmlr.2003.3.4-5.993 - DOI