Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan 22;8(1):1362.
doi: 10.1038/s41598-018-19333-x.

Gene annotation bias impedes biomedical research

Affiliations

Gene annotation bias impedes biomedical research

Winston A Haynes et al. Sci Rep. .

Abstract

We found tremendous inequality across gene and protein annotation resources. We observed that this bias leads biomedical researchers to focus on richly annotated genes instead of those with the strongest molecular data. We advocate that researchers reduce these biases by pursuing data-driven hypotheses.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1
Inequality in gene annotations. (A) We measured the Gini coefficient across a variety of gene annotation resources. (B) We compared the growth in the Gini coefficient of the Gene Ontology to different models of increasing and decreasing inequality. See also Figure S1.
Figure 2
Figure 2
Published Disease-Gene Associations Not Reflected in Molecular Data. (A) The number of publications for every disease-gene pair was not significantly correlated with the gene expression multicohort analysis effect size FDR rank [Spearman’s correlation = −0.003, p = 0.836]. (B) The number of publications for every disease-gene pair correlated with the number of non-inferred from electronic annotation (non-IEA) Gene Ontology annotations [Spearman’s correlation = 0.110, p = 2.1e–16]. Orange points represent disease-gene associations published in our prior meta-analyses,,. Purple points have at least 1000 publications. See also Figure S2.

References

    1. Khatri P, Sirota M, Butte AJ. Ten years of pathway analysis: current approaches and outstanding challenges. PLoS computational biology. 2012;8:e1002375. doi: 10.1371/journal.pcbi.1002375. - DOI - PMC - PubMed
    1. Ashburner M, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature genetics. 2000;25:25–9. doi: 10.1038/75556. - DOI - PMC - PubMed
    1. Croft D, et al. The Reactome pathway knowledgebase. Nucleic acids research. 2014;42:472–7. doi: 10.1093/nar/gkt1102. - DOI - PMC - PubMed
    1. Davis AP, et al. The Comparative Toxicogenomics Database’s 10th year anniversary: update 2015. Nucleic acids research. 2015;43:914–20. doi: 10.1093/nar/gku935. - DOI - PMC - PubMed
    1. Wishart DS, et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic acids research. 2006;34:668–72. doi: 10.1093/nar/gkj067. - DOI - PMC - PubMed

Publication types

LinkOut - more resources