Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015;16 Suppl 8(Suppl 8):S6.
doi: 10.1186/1471-2164-16-S8-S6. Epub 2015 Jun 18.

NET-GE: a novel NETwork-based Gene Enrichment for detecting biological processes associated to Mendelian diseases

NET-GE: a novel NETwork-based Gene Enrichment for detecting biological processes associated to Mendelian diseases

Pietro Di Lena et al. BMC Genomics. 2015.

Abstract

Background: Enrichment analysis is a widely applied procedure for shedding light on the molecular mechanisms and functions at the basis of phenotypes, for enlarging the dataset of possibly related genes/proteins and for helping interpretation and prioritization of newly determined variations. Several standard and Network-based enrichment methods are available. Both approaches rely on the annotations that characterize the genes/proteins included in the input set; network based ones also include in different ways physical and functional relationships among different genes or proteins that can be extracted from the available biological networks of interactions.

Results: Here we describe a novel procedure based on the extraction from the STRING interactome of sub-networks connecting proteins that share the same Gene Ontology(GO) terms for Biological Process (BP). Enrichment analysis is performed by mapping the protein set to be analyzed on the sub-networks, and then by collecting the corresponding annotations. We test the ability of our enrichment method in finding annotation terms disregarded by other enrichment methods available. We benchmarked 244 sets of proteins associated to different Mendelian diseases, according to the OMIM web resource. In 143 cases (58%), the network-based procedure extracts GO terms neglected by the standard method, and in 86 cases (35%), some of the newly enriched GO terms are not included in the set of annotations characterizing the input proteins. We present in detail six cases where our network-based enrichment provides an insight into the biological basis of the diseases, outperforming other freely available network-based methods.

Conclusions: Considering a set of proteins in the context of their interaction network can help in better defining their functions. Our novel method exploits the information contained in the STRING database for building the minimal connecting network containing all the proteins annotated with the same GO term. The enrichment procedure is performed considering the GO-specific network modules and, when tested on the OMIM-derived benchmark sets, it is able to extract enrichment terms neglected by other methods. Our procedure is effective even when the size of the input protein set is small, requiring at least two input proteins.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Outline of the network module generation of NET-GE. Details on the different steps are explained in Methods. *Ranking scores are hierarchically applied.
Figure 2
Figure 2
Minimal connecting network for GO:0036018. A) Minimal connecting network extracted from STRING 9.2 (http://www.string-db.org) build for the Biological Process term GO:003601 (cellular response to erythropoietin). The seed genes, directly annotated with GO:0036018, are HGNC:MT2A, HGNC:KIT, HGNC:EPOR and HGNC:MT1X. The connecting genes HGNC:JAK2 and HGNC:IL6, recovered by the minimization procedure, are associated to GO:0019221 (cytokine-mediated signaling pathway). B) Relation between the reference GO term (GO:0036018) and the GO associated to the connecting genes (GO:0019221).
Figure 3
Figure 3
Number of enriched GO BP terms as a function of the frequency of occurrence in the human proteome. The x-axis groups GO BP terms based on their frequency of occurrence in the human proteome. The numbers between parentheses indicate the number of GO BP terms falling in each class.

Similar articles

Cited by

References

    1. Huang DW, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucl Acids Res. 2009;37:1–13. doi: 10.1093/nar/gkn923. - DOI - PMC - PubMed
    1. Gonzalez MW, Kann MG. Chapter 4: Protein interactions and disease. PloS Comput Biol. 2012;8:1002819. doi: 10.1371/journal.pcbi.1002819. - DOI - PMC - PubMed
    1. Laukens K, Naulaerts S, Berghe WV. Bioinformatics approaches for the functional interpretation of protein lists: from ontology term enrichment to network analysis. Proteomics. 2015;15:981–996. doi: 10.1002/pmic.201400296. - DOI - PubMed
    1. Glaab E. et al.Enrichnet: network-based gene set enrichment analysis. Bioinformatics. 2012;28(18):451–457. doi: 10.1093/bioinformatics/bts389. - DOI - PMC - PubMed
    1. Hung JH, Whitfield TW, Yang TH, Hu Z, Weng Z, DeLisi C. Identification of functional modules that correlate with phenotypic difference: the influence of network topology. Genome Biol. 2010;11:R23. doi: 10.1186/gb-2010-11-2-r23. - DOI - PMC - PubMed

Publication types