Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;8(9):e1002694.
doi: 10.1371/journal.pcbi.1002694. Epub 2012 Sep 27.

Tissue-specific functional networks for prioritizing phenotype and disease genes

Affiliations

Tissue-specific functional networks for prioritizing phenotype and disease genes

Yuanfang Guan et al. PLoS Comput Biol. 2012.

Abstract

Integrated analyses of functional genomics data have enormous potential for identifying phenotype-associated genes. Tissue-specificity is an important aspect of many genetic diseases, reflecting the potentially different roles of proteins and pathways in diverse cell lineages. Accounting for tissue specificity in global integration of functional genomics data is challenging, as "functionality" and "functional relationships" are often not resolved for specific tissue types. We address this challenge by generating tissue-specific functional networks, which can effectively represent the diversity of protein function for more accurate identification of phenotype-associated genes in the laboratory mouse. Specifically, we created 107 tissue-specific functional relationship networks through integration of genomic data utilizing knowledge of tissue-specific gene expression patterns. Cross-network comparison revealed significantly changed genes enriched for functions related to specific tissue development. We then utilized these tissue-specific networks to predict genes associated with different phenotypes. Our results demonstrate that prediction performance is significantly improved through using the tissue-specific networks as compared to the global functional network. We used a testis-specific functional relationship network to predict genes associated with male fertility and spermatogenesis phenotypes, and experimentally confirmed one top prediction, Mbyl1. We then focused on a less-common genetic disease, ataxia, and identified candidates uniquely predicted by the cerebellum network, which are supported by both literature and experimental evidence. Our systems-level, tissue-specific scheme advances over traditional global integration and analyses and establishes a prototype to address the tissue-specific effects of genetic perturbations, diseases and drugs.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Strategy for constructing tissue-specific networks and predicting phenotype-associated genes.
Diverse functional genomic datasets such as expression, protein-protein interactions and phenotype information were integrated in a Bayesian framework to generate tissue-specific networks. Input datasets were probabilistically “weighted” based on how informative they were in reflecting known co-functional proteins that are both expressed in a given tissue. To account for overlap in information in multiple datasets (especially the large number of gene expression microarray datasets), mutual information-based regularization was used to down-weight datasets showing significant overlap with each other. These networks were then used as input into a Support Vector Machine classifier to predict phenotype related genes. Finally, we implemented a web interface that allows network comparison between tissues.
Figure 2
Figure 2. Tissue-specific networks are more accurate than the global network in reflecting protein functional relationships.
A. 107 tissues were grouped into major body systems according to the anatomical hierarchical structure maintained in GXD . Through three-fold cross-validation, the performance of tissue-specific networks was compared against the global network and the percentage improvement of tissue-specific networks over the global network was plotted. All tissue-specific networks out-performed the global network in this cross-validation analysis. Improvements were consistent across tissues belonging to all major organ systems. Candle-stick plots (minimum, 25%, median, 75% and maximum) represent the distribution of percentage AUC improvement for all tissues in a specific system. B. Example precision recall curves of tissue-specific and the global network, generated using three-fold cross-validation. Across the entire precision-recall space, tissue-specific networks performed better than the global network. Complete precision-recall figures for all networks are included in Dataset S2.
Figure 3
Figure 3. Top connected genes of Wnt10b in muscle-specific and bone-specific networks.
In A, blue-highlighted genes are directly involved in skeletal muscle development. In B, blue-highlighted genes are involved in bone minerization or bone structure formation. The enrichment of genes involved in the above processes reflects the differential roles of Wnt10b in skeletal muscle and bone.
Figure 4
Figure 4. Tissue-specific networks perform better than the global network in predicting genes related to different phenotypes.
By mapping phenotypes to different tissues according to their terminology and description, we are able to compare the performance of tissue-specific networks and the global network in predicting phenotype-related genes. Candle-stick plots (minimum, 25%, median, 75% and maximum) show the distribution of percentage AUC improvement when predicting phenotype-related genes. A. Phenotypes were grouped according to the number of annotated genes. Tissue-specific functional networks show consistent improvement across different phenotype sizes. B. Phenotypes were grouped according to major organ systems of their corresponding tissue. Improvements were consistent across all major systems. C. Example precision-recall curves for “abnormal osteogenesis” (MP:0000057), “abnormal nervous system electrophysiology” (MP:0002272), “abnormal spleen white pulp morphology” (MP:0002357), and “abnormal CNS glial cell morphology” (MP:0003634) using both tissue-specific networks (shown in red) and global networks (shown in green). For phenotypes such as these, tissue-specific networks are necessary to make accurate predictions.
Figure 5
Figure 5. Prediction and verification of infertility-related genes through male reproductive system-specific networks.
A. Local functional relationship network of the gene Mybl1 in the male reproductive system. The top 18 genes connected to the query set with connection weights higher than 0.634 are displayed. These top functionally related proteins include well characterized male infertility genes such as Dmc1, Ddx4, and Cyct. B. Histological cross-sections of oval seminiferous tubules show that wild type (Mybl1+/+) testis tubules contain many developing germ cells, while mutant (Mybl1repro9/repro9) testis tubules contain many fewer germ cells and more empty space, indicative of infertility.
Figure 6
Figure 6. Top connected genes to Atcay in the cerebellum-specific network reveals likely ataxia candidates.
Edges with weight greater than 0.9 are shown. In the cerebellum network (A), Grm1 and Cacn1a are the top predicted connections to Atcay, with confidences of 0.902 and 0.943, respectively. Both genes are closely connected to Atcay and its top 10 neighbors. In the global network (B), Grm1 and Cacn1a are much more weakly connected to Atcay (0.763 and 0.647, respectively), and are not identified as top connectors to Atcay. Grm1 and Cacn1a are not connected to Atcay or any of its top 10 neighbors in the global network.

References

    1. Winter EE, Goodstadt L, Ponting CP (2004) Elevated rates of protein secretion, evolution, and disease among tissue-specific genes. Genome Res 14: 54–61. - PMC - PubMed
    1. Goh KI, Cusick ME, Valle D, Childs B, Vidal M, et al. (2007) The human disease network. Proc Natl Acad Sci U S A 104: 8685–8690. - PMC - PubMed
    1. Chao EC, Lipkin SM (2006) Molecular models for the tissue specificity of DNA mismatch repair-deficient carcinogenesis. Nucleic Acids Res 34: 840–852. - PMC - PubMed
    1. Lage K, Hansen NT, Karlberg EO, Eklund AC, Roque FS, et al. (2008) A large-scale analysis of tissue-specific pathology and gene expression of human disease genes and complexes. Proc Natl Acad Sci U S A 105: 20870–20875. - PMC - PubMed
    1. Guan Y, Myers CL, Lu R, Lemischka IR, Bult CJ, et al. (2008) A genomewide functional network for the laboratory mouse. PLoS Comput Biol 4: e1000165. - PMC - PubMed

Publication types