. 2008 Mar 28;4(3):e1000043.

doi: 10.1371/journal.pcbi.1000043.

Prediction of human disease genes by human-mouse conserved coexpression analysis

Ugo Ala¹, Rosario Michael Piro, Elena Grassi, Christian Damasco, Lorenzo Silengo, Martin Oti, Paolo Provero, Ferdinando Di Cunto

Affiliations

PMID: 18369433
PMCID: PMC2268251
DOI: 10.1371/journal.pcbi.1000043

Prediction of human disease genes by human-mouse conserved coexpression analysis

Ugo Ala et al. PLoS Comput Biol. 2008.

. 2008 Mar 28;4(3):e1000043.

doi: 10.1371/journal.pcbi.1000043.

Authors

Ugo Ala¹, Rosario Michael Piro, Elena Grassi, Christian Damasco, Lorenzo Silengo, Martin Oti, Paolo Provero, Ferdinando Di Cunto

Affiliation

¹ Molecular Biotechnology Center, Department of Genetics, Biology and Biochemistry, University of Turin, Turin, Italy.

PMID: 18369433
PMCID: PMC2268251
DOI: 10.1371/journal.pcbi.1000043

Abstract

Background: Even in the post-genomic era, the identification of candidate genes within loci associated with human genetic diseases is a very demanding task, because the critical region may typically contain hundreds of positional candidates. Since genes implicated in similar phenotypes tend to share very similar expression profiles, high throughput gene expression data may represent a very important resource to identify the best candidates for sequencing. However, so far, gene coexpression has not been used very successfully to prioritize positional candidates.

Methodology/principal findings: We show that it is possible to reliably identify disease-relevant relationships among genes from massive microarray datasets by concentrating only on genes sharing similar expression profiles in both human and mouse. Moreover, we show systematically that the integration of human-mouse conserved coexpression with a phenotype similarity map allows the efficient identification of disease genes in large genomic regions. Finally, using this approach on 850 OMIM loci characterized by an unknown molecular basis, we propose high-probability candidates for 81 genetic diseases.

Conclusion: Our results demonstrate that conserved coexpression, even at the human-mouse phylogenetic distance, represents a very strong criterion to predict disease-relevant relationships among human genes.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Figure 1. Identification of candidate disease genes by means of conserved co-expression clusters.**
A locus with positional candidates (green bar) associated with a disease of unknown molecular basis (yellow) is screened for one or more candidate genes (green sphere) that appear in a conserved coexpression cluster (purple spheres) together with at least two other genes known to be involved in similar phenotypes (yellow spheres), as defined by the MimMiner similarity score.

**Figure 2. Comparison of the Affy and Stanford networks with functional, physical interaction, and disease-related information.**
(A) Prevalence of functionally related coexpression clusters (see Materials and Methods). (B) Number of edges of the CCN joining proteins previously shown to physically interact by different techniques, as deduced by the HPRD database. (C) Number of edges of the indicated networks connecting genes involved in Mendelian phenotypes sharing a MimMiner score of 0.4 or higher. In each case, the results for the actual CCNs are compared to the results averaged on 100 randomized CCNs, with error bars representing the standard deviation of the latter.

**Figure 3. Performance of the identification of candidate disease genes as determined by a leave-one-out strategy.**
Artificial loci of different sizes were constructed around known disease genes as explained in the text. (A) Fraction of the artificial loci for which it was possible to identify at least one candidate gene, as a function of the locus size. (B) Average number of candidates in the loci for which at least one candidate was identified, as a function of the locus size. (C) Precision as a function of the locus size. Precision is determined as the ratio between the number of loci whose candidate list contained the starting disease gene and the number of loci with candidates. Filled triangles indicate the results obtained with the Affy network, while empty boxes refer to the Stanford network.

**Figure 4. Example of identification of candidate disease genes.**
Identification of KCNIP4 (green) in the genomic locus 4p15 (containing 86 genes; orange), associated with partial epilepsy with pericentral spikes (OMIM ID 607221). KCNIP4 was found in a disease-related cluster composed of LOC399947 and its nearest neighbors (red and purple spheres, respectively) that contains 5 other genes known to be involved in similar phenotypes (yellow spheres). A second candidate for this disease was found in another disease-associated cluster (not shown; see Table 2).

See this image and copyright information in PMC

References

1. van Driel MA, Brunner HG. Bioinformatics methods for identifying candidate disease genes. Hum Genomics. 2006;2:429–432. - PMC - PubMed
1. Lopez-Bigas N, Ouzounis CA. Genome-wide identification of genes likely to be involved in human genetic disease. Nucleic Acids Res. 2004;32:3108–3114. - PMC - PubMed
1. Barabasi AL, Oltvai ZN. Network biology: understanding the cell's functional organization. Nat Rev Genet. 2004;5:101–113. - PubMed
1. Oti M, Brunner HG. The modular nature of genetic diseases. Clin Genet. 2007;71:1–11. - PubMed
1. Franke L, Bakel H, Fokkens L, de Jong ED, Egmont-Petersen M, et al. Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am J Hum Genet. 2006;78:1011–1025. - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Prediction of human disease genes by human-mouse conserved coexpression analysis

Affiliation

Prediction of human disease genes by human-mouse conserved coexpression analysis

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Medical