Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;8(12):e1002816.
doi: 10.1371/journal.pcbi.1002816. Epub 2012 Dec 27.

Chapter 2: Data-driven view of disease biology

Affiliations

Chapter 2: Data-driven view of disease biology

Casey S Greene et al. PLoS Comput Biol. 2012.

Abstract

Modern experimental strategies often generate genome-scale measurements of human tissues or cell lines in various physiological states. Investigators often use these datasets individually to help elucidate molecular mechanisms of human diseases. Here we discuss approaches that effectively weight and integrate hundreds of heterogeneous datasets to gene-gene networks that focus on a specific process or disease. Diverse and systematic genome-scale measurements provide such approaches both a great deal of power and a number of challenges. We discuss some such challenges as well as methods to address them. We also raise important considerations for the assessment and evaluation of such approaches. When carefully applied, these integrative data-driven methods can make novel high-quality predictions that can transform our understanding of the molecular-basis of human disease.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Potential distributions of experimental results obtained for datasets collected under three different conditions.
The dotted line indicates the distribution of negative examples and the solid line indicates the distribution of positive examples. In condition A the positive examples more often occur to the right of the negative examples, in condition B both sets overlap, and in condition C the positive examples occur more often to the left of the negative examples.
Figure 2
Figure 2. An example of querying HEFalMp for the role of APOE across all biological processes (http://hefalmp.princeton.edu/).
Figure 3
Figure 3. The result of querying HEFalMp for the role of APOE across all biological processes.
Red links indicate that there is a high probability of a functional relationship between the two genes.
Figure 4
Figure 4. The highest and lowest contributing datasets for the pair of APOE and PLTP are shown (http://hefalmp.princeton.edu/gene/one_specific_gene/18543?argument=21697&context=0).
These contributions are based on how well the bin containing the queried gene pair separated known positive functional relationships from known negative functional relationships.
Figure 5
Figure 5. The diseases that are significantly connected to APOE through the guilt by association strategy used in HEFalMp.
Alzheimer disease and Macular degeneration are both annotated to the disease in OMIM as noted by the gold bars to the left of the disease (http://hefalmp.princeton.edu/gene/diseases?context=0&name=APOE). The other diseases are implicated by APOE's functional relationships to genes annotated to that disease in OMIM.
Figure 6
Figure 6. The genes that are most significantly connected to Alzheimer disease genes using the HEFalMp network and OMIM disease gene annotations (http://hefalmp.princeton.edu/disease/all_genes/55?context=0).
The gold bars to the left of APP and APOE indicate that both genes were annotated Alzheimer disease according to OMIM.
Figure 7
Figure 7. The functional relationship network discovered by a data driven integration for the YFG gene in YFO.

References

    1. Whitfield ML, Sherlock G, Saldanha AJ, Murray JI, Ball CA, et al. (2002) Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol Biol Cell 13: 1977–2000. - PMC - PubMed
    1. Hegde P, Qi R, Gaspard R, Abernathy K, Dharap S, et al. (2001) Identification of tumor markers in models of human colorectal cancer using a 19,200-element complementary DNA microarray. Cancer Res 61: 7792–7797. - PubMed
    1. Lock C, Hermans G, Pedotti R, Brendolan A, Schadt E, et al. (2002) Gene-microarray analysis of multiple sclerosis lesions yields new targets validated in autoimmune encephalomyelitis. Nat Med 8: 500–508. - PubMed
    1. Wellcome Trust Case Control Consortium (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447: 661–678. - PMC - PubMed
    1. Schymick JC, Scholz SW, Fung HC, Britton A, Arepalli S, et al. (2007) Genome-wide genotyping in amyotrophic lateral sclerosis and neurologically normal controls: first stage analysis and public release of data. Lancet Neurol 6: 322–328. - PubMed

Publication types