Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Aug 30;29(19):1998-2011.
doi: 10.1002/sim.3962.

Discovery of complex pathways from observational data

Affiliations

Discovery of complex pathways from observational data

James W Baurley et al. Stat Med. .

Abstract

Unraveling complex interactions has been a challenge in epidemiologic research. We introduce a pathway modeling framework that discovers plausible pathways from observational data, and allows estimation of both the net effect of the pathway and the types of interactions occurring among genetic or environmental risk factors. Each discovered pathway structure links combinations of observed variables through intermediate latent nodes to a final node, the outcome. Biologic knowledge can be readily applied in this framework as a prior on pathway structure to give preference to more biologically plausible models, thereby providing more precise estimation of Bayes factors for pathways of greatest interest by Markov Chain Monte Carlo (MCMC) methods.Data were simulated for binary inputs of which only a subset was involved in different pathway topologies. Our algorithm was then used to recover the pathway from the simulated data. The posterior distributions of inputs, pairwise and higher-order interactions, and topologies were obtained by MCMC methods. The evidence in favor of a particular pathway or interaction was summarized using Bayes factors. Our method can correctly identify the risk factors and interactions involved in the simulated pathway. We apply our framework to an asthma case-control data set with polymorphisms in 12 genes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Graphical representation of the model. The risk factors Z and the outcome Y are observed. The X’s are latent nodes determined by topology specific parameters θΛ and the topology Λ.
Figure 2
Figure 2
Top 5 discovered topologies by posterior probability.
Figure 3
Figure 3
These two different topologies are constructed of AND and OR node types. Scenario 1 OR(AND,AND) in the upper panel and AND(OR(AND)) in the lower panel. Each edge is labeled with the corresponding simulated value in brackets and the posterior mean and standard deviation for θΛ and βΛ.
Figure 4
Figure 4
Cluster dendrogram of asthma genes from Gene Ontology.
Figure 5
Figure 5
Discovered asthma pathway involving three genes.
Figure 6
Figure 6
Topology moves. From left to right, a node is removed deleting the edge to input 3. A new node is then added connecting input 1 and 2.

References

    1. Ottman R. Gene-environment interaction: definitions and study designs. Prev Med. 1996;25(6):764–70. - PMC - PubMed
    1. Thomas PD, Kejariwal A. Coding single-nucleotide polymorphisms associated with complex vs. Mendelian disease: evolutionary evidence for differences in molecular effects. Proc Natl Acad Sci U S A. 2004;101(43):15398–403. - PMC - PubMed
    1. Thomas DC. The Need for a systematic approach to complex pathways in molecular epidemiology. Cancer Epidemiol Biomarkers Prev. 2005;14(3) - PubMed
    1. Conti DV, et al. Bayesian modeling of complex metabolic pathways. Hum Hered. 2003;56(1–3):83–93. - PubMed
    1. Conti D, et al. Using ontologies in hierarchical modeling of genes and exposures in biological pathways. Phenotypes and Endophenotypes: Foundations for Genetic Studies of Nicotine Use and Dependence. 2009. NCI Tobacco Control Monograph Series. :539–584.

Publication types

LinkOut - more resources