Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 May 8:7:249.
doi: 10.1186/1471-2105-7-249.

Applying dynamic Bayesian networks to perturbed gene expression data

Affiliations

Applying dynamic Bayesian networks to perturbed gene expression data

Norbert Dojer et al. BMC Bioinformatics. .

Abstract

Background: A central goal of molecular biology is to understand the regulatory mechanisms of gene transcription and protein synthesis. Because of their solid basis in statistics, allowing to deal with the stochastic aspects of gene expressions and noisy measurements in a natural way, Bayesian networks appear attractive in the field of inferring gene interactions structure from microarray experiments data. However, the basic formalism has some disadvantages, e.g. it is sometimes hard to distinguish between the origin and the target of an interaction. Two kinds of microarray experiments yield data particularly rich in information regarding the direction of interactions: time series and perturbation experiments. In order to correctly handle them, the basic formalism must be modified. For example, dynamic Bayesian networks (DBN) apply to time series microarray data. To our knowledge the DBN technique has not been applied in the context of perturbation experiments.

Results: We extend the framework of dynamic Bayesian networks in order to incorporate perturbations. Moreover, an exact algorithm for inferring an optimal network is proposed and a discretization method specialized for time series data from perturbation experiments is introduced. We apply our procedure to realistic simulations data. The results are compared with those obtained by standard DBN learning techniques. Moreover, the advantages of using exact learning algorithm instead of heuristic methods are analyzed.

Conclusion: We show that the quality of inferred networks dramatically improves when using data from perturbation experiments. We also conclude that the exact algorithm should be used when it is possible, i.e. when considered set of genes is small enough.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The gene regulatory network from [21], its mRNA interactions and the corresponding diagram for the network modified by [17]. (a) The gene regulatory network from [21]. Rectangles denote promoters, zigzags indicate mRNAs and circles stand for proteins, dimers and a ligand (Q). The symbols + and - indicate whether transcription of a gene is activated or inhibited by relevant transcription factor. The subnetwork involving the genes A and B is a hysteretic oscillator [22]. Protein A activates transcription of both genes. Protein B joins it forming a dimer AB, which reduces the amount of free protein A and, consequently, inhibits transcription. Thus oscillations appear. The genes C and D compose a switch [23]: each protein forms a dimer which acts as an inhibitor of transcription of the other gene. Therefore highly expressed gene switches off expression of the other. The ligand binding mechanism [24] is represented by the subnetwork involving the genes E and F: protein E joined with a ligand Q forms an activator of transcription of the gene F. Finally, there are two cascades: in the first cascade C inhibits G and G inhibits H, while in the second cascade C inhibits K and K activates J. (b) The mRNA interactions in the above network; solid arrows denote transcriptional regulation, dashed ones represent interactions triggered by the ligand and posttransciptional regulation, (c) The corresponding diagram for the network modified by [17].
Figure 2
Figure 2
Log-ratios of mRNA concentration values for all knockout experiments. Log-ratios of mRNA concentration values of genes A and J for all knockout experiments. 12 time points in equal length intervals between 1100 and 1600 minutes are taken from each experiment (120 slices together). Ratios lower than 0.001 were set to this value. The horizontal lines indicate the top and bottom limits of the baseline expression level. Extended version of the figure including all 10 genes is available in the supplementary materials [25].
Figure 3
Figure 3
The interactions inferred from unperturbed data. The interactions inferred from unperturbed data (12 slices; the network restricted to 9 genes): (a) by our exact algorithm with self-loops forbidden (the edges occurring in each network with the optimal score) and (b) by Markov chain Monte Carlo method [17]; black arrows show true inferred edges (solid arrows refer to transcriptional regulation and dashed refer to interactions triggered by the ligand and posttranscriptional regulation), grey dashed arrows represent spurious edges.
Figure 4
Figure 4
The interactions inferred from perturbed data. The interactions inferred from perturbed data: (ab) 10 knockout series, each with 12 slices, (cd) 10 knockout series, each with 3 slices, (ef) 10 knockout and 10 overexpression series, each with 3 slices; black arrows show true inferred edges (solid arrows refer to transcriptional regulation and dashed refer to interactions triggered by the ligand and posttranscriptional regulation), grey dashed arrows represent spurious edges. The networks (ace) are obtained by our exact algorithm with self-loops forbidden (if there are many networks with the optimal score, the edges occurring in each one are shown) and the networks (bdf) by the MCMC method [17].

References

    1. de Jong H. Modeling and simulation of genetic regulatory systems: a literature review. J Comput Biol. 2002;9:67–103. doi: 10.1089/10665270252833208. - DOI - PubMed
    1. Friedman N. Inferring cellular networks using probabilistic graphical models. Science. 2004;303:799–805. doi: 10.1126/science.1094068. - DOI - PubMed
    1. Akutsu T, Kuhara S, Maruyama O, Miyano S. A System for Identifying Genetic Networks from Gene Expression Patterns Produced by Gene Disruptions and Overexpressions. Genome Inform Ser Workshop Genome Inform. 1998;9:151–160. - PubMed
    1. Moriyama T, Shinohara A, Takeda M, Maruyama O, Goto T, Miyano S, Kuhara S. A System to Find Genetic Networks Using Weighted Network Model. Genome Inform Ser Workshop Genome Inform. 1999;10:186–195. - PubMed
    1. Gardner TS, di Bernardo D, Lorenz D, Collins JJ. Inferring genetic networks and identifying compound mode of action via expression profiling. Science. 2003;301:102–105. doi: 10.1126/science.1081900. - DOI - PubMed

Publication types