Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011:5:75-88.
doi: 10.4137/GRSB.S7569. Epub 2011 Nov 10.

Application of structure equation modeling for inferring a serial transcriptional regulation in yeast

Affiliations

Application of structure equation modeling for inferring a serial transcriptional regulation in yeast

Sachiyo Aburatani. Gene Regul Syst Bio. 2011.

Abstract

Revealing the gene regulatory systems among DNA and proteins in living cells is one of the central aims of systems biology. In this study, I used Structural Equation Modeling (SEM) in combination with stepwise factor analysis to infer the protein-DNA interactions for gene expression control from only gene expression profiles, in the absence of protein information. I applied my approach to infer the causalities within the well-studied serial transcriptional regulation composed of GAL-related genes in yeast. This allowed me to reveal the hierarchy of serial transcriptional regulation, including previously unclear protein-DNA interactions. The validity of the constructed model was demonstrated by comparing the results with previous reports describing the regulation of the transcription factors. Furthermore, the model revealed combinatory regulation by Gal4p and Gal80p. In this study, the target genes were divided into three types: those regulated by one factor and those controlled by a combination of two factors.

Keywords: Structural equation modeling; expression profile; gene regulatory network; transcriptional regulation.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Modified four-step procedure. Step 1, construction of unrestricted models for each stage of the serial transcription; Step 2, construction of measurement models for each stage; Step 3, construction of structural equation models for each stage; Step 4, stepwise modeling to connect the different structural equation models.
Figure 2
Figure 2
Possible relationships between variables. The relationships between one observed variable (gene) and two estimated latent variables (F1 and F2). Based on empirical studies, two possible causal relationship directions exist: the observed variable to the latent variables, and one latent variable to the other latent variable. (A), (B) Possibilities for one causal relationship between the observed variable and one latent variable. (C), (D), (E) Possibilities for two causal relationships between the observed variable and the latent variables. (F), (G) Possibilities for three causal relationships between the observed variable and the latent variables.
Figure 3
Figure 3
Inferred network model of the GAL regulatory system. (A) Estimated main structure of the transcriptional regulation. Arrows show causal relationships between genes (rectangles) and transcription factors (circles). Error terms are indicated by ɛ. Relationships between errors are considered to represent other regulatory systems in the cell. For simplicity, these relationships are not shown. (B) Goodness-of-fit scores. The calculations for these scores included relationships between errors. Four criteria were used: GFI > 0.95, AGFI > 0.95, CFI > 0.90 and RMSEA < 0.05. Note: All four scores indicate that the model fit the measured data well.
Figure 4
Figure 4
Biological interpretation of the factors. (A) Key structures of the inferred network. Genes (rectangles) with two names show the ORF name (upper) and the coding protein name (lower). Circles indicate the transcription factor or a complex of factors. Mig1p and Gal4p form complexes with other proteins. (B) Beginning of the serial transcriptional regulation initiated by Mig1p. Numbers indicate the regression weight of the relationships between variables. The edge from YGL035C to F1 was positive, indicating that the expression of YGL035C promoted the translation of the subsequent coding protein. On the other hand, the edge from F1 to F2 is negative, indicating an inhibitory relationship. (C) Gal4p and Gal80p. The best fit for the latent variable at the third stage was located at the same location as another latent variable. In other words, Gal80p was inferred to form a complex with Gal4p.

Similar articles

Cited by

References

    1. Brazhnik P, Fuente A, Mende P. Gene networks: how to put the function in genomics. Trends in Biotechnology. 2002;20(11):467–72. - PubMed
    1. Akutsu T, Miyano S, Kuhara S. Algorithms for identifying Boolean networks and related biological networks based on matrix multiplication and fingerprint function. J Comput Biol. 2000;7:331–43. - PubMed
    1. Friedman N, Linial M, Nachman I, Pe’er D. Using Bayesian networks to analyze expression data. J Comput Biol. 2000;7:601–20. - PubMed
    1. Aburatani S, Kuhara S, Toh H, Horimoto K. Deduction of a gene regulatory relationship framework from gene expression data by the application of graphical Gaussian modeling. Signal Processing. 2003;83:777–88.
    1. Aburatani S, Horimoto K. Elucidation of the Relationships between LexA-Regulated Genes in the SOS response. Genome Informatics. 2005;16:95–105. - PubMed

LinkOut - more resources