Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009;4(4):e5098.
doi: 10.1371/journal.pone.0005098. Epub 2009 Apr 6.

Listen to genes: dealing with microarray data in the frequency domain

Affiliations

Listen to genes: dealing with microarray data in the frequency domain

Jianfeng Feng et al. PLoS One. 2009.

Abstract

Background: We present a novel and systematic approach to analyze temporal microarray data. The approach includes normalization, clustering and network analysis of genes.

Methodology: Genes are normalized using an error model based uniform normalization method aimed at identifying and estimating the sources of variations. The model minimizes the correlation among error terms across replicates. The normalized gene expressions are then clustered in terms of their power spectrum density. The method of complex Granger causality is introduced to reveal interactions between sets of genes. Complex Granger causality along with partial Granger causality is applied in both time and frequency domains to selected as well as all the genes to reveal the interesting networks of interactions. The approach is successfully applied to Arabidopsis leaf microarray data generated from 31,000 genes observed over 22 time points over 22 days. Three circuits: a circadian gene circuit, an ethylene circuit and a new global circuit showing a hierarchical structure to determine the initiators of leaf senescence are analyzed in detail.

Conclusions: We use a totally data-driven approach to form biological hypothesis. Clustering using the power-spectrum analysis helps us identify genes of potential interest. Their dynamics can be captured accurately in the time and frequency domain using the methods of complex and partial Granger causality. With the rise in availability of temporal microarray data, such methods can be useful tools in uncovering the hidden biological interactions. We show our method in a step by step manner with help of toy models as well as a real biological dataset. We also analyse three distinct gene circuits of potential interest to Arabidopsis researchers.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Synthesized data.
A. Gene intensity vs. time. B. The magnitude of discrete Fourier transform of the data in A. The DC term is not shown. C. M0 (DC term), M1 (corresponding to the first column in B) and M11 (the 11th column in B). A clear structure of two clusters is shown. D. The histogram of the magnitude of M11.
Figure 2
Figure 2. Correlation matrix before and after uniform normalization.
For x = 1, 2, ··· , 16 is the correlation matrix before applying the uniform normalization (see Text S1). For x = 21, 22, ··· , 36 is the correlation matrix after applying the uniform normalization (see Text S1). The diagonal elements of two matrices are all set to 0.
Figure 3
Figure 3. I Microarray data of 31,000 genes.
IA. Gene intensity vs. time. Only 200 genes are shown. IB. Magnitude of all genes vs. frequency. It is clear to see that there are two main frequencies in the data, i.e. the one of one day period (M11, the 11th column) and the other of 22 days period (M1, the first column). The DC term M0 is not shown. IC. Two dimensional plot of M11 vs. M1. ID. The histogram of the DC term. There are two peaks in the histogram. IE. The histogram of M1, it is a Weibull distribution. IF. The histogram of M11, it is an exponential distribution. II. Time trace of top ten genes with 22-day, one day period and flat. IIA. Time trace of the first (in red and black) and bottom (in blue) ten genes with the strongest amplitude of the period of 22 days. There are two classes: one is up regulated (red thick line), the other is down regulated (black thick lines). IIB. Time trace of the first (in red and black) and bottom (in blue) ten genes with the strongest amplitude of period of 1 day. There are two classes: one is on-phase (red thick line), the other is off-phase (black thick line). IIC. Time trace of the first top (in red) and bottom (in blue) ten genes without rhythms. IID, IIE, IIF, the power corresponding to IIA, IIB and IIC respectively.
Figure 4
Figure 4. One gene circuit controlling circadian activity.
A. Time trace of four genes, ELF4, TOC1, LFY and CCA1. ELF4 and TOC1 are in-phase oscillators, LFY and CCA1 are in-phase oscillators, but they are off-phase oscillators with respect to ELF4 and TOC1. B. Magnitudes vs. frequency for the four genes. They have highest magnitude at the frequency of one-day period. C. The gene circuit obtained in terms of PGC (see annotation in Text S2). D. Complex interactions between different group of genes and GI. E. Gene interactions in the frequency domain. The y-axis represents the strength of causal interactions.
Figure 5
Figure 5. A circuit of ethylene pathway.
A. An ethylene gene circuit with around 16 genes. Only genes with interactions are shown here. The thick arrow is the complex interaction between {CTR1, ETR1 and ERS2} and EIN2. B Interactions in the frequency domain calculated in terms of PGC. Only 14 significant interactions are shown.
Figure 6
Figure 6. Causal relationship between genes: a global circuit.
A. A total of 11 genes are shown and a clear hierarchy structure is demonstrated. B. The interactions in the frequency domain.

Similar articles

Cited by

References

    1. Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, et al. Comprehensive Identification of Cell Cycle-regulated Genes of the Yeast Saccharomyces cerevisiae by Microarray Hybridization. Molecular Biology of the Cell. 1998;9:3273–3297. - PMC - PubMed
    1. Wichert S, Fokianos K, Strimmer K. Identifying periodically expressed transcripts in microarray time series data. Bioinformatics. 2004;20:5–20. - PubMed
    1. Kim B-R, Littell RC, Wu RL. Clustering the periodic pattern of gene expression using Fourier series approximations. Curr Genomics. 2006;7:197203.
    1. Claridge-Chang A, Wijnen H, Naef F, Boothroyd C, Rajewsky N, et al. Circadian regulation of gene expression systems in the Drosophila head. Neuron. 2001;32:657–671. - PubMed
    1. Harmer SL, Hogenesch JB, Straume M, Chang HS, Han B, et al. Orchestrated transcription of key pathways in Arabidopsis by the circadian clock. Science. 2000;290:21102113. - PubMed

Publication types