Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2002 Jun 21:3:17.
doi: 10.1186/1471-2105-3-17.

The limit fold change model: a practical approach for selecting differentially expressed genes from microarray data

Affiliations

The limit fold change model: a practical approach for selecting differentially expressed genes from microarray data

David M Mutch et al. BMC Bioinformatics. .

Abstract

Background: The biomedical community is developing new methods of data analysis to more efficiently process the massive data sets produced by microarray experiments. Systematic and global mathematical approaches that can be readily applied to a large number of experimental designs become fundamental to correctly handle the otherwise overwhelming data sets.

Results: The gene selection model presented herein is based on the observation that: (1) variance of gene expression is a function of absolute expression; (2) one can model this relationship in order to set an appropriate lower fold change limit of significance; and (3) this relationship defines a function that can be used to select differentially expressed genes. The model first evaluates fold change (FC) across the entire range of absolute expression levels for any number of experimental conditions. Genes are systematically binned, and those genes within the top X% of highest FCs for each bin are evaluated both with and without the use of replicates. A function is fitted through the top X% of each bin, thereby defining a limit fold change. All genes selected by the 5% FC model lie above measurement variability using a within standard deviation (SDwithin) confidence level of 99.9%. Real time-PCR (RT-PCR) analysis demonstrated 85.7% concordance with microarray data selected by the limit function.

Conclusion: The FC model can confidently select differentially expressed genes as corroborated by variance data and RT-PCR. The simplicity of the overall process permits selecting model limits that best describe experimental data by extracting information on gene expression patterns across the range of expression levels. Genes selected by this process can be consistently compared between experiments and enables the user to globally extract information with a high degree of confidence.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The relationship between absolute value, limit fold change (LFC), and variance across the absolute expression range. A) The x-axis threshold indicates those genes that have a minimum ADI of 20. Genes in bins of 200 are examined for the top 5% highest fold changes (red horizontal lines indicate the 95th percentile for each bin). The line of best fit, drawn through each bin in blue, identifies the overall LFC cut-off and is described by the simple equation 5% LFC = 1.74 + 91.55/min ADI. B) Identifying the top 1% (black line) or 10% (red line) highest fold changes in each bin shifts the LFC curve, when compared to the 5% LFC model (blue line), and alters the severity for the selection of differentially expressed genes (1% LFC = 2.43 + 166.12/min ADI; 10% LFC = 1.59 + 69.47/min ADI). C) The upper 99.9% confidence limit (CL) of a robust estimation of the coefficient of variance (CV) for replicates (within-treatment variability) has been modeled as a function of absolute minimum expression of all treatments, as indicated by the blue line. Overlaying the 99.9% CL on the data selected by the 5% LFC model (red dots) ensures high confidence in the selected genes.
Figure 2
Figure 2
Schematic representation of the cyclical nature of the limit fold change (LFC) model. Selecting an initial X% LFC model (1) provides a starting point for the identification of those genes differentially regulated. Genes can then be ranked (2) by a calculation combining fold change and absolute expression in order to assign a degree of importance. Validation of the chosen LFC model by a complementary technique such as RT-PCR (3) and/or the characterization of variance (4) enables the analyst to reexamine the initial LFC model and determine the confidence level for the results. Depending on the data set, one could redefine the LFC model and repeat the cycle.

Similar articles

Cited by

References

    1. Brazma A, Vilo J. Gene expression data analysis. FEBS Lett. 2000;480:17–24. doi: 10.1016/S0014-5793(00)01772-5. - DOI - PubMed
    1. Ptashne M, Gann A. Genes & Signals. Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press. 2002.
    1. Ekins RP. Ligand assays: from electrophoresis to miniaturized microarrays. Clin Chem. 1998;44:2015–2030. - PubMed
    1. DeRisi JL, Iyer VR, Brown PO. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science. 1997;278:680–686. doi: 10.1126/science.278.5338.680. - DOI - PubMed
    1. Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A. 1998;95:14863–14868. doi: 10.1073/pnas.95.25.14863. - DOI - PMC - PubMed

Publication types

Substances