Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Mar 4;10(3):e0118198.
doi: 10.1371/journal.pone.0118198. eCollection 2015.

A regression-based differential expression detection algorithm for microarray studies with ultra-low sample size

Affiliations

A regression-based differential expression detection algorithm for microarray studies with ultra-low sample size

Daniel Vasiliu et al. PLoS One. .

Abstract

Global gene expression analysis using microarrays and, more recently, RNA-seq, has allowed investigators to understand biological processes at a system level. However, the identification of differentially expressed genes in experiments with small sample size, high dimensionality, and high variance remains challenging, limiting the usability of these tens of thousands of publicly available, and possibly many more unpublished, gene expression datasets. We propose a novel variable selection algorithm for ultra-low-n microarray studies using generalized linear model-based variable selection with a penalized binomial regression algorithm called penalized Euclidean distance (PED). Our method uses PED to build a classifier on the experimental data to rank genes by importance. In place of cross-validation, which is required by most similar methods but not reliable for experiments with small sample size, we use a simulation-based approach to additively build a list of differentially expressed genes from the rank-ordered list. Our simulation-based approach maintains a low false discovery rate while maximizing the number of differentially expressed genes identified, a feature critical for downstream pathway analysis. We apply our method to microarray data from an experiment perturbing the Notch signaling pathway in Xenopus laevis embryos. This dataset was chosen because it showed very little differential expression according to limma, a powerful and widely-used method for microarray analysis. Our method was able to detect a significant number of differentially expressed genes in this dataset and suggest future directions for investigation. Our method is easily adaptable for analysis of data from RNA-seq and other global expression experiments with low sample size and high dimensionality.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. A schematic overview of gene selection by PED.

Similar articles

Cited by

References

    1. Lovén J, Orlando D, Sigova A, Lin C, Rahl P, et al. (2012) Revisiting global gene expression analysis. Cell 151: 476–82. 10.1016/j.cell.2012.10.012 - DOI - PMC - PubMed
    1. Papp K, Szittner Z, Prechl J (2012) Life on a microarray: assessing live cell functions in a microarray format. Cell Mol Life Sci 69: 2717–25. 10.1007/s00018-012-0947-z - DOI - PMC - PubMed
    1. Bair E (2013) Identification of significant features in DNA microarray data. Wiley Interdiscip Rev Comput Stat 5 10.1002/wics.1260 - DOI - PMC - PubMed
    1. Suárez E, Burguete A, Mclachlan G (2009) Microarray data analysis for differential expression: a tutorial. P R Health Sci J 28: 89–104. - PubMed
    1. Dudoit S, Shaffer JP, C BJ (2003) Multiple hypothesis testing in microarray experiments. Statistical Science 18: 71–103. 10.1214/ss/1056397487 - DOI

Publication types