Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2000 Dec;10(12):2055-61.
doi: 10.1101/gr.gr-1325rr.

The comparison of gene expression from multiple cDNA libraries

Affiliations
Comparative Study

The comparison of gene expression from multiple cDNA libraries

D J Stekel et al. Genome Res. 2000 Dec.

Abstract

We describe a method for comparing the abundance of gene transcripts in cDNA libraries. This method allows for the comparison of gene expression in any number of libraries, in a single statistical analysis, to identify differentially expressed genes. Such genes may be of potential biological or pharmaceutical relevance. The formula that we derive is essentially the entropy of a partitioning of genes among cDNA libraries. This work goes beyond previously published analyses, which can either compare only two libraries, or identify a single outlier in a group of libraries. This work also addresses the problem of false positives associated with repeating the test on many thousands of genes. A randomization procedure is described that provides a quantitative measure of the degree of belief in the results; the results are further verified by considering a theoretically derived large deviations rate for the test statistic. As an example, the analysis is applied to four prostate cancer libraries from the Cancer Genome Anatomy Project. The analysis identifies biologically relevant genes that are differentially expressed in the different tumor cell types.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The number of genes for a given value of the test statistic R is plotted as a function of R. It can be seen that the data falls into two regions. For 1 ⩽ R ⩽ 9, the number of genes decreases exponentially with R. The solid line is the regression in this interval. The slope is −0.9 with standard error 0.07, and is therefore not significantly different from −1 at 5% significance. This is in accordance with the large deviations calculation described in the Methods section. When R > 9, the number of genes is above this exponential curve, and is much greater than predicted by the large deviations calculation.

Similar articles

Cited by

References

    1. Anderson L, Seilhamer J. A comparison of selected mRNA and protein abundances in the human liver. Electrophoresis. 1997;18:533–537. - PubMed
    1. Audic S, Claverie J-M. The significance of digital gene expression profiles. Genome Res. 1997;7:986–995. - PubMed
    1. Boguski MS, Schuler GD. ESTablishing a human transcript map. Nat Genet. 1995;10:369–371. - PubMed
    1. Borchert GH, Yu H, Tomlinson G, Giai M, Roagna R, Ponzone R, Sgro L, Diamandis EP. Prostate specific antigen molecular forms in breast cyst fluid and serum of women with fibrocystic breast disease. J Clin Lab Anal. 1999;13:75–81. - PMC - PubMed
    1. Bortoluzzi S, Danieli GA. Towards an in silico analysis of transcription patterns. Trends Genet. 1999;15:118–119. - PubMed

Publication types