Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Jun 22:8:188.
doi: 10.1186/1471-2164-8-188.

New data on robustness of gene expression signatures in leukemia: comparison of three distinct total RNA preparation procedures

Affiliations

New data on robustness of gene expression signatures in leukemia: comparison of three distinct total RNA preparation procedures

Marta Campo Dell'Orto et al. BMC Genomics. .

Abstract

Background: Microarray gene expression (MAGE) signatures allow insights into the transcriptional processes of leukemias and may evolve as a molecular diagnostic test. Introduction of MAGE into clinical practice of leukemia diagnosis will require comprehensive assessment of variation due to the methodologies. Here we systematically assessed the impact of three different total RNA isolation procedures on variation in expression data: method A: lysis of mononuclear cells, followed by lysate homogenization and RNA extraction; method B: organic solvent based RNA isolation, and method C: organic solvent based RNA isolation followed by purification.

Results: We analyzed 27 pediatric acute leukemias representing nine distinct subtypes and show that method A yields better RNA quality, was associated with more differentially expressed genes between leukemia subtypes, demonstrated the lowest degree of variation between experiments, was more reproducible, and was characterized with a higher precision in technical replicates. Unsupervised and supervised analyses grouped leukemias according to lineage and clinical features in all three methods, thus underlining the robustness of MAGE to identify leukemia specific signatures.

Conclusion: The signatures in the different subtypes of leukemias, regardless of the different extraction methods used, account for the biggest source of variation in the data. Lysis of mononuclear cells, followed by lysate homogenization and RNA extraction represents the optimum method for robust gene expression data and is thus recommended for obtaining robust classification results in microarray studies in acute leukemias.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Study concept. (A) Total RNA of each of the first 24 samples had been extracted following three different total RNA purification methods A, B, and C. Method A: lysis of the mononuclear cells, followed by lysate homogenization (to reduce viscosity caused by high-molecular-weight cellular components and cell debris) using a biopolymer shredding system in a microcentrifuge spin-column format (QIAshredder, Qiagen) followed by total RNA purification (RNeasy Mini Kit, Qiagen). Method B: TRIzol RNA isolation (Invitrogen). Method C: TRIzol RNA isolation (Invitrogen) followed by an RNeasy purification step (RNeasy Mini Kit, Qiagen). The RNA purification step combines the selective binding properties of a silica-based membrane with the speed of microspin technology. It allows only RNA longer than 200 bases to bind to the silica membrane, providing an enriching for mRNA since nucleotides shorter than 200 nucleotides are selectively excluded. (B) For each of three additional samples, nine aliquots of mononuclear cells had been collected. Total RNA has been processed for each aliquot following one of the three methods and for each method three independent technical replicates were performed (A,A,A, B,B,B, C,C,C).
Figure 2
Figure 2
Box plots of quality measurements. The box plots show various quality metrics to judge overall performance of the microarray experiments. Each method represents 33 individual microarray experiments (Count). For each of the methods median values (blue arrow), mean values (black arrow), standard deviation (StdDev) and interquartile range (IQR) are given. The overall p value has been calculated for each of the parameters using one-way ANOVA. (A) total cRNA yield after in vitro transcription (A<C<B; P = 5,308e-12). (B) %P called transcripts (A<B, A<C, B~C; P = 0,020). (C) Scaling factor (A<B, A<C, B~C; p = 1,477e-5). (D) 3'/5' ratio of the housekeeping gene GAPD. Note: one sample was excluded in the GAPD box plot due to strong outlier behavior (PAD_00271, #16, TRIzol method). (E) Q value, defined as the average standard error of pixels in probe cells used for background computation (A>B, A>C, B~C; P = 0,0149). (F) the A260/A280 ratio of cRNA measured with a spectrophotometer (A<B, C<B, A~C; p = 0,00227).
Figure 3
Figure 3
Density curves of global signal intensities. The plots show the overall signal density distribution of all probe sets represented on the HG-U133 Plus 2.0 microarray. The signal used is PS. Data from each microarray analysis is represented by a separate line. The plot is useful to visualize whether there are differences in the overall signal distributions of the experiments. (A) Density curves colored by nine distinct leukemia types. (B) Density curves colored by the three different sample preparation methods.
Figure 4
Figure 4
Unsupervised hierarchical clustering analysis. The unsupervised analysis is based on 2821 interquartile range (IQR) filtered probe sets of the HG-U133 Plus 2.0 microarray of the 99 experiments included in the study. The signal used is PQN. The three major clusters that were identified by the algorithm represent B lineage ALL (orange), T lineage ALL (blue). and AML (green) leukemia types. Then the dendrogram splits and samples are subdivided according to leukemia subtype characteristics: 1. Pro-B-ALL with t(4;11); 2. c-ALL with t(9;22); 3. T-ALL; 4. c-ALL with t(12;21); 5. Pre-B-ALL with t(1;19); 6. ALL with hyperdiploid karyotype; 7. c-ALL-Pre-B-ALL with DNA-Index DI = 1 and negative for recurrent translocations; 8. AML with t(11q23)/MLL; 9. AML with normal karyotype or other abnormalities. The graph on the left shows the correlation between distances for clustering validation (0–1-vector where 0 means same cluster, 1 means different clusters). Samples are labeled by patient numbers (#1 – #27) and total RNA extraction methods (method A, method B, or method C). For patient samples #25, #26, and #27, three individual technical replicates were performed.
Figure 5
Figure 5
Supervised analysis using differentially expressed genes. In the three-dimensional principal component analysis (PCA) 99 samples are included. The signal used is PQN. The analysis is based on 1089 differentially expressed genes that were identified in a supervised way to distinguish between the 9 distinct leukemia subtypes. A sphere represents each sample's gene expression profile using the 1089-gene signature. The first three principal components (PC) account for 58.6% of variation of the data (PC1 = 40.3%, PC2 = 11.3%, PC3 = 7.01%). (A) Distinction by leukemia classification: spheres with the same colors represent the same leukemia subtype. (B) Distinction by sample preparation method: spheres with the same color represent samples processed with the same total RNA preparation method.
Figure 6
Figure 6
One-way ANOVA of technical replicates. Three patient samples (#25, #26, and #27) from distinct leukemia subtypes were analyzed in three independent technical replicates for each method A, B, and C leading to a dataset of 27 gene expression profiles. (A) The graph represents false discovery rate (FDR) values based on One-way Analysis of Variance (ANOVA) results. For each preparation method the absolute number (left x-axis) and percentage of differentially expressed genes (right y-axis) between the various leukemia subclasses is given. The x-axis is representing multiple percentages of false discovery rates (%FDR). Method A: red line, method B, blue line, method C, green line. The vertical line is drawn at a FDR of 0.001 (0.1%). (B) Venn diagram representing the absolute number of overlapping differentially expressed genes for the three methods used. The representation is based on a series of filters: present calls, fold-change, and FDR of 0.001 (0.1%). For example, n = 7,728 genes are found to be consistently differentially expressed between the various leukemia subclasses when comparing method A to B, method A to method C, and method B to method C. As a second example, n = 2,107 genes are exclusively found to be differentially expressed when using sample preparation method A. Alternatively, n = 1,274 genes are detected to be differentially expressed by both method A and method C, but not by method B. (C) Summary table representing the percentages of overlapping differentially expressed genes for the three methods used. The first line represents the comparisons of method A to method B or method C. The second line represents the comparisons of method B to method C or method A. The third line represents the comparisons of method C to method A or method C.
Figure 7
Figure 7
Signal distributions for three technical replicates. Individual signal intensity distribution on a probe set level (PS) are shown as box plots for the three technical replicates for each of the three methods used. Sample preparation types are pointed on the x-axes; the log value of PS signals are pointed on the y-axes. Box plots with the same color represent log value of PS signals from the same total RNA preparation procedure type method A (red), method B (blue), or method C (green), respectively. (A) Replicates of patient #25. (B) Replicates of patient #26. (C) Replicates of patient #27.

References

    1. Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, Powell JI, Yang L, Marti GE, Moore T, Hudson J, Jr., Lu L, Lewis DB, Tibshirani R, Sherlock G, Chan WC, Greiner TC, Weisenburger DD, Armitage JO, Warnke R, Levy R, Wilson W, Grever MR, Byrd JC, Botstein D, Brown PO, Staudt LM. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature. 2000;403:503–511. doi: 10.1038/35000501. - DOI - PubMed
    1. Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M, Loda M, Weber G, Mark EJ, Lander ES, Wong W, Johnson BE, Golub TR, Sugarbaker DJ, Meyerson M. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci U S A. 2001;98:13790–13795. doi: 10.1073/pnas.191502998. - DOI - PMC - PubMed
    1. Bittner M, Meltzer P, Chen Y, Jiang Y, Seftor E, Hendrix M, Radmacher M, Simon R, Yakhini Z, Ben Dor A, Sampas N, Dougherty E, Wang E, Marincola F, Gooden C, Lueders J, Glatfelter A, Pollock P, Carpten J, Gillanders E, Leja D, Dietrich K, Beaudry C, Berens M, Alberts D, Sondak V. Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature. 2000;406:536–540. doi: 10.1038/35020115. - DOI - PubMed
    1. Garber ME, Troyanskaya OG, Schluens K, Petersen S, Thaesler Z, Pacyna-Gengelbach M, van de RM, Rosen GD, Perou CM, Whyte RI, Altman RB, Brown PO, Botstein D, Petersen I. Diversity of gene expression in adenocarcinoma of the lung. Proc Natl Acad Sci U S A. 2001;98:13784–13789. doi: 10.1073/pnas.241500798. - DOI - PMC - PubMed
    1. Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. 1999;286:531–537. doi: 10.1126/science.286.5439.531. - DOI - PubMed