Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2010 May 5:11:282.
doi: 10.1186/1471-2164-11-282.

A comparison of massively parallel nucleotide sequencing with oligonucleotide microarrays for global transcription profiling

Affiliations
Comparative Study

A comparison of massively parallel nucleotide sequencing with oligonucleotide microarrays for global transcription profiling

James R Bradford et al. BMC Genomics. .

Abstract

Background: RNA-Seq exploits the rapid generation of gigabases of sequence data by Massively Parallel Nucleotide Sequencing, allowing for the mapping and digital quantification of whole transcriptomes. Whilst previous comparisons between RNA-Seq and microarrays have been performed at the level of gene expression, in this study we adopt a more fine-grained approach. Using RNA samples from a normal human breast epithelial cell line (MCF-10a) and a breast cancer cell line (MCF-7), we present a comprehensive comparison between RNA-Seq data generated on the Applied Biosystems SOLiD platform and data from Affymetrix Exon 1.0ST arrays. The use of Exon arrays makes it possible to assess the performance of RNA-Seq in two key areas: detection of expression at the granularity of individual exons, and discovery of transcription outside annotated loci.

Results: We found a high degree of correspondence between the two platforms in terms of exon-level fold changes and detection. For example, over 80% of exons detected as expressed in RNA-Seq were also detected on the Exon array, and 91% of exons flagged as changing from Absent to Present on at least one platform had fold-changes in the same direction. The greatest detection correspondence was seen when the read count threshold at which to flag exons Absent in the SOLiD data was set to t<1 suggesting that the background error rate is extremely low in RNA-Seq. We also found RNA-Seq more sensitive to detecting differentially expressed exons than the Exon array, reflecting the wider dynamic range achievable on the SOLiD platform. In addition, we find significant evidence of novel protein coding regions outside known exons, 93% of which map to Exon array probesets, and are able to infer the presence of thousands of novel transcripts through the detection of previously unreported exon-exon junctions.

Conclusions: By focusing on exon-level expression, we present the most fine-grained comparison between RNA-Seq and microarrays to date. Overall, our study demonstrates that data from a SOLiD RNA-Seq experiment are sufficient to generate results comparable to those produced from Affymetrix Exon arrays, even using only a single replicate from each platform, and when presented with a large genome.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Read locations. The proportion of unique reads in (A) MCF-10a and (B) MCF-7, mapping to four genomic locations: known exons and introns, as defined by Ensembl, other annotated regions including ESTs, Genscan predictions and Exon array probe selection regions, and un-annotated regions.
Figure 2
Figure 2
Correspondence between RNA-Seq and Exon arrays. (A) Determination of the read count threshold giving optimum correspondence between both platforms with respect to Present/Absent calls. (B) Present/Absent call correspondence at a read count threshold of zero in RNA-Seq and a DABG score threshold of 0.01 on the array. (C) Comparison of fold changes between RNA-Seq and the array. Red dots indicate exons flagged as Present (P) in both samples and on both platforms (PP->PP). Grey dots indicate exons flagged as Absent (A) in at least one sample on both platforms (AA->AA, PA->PA, AP->AP, PA-AP, AP->PA, AA->PA, AA->AP, PA->AA, AP->AA). Note that, due to the density of the data, some grey points representing exons Absent in both RNA-Seq samples (zero fold change) are masked by other colours. Blue dots indicate exons Absent in at least one RNA-Seq sample but flagged Present in both array samples (PA->PP, AA->PP, AP->PP), and green dots represent exons Present in both samples in RNA-Seq but flagged Absent in at least one sample on the array (PP->PA, PP->AA, PP->AP). (D) Overlap between numbers of exons called differentially expressed by the array and RNA-Seq using (Left) a log2 fold change threshold of 2.0 on the array and 3.0 in RNA-Seq (left) and a LIMMA p-value threshold of 1 × 10-4 on the array and an Audic-Claverie p-value threshold of 1 × 10-7 in RNA-Seq (right). These thresholds lead to the greatest equivalence between platforms using an overlap metric based on the CS (Equation 2).

Similar articles

Cited by

References

    1. Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M. The Transcriptional Landscape of the Yeast Genome Defined by RNA Sequencing. Science. 2008;320(5881):1344–1349. doi: 10.1126/science.1158441. - DOI - PMC - PubMed
    1. Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10(1):57–63. doi: 10.1038/nrg2484. - DOI - PMC - PubMed
    1. Wilhelm BT, Landry J-R. RNA-Seq--quantitative measurement of expression through massively parallel RNA-sequencing. Methods. 2009;48(3):249–57. doi: 10.1016/j.ymeth.2009.03.016. - DOI - PubMed
    1. Wilhelm BT, Marguerat S, Watt S, Schubert F, Wood V, Goodhead I, Penkett CJ, Rogers J, Bahler J. Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature. 2008;453(7199):1239–1243. doi: 10.1038/nature07002. - DOI - PubMed
    1. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Meth. 2008;5(7):621–628. doi: 10.1038/nmeth.1226. - DOI - PubMed

Publication types