Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 May 5:8:15190.
doi: 10.1038/ncomms15190.

Quantification of differential gene expression by multiplexed targeted resequencing of cDNA

Affiliations

Quantification of differential gene expression by multiplexed targeted resequencing of cDNA

Peer Arts et al. Nat Commun. .

Abstract

Whole-transcriptome or RNA sequencing (RNA-Seq) is a powerful and versatile tool for functional analysis of different types of RNA molecules, but sample reagent and sequencing cost can be prohibitive for hypothesis-driven studies where the aim is to quantify differential expression of a limited number of genes. Here we present an approach for quantification of differential mRNA expression by targeted resequencing of complementary DNA using single-molecule molecular inversion probes (cDNA-smMIPs) that enable highly multiplexed resequencing of cDNA target regions of ∼100 nucleotides and counting of individual molecules. We show that accurate estimates of differential expression can be obtained from molecule counts for hundreds of smMIPs per reaction and that smMIPs are also suitable for quantification of relative gene expression and allele-specific expression. Compared with low-coverage RNA-Seq and a hybridization-based targeted RNA-Seq method, cDNA-smMIPs are a cost-effective high-throughput tool for hypothesis-driven expression analysis in large numbers of genes (10 to 500) and samples (hundreds to thousands).

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1. Evaluation of cDNA-smMIPs for estimation of differential expression with artificial transcripts.
(a) Outline of the approach. (b) Accuracy of differential expression estimates from 337 individual cDNA-smMIPs targeting 92 ERCC transcripts in condition ERCC1 and ERCC2. The 92 transcripts are divided into four groups; for each group the difference in transcript abundance between condition ERCC1 and ERCC2 is known, and is indicated by the solid lines. Four technical capture replicates were performed on respectively one cDNA sample for condition ERCC1 and one cDNA sample for condition ERCC2. The expression value for each smMIP is estimated using a Bayesian model. Data points are coloured according to their expected expression fold difference. (c) Comparison of differential expression quantification by cDNA-smMIPs and the previously published method CaptureSeq that performs targeted RNA-Seq by biotin-labelled oligonucleotide hybridization. Only transcripts with log2-expression values of >4 were included for both methods. Differential expression estimated with the Bayesian model from respectively four ERCC1 and ERCC2 cDNA-smMIPs capture replicates is compared with differential expression estimates from respectively 4 ERCC1 and 5 ERCC2 replicates for CaptureSeq (see respectively Supplementary Tables 1 and 2 for statistics).
Figure 2
Figure 2. Validation of cDNA-smMIPs using on lymphoblastoid cell lines.
(a) Quantification of relative gene expresssion on endogeneous transcripts of EBV-transformed lymphoblastoid cell lines compared with average gene expression from RNA-Seq data of 660 samples in the Geuvadis project for the same cell type. Two experiments were performed (exp. 1 and exp. 2); in the second experiment, two technical replicates were created by two independent experimenters (designated by Rep. 1 and Rep. 2, see Supplementary Table 2). cDNA for experiments 1 and 2 was generated independently from the same RNA for EBV2 and EBV3. (b) Concordance of differential expression between sample EBV2 and EBV3 for individual smMIPs (N=95).
Figure 3
Figure 3. Comparison of cDNA-smMIPs with low-coverage RNA-Seq.
(a) Differences in molecule count (cDNA-smMIPs) and fragment count/read pairs (RNA-Seq) for the 12 genes targeted in the cDNA-smMIPs assay. The data points are averages over the randomly sampled sets of fragments. (b) Fraction of detected genes (genes with at least one mapped read) as a function of total number of reads. Each data point corresponds to a replicate. (c) Reproducibility of fold changes was estimated as a function of the total number of sequencing read pairs. For cDNA-smMIPs, the correlation is between log2(fold change) estimated in experiment 1 (using two technical replicates per individual) and experiment 2 (two technical replicates per individual). For the low-coverage RNA-Seq, correlation is between log2(fold change) estimated in experiment 1 (one technical replicate for respectively individual HG00117 and NA06986) and experiment 2 (one technical replicate for each individual). Each data point corresponds to a random sampling (without replacement) of the number of fragments (=read pairs) given on the horizontal axis and is based on 8 and 4 technical replicates for respectively cDNA-smMIPs and RNA-Seq. Corresponding scatter plots between the replicate DE estimates are shown in Supplementary Fig. 10. (d) Comparison of molecule counts (cDNA-smMIPs) and fragments/read pairs (RNA-Seq) mapping to the regions targeted by the cDNA-smMIPs. For each gene the average count across all smMIPs targeting the same gene is reported.
Figure 4
Figure 4. Expression changes following stimulation of PBMCs.
(a) Outline of PBMC stimulation experiment. (b) Concordance of differential gene expression (Candida versus Control) estimates from cDNA-smMIPs and previously published RNA-Seq data. The cDNA-smMIPs and RNA-Seq experiments were performed on PBMCs from different individuals.
Figure 5
Figure 5. Estimation of allelic ratios with cDNA-smMIPs.
Concordance of allelic ratios estimated from distinct smMIPs (respectively non-overlapping extension probes and non-overlapping ligation probes) targeting the same SNP in a serial dilution of cDNA from K562 cell line with cDNA from HEK293 cell line (8 dilution steps). For all SNPs, the two cell lines are homozygous for the opposite allele. Reported variation in R2 is s.d.
Figure 6
Figure 6. Cost comparison.
Reported cost is per sample, assuming a total of 1,000 samples, starting from RNA and including library preparation and sequencing. Breakdown of cost for cDNA-smMIPs is given in Supplementary Table 7. For cDNA-smMIPs and qPCR, calculation is based on 100 target regions in 20 genes; reported cost also includes cDNA synthesis (iScript), purification and measurement of cDNA concentration (8.25 USD). Cost for RNA-Seq is based on Illumina Truseq V2 kit (48 reactions, 3,724 USD, assuming 5 million paired-end reads on Illumina NextSeq at cost of USD 32.49). Cost for CaptureSeq is based on same Illumina Truseq V2 kit (48 reactions) followed by capture with Nimblegen SeqCap in 5-plex capture (87 USD/sample) as previously described and 5 million paired-end reads on Illumina NextSeq(USD 32.49). The current commercially available version of SeqCap also permits 12-plex capture.

References

    1. Wang Z., Gerstein M. & Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10, 57–63 (2009). - PMC - PubMed
    1. Ozsolak F. & Milos P. M. RNA sequencing: advances, challenges and opportunities. Nat. Rev. Genet. 12, 87–98 (2011). - PMC - PubMed
    1. Mercer T. R. et al.. Targeted RNA sequencing reveals the deep complexity of the human transcriptome. Nat. Biotechnol. 30, 99–104 (2011). - PMC - PubMed
    1. Levin J. Z. et al.. Targeted next-generation sequencing of a cancer transcriptome enhances detection of sequence variants and novel fusion transcripts. Genome Biol. 10, R115 (2009). - PMC - PubMed
    1. Clark M. B. et al.. Quantitative gene profiling of long noncoding RNAs with targeted RNA sequencing. Nat. Methods 12, 339–342 (2015). - PubMed

Publication types

MeSH terms

LinkOut - more resources