Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jan;40(Database issue):D1089-92.
doi: 10.1093/nar/gkr1172. Epub 2011 Dec 1.

RecountDB: a database of mapped and count corrected transcribed sequences

Affiliations

RecountDB: a database of mapped and count corrected transcribed sequences

Edward Wijaya et al. Nucleic Acids Res. 2012 Jan.

Abstract

The field of gene expression analysis continues to benefit from next-generation sequencing generated data, which enables transcripts to be measured with unmatched accuracy and resolution. But the high-throughput reads from these technologies also contain many errors, which can compromise the ability to accurately detect and quantify rare transcripts. Fortunately, techniques exist to ameliorate the affects of sequencer error. We present RecountDB, a secondary database derived from primary data in NCBI's short read archive. RecountDB holds sequence counts from RNA-seq and 5' capped transcription start site experiments, corrected and mapped to the relevant genome. Via a searchable and browseable interface users can obtain corrected data in formats useful for transcriptomic analysis. The database is currently populated with 2265 entries from 45 organisms and continuously growing. RecountDB is publicly available at: http://recountdb.cbrc.jp.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Effect of sequencing error on the sequence counts. Gray bars represent misread sequences. (a) True counts of input sequences; aaact and gactt are 3 × 105 and 600, respectively. (b) Output of sequencer. Due to sequencer error, misread sequences appear around each input sequence, like crumbs fallen off a cake. Unfortunately, the misreads of highly abundant sequence can have higher count than correct reads of rare sequences. Thus it is in general not possible to separate true and false sequences with a simple threshold. recount uses a probabilistic model to approximately infer (a) from (b).
Figure 2.
Figure 2.
RecountDB's search interface. (a) The snapshot of the RecountDB entry page. Users can perform searches using keywords such as genome name, or type of study, or NCBI-SRA file ID. A browseable interface can also be accessed through the link in this page. (b) A typical RecountDB keyword search result page. Each entry contains basic information such as data submitter, type of study and sample source. The results are presented in three formats: TAB, PSL and SAM/BAM (see main text for explanation). The link in depicted as a globe symbol allows users to reach the NCBI-SRA primary site for the data, where the user can access the original FASTQ file.
Figure 3.
Figure 3.
Additional fields provided in the RecountDB SAM format data. OC refers to observed count, and EC estimated count (after correction). ‘f’ refers to type of value (float) and is followed by the values of each type of count.

References

    1. Velculescu VE, Madden SL, Zhang L, Lash AE, Yu J, Rago C, Lal A, Wang CJ, Beaudry GA, Ciriello KM, et al. Analysis of human transcriptomes. Nat. Genet. 1999;23:387–388. - PubMed
    1. Blow N. Transcriptomics: the digital generation. Nature. 2009;458:239–242. - PubMed
    1. Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 2009;10:57–63. - PMC - PubMed
    1. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods. 2008;5:621–628. - PubMed
    1. Carninci P, Sandelin A, Lenhard B, Katayama S, Shimokawa K, Ponjavic J, Semple CA, Taylor MS, Engstrom PG, Frith MC, et al. Genome-wide analysis of mammalian promoter architecture and evolution. Nat. Genet. 2006;38:626–635. - PubMed

Publication types