Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Sep;18(9):1586-96.
doi: 10.1261/rna.033233.112. Epub 2012 Jul 25.

RNA editing of protein sequences: a rare event in human transcriptomes

Affiliations

RNA editing of protein sequences: a rare event in human transcriptomes

Claudia L Kleinman et al. RNA. 2012 Sep.

Abstract

RNA editing, the post-transcriptional recoding of RNA molecules, has broad potential implications for gene expression. Several recent studies of human transcriptomes reported a high number of differences between DNA and RNA, including events not explained by any known mammalian RNA-editing mechanism. However, RNA-editing estimates differ by orders of magnitude, since technical limitations of high-throughput sequencing have been sometimes overlooked and sequencing errors have been confounded with editing sites. Here, we developed a series of computational approaches to analyze the extent of this process in the human transcriptome, identifying and addressing the major sources of error of a large-scale approach. We apply the detection pipeline to deep sequencing data from lymphoblastoid cell lines expressing ADAR1 at high levels, and show that noncanonical editing is unlikely to occur, with at least 85%-98% of candidate sites being the result of sequencing and mapping artifacts. By implementing a method to detect intronless gene duplications, we show that most noncanonical sites previously validated originate in read mismapping within these regions. Canonical A-to-G editing, on the other hand, is widespread in noncoding Alu sequences and rare in exonic and coding regions, where the validation rate also dropped. The genomic distribution of editing sites we find, together with the lack of consistency across studies or biological replicates, suggest a minor quantitative impact of this process in the overall recoding of protein sequences. We propose instead a primary role of ADAR1 protein as a defense system against elements potentially damaging to the genome.

PubMed Disclaimer

Figures

FIGURE 1.
FIGURE 1.
Flowchart of the detection method.
FIGURE 2.
FIGURE 2.
Pseudogenes lacking introns are a source of false positives. Retrotransposition of processed mRNAs is a common source of sequence duplication in the human genome. “Exon-first” approaches to transcriptome reconstruction tend to have a high rate of misalignments around these regions, producing sites that are incorrectly interpreted as edited sites. Validation by Sanger sequencing usually does not resolve the ambiguity, since it requires a PCR step, where traces of unprocessed transcripts or DNA contaminations can act as templates to reproduce the artifact obtained when using RNA-seq. Yellow vertical lines represent substitutions, very common in pseudogenes.
FIGURE 3.
FIGURE 3.
Recoding events identified by RNA-Seq. Number of events for the 12 types of differences between RNA sequence and genomic DNA sequence observed in lymphoblast cell transcriptomes. Labels of x-axis denote DNA and RNA nucleotides observed (e.g., “AG” stands for an A observed in the gDNA and a G observed in the cDNA). See Materials and Methods for details on the definition of repetitive regions.
FIGURE 4.
FIGURE 4.
Recoding events across genomic regions. Number of events observed for the 12 types of differences between RNA sequence and genomic DNA sequence, separated by type of region. See Materials and Methods for details on the definition of repetitive regions.
FIGURE 5.
FIGURE 5.
(A) Sequence logos (Crooks et al. 2004) generated using sequences flanking candidate recoded sites. (B) A single motif is found when sites corresponding to the same case in the cDNA molecule are grouped together (see Materials and Methods for details); T-to-G and the reverse complement of A-to-G flanking regions are shown.

References

    1. Altshuler DL, Durbin RM, Abecasis GR, Bentley DR, Chakravarti A, Clark AG, Collins FS, De La Vega FM, Donnelly P, Egholm M, et al. 2010. A map of human genome variation from population-scale sequencing. Nature 467: 1061–1073 - PMC - PubMed
    1. Bahn JH, Lee JH, Li G, Greer C, Peng G, Xiao X 2011. Accurate identification of A-to-I RNA editing in human by transcriptome sequencing. Genome Res 22: 142–150 - PMC - PubMed
    1. Bass BL 2002. RNA editing by adenosine deaminases that act on RNA. Annu Rev Biochem 71: 817–846 - PMC - PubMed
    1. Bhat GJ, Koslowsky DJ, Feagin JE, Smiley BL, Stuart K 1990. An extensively edited mitochondrial transcript in kinetoplastids encodes a protein homologous to ATPase subunit 6. Cell 61: 885–894 - PubMed
    1. Brawand D, Soumillon M, Necsulea A, Julien P, Csardi G, Harrigan P, Weier M, Liechti A, Aximu-Petri A, Kircher M, et al. 2011. The evolution of gene expression levels in mammalian organs. Nature 478: 343–348 - PubMed

Publication types

LinkOut - more resources