Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jul 19;21(1):160.
doi: 10.1186/s12915-023-01651-w.

Lessons from discovery of true ADAR RNA editing sites in a human cell line

Affiliations

Lessons from discovery of true ADAR RNA editing sites in a human cell line

Fang Wang et al. BMC Biol. .

Abstract

Background: Conversion or editing of adenosine (A) into inosine (I) catalyzed by specialized cellular enzymes represents one of the most common post-transcriptional RNA modifications with emerging connection to disease. A-to-I conversions can happen at specific sites and lead to increase in proteome diversity and changes in RNA stability, splicing, and regulation. Such sites can be detected as adenine-to-guanine sequence changes by next-generation RNA sequencing which resulted in millions reported sites from multiple genome-wide surveys. Nonetheless, the lack of extensive independent validation in such endeavors, which is critical considering the relatively high error rate of next-generation sequencing, leads to lingering questions about the validity of the current compendiums of the editing sites and conclusions based on them.

Results: Strikingly, we found that the current analytical methods suffer from very high false positive rates and that a significant fraction of sites in the public databases cannot be validated. In this work, we present potential solutions to these problems and provide a comprehensive and extensively validated list of A-to-I editing sites in a human cancer cell line. Our findings demonstrate that most of true A-to-I editing sites in a human cancer cell line are located in the non-coding transcripts, the so-called RNA 'dark matter'. On the other hand, many ADAR editing events occurring in exons of human protein-coding mRNAs, including those that can recode the transcriptome, represent false positives and need to be interpreted with caution. Nonetheless, yet undiscovered authentic ADAR sites that increase the diversity of human proteome exist and warrant further identification.

Conclusions: Accurate identification of human ADAR sites remains a challenging problem, particularly for the sites in exons of protein-coding mRNAs. As a result, genome-wide surveys of ADAR editome must still be accompanied by extensive Sanger validation efforts. However, given the vast number of unknown human ADAR sites, there is a need for further developments of the analytical techniques, potentially those that are based on deep learning solutions, in order to provide a quick and reliable identification of the editome in any sample.

Keywords: ADAR; Cancer; RNA dark matter; RNA editing; lncRNA; vlincRNA.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
A scheme illustrating the underlying concept of this study. True A-to-I RNA editing events revealed by A-to-G substitutions during RNA sequencing have to be separated from other A-to-G sequence changes that are unrelated to ADAR editing. This work is based on extensive independent validation by Sanger sequencing as to key component to test every indicated step of the analytical pipeline in order to provide an authentic compendium of true ADAR editing events in a particular biological system
Fig. 2
Fig. 2
Performance of the different analytical techniques in ADAR site detection. a A Venn diagram showing the number of the candidate editing sites predicted by one or multiple methods. Only sites with the maximum editing level of > 0.2 across all 130 samples were used in this analysis. b The numbers and fractions of the candidate annotated (left) and unannotated sites (right) editing sites detected by only one (REDItools-, RED-ML-, or SPRINT-specific) or multiple different methods. Source data are provided as a Source data file
Fig. 3
Fig. 3
True ADAR editing sites are reproducible in independent RNA-seq samples of the same cell type. a Fractions of candidate sites detected in at least two samples (X-axis) by only one (REDItools-, RED-ML- or SPRINT-specific), or multiple different methods are shown for annotated (orange circles) and unannotated (blue circles) sites. b Examples of typical Sanger validation results for 2 true (top) and 2 false (bottom) ADAR editing sites. Sanger sequencing electropherograms derived from RNA or genomic DNA (gDNA) are shown. The sites targeted for validation are demarcated by the blue dashed lines and their genomic coordinates are given below. The nearby ADAR sites found by Sanger only are demarcated by the orange dashed lines. Note that the false positive site on the bottom right represents an SNP in DNA that was absent from the SNP databases and missed by the K562 genomic resequencing. c Validation ratios (Y-axis) of the annotated (orange) and unannotated (blue) sites predicted in only one or at least two samples. d A Venn diagram representing the number of annotated (left) and unannotated sites (right) which were detected by one or more different methods in at least two samples. e Validation ratios (Y-axis) of annotated (orange circles) and unannotated (blue circles) sites detected in at least two samples by only one or multiple different methods. ae Only sites with the maximum editing level of > 0.2 across all 130 samples were used in this analysis. Source data are provided as a Source data file
Fig. 4
Fig. 4
True unannotated ADAR editing events are rare in exonic regions. a, b Validation ratios of annotated (orange circles) and unannotated (blue circles) candidate edits in a CDS or b 3′ UTR regions. The number within each circle represents the number of total sites predicted by only one or multiple methods (X-axis). The fraction outside each circle represents the corresponding validation ratio. The hollow circles represent sites detected by only one (REDItools-, RED-ML-, or SPRINT-specific) or multiple methods with low (< 7.3%) validation ratios. c Sanger sequencing electropherograms of all — 4 unannotated and 5 annotated — validated sites in CDS regions. The sites targeted for validation are demarcated by the blue dashed lines while the nearby unannotated ADAR site found by Sanger only is demarcated by orange dashed lines. Genomic coordinates of all sites and the names of the corresponding genes are shown above the electropherograms with the theoretical amino acid changes caused by the editing shown below. ac Only sites with the maximum editing level of > 0.2 across all 130 samples were used in this analysis. Source data are provided as a Source data file
Fig. 5
Fig. 5
Validation ratios of unannotated ADAR candidate sites in the non-coding regions of the genome. The validation ratios (Y-axis) of non-exonic unannotated sites mapping to the same strands of a introns of annotated genes, b vlincRNAs, but not introns, or c elsewhere in the genome are shown. The number within each circle represents the number of total sites predicted by only one (REDItools-, RED-ML-, or SPRINT-specific) or more methods (X-axis). The fraction outside each circle represents the corresponding validation ratio. The hollow circles represent sites detected by only one or multiple methods with low validation ratios. ac Only sites with the maximum editing level of > 0.2 across all 130 samples were used in this analysis. Source data are provided as a Source data file
Fig. 6
Fig. 6
Genomic landscape of ADAR editing events in a cancer cell line. a Flow chart diagram illustrating the process of obtaining true annotated and unannotated editing sites in the K562 cancer cell line. As shown in this study, sites detected only by REDItools and not supported by other methods have to be interpreted with caution due to very low validation ratios. b, c Pie charts showing distributions of true annotated (b) and unannotated (c) editing sites across the indicated genomic elements. The two numbers for each elements represent the number of editing mapping within each element and the fraction of the total sites. d Odds ratios of enrichment of the true annotated K562 sites in various genomic elements. The red dashed line represents the odds ratio of 1. e Sequence motifs around the sites positive or negative in the Sanger validation and all final sites predicted by the pipeline shown in the ac. The editing site is represented by the position “0.” f Differences in the fraction of each base at each position. The positive values on the Y-axes mean enrichment in the sites positive in the Sanger validation. bd Only sites with the maximum editing level of > 0.2 across all 130 samples were used in this analysis. Source data are provided as a Source data file
Fig. 7
Fig. 7
Estimates of the minimal fractions of false positive non-repeat ADAR editing sites in public databases. The estimates were provided based on two approaches: (1) failed Sanger validation of annotated ADAR sites detected in at least 1 sample and (2) annotated sites detected in published K562 CCLE (The Cancer Cell Line Encyclopedia) RNA-seq data that could not be reproduced in any of the 130 K562 RNA-seq samples used in this study. Source data are provided as a Source data file

Similar articles

Cited by

References

    1. Gott JM, Emeson RB. Functions and mechanisms of RNA editing. Annu Rev Genet. 2000;34:499–531. doi: 10.1146/annurev.genet.34.1.499. - DOI - PubMed
    1. Savva YA, Rieder LE, Reenan RA. The ADAR protein family. Genome Biol. 2012;13(12):252. doi: 10.1186/gb-2012-13-12-252. - DOI - PMC - PubMed
    1. Licht K, Hartl M, Amman F, Anrather D, Janisiw MP, Jantsch MF. Inosine induces context-dependent recoding and translational stalling. Nucleic Acids Res. 2019;47(1):3–14. doi: 10.1093/nar/gky1163. - DOI - PMC - PubMed
    1. Hajnic M, Ruiter A, Polyansky AA, Zagrovic B. Inosine nucleobase acts as guanine in interactions with protein side chains. J Am Chem Soc. 2016;138(17):5519–5522. doi: 10.1021/jacs.6b02417. - DOI - PubMed
    1. Oakes E, Vadlamani P, Hundley HA. Methods for the detection of adenosine-to-inosine editing events in cellular RNA. Methods Mol Biol. 2017;1648:103–127. doi: 10.1007/978-1-4939-7204-3_9. - DOI - PubMed

Publication types