Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Dec 29;11(12):e0169196.
doi: 10.1371/journal.pone.0169196. eCollection 2016.

Novel Role of 3'UTR-Embedded Alu Elements as Facilitators of Processed Pseudogene Genesis and Host Gene Capture by Viral Genomes

Affiliations

Novel Role of 3'UTR-Embedded Alu Elements as Facilitators of Processed Pseudogene Genesis and Host Gene Capture by Viral Genomes

Domènec Farré et al. PLoS One. .

Abstract

Since the discovery of the high abundance of Alu elements in the human genome, the interest for the functional significance of these retrotransposons has been increasing. Primate Alu and rodent Alu-like elements are retrotransposed by a mechanism driven by the LINE1 (L1) encoded proteins, the same machinery that generates the L1 repeats, the processed pseudogenes (PPs), and other retroelements. Apart from free Alu RNAs, Alus are also transcribed and retrotranscribed as part of cellular gene transcripts, generally embedded inside 3' untranslated regions (UTRs). Despite different proposed hypotheses, the functional implication of the presence of Alus inside 3'UTRs remains elusive. In this study we hypothesized that Alu elements in 3'UTRs could be involved in the genesis of PPs. By analyzing human genome data we discovered that the existence of 3'UTR-embedded Alu elements is overrepresented in genes source of PPs. In contrast, the presence of other retrotransposable elements in 3'UTRs does not show this PP linked overrepresentation. This research was extended to mouse and rat genomes and the results accordingly reveal overrepresentation of 3'UTR-embedded B1 (Alu-like) elements in PP parent genes. Interestingly, we also demonstrated that the overrepresentation of 3'UTR-embedded Alus is particularly significant in PP parent genes with low germline gene expression level. Finally, we provide data that support the hypothesis that the L1 machinery is also the system that herpesviruses, and possibly other large DNA viruses, use to capture host genes expressed in germline or somatic cells. Altogether our results suggest a novel role for Alu or Alu-like elements inside 3'UTRs as facilitators of the genesis of PPs, particularly in lowly expressed genes. Moreover, we propose that this L1-driven mechanism, aided by the presence of 3'UTR-embedded Alus, may also be exploited by DNA viruses to incorporate host genes to their viral genomes.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Contingence tables showing overrepresentation of Alu elements, but not of other retrotransposons, inside 3’UTRs of human PP parent genes.
Plus and minus signs above the tables indicate presence or absence, respectively, of Alus (A), other SINEs (B), LINEs (C), or LTRs (D) inside the 3’UTR(s) of a gene. Plus and minus signs on the left mean presence or absence, respectively, of PPs generated from a gene. Numbers in bold are gene counts; total number of genes are also displayed in the right column and the bottom row for each table. Percentages with respect to each total are also shown. P-values of the χ2 test are indicated below each corresponding table.
Fig 2
Fig 2. Contingence tables showing overrepresentation of Alu-like elements, but not of other SINEs, inside 3’UTRs of mouse and rat PP parent genes.
Plus and minus signs above the tables indicate presence or absence, respectively, of B1 or B2 elements (A, B), B1 elements alone (C, D), or other SINEs (E, F) inside the 3’UTR(s) of a gene. Plus and minus signs on the left mean presence or absence, respectively, of PPs generated from a gene. Numbers in bold are gene counts; total number of genes are also displayed in the right column and the bottom row for each table. Percentages with respect to each total are also shown. P-values of the χ2 test are indicated below each corresponding table.
Fig 3
Fig 3. The overrepresentation of 3’UTR-embedded Alu elements in PP parent genes is independent of transcript length.
(A) Transcript length distribution of genes with and without Alus in their 3’UTRs (Alu+ and Alu–). The P-value of the Mann-Whitney-Wilcoxon test comparing Alu+ and Alu− distributions and the number of genes (N) in each set are also indicated. (B) Sampling analysis to separate the possible effect of the transcript length (see Methods for details). Ten samples were generated. For each sample, Mann-Whitney-Wilcoxon (MWW) test proved that both gene sets (Alu+ and sampled Alu–) have a similar transcript length distribution and a contingence table showed overrepresentation of 3’UTR-embedded Alu elements in PP parent genes (χ2 tested). Here only the contingence table of the first sample is shown; see S7 Fig for the rest of the samples. (C) Contingence tables showing overrepresentation of Alu presence inside 3’UTRs of PP parent genes for each of the sets of genes grouped by their maximum transcript length (nine bins). (D) Percentage of genes that have PPs among 18 sets of genes grouped by their transcript length (the nine bins defined in C) and the presence or absence of 3’UTR-embedded Alu repeats (Alu+ and Alu–). The percentage values represented in D are also display in the contingence tables of C shaded in blue (Alu–) and red (Alu+). In B and C, plus and minus signs above the tables indicate presence or absence, respectively, of Alus inside the 3’UTR(s) of a gene. Plus and minus signs on the left of the tables mean presence or absence, respectively, of PPs generated from a gene. Numbers in bold are gene counts; total number of genes are also displayed in the right column and the bottom row of each table. Percentages with respect to each total are also shown. P-values of the χ2 test are indicated below the corresponding table.
Fig 4
Fig 4. The overrepresentation of 3’UTR-embedded Alu elements in PP parent genes is independent of GC-content.
(A) Percentage of genes that have PPs among 24 sets of genes grouped by their GC-content (12 bins) and the presence or absence of 3’UTR-embedded Alu repeats (Alu+ and Alu−). (B) Sampling analysis to separate the GC-content possible effect (see Methods for details). Ten samples were generated. For each sample, Mann-Whitney-Wilcoxon (MWW) test proved that both gene sets (Alu+ and sampled Alu−) have a similar GC-content distribution and a contingence table showed overrepresentation of 3’UTR-embedded Alu elements in genes with PPs (χ2 tested). Only the contingence table of the first sample is shown here; see S8 Fig for the rest of the samples. Plus and minus signs above the table indicate presence or absence, respectively, of Alus inside the 3’UTR(s) of a gene. Plus and minus signs on the left of the table mean presence or absence, respectively, of PPs generated from a gene. Numbers in bold are gene counts; total number of genes are also displayed in the right column and the bottom row of the table. Percentages with respect to each total are also shown. P-value of the χ2 test is also indicated.
Fig 5
Fig 5. Gene expression and overrepresentation of 3’UTR-embedded Alu elements in PP parent genes.
(A) The top table shows number of genes grouped by their germline gene expression mean and their number of generated PPs (4x4 groups); total numbers of genes are also displayed. The bottom table shows the corresponding percentage of genes in each row (groups by number of PPs) of the top table. (B) Bar graph representing the data from the bottom table of A. (C) Counts of genes grouped by their germline gene expression mean (4 bins) and presence or absence of generated PPs (+ and–, respectively). Total numbers of genes are also indicated. The top table displays the overall dataset; the middle table shows only genes with 3’UTR-embedded Alus; the bottom table presents the percentages of genes with these Alus. (D) Bar graph representing the data from the bottom table of C. (E) Same tables as in C but grouping genes into 2 bins of germline gene expression mean. (F) Contingence tables testing overrepresentation of Alu elements inside 3’UTRs of lowly expressed genes (germline gene expression mean lower than 8) respect to highly expressed genes (germline gene expression mean higher or equal to 8). The top table displays the overall gene set; the middle table shows only the genes without associated PPs; the bottom table presents only the genes with PPs. Plus and minus signs above the tables indicate presence or absence, respectively, of Alus inside the 3’UTR(s) of a gene. Numbers in bold are gene counts; total number of genes are also displayed in the right column and the bottom row for each table. Percentages with respect to each total are also shown. P-values of the χ2 test are indicated below each corresponding table.
Fig 6
Fig 6. The 3’UTR-embedded Alu overrepresentation in PP parent genes is not a by-product of germline expression differences between Alu+ and Alu− genes.
(A) Germline gene expression mean distribution of genes with and without Alus in their 3’UTRs (Alu+ and Alu–). The graph shows the overall gene set. The P-value of the Mann-Whitney-Wilcoxon test comparing Alu+ and Alu− distributions is also indicated. (B) Sampling analysis to separate the possible effect of the germline gene expression level (see Methods for details). Ten samples were generated. For each sample, Mann-Whitney-Wilcoxon (MWW) test proved that both gene sets (Alu+ and sampled Alu–) have a similar germline gene expression mean distribution and a contingence table showed overrepresentation of 3’UTR-embedded Alu elements in PP parent genes (χ2 tested). Here only the contingence table of the first sample is shown; see S9 Fig for the rest of the samples. Plus and minus signs above the table indicate presence or absence, respectively, of Alus inside the 3’UTR(s) of a gene. Plus and minus signs on the left of the table mean presence or absence, respectively, of PPs generated from a gene. Numbers in bold are gene counts; total number of genes are also displayed in the right column and the bottom row of the table. Percentages with respect to each total are also shown. P-values of the χ2 test are indicated below the table. (C) Germline gene expression mean distribution of genes with and without Alus in their 3’UTRs (Alu+ and Alu–). The left graph displays only the genes without associated PPs. The right graph presents only the genes with PPs. The P-values of the Mann-Whitney-Wilcoxon test comparing Alu+ and Alu− distributions are also indicated.
Fig 7
Fig 7. Primate genes captured by herpesviruses have Alu elements inside their 3’UTRs.
Blue areas illustrate the last exon of the transcripts of the IL10, DHFR, SLAMF6, IL17A, CD59, CLEC2-like (LOC101037697/LOC374443), LY9, and CD48 genes, where the narrower ending segment indicates the 3’UTR. The edge of the previous exon is also displayed (left open-ended rectangle). Exons are drawn to the same scale with respect to the human genome annotation, except for the CD59 3’UTR that was cut in the middle (void space with dotted lines) because it is very long. Black oblique lines represent splicing. For LY9, black triangles indicate alternative predicted polyadenylation sites and the dotted lines display a predicted 3’UTR addition. Red rectangles show the position of the Alu elements. IL17A and CD59 3’UTRs do not have Alu elements. In CD48, as indicated, the Alu element is only present in S. boliviensis and A. nancymaae.
Fig 8
Fig 8. Schematic diagram illustrating the proposed hypothesis of the new role of 3’UTR-embedded Alus in the genesis of PPs and the herpesviral capture of host genes.
Blue pathway: A highly expressed gene A produces a large amount of transcripts (a) and thus there is a high probability for one of these transcripts to come into contact with a ribosome that is translating an L1 RNA and bind the L1 ORF2p (b), steal it (c), and move back to the nucleus where the ORF2p is used to generate a new processed pseudogene of the gene A (PP A) (d). Red pathway: The few transcripts of a lowly expressed gene B (e) have, by contrast, a low probability to reach a ribosome that is translating an L1 RNA. However, the presence of an Alu element inside the 3’UTR of the gene B allows gene B transcripts to bind the abundant protein complex SRP9/14, promoting transcripts to move to the ribosomes and therefore increasing the likelihood to make contact with a ribosome that is translating an L1 RNA and bind the L1 ORF2p (f), steal it (g), and move to the nucleus where the ORF2p is used to generate a new processed pseudogene of the gene B (PP B) (h) or to insert a transcript retrocopy inside an existing herpesviral episome (the circled DNA of a herpesvirus) (i).

Similar articles

Cited by

References

    1. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, et al. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. 10.1038/35057062 - DOI - PubMed
    1. Cordaux R, Batzer MA. The impact of retrotransposons on human genome evolution. Nat Rev Genet. 2009;10:691–703. 10.1038/nrg2640 - DOI - PMC - PubMed
    1. Vassetzky NS, Ten OA, Kramerov DA. B1 and related SINEs in mammalian genomes. Gene. 2003;319:149–60. - PubMed
    1. Quentin Y. Origin of the Alu family: a family of Alu-like monomers gave birth to the left and the right arms of the Alu elements. Nucleic Acids Res. 1992;20:3397–401. - PMC - PubMed
    1. Quentin Y. A master sequence related to a free left Alu monomer (FLAM) at the origin of the B1 family in rodent genomes. Nucleic Acids Res. 1994;22:2222–7. - PMC - PubMed

Substances