Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jan 6;469(7328):97-101.
doi: 10.1038/nature09616. Epub 2010 Nov 17.

Formation, regulation and evolution of Caenorhabditis elegans 3'UTRs

Affiliations

Formation, regulation and evolution of Caenorhabditis elegans 3'UTRs

Calvin H Jan et al. Nature. .

Abstract

Post-transcriptional gene regulation frequently occurs through elements in mRNA 3' untranslated regions (UTRs). Although crucial roles for 3'UTR-mediated gene regulation have been found in Caenorhabditis elegans, most C. elegans genes have lacked annotated 3'UTRs. Here we describe a high-throughput method for reliable identification of polyadenylated RNA termini, and we apply this method, called poly(A)-position profiling by sequencing (3P-Seq), to determine C. elegans 3'UTRs. Compared to standard methods also recently applied to C. elegans UTRs, 3P-Seq identified 8,580 additional UTRs while excluding thousands of shorter UTR isoforms that do not seem to be authentic. Analysis of this expanded and corrected data set suggested that the high A/U content of C. elegans 3'UTRs facilitated genome compaction, because the elements specifying cleavage and polyadenylation, which are A/U rich, can more readily emerge in A/U-rich regions. Indeed, 30% of the protein-coding genes have mRNAs with alternative, partially overlapping end regions that generate another 10,480 cleavage and polyadenylation sites that had gone largely unnoticed and represent potential evolutionary intermediates of progressive UTR shortening. Moreover, a third of the convergently transcribed genes use palindromic arrangements of bidirectional elements to specify UTRs with convergent overlap, which also contributes to genome compaction by eliminating regions between genes. Although nematode 3'UTRs have median length only one-sixth that of mammalian 3'UTRs, they have twice the density of conserved microRNA sites, in part because additional types of seed-complementary sites are preferentially conserved. These findings reveal the influence of cleavage and polyadenylation on the evolution of genome architecture and provide resources for studying post-transcriptional gene regulation.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Identification of C. elegans 3′ UTRs
a, Schematic of the 3P-Seq protocol. See text for description. b, Sequence composition of homopolymer runs that were found at 3′ termini of candidate 3P tags and included ≥1 untemplated nucleotide. c, Cleavage heterogeneity surrounding the most abundant cleavage site (position 0). Box plots show results for 380 cleavage sites that were both between two non-A residues (which enabled precise mapping) and within the top quintile of 3P-tag abundance. d, The lin-14 3′UTRs. 3P tags from egg were mapped relative to RNA-Seq data, prior mRNA annotations from the indicated databases,, and the proposed lin-4-binding region . Distal and proximal cleavage sites are indicated (black and red arrowheads, respectively). A 50-nucleotide region containing the distal 3P cluster is enlarged (box). Each tag sequence with a unique genome match is depicted as a bar, colored by tag frequency (key). e, Nucleotide sequence composition at mRNA end regions. Shown above are elements implicated in cleavage and polyadenylation (Supplementary Fig. 3c), with colors reflecting their nucleotide composition (A-rich, red; U-rich, blue). The sharp adenosine peak at position +1 (*) was due only partly to cleavage prior to an A. Also contributing to this peak (and to both depletion of A at position −1 and blurring of sequence composition at other positions) was cleavage after an A, for which the templated A was assigned to the poly(A) tail, resulting in a 1 nucleotide offset from the cleavage-site register.
Figure 2
Figure 2. Alternative 3′ UTRs in C. elegans
a, Distribution of the 24,036 3P-Seq-supported UTRs among the types of alternative isoforms. For genes with ALEs that have tandem isoforms (bottom), the ALE tally indicates the number of distal isoforms of proximal ALEs (blue) and the tandem tally indicates the proximal tandem isoforms of all ALEs (red). In all cases, the distal isoform is the 3′-most cleavage site for each gene (black arrowhead). Also depicted are proximal tandem sites and proximal ALE sites (red and blue arrowheads, respectively). Listed (in parenthesis) is the number of cleavage sites associated with each isoform type for the 34,525 3P-Seq-supported cleavage sites (which exceeded the number of unique UTRs because OERs produced multiple cleavage sites for the same UTR). The nucleotide composition near proximal and distal sites is shown (right). b, Frequency of PAS motifs for isoform types indicated. c, Schematics of canonical and alternative operons.
Figure 3
Figure 3. Evolution and topology of 3′-end formation
a, 3′UTR length distributions for the indicated species, considering the most distal annotated isoform for each gene. b, A/U content for C. elegans 3′UTRs of the indicted lengths. c, Relationship between 3′UTR length and 3′UTR A/U content (disregarding content of the last 40 UTR nucleotides), 3′UTR length and genomic A/T content, and 5′UTR length and 5′UTR A/U content for the metazoan species in (a) (r2, Pearson correlation coefficients). d, OERs. Distances between neighboring cleavage sites are plotted (left). For peaks in the distribution at 15–20 and 35–40 nucleotides (shaded), nucleotide compositions of OERs are shown (middle and right, respectively), with proposed RNA-recognition elements colored as in Fig. 1e. Arrowheads indicate cleavage sites, with shading also indicating positions of upstream cleavage. e, Convergent UTR overlap. Distances between convergent 3′ ends are plotted (left), with negative values indicating overlap. For peaks at 15–22 and (−2) 8 nucleotides of overlap (shaded), nucleotide compositions are shown (middle and right, respectively) as in (d), with shading indicating positions of minus-strand cleavage.
Figure 4
Figure 4. MicroRNA targeting
a, Expanded repertoire of seed-matched sites preferentially conserved in nematode 3′UTRs. Sites conserved only marginally above chance are above the dashed line. Watson-Crick-matched residues, blue or black; residues independent of the miRNA sequence, red. b, Density of miRNA sites conserved above background, combining all site types at the maximally sensitive cutoff. Error bars, one standard deviation (calculated by repeating the analysis for each site type 50 times, each time using a different cohort of control sequences that matched the properties of the miRNA sequences18). c, Relative strength of miRNA site types across clades. Within each clade, two species of comparable divergence were selected. For each miRNA site type, the fraction of sites conserved above background in the two species was normalized to that of the 8mer-A1 (shown in parentheses). d, Enrichment of 8mer-A1 3′UTR sites above expectation based on dinucleotide content. Error bars, one standard deviation, derived as in (b). e, Relationship between 3′UTR length and site enrichment. Site enrichment is ploted for 3′UTRs of the indicated species sorted by length into ten equally sized bins.

Similar articles

Cited by

References

    1. Moore MJ. From birth to death: the complex lives of eukaryotic mRNAs. Science. 2005;309:1514–1518. - PubMed
    1. Martin KC, Ephrussi A. mRNA localization: gene expression in the spatial dimension. Cell. 2009;136:719–730. - PMC - PubMed
    1. Ahringer J, Kimble J. Control of the sperm-oocyte switch in Caenorhabditis elegans hermaphrodites by the fem-3 3' untranslated region. Nature. 1991;349:346–348. - PubMed
    1. Wightman B, Burglin TR, Gatto J, Arasu P, Ruvkun G. Negative regulatory sequences in the lin-14 3'-untranslated region are necessary to generate a temporal switch during Caenorhabditis elegans development. Genes Dev. 1991;5:1813–1824. - PubMed
    1. Merritt C, Rasoloson D, Ko D, Seydoux G. 3' UTRs are the primary regulators of gene expression in the C. elegans germline. Curr Biol. 2008;18:1476–1482. - PMC - PubMed

Publication types

MeSH terms

Associated data