Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Oct;11(10):1485-93.
doi: 10.1261/rna.2107305. Epub 2005 Aug 30.

Bioinformatic identification of candidate cis-regulatory elements involved in human mRNA polyadenylation

Affiliations

Bioinformatic identification of candidate cis-regulatory elements involved in human mRNA polyadenylation

Jun Hu et al. RNA. 2005 Oct.

Abstract

Polyadenylation is an essential step for the maturation of almost all cellular mRNAs in eukaryotes. In human cells, most poly(A) sites are flanked by the upstream AAUAAA hexamer or a close variant, and downstream U/GU-rich elements. In yeast and plants, additional cis elements have been found to be located upstream of the poly(A) site, including UGUA, UAUA, and U-rich elements. In this study, we have developed a computer program named PROBE (Polyadenylation-Related Oligonucleotide Bidimensional Enrichment) to identify cis elements that may play regulatory roles in mRNA polyadenylation. By comparing human genomic sequences surrounding frequently used poly(A) sites with those surrounding less frequently used ones, we found that cis elements occurring in yeast and plants also exist in human poly(A) regions, including the upstream U-rich elements, and UAUA and UGUA elements. In addition, several novel elements were found to be associated with human poly(A) sites, including several G-rich elements. Thus, we suggest that many cis elements are evolutionarily conserved among eukaryotes, and human poly(A) sites have an additional set of cis elements that may be involved in the regulation of mRNA polyadenylation.

PubMed Disclaimer

Figures

FIGURE 1.
FIGURE 1.
Schematic of a poly(A) region and identification of cis elements in the −100/−41 region. (A) A poly(A) region is a genomic sequence containing a poly(A) site. The poly(A) site is considered at position 0. Four subregions were investigated in this study, namely −100/−41, −40/−1, +1/+40, and +41/+100. Elements identified for these regions are named AUE, CUE, CDE, and ADE, respectively. (B) Scatter plot of all 4096 hexamers. (X-axis) zsw, corresponding to the difference between strong and weak poly(A) sites in a specific region (the −100/−41 region in this graph); (Y-axis) zoe, corresponding to the difference between observed and expected values in the specific poly(A) region. (Black asterisks) Hexamers whose zsw and zoe are above the respective cutoffs; (gray asterisks) the rest of the hexamers. (C) Clustering of hexamers. Hexamers were clustered according to their mutual dissimilarity by agglomerative hierarchical clustering, which is shown at right. Dissimilarity value 2.6 was used to group hexamers (see Materials and Methods). Grouped hexamers were aligned by a multiple sequence alignment method described in Materials and Methods. Each hexamer group gave rise to a sequence logo, shown at left.
FIGURE 2.
FIGURE 2.
AAUAAA element. (A) Average score (left) and fraction of hits of CUE.2 in regions from all poly(A) sites (black lines), strong poly(A) sites (red lines), and weak poly(A) sites (green lines). (Dotted vertical lines) −100-nt, −40-nt, 0-nt, +40-nt, and +100-nt positions; (dotted horizontal lines) the average of values in the −100-nt to +100-nt region from all poly(A) sites. (B) Association of different PAS hexamers with poly(A) sites of various types. Constitutive poly(A) sites are the sites from genes with only one poly(A) site. For genes with multiple poly(A) sites, the site that is utilized more than 75% of the time is classified as a strong poly(A) site. If there exists a strong site, other poly(A) sites in the same gene are classified as weak sites. If there is no strong site, all poly(A) sites are classified as median sites.
FIGURE 3.
FIGURE 3.
U-rich elements. (A) CDE.2; (B) CUE.1; (C) AUE.2. See the Figure 2A legend for detailed description of the graphs.
FIGURE 4.
FIGURE 4.
GU-rich elements. (A) CDE.3; (B) CDE.1; (C) CDE.4. See the Figure 2A legend for detailed description of the graphs.
FIGURE 5.
FIGURE 5.
UAUA and UGUA elements. (A) AUE.3; (B) AUE.4. See the Figure 2A legend for detailed description of the graphs.
FIGURE 6.
FIGURE 6.
Schematic of cis elements for human poly(A) sites. (Gray boxes) Novel candidate cis elements identified in this study.

References

    1. Arhin, G.K., Boots, M., Bagga, P.S., Milcarek, C., and Wilusz, J. 2002. Downstream sequence elements with different affinities for the hnRNP H/H′ protein influence the processing efficiency of mammalian polyadenylation signals. Nucleic Acids Res. 30: 1842–1850. - PMC - PubMed
    1. Bagga, P.S., Ford, L.P., Chen, F., and Wilusz, J. 1995. The G-rich auxiliary downstream element has distinct sequence and position requirements and mediates efficient 3′ end pre-mRNA processing through a trans-acting factor. Nucleic Acids Res. 23: 1625–1631. - PMC - PubMed
    1. Bakheet, T., Frevel, M., Williams, B.R., Greer, W., and Khabar, K.S. 2001. ARED: Human AU-rich element-containing mRNA database reveals an unexpectedly diverse functional repertoire of encoded proteins. Nucleic Acids Res. 29: 246–254. - PMC - PubMed
    1. Beaudoing, E., Freier, S., Wyatt, J.R., Claverie, J.M., and Gautheret, D. 2000. Patterns of variant polyadenylation signal usage in human genes. Genome Res. 10: 1001–1010. - PMC - PubMed
    1. Brown, P.H., Tiley, L.S., and Cullen, B.R. 1991. Efficient polyadenylation within the human immunodeficiency virus type 1 long terminal repeat requires flanking U3-specific sequences. J. Virol. 65: 3340–3343. - PMC - PubMed

Publication types

LinkOut - more resources