Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Nov 21;114(47):E10244-E10253.
doi: 10.1073/pnas.1706539114. Epub 2017 Nov 6.

An RNA structure-mediated, posttranscriptional model of human α-1-antitrypsin expression

Affiliations

An RNA structure-mediated, posttranscriptional model of human α-1-antitrypsin expression

Meredith Corley et al. Proc Natl Acad Sci U S A. .

Abstract

Chronic obstructive pulmonary disease (COPD) affects over 65 million individuals worldwide, where α-1-antitrypsin deficiency is a major genetic cause of the disease. The α-1-antitrypsin gene, SERPINA1, expresses an exceptional number of mRNA isoforms generated entirely by alternative splicing in the 5'-untranslated region (5'-UTR). Although all SERPINA1 mRNAs encode exactly the same protein, expression levels of the individual mRNAs vary substantially in different human tissues. We hypothesize that these transcripts behave unequally due to a posttranscriptional regulatory program governed by their distinct 5'-UTRs and that this regulation ultimately determines α-1-antitrypsin expression. Using whole-transcript selective 2'-hydroxyl acylation by primer extension (SHAPE) chemical probing, we show that splicing yields distinct local 5'-UTR secondary structures in SERPINA1 transcripts. Splicing in the 5'-UTR also changes the inclusion of long upstream ORFs (uORFs). We demonstrate that disrupting the uORFs results in markedly increased translation efficiencies in luciferase reporter assays. These uORF-dependent changes suggest that α-1-antitrypsin protein expression levels are controlled at the posttranscriptional level. A leaky-scanning model of translation based on Kozak translation initiation sequences alone does not adequately explain our quantitative expression data. However, when we incorporate the experimentally derived RNA structure data, the model accurately predicts translation efficiencies in reporter assays and improves α-1-antitrypsin expression prediction in primary human tissues. Our results reveal that RNA structure governs a complex posttranscriptional regulatory program of α-1-antitrypsin expression. Crucially, these findings describe a mechanism by which genetic alterations in noncoding gene regions may result in α-1-antitrypsin deficiency.

Keywords: RNA secondary structure; SERPINA1; translation efficiency; uORFs; α-1-antitrypsin deficiency.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
The SERPINA1 gene produces 11 splice isoforms, all encoding the same protein. (A) All exons in SERPINA1. Coding sequence (CDS) exons are shown in red, and untranslated regions (UTRs) in blue. Each exon, splice donor (SD), and splice acceptor (SA) is identified by a unique name. The two SERPINA1 TSSs are labeled TSS1 and TSS2. Disease-associated variants, as cataloged by the Human Gene Mutation Database, are indicated with black lines, including the common α-1-antitrypsin deficiency-associated Pi*S and Pi*Z alleles. Upstream ORFs (uORFs) are indicated by red boxes and named. uORF δ/δ′ spans a splice junction and is present only in isoforms with exon E1b.2. (B) The total amount of expressed SERPINA1 differs across 16 human tissue types. Total SERPINA1 transcript amounts were estimated from the Illumina BodyMap 2.0 project and are shown in log relative transcripts per million (TPM). (C) The SERPINA1 transcript isoforms are expressed, with different frequencies, across different tissues. Transcripts are specified with their NCBI names. The log(TPM) of each SERPINA1 transcript is shown for each tissue and for A549 and HepG2 cells. TPMs are relative to liver, which expresses the most SERPINA1 and is set to a total of 106.
Fig. 2.
Fig. 2.
Translation efficiency (TE) differs between SERPINA1 transcripts and is affected by uORFs. (A) The TEs of six SERPINA1 5′-UTRs and their SDs, as measured by luciferase reporter assays. Replicate TE values are shown as open squares. Transcripts are labeled by NCBI name. Measurements are relative to the luciferase assay control. The number of uORFs in each transcript is indicated (Bottom). The Kozak sequence of each uORF is listed. (B) Schematic of the SERPINA1 luciferase constructs and empty vector control. Luciferase CDS not to scale. uORFs in each transcript are indicated with Greek letters and shaded by Kozak sequence score (see color scale). Red arrows indicate uORFs selected for mutation. (C) TEs of the six SERPINA1 constructs with disrupted (mutated) uORFs and their SDs, relative to the wild type (above). (D) TEs of wild type and uORF mutant SERPINA1 constructs predicted with a leaky-scanning model of translation (Eq. 1) fit to experimental TEs, as measured by luciferase assays (r2 = 0.400, n = 12).
Fig. 3.
Fig. 3.
SHAPE-MaP structure probing data for SERPINA1 transcripts. (A) SHAPE reactivity of each nucleotide in a region of low median SHAPE values around the start codon of transcript NM_001002236.2. Each value is shown with its SE and colored by SHAPE reactivity according to the color scale. Nucleotides are numbered by their relative position within the transcript; the start codon is labeled +1. (B) SHAPE reactivity of each position in a region of high median SHAPE values in the coding sequence of transcript NM_001002236.2. (C) The windowed, median-centered SHAPE profiles of six SERPINA1 transcripts ordered by length. Higher SHAPE values indicate unstructured (unpaired) regions, while lower SHAPE values indicate structured (base-paired) regions. uORFs are indicated with gray shaded regions and named with Greek letters. Vertical bars separate exons. (D) The minimum free-energy (MFE) secondary structure of transcript NM_001002236.2, modeled by computational folding with SHAPE reactivity information.
Fig. 4.
Fig. 4.
Structural data greatly improve the leaky-scanning model of translation efficiency (TE). (A) SHAPE-based predicted structures around the uORFs and coding sequence start in transcript NM_001002236.2. uORFs are labeled by name. Bases are colored according to their SHAPE reactivity, as measured by SHAPE-MaP. Bases with unknown SHAPE data are colored gray. Kozak sequences are outlined in green. (B) SHAPE-based predicted structures around the uORF and coding sequence start in transcript NM_000295.4. (C) TEs of wild type and uORF mutant SERPINA1 constructs predicted with the structure leaky-scanning model of translation (Eq. 3) fit to experimental TEs, as measured by luciferase assays (r2 = 0.936, n = 12).
Fig. 5.
Fig. 5.
Structure mutants show translation efficiency (TE) is a function of ΔG of unfolding around the uORF Kozak sequence. (A) TE relative to wild type (WT) for three uORFα structure mutants in transcript NM_001002235.2. Replicate TE values are shown as open squares. The predicted ΔG of unfolding is shown for each structure mutant. (B) Structure mutant and WT TEs plotted with the structure leaky-scanning (solid line) and leaky-scanning (dotted line) models as functions of uORFα ΔG of unfolding. The predicted structure for each mutant and the WT uORFα is shown. Kozak sequences are outlined in green. CAA repeats are abbreviated in the mutants. (C) The structure leaky-scanning and leaky-scanning models as functions of uORFα ΔG of unfolding (lilac), or uORF δ/δ′ ΔG of unfolding (peach). Experimental TEs are plotted for SERPINA1 structure mutants (stars), uORF mutants (triangles), and WT constructs (circles) that contained only uORFα or uORFα, β, and δ/δ′. (D) The structure leaky-scanning and leaky-scanning models as functions of ORF (CDS) ΔG of unfolding. Experimental TEs are plotted for SERPINA1 constructs that contained no uORFs.
Fig. 6.
Fig. 6.
Predictions of SERPINA1 translation efficiency (TE) in 10 human tissues are improved with the structure leaky-scanning model. (A) Total SERPINA1 transcript versus α-1-antitrypsin protein measurements show no correlation (r2 = 0.0, n = 10). Protein measurements are in normalized spectral counts (68); transcript measurements are in transcripts per million (TPM). (B) Leaky-scanning model predictions of TE versus measured TE in each tissue (r2 = 0.591, n = 10). Each tissue is labeled and colored in the plot and in the human figure according to its prediction percent error (Eq. 5). (C) Structure leaky-scanning model predictions of TE versus measured TE in each tissue (r2 = 0.655, n = 10).

References

    1. Crystal RG. The alpha 1-antitrypsin gene and its deficiency states. Trends Genet. 1989;5:411–417. - PubMed
    1. Castaldi PJ, et al. The COPD genetic association compendium: A comprehensive online database of COPD genetic associations. Hum Mol Genet. 2010;19:526–534. - PMC - PubMed
    1. Eden E, et al. Atopy, asthma, and emphysema in patients with severe alpha-1-antitrypysin deficiency. Am J Respir Crit Care Med. 1997;156:68–74. - PubMed
    1. Mahadeva R, Gaillard M, Pillay V, Halkas A, Lomas D. Characterization of a new variant of alpha(1)-antitrypsin E(Johannesburg) (H15N) in association with asthma. Hum Mutat. 2001;17:156. - PubMed
    1. Chappell S, et al. Cryptic haplotypes of SERPINA1 confer susceptibility to chronic obstructive pulmonary disease. Hum Mutat. 2006;27:103–109. - PubMed

Publication types

LinkOut - more resources