. 2017 Nov 21;114(47):E10244-E10253.

doi: 10.1073/pnas.1706539114. Epub 2017 Nov 6.

An RNA structure-mediated, posttranscriptional model of human α-1-antitrypsin expression

Meredith Corley^{1

2}, Amanda Solem¹, Gabriela Phillips¹, Lela Lackey¹, Benjamin Ziehr^{3

4}, Heather A Vincent^{3

4}, Anthony M Mustoe⁵, Silvia B V Ramos⁶, Kevin M Weeks⁵, Nathaniel J Moorman^{3

4}, Alain Laederach^{7

2}

Affiliations

¹ Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599.
² Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599.
³ Department of Microbiology and Immunology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599.
⁴ Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599.
⁵ Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599.
⁶ Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599.
⁷ Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599; alain@unc.edu.

PMID: 29109288
PMCID: PMC5703279
DOI: 10.1073/pnas.1706539114

An RNA structure-mediated, posttranscriptional model of human α-1-antitrypsin expression

Meredith Corley et al. Proc Natl Acad Sci U S A. 2017.

. 2017 Nov 21;114(47):E10244-E10253.

doi: 10.1073/pnas.1706539114. Epub 2017 Nov 6.

Authors

Affiliations

¹ Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599.
² Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599.
³ Department of Microbiology and Immunology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599.
⁴ Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599.
⁵ Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599.
⁶ Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599.
⁷ Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599; alain@unc.edu.

PMID: 29109288
PMCID: PMC5703279
DOI: 10.1073/pnas.1706539114

Abstract

Chronic obstructive pulmonary disease (COPD) affects over 65 million individuals worldwide, where α-1-antitrypsin deficiency is a major genetic cause of the disease. The α-1-antitrypsin gene, SERPINA1, expresses an exceptional number of mRNA isoforms generated entirely by alternative splicing in the 5'-untranslated region (5'-UTR). Although all SERPINA1 mRNAs encode exactly the same protein, expression levels of the individual mRNAs vary substantially in different human tissues. We hypothesize that these transcripts behave unequally due to a posttranscriptional regulatory program governed by their distinct 5'-UTRs and that this regulation ultimately determines α-1-antitrypsin expression. Using whole-transcript selective 2'-hydroxyl acylation by primer extension (SHAPE) chemical probing, we show that splicing yields distinct local 5'-UTR secondary structures in SERPINA1 transcripts. Splicing in the 5'-UTR also changes the inclusion of long upstream ORFs (uORFs). We demonstrate that disrupting the uORFs results in markedly increased translation efficiencies in luciferase reporter assays. These uORF-dependent changes suggest that α-1-antitrypsin protein expression levels are controlled at the posttranscriptional level. A leaky-scanning model of translation based on Kozak translation initiation sequences alone does not adequately explain our quantitative expression data. However, when we incorporate the experimentally derived RNA structure data, the model accurately predicts translation efficiencies in reporter assays and improves α-1-antitrypsin expression prediction in primary human tissues. Our results reveal that RNA structure governs a complex posttranscriptional regulatory program of α-1-antitrypsin expression. Crucially, these findings describe a mechanism by which genetic alterations in noncoding gene regions may result in α-1-antitrypsin deficiency.

Keywords: RNA secondary structure; SERPINA1; translation efficiency; uORFs; α-1-antitrypsin deficiency.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Fig. 1.**
The *SERPINA1* gene produces 11 splice isoforms, all encoding the same protein. (A) All exons in *SERPINA1*. Coding sequence (CDS) exons are shown in red, and untranslated regions (UTRs) in blue. Each exon, splice donor (SD), and splice acceptor (SA) is identified by a unique name. The two *SERPINA1* TSSs are labeled TSS1 and TSS2. Disease-associated variants, as cataloged by the Human Gene Mutation Database, are indicated with black lines, including the common α-1-antitrypsin deficiency-associated Pi*S and Pi*Z alleles. Upstream ORFs (uORFs) are indicated by red boxes and named. uORF δ/δ′ spans a splice junction and is present only in isoforms with exon E1b.2. (B) The total amount of expressed *SERPINA1* differs across 16 human tissue types. Total *SERPINA1* transcript amounts were estimated from the Illumina BodyMap 2.0 project and are shown in log relative transcripts per million (TPM). (C) The *SERPINA1* transcript isoforms are expressed, with different frequencies, across different tissues. Transcripts are specified with their NCBI names. The log(TPM) of each *SERPINA1* transcript is shown for each tissue and for A549 and HepG2 cells. TPMs are relative to liver, which expresses the most *SERPINA1* and is set to a total of 10⁶.

**Fig. 2.**
Translation efficiency (TE) differs between *SERPINA1* transcripts and is affected by uORFs. (A) The TEs of six *SERPINA1* 5′-UTRs and their SDs, as measured by luciferase reporter assays. Replicate TE values are shown as open squares. Transcripts are labeled by NCBI name. Measurements are relative to the luciferase assay control. The number of uORFs in each transcript is indicated (*Bottom*). The Kozak sequence of each uORF is listed. (B) Schematic of the *SERPINA1* luciferase constructs and empty vector control. Luciferase CDS not to scale. uORFs in each transcript are indicated with Greek letters and shaded by Kozak sequence score (see color scale). Red arrows indicate uORFs selected for mutation. (C) TEs of the six *SERPINA1* constructs with disrupted (mutated) uORFs and their SDs, relative to the wild type (above). (D) TEs of wild type and uORF mutant *SERPINA1* constructs predicted with a leaky-scanning model of translation (Eq. 1) fit to experimental TEs, as measured by luciferase assays (r² = 0.400, n = 12).

**Fig. 3.**
SHAPE-MaP structure probing data for *SERPINA1* transcripts. (A) SHAPE reactivity of each nucleotide in a region of low median SHAPE values around the start codon of transcript NM_001002236.2. Each value is shown with its SE and colored by SHAPE reactivity according to the color scale. Nucleotides are numbered by their relative position within the transcript; the start codon is labeled +1. (B) SHAPE reactivity of each position in a region of high median SHAPE values in the coding sequence of transcript NM_001002236.2. (C) The windowed, median-centered SHAPE profiles of six *SERPINA1* transcripts ordered by length. Higher SHAPE values indicate unstructured (unpaired) regions, while lower SHAPE values indicate structured (base-paired) regions. uORFs are indicated with gray shaded regions and named with Greek letters. Vertical bars separate exons. (D) The minimum free-energy (MFE) secondary structure of transcript NM_001002236.2, modeled by computational folding with SHAPE reactivity information.

**Fig. 4.**
Structural data greatly improve the leaky-scanning model of translation efficiency (TE). (A) SHAPE-based predicted structures around the uORFs and coding sequence start in transcript NM_001002236.2. uORFs are labeled by name. Bases are colored according to their SHAPE reactivity, as measured by SHAPE-MaP. Bases with unknown SHAPE data are colored gray. Kozak sequences are outlined in green. (B) SHAPE-based predicted structures around the uORF and coding sequence start in transcript NM_000295.4. (C) TEs of wild type and uORF mutant *SERPINA1* constructs predicted with the structure leaky-scanning model of translation (Eq. 3) fit to experimental TEs, as measured by luciferase assays (r² = 0.936, n = 12).

**Fig. 5.**
Structure mutants show translation efficiency (TE) is a function of ΔG of unfolding around the uORF Kozak sequence. (A) TE relative to wild type (WT) for three uORFα structure mutants in transcript NM_001002235.2. Replicate TE values are shown as open squares. The predicted ΔG of unfolding is shown for each structure mutant. (B) Structure mutant and WT TEs plotted with the structure leaky-scanning (solid line) and leaky-scanning (dotted line) models as functions of uORFα ΔG of unfolding. The predicted structure for each mutant and the WT uORFα is shown. Kozak sequences are outlined in green. CAA repeats are abbreviated in the mutants. (C) The structure leaky-scanning and leaky-scanning models as functions of uORFα ΔG of unfolding (lilac), or uORF δ/δ′ ΔG of unfolding (peach). Experimental TEs are plotted for *SERPINA1* structure mutants (stars), uORF mutants (triangles), and WT constructs (circles) that contained only uORFα or uORFα, β, and δ/δ′. (D) The structure leaky-scanning and leaky-scanning models as functions of ORF (CDS) ΔG of unfolding. Experimental TEs are plotted for *SERPINA1* constructs that contained no uORFs.

**Fig. 6.**
Predictions of *SERPINA1* translation efficiency (TE) in 10 human tissues are improved with the structure leaky-scanning model. (A) Total *SERPINA1* transcript versus α-1-antitrypsin protein measurements show no correlation (r² = 0.0, n = 10). Protein measurements are in normalized spectral counts (68); transcript measurements are in transcripts per million (TPM). (B) Leaky-scanning model predictions of TE versus measured TE in each tissue (r² = 0.591, n = 10). Each tissue is labeled and colored in the plot and in the human figure according to its prediction percent error (Eq. 5). (C) Structure leaky-scanning model predictions of TE versus measured TE in each tissue (r² = 0.655, n = 10).

See this image and copyright information in PMC

References

1. Crystal RG. The alpha 1-antitrypsin gene and its deficiency states. Trends Genet. 1989;5:411–417. - PubMed
1. Castaldi PJ, et al. The COPD genetic association compendium: A comprehensive online database of COPD genetic associations. Hum Mol Genet. 2010;19:526–534. - PMC - PubMed
1. Eden E, et al. Atopy, asthma, and emphysema in patients with severe alpha-1-antitrypysin deficiency. Am J Respir Crit Care Med. 1997;156:68–74. - PubMed
1. Mahadeva R, Gaillard M, Pillay V, Halkas A, Lomas D. Characterization of a new variant of alpha(1)-antitrypsin E(Johannesburg) (H15N) in association with asthma. Hum Mutat. 2001;17:156. - PubMed
1. Chappell S, et al. Cryptic haplotypes of SERPINA1 confer susceptibility to chronic obstructive pulmonary disease. Hum Mutat. 2006;27:103–109. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

An RNA structure-mediated, posttranscriptional model of human α-1-antitrypsin expression

Affiliations

An RNA structure-mediated, posttranscriptional model of human α-1-antitrypsin expression

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials

Miscellaneous