Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jul 1;11(7):e1005310.
doi: 10.1371/journal.pgen.1005310. eCollection 2015 Jul.

Functional Constraint Profiling of a Viral Protein Reveals Discordance of Evolutionary Conservation and Functionality

Affiliations

Functional Constraint Profiling of a Viral Protein Reveals Discordance of Evolutionary Conservation and Functionality

Nicholas C Wu et al. PLoS Genet. .

Erratum in

Abstract

Viruses often encode proteins with multiple functions due to their compact genomes. Existing approaches to identify functional residues largely rely on sequence conservation analysis. Inferring functional residues from sequence conservation can produce false positives, in which the conserved residues are functionally silent, or false negatives, where functional residues are not identified since they are species-specific and therefore non-conserved. Furthermore, the tedious process of constructing and analyzing individual mutations limits the number of residues that can be examined in a single study. Here, we developed a systematic approach to identify the functional residues of a viral protein by coupling experimental fitness profiling with protein stability prediction using the influenza virus polymerase PA subunit as the target protein. We identified a significant number of functional residues that were influenza type-specific and were evolutionarily non-conserved among different influenza types. Our results indicate that type-specific functional residues are prevalent and may not otherwise be identified by sequence conservation analysis alone. More importantly, this technique can be adapted to any viral (and potentially non-viral) protein where structural information is available.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Construction of the mutant libraries.
(A) A schematic representation of the fitness profiling experiment is shown. A 240 bp insert was generated by error-prone PCR and BsaI digestion. The corresponding vector was generated by high-fidelity PCR and BsmBI digestion. Each of the nine plasmid libraries in this study consist of ∼ 50,000 clones. Each viral mutant library was rescued by transfecting ∼ 35 million 293T cells. Each infection was performed with ∼ 10 million A549 cells. (B) A schematic representation of the sequencing library preparation is shown. DNA plasmid mutant library or viral cDNA was used for PCR. This PCR amplified the 240 bp randomized region. The amplicon product was then digested with BpmI, end-repaired, dA-tailed, ligated to sequencing adapters, and sequenced using the Illumina MiSeq platform. BpmI digestion removed the primer region in the amplicon PCR, resulting in sequencing reads covering only the barcode for multiplex sequencing and the 240 bp region that was randomized in the mutant library. With this experimental design, the number of mutations carried by individual genomes in the mutant libraries could be precisely determined.
Fig 2
Fig 2. Fitness profiling of PA influenza virus polymerase subunit.
(A) Correlations of log10 relative frequency of individual point mutations between replicates are shown. Relative frequencymutationi = (Occurrence frequencymutationi)/(Occurrence frequencyWT) (B) Log10 RF indices for silent mutations, nonsense mutations, and missense mutations are shown as histograms. Point mutations located at the 5 terminal 400 bp and 3 terminal 400 bp regions are not included in this analysis to avoid complication by the vRNA packaging signal [93, 94]. (C) The locations of the PA C-terminal domain and the PA N-terminal domain are shown as white boxes. The locations of the mutated regions in each mutant library are shown as green boxes. Log10 RF indices for individual point mutations are plotted across the PA gene. Each point mutation is colored coded as in panel B. Purple: silent mutations; Cyan: nonsense mutations; Brown: missense mutations. A smooth curve was fitted by loess and plotted for each point mutation type.
Fig 3
Fig 3. Systematic identification of functional residues.
(A) Predicted ΔΔG for each point mutation is plotted against the log10 RF index. The horizontal green line represents the RF index cutoff used in this study, RF index = 0.15. For the N-terminal domain, the Spearman’s rank correlation between log10 RF index and Predicted ΔΔG is -0.20 (P = 1.3e−4). For the C-terminal, the Spearman’s rank correlation between log10 RF index and Predicted ΔΔG is -0.18 (P = 6.8e−10). (B) The distributions of relative SASA are shown for residues that carried at least one substitutions of interest (RF index < 0.15 and a predicted ΔΔG < 0) and for residues that did not carry any substitutions of interest. (C) This analysis is performed on those solvent exposed residues (relative SASA > 0.2) that carried a deleterious mutation (RF index < 0.15). The pie chart is showing the fraction of residues that carried a substitution of interest (ΔΔG < 0) and those did not (ΔΔG ≥ 0).
Fig 4
Fig 4. Identification of PA residues that carry non-polymerase functions.
(A) Locations of substitution with an RF index < 0.15 and a predicted ΔΔG < 0 are colored in orange or red respectively. Mutations that were individually reconstructed and analyzed in this study are labeled and colored in red. Residues that were not covered in our profiling data are colored in grey. For PB1, only the N-terminal helix is structurally available in this PDB file, and is colored in green. PDB ID: 2ZNL [23]. (B) The effects of different PA point mutations on influenza polymerase activity were measured using an influenza A virus-inducible luciferase reporter assay [63]. Error bar represents the standard deviation of three biological replicates. (C) The expression level of each PA mutant was tested by immunoblot analysis. (D) TCID50 of the rescued mutant or WT viruses was measured.
Fig 5
Fig 5. Structural analysis of putative functional residues.
(A) The location of a putative functional subdomain is shown on the structure of the influenza polymerase heterotrimeric complex (PDB: 4WSB) [64]. For PA, residues were colored as according to the scheme presented in Fig 4. A putative host determinant residue, S552, is colored in magenta. Note, residue 559 carries an arginine [R] instead of a lysine [K] on the PA of A/WSN/33. (B) The effects of different PA point mutations on influenza polymerase activity were measured using an influenza A virus-inducible luciferase reporter assay [63]. Error bar represents the standard deviation of three biological replicates. (C) The expression level of each C-terminal Flag-tagged PA mutant or WT was tested by immunoblot analysis. The expression level of actin was served as a loading control.
Fig 6
Fig 6. Sequence entropy analysis.
(A) Distribution of sequence entropy for functional residues, structural residues, and “other” residues. (B) Distribution of dN/dS for functional residues, structural residues, and “other” residues. (C) Sequence entropy, dN/dS, the natural concensus residue, FRcons category, and FRsubtype category are shown for the validated functional residues in this study. The dashed line indicated the median value across the entire PA segment. For FRcons and FRsubtype, we considered a residue with a category of ≥ 8 as a hit (a total of 72 residues were identified as a hit in each of these two methods).
Fig 7
Fig 7. Structure-function relationship of residue 281.
(A) The interaction of influenza A PA with the RNA phosphate backbone located between base 3 and 4 is shown. RNA is colored in green. PA is colored in cyan. Hydrogen bonds are represented by dotted black lines. Numbering of residue position is based on A/WSN/33. Conversion of residue position numbering is described in S3 Table. (B) The interaction of influenza B PA with the RNA phosphate backbone located between base 3 and 4 is shown. RNA is colored in green. PA is colored in cyan. Hydrogen bonds are represented by dotted black lines. Numbering of residue position is based on A/WSN/33. Conversion of residue position numbering is described in S3 Table.

References

    1. Bairoch A, Apweiler R. The SWISS-PROT protein sequence data bank and its new supplement TREMBL. Nucleic Acids Res. 1996. January;24(1):21–25. 10.1093/nar/24.1.21 - DOI - PMC - PubMed
    1. Pruitt KD, Tatusova T, Maglott DR. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2005. January;33(Database issue):D501–D504. Available from: 10.1093/nar/gki025. - DOI - PMC - PubMed
    1. Kanz C, Aldebert P, Althorpe N, Baker W, Baldwin A, Bates K, et al. The EMBL Nucleotide Sequence Database. Nucleic Acids Res. 2005. January;33(Database issue):D29–D33. Available from: 10.1093/nar/gki098. - DOI - PMC - PubMed
    1. Li Z, Watanabe T, Hatta M, Watanabe S, Nanbo A, Ozawa M, et al. Mutational analysis of conserved amino acids in the influenza A virus nucleoprotein. J Virol. 2009. May;83(9):4153–4162. Available from: 10.1128/JVI.02642-08. - DOI - PMC - PubMed
    1. Stewart SM, Pekosz A. Mutations in the membrane-proximal region of the influenza A virus M2 protein cytoplasmic tail have modest effects on virus replication. J Virol. 2011. December;85(23):12179–12187. Available from: 10.1128/JVI.05970-11. - DOI - PMC - PubMed

Publication types

Substances