Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Feb 7:5:e2959.
doi: 10.7717/peerj.2959. eCollection 2017.

Evolutionary and functional implications of hypervariable loci within the skin virome

Affiliations

Evolutionary and functional implications of hypervariable loci within the skin virome

Geoffrey D Hannigan et al. PeerJ. .

Abstract

Localized genomic variability is crucial for the ongoing conflicts between infectious microbes and their hosts. An understanding of evolutionary and adaptive patterns associated with genomic variability will help guide development of vaccines and antimicrobial agents. While most analyses of the human microbiome have focused on taxonomic classification and gene annotation, we investigated genomic variation of skin-associated viral communities. We evaluated patterns of viral genomic variation across 16 healthy human volunteers. Human papillomavirus (HPV) and Staphylococcus phages contained 106 and 465 regions of diversification, or hypervariable loci, respectively. Propionibacterium phage genomes were minimally divergent and contained no hypervariable loci. Genes containing hypervariable loci were involved in functions including host tropism and immune evasion. HPV and Staphylococcus phage hypervariable loci were associated with purifying selection. Amino acid substitution patterns were virus dependent, as were predictions of their phenotypic effects. We identified diversity generating retroelements as one likely mechanism driving hypervariability. We validated these findings in an independently collected skin metagenomic sequence dataset, suggesting that these features of skin virome genomic variability are widespread. Our results highlight the genomic variation landscape of the skin virome and provide a foundation for better understanding community viral evolution and the functional implications of genomic diversification of skin viruses.

Keywords: Bacteriophage; Dermatology; Evolution; Genomic variability; Metagenomics; Virome.

PubMed Disclaimer

Conflict of interest statement

Samuel S. Minot is an employee of One Codex.

Figures

Figure 1
Figure 1. Phylogenetic & evolutionary characteristics of skin virome hypervariable loci.
(A) Scatter plot depicting the candidate contigs considered for analysis in this study. Each point is a contig that mapped to a reference virus genome. The x-axis shows the length (log10 scale) of the contig subsection that mapped to the reference genome. The y-axis shows the overall coverage of the contig, as a quantification of sequences aligning to the contig. The color highlights the reference virus genome that the contig was most similar to, and the size depicts the e-value (inverse log10) associated with the contig-reference match. The horizontal dashed line marks the threshold of 10× coverage, and the vertical dashed line marks the 750 bp length threshold. (B) Phylogenetic tree of skin virome HPVs and (C) Staphylococcus phages, structured onto a standard phylogenetic tree using reference genomes. HPV phylogeny was based on the L1 major capsid gene and Staphylococcus phage phylogeny was based on the large terminase subunit. Contigs from this study are highlighted as orange dots, and genera are labeled with text. Phylogenetic lengths were normalized to ranks to facilitate visualization. (D) Box plots depicting the evolutionary pressure of HPVs (left) and Staphylococcus bacteriophages (right) at the hypervariable loci (blue) and the regions immediately adjacent to the hypervariable loci (red). Adjacent regions were calculated as being twice the length of the hypervariable loci (see visualization to the right). The hypervariable locus and adjacent region (combination of both sides) from each sample were evaluated for evolutionary pressure (y-axis) using SNPs (pink lines in right illustration). Asterisk indicates a statistically significant difference (p < 0.01). Notched boxplots were created using ggplot and show the median (center line), the inter-quartile range (IQR; upper and lower boxes), the highest and lowest value within 1.5 × IQR (whiskers), and the notch which provides an approximate 95% confidence interval as defined by 1.58 × IQR/sqrt(n).
Figure 2
Figure 2. Nucleotide and amino acid substitution patterns within viral hypervariable loci.
Heat maps portraying the counts of every possible nucleotide substitution for each SNP found within (A) HPV and (B) Staphylococcus phage hypervariable loci. Tile color weight corresponds to the relative abundance of SNP substitution counts. The diagonal line highlights the panels associated with no substitution. The substitution patterns of amino acids at each SNP are also shown with exponential transformation (C, D). An illustration of the major amino acid substitutions are provided beneath the legends as a reference. Amino acid charge (E, F) and polarity with acidity (G, H) are shown with log10 transformation. The absence of a basic or acidic polar identifier indicates the amino acid 20 is polar but neutral. The HPV substitution profiles are found in the left column and the Staphylococcus phage profiles are found on the right. Chi-square significance p-value, comparing variation profiles between the viruses in each row (i.e., A and B), is shown in the upper right corner of the associated Staphylococcus phage variation profile. The most frequently substituted amino acid pairs are highlighted with a box around the amino acid letters.
Figure 3
Figure 3. SVM predicted impact of hypervariable loci on phenotype.
Notched boxplot of deleterious scores in human papillomavirus (red) and Staphylococcus phage (blue) genomes. A low deleterious score indicates a predicted neutral phenotypic effect, while a high score indicates a predicted strong phenotypic effect. Asterisk indicates significant difference by Wilcoxon rank-sum test (p < 1e15). Boxplot parameters as described in Fig. 1.
Figure 4
Figure 4. The diversity generating retroelement as a mechanism for targeted nucleotide variation.
Alignment illustrating a putative diversity generating retroelement in Staphylococcus phage. (A) Sashimi plot of sequence coverage across the contig. Coverage ranges from 0 to 67×. Below the coverage is a map of the relevant genes predicted within the contig. Sequence alignment of the diversity generating retroelement template region (B) and variable region (C). Linkage disequilibrium heatmap for the template (D) and variable (E) region. Panels compare variable nucleotides to each other and darker tiles indicate decreased linkage disequilibrium correlation, according to squared allelic correlation (R2) between pairs of SNPs.
Figure 5
Figure 5. Validation of study findings using secondary dataset.
Results from the Oh et al. dataset, which was analyzed using the same workflow as the primary dataset. (A) Scatter plot depicting the candidate contigs considered for analysis in this study. Each point is a contig that mapped to a reference virus genome. The x-axis shows the length (in nucleotides) of the contig subsection that mapped to the reference genome. The y-axis shows the overall coverage of the contig as a quantification of sequences aligning to the contig. The color highlights the reference virus genome that the contig was most similar to, and the size depicts the blast bit score associated with the contig-reference match. (B) Box plots depicting the evolutionary pressure of Staphylococcus bacteriophages at the hypervariable loci (blue) and the regions immediately adjacent to the hypervariable loci (red). (C) Heat map portraying the counts of every possible nucleotide substitution for each SNP found within 21 Staphylococcus phage hypervariable loci. Tile color weight corresponds to the relative abundance of SNP substitution counts. The diagonal line highlights the panels associated with no substitution. The substitution patterns of amino acids at each (D) SNP, (E) amino acid charge, and (F) polarity with acidity are also shown. (G) Notched boxplot illustrating the percent of primary dataset SNPs whose nucleotide positions were identical to those from the secondary validation sample set (left) compared to a simulated dataset of randomly assigned SNP locations (right). The inset shows an example contig identified in both datasets with 81% identical SNP positions. SNPs are represented as yellow lines, with the inner circle representing the validation dataset, and the middle circle representing the primary dataset. The outmost ring illustrates the contig, colored by nucleotides (A = red, C = blue, G = yellow, T = green). Boxplot parameters as described in Fig. 1.

References

    1. Bacher JM, Bull JJ, Ellington AD. Evolution of phage with chemically ambiguous proteomes. BMC Evolutionary Biology. 2003;3:24. doi: 10.1186/1471-2148-3-24. - DOI - PMC - PubMed
    1. Bae T, Baba T, Hiramatsu K, Schneewind O. Prophages of Staphylococcus aureus Newman and their contribution to virulence. Molecular Microbiology. 2006;62(4):1035–1047. doi: 10.1111/j.1365-2958.2006.05441.x. - DOI - PubMed
    1. Bajgain P, Richardson BA, Price JC, Cronn RC, Udall JA. Transcriptome characterization and polymorphism detection between subspecies of big sagebrush (Artemisia tridentata) BMC Genomics. 2011;12(1):370. doi: 10.1186/1471-2164-12-370. - DOI - PMC - PubMed
    1. Boisvert S, Raymond F, Godzaridis E, Laviolette F, Corbeil J. Ray Meta: scalable de novo metagenome assembly and profiling. Genome Biology. 2012;13:R122. doi: 10.1186/gb-2012-13-12-r122. - DOI - PMC - PubMed
    1. Borghans JAM, Beltman JB, De Boer RJ. MHC polymorphism under host-pathogen coevolution. Immunogenetics. 2004;55(11):732–739. doi: 10.1007/s00251-003-0630-5. - DOI - PubMed

LinkOut - more resources