Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 26;11(1):940.
doi: 10.1038/s41467-020-14677-3.

A comprehensive non-redundant gene catalog reveals extensive within-community intraspecies diversity in the human vagina

Affiliations

A comprehensive non-redundant gene catalog reveals extensive within-community intraspecies diversity in the human vagina

Bing Ma et al. Nat Commun. .

Abstract

Analysis of metagenomic and metatranscriptomic data is complicated and typically requires extensive computational resources. Leveraging a curated reference database of genes encoded by members of the target microbiome can make these analyses more tractable. In this study, we assemble a comprehensive human vaginal non-redundant gene catalog (VIRGO) that includes 0.95 million non-redundant genes. The gene catalog is functionally and taxonomically annotated. We also construct a vaginal orthologous groups (VOG) from VIRGO. The gene-centric design of VIRGO and VOG provides an easily accessible tool to comprehensively characterize the structure and function of vaginal metagenome and metatranscriptome datasets. To highlight the utility of VIRGO, we analyze 1,507 additional vaginal metagenomes, and identify a high degree of intraspecies diversity within and across vaginal microbiota. VIRGO offers a convenient reference database and toolkit that will facilitate a more in-depth understanding of the role of vaginal microorganisms in women's health and reproductive outcomes.

PubMed Disclaimer

Conflict of interest statement

J.R. is co-founder of LUCA Biologics, a biotechnology company focusing on translating microbiome research into live biotherapeutics drugs for women’s health. All other authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1. Percent of vaginal metagenome reads that can be mapped to contigs from the following reference data sets.
Complete VIRGO database, 211 in-house sequenced vaginal metagenomes, 53 HMP DACC vaginal metagenomes, all HMP urogenital reference genomes, 277 genomes of bacteria isolated from vagina, reproductive or urinary system deposited in GenBank, and 139 genomes of urogenital bacteria from HMP DACC database. Values plotted are the average, error bars represent the standard error of the mean.
Fig. 2
Fig. 2. Pipeline for data processing and integration for the construction of the human vaginal integrated non-redundant gene catalog (VIRGO) and vaginal orthologous groups (VOG) for protein families.
Metagenomes from 264 vaginal metagenomes and 416 genomes of urogenital isolates were processed, that including 211 in-house sequenced vaginal metagenomes. The procedures include preprocessing to remove human contaminates, quality assessment, metagenome assembly, gene calling, functional, and taxonomic annotation, gene clustering based on nucleotide sequencing similarity to form VIRGO, and jaccard index coefficiency clustering of amino acid sequences to form VOG. A more detailed illustration is in Supplementary Fig. 5 and description is in Material and Methods section.
Fig. 3
Fig. 3. Taxonomic and functional composition of vaginal microbiome in VIRGO.
a Top 20 species with the most abundant gene content in VIRGO. The logarithm of the ratio of the gene content of a species over the entire community to the base 2. Plotted are interquartile ranges (IQRs, boxes), medians (line in box), and mean (red diamond). b Species-specific metagenome accumulation curves for the number of non-redundant genes. c Functional distribution of non-redundant genes in VIRGO. Functional categories were defined using EggNOG (v4.5). d Prevalence of BVAB1 in metagenomes using a minimum number of genes threshold of 50% of the estimated BVAB1 genome size. A gene was present if ≥3 reads mapped to it. e Relationship between the depth of sequencing and the number of bacterial non-redundant genes identified using VIRGO. Each point is a separate metagenome and is color-coded according to community state type.
Fig. 4
Fig. 4
a Boxplot of the number non-redundant genes in samples of different Community State Types (CSTs). Boxes represent the interquartile ranges and lines represent the median values. CSTs were defined as previously according to the composition and structure of the microbial community. Table below boxplot contains percentage of samples in each of the CSTs stratified by high gene count (HGC) or low gene count (LGC), in which HGC has >10,000 non-redundant genes and LGC has <10,000 non-redundant genes. b Plot of the log2 transformed ratio of the gene of a species being in one gene count category over the other across the 264 vaginal metagenomes, only the species with more than four times more abundant in a category (either HGC or LGC) are shown. The species with at least 0.1% abundance and at least 100 genes in either HGC or LGC groups. Plotted are interquartile ranges (IQRs, boxes), medians (line in box), and mean (red diamond).
Fig. 5
Fig. 5. Demonstration using VIRGO and VOG to study vaginal microbiome.
a 4 sampling points were selected based on a longitudinally profiled subject prior to (T1), during (T2 and T3), and after (T4) an episode of bacterial vaginosis using 16S rRNA profiling. b Functional profiling of the metagenome (MG) and metatranscriptome (MT) of each of the four sampling points. Functional categories were annotated using EggNOG (v4.5). c Functional profiles stratified by species using the taxonomic profiling provided by VIRGO. d Demonstrative use of VOG to characterize the G. vaginalis cholesterol-dependent cytolysin (CDC) protein family. It shows the phylogeny of CDC-containing protein and alignment of domain 4 of the CDCs that is generally well conserved but contains a single divergent site, highlighted in yellow.
Fig. 6
Fig. 6. Intraspecies diversity revealed using VIRGO of seven vaginal species including L. crispatus, L. iners, L. jensenii, L. gasseri, and G. vaginalis, A. vaginae and P. timonensis.
a Summary of the number (N) of isolate genomes and metagenome (MG) samples with more than 80% of their average genome’s number of coding genes for a species, based on a dataset of 1507 in-house vaginal metagenomes characterized using VIRGO. b Boxplot of number non-redundant genes in isolate genomes versus vaginal metagenomes. Boxes represent the interquartile ranges, the notch represents the 95% confidence interval, and the line represents the median value. c Heatmap of presence/absence of L. crispatus non-redundant gene profiles for 56 available isolate genomes (gray) and 413 VIRGO-characterized metagenomes that contained either high (>50% relative abundance, red: L. crispatus, green L. gasseri) or low (<50% relative abundance, cyan) relative abundance of the species. Hierarchical clustering of the profiles was performed using ward linkage based on their Jaccard similarity coefficient. *number of isolate genomes and metagenome samples. MG: Metagenomes *p < 0.05, ***p < 0.001, Student’s t-test, after correction for multiple comparisons.

References

    1. Cho I, Blaser MJ. The human microbiome: at the interface of health and disease. Nat. Rev. Genet. 2012;13:260–270. doi: 10.1038/nrg3182. - DOI - PMC - PubMed
    1. Henao-Mejia J, Elinav E, Thaiss CA, Licona-Limon P, Flavell RA. Role of the intestinal microbiome in liver disease. J. Autoimmun. 2013;46:66–73. doi: 10.1016/j.jaut.2013.07.001. - DOI - PubMed
    1. Ley RE. Obesity and the human microbiome. Curr. Opin. Gastroenterol. 2010;26:5–11. doi: 10.1097/MOG.0b013e328333d751. - DOI - PubMed
    1. Schwebke JR. New concepts in the etiology of bacterial vaginosis. Curr. Infect. Dis. Rep. 2009;11:143–147. doi: 10.1007/s11908-009-0021-7. - DOI - PubMed
    1. Gevers D, et al. A microbiome foundation for the study of Crohn’s Disease. Cell Host Microbe. 2017;21:301–304. doi: 10.1016/j.chom.2017.02.012. - DOI - PMC - PubMed

Publication types

Substances