Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 25;5(1):e00096-20.
doi: 10.1128/mSystems.00096-20.

Improving Characterization of Understudied Human Microbiomes Using Targeted Phylogenetics

Affiliations

Improving Characterization of Understudied Human Microbiomes Using Targeted Phylogenetics

Bruce A Rosa et al. mSystems. .

Abstract

Whole-genome bacterial sequences are required to better understand microbial functions, niche-specific bacterial metabolism, and disease states. Although genomic sequences are available for many of the human-associated bacteria from commonly tested body habitats (e.g., feces), as few as 13% of bacterium-derived reads from other sites such as the skin map to known bacterial genomes. To facilitate a better characterization of metagenomic shotgun reads from underrepresented body sites, we collected over 10,000 bacterial isolates originating from 14 human body habitats, identified novel taxonomic groups based on full-length 16S rRNA gene sequences, clustered the sequences to ensure that no individual taxonomic group was overselected for sequencing, prioritized bacteria from underrepresented body sites (such as skin and respiratory and urinary tracts), and sequenced and assembled genomes for 665 new bacterial strains. Here, we show that addition of these genomes improved read mapping rates of Human Microbiome Project (HMP) metagenomic samples by nearly 30% for the previously underrepresented phylum Fusobacteria, and 27.5% of the novel genomes generated here had high representation in at least one of the tested HMP samples, compared to 12.5% of the sequences in the public databases, indicating an enrichment of useful novel genomic sequences resulting from the prioritization procedure. As our understanding of the human microbiome continues to improve and to enter the realm of therapy developments, targeted approaches such as this to improve genomic databases will increase in importance from both an academic and a clinical perspective.IMPORTANCE The human microbiome plays a critically important role in health and disease, but current understanding of the mechanisms underlying the interactions between the varying microbiome and the different host environments is lacking. Having access to a database of fully sequenced bacterial genomes provides invaluable insights into microbial functions, but currently sequenced genomes for the human microbiome have largely come from a limited number of body sites (primarily feces), while other sites such as the skin, respiratory tract, and urinary tract are underrepresented, resulting in as little as 13% of bacterium-derived reads mapping to known bacterial genomes. Here, we sequenced and assembled 665 new bacterial genomes, prioritized from a larger database to select underrepresented body sites and bacterial taxa in the existing databases. As a result, we substantially improve mapping rates for samples from the Human Microbiome Project and provide an important contribution to human bacterial genomic databases for future studies.

Keywords: HMP; genome; human microbiome; microbiome; resource.

PubMed Disclaimer

Figures

FIG 1
FIG 1
Flow chart diagram of overall sample production and prioritization procedure.
FIG 2
FIG 2
Composition of isolates before (left) and after (right) prioritization. (A) Isolate categorization by body habitat. (B) Isolate categorization by phylogeny at a phylum level.
FIG 3
FIG 3
Grouping of the sequenced WUSC bacterial genomes based on body site, phylum, and class. (A) WUSC strains clustered based off 16S read sequence similarity. Phylum is indicated by dendrogram branch color, while body habitat is indicated by the colored bars around the periphery of the image. (B) WUSC strains classified by body site and phylogeny at a phylum (left) or class (right) level. Counts indicate the number of strains in each category; color indicates the level of representation in each category, with green representing high counts and red representing low or none.
FIG 4
FIG 4
Increased characterization of metagenomic shotgun sequences by addition of genomes of novel phylogenetically distinct strains. (A) Relative improvement of read mapping rates from HMP samples to the improved genome database. (B) Number of total mapped reads per sample after novel reference genome sequence inclusion versus the absolute increase in mapped reads after versus before novel sequence inclusion. Samples are colored by body site. The dotted line represents the average relationship between total number of mapped reads and absolute increase in mapped reads after novel reference genome sequence inclusion. (C) The proportion of total mapped reads after novel sequence inclusion attributable to a given phylum versus the relative increase in reads mapped for that phylum after novel sequence inclusion. Phyla are represented as differently colored dots, with names given on the chart.
FIG 5
FIG 5
Novel genome assemblies (black) show increased representation in HMP samples, relative to sample size (>50% breadth, >1× depth, upper right quadrant of graph). Novel genomes are shown in black; public database genomes are shown in dark gray. Numbers are not normalized for database size.

References

    1. Pedersen HK, Gudmundsdottir V, Nielsen HB, Hyotylainen T, Nielsen T, Jensen BA, Forslund K, Hildebrand F, Prifti E, Falony G, Le Chatelier E, Levenez F, Dore J, Mattila I, Plichta DR, Poho P, Hellgren LI, Arumugam M, Sunagawa S, Vieira-Silva S, Jorgensen T, Holm JB, Trost K, MetaHIT Consortium, Kristiansen K, Brix S, Raes J, Wang J, Hansen T, Bork P, Brunak S, Oresic M, Ehrlich SD, Pedersen O. 2016. Human gut microbes impact host serum metabolome and insulin sensitivity. Nature 535:376–381. doi:10.1038/nature18646. - DOI - PubMed
    1. Jie Z, Xia H, Zhong SL, Feng Q, Li S, Liang S, Zhong H, Liu Z, Gao Y, Zhao H, Zhang D, Su Z, Fang Z, Lan Z, Li J, Xiao L, Li J, Li R, Li X, Li F, Ren H, Huang Y, Peng Y, Li G, Wen B, Dong B, Chen JY, Geng QS, Zhang ZW, Yang H, Wang J, Wang J, Zhang X, Madsen L, Brix S, Ning G, Xu X, Liu X, Hou Y, Jia H, He K, Kristiansen K. 2017. The gut microbiome in atherosclerotic cardiovascular disease. Nat Commun 8:845. doi:10.1038/s41467-017-00900-1. - DOI - PMC - PubMed
    1. Le Chatelier E, Nielsen T, Qin J, Prifti E, Hildebrand F, Falony G, Almeida M, Arumugam M, Batto J-M, Kennedy S, Leonard P, Li J, Burgdorf K, Grarup N, Jørgensen T, Brandslund I, Nielsen HB, Juncker AS, Bertalan M, Levenez F, Pons N, Rasmussen S, Sunagawa S, Tap J, Tims S, Zoetendal EG, Brunak S, Clément K, Doré J, Kleerebezem M, Kristiansen K, Renault P, Sicheritz-Ponten T, de Vos WM, Zucker J-D, Raes J, Hansen T, Bork P, Wang J, Ehrlich SD, Pedersen O, MetaHIT Consortium. 2013. Richness of human gut microbiome correlates with metabolic markers. Nature 500:541–546. doi:10.1038/nature12506. - DOI - PubMed
    1. Parracho HM, Bingham MO, Gibson GR, McCartney AL. 2005. Differences between the gut microflora of children with autistic spectrum disorders and that of healthy children. J Med Microbiol 54:987–991. doi:10.1099/jmm.0.46101-0. - DOI - PubMed
    1. Claesson MJ, Clooney AG, O’Toole PW. 2017. A clinician’s guide to microbiome analysis. Nat Rev Gastroenterol Hepatol 14:585–595. doi:10.1038/nrgastro.2017.97. - DOI - PubMed