Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 May 21;328(5981):994-9.
doi: 10.1126/science.1183605.

A catalog of reference genomes from the human microbiome

Human Microbiome Jumpstart Reference Strains ConsortiumKaren E NelsonGeorge M WeinstockSarah K HighlanderKim C WorleyHeather Huot CreasyJennifer Russo WortmanDouglas B RuschMakedonka MitrevaErica SodergrenAsif T ChinwallaMichael FeldgardenDirk GeversBrian J HaasRamana MadupuDoyle V WardBruce W BirrenRichard A GibbsBarbara MetheJoseph F PetrosinoRobert L StrausbergGranger G SuttonOwen R WhiteRichard K WilsonScott DurkinMichelle Gwinn GiglioSharvari GujjaClint HowarthChinnappa D KodiraNikos KyrpidesTeena MehtaDonna M MuznyMatthew PearsonKymberlie PepinAmrita PatiXiang QinChandri YandavaQiandong ZengLan ZhangAaron M BerlinLei ChenTheresa A HepburnJustin JohnsonJamison McCorrisonJason MillerPat MinxChad NusbaumCarsten RussSean M SykesChad M TomlinsonSarah YoungWesley C WarrenJonathan BadgerJonathan CrabtreeVictor M MarkowitzJoshua OrvisAndrew CreeSteve FerrieraLucinda L FultonRobert S FultonMarcus GillisLisa D HemphillVandita JoshiChristie KovarManolito TorralbaKris A WetterstrandAmr AbouellleilAye M WollamChristian J BuhayYan DingShannon DuganMichael G FitzGeraldMike HolderJessica HostetlerSandra W CliftonEmma Allen-VercoeAshlee M EarlCandace N FarmerKonstantinos LioliosMichael G SuretteQiang XuCraig PohlKatarzyna Wilczek-BoneyDianhui Zhu

A catalog of reference genomes from the human microbiome

Human Microbiome Jumpstart Reference Strains Consortium et al. Science. .

Abstract

The human microbiome refers to the community of microorganisms, including prokaryotes, viruses, and microbial eukaryotes, that populate the human body. The National Institutes of Health launched an initiative that focuses on describing the diversity of microbial species that are associated with health and disease. The first phase of this initiative includes the sequencing of hundreds of microbial reference genomes, coupled to metagenomic sequencing from multiple body sites. Here we present results from an initial reference genome sequencing of 178 microbial genomes. From 547,968 predicted polypeptides that correspond to the gene complement of these strains, previously unidentified ("novel") polypeptides that had both unmasked sequence length greater than 100 amino acids and no BLASTP match to any nonreference entry in the nonredundant subset were defined. This analysis resulted in a set of 30,867 polypeptides, of which 29,987 (approximately 97%) were unique. In addition, this set of microbial genomes allows for approximately 40% of random sequences from the microbiome of the gastrointestinal tract to be associated with organisms based on the match criteria used. Insights into pan-genome analysis suggest that we are still far from saturating microbial species genetic data sets. In addition, the associated metrics and standards used by our group for quality assurance are presented.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Phylogenetic tree of 16S rDNA sequences
The tree was created using ~1500 16S rDNA representing single species. Organisms sequenced as part of the HMP project are highlighted in blue. Additional coloring indicates separation by phylum: yellow, Actinobacteria; dark green, Bacteroidetes; light green, Cyanobacteria; red, Firmicutes; cyan, Fusobacteria; dark red Planctomycetes; grey, Proteobacteria; magenta, Spirochaetes; light pink, TM7; tan, Tenericutes. The purpose of this analysis is not the details of the branching structure (which include minor known artifacts), but the overall distribution of the HMP strains (in blue) around the tree of life.
Figure 2
Figure 2. Contig N50 comparison for twenty-six draft and improved genomes
High quality draft contig N50 bases are shown in magenta and improved high quality draft sequences are shown in green. These data represent the variety of approaches from the four data generation centers. The majority of shotgun data were produced on the Roche-454 platform, though some assemblies include paired Sanger reads to improve contiguity. All draft assemblies are based on the Roche-Newbler assembler, though some of the improved assemblies are based on PCAP (11) and the Celera assembler due to existing integration with finishing and improvement pipelines. Additional variation comes from the improvement approach. Directed Sanger reads from gap spanning PCR amplicons serves as the primary approach while some assemblies have been subjected only to manipulation of the shotgun data, making unrealized joins, removing poor quality data and placing unincorporated shotgun reads.
Figure 3
Figure 3. Inter-strain diversity among Lactobacillus genomes
Each point represents a whole-genome comparison between two Lactobacillus genomes and shows the percentage average nucleotide identity (ANI) on the x-axis as a measure of evolutionary distance, plotted against the percentage of gene content similarity on the y-axis. Only comparisons with ANI values above 85% are shown. The vertical line at 95% corresponds to a recommended cut-off of 70% DNA–DNA reassociation for species delineation. Different intra- and inter-species comparisons are color-coded, with full or open circles respectively, and labeled with given taxonomical name in corresponding color. Colored ovals assist in identifying related data points belonging to a single named species.

References

    1. The NIH Common Fund Human Microbiome Project. Division of Program Coordination, Planning and Strategic Initiatives, National Institutes of Health, U.S. Department of Health and Human Services; ( http://nihroadmap.nih.gov/hmp/)
    1. HMP Project Catalog. Human Microbiome Project Data Analysis Coordinating Center; ( http://www.hmpdacc.org/project_catalog.html)
    1. 16S rDNA for cultured bacteria. ( http://bioinfo.unice.fr/blast/documentation/alphabetical_list.html)
    1. Reference Genomes of the Human Microbiome Project. Human Microbiome Project Data Analysis Coordinating Center; ( http://hmpdacc.org/reference_genomes.php)
    1. Mobley HL, Island MD, Hausinger RP. Microbiol Rev. 1995 Sep;59:451. - PMC - PubMed

Publication types

MeSH terms