Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Aug;37(8):953-961.
doi: 10.1038/s41587-019-0202-3. Epub 2019 Aug 2.

Compendium of 4,941 rumen metagenome-assembled genomes for rumen microbiome biology and enzyme discovery

Affiliations

Compendium of 4,941 rumen metagenome-assembled genomes for rumen microbiome biology and enzyme discovery

Robert D Stewart et al. Nat Biotechnol. 2019 Aug.

Abstract

Ruminants provide essential nutrition for billions of people worldwide. The rumen is a specialized stomach that is adapted to the breakdown of plant-derived complex polysaccharides. The genomes of the rumen microbiota encode thousands of enzymes adapted to digestion of the plant matter that dominates the ruminant diet. We assembled 4,941 rumen microbial metagenome-assembled genomes (MAGs) using approximately 6.5 terabases of short- and long-read sequence data from 283 ruminant cattle. We present a genome-resolved metagenomics workflow that enabled assembly of bacterial and archaeal genomes that were at least 80% complete. Of note, we obtained three single-contig, whole-chromosome assemblies of rumen bacteria, two of which represent previously unknown rumen species, assembled from long-read data. Using our rumen genome collection we predicted and annotated a large set of rumen proteins. Our set of rumen MAGs increases the rate of mapping of rumen metagenomic sequencing reads from 15% to 50-70%. These genomic and protein resources will enable a better understanding of the structure and functions of the rumen microbiota.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Phylogenetic tree of 4,941 RUGs from the cattle rumen, additionally incorporating rumen genomes from the Hungate collection.
The tree was produced from concatenated protein sequences using PhyloPhlAn, and subsequently drawn using GraPhlAn. Labels show Hungate genome names, and were chosen to be informative but not overlap.
Fig. 2
Fig. 2. A comparison of the RUG dataset with the Hungate collection and previously published data.
a,b, A comparison of the 4,941 RUGs with the Hungate collection (a) and our previously published data from Stewart et al. (b). The black line indicates the average percentage protein identity with the closest match (right-hand y axis), and blue dots indicate the mash distance (k = 100,000) between each RUG and the closest match in the comparison dataset (a measure of dissimilarity between two DNA sequences). As expected, a high protein identity relates to a low mash distance, and vice versa. The RUGs are sorted independently by average protein identity for a and b. There is a clear inflection point in Fig. 2b, roughly half way along the x axis, where the protein identity dips below 90% and the mash distance rises, neatly demonstrating the novelty represented by our new larger dataset.
Fig. 3
Fig. 3. A comparison of Illumina and nanopore metagenomic assembly statistics.
The colored histograms show the distribution of statistics for 282 Illumina assemblies, and the single nanopore assembly is highlighted. a, N50 values. b, Total length of the assembly. c, Length of the longest contig. The nanopore assembly N50 of 268 kb was over 56 times longer than that for the average Illumina assembly (4.7 kb), the Illumina assemblies were often longer (average of 600 Mb), the nanopore assembly (at 178 Mb in length) was not the shortest of the assemblies we produced and the nanopore assembly produced the longest contig at 3.8 Mb, seven times longer than the average for the Illumina assemblies (479 kb) and 2.74 times longer than the longest single Illumina contig (1.38 Mb; one of 13 contigs from the 99.19% complete uncultured Bacteroidia bacterium RUG14538). In terms of a direct comparison, the Illumina-only assembly of the same sample had an N50 of 12.2 kb, a total length of 247 Mb and a longest contig of 358 kb.
Fig. 4
Fig. 4. Maximum percentage identity between CAZyme-predicted proteins from the RUGs and the CAZy database.
GH, glycoside hydrolase (n = 235,001); GT, glycosyl transferase (n = 120,494); PL, polysaccharide lyase (n = 6,834); CE, carbohydrate esterase (n = 55,523); AA, auxiliary activities; CBM, carbohydrate-binding module (n = 23,928); SLH, S-layer homology domain (n = 150); cohesin, cohesin domain (n = 80). Center lines indicate the median value; boxes show the interquartile range; and whiskers extend to the most extreme data point that is no more than 1.5 times the interquartile range from the box.
Fig. 5
Fig. 5. Taxonomic and functional distribution of proteins.
Top, total number of proteins for 12 phyla and the group of unknown bacteria. Middle, percentage of the proteome predicted to be CAZymes. Bottom, distribution of eight CAZyme classes as a proportion of the total number of predicted CAZymes.

References

    1. Hess M, et al. Metagenomic discovery of biomass-degrading genes and genomes from cow rumen. Science. 2011;331:463–467. doi: 10.1126/science.1200387. - DOI - PubMed
    1. Cowan DA, et al. Metagenomics, gene discovery and the ideal biocatalyst. Biochem. Soc. Trans. 2004;32:298–302. doi: 10.1042/bst0320298. - DOI - PubMed
    1. Roumpeka DD, Wallace RJ, Escalettes F, Fotheringham I, Watson M. A review of bioinformatics tools for bio-prospecting from metagenomic sequence data. Front. Genet. 2017;8:23. doi: 10.3389/fgene.2017.00023. - DOI - PMC - PubMed
    1. Huws SA, et al. Addressing global ruminant agricultural challenges through understanding the rumen microbiome: past, present, and future. Front. Microbiol. 2018;9:2161. doi: 10.3389/fmicb.2018.02161. - DOI - PMC - PubMed
    1. Gerber, P. J et al. Tackling Climate Change Through Livestock: a Global Assessment of Emissions and Mitigation Opportunities. (Food and Agriculture Organization of the United Nations (FAO), 2013).

Publication types