DRAM for distilling microbial metabolism to automate the curation of microbiome function

Michael Shaffer¹, Mikayla A Borton¹, Bridget B McGivern¹, Ahmed A Zayed², Sabina Leanti La Rosa³, Lindsey M Solden², Pengfei Liu¹, Adrienne B Narrowe¹, Josué Rodríguez-Ramos¹, Benjamin Bolduc², M Consuelo Gazitúa², Rebecca A Daly¹, Garrett J Smith⁴, Dean R Vik², Phil B Pope³, Matthew B Sullivan², Simon Roux⁵, Kelly C Wrighton¹

Affiliations

¹ Department of Soil and Crop Sciences, Colorado State University, Fort Collins, CO 80523, USA.
² Department of Microbiology, The Ohio State University, Columbus, OH 43210, USA.
³ Faculty of Biosciences, Norwegian University of Life Sciences, Aas 1432, Norway.
⁴ Department of Microbiology, Radboud University, Nijmegen 6525, Netherlands.
⁵ Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.

PMID: 32766782
PMCID: PMC7498326
DOI: 10.1093/nar/gkaa621

DRAM for distilling microbial metabolism to automate the curation of microbiome function

Michael Shaffer et al. Nucleic Acids Res. 2020.

. 2020 Sep 18;48(16):8883-8900.

doi: 10.1093/nar/gkaa621.

Authors

Affiliations

¹ Department of Soil and Crop Sciences, Colorado State University, Fort Collins, CO 80523, USA.
² Department of Microbiology, The Ohio State University, Columbus, OH 43210, USA.
³ Faculty of Biosciences, Norwegian University of Life Sciences, Aas 1432, Norway.
⁴ Department of Microbiology, Radboud University, Nijmegen 6525, Netherlands.
⁵ Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.

PMID: 32766782
PMCID: PMC7498326
DOI: 10.1093/nar/gkaa621

Abstract

Microbial and viral communities transform the chemistry of Earth's ecosystems, yet the specific reactions catalyzed by these biological engines are hard to decode due to the absence of a scalable, metabolically resolved, annotation software. Here, we present DRAM (Distilled and Refined Annotation of Metabolism), a framework to translate the deluge of microbiome-based genomic information into a catalog of microbial traits. To demonstrate the applicability of DRAM across metabolically diverse genomes, we evaluated DRAM performance on a defined, in silico soil community and previously published human gut metagenomes. We show that DRAM accurately assigned microbial contributions to geochemical cycles and automated the partitioning of gut microbial carbohydrate metabolism at substrate levels. DRAM-v, the viral mode of DRAM, established rules to identify virally-encoded auxiliary metabolic genes (AMGs), resulting in the metabolic categorization of thousands of putative AMGs from soils and guts. Together DRAM and DRAM-v provide critical metabolic profiling capabilities that decipher mechanisms underpinning microbiome function.

PubMed Disclaimer

Figures

**Figure 1.**
Conceptual overview and workflow of the assembly-based software, DRAM (Distilled and Refined Annotation of Metabolism). DRAM (green, A) profiles microbial metabolism from genomic sequences, while DRAM-v profiles the Auxiliary Metabolic Genes (AMGs) (orange, B) in vMAGs. DRAM’s input data files are denoted by circles in gray, while analysis and output files are denoted by rectangles in green for MAGs or orange for AMGs. DRAM’s outputs (from the *Raw*, *Distillate* and *Product*) provide three levels of annotation density and metabolic parsing. More details on the output files and specific operation can be found in the Supplementary Text or at https://github.com/shafferm/DRAM/wiki. User defined taxonomy (e.g. GTDB-Tk (24)) and completion estimates (e.g. CheckM (51)) for MAGs and isolate genomes can be input into DRAM.

**Figure 2.**
DRAM provides multiple levels of metabolic and structural information. (A) Genome cartoon of *Dechloromonas aromatica* RCB demonstrates the usability of DRAM to understand the potential metabolism of a genome. Putative enzymes are colored by location of information in DRAM’s outputs: *Raw* (black), *Distillate* (gray) and *Product* (white). Gene numbers, identifiers, or abbreviations are colored according to metabolic categories outlined in (B) and detailed in Supplementary File 4. Genes with an asterisk had an unidentified localization by PSORTb (101). (B) Flow chart shows the metabolisms from DRAM’s *Distillate*. *Distillate* provides five major categories of metabolism: energy, transporters, miscellaneous (MISC), carbon utilization and organic nitrogen. Each major category contains subcategories, with outlines denoting location of information within *Distillate* and *Product*. (C) Heatmap shows presence (colored) and absence (white) of databases used in comparable annotators to DRAM. Annotators are colored consistently in A–E, with Prokka (30) in black, DFAST (31) in light gray, MetaErg (32) in dark gray and DRAM in red. Barcharts in (D–F) show database size (D), as well as number of annotated (E), hypotheticals (F), and unannotated (G) genes assigned by each annotator when analyzing *in silico* soil community. See methods for definitions of annotated, hypothetical, and unannotated genes, relative to each annotator.

**Figure 3.**
DRAM *Product* summarizes and visualizes ecosystem-relevant metabolisms across input genomes. Heatmaps in (A–C) were automatically generated by DRAM from the *Product* shown in Supplementary File 3. Sections of the heatmap are ordered to highlight information available in *Product*, including pathway completion (A), subunit completion (B), and presence/absence (C) data. Boxes colored by presence/absence in (C) represent 1–2 genes necessary to carry out a particular process. Hovering over the heatmap cells in the *Product*’s HTML outputs interactively reports the calculated percent completion among other information. *Dechloromonas aromatica* RCB is represented by a genome cartoon in Figure 2A and is highlighted in blue on the heatmaps.

**Figure 4.**
Substrate-resolved survey of carbon metabolism in the human gut. Bar charts represent normalized gene abundance or proportion of reads that mapped to each gene or gene category reported as relative abundance (%) or Gene Per Million (GPM). Reads came from previously (56) published healthy human fecal metagenomes that were assembled and then annotated in DRAM (A–C). (A) Using a subset of 44 randomly selected metagenomes from (56), we profiled and annotated gene abundance patterns colored by DRAM’s *Distillate* categories and subcategories. (B) Using the same metagenomes and sample order as in (B), summary of CAZymes to broader substrate categories reveals differential abundance patterns across the cohort. (C) Data from (B) is graphed by carbohydrate substrates. Boxplots represent the median and one quartile deviation of CAZyme abundance, with each point representing a single person in the 44-member cohort. Putative substrates are ordered by class, then by mean abundance.

**Figure 5.**
DRAM provides a metabolic inventory of microbial traits important in the human gut. Seventy-six medium and high-quality MAGs were reconstructed from a single HMP fecal metagenome. Taxonomy was assigned using GTDB-Tk (24), with colored boxes noting class and name noting genus. The presence (green) or absence (blue) of genes capable of catalyzing carbohydrate degradation or contributing to short chain fatty acid metabolism are reported in the heatmap. We note that the directionality of some of these SCFA conversions is difficult to infer from gene sequence alone. Genomes are clustered by gene presence and hemicellulose substrates are shown in red text.

**Figure 6.**
DRAM-v profiles putative AMGs in viral sequences. Description of DRAM-v's rules for auxiliary score (A) and flag (B) assignments. Auxiliary scores shown in (A) are determined by the location of a putative AMG on the contig relative to other viral hallmark or viral-like genes (determined by VirSorter (61)), with all scores being reported in the *Distillate*. Scores highlighted in red are considered high (1, 2) or medium (3) confidence and thus the putative AMGs are also represented in the *Product*. Flags shown in (B) highlight important details about each putative AMG of which the user should be aware, all being reported in the *Raw*. Putative AMGs with a confidence score 1–3 and a metabolic flag (flag ‘M’; highlighted in red) are included in the *Distillate* and *Product*, unless flags in blue are reported. Flags in black do not decide the inclusion of a putative AMG. (C) Bar graph displaying putative AMGs recovered by DRAM-v from metagenomic files (soil metagenomes (14), left; 44 gut metagenomes from the HMP (56), right) and categorized by the *Distillate* metabolic category: Carbon Utilization, Energy, Organic Nitrogen, Transporters and MISC. Putative AMGs labeled as ‘multiple’ refer to genes that occur in multiple DRAM *Distillate* categories (e.g. transporters for organic nitrogen) and AMGs that are labeled as previously reported are in the viral AMG database compiled here. (D) Sequence similarity network (66) of all AMGs with an auxiliary score of 1–3 recovered from soil and human fecal metagenomes. Nodes are connected by an edge (line) if the pairwise amino acid sequence identity is >80% (see Materials and Methods). Only clusters of >5 members are shown. Nodes are colored by the *Distillate* category defined in (C), while node shape denotes soil or human fecal. Back highlighting denotes if the cluster contains both soil and human fecal nodes (shared), soil nodes only, or human fecal nodes only. Specific AMGs highlighted in the text are shown. (E) Stacked bar chart shows the number of singletons (AMGs that do not align by at least 80% to another recovered AMG) in each sample type, with bars colored by DRAM-v's *Distillate* category.

See this image and copyright information in PMC

References

1. Thompson L.R., Sanders J.G., McDonald D., Amir A., Ladau J., Locey K.J., Prill R.J., Tripathi A., Gibbons S.M., Ackermann G. et al. .. A communal catalogue reveals Earth's multiscale microbial diversity. Nature. 2017; 551:457–463. - PMC - PubMed
1. Bolyen E., Rideout J.R., Dillon M.R., Bokulich N.A., Abnet C.C., Al-Ghalith G.A., Alexander H., Alm E.J., Arumugam M., Asnicar F. et al. .. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat. Biotechnol. 2019; 37:852–857. - PMC - PubMed
1. Wrighton K.C., Thomas B.C., Sharon I., Miller C.S., Castelle C.J., VerBerkmoes N.C., Wilkins M.J., Hettich R.L., Lipton M.S., Williams K.H. et al. .. Fermentation, hydrogen, and sulfur metabolism in multiple uncultivated bacterial phyla. Science. 2012; 337:1661–1665. - PubMed
1. Sharon I., Banfield J.F.. Genomes from metagenomics. Science. 2013; 342:1057–1058. - PubMed
1. Tyson G.W., Chapman J., Hugenholtz P., Allen E.E., Ram R.J., Richardson P.M., Solovyev V.V., Rubin E.M., Rokhsar D.S., Banfield J.F.. Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature. 2004; 428:37–43. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

R01 AI143288/AI/NIAID NIH HHS/United States

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

DRAM for distilling microbial metabolism to automate the curation of microbiome function

Affiliations

DRAM for distilling microbial metabolism to automate the curation of microbiome function

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources