Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Sep 18;48(16):8883-8900.
doi: 10.1093/nar/gkaa621.

DRAM for distilling microbial metabolism to automate the curation of microbiome function

Affiliations

DRAM for distilling microbial metabolism to automate the curation of microbiome function

Michael Shaffer et al. Nucleic Acids Res. .

Abstract

Microbial and viral communities transform the chemistry of Earth's ecosystems, yet the specific reactions catalyzed by these biological engines are hard to decode due to the absence of a scalable, metabolically resolved, annotation software. Here, we present DRAM (Distilled and Refined Annotation of Metabolism), a framework to translate the deluge of microbiome-based genomic information into a catalog of microbial traits. To demonstrate the applicability of DRAM across metabolically diverse genomes, we evaluated DRAM performance on a defined, in silico soil community and previously published human gut metagenomes. We show that DRAM accurately assigned microbial contributions to geochemical cycles and automated the partitioning of gut microbial carbohydrate metabolism at substrate levels. DRAM-v, the viral mode of DRAM, established rules to identify virally-encoded auxiliary metabolic genes (AMGs), resulting in the metabolic categorization of thousands of putative AMGs from soils and guts. Together DRAM and DRAM-v provide critical metabolic profiling capabilities that decipher mechanisms underpinning microbiome function.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Conceptual overview and workflow of the assembly-based software, DRAM (Distilled and Refined Annotation of Metabolism). DRAM (green, A) profiles microbial metabolism from genomic sequences, while DRAM-v profiles the Auxiliary Metabolic Genes (AMGs) (orange, B) in vMAGs. DRAM’s input data files are denoted by circles in gray, while analysis and output files are denoted by rectangles in green for MAGs or orange for AMGs. DRAM’s outputs (from the Raw, Distillate and Product) provide three levels of annotation density and metabolic parsing. More details on the output files and specific operation can be found in the Supplementary Text or at https://github.com/shafferm/DRAM/wiki. User defined taxonomy (e.g. GTDB-Tk (24)) and completion estimates (e.g. CheckM (51)) for MAGs and isolate genomes can be input into DRAM.
Figure 2.
Figure 2.
DRAM provides multiple levels of metabolic and structural information. (A) Genome cartoon of Dechloromonas aromatica RCB demonstrates the usability of DRAM to understand the potential metabolism of a genome. Putative enzymes are colored by location of information in DRAM’s outputs: Raw (black), Distillate (gray) and Product (white). Gene numbers, identifiers, or abbreviations are colored according to metabolic categories outlined in (B) and detailed in Supplementary File 4. Genes with an asterisk had an unidentified localization by PSORTb (101). (B) Flow chart shows the metabolisms from DRAM’s Distillate. Distillate provides five major categories of metabolism: energy, transporters, miscellaneous (MISC), carbon utilization and organic nitrogen. Each major category contains subcategories, with outlines denoting location of information within Distillate and Product. (C) Heatmap shows presence (colored) and absence (white) of databases used in comparable annotators to DRAM. Annotators are colored consistently in A–E, with Prokka (30) in black, DFAST (31) in light gray, MetaErg (32) in dark gray and DRAM in red. Barcharts in (D–F) show database size (D), as well as number of annotated (E), hypotheticals (F), and unannotated (G) genes assigned by each annotator when analyzing in silico soil community. See methods for definitions of annotated, hypothetical, and unannotated genes, relative to each annotator.
Figure 3.
Figure 3.
DRAM Product summarizes and visualizes ecosystem-relevant metabolisms across input genomes. Heatmaps in (A–C) were automatically generated by DRAM from the Product shown in Supplementary File 3. Sections of the heatmap are ordered to highlight information available in Product, including pathway completion (A), subunit completion (B), and presence/absence (C) data. Boxes colored by presence/absence in (C) represent 1–2 genes necessary to carry out a particular process. Hovering over the heatmap cells in the Product’s HTML outputs interactively reports the calculated percent completion among other information. Dechloromonas aromatica RCB is represented by a genome cartoon in Figure 2A and is highlighted in blue on the heatmaps.
Figure 4.
Figure 4.
Substrate-resolved survey of carbon metabolism in the human gut. Bar charts represent normalized gene abundance or proportion of reads that mapped to each gene or gene category reported as relative abundance (%) or Gene Per Million (GPM). Reads came from previously (56) published healthy human fecal metagenomes that were assembled and then annotated in DRAM (A–C). (A) Using a subset of 44 randomly selected metagenomes from (56), we profiled and annotated gene abundance patterns colored by DRAM’s Distillate categories and subcategories. (B) Using the same metagenomes and sample order as in (B), summary of CAZymes to broader substrate categories reveals differential abundance patterns across the cohort. (C) Data from (B) is graphed by carbohydrate substrates. Boxplots represent the median and one quartile deviation of CAZyme abundance, with each point representing a single person in the 44-member cohort. Putative substrates are ordered by class, then by mean abundance.
Figure 5.
Figure 5.
DRAM provides a metabolic inventory of microbial traits important in the human gut. Seventy-six medium and high-quality MAGs were reconstructed from a single HMP fecal metagenome. Taxonomy was assigned using GTDB-Tk (24), with colored boxes noting class and name noting genus. The presence (green) or absence (blue) of genes capable of catalyzing carbohydrate degradation or contributing to short chain fatty acid metabolism are reported in the heatmap. We note that the directionality of some of these SCFA conversions is difficult to infer from gene sequence alone. Genomes are clustered by gene presence and hemicellulose substrates are shown in red text.
Figure 6.
Figure 6.
DRAM-v profiles putative AMGs in viral sequences. Description of DRAM-v's rules for auxiliary score (A) and flag (B) assignments. Auxiliary scores shown in (A) are determined by the location of a putative AMG on the contig relative to other viral hallmark or viral-like genes (determined by VirSorter (61)), with all scores being reported in the Distillate. Scores highlighted in red are considered high (1, 2) or medium (3) confidence and thus the putative AMGs are also represented in the Product. Flags shown in (B) highlight important details about each putative AMG of which the user should be aware, all being reported in the Raw. Putative AMGs with a confidence score 1–3 and a metabolic flag (flag ‘M’; highlighted in red) are included in the Distillate and Product, unless flags in blue are reported. Flags in black do not decide the inclusion of a putative AMG. (C) Bar graph displaying putative AMGs recovered by DRAM-v from metagenomic files (soil metagenomes (14), left; 44 gut metagenomes from the HMP (56), right) and categorized by the Distillate metabolic category: Carbon Utilization, Energy, Organic Nitrogen, Transporters and MISC. Putative AMGs labeled as ‘multiple’ refer to genes that occur in multiple DRAM Distillate categories (e.g. transporters for organic nitrogen) and AMGs that are labeled as previously reported are in the viral AMG database compiled here. (D) Sequence similarity network (66) of all AMGs with an auxiliary score of 1–3 recovered from soil and human fecal metagenomes. Nodes are connected by an edge (line) if the pairwise amino acid sequence identity is >80% (see Materials and Methods). Only clusters of >5 members are shown. Nodes are colored by the Distillate category defined in (C), while node shape denotes soil or human fecal. Back highlighting denotes if the cluster contains both soil and human fecal nodes (shared), soil nodes only, or human fecal nodes only. Specific AMGs highlighted in the text are shown. (E) Stacked bar chart shows the number of singletons (AMGs that do not align by at least 80% to another recovered AMG) in each sample type, with bars colored by DRAM-v's Distillate category.

References

    1. Thompson L.R., Sanders J.G., McDonald D., Amir A., Ladau J., Locey K.J., Prill R.J., Tripathi A., Gibbons S.M., Ackermann G. et al. .. A communal catalogue reveals Earth's multiscale microbial diversity. Nature. 2017; 551:457–463. - PMC - PubMed
    1. Bolyen E., Rideout J.R., Dillon M.R., Bokulich N.A., Abnet C.C., Al-Ghalith G.A., Alexander H., Alm E.J., Arumugam M., Asnicar F. et al. .. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat. Biotechnol. 2019; 37:852–857. - PMC - PubMed
    1. Wrighton K.C., Thomas B.C., Sharon I., Miller C.S., Castelle C.J., VerBerkmoes N.C., Wilkins M.J., Hettich R.L., Lipton M.S., Williams K.H. et al. .. Fermentation, hydrogen, and sulfur metabolism in multiple uncultivated bacterial phyla. Science. 2012; 337:1661–1665. - PubMed
    1. Sharon I., Banfield J.F.. Genomes from metagenomics. Science. 2013; 342:1057–1058. - PubMed
    1. Tyson G.W., Chapman J., Hugenholtz P., Allen E.E., Ram R.J., Richardson P.M., Solovyev V.V., Rubin E.M., Rokhsar D.S., Banfield J.F.. Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature. 2004; 428:37–43. - PubMed

Publication types