Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Aug:75:102416.
doi: 10.1016/j.sbi.2022.102416. Epub 2022 Jul 13.

A structural metagenomics pipeline for examining the gut microbiome

Affiliations
Review

A structural metagenomics pipeline for examining the gut microbiome

Morgan E Walker et al. Curr Opin Struct Biol. 2022 Aug.

Abstract

Metagenomic sequencing data provide a rich resource from which to expand our understanding of differential protein functions involved in human health. Here, we outline a pipeline that combines microbial whole genome sequencing with protein structure data to yield a structural metagenomics-informed atlas of microbial enzyme families of interest. Visualizing metagenomics data through a structural lens facilitates downstream studies including targeted inhibition and probe-based proteomics to define at the molecular level how different enzyme orthologs impact in vivo function. Application of this pipeline to gut microbial enzymes like glucuronidases, TMA lyases, and bile salt hydrolases is expected to pinpoint their involvement in health and disease and may aid in the development of therapeutics that target specific enzymes within the microbiome.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest statement MRR is a founder of Symberix, Inc., which is developing microbiome-targeted therapeutics. MRR is also the recipient of research funding from Merck and Lilly.

Figures

Figure 1
Figure 1. A pipeline for examining structural metagenomics data from the gut microbiome.
(a). First, large microbial metagenomics databases are examined for sequences of the enzyme of interest. (b). Sequences are then organized using a sequence similarity network and examined through the lens of known structural data about the enzyme family. (c). The outcome of this analysis can be used to understand differential functional and inhibition data about distinct enzyme isoforms, and to develop chemical probes towards future proteomics efforts. (d). Finally, the holistic understanding of an enzyme class developed through this pipeline can then be validated and further explored for physiological impact through in vivo modulation of enzyme activity.
Figure 2
Figure 2. Structural classification of GUS enzymes.
(a). Functionally critical residues validated by experiment are combined with core structural features to search microbial metagenomic protein sequences for enzyme families, as outlined here for gut microbial β-glucuronidase (GUS) proteins. C-terminus (Ct) and N-terminus (Nt) are labeled. The grey circle indicates the region where C-terminal domains (CTDs) of 200+ residues exist in some GUS enzymes. (b). GUS enzymes from the human and mouse GI microbiomes can be organized by active site features into eight clades (Loop 1, etc.), with each clade demonstrating substrate preferences (small molecule, etc.) unique to this promiscuous group of proteins. Structures are as follows: Loop 1, E. coli, Mini Loop 1, B. fragilis; N-terminal Loop, B. uniformis 1; Loop 2, B. uniformis 2; Mini Loop 2, P. merdae; FMN-binding, R. gnavus 3; Mini Loop 1,2, B. ovatus model created using the Phyre2 Protein Fold Recognition Server; No Loop, B. dorei. (c). Sequence similarity network of the 710 unique gut microbial GUS proteins identified in the Integrated Gene Catalog reveals clustering related to functional clade. SSN was created using the EFI-EST tool and an E value of 1 × 10−220.
Figure 3
Figure 3. Structural diversity of GUS enzymes and inhibitors.
(a). The twenty-two extant gut microbial GUS crystal structures reveal several distinct quaternary structures, including four unique tetramers, four unique dimers, a trimer, and a hexamer, many of which influence catalytic function. The oligomeric structure of R. gnavus 3 GUS was ambiguous based on the crystal structure and is thus shown as a monomer. (b). Chemically distinct inhibitors of GUS enzymes, including several that are selective for specific clades of these diverse gut microbial proteins, as well as an activity-based probe scaffold that is active against all GUS enzyme orthologues.
Figure 4
Figure 4. Gut microbial candidates for further biochemical characterization.
CutC proteins appear to be specific for a narrow range of substrates. The bile salt hydrolases exhibit a slightly broader accommodation of distinct substrates, and more variation in their active site structures. The azoreductases are the most promiscuous of the enzymes shown here, being capable of processing a wide range of substrates and exhibiting even more variation in their active site structures. For each of these enzyme classes, a covalent probe compound has been described that can be leveraged for targeted proteomics.

References

    1. Tierney BT, Yang Z, Luber JM, Beaudin M, Wibowo MC, Baek C, Patel CJ, Kostic AD: The landscape of genetic content in the human microbiome. Cell Host Microbe 2019, 26:283–295. - PMC - PubMed
    1. Li J, Jia H, Cai X, Zhong H, Feng Q, Sunagawa S, Arumugam M, Kultima JR, Prifti E, Nielsen T, et al.: An integrated catalog of reference genes in the human gut microbiome. Nat Biotechnol 2014, 32:834–841. - PubMed
    1. Abraham C, Medzhitov R: Interactions between the host innate immune system and microbes in inflammatory bowel disease. Gastroenterology 2011, 140:1729–1737. - PMC - PubMed
    1. Lavelle A, Sokol H: The gut microbiome in inflammatory bowel disease. In Molecular Genetics of Inflammatory Bowel Disease. Springer International Publishing; 2019:347–377.
    1. Schubert AM, Rogers MAM, Ring C, Mogle J, Petrosino JP, Young VB, Aronoff DM, Schloss PD: Microbiome data distinguish patients with clostridium difficile infection and non-c. Difficile-associated diarrhea from healthy controls. mBio 2014, 5. - PMC - PubMed

Publication types

LinkOut - more resources