Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 23;5(3):e00291-20.
doi: 10.1128/mSystems.00291-20.

GapMind: Automated Annotation of Amino Acid Biosynthesis

Affiliations

GapMind: Automated Annotation of Amino Acid Biosynthesis

Morgan N Price et al. mSystems. .

Abstract

GapMind is a Web-based tool for annotating amino acid biosynthesis in bacteria and archaea (http://papers.genomics.lbl.gov/gaps). GapMind incorporates many variant pathways and 130 different reactions, and it analyzes a genome in just 15 s. To avoid error-prone transitive annotations, GapMind relies primarily on a database of experimentally characterized proteins. GapMind correctly handles fusion proteins and split proteins, which often cause errors for best-hit approaches. To improve GapMind's coverage, we examined genetic data from 35 bacteria that grow in defined media without amino acids, and we filled many gaps in amino acid biosynthesis pathways. For example, we identified additional genes for arginine synthesis with succinylated intermediates in Bacteroides thetaiotaomicron, and we propose that Dyella japonica synthesizes tyrosine from phenylalanine. Nevertheless, for many bacteria and archaea that grow in minimal media, genes for some steps still cannot be identified. To help interpret potential gaps, GapMind checks if they match known gaps in related microbes that can grow in minimal media. GapMind should aid the identification of microbial growth requirements.IMPORTANCE Many microbes can make all of the amino acids (the building blocks of proteins). In principle, we should be able to predict which amino acids a microbe can make, and which it requires as nutrients, by checking its genome sequence for all of the necessary genes. However, in practice, it is difficult to check for all of the alternative pathways. Furthermore, new pathways and enzymes are still being discovered. We built an automated tool, GapMind, to annotate amino acid biosynthesis in bacterial and archaeal genomes. We used GapMind to list gaps: cases where a microbe makes an amino acid but a complete pathway cannot be identified in its genome. We used these gaps, together with data from mutants, to identify new pathways and enzymes. However, for most bacteria and archaea, we still do not know how they can make all of the amino acids.

Keywords: amino acid biosynthesis; gene annotation; high-throughput genetics.

PubMed Disclaimer

Figures

FIG 1
FIG 1
How GapMind works. (A) A pathway with no variants. (B) The definition of a step. (C) Confidence levels for candidates from ublast. (D) Confidence levels for candidates from HMMER.
FIG 2
FIG 2
GapMind handles fusion proteins and split proteins. (A) HSERO_RS20920 from Herbaspirillum seropedicae SmR1 is a fusion of AroL and AroB (shown with Swiss-Prot identifiers). (B) Split candidates for vitamin B12-dependent methionine synthase (MetH) in Burkholderia phytofirmans PsJN and Bacteroides thetaiotaomicron VPI-5482. a.a., amino acids.
FIG 3
FIG 3
Arginine biosynthesis with succinylated intermediates. (A) The standard pathway. Protein names are from Escherichia coli or Bacillus subtilis. The formation of carbamoyl phosphate (catalyzed by CarAB) is not shown. (B) The pathway in Bacteroides and in other Bacteroidetes. (C) Fitness data from Bacteroides thetaiotaomicron VPI-5482, Echinicola vietnamensis KMM 6221 (DSSM 17526), and Pedobacter sp. strain GW460-11-11-14-LB5 (from references and 20). Each fitness value is the log2 change in the abundance of the mutants in a gene during an experiment. Each experiment went from an optical density at 600 nm of 0.02 to saturation (usually 4 to 8 doublings). Fitness values for CA265_RS18510 were not estimated, because mutants of this gene were at low abundance in the starting samples.
FIG 4
FIG 4
Tyrosine synthesis via phenylalanine hydroxylase in Dyella japonica and Echinicola vietnamensis. (A) Gene fitness in Dyella japonica UNC79MFTsu3.2. The x axis shows the median fitness across 59 genes that are predicted to be involved in amino acid biosynthesis (by TIGRFam role [14]), and the y axis shows the fitness of the predicted phenylalanine hydroxylase (PAH). (B) Gene fitness in Echinicola vietnamensis KMM 6221 (DSM 17526) for prephenate dehydrogenase (x axis) and for PAH (y axis). In both panels, we color code experiments by whether or not tyrosine was present in the media. The experiments with tyrosine usually included it via yeast extract or Casamino Acids, while the experiments without tyrosine are in defined media with just one or no amino acids added. Lines show x = 0 and y = 0, corresponding to no effect of mutating the genes. In panel A, lines show x = y.
FIG 5
FIG 5
Number of gaps in amino acid biosynthesis in 148 diverse bacteria and archaea that can grow without amino acids. (These are distinct from the 35 bacteria with fitness data.)
FIG 6
FIG 6
GapMind’s website renders the best paths for amino acid biosynthesis in Desulfovibrio alaskensis G20. Each step is color coded by its confidence level, and a question mark indicates known gaps in related organisms.

Similar articles

Cited by

References

    1. Chen I-M, Markowitz VM, Chu K, Anderson I, Mavromatis K, Kyrpides NC, Ivanova NN. 2013. Improving microbial genome annotations in an integrated database context. PLoS One 8:e54859. doi:10.1371/journal.pone.0054859. - DOI - PMC - PubMed
    1. D'Souza G, Waschina S, Pande S, Bohl K, Kaleta C, Kost C. 2014. Less is more: selective advantages can explain the prevalent loss of biosynthetic genes in bacteria. Evolution 68:2559–2570. doi:10.1111/evo.12468. - DOI - PubMed
    1. Price MN, Zane GM, Kuehl JV, Melnyk RA, Wall JD, Deutschbauer AM, Arkin AP. 2018. Filling gaps in bacterial amino acid biosynthesis pathways with high-throughput genetics. PLoS Genet 14:e1007147. doi:10.1371/journal.pgen.1007147. - DOI - PMC - PubMed
    1. Seif Y, Choudhary KS, Hefner Y, Anand A, Yang L, Palsson BO. 2020. Metabolic and genetic basis for auxotrophies in Gram-negative species. Proc Natl Acad Sci U S A 117:6264–6273. doi:10.1073/pnas.1910499117. - DOI - PMC - PubMed
    1. Tramontano M, Andrejev S, Pruteanu M, Klünemann M, Kuhn M, Galardini M, Jouhten P, Zelezniak A, Zeller G, Bork P, Typas A, Patil KR. 2018. Nutritional preferences of human gut bacteria reveal their metabolic idiosyncrasies. Nat Microbiol 3:514–522. doi:10.1038/s41564-018-0123-9. - DOI - PubMed