Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Dec 9:14:865.
doi: 10.1186/1471-2164-14-865.

Automated workflow-based exploitation of pathway databases provides new insights into genetic associations of metabolite profiles

Collaborators, Affiliations

Automated workflow-based exploitation of pathway databases provides new insights into genetic associations of metabolite profiles

Harish Dharuri et al. BMC Genomics. .

Abstract

Background: Genome-wide association studies (GWAS) have identified many common single nucleotide polymorphisms (SNPs) that associate with clinical phenotypes, but these SNPs usually explain just a small part of the heritability and have relatively modest effect sizes. In contrast, SNPs that associate with metabolite levels generally explain a higher percentage of the genetic variation and demonstrate larger effect sizes. Still, the discovery of SNPs associated with metabolite levels is challenging since testing all metabolites measured in typical metabolomics studies with all SNPs comes with a severe multiple testing penalty. We have developed an automated workflow approach that utilizes prior knowledge of biochemical pathways present in databases like KEGG and BioCyc to generate a smaller SNP set relevant to the metabolite. This paper explores the opportunities and challenges in the analysis of GWAS of metabolomic phenotypes and provides novel insights into the genetic basis of metabolic variation through the re-analysis of published GWAS datasets.

Results: Re-analysis of the published GWAS dataset from Illig et al. (Nature Genetics, 2010) using a pathway-based workflow (http://www.myexperiment.org/packs/319.html), confirmed previously identified hits and identified a new locus of human metabolic individuality, associating Aldehyde dehydrogenase family1 L1 (ALDH1L1) with serine/glycine ratios in blood. Replication in an independent GWAS dataset of phospholipids (Demirkan et al., PLoS Genetics, 2012) identified two novel loci supported by additional literature evidence: GPAM (Glycerol-3 phosphate acyltransferase) and CBS (Cystathionine beta-synthase). In addition, the workflow approach provided novel insight into the affected pathways and relevance of some of these gene-metabolite pairs in disease development and progression.

Conclusions: We demonstrate the utility of automated exploitation of background knowledge present in pathway databases for the analysis of GWAS datasets of metabolomic phenotypes. We report novel loci and potential biochemical mechanisms that contribute to our understanding of the genetic basis of metabolic variation and its relationship to disease development and progression.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The database interrogation schemes. The two interrogation schemes: pathway scheme (A) and reaction scheme (B) are shown. The blue color indicates the intermediate steps to filter out certain pathways/compounds from the two schemes to avoid non-specific connections.
Figure 2
Figure 2
Strategy to find biologically relevant SNP-metabolite pairs in published GWAS datasets. Background knowledge pertaining to a metabolite is collected from the pathway databases KEGG and BioCyc in an automated fashion to generate a gene/SNP set relevant to the synthesis and degradation of the metabolite.
Figure 3
Figure 3
Gene set overlap for the KEGG and BioCyc databases. The Venn diagram depicts the overlap between the non-redundant gene set for KEGG and the BioCyc metabolic pathway database. These genes correspond to the combined set from the pathway and reaction interrogation schemes. The total number of unique genes that our method yields is 1246.
Figure 4
Figure 4
Role of ALDH1L1 in the cytosolic one-carbon pool metabolism. A simplified schematic of the one-carbon pool metabolism in the cytosol is depicted. ALDH1L1: Aldehyde Dehydrogenase 1 Family, Member L1; THF: tetrahydrofolate; SHMT: Serine hydrxymethyltransferase.

References

    1. Gieger C, Geistlinger L, Altmaier E, Hrabe de Angelis M, Kronenberg F, Meitinger T, Mewes HW, Wichmann HE, Weinberger KM, Adamski J. et al.Genetics meets metabolomics: a genome-wide association study of metabolite profiles in human serum. PLoS Genet. 2008;14:e1000282. doi: 10.1371/journal.pgen.1000282. 2008/12/02 edn. - DOI - PMC - PubMed
    1. Illig T, Gieger C, Zhai G, Romisch-Margl W, Wang-Sattler R, Prehn C, Altmaier E, Kastenmuller G, Kato BS, Mewes HW. et al.A genome-wide perspective of genetic variation in human metabolism. Nat Genet. 2010;14(2):137–141. doi: 10.1038/ng.507. - DOI - PMC - PubMed
    1. Suhre K, Shin SY, Petersen AK, Mohney RP, Meredith D, Wagele B, Altmaier E, Deloukas P, Erdmann J, Grundberg E. et al.Human metabolic individuality in biomedical and pharmaceutical research. Nature. 2011;14(7362):54–60. doi: 10.1038/nature10354. - DOI - PMC - PubMed
    1. Demirkan A, van Duijn CM, Ugocsai P, Isaacs A, Pramstaller PP, Liebisch G, Wilson JF, Johansson A, Rudan I, Aulchenko YS. et al.Genome-wide association study identifies novel loci associated with circulating phospho- and sphingolipid concentrations. PLoS Genet. 2012;14(2):e1002490. doi: 10.1371/journal.pgen.1002490. - DOI - PMC - PubMed
    1. Wang-Sattler R, Yu Z, Herder C, Messias AC, Floegel A, He Y, Heim K, Campillos M, Holzapfel C, Thorand B. et al.Novel biomarkers for pre-diabetes identified by metabolomics. Mol Syst Biol. 2012;14:615. - PMC - PubMed

Publication types

LinkOut - more resources