Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Oct 1;177(2):392-404.
doi: 10.1093/toxsci/kfaa113.

Leveraging the Comparative Toxicogenomics Database to Fill in Knowledge Gaps for Environmental Health: A Test Case for Air Pollution-induced Cardiovascular Disease

Affiliations

Leveraging the Comparative Toxicogenomics Database to Fill in Knowledge Gaps for Environmental Health: A Test Case for Air Pollution-induced Cardiovascular Disease

Allan Peter Davis et al. Toxicol Sci. .

Abstract

Environmental health studies relate how exposures (eg, chemicals) affect human health and disease; however, in most cases, the molecular and biological mechanisms connecting an exposure with a disease remain unknown. To help fill in these knowledge gaps, we sought to leverage content from the public Comparative Toxicogenomics Database (CTD) to identify potential intermediary steps. In a proof-of-concept study, we systematically compute the genes, molecular mechanisms, and biological events for the environmental health association linking air pollution toxicants with 2 cardiovascular diseases (myocardial infarction and hypertension) as a test case. Our approach integrates 5 types of curated interactions in CTD to build sets of "CGPD-tetramers," computationally constructed information blocks relating a Chemical- Gene interaction with a Phenotype and Disease. This bioinformatics strategy generates 653 CGPD-tetramers for air pollution-associated myocardial infarction (involving 5 pollutants, 58 genes, and 117 phenotypes) and 701 CGPD-tetramers for air pollution-associated hypertension (involving 3 pollutants, 96 genes, and 142 phenotypes). Collectively, we identify 19 genes and 96 phenotypes shared between these 2 air pollutant-induced outcomes, and suggest important roles for oxidative stress, inflammation, immune responses, cell death, and circulatory system processes. Moreover, CGPD-tetramers can be assembled into extensive chemical-induced disease pathways involving multiple gene products and sequential biological events, and many of these computed intermediary steps are validated in the literature. Our method does not require a priori knowledge of the toxicant, interacting gene, or biological system, and can be used to analyze any environmental chemical-induced disease curated within the public CTD framework. This bioinformatics strategy links and interrelates chemicals, genes, phenotypes, and diseases to fill in knowledge gaps for environmental health studies, as demonstrated for air pollution-associated cardiovascular disease, but can be adapted by researchers for any environmentally influenced disease-of-interest.

Keywords: air pollution; cardiovascular disease; chemical-induced pathways; database; environmental health.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Ambient air pollutant-associated diseases in Comparative Toxicogenomics Database. The numbers of diseases with curated associations with 8 pollutants are clustered by disease categories (y-axis); only the top 7 disease categories are listed. The category “Pathology” includes pathological processes (eg, chromosome aberration, fibrosis, hemorrhage, and shock), and “Signs & Symptoms” includes clinical symptoms (eg, headache, sneezing, cough, and nausea). Chemicals: AP, Air Pollutants; CO, Carbon Monoxide; NO2, Nitrogen Dioxide; O3, Ozone; PM, Particulate Matter; SO2, Sulfur Dioxide; Soot; VE, Vehicle Emissions.
Figure 2.
Figure 2.
CGPD-tetramers are computationally generated information units that interrelate 4 data types at Comparative Toxicogenomics Database (CTD). To generate a CGPD-tetramer by data integration, 5 lines of supporting evidence are required (box) as directly curated interactions among the 4 data types: C, chemical; G, gene product; P, phenotype; D, disease. If any 1 line of supporting evidence is lacking, the tetramer is not generated.
Figure 3.
Figure 3.
Phenotypes associated with myocardial infarction (MI) in Comparative Toxicogenomics Database. A partial screenshot of the “Phenotypes” data-tab for MI lists the phenotypes that have an inferred relationship to this disease via either a Chemical Inference Network or Gene Inference Network. Here, 198 chemicals have direct interactions with both the phenotype “apoptotic process” (evidence line No. 2) and the disease MI (evidence line No. 3); independently, 17 genes also interact with both this phenotype (evidence line No. 4) and disease (evidence line No. 5). Thus, “apoptotic process” can be inferred to MI via 198 chemicals and 17 genes. Note: 5 of the 198 chemicals listed are environmental ambient air pollutants (boxed within the Chemical Inference Network).
Figure 4.
Figure 4.
Step-wise process computing CGPD-tetramers for 2 cardiovascular diseases. The disease hierarchy shows 2 cardiovascular diseases: myocardial infarction (MI) (a heart disease) and hypertension (both a heart and vascular disease). In step 1, MI is associated with 4918 phenotypes inferred via either CIN and/or GIN, representing 324 chemicals and 95 genes, respectively. In step 2, the data are filtered for only phenotypes inferred via both a Chemical Inference Network (CIN) and Gene Inference Network (GIN), and then restricted further in step 3 by requiring a chemical in the CIN to have a directly curated Comparative Toxicogenomics Database interaction with a gene in the associated GIN, to yield 758 phenotypes. The data in step 3 are supported by all 5 required lines of evidence and can be used to generate 14 957 CGPD-tetramers (step 4). By limiting the chemicals to ambient air pollutants (step 5), 653 CGPD-tetramers remain, for 5 pollutants, 58 genes, and 117 phenotypes for MI. The same steps are performed for hypertension. Venn analysis discovers 19 genes and 96 phenotypes shared between these 2 cardiovascular diseases associated with air pollution exposure.
Figure 5.
Figure 5.
CGPD-tetramers relating air pollutants, intermediary genes, apoptosis, and myocardial infarction (MI). Forty-three individual CGPD-tetramers were computed that relate 5 environmental chemicals (C) with molecular interactions to 14 genes (G) that modulate the phenotype (P) “apoptotic process” inferred to the disease (D) MI. For visualization, the tetramers are condensed into a network schematic. Chemicals: AP, Air Pollutants; NO2, Nitrogen Dioxide; O3, Ozone; PM, Particulate Matter; VE, Vehicle Emissions; genes are depicted using official gene symbols.
Figure 6.
Figure 6.
Aligning CGPD-tetramers. To help inform the knowledge gaps between ozone (O3) exposure and myocardial infarction (MI), the 91 computed CGPD-tetramers are condensed and aligned in a matrix, arranged by shared genes and phenotypes. Here, 33 genes (filled-in boxes) and 27 phenotypes clustered into 6 categories connect this individual pollutant to cardiovascular disease.
Figure 7.
Figure 7.
Assembling CGPD-tetramers. Intermediary genes and phenotypes derived from CGPD-tetramers are coalesced and assembled into putative chemical-induced disease pathways to help inform the knowledge gaps connecting ozone (O3) exposure with myocardial infarction (MI) using steps at the molecular, cellular, and system levels. The number of genes (circles) associated with each set of phenotypes is indicated, and phenotype clusters (boxes) are interrelated by shared genes (curved arrows with lists of shared genes), helping to conjoin independent tetramers into a larger integrated network relating ozone exposure to cardiovascular disease. The diagram uses 84 CGPD-tetramers (from Figure 6) linking ozone, 33 genes, 24 phenotypes, and MI.

References

    1. Ankley G. T., Edwards S. W. (2018). The adverse outcome pathway: A multifaceted framework supporting 21st century toxicology. Curr. Opin. Toxicol. 9, 1–7. - PMC - PubMed
    1. Ashburner M., Ball C. A., Blake J. A., Botstein D., Butler H., Cherry J. M., Davis A. P., Dolinski K., Dwight S. S., Eppig J. T., et al. (2000). Gene Ontology: Tool for the unification of biology. Nat. Genet. 25, 25–29. - PMC - PubMed
    1. Bai N., van Eeden S. F. (2013). Systemic and vascular effects of circulating diesel exhaust particulate matter. Inhal. Toxicol. 25, 725–734. - PubMed
    1. Barlow P. G., Brown D. M., Donaldson K., MacCallum J., Stone V. (2008). Reduced alveolar macrophage migration induced by acute ambient particle (PM10) exposure. Cell Biol. Toxicol. 24, 243–252. - PubMed
    1. Bell S. M., Angrish M. M., Wood C. E., Edwards S. W. (2016). Integrating publicly available data to generate computationally predicted adverse outcome pathways for fatty liver. Toxicol. Sci. 150, 510–520. - PubMed

Publication types