Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jul;9(7):2479-503.
doi: 10.3390/ijerph9072479. Epub 2012 Jul 12.

Using bioinformatic approaches to identify pathways targeted by human leukemogens

Affiliations

Using bioinformatic approaches to identify pathways targeted by human leukemogens

Reuben Thomas et al. Int J Environ Res Public Health. 2012 Jul.

Abstract

We have applied bioinformatic approaches to identify pathways common to chemical leukemogens and to determine whether leukemogens could be distinguished from non-leukemogenic carcinogens. From all known and probable carcinogens classified by IARC and NTP, we identified 35 carcinogens that were associated with leukemia risk in human studies and 16 non-leukemogenic carcinogens. Using data on gene/protein targets available in the Comparative Toxicogenomics Database (CTD) for 29 of the leukemogens and 11 of the non-leukemogenic carcinogens, we analyzed for enrichment of all 250 human biochemical pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The top pathways targeted by the leukemogens included metabolism of xenobiotics by cytochrome P450, glutathione metabolism, neurotrophin signaling pathway, apoptosis, MAPK signaling, Toll-like receptor signaling and various cancer pathways. The 29 leukemogens formed 18 distinct clusters comprising 1 to 3 chemicals that did not correlate with known mechanism of action or with structural similarity as determined by 2D Tanimoto coefficients in the PubChem database. Unsupervised clustering and one-class support vector machines, based on the pathway data, were unable to distinguish the 29 leukemogens from 11 non-leukemogenic known and probable IARC carcinogens. However, using two-class random forests to estimate leukemogen and non-leukemogen patterns, we estimated a 76% chance of distinguishing a random leukemogen/non-leukemogen pair from each other.

Keywords: Comparative Toxicogenomics Database; carcinogen; clustering; leukemogen; pathway.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Human leukemogens and non-leukemogenic carcinogens identified from NTP and IARC reports. The Venn diagram shows the numbers of leukemogens (n = 29) and non-leukemogenic carcinogens (n = 11) identified from IARC and NTP, for which CTD data were available. The boxes detail the IARC and NTP carcinogen classifications of all 35 leukemogens and all 16 non-leukemogenic carcinogens, sorted by agency from which they were selected. Within the boxes, the chemicals are organized by reported disease associations, then by IARC group number, then alphabetically by name. “-” group or class indicates no report available. The 11 chemical names for which no CTD data were available are shown in gray-italics.
Figure 2
Figure 2
Unsupervised clustering of KEGG human pathways targeted by 29 leukemogens and 11 non-leukemogenic carcinogens. The 250 pathways are clustered based on the distance between the columns (corresponding to the pathways) of the matrix of transformed pathway enrichment p values over all the 40 chemicals (corresponding to the rows). The figure is a visual representation of the distance matrix between all the chosen pathways. The color of the (i,j)th position of the distance matrix, where i and j represent indices in the set of human pathways indexed by values in the set {1, 2,…250}, is a measure of how close pathway i and pathway j are to each other based on the enrichment of their gene targets on all the 40 chemicals. The color ranges from white to red, with red indicating greater closeness of a pair of pathways. Dashed black lines indicate boundaries of clusters of pathways as determined by the Hierarchical Ordered Partitioning And Collapsing Hybrid (HOPACH) algorithm [60]. Two clusters, labeled 0 and 1, were identified.
Figure 3
Figure 3
Unsupervised clustering of leukemogens. The 29 leukemogens are clustered based on the distance between rows (corresponding to the leukemogens) of the matrix of transformed pathway enrichment p values over all the 250 KEGG human pathways (corresponding to the columns). The figure is a visual representation of the distance matrix between all the chosen leukemogens. The color of the (i,j)th position of the distance matrix, where i,j represent indices in the set of 29 leukemogens indexed by values in the set {1, 2,…29}, is a measure of how close leukemogen i and leukemogen j are to each other based on the enrichment of their gene targets on all the KEGG human pathways. The color ranges from white to red, with red indicating greater closeness of leukemogen pairs. Dashed black lines indicate the boundaries of clusters of leukemogens as determined by the HOPACH algorithm [60]. Eighteen clusters, labeled from 0 to 17, were identified. Listed on the right are the chemical names, the medoid chemicals (or chemicals with pathway response pattern most similar to other chemicals in the cluster) for each cluster identified in bold case, and the cluster membership probabilities of each of the leukemogens.
Figure 4
Figure 4
Unsupervised clustering and supervised classification of leukemogens and non-leukemogenic carcinogens. The 40 chemicals (29 leukemogens and 11 non-leukemogenic carcinogens) are clustered based on the distance between the rows (corresponding to the chemicals) of the matrix of transformed pathway enrichment p values over all the 250 KEGG human pathways (corresponding to the columns). The figure is a visual representation of the distance matrix between all the chosen chemicals. The color of the (i,j)th position of the distance matrix is a measure of how close chemical i and chemical j, where i,j represent indices in the set of 40 chemicals indexed by values in the set {1, 2,…40}, are to each other based on the enrichment of their gene targets on all the pathways. The color ranges from white to red, with red indicating closeness of a pair of chemicals. Dashed black lines indicate boundaries of chemical clusters as determined by the HOPACH algorithm [60]. Seven clusters, labeled from 0 to 6, were identified. The chemical names, leukemogens in lower case and non-leukemogenic carcinogens in upper case, are provided on the right of the figure. The medoid chemical (or chemical with pathway response pattern most similar to other chemicals in the cluster) for each cluster is identified in bold and the cluster membership probabilities are provided. The mean predictions from the one and two-class classification methods (Sections 2.4.2 and 2.4.3) are provided in the last two columns. A prediction value of 1 represents the leukemogen-class of chemicals and 0 represents the non-leukemogenic carcinogen class. These predictions are based on an approximately 50% false-positive rate.

Similar articles

Cited by

References

    1. Sawyers C.L., Denny C.T., Witte O.N. Leukemia and the disruption of normal hematopoiesis. Cell. 1991;64:337–350. doi: 10.1016/0092-8674(91)90643-D. - DOI - PubMed
    1. Swerdlow S.H., Campo E., Harris N.L., Jaffe E.S., Pileri S.A., Stein H., Thiele J., Vardiman J.W. WHO Classification of Tumours of Haematopoietic and Lymphoid Tissues. IARC; Lyon, France: 2008.
    1. Vardiman J.W. The World Health Organization (WHO) classification of tumors of the hematopoietic and lymphoid tissues: An overview with emphasis on the myeloid neoplasms. Chem. Biol. Interact. 2010;184:16–20. doi: 10.1016/j.cbi.2009.10.009. - DOI - PubMed
    1. Cancer Facts & Figures 2012. American Cancer Society; Atlanta, GA, USA: 2012. American Cancer Society.
    1. Austin H., Delzell E., Cole P. Benzene and leukemia. A review of the literature and a risk assessment. Am. J. Epidemiol. 1988;127:419–439. - PubMed

Publication types