Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Apr 20;28(4):797-809.
doi: 10.1021/acs.chemrestox.5b00017. Epub 2015 Mar 16.

Site of reactivity models predict molecular reactivity of diverse chemicals with glutathione

Affiliations

Site of reactivity models predict molecular reactivity of diverse chemicals with glutathione

Tyler B Hughes et al. Chem Res Toxicol. .

Abstract

Drug toxicity is often caused by electrophilic reactive metabolites that covalently bind to proteins. Consequently, the quantitative strength of a molecule's reactivity with glutathione (GSH) is a frequently used indicator of its toxicity. Through cysteine, GSH (and proteins) scavenges reactive molecules to form conjugates in the body. GSH conjugates to specific atoms in reactive molecules: their sites of reactivity. The value of knowing a molecule's sites of reactivity is unexplored in the literature. This study tests the value of site of reactivity data that identifies the atoms within 1213 reactive molecules that conjugate to GSH and builds models to predict molecular reactivity with glutathione. An algorithm originally written to model sites of cytochrome P450 metabolism (called XenoSite) finds clear patterns in molecular structure that identify sites of reactivity within reactive molecules with 90.8% accuracy and separate reactive and unreactive molecules with 80.6% accuracy. Furthermore, the model output strongly correlates with quantitative GSH reactivity data in chemically diverse, external data sets. Site of reactivity data is nearly unstudied in the literature prior to our efforts, yet it contains a strong signal for reactivity that can be utilized to more accurately predict molecule reactivity and, eventually, toxicity.

PubMed Disclaimer

Conflict of interest statement

Notes

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Adverse drug reactions are often cased by reactive metabolites. Acetaminophen is metabolized by cytochromes P450 to N-acetyl-p-benzoquinone imine (NAPQI). NAPQI is electrophilically reactive and covalently binds to nucleophilic sites within proteins, eliciting an immune response. Glutathione (GSH and outlined in gray) protects the body from this adverse drug reaction by scavenging electrophiles like NAPQI, to which GSH binds at its site of reactivity (circled atom). Thus, a site of GSH conjugation is a likely site of protein conjugation, and identifying these sites of reactivity offers information about the mechanism of metabolite toxicity. Several methods have been published that can predict how P450s metabolize molecules. This study, however, focuses on modeling the reactivity of molecules with GSH but not the metabolism of molecules into reactive species.
Figure 2
Figure 2
The structure of the reactivity model. This diagram shows how information flows through the model, which is composed of one input layer, one hidden layer, and two output layers. This model computed a prediction for each test molecule and atom in the test molecule. Atom reactivity scores (ARS) were computed with a neural network, with one output node, one hidden layer (with ten units), and one input layer. From the 3D structure of input molecule χ, 30 molecule-level and 86 atom-level descriptors were calculated (two input layer nodes for each category are displayed). The diagram shows only two hidden nodes, two molecule input nodes, and two atom input nodes for conciseness. The actual model had several additional nodes in each input and hidden layer. For each atom within χ, all 116 descriptors were fed into the 10 hidden layer nodes (two are displayed), which generated an ARS. The molecule reactivity score (MRS) of χ was computed from the top five ARS corresponding to the scores of the five atoms predicted to be the most reactive within a molecule and all molecule-level descriptors. The molecules on the top right illustrate atom-level data, with sites of reactivity circled, and the molecules on the bottom right illustrate molecule-level data, with reactive molecules circled.
Figure 3
Figure 3
Atom reactivity scores accurately identified sites of reactivity. For each prediction method, average site AUC was computed for 1213 reactive molecules, with their sites of conjugation to glutathione labeled. This metric reflected how often reactive atoms were ranked above unreactive atoms within reactive molecules. The cross-validated atom reactivity scores (ARS), generated by a neural network with 10 hidden nodes trained by gradient descent on the cross-entropy error, outperformed the cross-validated predictions of a logistic regressor (ARS[LR]). The performances of selected atom-level descriptors were also evaluated. The accuracy of the reactivity model exceeded that of πS(r), DN(r), and DE(r), three commonly used reactivity indices.
Figure 4
Figure 4
Importance of specific descriptors to the atom reactivity model. A permutation sensitivity analysis quantified the importance of descriptors for the final trained atom reactivity model. This listing indicates the 12 most important descriptors in decreasing order of importance from top to bottom. The graph shows the model performance drop associated after permuting the associated descriptor values, averaging over 10 iterations. All top descriptors with the exception of two were topological; the remainder were derived from a quantum simulation of the molecule structure.
Figure 5
Figure 5
The reactivity model accurately identified reactive molecules. Several prediction methods were compared based on their ability to identify reactive molecules. The data set included 1484 molecules, 1213 of which are reactive with glutathione and 271 of which are not reactive but are structurally similar to reactive molecules. Performance was measured by computing the area under the ROC curve (molecule AUC). The best performing approach computed a MRS using a logistic regressor based on molecule-level descriptors and the top five ARS scores associated with each atom of the molecule. Control models demonstrated lower accuracy using only atom-level information (MRS[Atom only] and max[ARS]) or molecule-level information (MRS[Molecule only]). Similarly, neither the two published QSAR models (eqs 1 and 2), several molecule descriptors, nor atom descriptors yielded models as accurate as our reactivity model.
Figure 6
Figure 6
Importance of descriptors to molecular reactivity score. The weights of the final model nodes revealed the relative contribution of individual descriptors to model performance. The descriptors were normalized before training so that the magnitude of the weights directly measured the importance of each descriptor. The values for the five ARS scores were included, as well as those of selected reactivity indices. As we would hope, a significant weight was placed on all of the ARS descriptors. Moreover, the qualitative contributions of specific quantum descriptors were within expectations based on frontier orbital theory. This analysis increased confidence in the ability of the model to sensibly generalize toward external data.
Figure 7
Figure 7
Molecule reactivity scores correlated with glutathione reactivity and toxicity of substituted quinones. The model molecule reactivity scores (MRS) correlated closely with hepatocyte toxicity (LC50, top graph) and the rate of reactivity with GSH (kGSH, bottom graph) of 10 substituted p-benzoquinones. The left panel illustrates all 10 test molecules and sorts them by MRS computed by a model trained without using these molecules. For each molecule, the shading intensity represents atom reactivity scores (ARS), which range from 0 to 0.41. Circled atoms are labeled as reactive in our training data set.
Figure 8
Figure 8
Reactivity scores correlated with the nucleophile reactivity of structurally diverse contact allergens. Model performance was assessed using an external data set with 38 molecules. The y axis is the percent depletion of GSH after 15 min incubation with each molecule. The x axis indicates molecule reactivity scores (MRS). For test molecules present in the training data set, the appropriate cross-validated predictions were extracted. The significance of Pearson correlation is reflected by the p-values. To the left, six example molecules are visualized with scaled ARS (which range from 0 to 0.43) and are sorted by MRS, which correspond to the data points marked with an × in the right panel plot. The six corresponding ×’s are in the same horizontal order as that of the visualized molecules. MRS significantly correlated with GSH reactivity.
Figure 9
Figure 9
Molecule reactivity scores distinguished drugs and their reactive metabolites. For each molecule, the shading represents atom reactivity scores (ARS), which ranged from 0 to 0.768. The structures of trimethoprim and felbamate are shown alongside their reactive metabolites and subsequent GSH conjugates. Circled atoms are labeled as reactive in our training data set. For molecules present in the training set, cross-validated predictions are displayed. In these cases, the predictions are obvious to an organic chemist, but they illustrate key points of the method’s behavior. First, it can distinguish accurately between very similar molecules that are reactive and nonreactive. Second, in the current form, it cannot predict how metabolism gives rise to reactive species.
Figure 10
Figure 10
Atom reactivity scores accurately identified nonobvious sites of reactivity The right panel displays 18 structural motifs known to be reactive. The left panel displays reactive molecules drawn from our training data that do not contain these specific substructures.,– The molecules are sorted by molecule reactivity scores, which range from 0.78 to 0.99. The shading intensity represents scaled atom reactivity scores (ARS), which range from 0 to 0.78. Our model accurately identified reactive molecules that do not match commonly used structural alerts.

References

    1. Hughes J, Rees S, Kalindjian S, Philpott K. Principles of early drug discovery. Br J Pharmacol. 2011;162:1239–1249. - PMC - PubMed
    1. Adams CP, Brantner VV. Spending on new drug development. Health Econ. 2010;19:130–141. - PubMed
    1. Borhani DW, Shaw DE. The future of molecular dynamics simulations in drug discovery. J Comput-Aided Mol Des. 2012;26:15–26. - PMC - PubMed
    1. Kessel M. The problems with today’s pharmaceutical business—an outsider’s view. Nat Biotechnol. 2011;29:27–33. - PubMed
    1. DiMasi JA. Success rates for new drugs entering clinical testing in the United States. Clin Pharmacol Ther. 1995;58:1–14. - PubMed

Publication types

MeSH terms

LinkOut - more resources