Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 1;29(13):3137.
doi: 10.3390/molecules29133137.

Can Graph Machines Accurately Estimate 13C NMR Chemical Shifts of Benzenic Compounds?

Affiliations

Can Graph Machines Accurately Estimate 13C NMR Chemical Shifts of Benzenic Compounds?

François Duprat et al. Molecules. .

Abstract

In the organic laboratory, the 13C nuclear magnetic resonance (NMR) spectrum of a newly synthesized compound remains an essential step in elucidating its structure. For the chemist, the interpretation of such a spectrum, which is a set of chemical-shift values, is made easier if he/she has a tool capable of predicting with sufficient accuracy the carbon-shift values from the structure he/she intends to prepare. As there are few open-source methods for accurately estimating this property, we applied our graph-machine approach to build models capable of predicting the chemical shifts of carbons. For this study, we focused on benzene compounds, building an optimized model derived from training a database of 10,577 chemical shifts originating from 2026 structures that contain up to ten types of non-carbon atoms, namely H, O, N, S, P, Si, and halogens. It provides a training root-mean-squared relative error (RMSRE) of 0.5%, i.e., a root-mean-squared error (RMSE) of 0.6 ppm, and a mean absolute error (MAE) of 0.4 ppm for estimating the chemical shifts of the 10k carbons. The predictive capability of the graph-machine model is also compared with that of three commercial packages on a dataset of 171 original benzenic structures (1012 chemical shifts). The graph-machine model proves to be very efficient in predicting chemical shifts, with an RMSE of 0.9 ppm, and compares favorably with the RMSEs of 3.4, 1.8, and 1.9 ppm computed with the ChemDraw v. 23.1.1.3, ACD v. 11.01, and MestReNova v. 15.0.1-35756 packages respectively. Finally, a Docker-based tool is proposed to predict the carbon chemical shifts of benzenic compounds solely from their SMILES codes.

Keywords: Docker; chemical shift; graph machines (GM); machine learning; structured data.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
Scatter plot of 13C chemical-shift estimations computed by graph machine from SMILES (node function with 26 hidden neurons) for the 1637 compounds of the training set (blue circles) and the 114 compounds of the test set (red filled circles) vs. measured values of the chemical shift. The black line is the bisector of the plot.
Figure 2
Figure 2
Scatter plot of 13C chemical-shift predictions computed by graph machines (red disks) and MestReNova software v. 15.0.1-35756 (blue circles) vs. measured shift values for the 171 molecules of the test set. The black line is the bisector of the plot, and the dashed red and blue lines are the regression lines for the GM and MestReNova plots.
Figure 3
Figure 3
Structure of training * or test set molecules with at least one carbon whose shift estimation with the GM26 model shows a large deviation (experimental minus estimated, in ppm). Shifts for blue carbon are underestimated, while they are overestimated for red carbons.
Figure 4
Figure 4
Deviations encountered in the prediction of carbon shifts of three molecules outside the scope of the GM26 model (experimental minus estimated, in ppm). Shifts for blue carbon are underestimated, while they are overestimated for red carbons.
Figure 5
Figure 5
Graph-machine construction process for two carbons of 2-methoxytoluene: (ⓐ) conversion of the 2D structure of 2-methoxytoluene into a cyclic graph, (ⓑ) construction of the two directed acyclic graphs for the carbons marked in red, and (ⓒ) generation of the corresponding graph machines.
Figure 6
Figure 6
Distribution of functional carbons (as percentages) for the molecules of the training (blue bars) and test (bistre-colored bars) sets.

References

    1. Fürst A., Pretsch E. A computer program for the prediction of 13C-NMR chemical shifts of organic compounds. Anal. Chim. Acta. 1990;229:17–25. doi: 10.1016/S0003-2670(00)85105-3. - DOI
    1. Zupan J., Novič M., Bohanec S., Razinger M., Lah L., Tusǎr M., Košir I. Expert system for solving problems in carbon-13 nuclear magnetic resonance spectroscopy. Anal. Chim. Acta. 1987;200:333–345. doi: 10.1016/S0003-2670(00)83781-2. - DOI
    1. Ewing D.F. 13C substituent effects in monosubstituted benzenes. Org. Magn. Reson. 1979;12:499–524. doi: 10.1002/mrc.1270120902. - DOI
    1. Hearmon R.A., Liu H.M., Laverick S., Tayler P. Microcomputer prediction and assessment of substituted benzene 13C NMR chemical shifts. Magn. Reson. Chem. 1991;30:240–248. doi: 10.1002/mrc.1260300309. - DOI
    1. Revvity Signals ChemDraw v.22. [(accessed on 1 May 2024)]. Available online: https://revvitysignals.com/products/research/chemdraw.

LinkOut - more resources