Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Dec 5:19:1825-1831.
doi: 10.3762/bjoc.19.134. eCollection 2023.

GlAIcomics: a deep neural network classifier for spectroscopy-augmented mass spectrometric glycans data

Affiliations

GlAIcomics: a deep neural network classifier for spectroscopy-augmented mass spectrometric glycans data

Thomas Barillot et al. Beilstein J Org Chem. .

Abstract

Carbohydrate sequencing is a formidable task identified as a strategic goal in modern biochemistry. It relies on identifying a large number of isomers and their connectivity with high accuracy. Recently, gas phase vibrational laser spectroscopy combined with mass spectrometry tools have been proposed as a very promising sequencing approach. However, its use as a generic analytical tool relies on the development of recognition techniques that can analyse complex vibrational fingerprints for a large number of monomers. In this study, we used a Bayesian deep neural network model to automatically identify and classify vibrational fingerprints of several monosaccharides. We report high performances of the obtained trained algorithm (GlAIcomics), that can be used to discriminate contamination and identify a molecule with a high degree of confidence. It opens the possibility to use artificial intelligence in combination with spectroscopy-augmented mass spectrometry for carbohydrates sequencing and glycomics applications.

Keywords: Bayesian neural network; IR; deep learning; glycomics; spectroscopy.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(a) Fingerprint of an unknown monosaccharide. (b) Labelled reference spectra of monosaccharide standards.
Figure 2
Figure 2
Typical experimental MS–IR spectra of the four categories of monosaccharides included in the first dataset. Blue: GalN; orange: GlcN; green: GlcNAc; red: ManN.
Figure 3
Figure 3
Synthetic IRMPD spectrum (grey trace) generated on the basis of a high resolution endogeneous experimental spectrum of GlcN (black trace) from dataset 1 using additional white noise: 10%; linear signal amplitude modulation: 5%; downsampling coefficient: 2; wavenumber shift: +9 cm−1. The orange trace corresponds to a low-resolution exogeneous GlcN spectrum from dataset 2.
Figure 4
Figure 4
Model accuracy dependance with experimental conditions, represented by the dataset augmentation parameters.
Figure 5
Figure 5
DNN Prediction results for third endogenous dataset (5 hexosamine samples and 7 other molecules). The middle map shows the mean prediction probabilities for each category and the right hand side map shows the 5% to 95% interpercentile range for the prediction probability distributions of each category.

References

    1. United Nations; Department of Economic and Social Affairs; Sustainable Development Goals. Available from: https://sdgs.un.org/goals.
    1. National Research Council . Transforming Glycoscience: A Roadmap for the Future. Washington, D.C., USA: The National Academies Press; 2012. - DOI - PubMed
    1. Gray C J, Migas L G, Barran P E, Pagel K, Seeberger P H, Eyers C E, Boons G-J, Pohl N L B, Compagnon I, Widmalm G, et al. J Am Chem Soc. 2019;141:14463–14479. doi: 10.1021/jacs.9b06406. - DOI - PMC - PubMed
    1. Schindler B, Barnes L, Renois G, Gray C, Chambert S, Fort S, Flitsch S, Loison C, Allouche A-R, Compagnon I. Nat Commun. 2017;8:973. doi: 10.1038/s41467-017-01179-y. - DOI - PMC - PubMed
    1. Yeni O, Schindler B, Moge B, Compagnon I. Analyst. 2022;147:312–317. doi: 10.1039/d1an01870a. - DOI - PubMed

LinkOut - more resources