Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 28;38(7):2015-2021.
doi: 10.1093/bioinformatics/btac032.

massNet: integrated processing and classification of spatially resolved mass spectrometry data using deep learning for rapid tumor delineation

Affiliations

massNet: integrated processing and classification of spatially resolved mass spectrometry data using deep learning for rapid tumor delineation

Walid M Abdelmoula et al. Bioinformatics. .

Abstract

Motivation: Mass spectrometry imaging (MSI) provides rich biochemical information in a label-free manner and therefore holds promise to substantially impact current practice in disease diagnosis. However, the complex nature of MSI data poses computational challenges in its analysis. The complexity of the data arises from its large size, high-dimensionality and spectral nonlinearity. Preprocessing, including peak picking, has been used to reduce raw data complexity; however, peak picking is sensitive to parameter selection that, perhaps prematurely, shapes the downstream analysis for tissue classification and ensuing biological interpretation.

Results: We propose a deep learning model, massNet, that provides the desired qualities of scalability, nonlinearity and speed in MSI data analysis. This deep learning model was used, without prior preprocessing and peak picking, to classify MSI data from a mouse brain harboring a patient-derived tumor. The massNet architecture established automatically learning of predictive features, and automated methods were incorporated to identify peaks with potential for tumor delineation. The model's performance was assessed using cross-validation, and the results demonstrate higher accuracy and a substantial gain in speed compared to the established classical machine learning method, support vector machine.

Availability and implementation: https://github.com/wabdelmoula/massNet. The data underlying this article are available in the NIH Common Fund's National Metabolomics Data Repository (NMDR) Metabolomics Workbench under project id (PR001292) with http://dx.doi.org/10.21228/M8Q70T.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Tissue sections from five intracranial GBM PDX models were divided into training/validation and testing sets: (a) schematic distribution of tissue sections from different GBM models, (b) annotated tumor regions in the MSI datasets which were guided by the H&E annotations (c)
Fig. 2.
Fig. 2.
Deep learning-based architecture of massNet for probabilistic two-class classification of large-scale MSI data without prior preprocessing and peak picking. The artificial neural network is based on spectral-wise analysis and consists of two modules, namely: VAE for nonlinear manifold learning that is captured at the ‘Code’ layer, and two fully connected layers that take input from the ‘Code’ layer to yield probabilistic predictions at the output layer using the sigmoid activation. massNet is regularized based on batch normalization and drop out to maintain learning stabilization and faster optimization
Fig. 3.
Fig. 3.
Performance of the VAE module and nonlinear data visualization: overlay of the TIC-normalized average spectrum of original and reconstructed data for both training (a) and testing (b) datasets. UMAP visualization of the five-dimensional latent variable captured by the VAE model reveals distinction between normal and tumor mass spectra from different GBM models for both training (b) and testing (d) dataset
Fig. 4.
Fig. 4.
Classification performance on the MALDI FT-ICR MSI test set: (a) ROC curve distribution for both normal (blue) and tumor (orange) classes with an AUC of 99.54% and 99.55%, respectively. (b) Confusion matrix showing the prediction performance compared to the ground truth labels
Fig. 5.
Fig. 5.
Spatial mapping of the classification predictions and multimodal integration: spatial distribution of the spectral-wise probabilistic predictions for normal (a) and tumor (b) classes. (c) Closeup visualization of the spatially mapped tumor prediction scores reveals a higher level of uncertainty at the interface between normal and tumor (i.e. tumor margins). (d) The H&E microscopy images show tumor regions in different GBM models (columns). (e) Multimodal integration of the H&E images and ion image at m/z 480.9211 ± 0.01 which is highly correlated and elevated in the tumor region

References

    1. Abadi M. et al. (2016) TensorFlow: a system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16), USA, Vol. 16, pp. 265–283.
    1. Abdelmoula W.M. et al. (2014) Automatic generic registration of mass spectrometry imaging data to histology using nonlinear stochastic embedding. Anal. Chem., 86, 9204–9211. - PubMed
    1. Abdelmoula W.M. et al. (2016) Data-driven identification of prognostic tumor subpopulations using spatially mapped t-SNE of Mass spectrometry imaging data. Proc. Natl. Acad. Sci. USA, 113, 12244–12249. - PMC - PubMed
    1. Abdelmoula W.M. et al. (2020) Peak learning of mass spectrometry imaging data using artificial neural networks. Nat. Commun., 12, 1–13. - PMC - PubMed
    1. Addie R.D. et al. (2015) Current state and future challenges of mass spectrometry imaging for clinical research. Anal. Chem., 87, 6426–6433. - PubMed

Publication types