Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan;15(1):101116.
doi: 10.1016/j.jpha.2024.101116. Epub 2024 Sep 26.

Integration of deep neural network modeling and LC-MS-based pseudo-targeted metabolomics to discriminate easily confused ginseng species

Affiliations

Integration of deep neural network modeling and LC-MS-based pseudo-targeted metabolomics to discriminate easily confused ginseng species

Meiting Jiang et al. J Pharm Anal. 2025 Jan.

Abstract

Metabolomics covers a wide range of applications in life sciences, biomedicine, and phytology. Data acquisition (to achieve high coverage and efficiency) and analysis (to pursue good classification) are two key segments involved in metabolomics workflows. Various chemometric approaches utilizing either pattern recognition or machine learning have been employed to separate different groups. However, insufficient feature extraction, inappropriate feature selection, overfitting, or underfitting lead to an insufficient capacity to discriminate plants that are often easily confused. Using two ginseng varieties, namely Panax japonicus (PJ) and Panax japonicus var. major (PJvm), containing the similar ginsenosides, we integrated pseudo-targeted metabolomics and deep neural network (DNN) modeling to achieve accurate species differentiation. A pseudo-targeted metabolomics approach was optimized through data acquisition mode, ion pairs generation, comparison between multiple reaction monitoring (MRM) and scheduled MRM (sMRM), and chromatographic elution gradient. In total, 1980 ion pairs were monitored within 23 min, allowing for the most comprehensive ginseng metabolome analysis. The established DNN model demonstrated excellent classification performance (in terms of accuracy, precision, recall, F1 score, area under the curve, and receiver operating characteristic (ROC)) using the entire metabolome data and feature-selection dataset, exhibiting superior advantages over random forest (RF), support vector machine (SVM), extreme gradient boosting (XGBoost), and multilayer perceptron (MLP). Moreover, DNNs were advantageous for automated feature learning, nonlinear modeling, adaptability, and generalization. This study confirmed practicality of the established strategy for efficient metabolomics data analysis and reliable classification performance even when using small-volume samples. This established approach holds promise for plant metabolomics and is not limited to ginseng.

Keywords: Deep neural network; Ginseng; Liquid chromatography-mass spectrometry; Pseudo-targeted metabolomics; Species differentiation.

PubMed Disclaimer

Conflict of interest statement

The authors declare that there are no conflicts of interest. As a young editorial board member, Wenzhi Yang recused himself from all review processes related to this article to ensure the fairness and objectivity of the review.

Figures

Image 1
Graphical abstract
Fig. 1
Fig. 1
Technology roadmaps for the strategy by integrating deep neural network (DNN) modeling and liquid chromatography-mass spectrometry (LC-MS)-based pseudo-targeted metabolomics to discriminate easily confused herbal medicines. 2D-LC: two-dimensional LC; MRM: multiple reaction monitoring; 1D: 1-dimensional; IM-QTOF: ion mobility quadrupole time-of-flight; MSE: full information tandem MS; HDMSE: high-definition MSE; DDA: data-dependent acquisition; HDDDA: high-definition DDA; QTrap: triple quadrupole-linear ion trap; EMS-IDA-EPI: enhanced MS scan-information dependent acquisition-enhanced product ion scan; sMRM: scheduled MRM; SPS: split-combine structure; RF: random forest; OPLS-DA: orthogonal partial least squares-discriminant analysis; SVM: support vector machine; XGBoost: extreme gradient boosting; MLP: multilayer perceptron; AUC: area under the curve; ROC: receiver operating characteristic.
Fig. 2
Fig. 2
Architecture of the classification model based on deep neural network (DNN). SPS: split-combine structure; Conv1D: 1-dimensional convolution.
Fig. 3
Fig. 3
Optimization and establishment of the pseudo-targeted metabolomics approach for differentiating the ginseng varieties. (A) Principal component analysis (PCA) score plots of the quality control (QC) metabolome data (QC1) by four metabolic features acquisition modes. (B, C) QC1 metabolome data stability obtained by four metabolic features acquisition modes: coefficient variation (CV) comparison of compounds (B) and relative standard deviation (RSD) comparison of compounds (C). (D) The response variation range of the same metabolome-QC1 determined by four metabolic features acquisition modes. (E) PCA score plots of the QC2 metabolome data by utilizing five chromatographic gradients. LC-MS: liquid chromatography-mass spectrometry; PG: Panax ginseng; RG: red ginseng; 2DLC: two-dimensional LC; MRM: multiple reaction monitoring; CV: coefficient of variation.
Fig. 4
Fig. 4
Comparison of the algorithm principles and representative chromatograms acquired using multiple reaction monitoring (MRM) and scheduled MRM (sMRM). (A) Overlapped chromatograms of 526 transitions using MRM and the magnified chromatograms of the signal-to-noise ratio for the MRM method. (B) Overlapped chromatograms of all 526 transitions using sMRM and the magnified chromatograms of the signal-to-noise ratio for the sMRM method. (C) Peaks obtained using sMRM (shown in the left) and MRM (in the right) representative of the common five subclasses of ginsenosides: protopanaxatriol (PPT)-type, oleanolic acid (OA)-type, protopanaxadiol (PPD)-type, ocotillol (OT)-type, and C-17 side-chain varied. 20-O-glu-Rf: 20-O-glucosylginsenoside Rf; Re: ginsenoside Re; noto-R1: notoginsenoside R1; Rg1: ginsenoside Rg1; 20(S)–Rh1-6′-acetate: 20(S)-ginsenoside Rh1-6′-acetate; chiku-IVa: chikusetsusaponin IVa; p-Rt1: pseudoginsenoside Rt1; Ro: ginsenoside Ro; Rd: ginsenoside Rd; Rc: ginsenoside Rc; Rb2: ginsenoside Rb2; Ra1: ginsenoside Ra1; 24(R)-p-F11: 24(R)-pseudoginsenoside F11.
Fig. 5
Fig. 5
Assessment on the established pseudo-targeted metabolomics approach from the quantitative assay aspects (number on the y-axis refers to the amount of multiple reaction monitoring (MRM) transitions giving the defined variation range among 300 transitions used for the statistical analysis). (A) Linearity scatter results. (B) Intra-day precision results. (C) Inter-day precision results. RSD: relative standard deviation.
Fig. 6
Fig. 6
Holistic comparison between two congeneric ginseng varieties (Panax japonicus (PJ) and Panaxjaponicus var. major (PJvm) based on the data of 659 metabolic features obtained by the developed pseudo-targeted metabolomics strategy. (A) Score plot of principal component analysis (PCA). (B) 200 times of permutation test plot. (C) Heatmap plotted by two ginseng varieties vs. 23 ginsenoside markers. (D) The multi-channel scheduled multiple reaction monitoring (sMRM) chromatograms to show the detection of representative ginsenoside markers. (E) Box charts illustrating the content difference for three marker compounds between two ginseng varieties. QC: quality control; p-RP1: pseudoginsenoside RP1; p-Rt1: pseudoginsenoside Rt1; chiku-IV: chikusetsusaponin IV; Rg1: ginsenoside Rg1; pRT2: pseudoginsenoside RT2; p-F11: pseudoginsenoside F11; Rd2: ginsenoside Rd2; OA-GlurA-Glc-Xyl-Ace-3H2O: oleanolic acid-glucuronic acid-glucose-xylose-acetate-trihydrate; vina-R8: vinaginsenoside R8; Re: ginsenoside Re; Rf: ginsenoside Rf; noto-R2: notoginsenoside R2; Ro: ginsenoside Ro.

Similar articles

References

    1. Pang H., Hu Z. Metabolomics in drug research and development: The recent advances in technologies and applications. Acta Pharm. Sin. B. 2023;13:3238–3251. - PMC - PubMed
    1. Zhou J., Hou D., Zou W., et al. Comparison of widely targeted metabolomics and untargeted metabolomics of wild Ophiocordyceps sinensis. Molecules. 2022;27 - PMC - PubMed
    1. Wurihan, Aodungerle, Bilige, et al. Metabonomics study of liver and kidney subacute toxicity induced by garidi-5 in rats. Chin. Herb. Med. 2022;14:422–431. - PMC - PubMed
    1. Zhu Y., Wang F., Han J., et al. Untargeted and targeted mass spectrometry reveal the effects of theanine on the central and peripheral metabolomics of chronic unpredictable mild stress-induced depression in juvenile rats. J. Pharm. Anal. 2023;13:73–87. - PMC - PubMed
    1. Wei Y., Zhang J., Qi K., et al. Combined analysis of transcriptomics and metabolomics revealed complex metabolic genes for diterpenoids biosynthesis in different organs of Anoectochilus roxburghii. Chin. Herb. Med. 2023;15:298–309. - PMC - PubMed

LinkOut - more resources