Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 8;14(1):3309.
doi: 10.1038/s41467-023-38382-z.

Data fusion and multivariate analysis for food authenticity analysis

Affiliations

Data fusion and multivariate analysis for food authenticity analysis

Yunhe Hong et al. Nat Commun. .

Abstract

A mid-level data fusion coupled with multivariate analysis approach is applied to dual-platform mass spectrometry data sets using Rapid Evaporative Ionization Mass Spectrometry and Inductively Coupled Plasma Mass Spectrometry to determine the correct classification of salmon origin and production methods. Salmon (n = 522) from five different regions and two production methods are used in the study. The method achieves a cross-validation classification accuracy of 100% and all test samples (n = 17) have their origins correctly determined, which is not possible with single-platform methods. Eighteen robust lipid markers and nine elemental markers are found, which provide robust evidence of the provenance of the salmon. Thus, we demonstrate that our mid-level data fusion - multivariate analysis strategy greatly improves the ability to correctly identify the geographical origin and production method of salmon, and this innovative approach can be applied to many other food authenticity applications.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. REIMS lipidomic fingerprints of Alaskan salmon, Icelandic salmon, Norwegian salmon, and Scottish salmon reveal distinct differences amongst the classes.
a PCA score plot amongst Alaskan salmon, Icelandic salmon, Norwegian salmon, and Scottish salmon: Intra-group differences were seen in the PCA model for the Iceland group (light blue dot). PC1 and PC3 are shown for clarity. PC1 contributed to 38.37% of the total explained variations, and PC3 has 15.26% contribution in the total explained variations. b PC1 and PC3 loading plot amongst 4 salmon groups. c PC2 loading plot amongst 4 salmon groups, which had 24.0% contribution in the total explained variations. d PCA score plot between Icelandic farmed salmon and Icelandic wild salmon. e PC1 and PC2 loading plot between Icelandic wild and farmed salmon.
Fig. 2
Fig. 2. Main effects of lipid differences on salmon geographical identification.
a Histogram of lipid biomarkers amongst Alaskan salmon, Icelandic farmed salmon, Icelandic wild salmon, Norwegian salmon, and Scottish salmon. b PCA score plot and c LDA plot of REIMS spectral data (m/z 200–1200) obtained from five salmon groups. For Mass spectra fingerprints of five groups, see Supplementary Fig. S1.
Fig. 3
Fig. 3. Differentially element analysis between Alaskan salmon, Icelandic farmed salmon, Icelandic wild salmon, Norwegian salmon, and Scottish salmon.
a Score plot of the PCA identified elements in five salmon groups. b OPLS-DA for discrimination of salmon geographical origins. c Heatmap of the Alaskan salmon, Icelandic farmed salmon, Icelandic wild salmon, Norwegian salmon, and Scottish salmon, 20 elements are indicated above the heatmap.
Fig. 4
Fig. 4. Elements pairwise comparison analysis in five salmon groups.
The figure illustrates the significant variations in the levels of Li, B, V, Fe, Co, Zn, Se, As, and Cd across the five salmon groups. Wild salmon groups were found to exhibit elevated levels of Fe, Zn, Se, and Cd compared to farmed groups.
Fig. 5
Fig. 5. The procedure of data fusion coupled to the chemometric model approach.
Data acquisition was carried out using REIMS and ICP-MS methods. Data fusion and modeling were then conducted. PLS-DA and OPLS-DA, identified as the optimal models in this research, proved to be effective for analyzing the traceability of salmon origin.
Fig. 6
Fig. 6. Unsupervised salmon origin differentiation based on different data fusion strategy, and Supervised learning parameter optimisation based on mid-level data fusion strategy.
a Low-level data fusion, using min-max normalisation, PCA score plot of 5 salmon groups with data min-max normalisation. b Mid-level data fusion PCA score plot of 5 salmon groups. c ICP-MS principal compound accumulated explained variance plot. d REIMS principal compound accumulated explained variance plot. e The k value evaluation of k-NN model based on mid-level data fusion, k values between 1 and 20 were tested to find the optimal parameter of the k-NN classifier using different sub-datasets in this study. The optimal k for the k-NN classifier was chosen as k = 5. f Plot cumulative R2 and Q2 per component for the PLS-DA model based on mid-level data fusion. Components 1–50 were computed for parameter optimisation, and 25 was determined to be the optimal component number. g Number of predictors of RF classifier influenced the correct classification rate, npredic 1–200 were tested for five groups to find the best parameters for the RF classifier. npredic = 15 was found to be the best value for RF classifiers, based on mid-level data fusion. h RF classifier correct classification rate was influenced by the number of trees, Ntree = 500 was found to be the best value for RF classifiers, based on mid-level data fusion.
Fig. 7
Fig. 7. Use PLS-DA and OPLS-DA model for salmon sample origin authenticity analysis based on mid-level data fusion strategy.
a Original PLS-DA model plot created by using 522 salmon samples. b Sample origin authenticity analysis by using PLS-DA model (6 replicants of each sample). c Original OPLS-DA 3D plot. d OPLS-DA model shown the results of salmon origin authenticity identification (6 replicants of each sample); 6b and 6d show that when this sample was defined as “Norway”-light blue group, it was classified into the yellow group “Scotland.

References

    1. Shahbandeh, M. Salmon industry - statistics & facts | Statista. https://www.statista.com/topics/7411/salmon-industry/#topicHeader__wrapper (2022).
    1. Shamshak GL, Anderson JL, Asche F, Garlock T, Love DC. U.S. seafood consumption. J. World Aquac. Soc. 2019;50:715–727. doi: 10.1111/jwas.12619. - DOI
    1. Asche, F., Sogn-Grundvåg, G., Zhang, D., Cojocaru, A. L. & Young, J. A. Brands, labels, and product longevity: the case of salmon in UK grocery retailing. J. Int. Food Agribus. Market.33, 53–68 (2021).
    1. Wang O, Somogyi S. Motives for luxury seafood consumption in first-tier cities in China. Food Qual. Prefer. 2020;79:103780. doi: 10.1016/j.foodqual.2019.103780. - DOI
    1. Oglend A, Straume HM. Pricing efficiency across destination markets for Norwegian salmon exports. Aquac. Econ. Manag. 2019;23:188–203. doi: 10.1080/13657305.2018.1554722. - DOI

Publication types