Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Sep 23;15(1):87.
doi: 10.1186/s13321-023-00741-9.

Mass-Suite: a novel open-source python package for high-resolution mass spectrometry data analysis

Affiliations

Mass-Suite: a novel open-source python package for high-resolution mass spectrometry data analysis

Ximin Hu et al. J Cheminform. .

Abstract

Mass-Suite (MSS) is a Python-based, open-source software package designed to analyze high-resolution mass spectrometry (HRMS)-based non-targeted analysis (NTA) data, particularly for water quality assessment and other environmental applications. MSS provides flexible, user-defined workflows for HRMS data processing and analysis, including both basic functions (e.g., feature extraction, data reduction, feature annotation, data visualization, and statistical analyses) and advanced exploratory data mining and predictive modeling capabilities that are not provided by currently available open-source software (e.g., unsupervised clustering analyses, a machine learning-based source tracking and apportionment tool). As a key advance, most core MSS functions are supported by machine learning algorithms (e.g., clustering algorithms and predictive modeling algorithms) to facilitate function accuracy and/or efficiency. MSS reliability was validated with mixed chemical standards of known composition, with 99.5% feature extraction accuracy and ~ 52% overlap of extracted features relative to other open-source software tools. Example user cases of laboratory data evaluation are provided to illustrate MSS functionalities and demonstrate reliability. MSS expands available HRMS data analysis workflows for water quality evaluation and environmental forensics, and is readily integrated with existing capabilities. As an open-source package, we anticipate further development of improved data analysis capabilities in collaboration with interested users.

Keywords: Mass spectrometry; Non-targeted analysis; Python; Source apportionment; Source tracking; Unsupervised machine learning.

PubMed Disclaimer

Conflict of interest statement

There are no financial or non-financial competing interests.

Figures

Fig. 1
Fig. 1
Overview of a typical MSS workflow for high-resolution mass spectrometry (HRMS) data analysis. The solid lines represent a typical workflow for typical HRMS non-targeted analysis (NTA) data processing; dashed lines represent additional optional workflows. All modules are optional
Fig. 2
Fig. 2
Comparisons of feature extraction outcomes for identical input samples. Samples numbered A #505, B #506 and C #508 from the ENTACT study [73] with MSS, XCMS and MSDIAL software processing. Venn diagrams report extracted features overlap between different platforms. The feature extractions were performed with parameters matched as closely as possible across the different platforms. Key parameters for peak extraction for different platforms are reported in Additional file 1: Table S2
Fig. 3
Fig. 3
Estimates of fold change (estimated vs. actual concentration) of source (roadway runoff) concentration from a previous study [4] and MSS model predictions. MSS predictions were built from an ensemble random forest model that was trained with roadway runoff source sample dilution. One cluster of compounds (Cluster label = 0, N = 587) was prioritized from DBSCAN clustering analysis and used to derive estimates. The dashed line (fold change = 1) indicates predicted concentration equal to actual concentration

References

    1. Wang Z, Walker GW, Muir DCG, Nagatani-Yoshida K. Toward a global understanding of chemical pollution: a first comprehensive analysis of national and regional chemical inventories. Environ Sci Technol. 2020;54:2575–2584. doi: 10.1021/acs.est.9b06379. - DOI - PubMed
    1. Hollender J, Bourgin M, Fenner KB, et al. Exploring the behaviour of emerging contaminants in the water cycle using the capabilities of high resolution mass spectrometry. CHIMIA Int J Chem. 2014;68:793–798. doi: 10.2533/chimia.2014.793. - DOI - PubMed
    1. Tian Z, Peter KT, Gipe AD, et al. Suspect and nontarget screening for contaminants of emerging concern in an urban estuary. Environ Sci Technol. 2020;54:889–901. doi: 10.1021/acs.est.9b06126. - DOI - PubMed
    1. Peter KT, Wu C, Tian Z, Kolodziej EP. Application of nontarget high resolution mass spectrometry data to quantitative source apportionment. Environ Sci Technol. 2019;53:12257–12268. doi: 10.1021/acs.est.9b04481. - DOI - PubMed
    1. Schollée JE, Bourgin M, von Gunten U, et al. Non-target screening to trace ozonation transformation products in a wastewater treatment train including different post-treatments. Water Res. 2018;142:267–278. doi: 10.1016/j.watres.2018.05.045. - DOI - PubMed