. 2022 Jul 22:9:917911.

doi: 10.3389/fmolb.2022.917911. eCollection 2022.

Graph Properties of Mass-Difference Networks for Profiling and Discrimination in Untargeted Metabolomics

Francisco Traquete¹, João Luz¹, Carlos Cordeiro¹, Marta Sousa Silva¹, António E N Ferreira¹

Affiliations

Affiliation

¹ Laboratório de FT-ICR e Espectrometria de Massa Estrutural, MARE-Marine and Environmental Sciences Centre, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal.

PMID: 35936789
PMCID: PMC9353772
DOI: 10.3389/fmolb.2022.917911

Graph Properties of Mass-Difference Networks for Profiling and Discrimination in Untargeted Metabolomics

Francisco Traquete et al. Front Mol Biosci. 2022.

. 2022 Jul 22:9:917911.

doi: 10.3389/fmolb.2022.917911. eCollection 2022.

Authors

Francisco Traquete¹, João Luz¹, Carlos Cordeiro¹, Marta Sousa Silva¹, António E N Ferreira¹

Affiliation

¹ Laboratório de FT-ICR e Espectrometria de Massa Estrutural, MARE-Marine and Environmental Sciences Centre, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal.

PMID: 35936789
PMCID: PMC9353772
DOI: 10.3389/fmolb.2022.917911

Abstract

Untargeted metabolomics seeks to identify and quantify most metabolites in a biological system. In general, metabolomics results are represented by numerical matrices containing data that represent the intensities of the detected variables. These matrices are subsequently analyzed by methods that seek to extract significant biological information from the data. In mass spectrometry-based metabolomics, if mass is detected with sufficient accuracy, below 1 ppm, it is possible to derive mass-difference networks, which have spectral features as nodes and chemical changes as edges. These networks have previously been used as means to assist formula annotation and to rank the importance of chemical transformations. In this work, we propose a novel role for such networks in untargeted metabolomics data analysis: we demonstrate that their properties as graphs can also be used as signatures for metabolic profiling and class discrimination. For several benchmark examples, we computed six graph properties and we found that the degree profile was consistently the property that allowed for the best performance of several clustering and classification methods, reaching levels that are competitive with the performance using intensity data matrices and traditional pretreatment procedures. Furthermore, we propose two new metrics for the ranking of chemical transformations derived from network properties, which can be applied to sample comparison or clustering. These metrics illustrate how the graph properties of mass-difference networks can highlight the aspects of the information contained in data that are complementary to the information extracted from intensity-based data analysis.

Keywords: Fourier transform mass spectrometry; graph properties; mass-difference networks; metabolomics data analysis; untargeted metabolomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**FIGURE 1**
The concept of mass-difference networks (MDiNs). In this four-node example, neutral mass values (in Da) obtained from a mass spectrometry analysis are represented as nodes, connected with mass differences associated with particular mass-difference-based building blocks (MDBs). Δm, mass difference (in Da).

**FIGURE 2**
Mass-difference network built from the YD dataset. The inset is a close-up of the selected rectangle in the populated area of the largest network component. Edge colors represent each MDB: () – O(–NH), () – NH₃(–O), () – H₂, () – CH₂, () – O, () – H₂O, () – NCH, () – CO, () – CHOH, () – S, () – CH₂O, () – CONH, () – CO₂, () – SO₃, () – PO₃H, () – CHCOOH, and () – CCH₃COOH. Node background colors represent the node degree. Network representations were made with Cytoscape 3.8.1 (Shannon et al., 2003).

formula image — **FIGURE 2**
Mass-difference network built from the YD dataset. The inset is a close-up of the selected rectangle in the populated area of the largest network component. Edge colors represent each MDB: () – O(–NH), () – NH₃(–O), () – H₂, () – CH₂, () – O, () – H₂O, () – NCH, () – CO, () – CHOH, () – S, () – CH₂O, () – CONH, () – CO₂, () – SO₃, () – PO₃H, () – CHCOOH, and () – CCH₃COOH. Node background colors represent the node degree. Network representations were made with Cytoscape 3.8.1 (Shannon et al., 2003).

**FIGURE 3**
Effect of IDT and sMDiN graph property analysis on clustering performance. **(A)** Correct clustering in HCA; **(B)** discrimination distance in HCA; **(C)** correct first clustering in HCA; **(D)** correct clustering in K-means clustering; **(E)** discrimination distance in K-means clustering; **(F)** adjusted Rand Index in K-means clustering. Methods are as follows: intensity-based data pretreatment (IDT); network analysis: degree analysis (degree), betweenness centrality analysis (betweenness), closeness centrality analysis (closeness), MDB impact (MDBI), weighted MDB impact (WMDBI), and GCD-11 topology analysis (GCD11).

**FIGURE 4**
Classification performance of models developed from IDT-treated data or sMDiN graph property methods. **(A)** Performance of random forest (RF) models; **(B)** performance of projection in latent structures–discriminant analysis (PLS-DA). For all datasets except HD, accuracy was estimated by 20 iterations of internal three- or fivefold stratified cross-validation, with the error bars representing the accuracy standard deviation. For the HD dataset, accuracy was estimated on a test set resulting from a stratified random 70/30% train/test split. Methods are as follows: intensity-based data pretreatment (IDT); network analysis: degree analysis (degree), betweenness centrality analysis (betweenness), closeness centrality analysis (closeness), MDB impact (MDBI), weighted MDB impact (WMDBI), and GCD-11 topology analysis (GCD11).

**FIGURE 5**
MDBI and WMDBI values for the sMDiNs of the YD dataset. **(A)** MDB impact; **(B)** weighted MDB impact. Values were mean-centered and standard scaled. MDBs are ordered by decreasing gini importance. Samples are triplicates of yeast strains of the wild-type reference strain (WT) and four single-gene deletion isogenic mutants of this strain: ΔGLO1, ΔGLO2, ΔGRE3, and ΔENO1. MDBs are listed in Table 2. Samples were clustered by HCA, with Euclidean distance and Ward linkage.

See this image and copyright information in PMC

Cited by

A Strategy for Uncovering the Serum Metabolome by Direct-Infusion High-Resolution Mass Spectrometry.
Sun X, Jia Z, Zhang Y, Zhao X, Zhao C, Lu X, Xu G. Sun X, et al. Metabolites. 2023 Mar 22;13(3):460. doi: 10.3390/metabo13030460. Metabolites. 2023. PMID: 36984900 Free PMC article.

References

1. Amara A., Frainay C., Jourdan F., Naake T., Neumann S., Novoa-del-Toro E. M., et al. (2022). Networks and Graphs Discovery in Metabolomics Data Analysis and Interpretation. Front. Mol. Biosci. 9. 10.3389/fmolb.2022.841373 - DOI - PMC - PubMed
1. Andreopoulos B., An A., Wang X., Schroeder M. (2009). A Roadmap of Clustering Algorithms: Finding a Match for a Biomedical Application. Briefings Bioinforma. 10, 297–314. 10.1093/bib/bbn058 - DOI - PubMed
1. Barabási A.-L., Oltvai Z. N. (2004). Network Biology: Understanding the Cell's Functional Organization. Nat. Rev. Genet. 5, 101–113. 10.1038/nrg1272 - DOI - PubMed
1. Bartel J., Krumsiek J., Theis F. J. (2013). Statistical Methods for the Analysis of High-Throughput Metabolomics Data. Comput. Struct. Biotechnol. J. 4, e201301009. 10.5936/csbj.201301009 - DOI - PMC - PubMed
1. Breitling R., Ritchie S., Goodenowe D., Stewart M. L., Barrett M. P. (2006). Ab Initio prediction of Metabolic Networks Using Fourier Transform Mass Spectrometry Data. Metabolomics 2, 155–164. 10.1007/s11306-006-0029-z - DOI - PMC - PubMed

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Graph Properties of Mass-Difference Networks for Profiling and Discrimination in Untargeted Metabolomics

Affiliation

Graph Properties of Mass-Difference Networks for Profiling and Discrimination in Untargeted Metabolomics

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Related information

LinkOut - more resources

Full Text Sources