Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb 6;19(2):11.
doi: 10.1007/s11306-023-01974-3.

Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software

Affiliations

Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software

Xinsong Du et al. Metabolomics. .

Abstract

Background: Liquid chromatography-high resolution mass spectrometry (LC-HRMS) is a popular approach for metabolomics data acquisition and requires many data processing software tools. The FAIR Principles - Findability, Accessibility, Interoperability, and Reusability - were proposed to promote open science and reusable data management, and to maximize the benefit obtained from contemporary and formal scholarly digital publishing. More recently, the FAIR principles were extended to include Research Software (FAIR4RS).

Aim of review: This study facilitates open science in metabolomics by providing an implementation solution for adopting FAIR4RS in the LC-HRMS metabolomics data processing software. We believe our evaluation guidelines and results can help improve the FAIRness of research software.

Key scientific concepts of review: We evaluated 124 LC-HRMS metabolomics data processing software obtained from a systematic review and selected 61 software for detailed evaluation using FAIR4RS-related criteria, which were extracted from the literature along with internal discussions. We assigned each criterion one or more FAIR4RS categories through discussion. The minimum, median, and maximum percentages of criteria fulfillment of software were 21.6%, 47.7%, and 71.8%. Statistical analysis revealed no significant improvement in FAIRness over time. We identified four criteria covering multiple FAIR4RS categories but had a low %fulfillment: (1) No software had semantic annotation of key information; (2) only 6.3% of evaluated software were registered to Zenodo and received DOIs; (3) only 14.5% of selected software had official software containerization or virtual machine; (4) only 16.7% of evaluated software had a fully documented functions in code. According to the results, we discussed improvement strategies and future directions.

Keywords: FAIR principles; Liquid chromatography-mass spectrometry; Metabolomics; Open science; Reproducibility; Research software.

PubMed Disclaimer

Conflict of interest statement

Declarations

Conflict of interest The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Consort diagram for the literature review and the computational tool extraction. The screening has two phases: literature screening and tool screening. During literature screening, 1396 papers published in the past 5 years were obtained through keyword search, 70 papers were finally included as relevant ones. In tool screening phase, 122 potentially relevant software were extracted from the 70 papers. We added two software that were recommended by an expert but were not mentioned in the 70 papers. All 124 software were reviewed in more detailed by reading other available resources online such as documentations and code repositories. 61 software were finally considered eligible for final FAIR4RS review.
Fig. 2
Fig. 2
Summary of basic results. Functions associated with each tool were displayed on the left side. Percentages of evaluating criteria that a software tool fulfilled are shown in the middle of the figure. The right panel indicates the percentage of criteria fulfillment of each tool. Additionally, names of software powered by AI techniques are highlighted in red, and software that are closed source are highlighted in black.
Fig. 3
Fig. 3
Line chart reflecting the criteria fulfillment of each category. X-axis represents tool names, ranked by overall percentage of fulfillment from left to right. Y-axis on the left side stands for percentage of fulfilled criteria. Overall fulfillment is represented by the black solid line. Fulfillments of findability, accessibility, interoperability, and reusability categories are represented by blue, yellow, green, and red dash line separately. Software are ranked by their release time, which is represented by the grey line along with the secondary y-axis.
Fig. 4
Fig. 4
Percentage of software that fulfilled each evaluation criterion. X-axes are values for percentage of fulfillment, and y-axes list all evaluation criteria. There are 4 categories and 47 criteria in total. (A) Blue bars represent findability related criteria; (B) yellow bars represent criteria related to accessibility; (C) green bars represent criteria related to interoperability; and (D) red bars stand for criteria regarding reusability.
Fig. 5
Fig. 5
FAIRness trend across time. X-axis are years representing selected software? first release times, y-axis stands for the averaged %fulfillment of software released in each year. (A) represents the relationship between findability and tool release times; (B) represents the relationship between accessibility and tool release times; (C) represents the relationship between interoperability and tool release times; (D) represents the relationship between reusability and tool release times; (E) represents the overall FAIR4RS criteria fulfillment and tool release times. Results of Pearson?s correlation are also included in each sub-figure; (F) includes information regarding correlations among categories.

References

    1. Adusumilli R, & Mallick P (2017). Data Conversion with ProteoWizard msConvert. Methods in Molecular Biology. (Clifton N J), 1550, 339–368. 10.1007/978-1-4939-6747-6_23. - DOI - PubMed
    1. Aghamohammadi A, Mirian-Hosseinabadi SH, & Jalali S (2021). Statement frequency coverage: a code coverage criterion for assessing test suite effectiveness. Information and Software Technology, 129, 106426. 10.1016/j.infsof.2020.106426. - DOI
    1. Agrawal S, Kumar S, Sehgal R, George S, Gupta R, Poddar S, Jha A, & Pathak S (2019). El-MAVEN: A Fast, Robust, and User-Friendly Mass Spectrometry Data Processing Engine for Metabolomics. Methods in Molecular Biology (Clifton, N.J.), 1978, 301–321. 10.1007/978-1-4939-9236-2_19 - DOI - PubMed
    1. Alonso A, Julià A, Beltran A, Vinaixa M, Díaz M, Ibañez L, Correig X, & Marsal S (2011). AStream: an R package for annotating LC/MS metabolomic data. Bioinformatics, 27(9), 1339–1340. 10.1093/bioinformatics/btr138. - DOI - PubMed
    1. Analytica Chimica Acta | Journal | ScienceDirect.com by Elsevier. (n.d.). Retrieved September 16, from https://www.sciencedirect.com/journal/analytica-chimica-acta

Publication types

LinkOut - more resources