Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 4;8(1):1149.
doi: 10.1038/s42003-025-08412-1.

fNIRS reproducibility varies with data quality, analysis pipelines, and researcher experience

Meryem A Yücel #  1   2 Robert Luke #  3   4 Rickson C Mesquita #  5   6 Alexander von Lühmann #  7   8 David M A Mehler  9   10   11 Michael Lührs  12 Jessica Gemignani  13   14 Androu Abdalmalak  15   16 Franziska Albrecht  17   18 Iara de Almeida Ivo  12 Christina Artemenko  19 Kira Ashton  20 Paweł Augustynowicz  21 Aahana Bajracharya  22 Elise Bannier  23   24 Beatrix Barth  25   26   27 Laurie Bayet  20 Jacqueline Behrendt  28   29 Hadi Borj Khani  30 Lenaic Borot  31 Jordan A Borrell  32   33 Sabrina Brigadoi  13 Kolby Brink  32 Chiara Bulgarelli  34 Emmanuel Caruyer  24 Hsin-Chin Chen  35 Christopher Copeland  32 Isabelle Corouge  24 Simone Cutini  13   14 Renata Di Lorenzo  36   37 Thomas Dresler  25   26   27 Adam T Eggebrecht  22 Ann-Christine Ehlis  25   26   27 Sinem B Erdoğan  38 Danielle Evenblij  12 Talukdar Raian Ferdous  39 Victoria Fracalossi  20 Erika Franzén  17   18 Anne Gallagher  40 Christian Gerloff  41   42 Judit Gervain  13   14 Noy Goldhamer  43 Louisa K Gossé  34 Ségolène M R Guérin  44   45 Edgar Guevara  46 S M Hadi Hosseini  47 Hamish Innes-Brown  48   49 Isabell Int-Veen  25 Sagi Jaffe-Dax  50 Nolwenn Jégou  24 Hiroshi Kawaguchi  51 Caroline Kelsey  36   37 Michaela Kent  52 Roman Kessler  53 Nadeen Kherbawy  50 Franziska Klein  9   54   55 Nofar Kochavi  50 Matthew Kolisnyk  56 Yogev Koren  57 Agnes Kroczek  25 Alexander Kvist  17 Chen-Hao Paul Lin  22   58 Andreas Löw  59 Siying Luan  60 Darren Mao  61 Giovani G Martins  62 Eike Middell  28   29 Samuel Montero-Hernandez  63   64 Murat Can Mutlu  65 Sergio L Novi  16 Natacha Paquette  40 Ishara Paranawithana  61 Yisrael Parmet  66 Jonathan E Peelle  67 Ke Peng  68 Tommy Peng  61 João Pereira  12   69 Paola Pinti  34 Luca Pollonini  70 Ali Rahimpour Jounghani  47   71 Vanessa Reindl  41   72 Wiebke Ringels  9 Betti Schopp  25 Alina Schulte  48   73 Martin Schulte-Rüther  74   75 Ari Segel  22 Tirdad Seifi Ala  48   49 Maureen J Shader  76 Hadas Shavit  50 Arefeh Sherafati  22   77 Mojtaba Soltanlou  78   79 Bettina Sorger  12 Emma Speh  22 Kevin D Stubbs  15   80 Katharina Stute  81 Eileen F Sullivan  36 Sungho Tak  82   83 Zeus Tipado  84 Julie Tremblay  40 Homa Vahidi  52 Maaike Van Eeckhoutte  49   85 Phetsamone Vannasing  40 Gregoire Vergotte  86 Marion A Vincent  45 Eileen Weiss  87 Dalin Yang  22 Gülnaz Yükselen  38 Dariusz Zapała  21 Vit Zemanek  88
Affiliations

fNIRS reproducibility varies with data quality, analysis pipelines, and researcher experience

Meryem A Yücel et al. Commun Biol. .

Erratum in

  • Publisher Correction: fNIRS reproducibility varies with data quality, analysis pipelines, and researcher experience.
    Yücel MA, Luke R, Mesquita RC, von Lühmann A, Mehler DMA, Lührs M, Gemignani J, Abdalmalak A, Albrecht F, de Almeida Ivo I, Artemenko C, Ashton K, Augustynowicz P, Bajracharya A, Bannier E, Barth B, Bayet L, Behrendt J, Khani HB, Borot L, Borrell JA, Brigadoi S, Brink K, Bulgarelli C, Caruyer E, Chen HC, Copeland C, Corouge I, Cutini S, Di Lorenzo R, Dresler T, Eggebrecht AT, Ehlis AC, Erdoğan SB, Evenblij D, Ferdous TR, Fracalossi V, Franzén E, Gallagher A, Gerloff C, Gervain J, Goldhamer N, Gossé LK, Guérin SMR, Guevara E, Hosseini SMH, Innes-Brown H, Int-Veen I, Jaffe-Dax S, Jégou N, Kawaguchi H, Kelsey C, Kent M, Kessler R, Kherbawy N, Klein F, Kochavi N, Kolisnyk M, Koren Y, Kroczek A, Kvist A, Lin CP, Löw A, Luan S, Mao D, Martins GG, Middell E, Montero-Hernandez S, Mutlu MC, Novi SL, Paquette N, Paranawithana I, Parmet Y, Peelle JE, Peng K, Peng T, Pereira J, Pinti P, Pollonini L, Jounghani AR, Reindl V, Ringels W, Schopp B, Schulte A, Schulte-Rüther M, Segel A, Ala TS, Shader MJ, Shavit H, Sherafati A, Soltanlou M, Sorger B, Speh E, Stubbs KD, Stute K, Sullivan EF, Tak S, Tipado Z, Tremblay J, Vahidi H, Van Eeckhoutte M, Vannasing P, Vergotte G, Vincent MA, Weiss E, Yang D… See abstract for full author list ➔ Yücel MA, et al. Commun Biol. 2025 Aug 21;8(1):1256. doi: 10.1038/s42003-025-08714-4. Commun Biol. 2025. PMID: 40841463 Free PMC article. No abstract available.

Abstract

As data analysis pipelines grow more complex in brain imaging research, understanding how methodological choices affect results is essential for ensuring reproducibility and transparency. This is especially relevant for functional Near-Infrared Spectroscopy (fNIRS), a rapidly growing technique for assessing brain function in naturalistic settings and across the lifespan, yet one that still lacks standardized analysis approaches. In the fNIRS Reproducibility Study Hub (FRESH) initiative, we asked 38 research teams worldwide to independently analyze the same two fNIRS datasets. Despite using different pipelines, nearly 80% of teams agreed on group-level results, particularly when hypotheses were strongly supported by literature. Teams with higher self-reported analysis confidence, which correlated with years of fNIRS experience, showed greater agreement. At the individual level, agreement was lower but improved with better data quality. The main sources of variability were related to how poor-quality data were handled, how responses were modeled, and how statistical analyses were conducted. These findings suggest that while flexible analytical tools are valuable, clearer methodological and reporting standards could greatly enhance reproducibility. By identifying key drivers of variability, this study highlights current challenges and offers direction for improving transparency and reliability in fNIRS research.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare the following competing interests: D.Z. and P.A. are employees and shareholders in Cortivision sp. z o.o. (Lublin, Poland), a company that manufactures fNIRS systems. M.L. is employed by the research company Brain Innovation B.V., Maastricht, Netherlands. K.S. is employed by the (f)NIRS manufacturer Artinis Medical Systems B.V., Elst, The Netherlands. A.A. is employed by NIRx Medical Technologies, LLC. The remaining authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Variability in hypothesis testing results across teams.
A Proportion of teams supporting each group-level hypothesis, H, in Dataset I (see text for the description of each hypothesis). B Percentage of teams supporting each individual-level hypothesis in Dataset II, separated by subject, S. In both panels, bar segments indicate the fraction of teams that supported (‘YES,’ green), rejected (‘NO,’ blue), or did not test (‘Not Investigated,’ navy), and the numbers in each bar represent the percentage of each response. C Proportion of teams reporting a significant result among those that tested the hypothesis in Dataset I (n = 31 (H1, H2), n = 30 (H3), n = 35 (H4-H7)). D Proportion of teams reporting a significant result among those that tested the hypothesis in Dataset II (n = 30–34 (H1, depending on the subject), n = 28–33 (H2, depending on subject), n = 28–38 (H3), n = 27–32 (H4)). Each color represents a different participant. (FT finger tapping, PMC primary motor cortex, LIFG left inferior frontal gyrus, HG Herschl’s gyrus).
Fig. 2
Fig. 2. Overview of signal processing pipeline choices across teams.
The teams’ pipeline choices for signal processing to extract the hemodynamic brain responses for subsequent statistical analysis are shown in a Sankey flow diagram with stages grouping choices by typical processing categories. All numbers are given in % rounded to two digits without decimal digits to improve readability. Multiple combined toolboxes in the analysis count individually toward each category (sum > 100%) in the pie chart. Processing stages: Pruning – method for identifying channels to drop from the analysis; (SCI) Scalp Coupling Index, (PSP) Peak Spectral Power, (SNR) Signal to Noise Ratio. Motion Artifacts – method for mitigating motion artifacts; (CBSI) Correlation Based Signal Improvement, (TDDR) Temporal Derivative Distribution Repair, (Spline SG) Spline interpolation with Savitzky-Golay filtering, (Mon. Interp.) Monotonous Interpolation. Resampling – resampling to a new sample rate for analysis. Filtering – temporal filtering. Physio. Preproc. – Other preprocessing methods for removal of physiological nuisance signals before HRF extraction. HRF Estimation – method for extraction/estimation of the hemodynamic brain response, GLM General Linear Model. Solvers/Modifiers – details for HRF estimation. (OLS) Ordinary Least Squares solution, (AR-IRLS) Autoregressive Iteratively Reweighted Least Squares. HRF Regressors – (only GLM) choice of regressors to model the hemodynamic response; (Consec. Gaussian) Consecutive Gaussians, (SPM) Statistical Parametric Mapping, (FIR) Finite Impulse Response. Other Regressors – (only GLM) choice of additional regressors to model physiology; (SC) Short Channels, (PCA of SC) First Principal Components of all Short Channels.
Fig. 3
Fig. 3. Reported statistical analysis steps for fNIRS data analysis pipelines.
The teams’ pipeline choices to test the working hypotheses of this study are shown in a Sankey flow diagram with stages grouping choices by typical analysis categories based on n = 38 pipelines for each of the two datasets. Multiple combined toolboxes in analysis count individually to each category (sum > 100%) in the pie chart. Statistical stages: Stat. Method – Statistical method employed for hypothesis testing. (t-Test NN) t-Test without further specification of type. (Mixed Effects NN) Mixed Effects model without further specification of type. Signal Type – tests performed on brain responses measured via HbO, HbR, both, or other. All numbers are given in % rounded to two digits without decimal digits to improve readability. Signal Space – tests performed on responses from individual channels, in image space, or for regions of interest (ROI). Metric – tests performed on GLM beta weights, windowed signal amplitude, or other options. Test for Normality – no or if yes, which one; (Kolmog.-Smirn.) Kolmogorov–Smirnov Test. Significance Level – Threshold for statistical significance. Multiple Comparison Correction – none or three different approaches.
Fig. 4
Fig. 4. Reported use of toolbox default settings across analysis pipelines.
UpSet plot showing the use of default parameters and settings in the groups’ analysis pipelines based on n = 38 pipelines for each of the two datasets. Rows display individual categories for which default settings could be chosen, and horizontal bar plots their cumulative frequency (e.g., groups chose default filter parameters in 47% of all reported analyses). The pie chart shows the fraction of groups that used default settings in 1, 2, 3, 4 and 5x categories (matching the color code of the intersection size bars, e.g. 2.9% used defaults for 5 categories). Connected black dots in columns display intersection (combination) of categories and vertical bar plots the frequency (intersection size) of these combinations. For example, three groups reported using default settings for the GLM method, artifact correction, and filter parameters combined, and four groups reported using the default settings only for the AR Model Order.
Fig. 5
Fig. 5. Distribution of pipeline choices by agreement with literature support.
Radar Charts of the distribution of choices in the analysis pipelines based on their (dis-)agreement with expected hypotheses outcomes, grouped by different pipeline stages based on n = 38 pipelines. A Results for Hypotheses 1, 2, and 7 of Dataset I, which have been supported by prior literature and therefore have a high expectation of being confirmed. B Results for Hypothesis 3–6 of Dataset I, which are expected to be true but are only weakly supported by prior literature. C Results for all four hypotheses analyzed at the individual level in Dataset II. In all cases, radial axes numbers represent joint probabilities of pooled individual hypothesis outcomes for a chosen category among all users in percent.
Fig. 6
Fig. 6. Relationship between hypothesis testing outcomes and self-reported confidence.
Panels AC represent group-level hypotheses, while panels DF illustrate individual-level hypotheses. Hypothesis testing results grouped by teams’ self-reported confidence in their analysis skills presented for group-level hypotheses (A) and individual-level hypotheses (D). Same results grouped by confidence in the reported outcomes presented for group-level hypotheses (B) and individual-level hypotheses (E). Sørensen-Dice similarity matrices illustrating the consistency of hypothesis outcomes across teams, organized according to self-reported confidence in their analysis skills presented for group-level hypotheses (C) and individual-level hypotheses (F). The colorbar represents the Sørensen-Dice coefficient values, ranging from 0.5 to 1. Please note that not all groups reported confidence; hence the number of groups in this plot is smaller than the total number of groups, which is 38.
Fig. 7
Fig. 7. Demographic profile of participating researchers.
A total of 102 researchers, grouped in 38 teams, submitted reports for analysis. Plots show their A geographic affiliation, B years of experience in fNIRS, C highest education qualification, and D self-reported fields of study. (LA: Latin America, APAC: Asia and Pacific, ME: Middle East, HS: High School, UG: Undergraduate).

References

    1. Allen, C. & Mehler, D. M. A. Open science challenges, benefits and tips in early career and beyond. PLOS Biol.17, e3000246 (2019). - PMC - PubMed
    1. Begley, C. G. & Ioannidis, J. P. A. Reproducibility in Science. Circ. Res.116, 116–126 (2015). - PubMed
    1. OPEN SCIENCE COLLABORATION Estimating the reproducibility of psychological science. Science349, aac4716 (2015). - PubMed
    1. Wicherts, J. M. et al. Degrees of freedom in planning, running, analyzing, and reporting psychological studies: a checklist to avoid p-hacking. Front. Psychol.7, 1832 (2016). - PMC - PubMed
    1. Fiedler, K. & Schwarz, N. Questionable research practices revisited. Soc. Psychological Personal. Sci.7, 45–52 (2016).

LinkOut - more resources