Review

. 2023 Aug 9:12:e85980.

doi: 10.7554/eLife.85980.

Enhancing precision in human neuroscience

Stephan Nebe^#¹, Mario Reutter^#², Daniel H Baker³, Jens Bölte⁴, Gregor Domes^{5

6}, Matthias Gamer², Anne Gärtner⁷, Carsten Gießing⁸, Caroline Gurr^{9

10}, Kirsten Hilger^{2

11}, Philippe Jawinski¹², Louisa Kulke¹³, Alexander Lischke^{14

15}, Sebastian Markett¹², Maria Meier^{16

17}, Christian J Merz¹⁸, Tzvetan Popov¹⁹, Lara M C Puhlmann^{20

21}, Daniel S Quintana^{21

22

23

24}, Tim Schäfer^{9

10}, Anna-Lena Schubert²⁵, Matthias F J Sperl^{26

27}, Antonia Vehlen⁵, Tina B Lonsdorf^#^{28

29}, Gordon B Feld^#^{30

31

32

33}

Affiliations

¹ Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Zurich, Switzerland.
² Department of Psychology, Julius-Maximilians-University, Würzburg, Germany.
³ Department of Psychology and York Biomedical Research Institute, University of York, York, United Kingdom.
⁴ Institute for Psychology, University of Münster, Otto-Creuzfeldt Center for Cognitive and Behavioral Neuroscience, Münster, Germany.
⁵ Department of Biological and Clinical Psychology, University of Trier, Trier, Germany.
⁶ Institute for Cognitive and Affective Neuroscience, Trier, Germany.
⁷ Faculty of Psychology, Technische Universität Dresden, Dresden, Germany.
⁸ Biological Psychology, Department of Psychology, School of Medicine and Health Sciences, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany.
⁹ Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, University Hospital, Goethe University, Frankfurt, Germany.
¹⁰ Brain Imaging Center, Goethe University, Frankfurt, Germany.
¹¹ Department of Psychology, Psychological Diagnostics and Intervention, Catholic University of Eichstätt-Ingolstadt, Eichstätt, Germany.
¹² Department of Psychology, Humboldt-Universität zu Berlin, Berlin, Germany.
¹³ Department of Developmental with Educational Psychology, University of Bremen, Bremen, Germany.
¹⁴ Department of Psychology, Medical School Hamburg, Hamburg, Germany.
¹⁵ Institute of Clinical Psychology and Psychotherapy, Medical School Hamburg, Hamburg, Germany.
¹⁶ Department of Psychology, University of Konstanz, Konstanz, Germany.
¹⁷ University Psychiatric Hospitals, Child and Adolescent Psychiatric Research Department (UPKKJ), University of Basel, Basel, Switzerland.
¹⁸ Department of Cognitive Psychology, Institute of Cognitive Neuroscience, Faculty of Psychology, Ruhr University Bochum, Bochum, Germany.
¹⁹ Department of Psychology, Methods of Plasticity Research, University of Zurich, Zurich, Switzerland.
²⁰ Leibniz Institute for Resilience Research, Mainz, Germany.
²¹ Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
²² NevSom, Department of Rare Disorders & Disabilities, Oslo University Hospital, Oslo, Norway.
²³ KG Jebsen Centre for Neurodevelopmental Disorders, University of Oslo, Oslo, Norway.
²⁴ Norwegian Centre for Mental Disorders Research (NORMENT), University of Oslo, Oslo, Norway.
²⁵ Department of Psychology, University of Mainz, Mainz, Germany.
²⁶ Department of Clinical Psychology and Psychotherapy, University of Giessen, Giessen, Germany.
²⁷ Center for Mind, Brain and Behavior, Universities of Marburg and Giessen, Giessen, Germany.
²⁸ Department of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
²⁹ Department of Psychology, Biological Psychology and Cognitive Neuroscience, University of Bielefeld, Bielefeld, Germany.
³⁰ Department of Clinical Psychology, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany.
³¹ Department of Psychology, Heidelberg University, Heidelberg, Germany.
³² Department of Addiction Behavior and Addiction Medicine, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany.
³³ Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany.

^# Contributed equally.

PMID: 37555830
PMCID: PMC10411974
DOI: 10.7554/eLife.85980

Review

Enhancing precision in human neuroscience

Stephan Nebe et al. Elife. 2023.

. 2023 Aug 9:12:e85980.

doi: 10.7554/eLife.85980.

Authors

Affiliations

¹ Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Zurich, Switzerland.
² Department of Psychology, Julius-Maximilians-University, Würzburg, Germany.
³ Department of Psychology and York Biomedical Research Institute, University of York, York, United Kingdom.
⁴ Institute for Psychology, University of Münster, Otto-Creuzfeldt Center for Cognitive and Behavioral Neuroscience, Münster, Germany.
⁵ Department of Biological and Clinical Psychology, University of Trier, Trier, Germany.
⁶ Institute for Cognitive and Affective Neuroscience, Trier, Germany.
⁷ Faculty of Psychology, Technische Universität Dresden, Dresden, Germany.
⁸ Biological Psychology, Department of Psychology, School of Medicine and Health Sciences, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany.
⁹ Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, University Hospital, Goethe University, Frankfurt, Germany.
¹⁰ Brain Imaging Center, Goethe University, Frankfurt, Germany.
¹¹ Department of Psychology, Psychological Diagnostics and Intervention, Catholic University of Eichstätt-Ingolstadt, Eichstätt, Germany.
¹² Department of Psychology, Humboldt-Universität zu Berlin, Berlin, Germany.
¹³ Department of Developmental with Educational Psychology, University of Bremen, Bremen, Germany.
¹⁴ Department of Psychology, Medical School Hamburg, Hamburg, Germany.
¹⁵ Institute of Clinical Psychology and Psychotherapy, Medical School Hamburg, Hamburg, Germany.
¹⁶ Department of Psychology, University of Konstanz, Konstanz, Germany.
¹⁷ University Psychiatric Hospitals, Child and Adolescent Psychiatric Research Department (UPKKJ), University of Basel, Basel, Switzerland.
¹⁸ Department of Cognitive Psychology, Institute of Cognitive Neuroscience, Faculty of Psychology, Ruhr University Bochum, Bochum, Germany.
¹⁹ Department of Psychology, Methods of Plasticity Research, University of Zurich, Zurich, Switzerland.
²⁰ Leibniz Institute for Resilience Research, Mainz, Germany.
²¹ Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
²² NevSom, Department of Rare Disorders & Disabilities, Oslo University Hospital, Oslo, Norway.
²³ KG Jebsen Centre for Neurodevelopmental Disorders, University of Oslo, Oslo, Norway.
²⁴ Norwegian Centre for Mental Disorders Research (NORMENT), University of Oslo, Oslo, Norway.
²⁵ Department of Psychology, University of Mainz, Mainz, Germany.
²⁶ Department of Clinical Psychology and Psychotherapy, University of Giessen, Giessen, Germany.
²⁷ Center for Mind, Brain and Behavior, Universities of Marburg and Giessen, Giessen, Germany.
²⁸ Department of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
²⁹ Department of Psychology, Biological Psychology and Cognitive Neuroscience, University of Bielefeld, Bielefeld, Germany.
³⁰ Department of Clinical Psychology, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany.
³¹ Department of Psychology, Heidelberg University, Heidelberg, Germany.
³² Department of Addiction Behavior and Addiction Medicine, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany.
³³ Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany.

^# Contributed equally.

PMID: 37555830
PMCID: PMC10411974
DOI: 10.7554/eLife.85980

Abstract

Human neuroscience has always been pushing the boundary of what is measurable. During the last decade, concerns about statistical power and replicability - in science in general, but also specifically in human neuroscience - have fueled an extensive debate. One important insight from this discourse is the need for larger samples, which naturally increases statistical power. An alternative is to increase the precision of measurements, which is the focus of this review. This option is often overlooked, even though statistical power benefits from increasing precision as much as from increasing sample size. Nonetheless, precision has always been at the heart of good scientific practice in human neuroscience, with researchers relying on lab traditions or rules of thumb to ensure sufficient precision for their studies. In this review, we encourage a more systematic approach to precision. We start by introducing measurement precision and its importance for well-powered studies in human neuroscience. Then, determinants for precision in a range of neuroscientific methods (MRI, M/EEG, EDA, Eye-Tracking, and Endocrinology) are elaborated. We end by discussing how a more systematic evaluation of precision and the application of respective insights can lead to an increase in reproducibility in human neuroscience.

Keywords: experimental methods; generalizability; human neuroscience; neuroscience; precision; reliability; sample size.

PubMed Disclaimer

Conflict of interest statement

SN, MR, DB, JB, GD, MG, AG, CG, CG, KH, PJ, LK, AL, SM, MM, CM, TP, LP, DQ, TS, AS, MS, AV, TL, GF No competing interests declared

Figures

**Figure 1.. Comparison of validity, precision, and accuracy.**
(A) A latent construct such as emotional arousal (red dot in the center of the circle) can be operationalized using a variety of methods (e.g., EEG ERN amplitudes, fMRI amygdala activation, or self-reports such as the Self-Assessment Manikin). These methods may differ in their construct validity (black arrows), that is, the measurement may be biased away from the true value of the construct. Of note, in this model, the true values are those of an unknown latent construct and thus validity will always be at least partially a philosophical question. Some may, for example, argue that measuring neural activity directly with sufficient precision is equivalent to measuring the latent construct. However, we prescribe to an emergent materialism and focus on measurement precision. The important and complex question of validity is thus beyond the scope of this review and should be discussed elsewhere. (B) Accuracy and precision are related to validity with the important difference that they are fully addressed within the framework of the manifest variable used to operationalize the latent construct (e.g., fMRI amygdala activation). The true value is shown as a blue dot in the center of the circle and, in this example, would be the true activity of the amygdala. The lack of accuracy (dark blue arrow) is determined by the tendency of the measured values to be biased away from this true value, that is, when signal losses to deeper structures alter the blood oxygen-level dependent (BOLD) signal measuring amygdala activity. Oftentimes, accuracy is unknown and can only be statistically estimated (see Eye-Tracking section for an exception). The precision is determined by the amount of error variance (diffuse dark blue area), i.e. precision is high if BOLD signals measured at the amygdala are similar to each other under the assumption that everything else remains equal. The main aim of this review is to discuss how precision can be optimized in human neuroscience.

**Figure 2.. Relation between reliability and precision.**
Hypothetical measurement of a variable at two time points in five participants under different assumptions of between-subjects and within-subject variance. Reliability can be understood as the relative stability of individual z-scores across repeated measurements of the same sample: Do participants who score high during the first assessment also score high in the second (compared to the rest of the sample)? Statistically, its calculation relies on relating the within-subject variance (illustrated by dot size) to the between-subjects variance (i.e., the spread of dots). As can be seen above and in , high reliability is achieved when the within-subject variance is small and the between-subjects variance is large (i.e., no overlap of dots in the top left panel). Low reliability can occur due to high within-subject variance and low between-subjects variance (i.e., highly overlapping dots in the bottom right) and intermediate reliability might result from similar between- and within-subject variance (top right and bottom left). Consequently, reliability can only be interpreted with respect to subject-level precision when taking the observed population variance (i.e., the group-level precision) into account (see ). For example, an event-related potential in the EEG may be sufficiently reliable after having collected 50 trials in a sample drawn from a population of young healthy adults. The same measure, however, may be unreliable in elderly populations or patients due to increased within-subject variance (i.e., decreased subject-level precision).

**Figure 3.. Primary, secondary, and error variance.**
(A) There are three main sources of variance in a measurement, each providing a different angle on optimizing precision. Primary (or systematic) variance results from changes in the true value of the manifest (dependent) variable upon manipulation of the independent variable and therefore represents what we desire to measure (e.g., neuronal activity due to emotional stimuli). Secondary variance is attributable to other variables that are not the focus of the research but are under the experimenter’s control, for example, the influence of the menstrual cycle on neural activity can either be controlled by measuring all participants at the same time of the cycle or by adding time of cycle as a covariate to the analysis. Trivially, if the research topic was the effect of the menstrual cycle on neural activity, then this variance would be primary variance, highlighting that these definitions depend solely on the research question. Error variance is any change in the measurement that cannot be reasonably accounted for by other variables. It is thus assumed to be a random error (see systematic error for exceptions). Explained variance (see definition of effect size in the Glossary in Appendix) is the size of the effect of manipulating the independent variable compared to the total variance after accounting for the measured secondary variance (via covariates). Precision is enhanced if the error variance is minimized and/or the secondary variance is controlled. Methods in human neuroscience differ substantially in the way they deal with error variance. (Kerlinger, 1964, for the first description of the Max-Con-Min principle). (B) In EEG research, a popular method is averaging. On the left, the evoked neuronal response (primary variance – green line) of an auditory stimulus is much smaller than the ongoing neuronal activity (error variance – gray lines). Error variance is assumed to be random and, thus, should cancel out during averaging. The more trials (many gray lines on the left) are averaged, the less error variance remains if we assume that the underlying true evoked neuronal response remains constant (green subject-level evoked potential on the right). Filtering and independent component analysis are further popular methods to reduce error variance in EEG research. After applying these procedures on the subject-level, the data can be used for group-level analyses. (C) In fMRI research, a linear model is commonly used to prepare the subject-level data before group analyses. The time series data are modeled using beta weights, a design matrix, and the residuals (see GLM and mass univariate approaches in the Glossary in Appendix). Essentially, a hypothetical hemodynamic response (green line in the middle) is convolved with the stimuli (red) to form predicted values. Covariates such as movements or physiological parameters are added. Therefore, the error variance (residuals) that remains is the part of the time series that cannot be explained by primary variance (predictor) or secondary variance (covariates). Of course, averaging and modeling approaches can both be used for the same method depending on the researcher’s preferences. Additionally, pre-processing procedures such as artifact rejection are used ubiquitously to reduce error variance.

**Figure 4.. Habituation of electrodermal activity.**
Habituation of electrodermal activity (EDA) is illustrated using a single subject from Reutter and Gamer, 2023. (A) EDA across the whole experiment with the red dashed lines marking onsets of painful stimuli and the gray solid line denoting a short break between experimental phases. (B) Skin conductance level (SCL) across trials (separately for experimental phases) showing habituation (i.e., decreasing SCLs) across the experiment. (C) Trial-level EDA after each application of a painful stimulus showing that SCL and skin conductance response (SCR) amplitude is reduced as the experiment progresses. (D) SCRs (operationalized as baseline-to-peak differences) decrease over time within the same experimental phase. Interestingly, SCR amplitudes ‘recover’ at the beginning of the second experimental phase even though this is not the case for SCL. Notably, this strong habituation of SCL and SCR means that increasing trials for higher precision may not always be possible. However, the extent to which components of primary and error variance are reduced by habituation remains an open question. This figure can be reproduced using the data and R script in ‘Figure 4—source data 1’.

**Figure 5.. Link between precision and accuracy of gaze signal.**
Due to the physiology of the eye, the ground truth of the manifest variable (fixation) is known during the calibration procedure. Therefore, accuracy and precision can be disentangled by this step. Accuracy is high if the calibration procedure leads to estimated gaze points (in blue) being centered around the target (green cross). Precision is high if the gaze points are less spread out. Ideally, both high precision and high accuracy are achieved. Note that the precision and accuracy of the measurement can change significantly after the calibration procedure, for example, because of participant movement.

**Figure 6.. Biological rhythms and how to control for them.**
(A) Examples of biological rhythms. Pulsatile rhythms refer to cyclic changes starting within (milli)seconds, ultradian rhythms occur in less than 20 hr, whereas circadian rhythms encompass changes within a day approximately. These rhythms are intertwined (Young et al., 2004) and included in even longer rhythms, such as occurring within a week (circaseptan), within 20–30 days (lunar; prominent example is the menstrual cycle), within a season (seasonal), or within one year (circannual). (B) Exemplary approaches to account for biological rhythms. Time of day at sampling, in itself and relative to awakening, is especially important when implementing physiological measures with a circadian rhythm (Nader et al., 2010; Orban et al., 2020) and needs to be controlled (B1-2). For trait measures, reliability can be increased by collecting multiple samples across participants of the same group, and/or better within participants (B3-4; Schmalenberger et al., 2021).

**Figure 7.. Hierarchical structure of precision.**
Four samples were simulated at different degrees of precision on group-, subject-, and trial-level. We start with a baseline case for which all levels of precision are comparably low (64 subjects, 50 trials per subject, 500 arbitrary units of random noise on trial-level). Afterwards, the number of subjects is quadrupled to double group-level precision (right panel) but no effect on subject-level precision or reliability is observed (a descriptive drop in reliability is due to sampling error). Subsequently, the number of trials is quadrupled to double subject-level precision. This also increases reliability and, vitally, carries on to improve group-level precision (Baker et al., 2021), albeit to a smaller extent than increasing sample size by the same factor. Finally, the trial-level deviation from the true subject-level means is halved to double trial-level precision. This improves both subject-level and group-level precision without increasing the number of data points (i.e., subjects or trials).

See this image and copyright information in PMC

References

1. Adam EK, Quinn ME, Tavernier R, McQuillan MT, Dahlke KA, Gilbert KE. Diurnal cortisol slopes and mental and physical health outcomes: A systematic review and meta-analysis. Psychoneuroendocrinology. 2017;83:25–41. doi: 10.1016/j.psyneuen.2017.05.018. - DOI - PMC - PubMed
1. Airan RD, Vogelstein JT, Pillai JJ, Caffo B, Pekar JJ, Sair HI. Factors affecting characterization and localization of interindividual differences in functional connectivity using MRI. Human Brain Mapping. 2016;37:1986–1997. doi: 10.1002/hbm.23150. - DOI - PMC - PubMed
1. Allen PJ, Josephs O, Turner R. A method for removing imaging artifact from continuous EEG recorded during functional MRI. NeuroImage. 2000;12:230–239. doi: 10.1006/nimg.2000.0599. - DOI - PubMed
1. Allen MJ, Yen WM. Introduction to Measurement Theory. Waveland Press; 2001.
1. Allen M, Poggiali D, Whitaker K, Marshall TR, Kievit RA. Raincloud plots: a multi-platform tool for robust data visualization. Wellcome Open Research. 2019;4:63. doi: 10.12688/wellcomeopenres.15191.1. - DOI - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Enhancing precision in human neuroscience

Affiliations

Enhancing precision in human neuroscience

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Miscellaneous