Power Analysis and Effect Size in Mixed Effects Models: A Tutorial

Marc Brysbaert¹, Michaël Stevens²

Affiliations

PMID: 31517183
PMCID: PMC6646942
DOI: 10.5334/joc.10

Power Analysis and Effect Size in Mixed Effects Models: A Tutorial

Marc Brysbaert et al. J Cogn. 2018.

. 2018 Jan 12;1(1):9.

doi: 10.5334/joc.10.

Authors

Marc Brysbaert¹, Michaël Stevens²

Affiliations

¹ Department of Experimental Psychology, Ghent University, Henri Dunantlaan 2, B-9000 Gent, BE.
² Ghent University, BE.

PMID: 31517183
PMCID: PMC6646942
DOI: 10.5334/joc.10

Abstract

In psychology, attempts to replicate published findings are less successful than expected. For properly powered studies replication rate should be around 80%, whereas in practice less than 40% of the studies selected from different areas of psychology can be replicated. Researchers in cognitive psychology are hindered in estimating the power of their studies, because the designs they use present a sample of stimulus materials to a sample of participants, a situation not covered by most power formulas. To remedy the situation, we review the literature related to the topic and introduce recent software packages, which we apply to the data of two masked priming studies with high power. We checked how we could estimate the power of each study and how much they could be reduced to remain powerful enough. On the basis of this analysis, we recommend that a properly powered reaction time experiment with repeated measures has at least 1,600 word observations per condition (e.g., 40 participants, 40 stimuli). This is considerably more than current practice. We also show that researchers must include the number of observations in meta-analyses because the effect sizes currently reported depend on the number of stimuli presented to the participants. Our analyses can easily be applied to new datasets gathered.

Keywords: F1 analysis; F2 analysis; effect size; mixed effects models; power analysis; random factors.

PubMed Disclaimer

Conflict of interest statement

The authors have no competing interests to declare.

Figures

**Figure 1**
Construction of the two prime types from the data of the Adelman et al. (2014) priming megastudy. Prime types varied from an identity prime (extreme left) to an all letter different prime (extreme right).

**Figure 2**
Snapshot of the Adelman et al. (2014) database used. Participant is the rank number of the participant tested (not all participants who started the study provided useful results); item = the target word responded to; prime = highly or lowly related to the target; RT is the reaction time to the target in the lexical decision task; correct = whether or not the answer was correct.

**Figure 3**
Input in the Westfall et al. (2014) website to calculate power of a simple design with random effects of participants and targets (items). Data based on the lmer analysis of the Adelman et al. (2014) dataset.

**Figure 4**
Top of the Perea et al. (2015) database.

**Figure 5**
Outcome of the powerCurve command from the simr package for the Perea et al. (2015) dataset. It shows how the power based on the 40 participants tested increases as a function of the number of items. With 40 items we have enough power to observe the 39 ms repetition priming effect.

**Figure 6**
Outcome of the powerCurve command (simr package) for the Perea et al. (2015) dataset. It shows how the power based on the 120 items tested increases as a function of the number of participants. With 7 participants we have enough power to observe the 39 ms repetition priming effect.

See this image and copyright information in PMC

References

1. Adelman J. S., Johnson R. L., McCormick S. F., McKague M., Kinoshita S., Bowers J. S., et al. A behavioral database for masked form priming. Behavior Research Methods. 2014;46(4):1052–1067. doi: 10.3758/s13428-013-0442-y. - DOI - PubMed
1. Amrhein V., Korner-Nievergelt F., Roth T. The earth is flat (p > 0.05): Significance thresholds and the crisis of unreplicable research. PeerJ. 2017;5:e3544. doi: 10.7717/peerj.3544. - DOI - PMC - PubMed
1. Atas A., San Anton E., Cleeremans A. The reversal of perceptual and motor compatibility effects differs qualitatively between metacontrast and random-line masks. Psychological Research. 2015;79(5):813–828. doi: 10.1007/s00426-014-0611-3. - DOI - PubMed
1. Baayen R. H. Analyzing linguistic data: A practical introduction to statistics using R. Cambridge University Press; 2008. - DOI
1. Baayen R. H., Davidson D. J., Bates D. M. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language. 2008;59(4):390–412. doi: 10.1016/j.jml.2007.12.005. - DOI

LinkOut - more resources

Full Text Sources
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Power Analysis and Effect Size in Mixed Effects Models: A Tutorial

Affiliations

Power Analysis and Effect Size in Mixed Effects Models: A Tutorial

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources

Miscellaneous