Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan 12;1(1):9.
doi: 10.5334/joc.10.

Power Analysis and Effect Size in Mixed Effects Models: A Tutorial

Affiliations

Power Analysis and Effect Size in Mixed Effects Models: A Tutorial

Marc Brysbaert et al. J Cogn. .

Abstract

In psychology, attempts to replicate published findings are less successful than expected. For properly powered studies replication rate should be around 80%, whereas in practice less than 40% of the studies selected from different areas of psychology can be replicated. Researchers in cognitive psychology are hindered in estimating the power of their studies, because the designs they use present a sample of stimulus materials to a sample of participants, a situation not covered by most power formulas. To remedy the situation, we review the literature related to the topic and introduce recent software packages, which we apply to the data of two masked priming studies with high power. We checked how we could estimate the power of each study and how much they could be reduced to remain powerful enough. On the basis of this analysis, we recommend that a properly powered reaction time experiment with repeated measures has at least 1,600 word observations per condition (e.g., 40 participants, 40 stimuli). This is considerably more than current practice. We also show that researchers must include the number of observations in meta-analyses because the effect sizes currently reported depend on the number of stimuli presented to the participants. Our analyses can easily be applied to new datasets gathered.

Keywords: F1 analysis; F2 analysis; effect size; mixed effects models; power analysis; random factors.

PubMed Disclaimer

Conflict of interest statement

The authors have no competing interests to declare.

Figures

Figure 1
Figure 1
Construction of the two prime types from the data of the Adelman et al. (2014) priming megastudy. Prime types varied from an identity prime (extreme left) to an all letter different prime (extreme right).
Figure 2
Figure 2
Snapshot of the Adelman et al. (2014) database used. Participant is the rank number of the participant tested (not all participants who started the study provided useful results); item = the target word responded to; prime = highly or lowly related to the target; RT is the reaction time to the target in the lexical decision task; correct = whether or not the answer was correct.
Figure 3
Figure 3
Input in the Westfall et al. (2014) website to calculate power of a simple design with random effects of participants and targets (items). Data based on the lmer analysis of the Adelman et al. (2014) dataset.
Figure 4
Figure 4
Top of the Perea et al. (2015) database.
Figure 5
Figure 5
Outcome of the powerCurve command from the simr package for the Perea et al. (2015) dataset. It shows how the power based on the 40 participants tested increases as a function of the number of items. With 40 items we have enough power to observe the 39 ms repetition priming effect.
Figure 6
Figure 6
Outcome of the powerCurve command (simr package) for the Perea et al. (2015) dataset. It shows how the power based on the 120 items tested increases as a function of the number of participants. With 7 participants we have enough power to observe the 39 ms repetition priming effect.

References

    1. Adelman J. S., Johnson R. L., McCormick S. F., McKague M., Kinoshita S., Bowers J. S., et al. A behavioral database for masked form priming. Behavior Research Methods. 2014;46(4):1052–1067. doi: 10.3758/s13428-013-0442-y. - DOI - PubMed
    1. Amrhein V., Korner-Nievergelt F., Roth T. The earth is flat (p > 0.05): Significance thresholds and the crisis of unreplicable research. PeerJ. 2017;5:e3544. doi: 10.7717/peerj.3544. - DOI - PMC - PubMed
    1. Atas A., San Anton E., Cleeremans A. The reversal of perceptual and motor compatibility effects differs qualitatively between metacontrast and random-line masks. Psychological Research. 2015;79(5):813–828. doi: 10.1007/s00426-014-0611-3. - DOI - PubMed
    1. Baayen R. H. Analyzing linguistic data: A practical introduction to statistics using R. Cambridge University Press; 2008. - DOI
    1. Baayen R. H., Davidson D. J., Bates D. M. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language. 2008;59(4):390–412. doi: 10.1016/j.jml.2007.12.005. - DOI

LinkOut - more resources