Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Apr 1;10(4):e0120838.
doi: 10.1371/journal.pone.0120838. eCollection 2015.

Menage a quoi? Optimal number of peer reviewers

Affiliations

Menage a quoi? Optimal number of peer reviewers

Richard R Snell. PLoS One. .

Abstract

Peer review represents the primary mechanism used by funding agencies to allocate financial support and by journals to select manuscripts for publication, yet recent Cochrane reviews determined literature on peer review best practice is sparse. Key to improving the process are reduction of inherent vulnerability to high degree of randomness and, from an economic perspective, limiting both the substantial indirect costs related to reviewer time invested and direct administrative costs to funding agencies, publishers and research institutions. Use of additional reviewers per application may increase reliability and decision consistency, but adds to overall cost and burden. The optimal number of reviewers per application, while not known, is thought to vary with accuracy of judges or evaluation methods. Here I use bootstrapping of replicated peer review data from a Post-doctoral Fellowships competition to show that five reviewers per application represents a practical optimum which avoids large random effects evident when fewer reviewers are used, a point where additional reviewers at increasing cost provides only diminishing incremental gains in chance-corrected consistency of decision outcomes. Random effects were most evident in the relative mid-range of competitiveness. Results support aggressive high- and low-end stratification or triaging of applications for subsequent stages of review, with the proportion and set of mid-range submissions to be retained for further consideration being dependent on overall success rate.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The author has declared that no competing interests exist. The views presented in this article are those of the author and do not necessarily represent those of the Canadian Institutes of Health Research.

Figures

Fig 1
Fig 1. Simulated competition outcomes from a single bootstrap iteration (1 to 21 reviewers per application).
Horizontal sequences represent simulated outcomes for each of 100 Fellowships applications (‘grey’ representing success) with incremental addition of reviewers, within different overall success rate scenarios. For each N to N+1reviewers, an additional score was sampled (with replacement) from 9 independent assessments. Discontinuities in horizontal grey/white—coding reflect changes in decision outcome with addition of a single reviewer. Within each iteration (representing one simulated competition), the outcome for many applications was invariant where N > 2, regardless of the success rate scenario.
Fig 2
Fig 2. Funding success probability profiles in five overall competition scenarios (1 to 21 reviewers per application).
Cumulative sum of 10,000 bootstrapped simulations of competition outcomes, for 100 applications within five overall success scenarios [5% (a), 15% (b), 25% (c), 35% (d), and 50% (e)]. Within some scenarios, the most competitive applications had ≥ ~95% probability of success (Category A). Applications of intermediate competitiveness (Category B) had a probability of success which varied from ~5% to ~95%. The least competitive applications (Category C) were rarely (≤ ~5% probability) or never successful.
Fig 3
Fig 3. Relative proportion of applications in three outcome categories, in five overall competition scenarios (5 reviewers per application).
For 100 applications within five overall competition success scenarios [5% (a), 15% (b), 25% (c), 35% (d), and 50% (e)], different proportions of applications had very high (A), medium (B) or very low (C) probability of success (categories defined in text) over 10,000 iterations of simulated competition results. Horizontal green lines indicated a probability of success of 5% and 95%. Vertical blue lines indicated where locally weighted polynomial regression (LOESS smoothing) curves intersected the 5% or 95% probability of success, or reached the highest ranked application (the limit of the graph).
Fig 4
Fig 4. Bootstrapped kappa statistics of peer review decision consistency with incremental N to N+1 reviewers.
In simulation of CIHR Fellowships competition outcomes, overall decision consistency improved with incremental addition of reviewers regardless of overall success rate scenario. Monte Carlo error analysis (standard deviation of bootstrapped estimates of kappa coefficients) [33] indicated broad overlap among incremental kappa values with increased reviewers. Kappa levels > 0.8 represented “almost perfect” consistency [32]. Kappa values were significant (α < 0.05) except in the 5% success scenario at the 1–2 and 2–3 reviewer increment.
Fig 5
Fig 5. First derivative (S1) and second derivative (S2) of kappa, within incremental N to N+1 reviewers, across five overall competition success rate scenarios.
Relative improvement of kappa reached stability at 4–5 reviewers per application or shortly thereafter, across all overall success rate scenarios [5% (a), 15% (b), 25% (c), 35% (d), and 50% (e)]. Vertical dashed lines represent the approximate S2 asymptotes.

References

    1. Bornmann L, Wallon G, Ledin A. Does the committee peer review select the best applicants for funding? An investigation of the selection process for two European molecular biology organization programmes. PLoS ONE. 2008; 3(10): e3480 10.1371/journal.pone.0003480 - DOI - PMC - PubMed
    1. Kravitz RL, Franks P, Feldman MD, Gerrity M, Byrne C, Tierney WM. Editorial peer reviewers’ recommendations at a general medical journal: are they reliable and do editors care? PLoS ONE. 2010; 5(4): e10072 10.1371/journal.pone.0010072 - DOI - PMC - PubMed
    1. Abdoul H, Perrey C, Amiel P, Tubach F, Gottot S, Durand-Zaleski I, et al. Peer review of grant applications: criteria used and qualitative study of reviewer practices. PLoS ONE. 2012; 7: e46054 10.1371/journal.pone.0046054 - DOI - PMC - PubMed
    1. Fogelholm M, Leppinen S, Auvinen A, Raitanen J, Nuutinen A, Väänänen K. Panel discussion does not improve reliability of peer review for medical research grant proposals. J Clin Epidemiol. 2012; 65: 47–52. 10.1016/j.jclinepi.2011.05.001 - DOI - PubMed
    1. Demicheli V, Di Pietrantonj C. Peer review for improving the quality of grant applications. Cochrane Database of Systematic Reviews. 2007; 2: MR000003. 10.1002/14651858.mr000003.pub2 - DOI - PMC - PubMed

MeSH terms

LinkOut - more resources