Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Dec 19:14:135.
doi: 10.1186/1471-2288-14-135.

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

Affiliations

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

Xiang Wan et al. BMC Med Res Methodol. .

Abstract

Background: In systematic reviews and meta-analysis, researchers often pool the results of the sample mean and standard deviation from a set of similar clinical trials. A number of the trials, however, reported the study using the median, the minimum and maximum values, and/or the first and third quartiles. Hence, in order to combine results, one may have to estimate the sample mean and standard deviation for such trials.

Methods: In this paper, we propose to improve the existing literature in several directions. First, we show that the sample standard deviation estimation in Hozo et al.'s method (BMC Med Res Methodol 5:13, 2005) has some serious limitations and is always less satisfactory in practice. Inspired by this, we propose a new estimation method by incorporating the sample size. Second, we systematically study the sample mean and standard deviation estimation problem under several other interesting settings where the interquartile range is also available for the trials.

Results: We demonstrate the performance of the proposed methods through simulation studies for the three frequently encountered scenarios, respectively. For the first two scenarios, our method greatly improves existing methods and provides a nearly unbiased estimate of the true sample standard deviation for normal data and a slightly biased estimate for skewed data. For the third scenario, our method still performs very well for both normal data and skewed data. Furthermore, we compare the estimators of the sample mean and standard deviation under all three scenarios and present some suggestions on which scenario is preferred in real-world applications.

Conclusions: In this paper, we discuss different approximation methods in the estimation of the sample mean and standard deviation and propose some new estimation methods to improve the existing literature. We conclude our work with a summary table (an Excel spread sheet including all formulas) that serves as a comprehensive guidance for performing meta-analysis in different situations.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Relative errors of the sample standard deviation estimation for normal data, where the red lines with solid circles represent Hozo et al.’s method, and the green lines with empty circles represent the new method.
Figure 2
Figure 2
Relative errors of the sample standard deviation estimation for non-normal data (log-normal, beta, exponential and Weibull), where the red lines with solid circles represent Hozo et al.’s method, and the green lines with empty circles represent the new method.
Figure 3
Figure 3
Relative errors of the sample standard deviation estimation for normal data and log-normal data, where the red lines with solid circles represent Bland’s method, and the green lines with empty circles represent the new method.
Figure 4
Figure 4
Relative errors of the sample mean and standard deviation estimations for normal data, where the black solid circles represent the method under scenario formula image , the red solid triangles represent the method under scenario formula image , and the green empty circles represent the method under scenario formula image .
Figure 5
Figure 5
Relative errors of the sample mean estimation for non-normal data (log-normal, beta, exponential and Weibull), where the black lines with solid circles represent the method under scenario formula image , the red lines with solid triangles represent the method under scenario formula image , and the green lines with empty circles represent the method under scenario formula image .
Figure 6
Figure 6
Relative errors of the sample standard deviation estimation for non-normal data (log-normal, beta, exponential and Weibull), where the black lines with solid circles represent the method under scenario formula image , the red lines with solid triangles represent the method under scenario formula image , and the green lines with empty circles represent the method under scenario formula image .

References

    1. Antman EM, Lau J, Kupelnick B, Mosteller F, Chalmers TC. A comparison of results of meta-analyses of randomized control trials and recommendations of clinical experts: treatments for myocardial infarction. J Am Med Assoc. 1992;268:240–248. doi: 10.1001/jama.1992.03490020088036. - DOI - PubMed
    1. Cipriani A, Geddes J. Comparison of systematic and narrative reviews: the example of the atypical antipsychotics. Epidemiol Psichiatr Soc. 2003;12:146–153. doi: 10.1017/S1121189X00002918. - DOI - PubMed
    1. Hozo SP, Djulbegovic B, Hozo I. Estimating the mean and variance from the median, range, and the size of a sample. BMC Med Res Methodol. 2005;5:13. doi: 10.1186/1471-2288-5-13. - DOI - PMC - PubMed
    1. Triola M. F. Elementary Statistics, 11th Ed. 2009.
    1. Hogg RV, Craig AT. Introduction to Mathematical Statistics. Maxwell: Macmillan Canada; 1995.
Pre-publication history
    1. The pre-publication history for this paper can be accessed here: http://www.biomedcentral.com/1471-2288/14/135/prepub

Publication types