. 2022 Oct;28(10):2038-2044.

doi: 10.1038/s41591-022-01973-2. Epub 2022 Oct 10.

The Burden of Proof studies: assessing the evidence of risk

Peng Zheng^{1

2}, Ashkan Afshin^{1

2}, Stan Biryukov¹, Catherine Bisignano¹, Michael Brauer^{1

2

3}, Dana Bryazka¹, Katrin Burkart^{1

2}, Kelly M Cercy¹, Leslie Cornaby¹, Xiaochen Dai^{1

2}, M Ashworth Dirac^{1

2}, Kara Estep¹, Kairsten A Fay¹, Rachel Feldman¹, Alize J Ferrari^{1

2

4

5}, Emmanuela Gakidou^{1

2}, Gabriela Fernanda Gil¹, Max Griswold¹, Simon I Hay^{1

2}, Jiawei He¹, Caleb M S Irvine¹, Nicholas J Kassebaum^{1

2

6}, Kate E LeGrand¹, Haley Lescinsky¹, Stephen S Lim^{1

2}, Justin Lo¹, Erin C Mullany¹, Kanyin Liane Ong¹, Puja C Rao¹, Christian Razo¹, Marissa B Reitsma¹, Gregory A Roth^{1

2

7}, Damian F Santomauro^{1

2

4

5}, Reed J D Sorensen¹, Vinay Srinivasan¹, Jeffrey D Stanaway^{1

2}, Stein Emil Vollset^{1

2}, Theo Vos^{1

2}, Nelson Wang⁸, Catherine A Welgan¹, Sarah S Wozniak¹, Aleksandr Y Aravkin^#^{1

2

9}, Christopher J L Murray^#^{10

11}

Affiliations

¹ Institute for Health Metrics and Evaluation, University of Washington, Seattle, WA, USA.
² Department of Health Metrics Sciences, School of Medicine, University of Washington, Seattle, WA, USA.
³ School of Population and Public Health, University of British Columbia, Vancouver, British Columbia, Canada.
⁴ School of Public Health, The University of Queensland, Brisbane, Queensland, Australia.
⁵ Queensland Centre for Mental Health Research, Wacol, Queensland, Australia.
⁶ Department of Anesthesiology & Pain Medicine, University of Washington, Seattle, WA, USA.
⁷ Division of Cardiology, University of Washington, Seattle, WA, USA.
⁸ The George Institute for Global Health, The University of New South Wales, Sydney, New South Wales, Australia.
⁹ Department of Applied Mathematics, University of Washington, Seattle, WA, USA.
¹⁰ Institute for Health Metrics and Evaluation, University of Washington, Seattle, WA, USA. cjlm@uw.edu.
¹¹ Department of Health Metrics Sciences, School of Medicine, University of Washington, Seattle, WA, USA. cjlm@uw.edu.

^# Contributed equally.

PMID: 36216935
PMCID: PMC9556298
DOI: 10.1038/s41591-022-01973-2

The Burden of Proof studies: assessing the evidence of risk

Peng Zheng et al. Nat Med. 2022 Oct.

. 2022 Oct;28(10):2038-2044.

doi: 10.1038/s41591-022-01973-2. Epub 2022 Oct 10.

Authors

Affiliations

¹ Institute for Health Metrics and Evaluation, University of Washington, Seattle, WA, USA.
² Department of Health Metrics Sciences, School of Medicine, University of Washington, Seattle, WA, USA.
³ School of Population and Public Health, University of British Columbia, Vancouver, British Columbia, Canada.
⁴ School of Public Health, The University of Queensland, Brisbane, Queensland, Australia.
⁵ Queensland Centre for Mental Health Research, Wacol, Queensland, Australia.
⁶ Department of Anesthesiology & Pain Medicine, University of Washington, Seattle, WA, USA.
⁷ Division of Cardiology, University of Washington, Seattle, WA, USA.
⁸ The George Institute for Global Health, The University of New South Wales, Sydney, New South Wales, Australia.
⁹ Department of Applied Mathematics, University of Washington, Seattle, WA, USA.
¹⁰ Institute for Health Metrics and Evaluation, University of Washington, Seattle, WA, USA. cjlm@uw.edu.
¹¹ Department of Health Metrics Sciences, School of Medicine, University of Washington, Seattle, WA, USA. cjlm@uw.edu.

^# Contributed equally.

PMID: 36216935
PMCID: PMC9556298
DOI: 10.1038/s41591-022-01973-2

Abstract

Exposure to risks throughout life results in a wide variety of outcomes. Objectively judging the relative impact of these risks on personal and population health is fundamental to individual survival and societal prosperity. Existing mechanisms to quantify and rank the magnitude of these myriad effects and the uncertainty in their estimation are largely subjective, leaving room for interpretation that can fuel academic controversy and add to confusion when communicating risk. We present a new suite of meta-analyses-termed the Burden of Proof studies-designed specifically to help evaluate these methodological issues objectively and quantitatively. Through this data-driven approach that complements existing systems, including GRADE and Cochrane Reviews, we aim to aggregate evidence across multiple studies and enable a quantitative comparison of risk-outcome pairs. We introduce the burden of proof risk function (BPRF), which estimates the level of risk closest to the null hypothesis that is consistent with available data. Here we illustrate the BPRF methodology for the evaluation of four exemplar risk-outcome pairs: smoking and lung cancer, systolic blood pressure and ischemic heart disease, vegetable consumption and ischemic heart disease, and unprocessed red meat consumption and ischemic heart disease. The strength of evidence for each relationship is assessed by computing and summarizing the BPRF, and then translating the summary to a simple star rating. The Burden of Proof methodology provides a consistent way to understand, evaluate and summarize evidence of risk across different risk-outcome pairs, and informs risk analysis conducted as part of the Global Burden of Diseases, Injuries, and Risk Factors Study.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Fig. 1. Smoking and lung cancer.**
a, Log relative risk function for smoking and lung cancer. b, Relative risk function for smoking and lung cancer. c, A modified funnel plot for smoking and lung cancer showing the residuals (relative to 0) on the x axis and the estimated standard deviation (s.d.) that includes reported s.d. and between-study heterogeneity on the y axis.

**Fig. 2. Systolic blood pressure and ischemic heart disease.**
a, Log relative risk function for systolic blood pressure and ischemic heart disease. b, Relative risk function for systolic blood pressure and ischemic heart disease. c, A modified funnel plot for systolic blood pressure and ischemic heart disease showing the residuals (relative to 0) on the x axis and the estimated standard deviation (s.d.) that includes reported s.d. and between-study heterogeneity on the y axis.

**Fig. 3. Vegetable consumption and ischemic heart disease.**
a, Log relative risk function for vegetable consumption and ischemic heart disease. b, Relative risk function for vegetable consumption and ischemic heart disease. c, A modified funnel plot for vegetable consumption and ischemic heart disease showing the residuals (relative to 0) on the x axis and the estimated standard deviation (s.d.) that includes reported s.d. and between-study heterogeneity on the y axis.

**Fig. 4. Unprocessed red meat consumption and ischemic heart disease.**
a, Log relative risk function for unprocessed red meat consumption and ischemic heart disease. b, Relative risk function for unprocessed red meat consumption and ischemic heart disease. c, A modified funnel plot for unprocessed red meat consumption and ischemic heart disease showing the residuals (relative to 0) on the x axis and the estimated standard deviation (s.d.) that includes reported s.d. and between-study heterogeneity on the y axis.

**Fig. 5. Model validation for the data-rich scenario.**
a, Estimated risk curves across 100 realizations for all methods for non-log linear risks. b, Estimated risk curves across 100 realizations for all methods for log linear risks. Mrtool is the method used in this paper. Dosresmeta_1stage_ncs refers to the Dosresmeta package with a natural cubic spline, while Dosresmeta_1stage_qs refers to the same tool with a quadratic spline. Dosresmeta_2stage refers to the the 2 stage approach in Dosresmeta. Metafor refers to a standard package that assumes a log linear relationship. (See ’Model validation’ for more details).

**Extended Data Fig. 1. Scenario 1 mean risk curve.**
a, estimated risk curves across 100 realizations for all methods, non-log linear. b, estimated risk curves across 100 realizations for all methods, log linear. c, RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, non-log linear risks. d, RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, log linear risks. e, log-scale RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, non-log linear risks. f, log-scale RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, log linear risks. MRSE = root mean squared error.

**Extended Data Fig. 2. Scenario 1 burden of proof risk function.**
a, estimated risk curves across 100 realizations for all methods, non-log linear. b, estimated risk curves across 100 realizations for all methods, log linear. c, RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, non-log linear risks. d, RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, log linear risks. e, log-scale RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, non-log linear risks. f, log-scale RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, log linear risks. MRSE = root mean squared error.

**Extended Data Fig. 3. Scenario 2 mean risk curve.**
a, estimated risk curves across 100 realizations for all methods, non-log linear. b, estimated risk curves across 100 realizations for all methods, log linear. c, RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, non-log linear risks. d, RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, log linear risks. e, log-scale RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, non-log linear risks. f, log-scale RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, log linear risks. MRSE = root mean squared error.

**Extended Data Fig. 4. Scenario 2 burden of proof risk function.**
a, estimated risk curves across 100 realizations for all methods, non-log linear. b, estimated risk curves across 100 realizations for all methods, log linear. c, RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, non-log linear risks. d, RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, log linear risks. e, log-scale RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, non-log linear risks. f, log-scale RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, log linear risks. MRSE = root mean squared error.

**Extended Data Fig. 5. Scenario 3 mean risk curve.**
a, estimated risk curves across 100 realizations for all methods, non-log linear. b, estimated risk curves across 100 realizations for all methods, log linear. c, RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, non-log linear risks. d, RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, log linear risks. e, log-scale RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, non-log linear risks. f, log-scale RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, log linear risks. MRSE = root mean squared error.

**Extended Data Fig. 6. Scenario 3 burden of proof risk function.**
a, estimated risk curves across 100 realizations for all methods, non-log linear. b, estimated risk curves across 100 realizations for all methods, log linear. c, RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, non-log linear risks. d, RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, log linear risks. e, log-scale RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, non-log linear risks. f, log-scale RMSE results summarizing the errors of each method against simulated ground truth across different levels of true between-study heterogeneity, log linear risks. MRSE = root mean squared error.

**Extended Data Fig. 7. Simulation study for γ estimation as a function of the number of studies, using 500 realizations of each setting.**
a, 5th and 95th percentiles of the estimated γ values. b, violin plots to illustrate how the number of studies affects γ estimation. In both panels, it is clear that using FIM improves the quality of γ estimation. FIM = Fischer information matrix. γ = between-study heterogeneity parameter.

**Extended Data Fig. 8. Sensitivity study results comparing mixed effects model (teal) and fixed effects model (orange).**
a, smoking and lung cancer. b, systolic blood pressure and ischemic heart disease. c, vegetables and ischemic heart disease. d, unprocessed red meat and ischemic heart disease.

**Extended Data Fig. 9. Sensitivity study results comparing trimming (teal) and no trimming (orange).**
a, smoking and lung cancer, with trimming. b, smoking and lung cancer, no trimming. c, systolic blood pressure and ischemic heart disease, with trimming. d, systolic blood pressure and ischemic heart disease, no trimming. e, vegetables and ischemic heart disease, with trimming. f, vegetables and ischemic heart disease, no trimming. g, unprocessed red meat and ischemic heart disease, with trimming. h, unprocessed red meat and ischemic heart disease, no trimming.

See this image and copyright information in PMC

References

1. Murray CJL, et al. Global burden of 87 risk factors in 204 countries and territories, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet. 2020;396:1223–1249. doi: 10.1016/S0140-6736(20)30752-2. - DOI - PMC - PubMed
1. Murray CJ, Lopez AD. Global mortality, disability, and the contribution of risk factors: Global Burden of Disease Study. Lancet. 1997;349:1436–1442. doi: 10.1016/S0140-6736(96)07495-8. - DOI - PubMed
1. Murray CJ, Ezzati M, Lopez AD, Rodgers A, Vander Hoorn S. Comparative quantification of health risks: conceptual framework and methodological issues. Popul. Health Metr. 2003;1:1. doi: 10.1186/1478-7954-1-1. - DOI - PMC - PubMed
1. Stanley K. Evaluation of randomized controlled trials. Circulation. 2007;115:1819–1822. doi: 10.1161/CIRCULATIONAHA.106.618603. - DOI - PubMed
1. Deaton A, Cartwright N. Understanding and misunderstanding randomized controlled trials. Soc. Sci. Med. 2018;210:2–21. doi: 10.1016/j.socscimed.2017.12.005. - DOI - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The Burden of Proof studies: assessing the evidence of risk

Affiliations

The Burden of Proof studies: assessing the evidence of risk

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources