What is a Good Calibration Question?

Victoria Hemming^{1

2}, Anca M Hanea², Mark A Burgman³

Affiliations

¹ Martin Conservation Decisions Lab, Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, Canada.
² Centre of Excellence for Biosecurity Risk Analysis, University of Melbourne, Victoria, Australia.
³ Centre for Environmental Policy, Imperial College London, London, UK.

PMID: 33864272
DOI: 10.1111/risa.13725

What is a Good Calibration Question?

Victoria Hemming et al. Risk Anal. 2022 Feb.

. 2022 Feb;42(2):264-278.

doi: 10.1111/risa.13725. Epub 2021 Apr 16.

Authors

Victoria Hemming^{1

2}, Anca M Hanea², Mark A Burgman³

Affiliations

¹ Martin Conservation Decisions Lab, Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, Canada.
² Centre of Excellence for Biosecurity Risk Analysis, University of Melbourne, Victoria, Australia.
³ Centre for Environmental Policy, Imperial College London, London, UK.

PMID: 33864272
DOI: 10.1111/risa.13725

Abstract

Weighted aggregation of expert judgments based on their performance on calibration questions may improve mathematically aggregated judgments relative to equal weights. However, obtaining validated, relevant calibration questions can be difficult. If so, should analysts settle for equal weights? Or should they use calibration questions that are easier to obtain but less relevant? In this article, we examine what happens to the out-of-sample performance of weighted aggregations of the classical model (CM) compared to equal weighted aggregations when the set of calibration questions includes many so-called "irrelevant" questions, those that might ordinarily be considered to be outside the domain of the questions of interest. We find that performance weighted aggregations outperform equal weights on the combined CM score, but not on statistical accuracy (i.e., calibration). Importantly, there was no appreciable difference in performance when weights were developed on relevant versus irrelevant questions. Experts were unable to adapt their knowledge across vastly different domains, and in-sample validation did not accurately predict out-of-sample performance on irrelevant questions. We suggest that if relevant calibration questions cannot be found, then analysts should use equal weights, and draw on alternative techniques to improve judgments. Our study also indicates limits to the predictive accuracy of performance weighted aggregation, and the degree to which expertise can be adapted across domains. We note limitations in our study and urge further research into the effect of question type on the reliability of performance weighted aggregations.

Keywords: Aggregation; calibration; equal weights; expert judgment; performance weights.

PubMed Disclaimer

References

1. Aspinall, W. P. (2010). A route to more tractable expert advice. Nature, 463(7279), 294-295.
1. Bamber, J., Aspinall, W., & Cooke, R. (2016). A commentary on “how to interpret expert judgment assessments of twenty-first century sea-level rise” by Hylke de Vries and Roderik SW van de Wal [journal article]. Climatic Change, 137(3), 321-328. https://doi.org/10.1007/s10584-016-1672-7
1. Bedford, T., & Cooke, R. M. (2001). Mathematical tools for probabilistic risk analysis. Cambridge, UK: Cambridge University Press.
1. Budescu, D. V., & Chen, E. (2014). Identifying expertise to extract the wisdom of crowds. Management Science, 61(2), 267-280.
1. Burgman, M., Carr, A., Godden, L., Gregory, R., McBride, M., Flander, L., & Maguire, L. (2011). Redefining expertise and improving ecological judgment. Conservation Letters, 4(2), 81-87. https://doi.org/10.1111/j.1755-263X.2011.00165.x

Publication types

Actions

MeSH terms

Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Wiley
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

What is a Good Calibration Question?

Affiliations

What is a Good Calibration Question?

Authors

Affiliations

Abstract

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources