Reliance on metrics is a fundamental challenge for AI
- PMID: 35607624
- PMCID: PMC9122957
- DOI: 10.1016/j.patter.2022.100476
Reliance on metrics is a fundamental challenge for AI
Abstract
Through a series of case studies, we review how the unthinking pursuit of metric optimization can lead to real-world harms, including recommendation systems promoting radicalization, well-loved teachers fired by an algorithm, and essay grading software that rewards sophisticated garbage. The metrics used are often proxies for underlying, unmeasurable quantities (e.g., "watch time" of a video as a proxy for "user satisfaction"). We propose an evidence-based framework to mitigate such harms by (1) using a slate of metrics to get a fuller and more nuanced picture; (2) conducting external algorithmic audits; (3) combining metrics with qualitative accounts; and (4) involving a range of stakeholders, including those who will be most impacted.
Keywords: DSML 1: Concept: Basic principles of a new data science output observed and reported.
© 2022.
Conflict of interest statement
The authors declare no competing interests.
References
-
- Likierman A. The five traps of performance measurement. Harv. Bus. Rev. 2009;87:96–101. - PubMed
-
- Kaplan R., Norton D. The balanced scorecard: measures that drive performance. Harv. Bus. Rev. 1992;70:71–79. - PubMed
-
- Ribeiro M.H., Ottoni R., West R., Almeida V.A.F., Meira W., Jr. Auditing radicalization pathways on YouTube. arXiv. 2019:1–18. Preprint at. abs/1908.08313.
-
- Turque B. ‘Creative… motivating’ and fired. Wash. Post. March 6, 2012. 2012
-
- Ramineni C., Williamson D. Understanding mean score differences between the e-rater® automated scoring engine and humans for demographically based groups in the GRE® general test. ETS Res. Rep. Ser. 2018;2018:1–31.
Publication types
LinkOut - more resources
Full Text Sources
