Reliance on metrics is a fundamental challenge for AI

Rachel L Thomas¹, David Uminsky²

Affiliations

PMID: 35607624
PMCID: PMC9122957
DOI: 10.1016/j.patter.2022.100476

Review

Reliance on metrics is a fundamental challenge for AI

Rachel L Thomas et al. Patterns (N Y). 2022.

. 2022 May 13;3(5):100476.

doi: 10.1016/j.patter.2022.100476.

Authors

Rachel L Thomas¹, David Uminsky²

Affiliations

¹ Queensland University of Technology, Brisbane, QLD, Australia.
² University of Chicago, Chicago, IL, USA.

PMID: 35607624
PMCID: PMC9122957
DOI: 10.1016/j.patter.2022.100476

Abstract

Through a series of case studies, we review how the unthinking pursuit of metric optimization can lead to real-world harms, including recommendation systems promoting radicalization, well-loved teachers fired by an algorithm, and essay grading software that rewards sophisticated garbage. The metrics used are often proxies for underlying, unmeasurable quantities (e.g., "watch time" of a video as a proxy for "user satisfaction"). We propose an evidence-based framework to mitigate such harms by (1) using a slate of metrics to get a fuller and more nuanced picture; (2) conducting external algorithmic audits; (3) combining metrics with qualitative accounts; and (4) involving a range of stakeholders, including those who will be most impacted.

Keywords: DSML 1: Concept: Basic principles of a new data science output observed and reported.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

References

1. Likierman A. The five traps of performance measurement. Harv. Bus. Rev. 2009;87:96–101. - PubMed
1. Kaplan R., Norton D. The balanced scorecard: measures that drive performance. Harv. Bus. Rev. 1992;70:71–79. - PubMed
1. Ribeiro M.H., Ottoni R., West R., Almeida V.A.F., Meira W., Jr. Auditing radicalization pathways on YouTube. arXiv. 2019:1–18. Preprint at. abs/1908.08313.
1. Turque B. ‘Creative… motivating’ and fired. Wash. Post. March 6, 2012. 2012
1. Ramineni C., Williamson D. Understanding mean score differences between the e-rater® automated scoring engine and humans for demographically based groups in the GRE® general test. ETS Res. Rep. Ser. 2018;2018:1–31.

Publication types

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Reliance on metrics is a fundamental challenge for AI

Affiliations

Reliance on metrics is a fundamental challenge for AI

Authors

Affiliations

Abstract

Conflict of interest statement

References

Publication types

LinkOut - more resources

Full Text Sources