. 2018 Jun 12;115(24):6171-6176.

doi: 10.1073/pnas.1721355115. Epub 2018 May 29.

Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms

Affiliations

¹ Information Access Division, National Institute of Standards and Technology, Gaithersburg, MD 20899; jonathon@nist.gov.
² Information Access Division, National Institute of Standards and Technology, Gaithersburg, MD 20899.
³ School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, TX 75080.
⁴ Department of Electrical and Computer Engineering, University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20854.
⁵ University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20854.
⁶ School of Psychology, The University of New South Wales, Sydney, NSW 2052, Australia.

PMID: 29844174
PMCID: PMC6004481
DOI: 10.1073/pnas.1721355115

Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms

P Jonathon Phillips et al. Proc Natl Acad Sci U S A. 2018.

. 2018 Jun 12;115(24):6171-6176.

doi: 10.1073/pnas.1721355115. Epub 2018 May 29.

Authors

Affiliations

¹ Information Access Division, National Institute of Standards and Technology, Gaithersburg, MD 20899; jonathon@nist.gov.
² Information Access Division, National Institute of Standards and Technology, Gaithersburg, MD 20899.
³ School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, TX 75080.
⁴ Department of Electrical and Computer Engineering, University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20854.
⁵ University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20854.
⁶ School of Psychology, The University of New South Wales, Sydney, NSW 2052, Australia.

PMID: 29844174
PMCID: PMC6004481
DOI: 10.1073/pnas.1721355115

Abstract

Achieving the upper limits of face identification accuracy in forensic applications can minimize errors that have profound social and personal consequences. Although forensic examiners identify faces in these applications, systematic tests of their accuracy are rare. How can we achieve the most accurate face identification: using people and/or machines working alone or in collaboration? In a comprehensive comparison of face identification by humans and computers, we found that forensic facial examiners, facial reviewers, and superrecognizers were more accurate than fingerprint examiners and students on a challenging face identification test. Individual performance on the test varied widely. On the same test, four deep convolutional neural networks (DCNNs), developed between 2015 and 2017, identified faces within the range of human accuracy. Accuracy of the algorithms increased steadily over time, with the most recent DCNN scoring above the median of the forensic facial examiners. Using crowd-sourcing methods, we fused the judgments of multiple forensic facial examiners by averaging their rating-based identity judgments. Accuracy was substantially better for fused judgments than for individuals working alone. Fusion also served to stabilize performance, boosting the scores of lower-performing individuals and decreasing variability. Single forensic facial examiners fused with the best algorithm were more accurate than the combination of two examiners. Therefore, collaboration among humans and between humans and machines offers tangible benefits to face identification accuracy in important applications. These results offer an evidence-based roadmap for achieving the most accurate face identification possible.

Keywords: face identification; face recognition algorithm; forensic science; machine learning technology; wisdom-of-crowds.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest statement: The University of Maryland is filing a US patent application that will cover portions of algorithms A2017a and A2017b. R.R., C.D.C., and R.C. are coinventors on this patent.

Figures

**Fig. 1.**
Examples highlighting the face region in the images used in this study (all image pairs are shown in *SI Appendix*, Figs. S8–S14). (*Left*) This pair is a same identity pair, and (*Right*) this pair shows a different identity pair.

**Fig. 2.**
Human and machine accuracy. Black dots indicate AUCs of individual participants; red dots are group medians. In the algorithms column, red dots indicate algorithm accuracy. Face specialists (facial examiners, facial reviewers, and superrecognizers) surpassed fingerprint examiners, who surpassed the students. The violin plot outlines are estimates of the density for the AUC distribution for the subject groups. The dashed horizontal line marks the accuracy of a 95th percentile student. All algorithms perform in the range of human performance. The best algorithm places slightly above the forensic examiners’ median.

**Fig. 3.**
Plots illustrate the effectiveness of fusing multiple participants within groups. For all groups, combining judgments by simple averaging is effective. The violin plots in *Upper* show the distribution of AUCs for fusing examiners. Red circles indicate median AUCs. In *Lower*, the medians of the AUC distributions for the examiners, reviewers, superrecognizers, fingerprint examiners, and students appear. The median AUC reaches 1.0 for fusing four examiners or fusing three superrecognizers. The median AUC of fusing 10 students was 0.88, substantially below the median AUC for individual examiner accuracy.

**Fig. 4.**
Fusion of examiners and algorithms. Violin plots show the distribution of AUCs for each fusion test. Red dots indicate median AUCs. The distribution of individual examiners and the fusion of two examiners appear in columns 1 and 2. Also, algorithm performance appears in column 7. In between, plots show the forensic facial examiners fused with each of the four algorithms. Fusing one examiner and A2017b is more accurate than fusing two examiners, fusing examiners and A2017a or A2016 is equivalent to fusing two examiners, and fusing examiners with A2015 does not improve accuracy over a single examiner.

**Fig. 5.**
Estimated probability of highly confident same person ratings (+3 judgment, strong evidence the same person) when the identities are different and estimated probability of highly confident different person ratings ( $- 3$ judgment, strong evidence different people) when the identity is the same. The 95% confidence intervals are shown.

See this image and copyright information in PMC

Comment in

Human-computer teams are best.
Sutherland ME. Sutherland ME. Nat Hum Behav. 2018 Jul;2(7):444. doi: 10.1038/s41562-018-0375-7. Nat Hum Behav. 2018. PMID: 31097803 No abstract available.

References

1. Noyes E, Phillips PJ, O’Toole AJ. What is a super-recogniser? In: Bindermann M, Megreya AM, editors. Face Processing: Systems, Disorders, and Cultural Differences. Nova; New York: 2017. pp. 173–201.
1. White D, Burton AM, Kemp RI, Jenkins R. Crowd effects in unfamiliar face matching. Appl Cognit Psychol. 2013;27:769–777.
1. White D, Phillips PJ, Hahn CA, Hill MQ, O’Toole AJ. Perceptual expertise in forensic facial image comparison. Proc R Soc B. 2015;282:20151292. - PMC - PubMed
1. Dowsett AJ, Burton AM. Unfamiliar face matching: Pairs out-perform individuals and provide a route to training. Br J Psychol. 2015;106:433–445. - PubMed
1. O’Toole A, Abdi H, Jiang F, Phillips PJ. Fusing face recognition algorithms and humans. IEEE Trans Syst Man Cybern B. 2007;37:1149–1155. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms

Affiliations

Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Comment in

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources