Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jun 12;115(24):6171-6176.
doi: 10.1073/pnas.1721355115. Epub 2018 May 29.

Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms

Affiliations

Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms

P Jonathon Phillips et al. Proc Natl Acad Sci U S A. .

Abstract

Achieving the upper limits of face identification accuracy in forensic applications can minimize errors that have profound social and personal consequences. Although forensic examiners identify faces in these applications, systematic tests of their accuracy are rare. How can we achieve the most accurate face identification: using people and/or machines working alone or in collaboration? In a comprehensive comparison of face identification by humans and computers, we found that forensic facial examiners, facial reviewers, and superrecognizers were more accurate than fingerprint examiners and students on a challenging face identification test. Individual performance on the test varied widely. On the same test, four deep convolutional neural networks (DCNNs), developed between 2015 and 2017, identified faces within the range of human accuracy. Accuracy of the algorithms increased steadily over time, with the most recent DCNN scoring above the median of the forensic facial examiners. Using crowd-sourcing methods, we fused the judgments of multiple forensic facial examiners by averaging their rating-based identity judgments. Accuracy was substantially better for fused judgments than for individuals working alone. Fusion also served to stabilize performance, boosting the scores of lower-performing individuals and decreasing variability. Single forensic facial examiners fused with the best algorithm were more accurate than the combination of two examiners. Therefore, collaboration among humans and between humans and machines offers tangible benefits to face identification accuracy in important applications. These results offer an evidence-based roadmap for achieving the most accurate face identification possible.

Keywords: face identification; face recognition algorithm; forensic science; machine learning technology; wisdom-of-crowds.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest statement: The University of Maryland is filing a US patent application that will cover portions of algorithms A2017a and A2017b. R.R., C.D.C., and R.C. are coinventors on this patent.

Figures

Fig. 1.
Fig. 1.
Examples highlighting the face region in the images used in this study (all image pairs are shown in SI Appendix, Figs. S8–S14). (Left) This pair is a same identity pair, and (Right) this pair shows a different identity pair.
Fig. 2.
Fig. 2.
Human and machine accuracy. Black dots indicate AUCs of individual participants; red dots are group medians. In the algorithms column, red dots indicate algorithm accuracy. Face specialists (facial examiners, facial reviewers, and superrecognizers) surpassed fingerprint examiners, who surpassed the students. The violin plot outlines are estimates of the density for the AUC distribution for the subject groups. The dashed horizontal line marks the accuracy of a 95th percentile student. All algorithms perform in the range of human performance. The best algorithm places slightly above the forensic examiners’ median.
Fig. 3.
Fig. 3.
Plots illustrate the effectiveness of fusing multiple participants within groups. For all groups, combining judgments by simple averaging is effective. The violin plots in Upper show the distribution of AUCs for fusing examiners. Red circles indicate median AUCs. In Lower, the medians of the AUC distributions for the examiners, reviewers, superrecognizers, fingerprint examiners, and students appear. The median AUC reaches 1.0 for fusing four examiners or fusing three superrecognizers. The median AUC of fusing 10 students was 0.88, substantially below the median AUC for individual examiner accuracy.
Fig. 4.
Fig. 4.
Fusion of examiners and algorithms. Violin plots show the distribution of AUCs for each fusion test. Red dots indicate median AUCs. The distribution of individual examiners and the fusion of two examiners appear in columns 1 and 2. Also, algorithm performance appears in column 7. In between, plots show the forensic facial examiners fused with each of the four algorithms. Fusing one examiner and A2017b is more accurate than fusing two examiners, fusing examiners and A2017a or A2016 is equivalent to fusing two examiners, and fusing examiners with A2015 does not improve accuracy over a single examiner.
Fig. 5.
Fig. 5.
Estimated probability of highly confident same person ratings (+3 judgment, strong evidence the same person) when the identities are different and estimated probability of highly confident different person ratings (3 judgment, strong evidence different people) when the identity is the same. The 95% confidence intervals are shown.

Comment in

  • Human-computer teams are best.
    Sutherland ME. Sutherland ME. Nat Hum Behav. 2018 Jul;2(7):444. doi: 10.1038/s41562-018-0375-7. Nat Hum Behav. 2018. PMID: 31097803 No abstract available.

References

    1. Noyes E, Phillips PJ, O’Toole AJ. What is a super-recogniser? In: Bindermann M, Megreya AM, editors. Face Processing: Systems, Disorders, and Cultural Differences. Nova; New York: 2017. pp. 173–201.
    1. White D, Burton AM, Kemp RI, Jenkins R. Crowd effects in unfamiliar face matching. Appl Cognit Psychol. 2013;27:769–777.
    1. White D, Phillips PJ, Hahn CA, Hill MQ, O’Toole AJ. Perceptual expertise in forensic facial image comparison. Proc R Soc B. 2015;282:20151292. - PMC - PubMed
    1. Dowsett AJ, Burton AM. Unfamiliar face matching: Pairs out-perform individuals and provide a route to training. Br J Psychol. 2015;106:433–445. - PubMed
    1. O’Toole A, Abdi H, Jiang F, Phillips PJ. Fusing face recognition algorithms and humans. IEEE Trans Syst Man Cybern B. 2007;37:1149–1155. - PubMed

Publication types