Crowdsourcing Skin Demarcations of Chronic Graft-Versus-Host Disease in Patient Photographs: Training Versus Performance Study
- PMID: 38147369
- PMCID: PMC10777279
- DOI: 10.2196/48589
Crowdsourcing Skin Demarcations of Chronic Graft-Versus-Host Disease in Patient Photographs: Training Versus Performance Study
Abstract
Background: Chronic graft-versus-host disease (cGVHD) is a significant cause of long-term morbidity and mortality in patients after allogeneic hematopoietic cell transplantation. Skin is the most commonly affected organ, and visual assessment of cGVHD can have low reliability. Crowdsourcing data from nonexpert participants has been used for numerous medical applications, including image labeling and segmentation tasks.
Objective: This study aimed to assess the ability of crowds of nonexpert raters-individuals without any prior training for identifying or marking cGHVD-to demarcate photos of cGVHD-affected skin. We also studied the effect of training and feedback on crowd performance.
Methods: Using a Canfield Vectra H1 3D camera, 360 photographs of the skin of 36 patients with cGVHD were taken. Ground truth demarcations were provided in 3D by a trained expert and reviewed by a board-certified dermatologist. In total, 3000 2D images (projections from various angles) were created for crowd demarcation through the DiagnosUs mobile app. Raters were split into high and low feedback groups. The performances of 4 different crowds of nonexperts were analyzed, including 17 raters per image for the low and high feedback groups, 32-35 raters per image for the low feedback group, and the top 5 performers for each image from the low feedback group.
Results: Across 8 demarcation competitions, 130 raters were recruited to the high feedback group and 161 to the low feedback group. This resulted in a total of 54,887 individual demarcations from the high feedback group and 78,967 from the low feedback group. The nonexpert crowds achieved good overall performance for segmenting cGVHD-affected skin with minimal training, achieving a median surface area error of less than 12% of skin pixels for all crowds in both the high and low feedback groups. The low feedback crowds performed slightly poorer than the high feedback crowd, even when a larger crowd was used. Tracking the 5 most reliable raters from the low feedback group for each image recovered a performance similar to that of the high feedback crowd. Higher variability between raters for a given image was not found to correlate with lower performance of the crowd consensus demarcation and cannot therefore be used as a measure of reliability. No significant learning was observed during the task as more photos and feedback were seen.
Conclusions: Crowds of nonexpert raters can demarcate cGVHD images with good overall performance. Tracking the top 5 most reliable raters provided optimal results, obtaining the best performance with the lowest number of expert demarcations required for adequate training. However, the agreement amongst individual nonexperts does not help predict whether the crowd has provided an accurate result. Future work should explore the performance of crowdsourcing in standard clinical photos and further methods to estimate the reliability of consensus demarcations.
Keywords: artificial intelligence; cGVHD; crowdsourcing; dermatology; feasibility; graft-versus-host disease; imaging; labeling; medical image; segmentation; skin.
©Andrew J McNeil, Kelsey Parks, Xiaoqi Liu, Bohan Jiang, Joseph Coco, Kira McCool, Daniel Fabbri, Erik P Duhaime, Benoit M Dawant, Eric R Tkaczyk. Originally published in JMIR Dermatology (http://derma.jmir.org), 26.12.2023.
Conflict of interest statement
Conflicts of Interest: EPD is the chief executive officer of Centaur Labs and holds shares in the company. KM is an employee of Centaur Labs.
Figures








Similar articles
-
Crowdsourcing to delineate skin affected by chronic graft-vs-host disease.Skin Res Technol. 2019 Jul;25(4):572-577. doi: 10.1111/srt.12688. Epub 2019 Feb 20. Skin Res Technol. 2019. PMID: 30786065 Free PMC article.
-
Agreement Between Experts and an Untrained Crowd for Identifying Dermoscopic Features Using a Gamified App: Reader Feasibility Study.JMIR Med Inform. 2023 Jan 18;11:e38412. doi: 10.2196/38412. JMIR Med Inform. 2023. PMID: 36652282 Free PMC article.
-
Crowds Replicate Performance of Scientific Experts Scoring Phylogenetic Matrices of Phenotypes.Syst Biol. 2018 Jan 1;67(1):49-60. doi: 10.1093/sysbio/syx052. Syst Biol. 2018. PMID: 29253296
-
Crowdsourcing the Citation Screening Process for Systematic Reviews: Validation Study.J Med Internet Res. 2019 Apr 29;21(4):e12953. doi: 10.2196/12953. J Med Internet Res. 2019. PMID: 31033444 Free PMC article.
-
Evidence-based, Skin-directed Treatments for Cutaneous Chronic Graft-versus-host Disease.Cureus. 2019 Dec 25;11(12):e6462. doi: 10.7759/cureus.6462. Cureus. 2019. PMID: 32025391 Free PMC article. Review.
References
-
- Socié G, Ritz J. Current issues in chronic graft-versus-host disease. Blood. 2014 Jul 17;124(3):374–384. doi: 10.1182/blood-2014-01-514752. https://linkinghub.elsevier.com/retrieve/pii/S0006-4971(20)39969-9 S0006-4971(20)39969-9 - DOI - PMC - PubMed
-
- Jagasia MH, Greinix HT, Arora M, Williams KM, Wolff D, Cowen EW, Palmer J, Weisdorf D, Treister NS, Cheng G, Kerr H, Stratton P, Duarte RF, McDonald GB, Inamoto Y, Vigorito A, Arai S, Datiles MB, Jacobsohn D, Heller T, Kitko CL, Mitchell SA, Martin PJ, Shulman H, Wu RS, Cutler CS, Vogelsang GB, Lee SJ, Pavletic SZ, Flowers MED. National Institutes of Health consensus development project on criteria for clinical trials in chronic graft-versus-host disease: I. The 2014 Diagnosis and Staging Working Group report. Biol Blood Marrow Transplant. 2015 Mar;21(3):389–401.e1. doi: 10.1016/j.bbmt.2014.12.001. https://linkinghub.elsevier.com/retrieve/pii/S1083-8791(14)01378-0 S1083-8791(14)01378-0 - DOI - PMC - PubMed
-
- Lee SJ, Wolff D, Kitko C, Koreth J, Inamoto Y, Jagasia M, Pidala J, Olivieri A, Martin PJ, Przepiorka D, Pusic I, Dignan F, Mitchell SA, Lawitschka A, Jacobsohn D, Hall AM, Flowers MED, Schultz KR, Vogelsang G, Pavletic S. Measuring therapeutic response in chronic graft-versus-host disease. National Institutes of Health consensus development project on criteria for clinical trials in chronic graft-versus-host disease: IV. The 2014 Response Criteria Working Group report. Biol Blood Marrow Transplant. 2015 Jun;21(6):984–999. doi: 10.1016/j.bbmt.2015.02.025. https://linkinghub.elsevier.com/retrieve/pii/S1083-8791(15)00155-X S1083-8791(15)00155-X - DOI - PMC - PubMed
-
- Miklos D, Cutler CS, Arora M, Waller EK, Jagasia M, Pusic I, Flowers ME, Logan AC, Nakamura R, Blazar BR, Li Y, Chang S, Lal I, Dubovsky J, James DF, Styles L, Jaglowski S. Ibrutinib for chronic graft-versus-host disease after failure of prior therapy. Blood. 2017 Nov 23;130(21):2243–2250. doi: 10.1182/blood-2017-07-793786. https://linkinghub.elsevier.com/retrieve/pii/S0006-4971(20)32689-6 S0006-4971(20)32689-6 - DOI - PMC - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources