The cure: design and evaluation of a crowdsourcing game for gene selection for breast cancer survival prediction
- PMID: 25654473
- PMCID: PMC4307816
- DOI: 10.2196/games.3350
The cure: design and evaluation of a crowdsourcing game for gene selection for breast cancer survival prediction
Abstract
Background: Molecular signatures for predicting breast cancer prognosis could greatly improve care through personalization of treatment. Computational analyses of genome-wide expression datasets have identified such signatures, but these signatures leave much to be desired in terms of accuracy, reproducibility, and biological interpretability. Methods that take advantage of structured prior knowledge (eg, protein interaction networks) show promise in helping to define better signatures, but most knowledge remains unstructured. Crowdsourcing via scientific discovery games is an emerging methodology that has the potential to tap into human intelligence at scales and in modes unheard of before.
Objective: The main objective of this study was to test the hypothesis that knowledge linking expression patterns of specific genes to breast cancer outcomes could be captured from players of an open, Web-based game. We envisioned capturing knowledge both from the player's prior experience and from their ability to interpret text related to candidate genes presented to them in the context of the game.
Methods: We developed and evaluated an online game called The Cure that captured information from players regarding genes for use as predictors of breast cancer survival. Information gathered from game play was aggregated using a voting approach, and used to create rankings of genes. The top genes from these rankings were evaluated using annotation enrichment analysis, comparison to prior predictor gene sets, and by using them to train and test machine learning systems for predicting 10 year survival.
Results: Between its launch in September 2012 and September 2013, The Cure attracted more than 1000 registered players, who collectively played nearly 10,000 games. Gene sets assembled through aggregation of the collected data showed significant enrichment for genes known to be related to key concepts such as cancer, disease progression, and recurrence. In terms of the predictive accuracy of models trained using this information, these gene sets provided comparable performance to gene sets generated using other methods, including those used in commercial tests. The Cure is available on the Internet.
Conclusions: The principal contribution of this work is to show that crowdsourcing games can be developed as a means to address problems involving domain knowledge. While most prior work on scientific discovery games and crowdsourcing in general takes as a premise that contributors have little or no expertise, here we demonstrated a crowdsourcing system that succeeded in capturing expert knowledge.
Keywords: Web applications; artificial intelligence; breast neoplasms; collaborative and social computing systems and tools; computer games; crowdsourcing; feature selection; gene expression; supervised learning; survival analysis.
Conflict of interest statement
Conflicts of Interest: None declared.
Figures










Similar articles
-
Inferring Mathematical Equations Using Crowdsourcing.PLoS One. 2015 Dec 29;10(12):e0145557. doi: 10.1371/journal.pone.0145557. eCollection 2015. PLoS One. 2015. PMID: 26713846 Free PMC article.
-
Crowdsourcing malaria parasite quantification: an online game for analyzing images of infected thick blood smears.J Med Internet Res. 2012 Nov 29;14(6):e167. doi: 10.2196/jmir.2338. J Med Internet Res. 2012. PMID: 23196001 Free PMC article. Clinical Trial.
-
Collaborative intelligence and gamification for on-line malaria species differentiation.Malar J. 2019 Jan 24;18(1):21. doi: 10.1186/s12936-019-2662-9. Malar J. 2019. PMID: 30678733 Free PMC article.
-
Crowdsourcing for bioinformatics.Bioinformatics. 2013 Aug 15;29(16):1925-33. doi: 10.1093/bioinformatics/btt333. Epub 2013 Jun 19. Bioinformatics. 2013. PMID: 23782614 Free PMC article. Review.
-
Clinical Text Data in Machine Learning: Systematic Review.JMIR Med Inform. 2020 Mar 31;8(3):e17984. doi: 10.2196/17984. JMIR Med Inform. 2020. PMID: 32229465 Free PMC article. Review.
Cited by
-
Pepblock Builder VR - An Open-Source Tool for Gaming-Based Bio-Edutainment in Interactive Protein Design.Front Bioeng Biotechnol. 2021 May 14;9:674211. doi: 10.3389/fbioe.2021.674211. eCollection 2021. Front Bioeng Biotechnol. 2021. PMID: 34055764 Free PMC article.
-
Rapid grading of fundus photographs for diabetic retinopathy using crowdsourcing.J Med Internet Res. 2014 Oct 30;16(10):e233. doi: 10.2196/jmir.3807. J Med Internet Res. 2014. PMID: 25356929 Free PMC article.
-
Research data management in health and biomedical citizen science: practices and prospects.JAMIA Open. 2019 Dec 9;3(1):113-125. doi: 10.1093/jamiaopen/ooz052. eCollection 2020 Apr. JAMIA Open. 2019. PMID: 32607493 Free PMC article.
-
The application of crowdsourcing approaches to cancer research: a systematic review.Cancer Med. 2017 Nov;6(11):2595-2605. doi: 10.1002/cam4.1165. Epub 2017 Sep 29. Cancer Med. 2017. PMID: 28960834 Free PMC article.
-
Mapping of Crowdsourcing in Health: Systematic Review.J Med Internet Res. 2018 May 15;20(5):e187. doi: 10.2196/jmir.9330. J Med Internet Res. 2018. PMID: 29764795 Free PMC article.
References
-
- van 't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH. Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002 Jan 31;415(6871):530–536. doi: 10.1038/415530a. - DOI - PubMed
-
- Margolin AA, Bilal E, Huang E, Norman TC, Ottestad L, Mecham BH, Sauerwine B, Kellen MR, Mangravite LM, Furia MD, Vollan HK, Rueda OM, Guinney J, Deflaux NA, Hoff B, Schildwachter X, Russnes HG, Park D, Vang VO, Pirtle T, Youseff L, Citro C, Curtis C, Kristensen VN, Hellerstein J, Friend SH, Stolovitzky G, Aparicio S, Caldas C, Børresen-Dale AL. Systematic analysis of challenge-driven improvements in molecular prognostic models for breast cancer. Sci Transl Med. 2013 Apr 17;5(181):181re1. doi: 10.1126/scitranslmed.3006112. http://stm.sciencemag.org/cgi/pmidlookup?view=long&pmid=23596205 - DOI - PMC - PubMed
-
- Griffith OL, Pepin F, Enache OM, Heiser LM, Collisson EA, Spellman PT, Gray JW. A robust prognostic signature for hormone-positive node-negative breast cancer. Genome Med. 2013;5(10):92. doi: 10.1186/gm496. http://www.genomemedicine.com/content/5/10/92 - DOI - PMC - PubMed
-
- Daemen A, Griffith OL, Heiser LM, Wang NJ, Enache OM, Sanborn Z, Pepin F, Durinck S, Korkola JE, Griffith M, Hur JS, Huh N, Chung J, Cope L, Fackler MJ, Umbricht C, Sukumar S, Seth P, Sukhatme VP, Jakkula LR, Lu Y, Mills GB, Cho RJ, Collisson EA, van't Veer LJ, Spellman PT, Gray JW. Modeling precision treatment of breast cancer. Genome Biol. 2013;14(10):R110. doi: 10.1186/gb-2013-14-10-r110. http://genomebiology.com/content/14/10/R110 - DOI - PMC - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous