A global review of publicly available datasets for ophthalmological imaging: barriers to access, usability, and generalisability
- PMID: 33735069
- PMCID: PMC7618278
- DOI: 10.1016/S2589-7500(20)30240-5
A global review of publicly available datasets for ophthalmological imaging: barriers to access, usability, and generalisability
Erratum in
-
Correction to Lancet Digit Health 2020; published online Oct 1. https://doi.org/10.1016/S2589-7500(20)30240-5.Lancet Digit Health. 2021 Jan;3(1):e7. doi: 10.1016/S2589-7500(20)30290-9. Epub 2020 Dec 1. Lancet Digit Health. 2021. PMID: 33735071 No abstract available.
Abstract
Health data that are publicly available are valuable resources for digital health research. Several public datasets containing ophthalmological imaging have been frequently used in machine learning research; however, the total number of datasets containing ophthalmological health information and their respective content is unclear. This Review aimed to identify all publicly available ophthalmological imaging datasets, detail their accessibility, describe which diseases and populations are represented, and report on the completeness of the associated metadata. With the use of MEDLINE, Google's search engine, and Google Dataset Search, we identified 94 open access datasets containing 507 724 images and 125 videos from 122 364 patients. Most datasets originated from Asia, North America, and Europe. Disease populations were unevenly represented, with glaucoma, diabetic retinopathy, and age-related macular degeneration disproportionately overrepresented in comparison with other eye diseases. The reporting of basic demographic characteristics such as age, sex, and ethnicity was poor, even at the aggregate level. This Review provides greater visibility for ophthalmological datasets that are publicly available as powerful resources for research. Our paper also exposes an increasing divide in the representation of different population and disease groups in health data repositories. The improved reporting of metadata would enable researchers to access the most appropriate datasets for their needs and maximise the potential of such resources.
Copyright © 2021 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved.
Conflict of interest statement
XL received a proportion of her funding from the Wellcome Trust, through a Health Improvement Challenge grant (200141/Z/15/Z). LF reports an award from Bayer; personal fees from Allergan; and non-financial support from Allergan, outside the submitted work. PAK reports personal fees from DeepMind, Roche, Novartis, Apellis, Heidelberg Engineering, Topcon, Allergan, Bayer, and Big Picture Medical, outside the submitted work; and is supported by the Moorfields Eye Charity Career Development Award (R190028A) and a UK Research & Innovation Future Leaders Fellowship (MR/T019050/1). MJB is supported by the Wellcome Trust (207472/Z/17/Z). AKD received a proportion of his funding from the Department of Health’s National Institute for Health Research Biomedical Research Centre for Ophthalmology at Moorfields Eye Hospital, University College London Institute of Ophthalmology, Health Data Research UK (London, UK), and the Wellcome Trust, through a Health Improvement Challenge grant (200141/Z/15/Z). All other authors declare no competing interests.
Figures
References
-
- Parikh RB, Gdowski A, Patt DA, Hertler A, Mermel C, Bekelman JE. Using big data and predictive analytics to determine patient risk in oncology. Am Soc Clin Oncol Educ Book. 2019;39:e53–58. - PubMed
-
- Wong ZSY, Zhou J, Zhang Q. Artificial intelligence for infectious disease big data analytics. Infect Dis Health. 2019;24:44–48. - PubMed
-
- Kim H-E, Kim HH, Han B-K, et al. Changes in cancer detection and false-positive recall in mammography using artificial intelligence: a retrospective, multireader study. Lancet Digit Health. 2020;2:e138–48. - PubMed
-
- Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell. 2018;172:1122–1131.:e9. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
