Challenges in mapping European rare disease databases, relevant for ML-based screening technologies in terms of organizational, FAIR and legal principles: scoping review
- PMID: 37780450
- PMCID: PMC10540868
- DOI: 10.3389/fpubh.2023.1214766
Challenges in mapping European rare disease databases, relevant for ML-based screening technologies in terms of organizational, FAIR and legal principles: scoping review
Abstract
Background: Given the increased availability of data sources such as hospital information systems, electronic health records, and health-related registries, a novel approach is required to develop artificial intelligence-based decision support that can assist clinicians in their diagnostic decision-making and shorten rare disease patients' diagnostic odyssey. The aim is to identify key challenges in the process of mapping European rare disease databases, relevant to ML-based screening technologies in terms of organizational, FAIR and legal principles.
Methods: A scoping review was conducted based on the PRISMA-ScR checklist. The primary article search was conducted in three electronic databases (MEDLINE/Pubmed, Scopus, and Web of Science) and a secondary search was performed in Google scholar and on the organizations' websites. Each step of this review was carried out independently by two researchers. A charting form for relevant study analysis was developed and used to categorize data and identify data items in three domains - organizational, FAIR and legal.
Results: At the end of the screening process, 73 studies were eligible for review based on inclusion and exclusion criteria with more than 60% (n = 46) of the research published in the last 5 years and originated only from EU/EEA countries. Over the ten-year period (2013-2022), there is a clear cycling trend in the publications, with a peak of challenges reporting every four years. Within this trend, the following dynamic was identified: except for 2016, organizational challenges dominated the articles published up to 2018; legal challenges were the most frequently discussed topic from 2018 to 2022. The following distribution of the data items by domains was observed - (1) organizational (n = 36): data accessibility and sharing (20.2%); long-term sustainability (18.2%); governance, planning and design (17.2%); lack of harmonization and standardization (17.2%); quality of data collection (16.2%); and privacy risks and small sample size (11.1%); (2) FAIR (n = 15): findable (17.9%); accessible sustainability (25.0%); interoperable (39.3%); and reusable (17.9%); and (3) legal (n = 33): data protection by all means (34.4%); data management and ownership (22.9%); research under GDPR and member state law (20.8%); trust and transparency (13.5%); and digitalization of health (8.3%). We observed a specific pattern repeated in all domains during the process of data charting and data item identification - in addition to the outlined challenges, good practices, guidelines, and recommendations were also discussed. The proportion of publications addressing only good practices, guidelines, and recommendations for overcoming challenges when mapping RD databases in at least one domain was calculated to be 47.9% (n = 35).
Conclusion: Despite the opportunities provided by innovation - automation, electronic health records, hospital-based information systems, biobanks, rare disease registries and European Reference Networks - the results of the current scoping review demonstrate a diversity of the challenges that must still be addressed, with immediate actions on ensuring better governance of rare disease registries, implementing FAIR principles, and enhancing the EU legal framework.
Keywords: European Reference Networks (ERNs); artificial intelligence; electronic health records; issues; limitations; machine learning; rare disease registry.
Copyright © 2023 Raycheva, Kostadinov, Mitova, Bogoeva, Iskrov, Stefanov and Stefanov.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures






Similar articles
-
Landscape analysis of available European data sources amenable for machine learning and recommendations on usability for rare diseases screening.Orphanet J Rare Dis. 2024 Apr 6;19(1):147. doi: 10.1186/s13023-024-03162-5. Orphanet J Rare Dis. 2024. PMID: 38582900 Free PMC article.
-
Initiatives, Concepts, and Implementation Practices of the Findable, Accessible, Interoperable, and Reusable Data Principles in Health Data Stewardship: Scoping Review.J Med Internet Res. 2023 Aug 28;25:e45013. doi: 10.2196/45013. J Med Internet Res. 2023. PMID: 37639292 Free PMC article.
-
Initiatives, Concepts, and Implementation Practices of FAIR (Findable, Accessible, Interoperable, and Reusable) Data Principles in Health Data Stewardship Practice: Protocol for a Scoping Review.JMIR Res Protoc. 2021 Feb 2;10(2):e22505. doi: 10.2196/22505. JMIR Res Protoc. 2021. PMID: 33528373 Free PMC article.
-
Data stewardship and curation practices in AI-based genomics and automated microscopy image analysis for high-throughput screening studies: promoting robust and ethical AI applications.Hum Genomics. 2025 Feb 23;19(1):16. doi: 10.1186/s40246-025-00716-x. Hum Genomics. 2025. PMID: 39988670 Free PMC article. Review.
-
Beyond the black stump: rapid reviews of health research issues affecting regional, rural and remote Australia.Med J Aust. 2020 Dec;213 Suppl 11:S3-S32.e1. doi: 10.5694/mja2.50881. Med J Aust. 2020. PMID: 33314144
Cited by
-
An ontology-based rare disease common data model harmonising international registries, FHIR, and Phenopackets.Sci Data. 2025 Feb 8;12(1):234. doi: 10.1038/s41597-025-04558-z. Sci Data. 2025. PMID: 39922817 Free PMC article.
-
How to customize common data models for rare diseases: an OMOP-based implementation and lessons learned.Orphanet J Rare Dis. 2024 Aug 14;19(1):298. doi: 10.1186/s13023-024-03312-9. Orphanet J Rare Dis. 2024. PMID: 39143600 Free PMC article.
-
The impact of rare diseases on the quality of life in paediatric patients: current status.Front Public Health. 2025 Mar 24;13:1531583. doi: 10.3389/fpubh.2025.1531583. eCollection 2025. Front Public Health. 2025. PMID: 40196857 Free PMC article. Review.
-
Serbian Health Information System (HIS) improvements 2021-2024: comparison study using stages of continuous improvement (SOCI) methodology.Health Res Policy Syst. 2025 Jul 14;23(1):92. doi: 10.1186/s12961-025-01337-5. Health Res Policy Syst. 2025. PMID: 40660281 Free PMC article.
-
Artificial intelligence empowering rare diseases: a bibliometric perspective over the last two decades.Orphanet J Rare Dis. 2024 Sep 13;19(1):345. doi: 10.1186/s13023-024-03352-1. Orphanet J Rare Dis. 2024. PMID: 39272071 Free PMC article.
References
-
- European Commission . EU research on rare diseases (2022). Available at: https://research-andinnovation.ec.europa.eu/research-area/health/rare-di... (Accessed November 21, 2022).
-
- National Organization for Rare Disorders (NORD) . EU research on rare diseases (2022). Available at: https://rarediseases.org/ (Accessed November 21, 2022).
-
- World Health Organization . International classification of diseases, eleventh revision (ICD-11) (2022). Available at: https://www.who.int/standards/classifications/classification-of-diseases (Accessed November 21, 2022).
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical
Miscellaneous