Understanding the Nature of Metadata: Systematic Review
- PMID: 35014967
- PMCID: PMC8790684
- DOI: 10.2196/25440
Understanding the Nature of Metadata: Systematic Review
Abstract
Background: Metadata are created to describe the corresponding data in a detailed and unambiguous way and is used for various applications in different research areas, for example, data identification and classification. However, a clear definition of metadata is crucial for further use. Unfortunately, extensive experience with the processing and management of metadata has shown that the term "metadata" and its use is not always unambiguous.
Objective: This study aimed to understand the definition of metadata and the challenges resulting from metadata reuse.
Methods: A systematic literature search was performed in this study following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines for reporting on systematic reviews. Five research questions were identified to streamline the review process, addressing metadata characteristics, metadata standards, use cases, and problems encountered. This review was preceded by a harmonization process to achieve a general understanding of the terms used.
Results: The harmonization process resulted in a clear set of definitions for metadata processing focusing on data integration. The following literature review was conducted by 10 reviewers with different backgrounds and using the harmonized definitions. This study included 81 peer-reviewed papers from the last decade after applying various filtering steps to identify the most relevant papers. The 5 research questions could be answered, resulting in a broad overview of the standards, use cases, problems, and corresponding solutions for the application of metadata in different research areas.
Conclusions: Metadata can be a powerful tool for identifying, describing, and processing information, but its meaningful creation is costly and challenging. This review process uncovered many standards, use cases, problems, and solutions for dealing with metadata. The presented harmonized definitions and the new schema have the potential to improve the classification and generation of metadata by creating a shared understanding of metadata and its context.
Keywords: data classification; data identification; data integration; metadata; metadata definition; systematic review.
©Hannes Ulrich, Ann-Kristin Kock-Schoppenhauer, Noemi Deppenwiese, Robert Gött, Jori Kern, Martin Lablans, Raphael W Majeed, Mark R Stöhr, Jürgen Stausberg, Julian Varghese, Martin Dugas, Josef Ingenerf. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 11.01.2022.
Conflict of interest statement
Conflicts of Interest: None declared.
Figures




Similar articles
-
Home treatment for mental health problems: a systematic review.Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150. Health Technol Assess. 2001. PMID: 11532236
-
Eliciting adverse effects data from participants in clinical trials.Cochrane Database Syst Rev. 2018 Jan 16;1(1):MR000039. doi: 10.1002/14651858.MR000039.pub2. Cochrane Database Syst Rev. 2018. PMID: 29372930 Free PMC article.
-
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340. Health Technol Assess. 2006. PMID: 16959170
-
The measurement and monitoring of surgical adverse events.Health Technol Assess. 2001;5(22):1-194. doi: 10.3310/hta5220. Health Technol Assess. 2001. PMID: 11532239
-
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3. Cochrane Database Syst Rev. 2022. PMID: 35593186 Free PMC article.
Cited by
-
Harvesting metadata in clinical care: a crosswalk between FHIR, OMOP, CDISC and openEHR metadata.Sci Data. 2022 Oct 28;9(1):659. doi: 10.1038/s41597-022-01792-7. Sci Data. 2022. PMID: 36307424 Free PMC article.
-
The challenges of research data management in cardiovascular science: a DGK and DZHK position paper-executive summary.Clin Res Cardiol. 2024 May;113(5):672-679. doi: 10.1007/s00392-023-02303-3. Epub 2023 Oct 17. Clin Res Cardiol. 2024. PMID: 37847314 Free PMC article. Review.
-
Developing a standardized but extendable framework to increase the findability of infectious disease datasets.Sci Data. 2023 Feb 23;10(1):99. doi: 10.1038/s41597-023-01968-9. Sci Data. 2023. PMID: 36823157 Free PMC article.
-
Metadata integrity in bioinformatics: Bridging the gap between data and knowledge.Comput Struct Biotechnol J. 2023 Oct 5;21:4895-4913. doi: 10.1016/j.csbj.2023.10.006. eCollection 2023. Comput Struct Biotechnol J. 2023. PMID: 37860229 Free PMC article. Review.
-
Critical Data for Critical Care: A Primer on Leveraging Electronic Health Record Data for Research From Society of Critical Care Medicine's Panel on Data Sharing and Harmonization.Crit Care Explor. 2024 Nov 15;6(11):e1179. doi: 10.1097/CCE.0000000000001179. eCollection 2024 Nov. Crit Care Explor. 2024. PMID: 39559555 Free PMC article. Review.
References
-
- Ammenwerth E, Spötl H-p. The time needed for clinical documentation versus direct patient care. A work-sampling analysis of physicians' activities. Methods Inf Med. 2009;48(1):84–91.09010084 - PubMed
-
- Patel J. Bridging Data Silos Using Big Data Integration. IJDMS. 2019 Jun 30;11(3):01–06. doi: 10.5121/ijdms.2019.11301. https://aircconline.com/ijdms/V11N3/11319ijdms01.pdf - DOI
-
- Hull R. Managing semantic heterogeneity in databases: a theoretical prospective. Proceedings of the 16th ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems; May; NY, USA. 1997. pp. 51–61. - DOI
-
- Weiskopf NG, Weng C. Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J Am Med Inform Assoc. 2013 Jan 01;20(1):144–51. doi: 10.1136/amiajnl-2011-000681. http://europepmc.org/abstract/MED/22733976 amiajnl-2011-000681 - DOI - PMC - PubMed
-
- Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten J, da Silva Santos LB, Bourne PE, Bouwman J, Brookes AJ, Clark T, Crosas M, Dillo I, Dumon O, Edmunds S, Evelo CT, Finkers R, Gonzalez-Beltran A, Gray AJ, Groth P, Goble C, Grethe JS, Heringa J, 't Hoen Peter A C, Hooft R, Kuhn T, Kok R, Kok J, Lusher SJ, Martone ME, Mons A, Packer AL, Persson B, Rocca-Serra P, Roos M, van Schaik R, Sansone S, Schultes E, Sengstag T, Slater T, Strawn G, Swertz MA, Thompson M, van der Lei J, van Mulligen E, Velterop J, Waagmeester A, Wittenburg P, Wolstencroft K, Zhao J, Mons B. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016 Mar 15;3:160018. doi: 10.1038/sdata.2016.18. doi: 10.1038/sdata.2016.18.sdata201618 - DOI - DOI - PMC - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources