Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Aug 1:16:e56237.
doi: 10.2196/56237.

Making Metadata Machine-Readable as the First Step to Providing Findable, Accessible, Interoperable, and Reusable Population Health Data: Framework Development and Implementation Study

Affiliations

Making Metadata Machine-Readable as the First Step to Providing Findable, Accessible, Interoperable, and Reusable Population Health Data: Framework Development and Implementation Study

David Amadi et al. Online J Public Health Inform. .

Erratum in

Abstract

Background: Metadata describe and provide context for other data, playing a pivotal role in enabling findability, accessibility, interoperability, and reusability (FAIR) data principles. By providing comprehensive and machine-readable descriptions of digital resources, metadata empower both machines and human users to seamlessly discover, access, integrate, and reuse data or content across diverse platforms and applications. However, the limited accessibility and machine-interpretability of existing metadata for population health data hinder effective data discovery and reuse.

Objective: To address these challenges, we propose a comprehensive framework using standardized formats, vocabularies, and protocols to render population health data machine-readable, significantly enhancing their FAIRness and enabling seamless discovery, access, and integration across diverse platforms and research applications.

Methods: The framework implements a 3-stage approach. The first stage is Data Documentation Initiative (DDI) integration, which involves leveraging the DDI Codebook metadata and documentation of detailed information for data and associated assets, while ensuring transparency and comprehensiveness. The second stage is Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) standardization. In this stage, the data are harmonized and standardized into the OMOP CDM, facilitating unified analysis across heterogeneous data sets. The third stage involves the integration of Schema.org and JavaScript Object Notation for Linked Data (JSON-LD), in which machine-readable metadata are generated using Schema.org entities and embedded within the data using JSON-LD, boosting discoverability and comprehension for both machines and human users. We demonstrated the implementation of these 3 stages using the Integrated Disease Surveillance and Response (IDSR) data from Malawi and Kenya.

Results: The implementation of our framework significantly enhanced the FAIRness of population health data, resulting in improved discoverability through seamless integration with platforms such as Google Dataset Search. The adoption of standardized formats and protocols streamlined data accessibility and integration across various research environments, fostering collaboration and knowledge sharing. Additionally, the use of machine-interpretable metadata empowered researchers to efficiently reuse data for targeted analyses and insights, thereby maximizing the overall value of population health resources. The JSON-LD codes are accessible via a GitHub repository and the HTML code integrated with JSON-LD is available on the Implementation Network for Sharing Population Information from Research Entities website.

Conclusions: The adoption of machine-readable metadata standards is essential for ensuring the FAIRness of population health data. By embracing these standards, organizations can enhance diverse resource visibility, accessibility, and utility, leading to a broader impact, particularly in low- and middle-income countries. Machine-readable metadata can accelerate research, improve health care decision-making, and ultimately promote better health outcomes for populations worldwide.

Keywords: DDI; Data Documentation Initiative; FAIR data principles; JSON-LD; JavaScript Object Notation for Linked Data; OMOP CDM; Observational Medical Outcomes Partnership Common Data Model; data models; data science; machine-readable metadata; metadata; standardization.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
Structure of the Implementation Network for Sharing Population Information from Research Entities (INSPIRE) model. aphrc: African Population Health Research Center; CRF: cloud raster format; EHR electronic health record; ETL: extraction, transform, load; IDSR: Integrated Disease Surveillance and Response; ohdsi: Observational Health Data Sciences and Informatics; OMOP CDM: Observational Medical Outcomes Partnership Common Data Model; WHO: World Health Organization.
Figure 2
Figure 2
Syntax example: MedicalObservationalStudy.
Figure 3
Figure 3
Next steps to achieving machine-readable and machine-actionable metadata for public health. CDI: Comprehensive Data Integration; DCAT: Data Catalog Vocabulary; DDI: Data Documentation Initiative; FAIR: findable, accessible, interoperable, reusable; GeoDCAT-AP: a geospatial extension for the DCAT application profile for data portals in Europe; INSPIRE: Implementation Network for Sharing Population Information with Research Entities; OHDSI: Observational Health Data Sciences and Informatics; OMOP CDM: Observational Medical Outcomes Partnership Common Data Model; SDMX: Statistical Data and Metadata Exchange.

References

    1. Brownson RC, Chriqui JF, Stamatakis KA. Understanding evidence-based public health policy. Am J Public Health. 2009 Sep;99(9):1576–1583. doi: 10.2105/AJPH.2008.156224.AJPH.2008.156224 - DOI - PMC - PubMed
    1. Fall IS, Rajatonirina S, Yahaya AA, Zabulon Y, Nsubuga P, Nanyunja M, Wamala J, Njuguna C, Lukoya CO, Alemu W, Kasolo FC, Talisuna AO. Integrated Disease Surveillance and Response (IDSR) strategy: current status, challenges and perspectives for the future in Africa. BMJ Glob Health. 2019 Jul 03;4(4):e001427. doi: 10.1136/bmjgh-2019-001427. https://gh.bmj.com/lookup/pmidlookup?view=long&pmid=31354972 bmjgh-2019-001427 - DOI - PMC - PubMed
    1. Sankoh O, Byass P. The INDEPTH Network: filling vital gaps in global epidemiology. Int J Epidemiol. 2012 Jun;41(3):579–588. doi: 10.1093/ije/dys081. https://europepmc.org/abstract/MED/22798690 dys081 - DOI - PMC - PubMed
    1. van Panhuis WG, Paul P, Emerson C, Grefenstette J, Wilder R, Herbst AJ, Heymann D, Burke DS. A systematic review of barriers to data sharing in public health. BMC Public Health. 2014 Nov 05;14:1144. doi: 10.1186/1471-2458-14-1144. https://bmcpublichealth.biomedcentral.com/articles/10.1186/1471-2458-14-... 1471-2458-14-1144 - DOI - PMC - PubMed
    1. Pettengill JB, Beal J, Balkey M, Allard M, Rand H, Timme R. Interpretative labor and the bane of nonstandardized metadata in public health surveillance and food safety. Clin Infect Dis. 2021 Oct 20;73(8):1537–1539. doi: 10.1093/cid/ciab615.6317707 - DOI - PubMed

LinkOut - more resources