Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Jun;18(3):e004624.
doi: 10.1161/CIRCGEN.124.004624. Epub 2025 May 9.

Data Interoperability and Harmonization in Cardiovascular Genomic and Precision Medicine

Affiliations
Review

Data Interoperability and Harmonization in Cardiovascular Genomic and Precision Medicine

C Anwar A Chahal et al. Circ Genom Precis Med. 2025 Jun.

Abstract

Despite advances in cardiovascular care and improved outcomes, fragmented healthcare systems, nonequitable access to health care, and nonuniform and unbiased collection and access to healthcare data have exacerbated disparities in healthcare provision and further delayed the technological-enabled implementation of precision medicine. Precision medicine relies on a foundation of accurate and valid omics and phenomics that can be harnessed at scale from electronic health records. Big data approaches in noncardiovascular healthcare domains have helped improve efficiency and expedite the development of novel therapeutics; therefore, applying such an approach to cardiovascular precision medicine is an opportunity to further advance the field. Several endeavors, including the American Heart Association Precision Medicine platform and public-private partnerships (such as BigData@Heart in Europe), as well as cloud-based platforms, such as Terra used for the National Institutes of Health All of Us, are attempting to temporally and ontologically harmonize data. This state-of-the-art review summarizes best practices used in cardiovascular genomic and precision medicine and provides recommendations for systems' requirements that could enhance and accelerate the integration of these platforms.

Keywords: big data; electronic health records; natural language processing; phenomics; translational research, biomedical.

PubMed Disclaimer

Conflict of interest statement

All authors declare no conflicts of interest, no relevant relationships to disclose, and no funding received from any companies or organizations related to the subject of this article.

Figures

Figure 1.
Figure 1.
Summary of present challenges and possible solutions to improve data harmonization and interoperability for genomic and precision medicine. EHR indicates electronic health record.
Figure 2.
Figure 2.
Current status of standardization for observational data. The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) is an open community data standard, designed to standardize the structure and content of observational data and to enable efficient analyses that can produce reliable evidence. A central component of the OMOP CDM is the Observational Health Data Sciences and Informatics (OHDSI) standardized vocabularies. The OHDSI vocabularies allow organization and standardization of medical terms to be used across the various clinical domains of the OMOP common data model and enable standardized analytics that leverage the knowledge base when constructing exposure and outcome phenotypes and other features within characterization, population-level effect estimation, and patient-level prediction studies.
Figure 3.
Figure 3.
Overview of Gen3 framework services for cloud-based data analysis and execution. This figure highlights the comprehensive ecosystem supporting cloud-based data analysis and execution. The Gen3 framework, developed to support large-scale biomedical data analysis, leverages cloud infrastructure to provide robust, scalable, and secure services. Components include the following. Synthetic cohorts: the creation and management of synthetic data sets for various research purposes, enabling analysis without compromising real data privacy. Data/metadata queries: tools and services that facilitate efficient querying of data and metadata to support research workflows. Workflow requirements and analysis workflows: specifications and execution pathways necessary for conducting data analyses within the Gen3 environment. Gen3 framework services and data commons: core services and shared resources within the Gen3 ecosystem that supports data storage, management, and accessibility. Gen3 workspaces and hello world workflow: user-friendly interfaces and example workflows provided to help users get started with the Gen3 environment. Findable, accessible, interoperable, and reusable (FAIR) principle implementation: emphasis on ensuring data is FAIR, following the FAIR guiding principles as promoted by initiatives such as FORCE11. Cloud-based analysis and execution environment: integration of secure cloud environments for data analysis, providing scalable resources for handling large data sets. Reusable docker-based tools and workflows: the use of docker containers to encapsulate tools and workflows, promoting reusability and reproducibility of research. Cloud storage and secure data download requests: mechanisms for storing large volumes of data securely and facilitating controlled access for download and analysis. Cohorts, workflows, and notebooks: support for generating and managing cohorts, executing workflows, and using interactive notebooks for data analysis. AnVIL indicates Analysis, Visualization, and Informatics Lab-Space; IRB, institutional review board; OMOP, Observational Medical Outcomes Partnership; PCORnet, patient-centered outcomes research network; and SDV, synthetic data vault.
Figure 4.
Figure 4.
Multiomic approaches with application procedures in the development stages. Different multiomics approaches that are incorporated in healthcare delivery at various extents. Note that the challenges for data acquisition, analysis, harmonization, and reporting of standards and applicability differ between different omics methods.
Figure 5.
Figure 5.
Illustration of the procedural workflow demonstrating the incorporation of data science components within the framework of experimental design in biomedical research or clinical study design. A, Source data can be obtained from human cohorts or model systems, with tailored techniques and methodologies used to procure phenotypic and molecular data. B, Given the diversity of data types and features, such as the sequencing technology applied in transcriptomics data sets, an initial step involves data harmonization, followed by metadata extraction to facilitate indexing and standardization. C, Following the transformation of these data into a standardized and accessible format, integration into a unified interface enables investigators to search for and retrieve pertinent digital objects, specifically data sets or computational tools appropriate to the intended study. D, These resources are then leveraged to execute cutting-edge analyses, including machine learning and predictive modeling, with the aim of unveiling robust genotype-phenotype associations and delineating molecular signatures for the cohort. E, Molecular signatures, thus, obtained undergo subsequent processing and in-depth analysis to derive novel mechanistic, therapeutic, and clinical insights. F, Armed with these newfound insights, researchers contribute to the expansive network of biomedical knowledge, thereby propelling cardiovascular research forward. EHR indicates electronic health record; ICD, International Classification of Diseases; OMIM, Online Mendelian Inheritance in Man; MeSH, Medical Subject Headings; and MOD, Model Organism Database.

Similar articles

Cited by

References

    1. Armoundas AA, Narayan SM, Arnett DK, Spector-Bagdady K, Bennett DA, Celi LA, Friedman PA, Gollob MH, Hall JL, Kwitek AE, et al. ; on behalf of the American Heart Association Institute for Precision Cardiovascular Medicine; Council on Cardiovascular and Stroke Nursing; Council on Lifelong Congenital Heart Disease and Heart Health in the Young; Council on Cardiovascular Radiology and Intervention; Council on Hypertension; Council on the Kidney in Cardiovascular Disease; and Stroke Council. Use of artificial intelligence in improving outcomes in heart disease: a scientific statement from the American Heart Association. Circulation. 2024;149:e1028. doi: 10.1161/CIR.0000000000001201 - PMC - PubMed
    1. NIH. NIH broadens genomic data-sharing policy. Cancer Discov. 2014;4:OF4. doi: 10.1158/2159-8290 - PubMed
    1. Bota P, Thambiraj G, Bollepalli SC, Armoundas AA. Artificial intelligence algorithms in cardiovascular medicine: an attainable promise to improve patient outcomes or an inaccessible investment? Curr Cardiol Rep. 2024;26:1477–1485. doi: 10.1007/s11886-024-02146-y - PubMed
    1. Donnelly WJ. Viewpoint: patient-centered medical care requires a patient-centered medical record. Acad Med. 2005;80:33–38. doi: 10.1097/00001888-200501000-00009 - PubMed
    1. Hersh WR, Weiner MG, Embi PJ, Logan JR, Payne PR, Bernstam EV, Lehmann HP, Hripcsak G, Hartzog TH, Cimino JJ, et al. Caveats for the use of operational electronic health record data in comparative effectiveness research. Med Care. 2013;51:S30–S37. doi: 10.1097/MLR.0b013e31829b1dbd - PMC - PubMed

LinkOut - more resources