Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 3;13(4):e068698.
doi: 10.1136/bmjopen-2022-068698.

Artificial intelligence-based mining of electronic health record data to accelerate the digital transformation of the national cardiovascular ecosystem: design protocol of the CardioMining study

Collaborators, Affiliations

Artificial intelligence-based mining of electronic health record data to accelerate the digital transformation of the national cardiovascular ecosystem: design protocol of the CardioMining study

Athanasios Samaras et al. BMJ Open. .

Abstract

Introduction: Mining of electronic health record (EHRs) data is increasingly being implemented all over the world but mainly focuses on structured data. The capabilities of artificial intelligence (AI) could reverse the underusage of unstructured EHR data and enhance the quality of medical research and clinical care. This study aims to develop an AI-based model to transform unstructured EHR data into an organised, interpretable dataset and form a national dataset of cardiac patients.

Methods and analysis: CardioMining is a retrospective, multicentre study based on large, longitudinal data obtained from unstructured EHRs of the largest tertiary hospitals in Greece. Demographics, hospital administrative data, medical history, medications, laboratory examinations, imaging reports, therapeutic interventions, in-hospital management and postdischarge instructions will be collected, coupled with structured prognostic data from the National Institute of Health. The target number of included patients is 100 000. Natural language processing techniques will facilitate data mining from the unstructured EHRs. The accuracy of the automated model will be compared with the manual data extraction by study investigators. Machine learning tools will provide data analytics. CardioMining aims to cultivate the digital transformation of the national cardiovascular system and fill the gap in medical recording and big data analysis using validated AI techniques.

Ethics and dissemination: This study will be conducted in keeping with the International Conference on Harmonisation Good Clinical Practice guidelines, the Declaration of Helsinki, the Data Protection Code of the European Data Protection Authority and the European General Data Protection Regulation. The Research Ethics Committee of the Aristotle University of Thessaloniki and Scientific and Ethics Council of the AHEPA University Hospital have approved this study. Study findings will be disseminated through peer-reviewed medical journals and international conferences. International collaborations with other cardiovascular registries will be attempted.

Trial registration number: NCT05176769.

Keywords: CARDIOLOGY; Health informatics; Heart failure; Ischaemic heart disease; Risk management.

PubMed Disclaimer

Conflict of interest statement

Competing interests: None declared.

Figures

Figure 1
Figure 1
Data extraction method. Data from deidentified electronic health records will undergo both manual and automated extraction. The results of the automated extraction using NLP techniques will be validated on the manually organised dataset using accuracy metrics of the NLP model. The obtained knowledge from the manual dataset will help with the development of an accurate trained AI-model for automated data extraction. EHR, electronic health record; ML, machine learning; NLP, natural language processing.
Figure 2
Figure 2
Utilisation text mining techniques to extract knowledge from unstructured clinical notes. Keyword recognition will provide the baseline information for treating the entities as labels in a multilabel classification task. Text mining techniques will allow the development of a structured database for further processing through machine learning models. PAF, paroxysmal atrial fibrillation; T2DM, Type 2 diabetes mellitus; aVF, augmented Vector Foot; LVEF, left ventricular ejection fraction; TAPSE, Tricuspid annular plane systolic excursion; RCA, right coronary artery; TTS, transdermal therapeutic system.
Figure 3
Figure 3
The roadmap of the CardioMining study towards digital transformation of the national cardiovascular ecosystem. The digital transformation of a healthcare system at a national level is a great challenge but also a complex and difficult task. The low digital maturity of the health sector in Greece coupled with the ongoing rapid technological changes worldwide demand urgent action through the implementation of a paradigm shift. The CardioMining study will retrospectively collect unstructured data derived from electronic health records of cardiology departments. Data extraction will be performed both manually by humans and automatically using natural language processing algorithms. The validated artificial intelligence models will contribute to the development of a structured registry of cardiac patients to provide data analytics through machine learning models. CVD, cardiovascular disease.

Similar articles

Cited by

References

    1. Shivade C, Raghavan P, Fosler-Lussier E, et al. . A review of approaches to identifying patient phenotype cohorts using electronic health records. J Am Med Inform Assoc 2014;21:221–30. 10.1136/amiajnl-2013-001935 - DOI - PMC - PubMed
    1. Krittanawong C, Zhang H, Wang Z, et al. . Artificial intelligence in precision cardiovascular medicine. J Am Coll Cardiol 2017;69:2657–64. 10.1016/j.jacc.2017.03.571 - DOI - PubMed
    1. Madani A, Arnaout R, Mofrad M, et al. . Fast and accurate view classification of echocardiograms using deep learning. NPJ Digit Med 2018;1:6. 10.1038/s41746-017-0013-1 - DOI - PMC - PubMed
    1. Chary M, Parikh S, Manini AF, et al. . A review of natural language processing in medical education. West J Emerg Med 2019;20:78–86. 10.5811/westjem.2018.11.39725 - DOI - PMC - PubMed
    1. Deo RC. Machine learning in medicine. Circulation 2015;132:1920–30. 10.1161/CIRCULATIONAHA.115.001593 - DOI - PMC - PubMed

Associated data