Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022;1(1):e0000004.
doi: 10.1371/journal.pdig.0000004. Epub 2022 Jan 18.

An explainable artificial intelligence approach for predicting cardiovascular outcomes using electronic health records

Affiliations

An explainable artificial intelligence approach for predicting cardiovascular outcomes using electronic health records

Sergiusz Wesołowski et al. PLOS Digit Health. 2022.

Abstract

Understanding the conditionally-dependent clinical variables that drive cardiovascular health outcomes is a major challenge for precision medicine. Here, we deploy a recently developed massively scalable comorbidity discovery method called Poisson Binomial based Comorbidity discovery (PBC), to analyze Electronic Health Records (EHRs) from the University of Utah and Primary Children's Hospital (over 1.6 million patients and 77 million visits) for comorbid diagnoses, procedures, and medications. Using explainable Artificial Intelligence (AI) methodologies, we then tease apart the intertwined, conditionally-dependent impacts of comorbid conditions and demography upon cardiovascular health, focusing on the key areas of heart transplant, sinoatrial node dysfunction and various forms of congenital heart disease. The resulting multimorbidity networks make possible wide-ranging explorations of the comorbid and demographic landscapes surrounding these cardiovascular outcomes, and can be distributed as web-based tools for further community-based outcomes research. The ability to transform enormous collections of EHRs into compact, portable tools devoid of Protected Health Information solves many of the legal, technological, and data-scientific challenges associated with large-scale EHR analyses.

PubMed Disclaimer

Conflict of interest statement

Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: GL, VD, MY own shares in Backdrop Health, there are no financial ties regarding this research.

Figures

Fig 1
Fig 1. Percent of medical terms influenced by various demographic features.
Demographic variables used in the comorbidity discovery process are displayed on the y-axis. The percent of all diagnoses, procedures, and medications influenced by a given demographic feature is displayed on the x-axis. For example, sex influences 42.2% percent of diagnoses, procedures, and medications in the Utah EHR corpus; ancestry influences 27.4% and EHR exposure 100%. EHR exposure includes subject age, length of medical record history, number of visits. See article [10] for details. Features were selected using L1 regularization.
Fig 2
Fig 2. Patient Disease Network for the Utah Data Resource.
Panel A. Graphical representation of the Patient Disease Network. 39,055 ICD 10 diagnosis codes, 5,716 CPT procedure codes, and 1,764 RxNorm medication codes comprising 50 million comorbidities are represented by the map. To render the patient disease network more readily interpretable, we utilized Minimum Description Length clustering, so that nodes with similar comorbidity patterns lay near to one another in the network. The comorbidities of Heart Transplant are labeled red for reference purposes. See Methods for details. Panel B. Term trajectory for Adult Heart Transplant. Nodes represent diagnosis (black), procedures (red), and medications (blue). Edges are temporally ordered comorbidities (Bonferroni alpha = 10E-9.5), arrows denote direction. Edges are labeled with transition probabilities (e.g. patient flux). For example, an adult patient with viral myocarditis has a 17% chance of developing a heart failure diagnosis, and a 4.9% chance of undergoing heart transplantation. See Methods for additional details and S5 Table for code references for the highlighted terms.
Fig 3
Fig 3. Multimorbidity Landscape of Heart Transplant.
Panel A. PGM for Adult Transplant. N = 1.6 million individuals. The clinical variables were chosen based on Bonferroni-corrected ICD10 and RXnorm billing codes significantly associated (preceding) with heart transplant. Each node represents a diagnosis, procedure, or medication code and each edge represents a conditional dependence between nodes. For detailed description of the clinical variables, please refer to S5 Table. Panel B. PGM for Pediatric Transplant. N = 26,458 individuals. Clinical variable terms represent terms in the Primary Children’s Hospital echocardiographic database or CCS billing codes when available. For detailed description of the clinical variables, please refer to the S5 Table. DCM: Dilated cardiomyopathy; Norwood: Norwood surgery; HLHS: hypoplastic left heart syndrome; Glenn: Glenn surgery; Fontan: Fontan surgery; AVSD: atrioventricular septal defect; ASD: Atrial septal defect; BAV: Bicuspid aortic valve; Coarctation: Coarctation of the aorta; VSD: Ventricular septal defect. Heart Transplant is highlighted in orange. For A and B, the target node (heart transplant) is colored red and nodes with direct connections to the target (ie, within the Markov blanket) are circled red. Values in Tables represent mean ± STD.
Fig 4
Fig 4. Multimorbidity Landscape of Sinoatrial Node Dysfunction (SND).
Each node represents a diagnosis or procedure, each edge represents a conditional dependence between nodes. For detailed description of the clinical variables, please refer to S5 Table. Panel A. Pediatric SND. N = 26,458 individuals. Clinical variable terms represent terms in the Primary Children’s Hospital echocardiographic database or CCS billing codes when available. Fontan: Fontan surgery; HLHS: hypoplastic left heart syndrome; Norwood: Norwood surgery; dTGA: d-transposition of the great arteries; RV fxn: right ventricular function; TR: > = moderate tricuspid regurgitation; BAV: bicuspid aortic valve. Panel B. Adult SND. N = 1.6 million individuals. Clinical variable terms represent CCS billing codes. Ancestry: Western European, African American, or Other; Ethnicity: Hispanic, or non-Hispanic. DCM: Dilated cardiomyopathy; AS: Aortic stenosis; Coarctation: Coarctation of the aorta. SND is highlighted in red in both panels. The target node (SND) is colored red and nodes with direct connections to the target (ie, within the Markov blanket) are circled red. Values in Tables represent mean ± STD.
Fig 5
Fig 5. Impact of maternal health on congenital anomalies in the child.
Panel A. Multimorbidity landscape for child’s risk for congenital malformations in the context of pregnancy-induced hypertension. N = 125,014 mothers. Clinical variable terms represent CCS billing codes present in the EHR database. Maternal diagnosis is highlighted in orange; HTN-Preg: Maternal diagnosis of hypertension complicating pregnancy (aka, pregnancy-induced hypertension); Diaphragm: Diaphragmatic congenital abnormalities; Genito-Urinary: Genito-Urinary congenital abnormalities; Cardiac: Cardiac and Circulatory congenital abnormalities; Skeletal: Skeletal congenital abnormalities; Down: Trisomy 21; Digestive: Congenital abnormalities of the gastrointestinal tract; Nervous: Nervous system congenital abnormalities; Eye: Congenital abnormalities of the Eye; CleftLip: Cleft lip. Panel B. Multimorbidity landscape for child’s risk of congenital heart defects in the context of pregnancy-induced hypertension. N = 125,014 mothers. ASD, atrial septal defect; VSD, ventricular septal defect, HLHS, hypoplastic left heart syndrome; Coarctation, coarctation of the aorta; TOF, tetralogy of fallot; BAV, bicuspid aortic valve. For detailed description of the clinical variables, please refer to S5 Table. The target node (HTN-PREG) is colored red and nodes with direct connections to the target (ie, within the Markov blanket) are circled red. Values in Tables represent mean ± standard deviation.

References

    1. Valderas J. M., Starfield B., Sibbald B., Salisbury C. & Roland M. Defining Comorbidity: Implications for Understanding Health and Health Services. Ann. Fam. Med. 7, 357–363 (2009). - PMC - PubMed
    1. Kraisangka J. et al.. Bayesian Network vs. Cox’s Proportional Hazard Model of PAH Risk: A Comparison. in Artificial Intelligence in Medicine (eds. Riaño D., Wilk S. & ten Teije A.) 139–149 (Springer International Publishing, 2019). doi: 10.1007/s11906-019-0950-y - DOI
    1. Capobianco E. & Lio P. Comorbidity: a multidimensional approach. Trends Mol. Med. 19, 515–521 (2013). doi: 10.1016/j.molmed.2013.07.004 - DOI - PubMed
    1. Guo M. et al.. Analysis of disease comorbidity patterns in a large-scale China population. BMC Med. Genomics 12, 177 (2019). - PMC - PubMed
    1. Hu J. X., Thomas C. E. & Brunak S. Network biology concepts in complex disease comorbidities. Nat. Rev. Genet. 17, 615–629 (2016). doi: 10.1038/nrg.2016.87 - DOI - PubMed

LinkOut - more resources