Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 17:6:846202.
doi: 10.3389/fdata.2023.846202. eCollection 2023.

Comorbidity network analysis using graphical models for electronic health records

Affiliations

Comorbidity network analysis using graphical models for electronic health records

Bo Zhao et al. Front Big Data. .

Abstract

Importance: The comorbidity network represents multiple diseases and their relationships in a graph. Understanding comorbidity networks among critical care unit (CCU) patients can help doctors diagnose patients faster, minimize missed diagnoses, and potentially decrease morbidity and mortality.

Objective: The main objective of this study was to identify the comorbidity network among CCU patients using a novel application of a machine learning method (graphical modeling method). The second objective was to compare the machine learning method with a traditional pairwise method in simulation.

Method: This cross-sectional study used CCU patients' data from Medical Information Mart for the Intensive Care-3 (MIMIC-3) dataset, an electronic health record (EHR) of patients with CCU hospitalizations within Beth Israel Deaconess Hospital from 2001 to 2012. A machine learning method (graphical modeling method) was applied to identify the comorbidity network of 654 diagnosis categories among 46,511 patients.

Results: Out of the 654 diagnosis categories, the graphical modeling method identified a comorbidity network of 2,806 associations in 510 diagnosis categories. Two medical professionals reviewed the comorbidity network and confirmed that the associations were consistent with current medical understanding. Moreover, the strongest association in our network was between "poisoning by psychotropic agents" and "accidental poisoning by tranquilizers" (logOR 8.16), and the most connected diagnosis was "disorders of fluid, electrolyte, and acid-base balance" (63 associated diagnosis categories). Our method outperformed traditional pairwise comorbidity network methods in simulation studies. Some strongest associations between diagnosis categories were also identified, for example, "diagnoses of mitral and aortic valve" and "other rheumatic heart disease" (logOR: 5.15). Furthermore, our method identified diagnosis categories that were connected with most other diagnosis categories, for example, "disorders of fluid, electrolyte, and acid-base balance" was associated with 63 other diagnosis categories. Additionally, using a data-driven approach, our method partitioned the diagnosis categories into 14 modularity classes.

Conclusion and relevance: Our graphical modeling method inferred a logical comorbidity network whose associations were consistent with current medical understanding and outperformed traditional network methods in simulation. Our comorbidity network method can potentially assist CCU doctors in diagnosing patients faster and minimizing missed diagnoses.

Keywords: comorbidity network analysis; critical care unit; electronic health records; graphic modeling method; machine learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Workflow of our data processing, analysis, and evaluation steps.
Figure 2
Figure 2
The penalized logistic regression loss with the elastic net penalty. The first two terms are the regular negative log-likelihood function of logistic regression. The third term is the elastic net penalty function. The third term: λ: penalty coefficient, chosen by eBIC for best model fit. α: elastic net tuning parameter, deciding the percentage of LASSO vs. Ridge in the penalty term. j=1p|βj|: LASSO penalty. j=1pβj2: ridge penalty.
Figure 3
Figure 3
Graph representation of the comorbidity network by our graphical modeling method. The network summary statistics and the interpretation of the top network features (edges, nodes, and modularity classes) were discussed in the main text. The disease network was shown in this figure composed by Gephi12. The graph layout was “Force Atlas,” with repulsion strength = 10,000. The size of node represented its degree and color of node represented its modularity class. The graph was partitioned into 14 modularity classes. The biggest class are the yellow nodes located in the far upper right of the graph. They are diseases related with injury and accident. Some representative diseases with high degree are “Fracture of rib(s), sternum, larynx, and trachea,” “Other motor vehicle traffic accident involving collision with another motor vehicle,” and “Fall on same level from slipping, tripping, or stumbling.” The second biggest class are the blue nodes located in the upper left of the graph. They are diseases related with digestive system. Some representative diseases with high degree are “Intestinal obstruction without mention of hernia,” “Gastrointestinal hemorrhage,” “Peritonitis” and “Chronic liver disease and cirrhosis.” One disease with highest degree is “Other and unspecified anemias,” and it is strongly associated with gastrointestinal hemorrhage, and the model include it in this class. The third biggest class are the pink nodes located in the far lower right of the graph. They are related with neonatal diseases. Some representative diseases with high degree are “Epilepsy,” “Disorders relating to short gestation and unspecified low birthweight,” and “Other perinatal jaundice.” Other classes are related with heart diseases (red nodes in the lower left), renal diseases (brown nodes in the upper middle), pulmonary diseases (light brown nodes in the upper middle), mental diseases (dark green nodes in the lower right), and so on.

Similar articles

Cited by

References

    1. ACOG Practice Bulletins (2021). Multifetal gestations: twin, triplet, and higher-order multifetal pregnancies. Obstet Gynecol. 137:e145–e162. 10.1097/AOG.0000000000004397 - DOI - PubMed
    1. Aguado A., Moratalla-Navarro F., López-Simarro F., Moreno V. (2020). MorbiNet: multimorbidity networks in adult general population. Analysis of type 2 diabetes mellitus comorbidity. Sci. Rep. 10, 1–12. 10.1038/s41598-020-59336-1 - DOI - PMC - PubMed
    1. Bastian M., Heymann S., Jacomy M. (2009). “Gephi: an open source software for exploring and manipulating networks,” in Proceedings of the International AAAI Conference on Web and Social Media, 3, 361–362. 10.1609/icwsm.v3i1.13937 - DOI
    1. Brunson J. C., Agresta T. P., Laubenbacher R. C. (2020). Sensitivity of comorbidity network analysis. JAMIA Open 3, 94–103. 10.1093/jamiaopen/ooz067 - DOI - PMC - PubMed
    1. Combes A., Mokhtari M., Couvelard A., Trouillet J.-L., Baudot J., Hénin D., et al. . (2004). Clinical and autopsy diagnoses in the intensive care unit: a prospective study. Arch. Intern. Med. 164, 389–392. 10.1001/archinte.164.4.389 - DOI - PubMed

LinkOut - more resources