Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul 3:9:835.
doi: 10.3389/fphys.2018.00835. eCollection 2018.

A Machine Learning Aided Systematic Review and Meta-Analysis of the Relative Risk of Atrial Fibrillation in Patients With Diabetes Mellitus

Affiliations

A Machine Learning Aided Systematic Review and Meta-Analysis of the Relative Risk of Atrial Fibrillation in Patients With Diabetes Mellitus

Zhaohan Xiong et al. Front Physiol. .

Abstract

Background: Meta-analysis is a widely used tool in which weighted information from multiple similar studies is aggregated to increase statistical power. However, the exponential growth of publications in key areas of medical science has rendered manual identification of relevant studies increasingly time-consuming. The aim of this work was to develop a machine learning technique capable of robust automatic study selection for meta-analysis. We have validated this approach with an up-to-date meta-analysis to investigate the association between diabetes mellitus (DM) and new-onset atrial fibrillation (AF). Methods: The PubMed online database was searched from 1960 to September 2017 where 4,177 publications that mentioned both DM and AF were identified. Relevant studies were selected as follows. First, publications were clustered based on common text features using an unsupervised K-means algorithm. Clusters that best matched the selected set of potentially relevant studies (a "training" set of 139 articles) were then identified by using maximum entropy classification. The 139 articles selected automatically on this basis were screened manually to identify potentially relevant studies. To determine the validity of the automated process, a parallel set of studies was also assembled by manually screening all initially searched publications. Finally, detailed manual selection was performed on the full texts of the studies in both sets using standard criteria. Quality assessment, meta-regression random-effects models, sensitivity analysis and publication bias assessment were then conducted. Results: Machine learning-assisted screening identified the same 29 studies for meta-analysis as those identified by using manual screening alone. Machine learning enabled more robust and efficient study selection, reducing the number of studies needed for manual screening from 4,177 to 556 articles. A pooled analysis using the most conservative estimates indicated that patients with DM had ~49% greater risk of developing AF compared with individuals without DM. After adjusting for three additional risk factors i.e., hypertension, obesity and heart disease, the relative risk was 23%. Using multivariate adjusted models, the risk for developing AF in patients with DM was similar for all DM subtypes. Women with DM were 24% more likely to develop AF than men with DM. The risk for new-onset AF in patients with DM has also increased over the years. Conclusions: We have developed a novel machine learning method to identify publications suitable for inclusion in meta-analysis.This approach has the capacity to provide for a more efficient and more objective study selection process for future such studies. We have used it to demonstrate that DM is a strong, independent risk factor for AF, particularly for women.

Keywords: atrial fibrillation; diabetes mellitus; machine learning; meta-analysis; risk factor.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Flowchart of search and selection strategy used in this study. The study selection flowchart is displayed here, in which a machine learning approach was developed to facilitate the publication selection. Articles from searched publications were first grouped into 14 clusters by unsupervised machine learning. Then supervised machine learning was used to identify clusters of articles with greatest relevance to the labeled training set identified based on a subset of articles from the initial search. Full texts of the studies identified were then reviewed and 29 articles were selected for the meta-analysis which was validated by the conventional manual selection approach.
Figure 2
Figure 2
(A) 4,177 articles from originally searched publications were used in the meta-analysis. (B) Cluster #5 containing 416 articles was automatically identified with greatest relevance to the labeled training set. DM, diabetes milieus; AF, atrial fibrillation.
Figure 3
Figure 3
Estimated risks of AF in patients with DM using the most conservative risk estimates provided in the individual studies. Subgroup summaries for cohort/randomized and case-control studies are in bold at the bottom of each subgroup. DM, diabetes milieus; AF, atrial fibrillation; RR, relative risk; CI, confidence interval.
Figure 4
Figure 4
Risk estimates with different additional risk factor adjustments. (A) Forest plot of risk values adjusted for hypertension in addition to age-and/or-sex/none and other included risk factors. (B) RR estimate adjusted for BMI in addition to age-and/or-sex/none and other included risk factors. (C) RR estimate adjusted for various heart conditions in addition to age-and/or-sex/none and other included risk factors. (D) Summary estimate for RRs after adjusting for hypertension, BMI and various heart conditions in addition to age-and/or-sex/none and other included risk factors. BMI, body mass index; DM, diabetes milieus; AF, atrial fibrillation; RR, relative risk; CI, confidence interval.
Figure 5
Figure 5
No significant difference in risks of AF incidence in patients with DM for undefined DM subtypes, type 2 DM and type 1 DM using the multivariate model. DM, diabetes milieus; AF, atrial fibrillation; RR, relative risk.
Figure 6
Figure 6
Significant difference in the risk of AF incidence between men and women with DM using the multivariate model. Summary estimate for publications that reported risk values for men (A) and for women (B). DM, diabetes milieus; AF, atrial fibrillation; RR, relative risk; CI, confidence interval.
Figure 7
Figure 7
The increasing trend for RRs of AF in patients with DM grouped by the median year of patient enrolment. The risk was estimated using the most conservative risks provided in included individual studies. DM, diabetes milieus; AF, atrial fibrillation; RR, relative risk.

References

    1. Aksnes T. A., Schmieder R. E., Kjeldsen S. E., Ghani S., Hua T. A., Julius S. (2008). Impact of new-onset diabetes mellitus on development of atrial fibrillation and heart failure in high-risk hypertension (from the VALUE Trial). Am. J. Cardiol. 101, 634–638. 10.1016/j.amjcard.2007.10.025 - DOI - PubMed
    1. Alves-Cabratosa L. García-Gil M. Comas-Cufí M Martí R Ponjoan A. Parramon D. et al. . (2016). Diabetes and new-onset atrial fibrillation in a hypertensive population. Ann. Med. 48, 119–127. 10.3109/07853890.2016.1144930 - DOI - PubMed
    1. Ananiadou S., Rea B., Okazaki N., Procter R., Thomas J. (2009). Supporting systematic reviews using text mining. Soc. Sci. Comp. Rev. 27, 509–523. 10.1177/0894439309332293 - DOI
    1. Cohen A. M., Ambert K., McDonagh M. (2012). Studying the potential impact of automated document classification on scheduling a systematic review update. BMC Med. Inform. Decis. Mak. 12:33. 10.1186/1472-6947-12-33 - DOI - PMC - PubMed
    1. Colilla S., Crow A., Petkun W., Singer D. E., Simon T., Liu X. (2013). Estimates of current and future incidence and prevalence of atrial fibrillation in the US adult population. Am. J. Cardiol. 112, 1142–1147. 10.1016/j.amjcard.2013.05.063 - DOI - PubMed

LinkOut - more resources