Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar 7;14(1):5609.
doi: 10.1038/s41598-024-56170-7.

Development of risk models of incident hypertension using machine learning on the HUNT study data

Affiliations

Development of risk models of incident hypertension using machine learning on the HUNT study data

Filip Emil Schjerven et al. Sci Rep. .

Abstract

In this study, we aimed to create an 11-year hypertension risk prediction model using data from the Trøndelag Health (HUNT) Study in Norway, involving 17 852 individuals (20-85 years; 38% male; 24% incidence rate) with blood pressure (BP) below the hypertension threshold at baseline (1995-1997). We assessed 18 clinical, behavioral, and socioeconomic features, employing machine learning models such as eXtreme Gradient Boosting (XGBoost), Elastic regression, K-Nearest Neighbor, Support Vector Machines (SVM) and Random Forest. For comparison, we used logistic regression and a decision rule as reference models and validated six external models, with focus on the Framingham risk model. The top-performing models consistently included XGBoost, Elastic regression and SVM. These models efficiently identified hypertension risk, even among individuals with optimal baseline BP (< 120/80 mmHg), although improvement over reference models was modest. The recalibrated Framingham risk model outperformed the reference models, approaching the best-performing ML models. Important features included age, systolic and diastolic BP, body mass index, height, and family history of hypertension. In conclusion, our study demonstrated that linear effects sufficed for a well-performing model. The best models efficiently predicted hypertension risk, even among those with optimal or normal baseline BP, using few features. The recalibrated Framingham risk model proved effective in our cohort.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Smoothed calibration curves for the test set. Calibration curves close to the dashed reference line exhibit an elevated level of agreement between its predictions and the observed incidence in the test set. Curves are shown as pointwise mean curves calculated by bootstrapping. KNN K-nearest neighbors, SVM support vector machines, XGBoost eXtreme gradient boosting.
Figure 2
Figure 2
Calibration curves with histogram of predictions above. The histogram is colored by proportion of incidence. Curves are shown as pointwise mean curves with red shaded 95% confidence interval calculated by bootstrapping. KNN K-nearest neighbors, SVM support vector machines, XGBoost eXtreme gradient boosting.
Figure 3
Figure 3
Decision curves of all models. Net benefit was standardized to have a max value of 1. Curves are shown as pointwise mean curves calculated by bootstrapping. BP blood pressure, KNN K-nearest neighbors, SVM support vector machines, XGBoost eXtreme gradient boosting.
Figure 4
Figure 4
Decision curves with histogram of predictions above. The histogram is colored by the proportion of incidence. Net Benefit is standardized to have a max value of 1. Curves are shown as pointwise mean curves with red shaded 95% confidence interval calculated by bootstrapping. KNN K-nearest neighbors, SVM support vector machines, XGBoost eXtreme gradient boosting.
Figure 5
Figure 5
(a) Coefficient sizes in least absolute shrinkage and selection operator (LASSO) regression fitted on the training set with increasing regularization. Only the 10 last features to be zeroed out are shown. (b) The performance of the LASSO regression model on the test set as regularization was increased. Curves are shown as pointwise mean curves with red shaded 95% confidence interval calculated by bootstrapping. AUC area under the receiver-operator curve, BMI body mass index, BP blood pressure, Chol cholesterol, Fam. hist. of hyp. family history of hypertension, HDL high-density lipid, ICI integrated calibration index. Log natural logarithm, PAI physical activity indicator.
Figure 6
Figure 6
Permutation importance calculated for XGBoost, SVM, KNN and random forest models. The importance of a feature or cluster was determined as the average decrease in Scaled Brier score on the test set when the feature or cluster was permuted. Features are colored following Fig. 5—Panel A, with gray for ‘Sex’ and ‘Marital status’, and combined colors for clusters. Irrelevant features or clusters, defined as those with a mean decrease of less than 0.004 in Scaled Brier score, were left out for conciseness. Features in clusters were permuted simultaneously. BMI body mass index, BP blood pressure, Cl. # feature cluster #, HDL high-density lipid, KNN K-nearest neighbors, SVM support vector machines, XGBoost eXtreme gradient boosting.

Similar articles

Cited by

References

    1. Williams B, et al. 2018 ESC/ESH Guidelines for the management of arterial hypertension. Eur. Heart J. 2018;39:3021–3104. doi: 10.1093/eurheartj/ehy339. - DOI - PubMed
    1. Zhou B, Perel P, Mensah GA, Ezzati M. Global epidemiology, health burden and effective interventions for elevated blood pressure and hypertension. Nat. Rev. Cardiol. 2021;18:785–802. doi: 10.1038/s41569-021-00559-8. - DOI - PMC - PubMed
    1. Gaziano TA, Bitton A, Anand S, Weinstein MC. The global cost of nonoptimal blood pressure. J. Hypertens. 2009;27:1472–1477. doi: 10.1097/HJH.0b013e32832a9ba3. - DOI - PubMed
    1. Echouffo-Tcheugui JB, Batty GD, Kivimäki M, Kengne AP. Risk models to predict hypertension: A systematic review. PLoS ONE. 2013;8:e67370. doi: 10.1371/journal.pone.0067370. - DOI - PMC - PubMed
    1. Sun D, et al. Recent development of risk-prediction models for incident hypertension: An updated systematic review. PLoS ONE. 2017;12:e0187240. doi: 10.1371/journal.pone.0187240. - DOI - PMC - PubMed