Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Oct:84:104243.
doi: 10.1016/j.ebiom.2022.104243. Epub 2022 Sep 6.

Machine learning integration of multimodal data identifies key features of blood pressure regulation

Affiliations

Machine learning integration of multimodal data identifies key features of blood pressure regulation

Panayiotis Louca et al. EBioMedicine. 2022 Oct.

Abstract

Background: Association studies have identified several biomarkers for blood pressure and hypertension, but a thorough understanding of their mutual dependencies is lacking. By integrating two different high-throughput datasets, biochemical and dietary data, we aim to understand the multifactorial contributors of blood pressure (BP).

Methods: We included 4,863 participants from TwinsUK with concurrent BP, metabolomics, genomics, biochemical measures, and dietary data. We used 5-fold cross-validation with the machine learning XGBoost algorithm to identify features of importance in context of one another in TwinsUK (80% training, 20% test). The features tested in TwinsUK were then probed using the same algorithm in an independent dataset of 2,807 individuals from the Qatari Biobank (QBB).

Findings: Our model explained 39·2% [4·5%, MAE:11·32 mmHg (95%CI, +/- 0·65)] of the variance in systolic BP (SBP) in TwinsUK. Of the top 50 features, the most influential non-demographic variables were dihomo-linolenate, cis-4-decenoyl carnitine, lactate, chloride, urate, and creatinine along with dietary intakes of total, trans and saturated fat. We also highlight the incremental value of each included dimension. Furthermore, we replicated our model in the QBB [SBP variance explained = 45·2% (13·39%)] cohort and 30 of the top 50 features overlapped between cohorts.

Interpretation: We show that an integrated analysis of omics, biochemical and dietary data improves our understanding of their in-between relationships and expands the range of potential biomarkers for blood pressure. Our results point to potentially key biological pathways to be prioritised for mechanistic studies.

Funding: Chronic Disease Research Foundation, Medical Research Council, Wellcome Trust, Qatar Foundation.

Keywords: Blood pressure; Diet; Genomics; Machine learning; Metabolomics.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests T. D. S is a co-founder and shareholder of Zoe Global. All other authors declare no competing financial interests.

Figures

Figure 1
Figure 1
Consort diagram of data quality control, machine learning model building and model evaluation. Data included traditional risk factors (age, sex and BMI), biochemical measures, 206 known metabolites, a BP polygenic risk score, and energy intake and dietary intake for 45 nutrients (including salt intake).
Figure 2
Figure 2
Importance of top 50 features in SBP. Bars represent SHAP values indicating the average relative importance of each feature, coloured according to the type of data. Base layer labels indicate metabolite super pathways, where; A=Amino Acids, Ch=Carbohydrates, L=Lipids, N=Nucleotides, P=Peptides, and CoV=Cofactors and vitamins.
Figure 3
Figure 3
SHAP plot of top 50 features influencing our model's prediction of SBP. Features are ranked in descending order based on their influence on our model and the x-axis denotes SHAP values. Each dot represents an individual subject and coloured according to the magnitude of the feature (red depicts a higher feature value, and blue depicts a lower value). The horizontal location of a dot (x-axis) depicts whether it corresponds with a higher or lower prediction.
Figure 4
Figure 4
Scatter plot of SBP within our sample and predicted SBP of the XGBoost algorithm. Actual SBP of each subject within our sample plot along the y-axis and predicted SBP from our model across the x-axis (in mmHg). The colour gradient of each point denotes the density of participants within a particular region of the plot.
Figure 5
Figure 5
Bi plot depicting the principal component analysis of features most influential to our SBP model. Lines depict how strongly each feature influences a principal component, and the angle between each feature represents the correlations between those features.

References

    1. Roth GA, Abate D, Abate KH, et al. Global, regional, and national age-sex-specific mortality for 282 causes of death in 195 countries and territories, 1980–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet North Am Ed. 2018;392(10159):1736–1788. - PMC - PubMed
    1. Arnett DK, Claas SA. Omics of blood pressure and hypertension. Circ Res. 2018;122(10):1409–1419. - PubMed
    1. Padmanabhan S, Dominiczak AF. Genomics of hypertension: the road to precision medicine. Nat Rev Cardiol. 2021;18(4):235–250. - PubMed
    1. Louca P, Menni C, Padmanabhan S. Genomic determinants of hypertension with a focus on metabolomics and the gut microbiome. Am J Hypertens. 2020;33(6):473–481. - PMC - PubMed
    1. Pfeffer MA, McMurray JJV. Lessons in uncertainty and humility — clinical trials involving hypertension. N Engl J Med. 2016;375(18):1756–1766. - PubMed