Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb 3;14(1):604.
doi: 10.1038/s41467-023-36231-7.

Atlas of plasma NMR biomarkers for health and disease in 118,461 individuals from the UK Biobank

Affiliations

Atlas of plasma NMR biomarkers for health and disease in 118,461 individuals from the UK Biobank

Heli Julkunen et al. Nat Commun. .

Abstract

Blood lipids and metabolites are markers of current health and future disease risk. Here, we describe plasma nuclear magnetic resonance (NMR) biomarker data for 118,461 participants in the UK Biobank. The biomarkers cover 249 measures of lipoprotein lipids, fatty acids, and small molecules such as amino acids, ketones, and glycolysis metabolites. We provide an atlas of associations of these biomarkers to prevalence, incidence, and mortality of over 700 common diseases ( nightingalehealth.com/atlas ). The results reveal a plethora of biomarker associations, including susceptibility to infectious diseases and risk of various cancers, joint disorders, and mental health outcomes, indicating that abundant circulating lipids and metabolites are risk markers beyond cardiometabolic diseases. Clustering analyses indicate similar biomarker association patterns across different disease types, suggesting latent systemic connectivity in the susceptibility to a diverse set of diseases. This work highlights the value of NMR based metabolic biomarker profiling in large biobanks for public health research and translation.

PubMed Disclaimer

Conflict of interest statement

H.J., M.T., H.K., K.N., V.M., J.N.-K., A.J.K., P.S., J.B., and P.W. are employees of Nightingale Health Plc, and hold shares or stock options in Nightingale Health Plc. A.C. is former employee of Nightingale Health Plc. V.S. has received an honorarium for consulting from Sanofi and has ongoing research collaboration with Bayer Ltd outside this work. The remaining authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Nuclear magnetic resonance (NMR) biomarker data in the UK Biobank and atlas of disease associations.
a Process of the Nightingale Health-UK Biobank Initiative: 1) EDTA plasma samples from the baseline survey were prepared on 96-well plates and shipped to Nightingale Health laboratories in Finland, 2) Buffer was added and samples transferred to NMR tubes, 3) Samples were measured using six 500 MHz proton NMR spectrometers, 4) Automated spectral processing software was used to quantify 249 biomarker measures from each sample, 5) Quality control metrics based on blind duplicates and internal control samples were used to track consistency metrics throughout the project, 6) Biomarker data were cleaned, provided to UK Biobank and released to the research community. b Overview of biomarker types included in the Nightingale Health NMR biomarker panel. c Schematic illustration of the atlas of biomarker-disease associations published along with this study. The webtool allows to display the associations of all biomarkers versus prevalence, incidence and mortality of each disease endpoint, as well as show each biomarker versus all disease endpoints.
Fig. 2
Fig. 2. Biomarkers for future disease onset across a spectrum of diseases.
a Total number of incident disease associations by biomarker at statistical significance level p < 5e-5. The disease outcomes were defined based on 3-character ICD-10 codes with 50 or more events from chapters A-N, with a total of 556 diseases tested for association. The colour coding indicates the proportion of associations coming from each ICD-10 chapter from A to N. be Twenty most significant associations for four biomarkers: b Glycoprotein acetyls, c Ratio of polyunsaturated fatty acids to monounsaturated fatty acids (PUFA/MUFA), d Alanine, and e Branched-chain amino acids (BCAA). The forestplots highlight 20 of the most significant associations, arranged according to decreasing association magnitude. Data are presented as hazard ratios and 95% confidence intervals (CI), per SD-scaled biomarker concentrations. All models were adjusted for age, sex and UK biobank assessment centre, using age as the timescale of the Cox proportional hazards regression. Similar disease-wide association plots for all 249 biomarkers across all endpoints analysed are available in the biomarker-disease atlas webtool. Source data are provided as a Source Data file.
Fig. 3
Fig. 3. Biomarker profiles for the incidence of various types of diseases.
Hazard ratios of biomarkers with the incidence of six disease examples: A41 Sepsis (red; n = 117,806, 2986 events), C34 Lung cancer (light blue; n = 117,964, 1210 events), F32 Depression (green; n = 116,993, 5455 events), G47 Sleep disorders (dark blue; n = 117,325, 1865 events), I21 Myocardial infarction (orange; n = 116,797, 2523 events) and M81 Osteoporosis (lavender; n = 117,538, 3326 events). Data are presented as hazard ratios and 95% confidence intervals (CI), per SD-scaled biomarker concentrations. The models were adjusted for age, sex and UK biobank assessment centre, using age as the timescale of the Cox proportional hazards regression. Filled points indicate statistically significant associations (p < 5e-5), and hollow points are non-significant ones. Similar forest plots for all 249 NMR biomarkers across all endpoints analysed are provided in the biomarker-disease atlas webtool. BCAA indicates branched-chain amino acids, DHA docosahexaenoic acid; MUFA monounsaturated fatty acids, PUFA polyunsaturated fatty acids, SFA saturated fatty acids. Source data are provided as a Source Data file.
Fig. 4
Fig. 4. Clustering of incident diseases according to their biomarker signatures.
a Heatmap showing the clustering of biomarker association signatures for the incidence of a diverse set of diseases. The diseases represent three diseases from each ICD-10 chapter from A to N, selected based on the highest number of significant associations. The colouring indicates the association magnitudes in units of the effect sizes, i.e log(hazard ratio per SD). The dendrograms depict the similarity of the association patterns, computed using complete linkage clustering based on the linear correlation between the association signatures. Significant associations with p value < 5e-5 are marked with an asterisk. All models were adjusted for age, sex and UK biobank assessment centre, using age as the timescale of the Cox proportional hazards regression. Examples of overall biomarker signatures compared for incidence of b Other diseases of liver (K76) and Other polyneuropathies (G62), and c Acute myocardial infarction (I21) and Heart failure (I50). The hazard ratios for each biomarker are shown as points with 95% confidence intervals (CI) indicated in vertical and horizontal error bars. The colouring of the points indicates the significance of the biomarker association for the pair of diseases. The red lines denote a hazard ratio of 1, and the grey line denotes the diagonal. BCAA indicates branched-chain amino acids, DHA docosahexaenoic acid, MUFA monounsaturated fatty acids, PUFA polyunsaturated fatty acids, SFA saturated fatty acids. Source data are provided as a Source Data file.
Fig. 5
Fig. 5. Comparison of nuclear magnetic resonance (NMR) and clinical chemistry biomarker associations.
Hazard ratios of biomarkers for which both NMR-based (red) and clinical chemistry (blue) measurements are available, against the incidence of six disease examples: A41 Sepsis (n = 117,806, 2986 events), C34 Lung cancer (n = 117,964, 1210 events), F32 Depression (n = 116,993, 5455 events), G47 Sleep disorders (n = 117,325, 1865 events), I21 Myocardial infarction (n = 116,797, 2523 events) and M81 Osteoporosis (n = 117,538, 3326 events). Data are presented as hazard ratios and 95% confidence intervals (CI), per SD-scaled biomarker concentrations. The models were adjusted for age, sex and UK biobank assessment centre, using age as the timescale of the Cox proportional hazards regression. Filled points indicate statistically significant (p < 5e-5) associations, hollow points non-significant ones. Source data are provided as a Source Data file.
Fig. 6
Fig. 6. Replication of biomarker associations with incident disease.
Biomarker associations for six disease endpoints are shown for THL Biobank (red) and UK Biobank for the full study population (light blue) as well as for individuals without self-reported use of cholesterol-lowering medication (dark blue): a All-cause mortality, b Major adverse cardiovascular event, c Diabetes, d Chronic obstructive pulmonary disease (COPD), e Chronic kidney failure and f Liver diseases. Results from THL biobank were meta-analysed for five prospective Finnish cohorts (FINRISK 1997, 2002, 2007, and 2012, and Health 2000). Data are presented as hazard ratios and 95% confidence intervals (CI), per SD-scaled biomarker concentrations. All models were adjusted for age and sex, using age as the timescale of the Cox proportional hazards regression. Analyses in the UK biobank were additionally adjusted for the UK biobank assessment centre. Filled points indicate statistically significant associations (p < 5e-5), and hollow points non-significant ones. Black horizontal line denotes a hazard ratio of 1. Event numbers for incident disease or mortality in the two biobanks are shown in Table 2. ICD-10 codes used for compiling the composite endpoints are listed in Supplementary Table 2. The replication results are shown here for six endpoints available in THL biobank; results for all overlapping endpoints are shown in Supplementary Fig. 16. Results are shown separately for each of the five Finnish cohorts in Supplementary Fig. 17. BCAA indicates branched-chain amino acids, DHA docosahexaenoic acid, MUFA monounsaturated fatty acids, PUFA polyunsaturated fatty acids, SFA saturated fatty acids. Source data are provided as a Source Data file.
Fig. 7
Fig. 7. Age-stratified biomarker profiles for the onset of various types of diseases.
Biomarker profiles stratified by age tertiles: 1st tertile (3–53 years of age; dark blue), 2nd tertile (54–61 years of age; red) and 3rd tertile (62–71 years of age; green). Results are shown for 17 biomarkers across six disease examples: A41 Sepsis (n = 117,806, 2986 events), C34 Lung cancer (n = 117,964, 1210 events), F32 Depression (n = 116,993, 5455 events), G47 Sleep disorders (n = 117,325, 1865 events), I21 Myocardial infarction (n = 116,797, 2523 events) and M81 Osteoporosis (n = 117,538, 3326 events). Results for the remaining 20 biomarkers are shown in Supplementary Fig. 19. Data are presented as hazard ratios and 95% confidence intervals (CI), per SD-scaled biomarker concentrations. The models were adjusted for age, sex and UK biobank assessment centre, using age as the timescale of the Cox proportional hazards regression. Filled points indicate statistically significant associations (p < 5e-5), and hollow points non-significant ones. Similar forest plots for all 249 NMR biomarkers across all endpoints analysed are provided in the biomarker-disease atlas webtool. DHA indicates docosahexaenoic acid, MUFA monounsaturated fatty acids, PUFA polyunsaturated fatty acids, SFA saturated fatty acids. Source data are provided as a Source Data file.

References

    1. Sudlow C, et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 2015;12:e1001779. doi: 10.1371/journal.pmed.1001779. - DOI - PMC - PubMed
    1. Bycroft C, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562:203–209. doi: 10.1038/s41586-018-0579-z. - DOI - PMC - PubMed
    1. Szustakowski JD, et al. Advancing human genetics research and drug discovery through exome sequencing of the UK Biobank. Nat. Genet. 2021;53:942–948. doi: 10.1038/s41588-021-00885-0. - DOI - PubMed
    1. Allen NE, et al. Approaches to minimising the epidemiological impact of sources of systematic and random variation that may affect biochemistry assay data in UK Biobank. Wellcome Open Res. 2021;5:222. doi: 10.12688/wellcomeopenres.16171.2. - DOI - PMC - PubMed
    1. Sinnott-Armstrong N, et al. Genetics of 35 blood and urine biomarkers in the UK Biobank. Nat. Genet. 2021;53:185–194. doi: 10.1038/s41588-020-00757-z. - DOI - PMC - PubMed

Publication types

LinkOut - more resources