Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 15:237:118189.
doi: 10.1016/j.neuroimage.2021.118189. Epub 2021 May 20.

Integrating large-scale neuroimaging research datasets: Harmonisation of white matter hyperintensity measurements across Whitehall and UK Biobank datasets

Affiliations

Integrating large-scale neuroimaging research datasets: Harmonisation of white matter hyperintensity measurements across Whitehall and UK Biobank datasets

Valentina Bordin et al. Neuroimage. .

Abstract

Large scale neuroimaging datasets present the possibility of providing normative distributions for a wide variety of neuroimaging markers, which would vastly improve the clinical utility of these measures. However, a major challenge is our current poor ability to integrate measures across different large-scale datasets, due to inconsistencies in imaging and non-imaging measures across the different protocols and populations. Here we explore the harmonisation of white matter hyperintensity (WMH) measures across two major studies of healthy elderly populations, the Whitehall II imaging sub-study and the UK Biobank. We identify pre-processing strategies that maximise the consistency across datasets and utilise multivariate regression to characterise study sample differences contributing to differences in WMH variations across studies. We also present a parser to harmonise WMH-relevant non-imaging variables across the two datasets. We show that we can provide highly calibrated WMH measures from these datasets with: (1) the inclusion of a number of specific standardised processing steps; and (2) appropriate modelling of sample differences through the alignment of demographic, cognitive and physiological variables. These results open up a wide range of applications for the study of WMHs and other neuroimaging markers across extensive databases of clinical data.

Keywords: Harmonisation; MRI; UK Biobank; White matter hyperintensities.

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest M.J. receives royalties from licensing of FSL to non-academic, commercial entities.

Figures

Fig 1
Fig. 1
Effect of rater, assessed both in terms of between- (A and B) and within-rater variability (C). Each panel displays a comparison of the agreement (measured with Dice Similarity Index) between manual masks annotated by the raters (left box-plots) and BIANCA outputs generated with masks from those raters (right box-plot). Solid and dotted lines refer to results obtained on subjects characterised, respectively, by high and low WMH load. Legend: R1 = rater 1, R2a = Rater 2, first rating, R2b = rater 2, second rating (1 year apart from the first rating, blind to first rating), M = manual, B = BIANCA.
Fig 2
Fig. 2
Effect of bias field correction (BC) on ‘travelling heads’ data from the WH dataset. (A) example data from 1 subject acquired on both scanners, before and after BC showing improvement in image similarity after BC (B) Cost function (correlation ratio) between Scanner1/Scanner2 images of the 5 traveling head participants, calculated before and after BC (*** - p < 0.001).
Fig 3
Fig. 3
BIANCA performance – scanner upgrade scenario. Box-plot of the Dice Similarity Index (DI) between BIANCA output and the corresponding manual masks for the different analysis options tested during our study (specified on the x axis). All the displayed results were evaluated on a sub-sample of manually segmented subjects (12 for WH1 and 12 for WH2) balanced in terms of WMH load and using leave-one-out cross-validation whenever appropriate (details in the main text).
Fig 4
Fig. 4
Association between WMHs and age – scanner upgrade scenario. Scatter plot of the relationship between WMH volumes (expressed as % of total brain volume, y axis) and age (x axis), for WH1 (cyan) and WH2 (purple) data. Regression lines with 95% confidence interval are also displayed. Each plot refers to one of the investigated analysis options: (A) without BC, single-site training, FA included; (B) with BC, single-site training, FA included; (C) with BC, site-specific training, FA included; (D) with BC, mixed training, FA included; (E) with BC, mixed training, FA excluded. Evaluation was conducted on the full sample of data for both datasets (WH1 = 513, WH2 = 200) (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.).
Fig 5
Fig. 5
Multivariate model – scanner upgrade scenario. Percentage of variance (y axis) explained by non-imaging variables (reported on the x axis) in the linear multivariate model that was implemented (Elastic Net). Evaluation was conducted on the full sample of data (WH1 = 513, WH2 = 200). Each plot refers to one of the investigated analysis options: (A) without BC, single-site training, FA included; (B) with BC, single-site training, FA included; (C) with BC, site-specific training, FA included; (D) with BC, mixed training, FA included; (E) with BC, mixed training, FA excluded. Variable scanner/site (SC) highlighted in red. Values are reported in Table 6 and Supplementary Table S3 (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.).
Fig 6
Fig. 6
BIANCA performance – retrospective data merging scenario. Box-plot of the Dice Similarity Index (DI) between BIANCA output and the corresponding manual mask for the different analysis options tested during our study (specified on the x axis) All the displayed results were evaluated on a sub-sample of manually segmented subjects (12 for WH1, 12 for WH2 and 12 for UKB) balanced in terms of WMH load and using leave-one-out cross-validation.
Fig 7
Fig. 7
Association between WMHs and age – retrospective data merging scenario. Scatter plot of the relationship between WMH volumes (expressed as % of total brain volume, y axis) and age (x axis), for WH1 (cyan), WH2 (purple) and UKB (orange) data. Regression lines with 95% confidence interval are also displayed. Each plot refers to one of the investigated analysis options: (A) with BC, site-specific training, FA excluded; (B) with BC, mixed training, FA excluded. Evaluation was conducted on the full sample of data for all datasets (WH1 = 513, WH2 = 200, UKB = 2285) (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.).
Fig 8
Fig. 8
Multivariate model – retrospective data merging scenario. Percentage of variance (reported on the y axis) explained by non-imaging variables (reported on the x axis) in the linear multivariate model that was implemented (Elastic Net). Evaluation was conducted on the full sample of data for all the involved populations (WH1 = 513, WH2 = 200, UKB = 2285). Each plot refers to one of the investigated analysis options: (A) with BC, site-specific training, FA excluded; (B) with BC, mixed training, FA excluded. Variable scanner/site (SC) highlighted in red. Values are reported in Table 6 and Supplementary Table S3 (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.).

References

    1. Alfaro-Almagro F., Jenkinson M., Bangerter N.K., Andersson J.L.R., Griffanti L., Douaud G., Sotiropoulos S.N., Jbabdi S., Hernandez-Fernandez M., Vallee E., Vidaurre D., Webster M., McCarthy P., Rorden C., Daducci A., Alexander D.C., Zhang H., Dragonu I., Matthews P.M., Miller K.L., Smith S.M. Image processing and quality control for the first 10,000 brain imaging datasets from UK Biobank. NeuroImage. 2018;166:400–424. doi: 10.1016/j.neuroimage.2017.10.034. - DOI - PMC - PubMed
    1. Anbeek P., Vincken K.L., van Osch M.J., Bisschops R.H., van der Grond J. Probabilistic segmentation of white matter lesions in MR imaging. NeuroImage. 2004;21:1037–1044. doi: 10.1016/j.neuroimage.2003.10.012. - DOI - PubMed
    1. Arvanitakis Z., Capuano A.W., Leurgans S.E., Bennett D.A., Schneider J.A. Relation of cerebral vessel disease to Alzheimer's disease dementia and cognitive function in elderly people: a cross-sectional study. Lancet Neurol. 2016;15:934–943. doi: 10.1016/S1474-4422(16)30029-1. - DOI - PMC - PubMed
    1. Bauermeister S., Orton C., Thompson S., Barker R.A., Bauermeister J.R., Ben-Shlomo Y., Brayne C., Burn D., Campbell A., Calvin C., Chandran S., Chaturvedi N., Chêne G., Chessell I.P., Corbett A., Davis D.H.J., Denis M., Dufouil C., Elliott P., Fox N., Hill D., Hofer S.M., Hu M.T., Jindra C., Kee F., Kim C.H., Kim C., Kivimaki M., Koychev I., Lawson R.A., Linden G.J., Lyons R.A., Mackay C., Matthews P.M., McGuiness B., Middleton L., Moody C., Moore K., Na D.L., O'Brien J.T., Ourselin S., Paranjothy S., Park K.S., Porteous D.J., Richards M., Ritchie C.W., Rohrer J.D., Rossor M.N., Rowe J.B., Scahill R., Schnier C., Schott J.M., Seo S.W., South M., Steptoe M., Tabrizi S.J., Tales A., Tillin T., Timpson N.J., Toga A.W., Visser P.J., Wade-Martins R., Wilkinson T., Williams J., Wong A., Gallacher J.E.J. The dementias platform UK (DPUK) data portal. Eur. J. Epidemiol. 2020;35:601–611. doi: 10.1007/s10654-020-00633-4. - DOI - PMC - PubMed
    1. Debette S., Beiser A., DeCarli C., Au R., Himali J.J., Kelly-Hayes M., Romero J.R., Kase C.S., Wolf P.A., Seshadri S. Association of MRI markers of vascular brain injury with incident stroke, mild cognitive impairment, dementia, and mortality: the framingham offspring study. Stroke. 2010;41:600–606. doi: 10.1161/STROKEAHA.109.570044. - DOI - PMC - PubMed

Publication types