Detect and correct bias in multi-site neuroimaging datasets

Christian Wachinger¹, Anna Rieckmann², Sebastian Pölsterl³; Alzheimer’s Disease Neuroimaging Initiative and the Australian Imaging Biomarkers and Lifestyle flagship study of ageing

Affiliations

¹ Lab for Artificial Intelligence in Medical Imaging (AI-Med), Department of Child and Adolescent Psychiatry, University Hospital, LMU München, Germany. Electronic address: christian.wachinger@med.uni-muenchen.de.
² Umeå Center for Functional Brain Imaging, Department of Radiation Sciences, Umeå University.
³ Lab for Artificial Intelligence in Medical Imaging (AI-Med), Department of Child and Adolescent Psychiatry, University Hospital, LMU München, Germany.

PMID: 33152602
DOI: 10.1016/j.media.2020.101879

Detect and correct bias in multi-site neuroimaging datasets

Christian Wachinger et al. Med Image Anal. 2021 Jan.

. 2021 Jan:67:101879.

doi: 10.1016/j.media.2020.101879. Epub 2020 Oct 21.

Authors

Christian Wachinger¹, Anna Rieckmann², Sebastian Pölsterl³; Alzheimer’s Disease Neuroimaging Initiative and the Australian Imaging Biomarkers and Lifestyle flagship study of ageing

Affiliations

¹ Lab for Artificial Intelligence in Medical Imaging (AI-Med), Department of Child and Adolescent Psychiatry, University Hospital, LMU München, Germany. Electronic address: christian.wachinger@med.uni-muenchen.de.
² Umeå Center for Functional Brain Imaging, Department of Radiation Sciences, Umeå University.
³ Lab for Artificial Intelligence in Medical Imaging (AI-Med), Department of Child and Adolescent Psychiatry, University Hospital, LMU München, Germany.

PMID: 33152602
DOI: 10.1016/j.media.2020.101879

Abstract

The desire to train complex machine learning algorithms and to increase the statistical power in association studies drives neuroimaging research to use ever-larger datasets. The most obvious way to increase sample size is by pooling scans from independent studies. However, simple pooling is often ill-advised as selection, measurement, and confounding biases may creep in and yield spurious correlations. In this work, we combine 35,320 magnetic resonance images of the brain from 17 studies to examine bias in neuroimaging. In the first experiment, Name That Dataset, we provide empirical evidence for the presence of bias by showing that scans can be correctly assigned to their respective dataset with 71.5% accuracy. Given such evidence, we take a closer look at confounding bias, which is often viewed as the main shortcoming in observational studies. In practice, we neither know all potential confounders nor do we have data on them. Hence, we model confounders as unknown, latent variables. Kolmogorov complexity is then used to decide whether the confounded or the causal model provides the simplest factorization of the graphical model. Finally, we present methods for dataset harmonization and study their ability to remove bias in imaging features. In particular, we propose an extension of the recently introduced ComBat algorithm to control for global variation across image features, inspired by adjusting for unknown population stratification in genetics. Our results demonstrate that harmonization can reduce dataset-specific information in image features. Further, confounding bias can be reduced and even turned into a causal relationship. However, harmonization also requires caution as it can easily remove relevant subject-specific information. Code is available at https://github.com/ai-med/Dataset-Bias.

Keywords: Bias; Big data; Causal inference; Harmonization; MRI.

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Elsevier Science
- Ovid Technologies, Inc.
Other Literature Sources
- scite Smart Citations
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Detect and correct bias in multi-site neuroimaging datasets

Affiliations

Detect and correct bias in multi-site neuroimaging datasets

Authors

Affiliations

Abstract

Conflict of interest statement

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical