Representation Learning with Statistical Independence to Mitigate Bias
- PMID: 34522832
- PMCID: PMC8436589
- DOI: 10.1109/wacv48630.2021.00256
Representation Learning with Statistical Independence to Mitigate Bias
Abstract
Presence of bias (in datasets or tasks) is inarguably one of the most critical challenges in machine learning applications that has alluded to pivotal debates in recent years. Such challenges range from spurious associations between variables in medical studies to the bias of race in gender or face recognition systems. Controlling for all types of biases in the dataset curation stage is cumbersome and sometimes impossible. The alternative is to use the available data and build models incorporating fair representation learning. In this paper, we propose such a model based on adversarial training with two competing objectives to learn features that have (1) maximum discriminative power with respect to the task and (2) minimal statistical mean dependence with the protected (bias) variable(s). Our approach does so by incorporating a new adversarial loss function that encourages a vanished correlation between the bias and the learned features. We apply our method to synthetic data, medical images (containing task bias), and a dataset for gender classification (containing dataset bias). Our results show that the learned features by our method not only result in superior prediction performance but also are unbiased.
Figures









Similar articles
-
Learning Fair Representations via Distance Correlation Minimization.IEEE Trans Neural Netw Learn Syst. 2024 Feb;35(2):2139-2152. doi: 10.1109/TNNLS.2022.3187165. Epub 2024 Feb 5. IEEE Trans Neural Netw Learn Syst. 2024. PMID: 35969542
-
Subgroup Invariant Perturbation for Unbiased Pre-Trained Model Prediction.Front Big Data. 2021 Feb 18;3:590296. doi: 10.3389/fdata.2020.590296. eCollection 2020. Front Big Data. 2021. PMID: 33693421 Free PMC article.
-
xDEEP-MSI: Explainable Bias-Rejecting Microsatellite Instability Deep Learning System in Colorectal Cancer.Biomolecules. 2021 Nov 29;11(12):1786. doi: 10.3390/biom11121786. Biomolecules. 2021. PMID: 34944430 Free PMC article.
-
Ensemble machine learning model trained on a new synthesized dataset generalizes well for stress prediction using wearable devices.J Biomed Inform. 2023 Dec;148:104556. doi: 10.1016/j.jbi.2023.104556. Epub 2023 Dec 2. J Biomed Inform. 2023. PMID: 38048895
-
Treatment effect prediction with adversarial deep learning using electronic health records.BMC Med Inform Decis Mak. 2020 Dec 14;20(Suppl 4):139. doi: 10.1186/s12911-020-01151-9. BMC Med Inform Decis Mak. 2020. PMID: 33317502 Free PMC article.
Cited by
-
Subject Harmonization of Digital Biomarkers: Improved Detection of Mild Cognitive Impairment from Language Markers.Pac Symp Biocomput. 2024;29:187-200. Pac Symp Biocomput. 2024. PMID: 38160279 Free PMC article.
-
Drop the shortcuts: image augmentation improves fairness and decreases AI detection of race and other demographics from medical images.EBioMedicine. 2024 Apr;102:105047. doi: 10.1016/j.ebiom.2024.105047. Epub 2024 Mar 11. EBioMedicine. 2024. PMID: 38471396 Free PMC article.
-
Automated eloquent cortex localization in brain tumor patients using multi-task graph neural networks.Med Image Anal. 2021 Dec;74:102203. doi: 10.1016/j.media.2021.102203. Epub 2021 Aug 21. Med Image Anal. 2021. PMID: 34474216 Free PMC article.
-
Ethical issues in using ambient intelligence in health-care settings.Lancet Digit Health. 2021 Feb;3(2):e115-e123. doi: 10.1016/S2589-7500(20)30275-2. Epub 2020 Dec 21. Lancet Digit Health. 2021. PMID: 33358138 Free PMC article. Review.
-
Individualized spatial network predictions using Siamese convolutional neural networks: A resting-state fMRI study of over 11,000 unaffected individuals.PLoS One. 2022 Jan 21;17(1):e0249502. doi: 10.1371/journal.pone.0249502. eCollection 2022. PLoS One. 2022. PMID: 35061657 Free PMC article.
References
-
- Akuzawa Kei, Iwasawa Yusuke, and Matsuo Yutaka. Adversarial invariant feature learning with accuracy constraint for domain generalization. arXiv preprint arXiv:1904.12543, 2019.
-
- Barocas Solon, Hardt Moritz, and Narayanan Arvind. Fairness in machine learning. Neural Information Processing Systems Tutorial, 2017.
-
- Barocas Solon and Selbst Andrew D. Big data’s disparate impact. Calif L. Rev, 104:671, 2016.
-
- Baron Reuben M and Kenny David A. The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of personality and social psychology, 51(6):1173, 1986. - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources