Representation Learning with Statistical Independence to Mitigate Bias

Ehsan Adeli^{1

2}, Qingyu Zhao¹, Adolf Pfefferbaum^{1

3}, Edith V Sullivan¹, Li Fei-Fei², Juan Carlos Niebles², Kilian M Pohl^{1

3}

Affiliations

¹ Department of Psychiatry and Behavioral Sciences, Stanford University, CA 94305.
² Department of Computer Science, Stanford University, CA 94305.
³ Center for Biomedical Sciences, SRI International, Menlo Park, CA 94205.

PMID: 34522832
PMCID: PMC8436589
DOI: 10.1109/wacv48630.2021.00256

Representation Learning with Statistical Independence to Mitigate Bias

Ehsan Adeli et al. IEEE Winter Conf Appl Comput Vis. 2021 Jan.

. 2021 Jan:2021:2512-2522.

doi: 10.1109/wacv48630.2021.00256. Epub 2021 Jun 14.

Authors

Ehsan Adeli^{1

2}, Qingyu Zhao¹, Adolf Pfefferbaum^{1

3}, Edith V Sullivan¹, Li Fei-Fei², Juan Carlos Niebles², Kilian M Pohl^{1

3}

Affiliations

¹ Department of Psychiatry and Behavioral Sciences, Stanford University, CA 94305.
² Department of Computer Science, Stanford University, CA 94305.
³ Center for Biomedical Sciences, SRI International, Menlo Park, CA 94205.

PMID: 34522832
PMCID: PMC8436589
DOI: 10.1109/wacv48630.2021.00256

Abstract

Presence of bias (in datasets or tasks) is inarguably one of the most critical challenges in machine learning applications that has alluded to pivotal debates in recent years. Such challenges range from spurious associations between variables in medical studies to the bias of race in gender or face recognition systems. Controlling for all types of biases in the dataset curation stage is cumbersome and sometimes impossible. The alternative is to use the available data and build models incorporating fair representation learning. In this paper, we propose such a model based on adversarial training with two competing objectives to learn features that have (1) maximum discriminative power with respect to the task and (2) minimal statistical mean dependence with the protected (bias) variable(s). Our approach does so by incorporating a new adversarial loss function that encourages a vanished correlation between the bias and the learned features. We apply our method to synthetic data, medical images (containing task bias), and a dataset for gender classification (containing dataset bias). Our results show that the learned features by our method not only result in superior prediction performance but also are unbiased.

PubMed Disclaimer

Figures

**Figure 1:**
Average face images for each shade category (1^st row), average saliency map of the trained baseline (2^nd row), and BR-Net (3^rd row) color-coded with the normalized saliency for each pixel. BR-Net results in more stable patterns across all 6 shades. The last column shows the tSNE projection of the learned representations by each method. Our method results in a better representation space invariant to the bias variable (shade) while the baseline shows a pattern influenced by the bias. Average accuracy of per-shade gender classification over 5 runs of 5-fold cross-validation (pre-trained on ImageNet, fine-tuned on GS-PPB) is shown on each average map. BR-Net not only obtains better accuracy for the darker shade but also *regularizes* the model to improve overall per-category accuracy.

**Figure 2:**
BR-Net architecture: $F E$ learns features, F, that successfully classify ( $C$ ) the input while being invariant (statistically independent) to the protected variables, b, using $B P$ and the adversarial loss, −λL_bp (based on correlation coefficient). Forward arrows show forward paths while the backward dashed ones indicate back-propagation with the respective gradient (∂) values.

**Figure 3:**
BR-Net can remove direct dependency between F and b for both dataset or task bias.

**Figure 4:**
Formation of synthetic dataset (a) and comparison of results for different methods (b).

**Figure 5:**
tSNE projection of the learned features for different methods. Color indicates the value of σ_B.

**Figure 6:**
Statistical dependence between the learned features and age for the CTRL cohort in the HIV experiment, which is quantitatively measured by *dcor*².

**Figure 7:**
Accuracy, TNR, and TPR of the HIV experiment, as a function of the # of iterations for (a) 3D CNN baseline, (b) BR-Net. Our method is robust against the imbalanced age distribution between HIV and CTRL.

**Figure 8:**
Accuracy of gender prediction from face images across all shades (1 to 6) of the GS-PPB dataset with two backbones, (left) VGG16 and (right) ResNet50. BR-Net consistently results in more accurate predictions in all 6 shade categories.

**Figure 9:**
Learned representations by different methods. Color encodes the 6 categories of skin shade.

See this image and copyright information in PMC

Cited by

Subject Harmonization of Digital Biomarkers: Improved Detection of Mild Cognitive Impairment from Language Markers.
Hoang B, Pang Y, Dodge HH, Zhou J. Hoang B, et al. Pac Symp Biocomput. 2024;29:187-200. Pac Symp Biocomput. 2024. PMID: 38160279 Free PMC article.
Drop the shortcuts: image augmentation improves fairness and decreases AI detection of race and other demographics from medical images.
Wang R, Kuo PC, Chen LC, Seastedt KP, Gichoya JW, Celi LA. Wang R, et al. EBioMedicine. 2024 Apr;102:105047. doi: 10.1016/j.ebiom.2024.105047. Epub 2024 Mar 11. EBioMedicine. 2024. PMID: 38471396 Free PMC article.
Automated eloquent cortex localization in brain tumor patients using multi-task graph neural networks.
Nandakumar N, Manzoor K, Agarwal S, Pillai JJ, Gujar SK, Sair HI, Venkataraman A. Nandakumar N, et al. Med Image Anal. 2021 Dec;74:102203. doi: 10.1016/j.media.2021.102203. Epub 2021 Aug 21. Med Image Anal. 2021. PMID: 34474216 Free PMC article.
Ethical issues in using ambient intelligence in health-care settings.
Martinez-Martin N, Luo Z, Kaushal A, Adeli E, Haque A, Kelly SS, Wieten S, Cho MK, Magnus D, Fei-Fei L, Schulman K, Milstein A. Martinez-Martin N, et al. Lancet Digit Health. 2021 Feb;3(2):e115-e123. doi: 10.1016/S2589-7500(20)30275-2. Epub 2020 Dec 21. Lancet Digit Health. 2021. PMID: 33358138 Free PMC article. Review.
Individualized spatial network predictions using Siamese convolutional neural networks: A resting-state fMRI study of over 11,000 unaffected individuals.
Hassanzadeh R, Silva RF, Abrol A, Salman M, Bonkhoff A, Du Y, Fu Z, DeRamus T, Damaraju E, Baker B, Calhoun VD. Hassanzadeh R, et al. PLoS One. 2022 Jan 21;17(1):e0249502. doi: 10.1371/journal.pone.0249502. eCollection 2022. PLoS One. 2022. PMID: 35061657 Free PMC article.

See all "Cited by" articles

References

1. Adeli Ehsan, Kwon Dongjin, Zhao Qingyu, Pfefferbaum Adolf, Zahr Natalie M, Sullivan Edith V, and Pohl Kilian M. Chained regularization for identifying brain patterns specific to HIV infection. NeuroImage, 183:425–437, 2018. - PMC - PubMed
1. Akuzawa Kei, Iwasawa Yusuke, and Matsuo Yutaka. Adversarial invariant feature learning with accuracy constraint for domain generalization. arXiv preprint arXiv:1904.12543, 2019.
1. Barocas Solon, Hardt Moritz, and Narayanan Arvind. Fairness in machine learning. Neural Information Processing Systems Tutorial, 2017.
1. Barocas Solon and Selbst Andrew D. Big data’s disparate impact. Calif L. Rev, 104:671, 2016.
1. Baron Reuben M and Kenny David A. The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of personality and social psychology, 51(6):1173, 1986. - PubMed

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Representation Learning with Statistical Independence to Mitigate Bias

Affiliations

Representation Learning with Statistical Independence to Mitigate Bias

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources