PathAL: An Active Learning Framework for Histopathology Image Analysis

Wenyuan Li, Jiayun Li, Zichen Wang, Jennifer Polson, Anthony E Sisk, Dipti P Sajed, William Speier, Corey W Arnold

PMID: 34898432
PMCID: PMC9199991
DOI: 10.1109/TMI.2021.3135002

PathAL: An Active Learning Framework for Histopathology Image Analysis

Wenyuan Li et al. IEEE Trans Med Imaging. 2022 May.

. 2022 May;41(5):1176-1187.

doi: 10.1109/TMI.2021.3135002. Epub 2022 May 2.

Authors

Wenyuan Li, Jiayun Li, Zichen Wang, Jennifer Polson, Anthony E Sisk, Dipti P Sajed, William Speier, Corey W Arnold

PMID: 34898432
PMCID: PMC9199991
DOI: 10.1109/TMI.2021.3135002

Abstract

Deep neural networks, in particular convolutional networks, have rapidly become a popular choice for analyzing histopathology images. However, training these models relies heavily on a large number of samples manually annotated by experts, which is cumbersome and expensive. In addition, it is difficult to obtain a perfect set of labels due to the variability between expert annotations. This paper presents a novel active learning (AL) framework for histopathology image analysis, named PathAL. To reduce the required number of expert annotations, PathAL selects two groups of unlabeled data in each training iteration: one "informative" sample that requires additional expert annotation, and one "confident predictive" sample that is automatically added to the training set using the model's pseudo-labels. To reduce the impact of the noisy-labeled samples in the training set, PathAL systematically identifies noisy samples and excludes them to improve the generalization of the model. Our model advances the existing AL method for medical image analysis in two ways. First, we present a selection strategy to improve classification performance with fewer manual annotations. Unlike traditional methods focusing only on finding the most uncertain samples with low prediction confidence, we discover a large number of high confidence samples from the unlabeled set and automatically add them for training with assigned pseudo-labels. Second, we design a method to distinguish between noisy samples and hard samples using a heuristic approach. We exclude the noisy samples while preserving the hard samples to improve model performance. Extensive experiments demonstrate that our proposed PathAL framework achieves promising results on a prostate cancer Gleason grading task, obtaining similar performance with 40% fewer annotations compared to the fully supervised learning scenario. An ablation study is provided to analyze the effectiveness of each component in PathAL, and a pathologist reader study is conducted to validate our proposed algorithm.

PubMed Disclaimer

Figures

**Fig. 1.**
(a) Schematics of our proposed PathAL. The core algorithm of PathAL consists of three steps in the ith iteration: 1) discarding noisy samples N_i; 2) requesting human experts to annotate informative samples I_i and adding them to L_i+1; and 3) adding confident predictictive samples C_i with their “pseudo-labels” to L_i+1. The curriculum classification (CC) algorithm and overfitting to underfitting (O2U) monitor are used to select N_i, I_i, C_i. (b) Illustration of the CC algorithm. Tissues from one slide are mapping to one single point in deep feature space, where K-Means Clustering is used to group them in subsets. The CC algorithm is applied to each subset and the image complexity is classified as “easy”, “medium” or “hard” based on their local density. (c) Principles on how to determine N_i and C_i based on CC and O2U results. A sample that is classified as “easy” based on its complexity but has large training loss variation is more likely to be incorrectly annotated. If it is classified as “hard” for its complexity, it is more likely to be a difficult sample. If a sample’s complexity is classified as “easy” and the variation of its predictive entropy is low by the current model, we will have a higher confidence that the current prediction is correct.

**Fig. 2.**
Illustration of data pre-processing steps. A binary mask of tissue is first extracted; then the mid-line is found using morphological closing; after that, the mid-line is partitioned to form patches based on the batch size and overlap; finally, the blue ratios of patches are calculated and the top k patches are selected.

**Fig. 3.**
(a) t-SNE plot in deep feature space. Each point in the figure represents a slide whose color indicates its ISUP grade. As training progressed, different ISUP grades became more separable in the deep feature space, indicating the model captured more essential information to make correct predictions. (b) The trend of “grade concentraion” that measured the ISUP grade distribution within subsets clustered by k-means. The insets of the figure demonstrate typical ISUP distributions for the subsets. At the beginning of training, the ISUP grades were more diffuse, while at the end of the training, each cluster concentrated on fewer grades. (c)(d) The training loss for every sample in L_i, and predictive entropy for every sample in U_i during the O2U process.

**Fig. 4.**
(a) Toy example for a 2-class classification. The dotted circle indicates the decision boundary while inside the circle we have class 0 (C=0) and class 1 (C=1) out side the circle. Samples that are far away from the decision boundary are considered as easy samples (blue for C=0 and orange for C=1), while samples that are closer to the decision boundary are considered as hard samples. We also randomly insert noisy samples (indicated by larger purple dots) that have wrong labels. (b) A heat map of averaged predictive entropy of each samples during the O2U process. (c) A confusion matrix of easy, hard, and noisy samples with horizontal axis representing the classified results and vertical axis representing the original categories.

**Fig. 5.**
(a) Performance comparison between PathAL and other AL baselines. (b) QWK for each group (N_i, C_i, I_i) during the training process. (c) Percentage of noisy samples that returned for training in later iterations. (d)Performance comparison between PathAL and random sampling with noise samples injection.

See this image and copyright information in PMC

References

1. Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, Van Der Laak JA, Van Ginneken B, and Sánchez CI, “A survey on deep learning in medical image analysis,” Medical image analysis, vol. 42, pp. 60–88, 2017. - PubMed
1. Yang C-W, Lin T-P, Huang Y-H, Chung H-J, Kuo J-Y, Huang WJ, Wu HH, Chang Y-H, Lin AT, and Chen K-K, “Does extended prostate needle biopsy improve the concordance of gleason scores between biopsy and prostatectomy in the taiwanese population?” Journal of the Chinese Medical Association, vol. 75, no. 3, pp. 97–101, 2012. - PubMed
1. Cheplygina V, de Bruijne M, and Pluim JP, “Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis,” Medical image analysis, vol. 54, pp. 280–296, 2019. - PubMed
1. Budd S, Robinson EC, and Kainz B, “A survey on active learning and human-in-the-loop deep learning for medical image analysis,” arXiv preprint arXiv:1910.02923, 2019. - PubMed
1. Dgani Y, Greenspan H, and Goldberger J, “Training a neural network based on unreliable human annotation of medical images,” in 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018). IEEE, 2018, pp. 39–42.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

R21 CA220352/CA/NCI NIH HHS/United States

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

PathAL: An Active Learning Framework for Histopathology Image Analysis

PathAL: An Active Learning Framework for Histopathology Image Analysis

Authors

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical