Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb 27:14:100301.
doi: 10.1016/j.jpi.2023.100301. eCollection 2023.

Pan-tumor T-lymphocyte detection using deep neural networks: Recommendations for transfer learning in immunohistochemistry

Affiliations

Pan-tumor T-lymphocyte detection using deep neural networks: Recommendations for transfer learning in immunohistochemistry

Frauke Wilm et al. J Pathol Inform. .

Erratum in

Abstract

The success of immuno-oncology treatments promises long-term cancer remission for an increasing number of patients. The response to checkpoint inhibitor drugs has shown a correlation with the presence of immune cells in the tumor and tumor microenvironment. An in-depth understanding of the spatial localization of immune cells is therefore critical for understanding the tumor's immune landscape and predicting drug response. Computer-aided systems are well suited for efficiently quantifying immune cells in their spatial context. Conventional image analysis approaches are often based on color features and therefore require a high level of manual interaction. More robust image analysis methods based on deep learning are expected to decrease this reliance on human interaction and improve the reproducibility of immune cell scoring. However, these methods require sufficient training data and previous work has reported low robustness of these algorithms when they are tested on out-of-distribution data from different pathology labs or samples from different organs. In this work, we used a new image analysis pipeline to explicitly evaluate the robustness of marker-labeled lymphocyte quantification algorithms depending on the number of training samples before and after being transferred to a new tumor indication. For these experiments, we adapted the RetinaNet architecture for the task of T-lymphocyte detection and employed transfer learning to bridge the domain gap between tumor indications and reduce the annotation costs for unseen domains. On our test set, we achieved human-level performance for almost all tumor indications with an average precision of 0.74 in-domain and 0.72-0.74 cross-domain. From our results, we derive recommendations for model development regarding annotation extent, training sample selection, and label extraction for the development of robust algorithms for immune cell scoring. By extending the task of marker-labeled lymphocyte quantification to a multi-class detection task, the pre-requisite for subsequent analyses, e.g., distinguishing lymphocytes in the tumor stroma from tumor-infiltrating lymphocytes, is met.

Keywords: Deep learning; Domain adaptation; Immuno-oncology; Immunohistochemistry; Transfer learning; Tumor-infiltrating lymphocytes.

PubMed Disclaimer

Figures

Image 1
Graphical abstract
Fig. 1
Fig. 1
Annotation and training pipeline. ROIs sized 2 mm2 are annotated using commercially available image analysis software.
Fig. 2
Fig. 2
Kappa variants per tumor indication. Annotators A, B, and C are compared to each other (hollow symbols) and to semi-automatic annotations (filled symbols) using the image analysis software (SW). The pathologist consensus is also compared to the software annotations (star).
Fig. 3
Fig. 3
Examples for consensus annotations (orange: tumor cells, green: CD3+ cells, blue: non-specified cells, purple: cells without agreement). HNSCC: head and neck squamous cell carcinoma, NSCLC: non-small cell lung cancer, TNBC: triple-negative breast cancer, GC: gastric cancer.
Fig. 4
Fig. 4
Average precision (AP) when deploying the models on test data from the source domain (HNSCC) and unseen target domains. The models were trained on field of views from an increasing number of whole slide images. The error bars visualize the standard deviation of the 5 repetitions and the curve fits a logistic regression. HNSCC: head and neck squamous cell carcinoma, NSCLC: non-small cell lung cancer, TNBC: triple-negative breast cancer, GC: gastric cancer.
Fig. 5
Fig. 5
Kappa variants per tumor indication. Annotators A, B, and C are compared to each other (hollow symbols) and to the detection results of RetinaNet22 (asterisks).
Fig. 6
Fig. 6
Comparison of consensus annotations in upper row vs. network predictions in the bottom row (orange: tumor cells, green: CD3+ cells, blue: non-specified cells, purple: cells without agreement). HNSCC: head and neck squamous cell carcinoma, NSCLC: non-small cell lung cancer, TNBC: triple-negative breast cancer, GC: gastric cancer.
Fig. 7
Fig. 7
Examples were the models faced difficulties to differentiate tumor and non-tumor cells (orange: tumor cells, green: CD3+ cells, blue: non-specified cells).
Fig. 8
Fig. 8
Difference in average precision (AP) for model deployment vs. fine-tuning with 1 target domain image. The error bars visualize the minimum to maximum range of the performance difference. HNSCC: head and neck squamous cell carcinoma, NSCLC: non-small cell lung cancer, TNBC: triple-negative breast cancer, GC: gastric cancer.
Fig. 9
Fig. 9
Average performance of benchmark models when being tested on all tumor indications. Matrix entry mi,j is the average precision (AP) when training on the indication in row i and testing on the indication in column j. Diagonal elements indicate in-domain performance, whereas off-diagonal elements represent cross-domain performance. HNSCC: head and neck squamous cell carcinoma, NSCLC: non-small cell lung cancer, TNBC: triple-negative breast cancer, GC: gastric cancer.
Fig. B.1
Fig. B.1
Detection results for common tissue and staining artifacts and morphologically challenging regions (orange: tumor cells, green: CD3+ cells, blue: non-specified cells).
Fig. B.1
Fig. B.1
Detection results for common tissue and staining artifacts and morphologically challenging regions (orange: tumor cells, green: CD3+ cells, blue: non-specified cells).

References

    1. Klauschen F., Müller K.-R., Binder A., et al. Semin Cancer Biol. Vol. 52. Elsevier; 2018. Scoring of tumor-infiltrating lymphocytes: from visual estimation to machine learning; pp. 151–157. - PubMed
    1. Ramos-Vara J. Technical aspects of immunohistochemistry. Vet Pathol. 2005;42(4):405–426. - PubMed
    1. Priego-Torres B.M., Lobato-Delgado B., Atienza-Cuevas L., Sanchez-Morillo D. Deep learning-based instance segmentation for the precise automated quantification of digital breast cancer immunohistochemistry images. Expert Syst Appl. 2022;193:116471.
    1. Garcia E., Hermoza R., Castanon C.B., Cano L., Castillo M., Castanneda C. Proc IEEE Int Symp Comput Based Med Syst. IEEE; 2017. Automatic lymphocyte detection on gastric cancer IHC images using deep learning; pp. 200–204.
    1. Chen T., Chefd’hotel C. Mach Learn Med Imaging. Springer; 2014. Deep learning based automatic immune cell detection for immunohistochemistry images; pp. 17–24.

LinkOut - more resources