Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comment
. 2024 Feb 2:15:1334348.
doi: 10.3389/fimmu.2024.1334348. eCollection 2024.

MIHIC: a multiplex IHC histopathological image classification dataset for lung cancer immune microenvironment quantification

Affiliations
Comment

MIHIC: a multiplex IHC histopathological image classification dataset for lung cancer immune microenvironment quantification

Ranran Wang et al. Front Immunol. .

Abstract

Background: Immunohistochemistry (IHC) is a widely used laboratory technique for cancer diagnosis, which selectively binds specific antibodies to target proteins in tissue samples and then makes the bound proteins visible through chemical staining. Deep learning approaches have the potential to be employed in quantifying tumor immune micro-environment (TIME) in digitized IHC histological slides. However, it lacks of publicly available IHC datasets explicitly collected for the in-depth TIME analysis.

Method: In this paper, a notable Multiplex IHC Histopathological Image Classification (MIHIC) dataset is created based on manual annotations by pathologists, which is publicly available for exploring deep learning models to quantify variables associated with the TIME in lung cancer. The MIHIC dataset comprises of totally 309,698 multiplex IHC stained histological image patches, encompassing seven distinct tissue types: Alveoli, Immune cells, Necrosis, Stroma, Tumor, Other and Background. By using the MIHIC dataset, we conduct a series of experiments that utilize both convolutional neural networks (CNNs) and transformer models to benchmark IHC stained histological image classifications. We finally quantify lung cancer immune microenvironment variables by using the top-performing model on tissue microarray (TMA) cores, which are subsequently used to predict patients' survival outcomes.

Result: Experiments show that transformer models tend to provide slightly better performances than CNN models in histological image classifications, although both types of models provide the highest accuracy of 0.811 on the testing dataset in MIHIC. The automatically quantified TIME variables, which reflect proportions of immune cells over stroma and tumor over tissue core, show prognostic value for overall survival of lung cancer patients.

Conclusion: To the best of our knowledge, MIHIC is the first publicly available lung cancer IHC histopathological dataset that includes images with 12 different IHC stains, meticulously annotated by multiple pathologists across 7 distinct categories. This dataset holds significant potential for researchers to explore novel techniques for quantifying the TIME and advancing our understanding of the interactions between the immune system and tumors.

Keywords: database; image classification; immunohistochemical image; lung cancer; transformer models.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Overview of our study.
Figure 2
Figure 2
Illustration of MIHIC dataset creation. (A) Manually annotated tissue cores, (B) different regions of interest (ROIs), (C) image patch extraction.
Figure 3
Figure 3
Examples of 7 histological image types included in the MIHIC dataset.
Figure 4
Figure 4
The comprehensive pipeline of quantitative analysis for TMA sections. (A) Tissue core extraction, (B) extracted tissue core samples, (C) diagram of TIME quantification and survival analysis.
Figure 5
Figure 5
Tissue identification in a tissue core. (A) CD3 tissue core, (B) tissue classification result.
Figure 6
Figure 6
Confusion matrix of different CNN models.
Figure 7
Figure 7
Confusion matrix of different Transformer models.
Figure 8
Figure 8
ROC curves of multi-class classification using various deep learning models. (A) ROC curves for different CNN models, (B) ROC curves for different Transformer models.
Figure 9
Figure 9
Box plots of quantified TIME variables. (A) ln(Immune cells/Tumor), (B) ln(Immune cells/Stroma), (C) ln(Immune cells/Necrosis), (D) ln(Tumor/Stroma), (E) ln(Stroma/Tissue core), (F) ln(Immune/Tissue core), (G) ln(Tumor/Tissue core).
Figure 10
Figure 10
Survival analysis visualization. (A) Immune cells/Stroma distribution density and cutoff point selection, (B) Kaplan-Meier survival curves based on Immune cells/Stroma, (C) Cumulative hazard based on Immune cells/Stroma, (D) Tumor/Tissue core distribution density and cutoff point selection, (E) Kaplan-Meier survival curves based on Tumor/Tissue core, (F) Cumulative hazard based on Tumor/Tissue core.

Comment on

References

    1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. . Global cancer statistics 2020: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: Cancer J Clin (2021) 71:209–49. doi: 10.3322/caac.21660 - DOI - PubMed
    1. Zhu C, Shih W, Ling C, Tsao M. Immunohistochemical markers of prognosis in non-small cell lung cancer: a review and proposal for a multiphase approach to marker evaluation. J Clin Pathol (2006) 59:790–800. doi: 10.1136/jcp.2005.031351 - DOI - PMC - PubMed
    1. Magaki S, Hojat SA, Wei B, So A, Yong WH. An introduction to the performance of immunohistochemistry. Biobanking: Methods Protoc (2019) 1897:289–98. doi: 10.1007/978-1-4939-8935-5_25 - DOI - PMC - PubMed
    1. Taylor C, Levenson RM. Quantification of immunohistochemistry—issues concerning methods, utility and semiquantitative assessment ii. Histopathology (2006) 49:411–24. doi: 10.1111/j.1365-2559.2006.02513.x - DOI - PubMed
    1. Lu C, Romo-Bucheli D, Wang X, Janowczyk A, Ganesan S, Gilmore H, et al. . Nuclear shape and orientation features from h&e images predict survival in early-stage estrogen receptor-positive breast cancers. Lab Invest (2018) 98:1438–48. doi: 10.1038/s41374-018-0095-7 - DOI - PMC - PubMed

Publication types