Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Apr 3;23(1):181-193.e7.
doi: 10.1016/j.celrep.2018.03.086.

Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images

Collaborators, Affiliations

Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images

Joel Saltz et al. Cell Rep. .

Abstract

Beyond sample curation and basic pathologic characterization, the digitized H&E-stained images of TCGA samples remain underutilized. To highlight this resource, we present mappings of tumor-infiltrating lymphocytes (TILs) based on H&E images from 13 TCGA tumor types. These TIL maps are derived through computational staining using a convolutional neural network trained to classify patches of images. Affinity propagation revealed local spatial structure in TIL patterns and correlation with overall survival. TIL map structural patterns were grouped using standard histopathological parameters. These patterns are enriched in particular T cell subpopulations derived from molecular measures. TIL densities and spatial structure were differentially enriched among tumor types, immune subtypes, and tumor molecular subtypes, implying that spatial infiltrate state could reflect particular tumor cell aberration states. Obtaining spatial lymphocytic patterns linked to the rich genomic characterization of TCGA samples demonstrates one use for the TCGA image archives with insights into the tumor-immune microenvironment.

Keywords: artificial intelligence; bioinformatics; computer vision; deep learning; digital pathology; immuno-oncology; lymphocytes; machine learning; tumor microenvironment; tumor-infiltrating lymphocytes.

PubMed Disclaimer

Conflict of interest statement

DECLARATION OF INTERESTS

Michael Seiler, Peter G. Smith, Ping Zhu, Silvia Buonamici, and Lihua Yu are employees of H3 Biomedicine, Inc. Parts of this work are the subject of a patent application: WO2017040526 titled “Splice variants associated with neo-morphic sf3b1 mutants.” Shouyoung Peng, Anant A. Agrawal, James Palacino, and Teng Teng are employees of H3 Biomedicine, Inc. Andrew D. Cherniack, Ashton C. Berger, and Galen F. Gao receive research support from Bayer Pharmaceuticals. Gordon B. Mills serves on the External Scientific Review Board of Astrazeneca. Anil Sood is on the Scientific Advisory Board for Kiyatec and is a shareholder in BioPath. Jonathan S. Serody receives funding from Merck, Inc. Kyle R. Covington is an employee of Castle Biosciences, Inc. Preethi H. Gunaratne is founder, CSO, and shareholder of NextmiRNA Therapeutics. Christina Yau is a part-time employee/consultant at NantOmics. Franz X. Schaub is an employee and shareholder of SEngine Precision Medicine, Inc. Carla Grandori is an employee, founder, and shareholder of SEngine Precision Medicine, Inc. Robert N. Eisenman is a member of the Scientific Advisory Boards and shareholder of Shenogen Pharma and Kronos Bio. Daniel J. Weisenberger is a consultant for Zymo Research Corporation. Joshua M. Stuart is the founder of Five3 Genomics and shareholder of NantOmics. Marc T. Goodman receives research support from Merck, Inc. Andrew J. Gentles is a consultant for Cibermed. Charles M. Perou is an equity stock holder, consultant, and Board of Directors member of BioClassifier and GeneCentric Diagnostics and is also listed as an inventor on patent applications on the Breast PAM50 and Lung Cancer Subtyping assays. Matthew Meyerson receives research support from Bayer Pharmaceuticals; is an equity holder in, consultant for, and Scientific Advisory Board chair for OrigiMed; and is an inventor of a patent for EGFR mutation diagnosis in lung cancer, licensed to LabCorp. Eduard Porta-Pardo is an inventor of a patent for domainXplorer. Han Liang is a shareholder and scientific advisor of Precision Scientific and Eagle Nebula. Da Yang is an inventor on a pending patent application describing the use of antisense oligonucleotides against specific lncRNA sequence as diagnostic and therapeutic tools. Yonghong Xiao was an employee and shareholder of TESARO, Inc. Bin Feng is an employee and shareholder of TESARO, Inc. Carter Van Waes received research funding for the study of IAP inhibitor ASTX660 through a Cooperative Agreement between NIDCD, NIH, and Astex Pharmaceuticals. Raunaq Malhotra is an employee and shareholder of Seven Bridges, Inc. Peter W. Laird serves on the Scientific Advisory Board for AnchorDx. Joel Tepper is a consultant at EMD Serono. Kenneth Wang serves on the Advisory Board for Boston Scientific, Microtech, and Olympus. Andrea Califano is a founder, shareholder, and advisory board member of DarwinHealth, Inc. and a shareholder and advisory board member of Tempus, Inc. Toni K. Choueiri serves as needed on advisory boards for Bristol-Myers Squibb, Merck, and Roche. Lawrence Kwong receives research support from Array BioPharma. Sharon E. Plon is a member of the Scientific Advisory Board for Baylor Genetics Laboratory. Beth Y. Karlan serves on the Advisory Board of Invitae.

Figures

Figure 1
Figure 1. Workflow for Training, Model Development, and Subsequent Generation of TIL Maps
Top: for training and developing CNN models, a pathologist reviews images and marks regions with lymphocytes and necrosis. These training data are then broken down into patches that are then fed into a training stage to train CNNs for lymphocyte and necrosis detection. A pathologist periodically reviews the results for accuracy and corrects the prediction. This results in a pair of Trained CNNs. Bottom: these trained CNNs are then used on the full set of 5,455 images from 13 cancer types to generate TIL maps. During TIL map generation, a probability map for TILs is generated from each image. These probabilities are then reviewed and lymphocyte selection thresholds are established using a selective sampling strategy (further information in Method Details). These thresholds are then used to obtain the final TIL maps. See also Figure S1 and Tables S1 and S2.
Figure 2
Figure 2. Assessment of TIL Prediction
(A) Receiver Operating Characteristic depicting performance of CNN. Applied to TCGA lung adenocarcinoma patches. The current method is compared with a popular CNN called VGG16 (see main text description). (B) Comparison of TIL scores of super-patches between pathologists and computational stain. x axis: median scores from three pathologists assessing 400 super-patches as having low, medium, or high lymphocyte infiltrate. y axis: scores from computational staining, on a scale from 0 to 64.
Figure 3
Figure 3. TIL Fraction by Tumor Category
(A–E) Percent TIL fraction, the proportion of TIL-positive patches within a TIL map, is shown by various categorizations of TCGA tumor samples. Each plotted point represents a tumor sample for (A) 13 TCGA tumor types (4,612 cases), (B) six subtypes characterized by differences in the nature of the overall immune response (Thorsson et al., 2018) (C5 has very few samples here), (C) gastrointestinal adenocarcinoma subtypes, (D) lung squamous cell carcinoma subtypes, and (E) breast adenocarcinoma subtypes. See also Figure S2.
Figure 4
Figure 4. Comparison of TIL Proportion from Imaging and Molecular Estimates
(A) Spearman correlation coefficients and p values for comparison of TIL fraction from spatial estimates of TIL maps and molecular estimates of TIL fraction from processing of cancer genomics data using deconvolution methods (see main text). (B) Each point represents a breast adenocarcinoma tumor sample, with the value of TIL fraction from TIL maps (x axis) and from molecular estimates (y axis). (C) As in B for 12 additional TCGA tumor types. See also Figure S3 and the companion manuscript (Thorsson et al., 2018).
Figure 5
Figure 5. Examples of TIL Map Structural Patterns
(A–D) Four cases representing different degrees of lymphocyte infiltration. Each example is labeled by TCGA participant barcode and has the following three panels. Left: H&E diagnostic image at low magnification with tumor regions circled in yellow; middle: TIL map; red represents a positive TIL patch, blue represents a tissue region with no TIL patch, while black represents no tissue; right: diagrams of clusters of TIL patches derived from the affinity propagation clustering of the TIL patches. Line segments connect cluster members with a central representative for each cluster, and colors are arbitrarily assigned to aid visual separation of clusters. (E) TIL map, cluster statistics, and global patterns for the four examples in A–D. Each column represents one way to characterize the TIL map, ranging from simple measures such as TIL count and density to more complex ones characterizing details of cluster properties and image patterns (see main text). See also Table S2.
Figure 6
Figure 6. Associations of TIL Local Spatial Structure with Cancer Type and Survival
Associations are shown with cluster indices, which summarize properties of clusters derived from affinity propagation clusters of the TIL map—properties that provide details on local structure beyond simple densities. (A) Ball-Hall cluster indices for all slide images considered in the study. The Ball-Hall index is a particular clustering index, summarizing the mean, through all the clusters, of their mean dispersion and is equivalent to the mean of the squared distances of the points of the cluster with respect to its center. In our data, the Ball-Hall index is correlated (ρSpearman = 0.95) with the mean cluster extent, CE. (B) Table of significant associations between TIL fraction-adjusted cluster indices and overall survival based on Cox regression, accounting for age and gender as additional clinical covariates. (C) Overall survival for median-stratified TIL fraction-adjusted Ball-Hall index in breast cancer. Significance test p value is shown in the lower left. (D) Same as C but for adjusted Banfield-Raftery index in skin cutaneous melanoma. The Banfield-Raftery index is the weighted sum of the logarithms of the mean cluster dispersion and, in our data, often correlates with the number of clusters. See also Figure S4.
Figure 7
Figure 7. Association of Spatial Structural Patterns with Tumor Type and Cell Fractions
(A) Each row corresponds to one of four spatial structure patterns, assigned in a manner consistent with the descriptions currently used to characterize the nature of the immune infiltrate in standard histopathological examinations, and each column is a TCGA tumor types. The values shown are the sample count for each tumor type and spatial structure pattern, divided by the counts expected by chance. The ratio of observed to expected co-membership counts is shown on a color scale, where the largest ratios are in red, values near unity as yellow, and blue represents fewer than expected counts. (B) Estimates of the proportion of CD4, CD8, NK cells, and B cells were segregated by spatial structure patterns and averaged. Bars show the proportion within each structural pattern. These proportions are estimated using molecular data of the TCGA. See also Figure S5.

Comment in

References

    1. Angell H, Galon J. From the immune contexture to the Immunoscore: the role of prognostic and predictive immune markers in cancer. Curr Opin Immunol. 2013;25:261–267. - PubMed
    1. Bailey P, Chang DK, Nones K, Johns AL, Patch AM, Gingras MC, Miller DK, Christ AN, Bruxner TJC, Quinn MC, et al. Australian Pancreatic Cancer Genome Initiative. Genomic analyses identify molecular subtypes of pancreatic cancer. Nature. 2016;531:47–52. - PubMed
    1. Ball GH, Hall DJ. Technical Report April 1965 prepared for the Information Sciences Branch of the Office of Naval Research. Stanford Research Institute - Clearinghouse for Federal Scientific and Technical Information; 1965. ISODATA, a novel method of data analysis and pattern classification; pp. 2–50.
    1. Banfield JD, Raftery AE. Model-Based Gaussian and Non-Gaussian Clustering. Biometrics. 1993;49:803–821.
    1. Bayramoglu N, Heikkila J. Transfer learning for cell nuclei classification in histopathology images. In: Hua G, Jégou H, editors. Computer Vision – ECCV 2016 Workshops. Springer; 2016. pp. 532–539. Lecture Notes in Computer Science.

Publication types

MeSH terms