Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 May 26:rs.3.rs-2947001.
doi: 10.21203/rs.3.rs-2947001/v1.

A population-level computational histologic signature for invasive breast cancer prognosis

Affiliations

A population-level computational histologic signature for invasive breast cancer prognosis

Mohamed Amgad et al. Res Sq. .

Update in

Abstract

Breast cancer is a heterogeneous disease with variable survival outcomes. Pathologists grade the microscopic appearance of breast tissue using the Nottingham criteria, which is qualitative and does not account for non-cancerous elements within the tumor microenvironment (TME). We present the Histomic Prognostic Signature (HiPS), a comprehensive, interpretable scoring of the survival risk incurred by breast TME morphology. HiPS uses deep learning to accurately map cellular and tissue structures in order to measure epithelial, stromal, immune, and spatial interaction features. It was developed using a population-level cohort from the Cancer Prevention Study (CPS)-II and validated using data from three independent cohorts, including the PLCO trial, CPS-3, and The Cancer Genome Atlas. HiPS consistently outperformed pathologists' performance in predicting survival outcomes, independent of TNM stage and pertinent variables. This was largely driven by stromal and immune features. In conclusion, HiPS is a robustly validated biomarker to support pathologists and improve prognosis.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Overview of the methodological approach and datasets used.
a. Established clinical prognosticators in breast cancer. The AJCC Staging Manual defines three treatment-oriented subtypes: Luminal-like cancer patients are eligible for hormone therapy, HER2-like cancer patients are eligible for Trastuzumab, and TNBC patients are not eligible for targeted therapies. We limited our analysis to invasive cancers, not metastatic at the time of diagnosis. All specimens are routinely assessed using the standard IHC/ISH panel (ER/PR/HER2) and H&E-stained slides. b. Our workflow for determining the Histomic Prognostic Signature (HiPS). Breast cancer resection specimens were fixed in formalin, embedded in paraffin, cut, stained, and digitally scanned. A panoptic segmentation model identified tissue regions and nuclei in each slide, followed by the computational extraction of interpretable morphologic features. These features include stromal, immune, and spatial interaction features not included in Nottingham grading. The most prognostic features within each biologic theme, combined with ER/PR/HER2, were used to fit a Cox regression model to cancer-specific survival data. The resultant HiPS score is an interpretable weighted combination of histologic features. Additionally, we learned thresholds to identify three distinct prognostic groups. Finally, we validated HiPS using clinical, genomic, and epidemiologic data. c. An overview of the datasets used. We include patients from almost all geographic regions from the United States, covering 614 counties in 48 states, plus the District of Columbia. CPS-II data was used for prognostic model fitting, while PLCO, TCGA, and CPS-3 were independent validation cohorts. Prediagnostic risk factor exposure data were available for all datasets except TCGA, while survival outcomes were available for all datasets except CPS-3. TCGA and PLCO specimens were exclusively sourced from tertiary medical centers, unlike the CPS datasets, which were mostly sourced from non-tertiary and community hospitals. Partially created with BioRender.com.
Figure 2
Figure 2. Thematic categorization and selection of features using the CPS-II cohort.
Supporting results are provided in Tables S23-25. a. Conceptual differences between visual grading and computational assessment. Pathologists use the Nottingham grading criteria, a visual semi-quantitative aggregate score of epithelial tubule formation, nuclear pleomorphism, and mitotic figures. In contrast, our models quantitatively assess the entire tumor microenvironment, including stromal and immune cells, stromal matrix, and spatial interactions.b. Left: Association of epithelial histomic features with visual grading (ordinal regression) versus association with survival (Cox regression). Error bars represent the standard error. Right: Box plots of the feature value distributions of two histomic features highly associated with grade (left), and two highly associated with survival (right). Feature selection for HiPS was guided by the association with survival, not grade. In fact, the standard deviation of epithelial nest size closely captures grades, yet is only modestly prognostic compared to alternative epithelial features. P-values were obtained using the Kruskal-Wallis test. c.Inter-feature correlations. The squares represent biological themes and subthemes. Except for interaction features, cross-theme correlations are mostly weak, reflecting the independence of different themes. d. Univariable Cox regression coefficients for all 109 histomic features. The most prognostic feature within each of the 26 subthemes was included in the HiPS signature. Partially created with BioRender.com
Figure 3
Figure 3. The Histomic Prognostic Signature.
a. Relative contribution of histomic features to the HiPS score. HiPS combines 26 computationally-derived morphologic descriptors of the tumor microenvironment from H&E WSI scans with the breast IHC/ISH panel. While epithelial features were influential, we found that stromal, immune, and cell-cell interaction features had an equally important prognostic role. b. Distribution of the HiPS scores among patients from the CPS-II cohort. The distribution was modeled as a mixture of three Gaussians defining low (H1), intermediate (H2), and high (H3) risk prognostic groups. c. Illustrations of the most influential features on the HiPS score, ordered by importance. The star symbol indicates features whose mean and variance values were both influential, while the cross symbol indicates features whose variance alone was influential. Cancer-Associated Fibroblast (CAF) density and acellular stromal matrix heterogeneity were among the top-five features. The morphology and local interactions of CAFs and Tumor-Infiltrating Lymphocytes (TILs) also played an important role. Partially created with BioRender.com.
Figure 4
Figure 4. Stromal features critically impact the HiPS score and alter risk categorization in stage I cancers.
a.Clinico-pathologic characteristics of stage I cancers where HiPS altered the Nottingham risk categorization (91 of 231 stage I patients). By definition, stage I cancers do not have nodal involvement, so there is a higher importance of histology in guiding clinical decision making. In the supplement we show that HiPS improves outcome stratification in this cohort. Patients are sorted by the percent contribution of epithelial and ER/PR/HER2 features to the HiPS score. We also show the HiPS subscores, each of which summarizes the influence of features within each biological theme. The summation of the six HiPS subscores equals the total HiPS score. Note that non-epithelial features contribute heavily to the total HiPS score in this cohort, and were heavily influential in altering the patients’ risk categories. b. Sample case where stromal features were heavily influential. Two features are illustrated: 1. The variation in peri-CAF stromal matrix intensity, which reflects stromal interface changes like desmoplasia and is favorably prognostic; 2. Clustering of CAFs within a 64 μM radius of epithelial cells, which is adversely prognostic. Partially created with BioRender.com.
Figure 5
Figure 5. Kaplan-Meier analysis of HiPS groups compared to the Control groups.
Detailed results and at-risk tables are provided in Tables S11-16. CPS-II was our discovery cohort, while PLCO and TCGA were independent validation cohorts. Control prognostic groups were obtained by combining Nottingham grades from pathology reports with the ER/PR/HER2 panel using the same methodology as the HiPS score. All P-values represent the Log-Rank statistic, and all hazard ratios (HR) are relative to the lowest risk category. a. Breast cancer-specific survival (BCSS) outcomes for patients within the CPS-II cohort using the Control groups (Nottingham grade + ER/PR/HER2). b. Left: Sankey diagram of reclassification from Control group to HiPS in in CPS-II. Right: BCSS outcomes for patients in the CPS-II cohort using the HiPS groups. Note the higher HR than Control groups in panel a. c. Left: Sankey diagram of reclassification from Control group C3 to HiPS groups within the CPS-II cohort. Right: BCSS outcomes for Control group C3 patients when stratified into HiPS groups. HiPS enables BCSS outcome stratification within Control group C3. d. BCSS outcome stratification for the CPS-II cohort using the alternative model, HiPSepithelial, which only relies on epithelial histologic features and the ER/PR/HER2 panel. Group 3 HR values are smaller than those observed with the full HiPS model in panel b. e-h. Equivalent results to panels a-d, but for the PLCO cohort. i-l. Equivalent results to panels a-d, but using Progression-Free Interval (PFI) outcomes in the TCGA cohort. In TCGA, high censorship rates and missing cause of death information make PFI the most suitable outcome measure. Partially created with BioRender.com.
Figure 6
Figure 6. The HiPS score is consistent with established risk profiles.
In each of the plots shown, the green and purple y-axis ranges indicate H1 (scores <3.6) and H3 (scores ≥6), respectively. Supporting results and sample sizes are provided in Tables S1-2,S27. P-values for panel b represent the independent two-sample t-test, while those in panels c-d represent the one-way ANOVA. a. Illustrating some of the epidemiological and genomic associations discussed. b. The distribution of HiPS scores by cancer detection method. Cancers detected using screening programs had lower HiPS scores than self-detected ones, likely reflecting early detection before developing high-risk features. c. HiPS score distributions within the PAM50 genomic subtypes. Basal and Luminal A cancers have the highest and lowest HiPS scores, consistent with known risk profiles. d. HiPS score distributions within genomic TNBC subtypes. Consistent with known mutational burdens, Basal-Like 1 (BL1) cancers have the highest scores, while Luminal Androgen Receptor (LAR) cancers have the lowest scores. UNS represents an unspecified TNBC subtype. e-f. Scatter plots of HiPS scores versus the fraction of genome altered and the composite Buffa hypoxia score. Higher genome alteration and tumor hypoxia correlate directly with the HiPS score. Partially created with BioRender.com.

References

    1. American Cancer Society. Global Cancer Facts & Figures 4th Edition. Atlanta. (2018).
    1. Siegel R. L., Miller K. D., Wagle N. S. & Jemal A. Cancer statistics, 2023. CA Cancer J Clin 73, 17–48 (2023). - PubMed
    1. American Joint Commission on Cancer. AJCC Cancer Staging Manual 2017. (Springer International Publishing, 2017).
    1. Coughlin S. S. Social determinants of breast cancer risk, stage, and survival. Breast Cancer Res Treat 177, 537–548 (2019). - PubMed
    1. Li X. et al. Validation of the newly proposed American Joint Committee on Cancer (AJCC) breast cancer prognostic staging group and proposing a new staging system using the National Cancer Database. Breast Cancer Res Treat 171, 303–313 (2018). - PubMed

Publication types

LinkOut - more resources