Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Aug 16:7:12474.
doi: 10.1038/ncomms12474.

Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features

Affiliations

Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features

Kun-Hsing Yu et al. Nat Commun. .

Abstract

Lung cancer is the most prevalent cancer worldwide, and histopathological assessment is indispensable for its diagnosis. However, human evaluation of pathology slides cannot accurately predict patients' prognoses. In this study, we obtain 2,186 haematoxylin and eosin stained histopathology whole-slide images of lung adenocarcinoma and squamous cell carcinoma patients from The Cancer Genome Atlas (TCGA), and 294 additional images from Stanford Tissue Microarray (TMA) Database. We extract 9,879 quantitative image features and use regularized machine-learning methods to select the top features and to distinguish shorter-term survivors from longer-term survivors with stage I adenocarcinoma (P<0.003) or squamous cell carcinoma (P=0.023) in the TCGA data set. We validate the survival prediction framework with the TMA cohort (P<0.036 for both tumour types). Our results suggest that automatically derived image features can predict the prognosis of lung cancer patients and thereby contribute to precision oncology. Our methods are extensible to histopathology images of other organs.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Quantitative image features accurately distinguished malignancies from adjacent dense normal tissues.
(a) ROC curves for classifying lung adenocarcinoma versus adjacent dense normal tissues in the TCGA test set. Classifiers with 80 features attained average AUC of 0.81. (b) ROC curves for classifying lung squamous cell carcinoma from adjacent dense normal tissues in the TCGA test set. Classifiers with 80 features attained average AUC of 0.85. The performance of different classifiers is shown. CIT, conditional inference trees; ROC, receiver operator characteristics.
Figure 2
Figure 2. Quantitative image features successfully distinguished histopathology images of lung adenocarcinoma from those of lung squamous cell carcinoma.
(a) ROC curves for classifying the two malignancies in the TCGA test set. Most classifiers achieved AUC>0.7. (b) ROC curves for classifying the two malignancies in the TMA test set. Most classifiers achieved AUC >0.75, indicating that our informatics pipeline was successfully validated in the independent TMA data set. The performance of different classifiers is shown. CIT, conditional inference trees; ROC, receiver operator characteristics.
Figure 3
Figure 3. Quantitative image features predicted the survival outcomes of stage I lung adenocarcinoma patients.
(a) Kaplan–Meier curves of lung adenocarcinoma patients stratified by tumour stage. Patients with higher stages tended to have worse prognosis (log-rank test P value <0.001 in TCGA data set, log-rank test P=0.0068 in TMA data set). However, the survival outcomes varied widely. (left: TCGA data set, right: TMA data set). (b) Kaplan–Meier curves of stage I lung adenocarcinoma patients stratified by tumour grade. Tumour grade did not significantly correlate with survival (left: TCGA data set, log-rank test P value=0.06; right: TMA data set, log-rank test P value=0.0502). (c) Kaplan–Meier curves of stage I lung adenocarcinoma patients stratified using quantitative image features. Image features predicted the survival outcomes. Elastic net-Cox proportional hazards model categorized patients into two prognostic groups, with a statistically significant difference in their survival outcomes in the TCGA test set (log-rank test P value=0.0023). (d) The same classification workflow was validated in the TMA data set, with comparable prediction performance. (log-rank test P value=0.028). (e) Sample image of stage I adenocarcinoma with long survival. This patient suffered from stage IB, grade 3 lung adenocarcinoma, and survived more than 99 months after diagnosis. Our classifier correctly predicted the patient as a long survivor. (f) Sample image of stage I adenocarcinoma with short survival. This patient suffered from stage IB, grade 3 lung adenocarcinoma, and survived less than 12 months after diagnosis. Our classifier correctly predicted the patient as a short survivor.
Figure 4
Figure 4. Quantitative image features predicted the survival outcomes of lung squamous cell carcinoma patients.
(a) Kaplan–Meier curves of lung squamous cell carcinoma patients stratified by tumour stage. Although patients with higher stages generally have worse outcomes, the trend was not statistically significant (left: TCGA data set, log-rank test P value=0.216; right: TMA data set, log-rank test P value=0.388). (b) Kaplan–Meier curves of stage I lung squamous cell carcinoma patients stratified by tumour grade. Tumour grade did not significantly correlate with survival. (left: TCGA data set, log-rank test P value=0.847; right: TMA data set, log-rank test P value=0.964). (c) Kaplan–Meier curves of lung squamous cell carcinoma patients stratified using quantitative image features. The image features predicted the survival outcomes. Elastic net-Cox proportional hazards model categorized patients into two prognostic groups, with a statistically significant difference in their survival in the TCGA test set (log-rank test P value=0.023). (d) The same classification workflow was validated in the TMA data set, with comparable prediction performance. (log-rank test P value=0.035). (e) Sample image of lung squamous cell carcinoma in a patient with long survival. This patient suffered from stage I, grade 1 lung squamous cell carcinoma, and survived more than 70 months after diagnosis. Our classifier correctly predicted the patient as a long survivor. (f) Sample image of squamous cell carcinoma in a patient with short survival. This patient suffered from stage I, grade 1 lung squamous cell carcinoma, and only survived 12.4 months after diagnosis. Our classifier correctly predicted the patient as a short survivor.

References

    1. Jemal A. et al. Global cancer statistics. CA Cancer J. Clin. 61, 69–90 (2011). - PubMed
    1. Siegel R., Naishadham D. & Jemal A. Cancer statistics, 2013. CA Cancer J. Clin. 63, 11–30 (2013). - PubMed
    1. Silvestri G. A. et al. Noninvasive staging of non-small cell lung cancer: ACCP evidenced-based clinical practice guidelines (2nd edition). Chest 132, 178S–201S (2007). - PubMed
    1. Travis W. D. et al. International Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society International multidisciplinary classification of lung adenocarcinoma. J. Thorac. Oncol. 6, 244–285 (2011). - PMC - PubMed
    1. Collins L. G., Haines C., Perkel R. & Enck R. E. Lung cancer: diagnosis and management. Am. Fam. Physician 75, 56–63 (2007). - PubMed

Publication types