Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Oct 1;22(19):4880-4889.
doi: 10.1158/1078-0432.CCR-15-2900. Epub 2016 Jun 28.

An Expression Signature as an Aid to the Histologic Classification of Non-Small Cell Lung Cancer

Affiliations

An Expression Signature as an Aid to the Histologic Classification of Non-Small Cell Lung Cancer

Luc Girard et al. Clin Cancer Res. .

Abstract

Purpose: Most non-small cell lung cancers (NSCLC) are now diagnosed from small specimens, and classification using standard pathology methods can be difficult. This is of clinical relevance as many therapy regimens and clinical trials are histology dependent. The purpose of this study was to develop an mRNA expression signature as an adjunct test for routine histopathologic classification of NSCLCs.

Experimental design: A microarray dataset of resected adenocarcinomas (ADC) and squamous cell carcinomas (SCC) was used as the learning set for an ADC-SCC signature. The Cancer Genome Atlas (TCGA) lung RNAseq dataset was used for validation. Another microarray dataset of ADCs and matched nonmalignant lung was used as the learning set for a tumor versus nonmalignant signature. The classifiers were selected as the most differentially expressed genes and sample classification was determined by a nearest distance approach.

Results: We developed a 62-gene expression signature that contained many genes used in immunostains for NSCLC typing. It includes 42 genes that distinguish ADC from SCC and 20 genes differentiating nonmalignant lung from lung cancer. Testing of the TCGA and other public datasets resulted in high prediction accuracies (93%-95%). In addition, a prediction score was derived that correlates both with histologic grading and prognosis. We developed a practical version of the Classifier using the HTG EdgeSeq nuclease protection-based technology in combination with next-generation sequencing that can be applied to formalin-fixed paraffin-embedded (FFPE) tissues and small biopsies.

Conclusions: Our RNA classifier provides an objective, quantitative method to aid in the pathologic diagnosis of lung cancer. Clin Cancer Res; 22(19); 4880-9. ©2016 AACR.

PubMed Disclaimer

Conflict of interest statement

Disclosure of Potential Conflicts of Interest: Ihab Botros and Debrah Thompson are both employees and stockholders of HTG Molecular Diagnostics, Inc. (HTG). Ignacio Wistuba is a consultant for HTG and an occasional participant of its advisory board. He also received financial support from HTG as part of a research agreement. No other authors of this article have financial ties to HTG.

Figures

Figure 1
Figure 1
(A) A volcano plot shows that many genes are significantly different between ADC and SCC. Among these are several genes for immunostains typically used by pathologists, including high molecular weight keratins (KRTs), TP63, DSG3, and TITF1 (NKX2-1). Color-coding is related to the distance of each point to the plot’s origin and shows significance levels (red: highly significant). (B) Heatmap of ADC-SCC signature in the MDACC dataset. Red, high mRNA expression; green, low mRNA expression. Twenty-one genes overexpressed in ADC and twenty-one genes overexpressed in SCC were selected for this signature. Blue labels, genes known to be relevant to lung cancer pathogenesis. Arrows, commonly used immunostains; (C) “Correlation plot” where each point represents the Pearson correlation values between the 42-gene signature expression of individual samples and the mean expression of ADCs (x-axis) and SCCs (y-axis). The scores shown as arrows are defined as (Correl ADC − Correl SCC)/2. They are proportional to the distances from each point to the line y = x. Dotted lines represent cutoff score values below which the samples are thought to be less well-differentiated. Color-coding in this and similar plots represents pathological diagnoses. (D) “Score plot” where the y-axis represents the scores calculated from (C). PD, Poorly Differentiated. Dotted lines are the score cutoffs.
Figure 2
Figure 2
The ADC-SCC signature is validated on the TCGA and EDRN/Canary datasets and shown as score plots as in Fig. 1D. The prediction accuracies for TCGA (before revision) were 97% (ADC) and 93% (SCC). Overall accuracy: 95% (Table 1). These accuracies were 97% and 96% respectively (overall: 97%) after revision of diagnosis of selected TCGA cases which included many NSCLC-NOS (blue). The prediction accuracy for the EDRN/Canary dataset which consists of ADCs only was 99%.
Figure 3
Figure 3
(A) A volcano plot shows that many genes are significantly different between tumor and non-malignant. (B) Ten genes overexpressed in tumors and ten genes overexpressed in non-malignant lung tissue were selected for the Tumor vs Nonmalignant signature. (C) This signature applied to the training set can distinguish tumor and non-malignant with complete accuracy (100%). (D) A score plot with the cutoff scores (+/− 0.10) shows a range of possible tumor content (tumor, low-tumor content, non-malignant).
Figure 4
Figure 4
The Tumor-Nonmalignant signature is validated on the TCGA dataset, which consists of 979 lung tumors and 108 non-malignant lung tissues. The prediction accuracies are 98% for tumors and 100% for non-malignant lung (overall: 98%, Table 1). Many tumor samples have negative scores which could be due to larger stromal infiltration. A score plot for the MDACC dataset, which has tumor samples only, shows an accuracy of 92% (Table 1).

References

    1. Travis W, Brambilla E, Burke A, Marx A, Nicholson A, editors. WHO Classification of Tumours of the Lung, Pleura, Thymus and Heart. 4th. Lyon: International Agency for Research on Cancer; 2015. - PubMed
    1. Sun S, Schiller JH, Gazdar AF. Lung cancer in never smokers–a different disease. Nat Rev Cancer. 2007;7:778–90. - PubMed
    1. Travis W, Brambilla E, Noguchi M, Geisinger K, Beer D, Powell C, et al. The new IASLC/ATS/ERS international multidisciplinary lung adenocarcinoma classification. J Thorac Oncol. 2009;4:244–85. - PMC - PubMed
    1. Rekhtman N, Tafe LJ, Chaft JE, Wang L, Arcila ME, Colanta A, et al. Distinct profile of driver mutations and clinical features in immunomarker-defined subsets of pulmonary large-cell carcinoma. Modern pathology : an official journal of the United States and Canadian Academy of Pathology, Inc. 2013;26:511–22. - PMC - PubMed
    1. The Clinical Lung Cancer Genome Project (CLCGP) and Network Genomic Medicine (NGM) A genomics-based classification of human lung tumors. Science translational medicine. 2013;5:209ra153. - PMC - PubMed

MeSH terms