Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Apr 17;5(4):33.
doi: 10.1186/gm437. eCollection 2013.

Isoform level expression profiles provide better cancer signatures than gene level expression profiles

Affiliations

Isoform level expression profiles provide better cancer signatures than gene level expression profiles

ZhongFa Zhang et al. Genome Med. .

Abstract

Background: The majority of mammalian genes generate multiple transcript variants and protein isoforms through alternative transcription and/or alternative splicing, and the dynamic changes at the transcript/isoform level between non-oncogenic and cancer cells remain largely unexplored. We hypothesized that isoform level expression profiles would be better than gene level expression profiles at discriminating between non-oncogenic and cancer cellsgene level.

Methods: We analyzed 160 Affymetrix exon-array datasets, comprising cell lines of non-oncogenic or oncogenic tissue origins. We obtained the transcript-level and gene level expression estimates, and used unsupervised and supervised clustering algorithms to study the profile similarity between the samples at both gene and isoform levels.

Results: Hierarchical clustering, based on isoform level expressions, effectively grouped the non-oncogenic and oncogenic cell lines with a virtually perfect homogeneity-grouping rate (97.5%), regardless of the tissue origin of the cell lines. However, gene levelthis rate was much lower, being 75% at best based on the gene level expressions. Statistical analyses of the difference between cancer and non-oncogenic samples identified the existence of numerous genes with differentially expressed isoforms, which otherwise were not significant at the gene level. We also found that canonical pathways of protein ubiquitination, purine metabolism, and breast-cancer regulation by stathmin1 were significantly enriched among genes thatshow differential expression at isoform level but not at gene level.

Conclusions: In summary, cancer cell lines, regardless of their tissue of origin, can be effectively discriminated from non-cancer cell lines at isoform level, but not at gene level. This study suggests the existence of an isoform signature, rather than a gene signature, which could be used to distinguish cancer cells from normal cells.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Hierarchical clustering dendrograms of 160 datasets (73 oncogenic and 87 non-oncogenic) from cell lines of various tissue origins, using expression estimates of (A) 87,345 transcripts and (B) 27,063 genes. The top 76% of genes/transcripts with the highest coefficient of variation (CV) of expression profile across all the samples was used for the clustering (see Additional file 3: Figures S1 and S2 for a series of dendrograms obtained by different CV cut-off points). The non-oncogenic melanocyte (N.Mel) and melanoma (T.Mel) cell lines were clustered together and separated into non-oncogenic and oncogenic groups in dendrogram A, whereas in dendrogram B, the N.HMEC samples were not grouped together and were clustered with the overall oncogenic group. N.HMEC is the normal breast cell line (human mammary epithelial cells) and T.MCF7 is the breast-cancer cell line MCF7.
Figure 2
Figure 2
Cumulative distribution function (CDF) and silhouette width plot demonstrates that isoform level clustering is more robust than gene-level clustering. (A) The empirical CDF plots were based on resampling 200 times at either isoform or gene level. (B, C) Silhouette width plots of the clustering results based on (B) gene-level expression of 27,063 genes and (C) isoform-level expression of 87,345 transcripts for the 160 datasets representing oncogenic and non-oncogenic cell lines. The stability and robustness of the clusters is indicated by average silhouette width. The sample number falling in each cluster and the silhouette width of each cluster is also represented on the figure.
Figure 3
Figure 3
Venn diagram showing the overlapping genes that were significantly upregulated (Up) or downregulated (Dn) either at gene level or isoform level in cancer. If more than one isoform from a single gene were significantly upregulated or downregulated, they were counted only once in the diagram. The differentially expressed genes/transcript variants were obtained by comparing the groups of (A) all non-oncogenic and all oncogenic cell lines, (B) melanocyte and melanoma cell lines and (C) human mammary epithelial cell (HMEC) and human breast adenocarcinoma (MCF7) cell lines. (D) Venn diagram shows the overlap of genes in A, B and C. The overlapping genes that were either upregulated (114 genes) or downregulated (68 genes) both at isoform and gene level were designated as the core isoforms and core genes. The 260 isoforms correspond to 182 unique genes. (E, F) Heat-map diagrams of differential gene expression of (E) core isoforms and (F) core genes.
Figure 4
Figure 4
Mean normalized expression estimates of MITF and its transcript variants in melanocyte (N) and melanoma (T) cell lines, and of TPM4 and its transcript variants in non-oncogenic and oncogenic breast cell lines (HMEC-N; MCF7-T) and tissues. (A) Notice that although the gene level expression of MITF was not significantly different between melanoma and melanocytes, three of its isoforms showed significant differences in their expression. Whereas ENST00000352241 was overexpressed, ENST00000433517 and ENST00000472437 were underexpressed in melanoma. (B) Similarly, two transcript variants of TPM4 showed opposing expression patterns between the non-oncogenic HMEC and the MCF7 breast-cancer cell lines. (C) Validation results via RT-qPCR experiments showing the relative fold expressions of the two major transcript variants of TMP4 (ENST00000300933 and ENST00000344824) in various human breast-cancer tissue subtypes compared with the surrounding matched non-oncogenic breast tissue. We analyzed two patient samples for estrogen receptor (ER)-positive breast-cancer tissues (ER+S1 and ER+S2) and one patient sample each for Her2 gene-positive breast cancer (Her2+) and triple-negative breast cancer (TNBC) subtypes. MCF7 is an ER+ breast-cancer cell line.
Figure 5
Figure 5
Hierarchical clustering dendrograms and heat map based on core transcript and gene expression and the gene network for the core genes. The hierarchical clustering dendogram of 160 samples and heat map of gene-expression estimates of (A) 18 core transcripts and (B) 14 core genes that were significantly enriched in the top 5 gene networks identified by Ingenuity Pathway Analysis (IPA). The 18 transcripts were those having the maximum coefficient of variation (CV) values of all the core isoforms, representing 14 core genes. (C) IPA gene network of the 14 core genes, of which 12 belong to the canonical pathway with associated known functions in hematological system development and function, tissue morphology, and cellular development (in red).
Figure 6
Figure 6
Canonical pathways significantly enriched with genes that are differentially expressed at isoform level but not at gene level in three pairwise comparisons. 1) Overall: all oncogenic versus all non-oncogenic cell lines, 2) mcf: MCF7 versus HMEC, and 3) melanoma: melanoma cell -lines versus melanocytes. Number of genes and the log P values for each pathway are plotted in the lower and upper panels of the bar chart, respectively.

Similar articles

Cited by

References

    1. Heard E, Tishkoff S, Todd JA, Vidal M, Wagner GP, Wang J, Weigel D, Young R. Ten years of genetics and genomics: what have we achieved and where are we heading?. Nat Rev Genet. 2010;11:723–733. doi: 10.1038/nrg2878. - DOI - PMC - PubMed
    1. Boran AD, Iyengar R. Systems approaches to polypharmacology and drug discovery. Curr Opin Drug Discov Devel. 2010;13:297–309. - PMC - PubMed
    1. Janga SC, Tzakos A. Structure and organization of drug-target networks: insights from genomic approaches for drug discovery. Mol Biosyst. 2009;5:1536–1548. doi: 10.1039/b908147j. - DOI - PubMed
    1. Swanton C, Caldas C. Molecular classification of solid tumours: towards pathway-driven therapeutics. Br J Cancer. 2009;100:1517–1522. doi: 10.1038/sj.bjc.6605031. - DOI - PMC - PubMed
    1. Feero WG, Guttmacher AE, Collins FS. Genomic medicine--an updated primer. The New England journal of medicine. 2010;362:2001–2011. doi: 10.1056/NEJMra0907175. - DOI - PubMed