Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Apr 11:7:202.
doi: 10.1186/1471-2105-7-202.

Profiling alternatively spliced mRNA isoforms for prostate cancer classification

Affiliations

Profiling alternatively spliced mRNA isoforms for prostate cancer classification

Chaolin Zhang et al. BMC Bioinformatics. .

Abstract

Background: Prostate cancer is one of the leading causes of cancer illness and death among men in the United States and world wide. There is an urgent need to discover good biomarkers for early clinical diagnosis and treatment. Previously, we developed an exon-junction microarray-based assay and profiled 1532 mRNA splice isoforms from 364 potential prostate cancer related genes in 38 prostate tissues. Here, we investigate the advantage of using splice isoforms, which couple transcriptional and splicing regulation, for cancer classification.

Results: As many as 464 splice isoforms from more than 200 genes are differentially regulated in tumors at a false discovery rate (FDR) of 0.05. Remarkably, about 30% of genes have isoforms that are called significant but do not exhibit differential expression at the overall mRNA level. A support vector machine (SVM) classifier trained on 128 signature isoforms can correctly predict 92% of the cases, which outperforms the classifier using overall mRNA abundance by about 5%. It is also observed that the classification performance can be improved using multivariate variable selection methods, which take correlation among variables into account.

Conclusion: These results demonstrate that profiling of splice isoforms is able to provide unique and important information which cannot be detected by conventional microarrays.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Prostate tumor and normal samples can be separated into distinct groups. (A) A thumbnail overview of the result of the two-way average-linkage hierarchical clustering of 38 arrays (columns) and 1532 isoforms (rows), as described in ref [30]. (B) Zoom-in view of the array clustering dendrogram. The two array clusters, C1 and C2, are enriched by normal samples and tumor samples, respectively. Cluster C2 is formed by two sub-clusters, reflecting differences in tumor percentage and stroma. (C-E) Isoform signatures up- or down-regulated in different array clusters. (F and G) The result of SVD. (F) The percentage of variation (y-axis) captured by each principal component (x-axis). (G) The low dimensional projection of arrays in the 3D space spanned by the first three principal components. SVD identified the same hierarchical structure as revealed by hierarchical clustering.
Figure 2
Figure 2
Profiling splice isoforms provides additional useful information for prostate cancer classification. (A) The validity of estimating the overall mRNA abundance level from the isoform abundance level. The overall mRNA level was estimated by summing up the abundances of individual isoforms for each gene. The estimated mRNA abundances of 107 genes were compared with direct measurements by an independent expression microarray design (described in main text). Plotted are the scatter-plot of log expression ratios of these genes in two prostate cancer cell lines, LNCaP and PC-3. These two approaches show good agreement (R2 = 0.80, p = 2.2e-16). (B) 159 genes out of 364 profiled genes in the DASL assay exhibit differential expression between tumors and normal samples at the overall mRNA level (q-value = 0.05). Most of them (92%) have isoforms with significant differential expression. (C and D) 464 isoforms from 222 genes are reported as being differentially expressed between tumors and normal tissues (q-value = 0.05), which may be prostate cancer marker candidates. 32% of these genes (corresponding to 22% significant isoforms) do not show differential expression at the overall mRNA level, therefore can not be detected by conventional microarrays.
Figure 3
Figure 3
The performance is measured by leave-one-out cross validation. To get unbiased result, the variable selection and training are done in training arrays, which is completely independent with the testing array. (A) The comparison in classification performance of SVM-RFE selected variables using individual isoforms and the overall mRNAs. (B) The comparison in classification performance of variable subsets selected by SVM-RFE and t-test, using individual isoforms.

References

    1. Parkin DM, Bray FI, Devesa SS. Cancer burden in the year 2000. The global picture. Eur J Cancer. 2001;37:4–66. doi: 10.1016/S0959-8049(01)00267-2. - DOI - PubMed
    1. Jemal A, Thomas A, Murray T, Thun M. Cancer statistics, 2002. CA Cancer J Clin. 2002;52:23–47. - PubMed
    1. Jemal A, Murray T, Samuels A, Ghafoor A, Ward E, Thun MJ. Cancer statistics, 2003. CA Cancer J Clin. 2003;53:5–26. - PubMed
    1. Denmeade SR, Isaacs JT. A history of prostate cancer treatment. Nat Rev Cancer. 2002;2:389 –3396. doi: 10.1038/nrc801. - DOI - PMC - PubMed
    1. Nelson WG, De Marzo AM, Isaacs WB. Prostate Cancer. N Engl J Med. 2003;349:366–381. doi: 10.1056/NEJMra021562. - DOI - PubMed

Publication types

MeSH terms