Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Meta-Analysis
. 2018 Jun:32:234-244.
doi: 10.1016/j.ebiom.2018.05.010. Epub 2018 Jun 1.

RankProd Combined with Genetic Algorithm Optimized Artificial Neural Network Establishes a Diagnostic and Prognostic Prediction Model that Revealed C1QTNF3 as a Biomarker for Prostate Cancer

Affiliations
Meta-Analysis

RankProd Combined with Genetic Algorithm Optimized Artificial Neural Network Establishes a Diagnostic and Prognostic Prediction Model that Revealed C1QTNF3 as a Biomarker for Prostate Cancer

Qi Hou et al. EBioMedicine. 2018 Jun.

Abstract

Prostate cancer (PCa) is the most commonly diagnosed cancer in males in the Western world. Although prostate-specific antigen (PSA) has been widely used as a biomarker for PCa diagnosis, its results can be controversial. Therefore, new biomarkers are needed to enhance the clinical management of PCa. From publicly available microarray data, differentially expressed genes (DEGs) were identified by meta-analysis with RankProd. Genetic algorithm optimized artificial neural network (GA-ANN) was introduced to establish a diagnostic prediction model and to filter candidate genes. The diagnostic and prognostic capability of the prediction model and candidate genes were investigated in both GEO and TCGA datasets. Candidate genes were further validated by qPCR, Western Blot and Tissue microarray. By RankProd meta-analyses, 2306 significantly up- and 1311 down-regulated probes were found in 133 cases and 30 controls microarray data. The overall accuracy rate of the PCa diagnostic prediction model, consisting of a 15-gene signature, reached up to 100% in both the training and test dataset. The prediction model also showed good results for the diagnosis (AUC = 0.953) and prognosis (AUC of 5 years overall survival time = 0.808) of PCa in the TCGA database. The expression levels of three genes, FABP5, C1QTNF3 and LPHN3, were validated by qPCR. C1QTNF3 high expression was further validated in PCa tissue by Western Blot and Tissue microarray. In the GEO datasets, C1QTNF3 was a good predictor for the diagnosis of PCa (GSE6956: AUC = 0.791; GSE8218: AUC = 0.868; GSE26910: AUC = 0.972). In the TCGA database, C1QTNF3 was significantly associated with PCa patient recurrence free survival (P < .001, AUC = 0.57). In this study, we have developed a diagnostic and prognostic prediction model for PCa. C1QTNF3 was revealed as a promising biomarker for PCa. This approach can be applied to other high-throughput data from different platforms for the discovery of oncogenes or biomarkers in different kinds of diseases.

Keywords: Artificial neural network; Biomarker; Genetic algorithm; Prostate cancer; RankProd.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Flowchart for the systematic analysis and validation of key genes in PCa.
Fig. 2
Fig. 2
Heatmap plot of top 1000 differentially expressed genes (DEG) from Rankprod. The blue shade represented normal tissue and red shade represented patients tumor tissue.
Fig. 3
Fig. 3
GO enrichment and KEGG pathway analysis for up and down-regulated genes in PCa (a) Biological process (b) Molecular function (c) Cellular component (d) KEGG pathway.
Fig. 4
Fig. 4
ANN model training process (a) The number of hidden layer nodes affects the accuracy of the ANN and GA-ANN model (b) The number of hidden layer nodes affects the modeling time in the ANN and GA-ANN model. (c) The configuration of the final ANN (d) The plot of mean squared error in training ANN. After six epochs, the mean squared error of prediction model trained by ANN descends below the threshold 0.005 (e) The regression plot shows the relationship between outputs of prediction model trained by ANN and targets. The regression plot suggests the training of prediction model is perfect; the outputs are nearly equal to the targets.
Fig. 5
Fig. 5
Diagnostic and prognostic capacity of the 15-gene signature for PCa in TCGA dataset. (a) 15-gene expression value distribution in TCGA PCa cohort by boxplots. The line within the box indicates the median value; the box spans the interquartile range. (b) ROC curve for the 15-gene signature for PCa diagnosis (c) Kaplan-Meier curves for the low- and high-risk groups separated by the PI of the 15-gene signature in the TCGA PCa cohort. Significant differences in overall survival between the 2 groups were analyzed by log-rank test (P = .003). (d) Kaplan-Meier curves for the low-risk and high-risk groups of the 15-gene signature in the TCGA PCa cohort. Significant differences in DFS between the two groups were determined by the log-rank test (P = .003). (e) ROC curves for the prediction of the 5 years overall survival among the 15-gene signature model, PSA screening and the Gleason score. (f) ROC curves for the 5 years DFS among 15-gene signature, PSA screening and the Gleason score.
Fig. 6
Fig. 6
qPCR assay of FABP5, C1QTNF3 and LPHN3 genes in PCa and normal adjacent tissues. Scatter diagram of the gene expression and fold changed distribution of gene expression in different samples.
Fig. 7
Fig. 7
Validation of C1QTNF3 expression in tissues microarray and Western blot. (a). Pathological sections of PCa and para-carcinoma tissue. (b) C1QTNF3 expression in tumor tissue was significantly increased when compared with para-carcinoma tissues. (c) C1QTNF3 expression assayed by Western blot.
Fig. 8
Fig. 8
Diagnostic and prognostic capacity of C1QTNF3 for PCa in GEO and TCGA datasets. (a) C1QTNF3 expression in three datasets (GSE6956, GSE8218, GSE26910) of GEO database. (b) The AUC of the ROC curve showed diagnostic capacity of C1QTNF3 for PCa in GSE6956, GSE8218 and GSE26910 datasets. (c) C1QTNF3 expression was associated with DFS time (log-rank test, P value < .001). (d) AUC of C1QTNF3 in DFS time is 0.57.
Supplementary file 7
Supplementary file 7
PI distribution of high-risk and low-risk patients.
Supplementary file 8
Supplementary file 8
Validation of the 15-gene signature in datasets from Taylor et al. and Ross-Adams et al. (a) Kaplan-Meier curves for the low- and high-risk groups separated by the PI of the 15-gene signature in the Taylor cohort. Significant differences in disease free time between the 2 groups were analyzed by log-rank test (P = .003). (b) ROC curves for the prediction of the 5 years disease free time among the 15-gene signature model in Taylor cohort. (c) Kaplan-Meier curves for the low- and high-risk groups separated by the PI of the 15-gene signature in the Ross-Adams cohort. Significant differences in biochemical relapse time between the 2 groups were analyzed by log-rank test (P = .036). (d) ROC curves for the prediction of the 5 years biochemical relapse time among the 15-gene signature model in Ross-Adams cohort.

References

    1. Siegel R.L., Miller K.D., Di A.J. Cancer statistics, 2018. CA Cancer J Clin. 2018;68 - PubMed
    1. Zhang W., Xiang Y.B., Liu Z.W., Fang R.R., Ruan Z.X., Sun L. Trends analysis of common urologic neoplasm incidence of elderly people in Shanghai, 1973–1999. Chinese J. Cancer. 2004;23:555. - PubMed
    1. Chen W., Zheng R., Baade P.D., Zhang S., Zeng H., Bray F. Cancer statistics in China, 2015. CA Cancer J. Clin. 2016;66:115. - PubMed
    1. Miller K.D., Siegel R.L., Lin C.C., Mariotto A.B., Kramer J.L., Rowland J.H. Cancer treatment and survivorship statistics, 2016. CA Cancer J. Clin. 2016;66:271. - PubMed
    1. Andriole G.L., Grubb R.L., III, Buys S.S., Chia D., Church T.R., Fouad M.N. Mortality results from a randomized prostate-cancer screening trial. N. Engl. J. Med. 2009;360:1310. - PMC - PubMed

Publication types

MeSH terms