Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jan;19(1):388-398.
doi: 10.3892/ol.2019.11068. Epub 2019 Nov 7.

Identification of key genes for predicting colorectal cancer prognosis by integrated bioinformatics analysis

Affiliations

Identification of key genes for predicting colorectal cancer prognosis by integrated bioinformatics analysis

Gong-Peng Dai et al. Oncol Lett. 2020 Jan.

Abstract

Colorectal cancer (CRC) is a life-threatening disease with a poor prognosis. Therefore, it is crucial to identify molecular prognostic biomarkers for CRC. The present study aimed to identify potential key genes that could be used to predict the prognosis of patients with CRC. Three CRC microarray datasets (GSE20916, GSE73360 and GSE44861) were downloaded from the Gene Expression Omnibus (GEO) database, and one dataset was obtained from The Cancer Genome Atlas (TCGA) database. The three GEO datasets were analyzed to detect differentially expressed genes (DEGs) using the BRB-ArrayTools software. Functional and pathway enrichment analyses of these DEGs were performed using the Database for Annotation, Visualization and Integrated Discovery tool. A protein-protein interaction (PPI) network of DEGs was constructed, hub genes were extracted, and modules of the PPI network were analyzed. To investigate the prognostic values of the hub genes in CRC, data from the CRC datasets of TCGA were used to perform the survival analyses based on the sample splitting method and Cox regression model. Correlation among the hub genes was evaluated using Spearman's correlation analysis. In the three GEO datasets, a total of 105 common DEGs were identified, including 51 down- and 54 up-regulated genes in CRC compared with normal colorectal tissues. A PPI network consisting of 100 DEGs and 551 edges was constructed, and 44 nodes were identified as hub genes. Among these 44 genes, the four hub genes TIMP metallopeptidase inhibitor 1 (TIMP1), solute carrier family 4 member 4 (SLC4A4), aldo-keto reductase family 1 member B10 (AKR1B10) and ATP binding cassette subfamily E member 1 (ABCE1) were associated with overall survival (OS) in patients with CRC. Three significant modules were extracted from the PPI network. The hub gene TIMP1 was present in Module 1, ABCE1 was involved in Module 2 and SLC4A4 was identified in Module 3. Univariate analysis revealed that TIMP1, SLC4A4, AKR1B10 and ABCE1 were associated with the OS of patients with CRC. Multivariate analysis demonstrated that SLC4A4 may be an independent prognostic factor associated with OS. Furthermore, the results from correlation analysis revealed that there was no correlation between TIMP1, SLC4A4 and ABCE1, whereas AKR1B10 was positively correlated with SLC4A4. In conclusion, the four key genes TIMP1, SLC4A4, AKR1B10 and ABCE1 associated with the OS of patients with CRC were identified by integrated bioinformatics analysis. These key genes may be used as prognostic biomarkers to predict the survival of patients with CRC, and may therefore represent novel therapeutic targets for CRC.

Keywords: colorectal cancer; differentially expressed genes; protein-protein interaction; survival.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Expression pattern of genes between CRC and normal samples. (A) Volcano plots exhibiting expression data of CRC and normal tissues in the microarray profiles of GSE20916, GSE44861 and GSE73360. The x-axis presents the mean differences between CRC and normal samples. The y-axis presents the log transformed P-values. DEGs are shown in blue. (B) Hierarchical clustering analysis of DEGs between CRC and normal samples in GSE20916, GSE44861 and GSE77360. Each row represents a DEG and each column represents a different sample. (C) Intersection of all DEGs (n=105), up-regulated DEGs (n=54) and down-regulated DEGs (n=51) among the expression data of GSE20916, GSE44861 and GSE73360. (D) Hierarchical clustering analysis of the mutual up-regulated and down-regulated DEGs in the expression data of GSE20916, GSE44861 and GSE73360. Red represents down-regulated genes, whereas blue represents up-regulated genes. CRC, colorectal cancer; DEGs, differentially expressed genes.
Figure 2.
Figure 2.
PPI network construction and module composition. (A) Construction of the PPI network covered 100 nodes and 551 edges, including 49 up-regulated genes and 51 down-regulated genes. The nodes represent the protein (gene). Red nodes represent down-regulated DEGs. Orange nodes represent up-regulated DEGs. Edge width was determined according to the combined score of the PPI relationship. (B) A total of 18 nodes and 136 edges were included in Module 1. (C) A total of six nodes and 12 edges were included in Module 2. (D) A total of six nodes and 10 edges were included in Module 2. DEGs, differentially expressed genes; PPI, protein-protein interaction.
Figure 3.
Figure 3.
Crosstalk analysis for significant pathways. ECM, extracellular matrix; TNF, tumor necrosis factor.
Figure 4.
Figure 4.
Kaplan-Meier OS curves according to the expression levels of TIMP1, SLC4A4, AKR1B10 and ABCE1. OS curves demonstrated that high TIMP1 expression and low SLC4A4/AKR1B10/ABCE1 expression were significantly associated with low OS in patients with colorectal cancer. ABCE1, ATP binding cassette subfamily E member 1; AKR1B10, aldo-keto reductase family 1 member B10; OS, overall survival; SLC4A4, solute carrier family 4 member 4; TIMP1, TIMP metallopeptidase inhibitor 1.
Figure 5.
Figure 5.
Correlations among the four hub genes (TIMP1, SLC4A4, AKR1B10 and ABCE1) evaluated using Spearman's correlation analysis based on Spearman's correlation coefficient (|R|<0.3) and P<0.05. ABCE1, ATP binding cassette subfamily E member 1; AKR1B10, aldo-keto reductase family 1 member B10; SLC4A4, solute carrier family 4 member 4; TIMP1, TIMP metallopeptidase inhibitor 1.

References

    1. Siegel R, Desantis C, Jemal A. Colorectal cancer statistics, 2014. CA Cancer J Clin. 2014;64:104–117. doi: 10.3322/caac.21220. - DOI - PubMed
    1. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68:394–424. doi: 10.3322/caac.21492. - DOI - PubMed
    1. Leslie A, Steele RJ. Management of colorectal cancer. Postgrad Med J. 2002;78:473–478. doi: 10.1136/pmj.78.922.473. - DOI - PMC - PubMed
    1. Renkonen-Sinisalo L, Aarnio M, Mecklin JP, Järvinen HJ. Surveillance improves survival of colorectal cancer in patients with hereditary nonpolyposis colorectal cancer. Cancer Detect Prev. 2000;24:137–142. - PubMed
    1. Siegel RL, Miller KD, Fedewa SA, Ahnen DJ, Meester RGS, Barzi A, Jemal A. Colorectal cancer statistics, 2017. CA Cancer J Clin. 2017;67:177–193. doi: 10.3322/caac.21395. - DOI - PubMed