Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Feb 9:9:e10556.
doi: 10.7717/peerj.10556. eCollection 2021.

A risk score model with five long non-coding RNAs for predicting prognosis in gastric cancer: an integrated analysis combining TCGA and GEO datasets

Affiliations

A risk score model with five long non-coding RNAs for predicting prognosis in gastric cancer: an integrated analysis combining TCGA and GEO datasets

Yiguo Wu et al. PeerJ. .

Abstract

Background: Gastric cancer (GC) is one of the most common carcinomas of the digestive tract, and the prognosis for these patients may be poor. There is evidence that some long non-coding RNAs(lncRNAs) can predict the prognosis of patients with GC. However, few lncRNA signatures have been used to predict prognosis. Herein, we aimed to construct a risk score model based on the expression of five lncRNAs to predict the prognosis of patients with GC and provide new potential therapeutic targets.

Methods: We performed differentially expressed and survival analyses to identify differentially expressed survival-ralated lncRNAs by using GC patient expression profile data from The Cancer Genome Atlas (TCGA) database. We then established a formula including five lncRNAs to predict the prognosis of patients with GC. In addition, to verify the prognostic value of this risk score model, two independent Gene Expression Omnibus (GEO) datasets, GSE62254 (N = 300) and GSE15459 (N = 200), were employed as validation groups.

Results: Based on the characteristics of five lncRNAs, patients with GC were divided into high or low risk subgroups. The prognostic value of the risk score model with five lncRNAs was confirmed in both TCGA and the two independent GEO datasets. Furthermore, stratification analysis results showed that this model had an independent prognostic value in patients with stage II-IV GC. We constructed a nomogram model combining clinical factors and the five lncRNAs to increase the accuracy of prognostic prediction. Enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) suggested that the five lncRNAs are associated with multiple cancer occurrence and progression-related pathways.

Conclusion: The risk score model including five lncRNAs can predict the prognosis of patients with GC, especially those with stage II-IV, and may provide potential therapeutic targets in future.

Keywords: Gastric cancer; LINC00106; LINC00205; Long non-coding RNA; MIR100HG; OVAAL; Prognosis; TRHDE-AS1.

PubMed Disclaimer

Conflict of interest statement

The authors declare there are no competing interests.

Figures

Figure 1
Figure 1. The expression information of five lncRNAs, overall survival and disease free survival in gastric cancer patients in the TCGA dataset.
(A) Volcano plot with blue dots indicating five lncRNAs expression levels which is significantly different between tumor and normal tissue based on the criteria of an absolute log2 fold change (FC) >1 and adjusted P < 0.05. (B) Heatmap of the five-lncRNA expression profile of the 414 patients in the TCGA dataset. Among five lncRNAs, MIR100HG and TRHDE-AS1 have a similar expression in 414 patients in the TCGA dataset, otherwise the other three lncRNAs do as well. (C–D) The survival curves based on the OS and DFS of the 408 patients in TCGA dataset.
Figure 2
Figure 2. The prognostic value of lncRNA-based risk model in training group.
(A–B) Kaplan–Meier analysis of patients’ OS and DFS in the high risk (n = 204) and low risk (n = 204) subgroups of the training group. (C) The scatter plot of lncRNA-based risk model distribution for patient survival status. (D) The percentage of patient survival status in the high risk and low risk subgroups of the training group. (E) The lncRNA-based risk model distribution for patient recurrence. (F) The percentage of patient recurrence in the high risk and low risk subgroups of the training group. (G–H) The time-dependent ROC analysis of the risk score for prediction the 4-year cutoff OS and 2-year cutoff DFS of the training group. The area under the curve was calculated for ROC curve. *** P < 0.001.
Figure 3
Figure 3. The prognostic value of lncRNA-based risk model in two independent GEO validation groups.
(A–B) Kaplan–Meier analysis of predicting OS of GC patients based on the high risk and low risk subgroups in two independent validation groups (GSE62254 and GSE15459). (C–D) The scatter plot of five-lncRNA-based risk score distribution for patient survival status in two independent validation groups.(E–F) The time-dependent ROC analysis of the risk score for prediction the 4-year cutoff OS of the two independent validation groups. The area under the curve was calculated for ROC curve.
Figure 4
Figure 4. The prognostic value of lncRNA-based risk model in subgroups according to the TNM stage.
(A–D) Kaplan–Meier analysis of the OS of GC patients with stage I, II, III and IV, respectively.
Figure 5
Figure 5. Forest plot to evaluate prognostic value of lncRNA-based risk model in subgroups divided by clinical factors.
Figure 6
Figure 6. The prognostic value of a nomogram model combining five-lncRNA signature with the clinical factors.
(A) A nomogram model combining five-lncRNA signature with the clinical factors for predicting the 4-year OS of GC patients. (B) The nomogram calibration curve to evaluate the prediction of 4-year OS of GC patients. The C index of this model was also calculated.
Figure 7
Figure 7. Potential functions of the five lncRNAs.
(A) The Pearson correlation coefficient between 19,605 protein-coding genes and five lncRNAs in TCGA dataset. (B) The functional enrichment bubble map of pathways by KEGG pathway analysis. Bubble size represents the number of gene enriched in the pathway.

Similar articles

Cited by

References

    1. Balas MM, Johnson AM. Exploring the mechanisms behind long noncoding RNAs and cancer. Noncoding RNA Research. 2018;3:108–117. doi: 10.1016/j.ncrna.2018.03.001. - DOI - PMC - PubMed
    1. Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, Yefanov A, Lee H, Zhang N, Robertson CL, Serova N, Davis S, Soboleva A. NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Research. 2013;41:D991–D995. doi: 10.1093/nar/gks1193. - DOI - PMC - PubMed
    1. Bartonicek N, Maag JL, Dinger ME. Long noncoding RNAs in cancer: mechanisms of action and technological advancements. Molecular Cancer. 2016;15:43. doi: 10.1186/s12943-016-0530-6. - DOI - PMC - PubMed
    1. Berridge MJ. The inositol trisphosphate/calcium signaling pathway in health and disease. Physiological Reviews. 2016;96:1261–1296. doi: 10.1152/physrev.00006.2016. - DOI - PubMed
    1. Burridge K. Focal adhesions: a personal perspective on a half century of progress. FEBS Journal. 2017;284:3355–3361. doi: 10.1111/febs.14195. - DOI - PMC - PubMed

LinkOut - more resources