Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun 9:12:648800.
doi: 10.3389/fgene.2021.648800. eCollection 2021.

Whole Transcriptome Data Analysis Reveals Prognostic Signature Genes for Overall Survival Prediction in Diffuse Large B Cell Lymphoma

Affiliations

Whole Transcriptome Data Analysis Reveals Prognostic Signature Genes for Overall Survival Prediction in Diffuse Large B Cell Lymphoma

Mengmeng Pan et al. Front Genet. .

Abstract

Background: With the improvement of clinical treatment outcomes in diffuse large B cell lymphoma (DLBCL), the high rate of relapse in DLBCL patients is still an established barrier, as the therapeutic strategy selection based on potential targets remains unsatisfactory. Therefore, there is an urgent need in further exploration of prognostic biomarkers so as to improve the prognosis of DLBCL.

Methods: The univariable and multivariable Cox regression models were employed to screen out gene signatures for DLBCL overall survival (OS) prediction. The differential expression analysis was used to identify representative genes in high-risk and low-risk groups, respectively, where student t test and fold change were implemented. The functional difference between the high-risk and low-risk groups was identified by the gene set enrichment analysis.

Results: We conducted a systematic data analysis to screen the candidate genes significantly associated with OS of DLBCL in three NCBI Gene Expression Omnibus (GEO) datasets. To construct a prognostic model, five genes (CEBPA, CYP27A1, LST1, MREG, and TARP) were then screened and tested using the multivariable Cox model and the stepwise regression method. Kaplan-Meier curve confirmed the good predictive performance of this five-gene Cox model. Thereafter, the prognostic model and the expression levels of the five genes were validated by means of an independent dataset. High expression levels of these five genes were significantly associated with favorable prognosis in DLBCL, both in training and validation datasets. Additionally, further analysis revealed the independent value and superiority of this prognostic model in risk prediction. Functional enrichment analysis revealed some vital pathways responsible for unfavorable outcome and potential therapeutic targets in DLBCL.

Conclusion: We developed a five-gene Cox model for the clinical outcome prediction of DLBCL patients. Meanwhile, potential drug selection using this model can help clinicians to improve the clinical practice for the benefit of patients.

Keywords: biomarkers; diffuse large B cell lymphoma; overall survival; prognosis; risk score.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Screening a five-gene Cox model in the public DLBCL datasets from Gene Expression Omnibus (GEO). (A) Venn diagram summarizing the overlap between the prognostic genes identified by univariable Cox regression analysis in three public DLBCL datasets with accession numbers of GSE32918 (n = 172), GSE4475 (n = 166) and GSE69051 (n = 172). (B) The forest plots represent the association of the five gene signatures with overall survival in the three public DLBCL datasets.
FIGURE 2
FIGURE 2
The performance of the five gene signatures in predicting the patients’ risk. K-M curves for the prognostic model in the training datasets (A) and the five validation datasets (B–F). The red and blue lines represent the high- and low-risk groups, respectively. The numbers within risk tables on the bottom represent the number of survivors at that time point.
FIGURE 3
FIGURE 3
The expression patterns of five prognostic gene signatures in the training and five validation sets. The expression patterns of the five prognostic genes in training (A) and validation (B–F) sets. The risk scores were estimated by the linear predictors of the Cox model. The samples were ordered by the risk scores.
FIGURE 4
FIGURE 4
The Cox model based on the five gene signatures was superior to other models. The performance of the four prognostic models in the validation datasets of TCGA (n = 43), GSE34171 (n = 68), GSE10846 (n = 414), GSE31312 (n = 470), and GSE11318 (n = 203) are displayed in panels (A–E). The log2-hazard ratios and 95% confidence intervals were denoted by the red boxes and lines.
FIGURE 5
FIGURE 5
The risk stratification based on the five prognostic genes is independent of clinical factors. (A) The risk scores in different IPI groups (left panel) and clinical stages (right panel). The boxes show the median and the interquartile range (IQR) of the risk scores grouped by the IPI scoring system and clinical stage in the validation dataset. There are no significant differences between those groups (P > 0.05). (B) Kaplan–Meier survival curves show the overall survival of samples grouped by combining the IPI scoring system and the five-gene-based risk stratification. ***P < 0.0001. The differences of overall survival between the high-risk and low-risk groups in specific subtype or with specific chemotherapy regiment [(C) ABC subtype; (D) GCB subtype; (E) unclassified subtype, (F) DLBCL treated with CHOP-Like regiment, (G) DLBCL treated with R-CHOP-Like regiment].
FIGURE 6
FIGURE 6
The molecular characteristics and potential drugs for the two risk groups. (A) The top-ten GO terms enriched by the upregulated genes in high-risk and low-risk groups. The dots size and color represent the ratio of gene counts and statistical significance, respectively. (B) The probability density function of the Spearman’s correlation between the five prognostic genes and the differentially expressed genes (DEGs). The colors represent the validation datasets. (C) The upregulated immune checkpoint proteins and the corresponding drugs in the low-risk group. (D) The upregulated cell cycle kinase and their potential drugs in high-risk group.

Similar articles

Cited by

References

    1. Alfaqih M. A., Nelson E. R., Liu W., Safi R., Jasper J. S., Macias E., et al. (2017). CYP27A1 Loss Dysregulates Cholesterol Homeostasis in Prostate Cancer. Cancer Res. 77 1662–1673. 10.1158/0008-5472.CAN-16-2738 - DOI - PMC - PubMed
    1. Barrans S. L., Crouch S., Care M. A., Worrillow L., Smith A., Patmore R., et al. (2012). Whole genome expression profiling based on paraffin embedded tissue can be used to classify diffuse large B-cell lymphoma and predict clinical outcome. Br. J. Haematol. 159 441–453. 10.1111/bjh.12045 - DOI - PubMed
    1. Bhatt G., Maddocks K., Christian B. (2016). CD30 and CD30-Targeted Therapies in Hodgkin Lymphoma and Other B cell Lymphomas. Curr. Hematol. Malign. Rep. 11 480–491. 10.1007/s11899-016-0345-y - DOI - PubMed
    1. Cabanillas F., Shah B. (2017). Advances in Diagnosis and Management of Diffuse Large B-cell Lymphoma. Clin. Lymph. Myeloma Leukemia 17 783–796. 10.1016/j.clml.2017.10.007 - DOI - PubMed
    1. Chapuy B., Stewart C., Dunford A. J., Kim J., Kamburov A., Redd R. A., et al. (2018). Molecular subtypes of diffuse large B cell lymphoma are associated with distinct pathogenic mechanisms and outcomes. Nat. Med. 24 679–690. 10.1038/s41591-018-0016-8 - DOI - PMC - PubMed

LinkOut - more resources