Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 1:13:928175.
doi: 10.3389/fgene.2022.928175. eCollection 2022.

Prognosis Prediction Through an Integrated Analysis of Single-Cell and Bulk RNA-Sequencing Data in Triple-Negative Breast Cancer

Affiliations

Prognosis Prediction Through an Integrated Analysis of Single-Cell and Bulk RNA-Sequencing Data in Triple-Negative Breast Cancer

Xiangru Wang et al. Front Genet. .

Abstract

Background: Genomic and antigenic heterogeneity pose challenges in the precise assessment of outcomes of triple-negative breast cancer (TNBC) patients. Thus, this study was designed to investigate the cardinal genes related to cell differentiation and tumor malignant grade to advance the prognosis prediction in TNBC patients through an integrated analysis of single-cell and bulk RNA-sequencing (RNA-seq) data. Methods: We collected RNA-seq and microarray data of TNBC from two public datasets. Using single-cell pseudotime analysis, differentially expressed genes (DEGs) among trajectories from 1534 cells of 6 TNBC patients were identified as the potential genes crucial for cell differentiation. Furthermore, the grade- and tumor mutational burden (TMB)-related DEGs were explored via a weighted correlation network analysis using the Molecular Taxonomy of Breast Cancer International Consortium dataset. Subsequently, we utilized the DEGs to construct a prognostic signature, which was validated using another independent dataset. Moreover, as gene set variation analysis indicated the differences in immune-related pathways between different risk groups, we explored the immune differences between the two groups. Results: A signature including 10 genes related to grade and TMB was developed to assess the outcomes of TNBC patients, and its prognostic efficacy was prominent in two cohorts. The low-risk group generally harbored lower immune infiltration compared to the high-risk group. Conclusion: Cell differentiation and grade- and TMB-related DEGs were identified using single-cell and bulk RNA-seq data. A 10-gene signature for prognosis prediction in TNBC patients was constructed, and its performance was excellent. Interestingly, the signature was found to be closely related to tumor immune infiltration, which might provide evidence for the crucial roles of immune cells in malignant initiation and progression in TNBC.

Keywords: Triple-negative breast cancer (TNBC); immune infiltration; prognosis; single cell; tumor mutational burden.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Flowchart of data processing and analyses.
FIGURE 2
FIGURE 2
Normalization, filtration, and dimension reduction of single-cell RNA-seq data. (A) The gene counts per cell (nFeature_RNA), number of unique molecular identifiers (UMIs) per cell (nCount_RNA), and percentage of mitochondrial genes per cell (percent.mt) of the single-cell RNA-seq data. (B) The top 2000 variable DEGs between cells are marked in red color, and the 10 most highly variable DEGs are labeled. (C) The top two dimensions of all tumor cells. (D) An elbow plot of the standard deviation of each principal component (PC) to help PC selection.
FIGURE 3
FIGURE 3
Cell clusters and types of annotation and pseudotime analysis. (A) Fourteen different cell clusters were identified by performing t-distributed stochastic neighbor embedding (t-SNE). (B) Cell types were further annotated and labeled by exploiting the cell markers. (C) All cells were ordered along trajectories to construct a pseudotime axis. Different colors represent different states. (D) The deeper the color, the earlier the beginning of cell progressions.
FIGURE 4
FIGURE 4
Identification of genes related to the TNBC grade and tumor mutational burden (TMB) through weighted gene coexpression network analysis (WGCNA). (A) All samples were clustered and displayed. We selected 35 as the cutHeight. (B) A gene dendrogram was generated, and genes were clustered into different modules with different colors. (C) The correlations between gene modules and grade and TMB were calculated using Pearson’s correlation and exhibited in a heatmap. Blue and turquoise gene modules containing 831 genes were found to be significantly related to grade and TMB (p < 0.05).
FIGURE 5
FIGURE 5
Identification of a prognostic signature in the training cohort. (A,B) Through LASSO regression, a signature including 10 genes was developed based on the optimal λ.
FIGURE 6
FIGURE 6
Validation of the risk signature. (A–D) KM curves displayed a significantly better OS of the low-risk group in the training cohort (A), the test cohort (B), the whole METABRIC cohort (C), and the GEO cohort (D).
FIGURE 7
FIGURE 7
Assessment of the performance of the risk signature. (A) The area under the ROC curve (AUC) values were 0.760, 0.766, and 0.744 for the 3-, 5-, 8-year survivals, respectively, in the training cohort. (B) AUC values were 0.661, 0.625, and 0.650 for the 3-, 5-, and 8-year survivals, respectively, in the test cohort. (C) AUC values were 0.736, 0.729, and 0.716 for the 3-, 5-, and 8-year survivals, respectively, in the whole METABRIC cohort. (D) AUC values were 0.663, 0.651, and 0.665 for the 3-, 5-, and 8-year survivals, respectively, in the GEO cohort.
FIGURE 8
FIGURE 8
Nomogram construction and validation. Univariate (A) and multivariate (B) Cox regression analysis indicated that age, stage, and risk score were independent prognostic factors in the METABRIC cohort. (C) Age, stage, and risk score were included to construct a nomogram . The points of age, stage, and risk score were calculated with reference to the nomogram, and the total points could facilitate the prediction of prognosis. (D) A calibration curve indicated a prominent consistency between the actual observed OS and the predicted OS. (E) The efficacy of the nomogram was also assessed using the ROC curves and the AUCs were 0.764, 0.778, and 0.744 for the 3-, 5-, and 8-year survivals, respectively. (F) The nomogram exhibited an advantage in C-index versus other clinical traits in prognosis prediction.
FIGURE 9
FIGURE 9
Gene expression in particular cell types. The expression of the 10 signature genes in the cellular landscape.
FIGURE 10
FIGURE 10
Gene set variation analysis (GSVA) differences between the different groups. The GSVA differences indicated that high-risk groups harbored lower immune-related pathways.
FIGURE 11
FIGURE 11
Immune infiltration differences between different groups. Nearly all immune cells (A) and activities (B) were higher in the low-risk group. (C) The correlations between the risk score and genes in the signature and immune infiltration (* p < 0.05, ** p < 0.01, *** p < 0.001).
FIGURE 12
FIGURE 12
Correlations between the risk score and genes in the signature and immune checkpoint genes.

Similar articles

Cited by

References

    1. Abd-Elnaby M., Alfonse M., Roushdy M. (2021). Classification of Breast Cancer Using Microarray Gene Expression Data: A Survey. J. Biomed. Inf. 117, 103764. 10.1016/j.jbi.2021.103764 - DOI - PubMed
    1. Achlaug L., Somri-Gannam L., Meisel-Sharon S., Sarfstein R., Dixit M., Yakar S., et al. (2021). ZYG11A Is Expressed in Epithelial Ovarian Cancer and Correlates with Low Grade Disease. Front. Endocrinol. 12, 688104. 10.3389/fendo.2021.688104 - DOI - PMC - PubMed
    1. Agostinetto E., Losurdo A., Nader-Marta G., Santoro A., Punie K., Barroso R., et al. (2022). Progress and Pitfalls in the Use of Immunotherapy for Patients with Triple Negative Breast Cancer. Expert Opin. Investigational Drugs 31 (6), 1–25. 10.1080/13543784.2022.2049232 - DOI - PubMed
    1. Ben-David U., Amon A. (2020). Context Is Everything: Aneuploidy in Cancer. Nat. Rev. Genet. 21 (1), 44–62. 10.1038/s41576-019-0171-x - DOI - PubMed
    1. Chen H., Ye F., Guo G. (2019). Revolutionizing Immunology with Single-Cell RNA Sequencing. Cell Mol. Immunol. 16 (3), 242–249. 10.1038/s41423-019-0214-4 - DOI - PMC - PubMed

LinkOut - more resources