Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 24;5(2):48-59.
doi: 10.1016/j.ncrna.2020.02.004. eCollection 2020 Jun.

A bioinformatics approach to identify novel long, non-coding RNAs in breast cancer cell lines from an existing RNA-sequencing dataset

Affiliations

A bioinformatics approach to identify novel long, non-coding RNAs in breast cancer cell lines from an existing RNA-sequencing dataset

Oza Zaheed et al. Noncoding RNA Res. .

Erratum in

Abstract

Breast cancer research has traditionally centred on genomic alterations, hormone receptor status and changes in cancer-related proteins to provide new avenues for targeted therapies. Due to advances in next generation sequencing technologies, there has been the emergence of long, non-coding RNAs (lncRNAs) as regulators of normal cellular events, with links to various disease states, including breast cancer. Here we describe our bioinformatic analyses of a previously published RNA sequencing (RNA-seq) dataset to identify lncRNAs with altered expression levels in a subset of breast cancer cell lines. Using a previously published RNA-seq dataset of 675 cancer cell lines, a subset of 18 cell lines was selected for our analyses that included 16 breast cancer lines, one ductal carcinoma in situ line and one normal-like breast epithelial cell line. Principal component analysis demonstrated correlation with well-established categorisation methods of breast cancer (i.e. luminal A/B, HER2 enriched and basal-like A/B). Through detailed comparison of differentially expressed lncRNAs in each breast cancer sub-type with normal-like breast epithelial cells, we identified 15 lncRNAs with consistently altered expression, including three uncharacterised lncRNAs. Utilising data from The Cancer Genome Atlas (TCGA) and The Genotype Tissue Expression (GETx) project via Gene Expression Profiling Interactive Analysis (GEPIA2), we assessed clinical relevance of several identified lncRNAs with invasive breast cancer. Lastly, we determined the relative expression level of six lncRNAs across a spectrum of breast cancer cell lines to experimentally confirm the findings of our bioinformatic analyses. Overall, we show that the use of existing RNA-seq datasets, if re-analysed with modern bioinformatic tools, can provide a valuable resource to identify lncRNAs that could have important biological roles in oncogenesis and tumour progression.

Keywords: Bioinformatics; Breast cancer; Ductal carcinoma in situ; Long non-coding RNAs (lncRNAs) RNA sequencing (RNA-seq); Quantitative reverse transcriptase polymerase chain reaction (qRT-PCR).

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Breast cancer cell lines distinguished by malignant versus non-malignant show differential expression of lncRNAs (A) Principal component analysis of selected breast cancer cell lines grouped by molecular classification, normal-like, DCIS, luminal A, luminal B, HER2 enriched, basal A and basal B. PC1 (x-axis) is representative of the non-malignant cell line (MCF10A); PC2 (y-axis) is representative of the 17 malignant cell lines. (B) Volcano plot (log2 FC > 10, p ≤ 0.01) to filter differentially expressed lncRNAs in malignant cell lines versus normal-like, MCF10A. (C) Heatmap of differentially expressed lncRNAs in malignant versus non-malignant cell lines. DSCAM-AS1 and LOC105372815 were the most highly expressed lncRNAs in many of the cell lines examined.
Fig. 2
Fig. 2
Differential expression of lncRNAs in ER/PR positive, HER2sensitiveand triple-negative breast cancer cell lines (A) Volcano plot (log2 FC > 10, p ≤ 0.01 indicated as dashed lines) and (B) Corresponding heatmap of differentially expressed lncRNAs in ER/PR positive breast cancer cell lines versus normal-like, MCF10A, analysed across all 18 cell lines examined. Similar analyses were done for HER2 sensitive lines (C) Volcano plot (log2 FC > 10, p ≤ 0.01), (D) Corresponding heatmap of lncRNA expression across all cell lines; and triple-negative breast cancer cell lines (those lacking ER/PR/HER2) (E) Volcano plot (log2 FC > 10, p ≤ 0.01), (F) Corresponding heatmap of lncRNA expression across all cell lines.
Fig. 3
Fig. 3
Differential expression of lncRNAs in DCIS, luminal A and luminal B breast cancer cell lines (A) Volcano plot (log2 FC > 10, p ≤ 0.01 indicated as dashed lines) and (B) Corresponding heatmap of differentially expressed lncRNAs in DCIS cell line, MCF10DCIS.com, versus normal-like, MCF10A, analysed across all 18 cell lines examined. Similar analyses were done for luminal A cell lines (BT-483, CAMA-1, KPL-1, MCF-7) (C) Volcano plot (log2 FC > 10, p ≤ 0.01), (D) Corresponding heatmap of lncRNA expression across all cell lines; and luminal B breast cancer cell lines (MDA-MB-330, UACC-812, ZR-75-30) (E) Volcano plot (log2 FC > 10, p ≤ 0.01), (F) Corresponding heatmap of lncRNA expression across all cell lines.
Fig. 4
Fig. 4
Differential expression of lncRNAs in HER2 enriched, basal-like type A (basal A) and basal-like type B (basal B) (A) Volcano plot (log2 FC > 10, p ≤ 0.01 indicated as dashed lines) and (B) Corresponding heatmap of differentially expressed lncRNAs in HER2 positive breast cancer cell lines (MDA-MB-453, SK-BR-3, UACC-893) versus normal-like, non-malignant line, MCF10A, analysed across all 18 cell lines examined. Similar analyses were performed for basal A breast cancer lines (BT-20, MDA-MB-436, MFM-223) (C) Volcano plot (log2 FC > 10, p ≤ 0.01), (D) Corresponding heatmap of lncRNA expression across all cell lines; and for basal B breast cancer cell lines (CAL-120, MDA-MB-157, MDA-MB-231) (E) Volcano plot (log2 FC > 10, p ≤ 0.01), (F) Corresponding heatmap of lncRNA expression across all cell lines.
Fig. 5
Fig. 5
Clinical relevance of select lncRNAs identified bioinformatically using Gene Expression Profiling Interactive Analysis (GEPIA2). Breast cancer survival analysis plots (Kaplan-Meier) were generated for lncRNAs (A) CELF2-AS1; (C) DSCAM-AS1; (E) ELFN1-AS1; (G) LINC00885 and (I) ZNF667-AS1 using GEPIA2 [37]. Corresponding box plots of the comparative expression of the same lncRNAs (B) CELF2-AS1; (D) DSCAM-AS1; (F) ELFN1-AS1; (H) LINC00885 and (J) ZNF667-AS1 in breast cancer tumour samples (red) versus normal tissue samples (grey) generated using GEPIA2.
Fig. 6
Fig. 6
Experimental confirmation of differential expression of selected lncRNAs in a breast cancer cell line panel, representing each molecular subtype. qRT-PCR was performed using cDNA synthesized from total RNA isolated from MCF10A (normal-like), MCF10DCIS.com (DCIS), MCF7 (luminal A), ZR-75-30 (luminal B), SK-BR-3 (HER2 positive), and MDA-MB-231 (basal B) cells. Relative expression of lncRNAs (A) CCAT1; (B) DSCAM-AS1; (C) LINC00885; (D) LOC105372815; (E) MUC5B-AS1 and (F) ZNF667-AS1, as compared to GAPDH, are shown, using one-way ANOVA (GraphPad Prism v.8.3.0). **** p-value < 0.0001; *** p-value < 0.001; ** p-value < 0.01.

References

    1. Wang Z., Gerstein M., Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 2009;10:57–63. - PMC - PubMed
    1. Goodwin S., McPherson J.D., McCombie W.R. Coming of age: ten years of next-generation sequencing technologies. Nat. Rev. Genet. 2016;17:333–351. - PMC - PubMed
    1. Birney E., Stamatoyannopoulos J.A., Dutta A., Guigó R., Gingeras T.R., Margulies E.H. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007;447:799–816. - PMC - PubMed
    1. Djebali S., Davis C.A., Merkel A., Dobin A., Lassmann T., Mortazavi A. Landscape of transcription in human cells. Nature. 2012;489:101–108. - PMC - PubMed
    1. De Leeneer K., Claes K. Non coding RNA molecules as potential biomarkers in breast cancer. Adv. Exp. Med. Biol. 2015;867:263–275. - PubMed