Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct 13;83(20):3462-3477.
doi: 10.1158/0008-5472.CAN-22-3186.

Integrative Genomic Analyses Identify LncRNA Regulatory Networks across Pediatric Leukemias and Solid Tumors

Affiliations

Integrative Genomic Analyses Identify LncRNA Regulatory Networks across Pediatric Leukemias and Solid Tumors

Apexa Modi et al. Cancer Res. .

Abstract

Long noncoding RNAs (lncRNA) play an important role in gene regulation and contribute to tumorigenesis. While pan-cancer studies of lncRNA expression have been performed for adult malignancies, the lncRNA landscape across pediatric cancers remains largely uncharted. Here, we curated RNA sequencing data for 1,044 pediatric leukemia and extracranial solid tumors and integrated paired tumor whole genome sequencing and epigenetic data in relevant cell line models to explore lncRNA expression, regulation, and association with cancer. A total of 2,657 lncRNAs were robustly expressed across six pediatric cancers, including 1,142 exhibiting histotype-elevated expression. DNA copy number alterations contributed to lncRNA dysregulation at a proportion comparable to protein coding genes. Application of a multidimensional framework to identify and prioritize lncRNAs impacting gene networks revealed that lncRNAs dysregulated in pediatric cancer are associated with proliferation, metabolism, and DNA damage hallmarks. Analysis of upstream regulation via cell type-specific transcription factors further implicated distinct histotype-elevated and developmental lncRNAs. Integration of these analyses prioritized lncRNAs for experimental validation, and silencing of TBX2-AS1, the top-prioritized neuroblastoma-specific lncRNA, resulted in significant growth inhibition of neuroblastoma cells, confirming the computational predictions. Taken together, these data provide a comprehensive characterization of lncRNA regulation and function in pediatric cancers and pave the way for future mechanistic studies.

Significance: Comprehensive characterization of lncRNAs in pediatric cancer leads to the identification of highly expressed lncRNAs across childhood cancers, annotation of lncRNAs showing histotype-specific elevated expression, and prediction of lncRNA gene regulatory networks.

PubMed Disclaimer

Conflict of interest statement

Conflicts of interest statement:

The authors declare no potential conflicts of interest.

Figures

Figure 1:
Figure 1:. Pan-pediatric cancer transcriptome characterization
(A) Overview of pan-pediatric cancer RNA-seq dataset and schematic of data processing and filtering. Reads from RNA-seq fastq files were aligned using the STAR algorithm and then gene transcripts were mapped in a guided de novo manner and quantified via the StringTie algorithm. Genes were considered novel if they did not have transcript exon structures matching genes in the GENCODE v19 or RefSeq v74 databases. Novel genes were assigned as lncRNAs based on length >200bp and non-coding potential calculated using the PLEK algorithm. Transcripts with low expression (FPKM <1 in >80% samples) were not considered for further analysis. (B) Pie graph showing the quantity of robustly expressed protein coding genes, GENCODE/RefSeq annotated lncRNAs, and novel lncRNAs. The number of genes expressed per cancer is also shown. Adjoining schematic gives overview of additional data types that were integrated with transcriptome data: WGS, ChIP-seq, and chromatin capture. Listed are the analyses used to elucidate lncRNAs with functional roles in pediatric cancer. (C) Cumulative expression plots comparing the number of lncRNAs and (D) protein coding genes, respectively, that constitute the total sum of gene expression (FPKM) per pediatric cancer. (E) Percentage of total lncRNA expression (FPKM) accounted for by the union of top five expressed lncRNAs per cancer (total 11 lncRNAs).
Figure 2:
Figure 2:. LncRNAs exhibit tissue specific expression that can distinguish cancers
(A) Tissue specificity index (tau score) which ranges from 0 (ubiquitously expressed) to 1 (tissue specific) is plotted for genes across three gene types: protein coding genes, lncRNAs, and novel lncRNAs. Table shows the tau score range and mean per gene type. (B) Heatmap showing the hierarchically clustered gene expression for the top five most tissue specific lncRNAs per cancer, ranked by highest tau score. Samples from each cancer cluster together based on expression of these genes alone. (C) Number of tissue specific known and novel lncRNAs in each cancer as defined by tissue specific gene threshold: tau score > 0.8.
Figure 3:
Figure 3:. A similar proportion of lncRNAs and protein coding genes are dysregulated due to SCNA
(A) The proportion of protein coding and lncRNA genes that have significant differential expression due SCNA, separated by copy number type (amplification or deletion). The number of genes found in SCNA loci is shown per cancer. Genes were evaluated to have differential expression due to copy number using the Wilcoxon rank sum test (p-value < 0.05) and log |fold change| > 1.5), comparing samples with no SCNA to samples with low/high SCNA as defined by GISTIC scores. (B) The number of differentially expressed lncRNAs per chromosome and per cancer, distinguished by color. Chromosomes 1 and 17 had the most dysregulated lncRNAs associating with the greater frequency of SCNA on these chromosomes across cancers. (C) Number of samples with structural variant breakpoints in or near (+/− 2.5kb) lncRNAs and that are also located in copy number regions, stratified by amplification or deletion status of the locus.
Figure 4:
Figure 4:. lncRNA modulators impact transcriptional networks involving proliferation
(A) Schematic that shows the three ways (attenuate, enhance, or invert) in which differentially expressed lncRNA modulators can impact transcription factor and target gene relationships. lncRNA modulators are associated with a TF-target gene pair based on a significant difference between TF-target gene expression correlation in samples with low lncRNA expression (lowest quartile) vs samples with high lncRNA expression (highest quartile). (B) The proportion of lncRNA modulator types associated with significantly dysregulated lncRNA modulator- TF-target gene (lncMod) triplets. The number of significantly dysregulated lncMod triplets is listed per cancer. (C) Number of lncRNA modulators genes that are common in lncMod triplets across cancers. Common lncRNA modulator genes tend to have a lower tau score compared to lncRNA modulators only associated with one cancer. (D) Gene set enrichment using the MSigDB Hallmark gene set, of target genes associated with lncRNA modulators in each cancer (Fisher’s exact test, FDR < 0.1). (E) Transcription factors associated with the B-ALL expression specific lncRNA, BLACE, ranked based on number of regulated target genes. (F) Expression heatmap of BLACE and the target genes of the XBP1 transcription factor, grouped by associated hallmark gene set, in samples within the bottom and top quartiles of BLACE expression in B-ALL.
Figure 5:
Figure 5:. Identification of lncRNAs associated with distinct neuroblastoma cell states
(A) The MES and ADRN signature score for TARGET NBL samples, with each sample labeled with either ADRN, Mixed, or MES phenotype based on clustering analysis. (B) Heatmap of the expression of lncRNAs that have significant correlation with either the MES or ADRN score (|r| >0.6, p-value < 0.01). lncRNAs were correlated with protein coding genes on the same chromosome and subsequent gene set enrichment analysis was performed for MES and ADRN protein coding genes separately. (C) Schematic of how ADRN associated CRC regulated genes are identified using ChIP-seq and chromatin interaction data. We identified lncRNAs based on three types of regulation. 1) CRC transcription factors binding directly at the promoter of the lncRNA. (D) 2) CRC TFs bind an enhancer region that interacts with a lncRNA promoter. (E) 3) CRC TFs bind the promoter of a different gene, and this promoter interacts with a lncRNA promoter. CRC TF binding was identified from ChIP-seq data, while enhancer-promoter and promoter-promoter interactions were identified from chromatin capture data. (F) Filtering of lncRNAs expressed in NBL based on CRC TF regulation and differential expression based on sample phenotypes (ADRN or MES). (G) Expression of TBX2 and TBX2-AS1 stratified by NBL sample phenotype (ADRN or MES). (H) ChIP-seq tracks for histone marks and CRC transcription factors in the NBL cell line: BE(2)C, and promoter capture C chromatin interactions in NBL cell line: NB1643, at the TBX2/TBX2-AS1 locus.
Figure 6:
Figure 6:. TBX2-AS1 influences NBL cell proliferation and E2F1-target gene expression
(A)Expression of TBX2 and TBX2-AS1 in NBL tumor samples with and without 17q gain. (B) The top MSigDB Hallmarks enriched across targets genes (p-value < 0.01) regulated by TBX2-AS1 as predicted from lncMod analysis. (C) The transcription factors with most target genes regulated by TBX2-AS1 as predicted from lncMod analysis. (D) Expression of gene targets of the E2F1 transcription factor that are enriched for proliferation hallmarks, in samples with low and high TBX2 and TBX2-AS1 expression. TBX2 expression is highly correlated with that of TBX2-AS1 (Pearson’s r=0.77). (E) Expression correlation between E2F1 and its lncMod predicted target genes (n=36) in TARGET NBL Stage 4 non MYCN amplified samples with the lowest 25% versus highest 25% quartile of TBX2-AS1 expression. (F) siRNA knockdown efficiency of TBX2-AS1 and TBX2 in the NBL cell line, NLF. (G) Western blot analysis of TBX2 in siTBX2 and siTBX2-AS1 treated NLF cell line (representative blot shown). (H) Quantification of TBX2 protein expression from three Western blots of independent knockdown experiments. (I) Representative image of cell growth (as measured by RT-Ces assay) of the NBL cell lines, NLF. Cell index is normalized to time point when siRNA reagent is added at 24 hours post cell plating. (J) Images of NLF cells after siTBX2-AS1 and siTBX2 show morphology changes. (K) Results from iRegulon analysis for genes that are up- or down-regulated upon siTBX2-AS1 treatment in NLF. Number of genes shown in Venn diagram with evidence of motif or ChIP-seq binding of the listed transcription factors. (L) Expression correlation between E2F1 and its lncMod predicted target genes (n=36) identified using RNA-sequencing expression profiling from the NLF cell line treated with either siNTC or siTBX2-AS1. (M) Expression correlation between E2F1 and its lncMod predicted target genes (n=36) identified using RNA-sequencing expression profiling from the NLF cell line treated with either siNTC or siTBX2.

Comment in

References

    1. Iyer MK, et al., The landscape of long noncoding RNAs in the human transcriptome. Nature genetics, 2015. 47: p. 199–208. - PMC - PubMed
    1. Gil N. and Ulitsky I, Regulation of gene expression by cis-acting long non-coding RNAs. Nat Rev Genet, 2020. 21(2): p. 102–117. - PubMed
    1. Dykes IM and Emanueli C, Transcriptional and Post-transcriptional Gene Regulation by Long Non-coding RNA. Genomics Proteomics Bioinformatics, 2017. 15(3): p. 177–186. - PMC - PubMed
    1. Kotake Y, et al., Long non-coding RNA ANRIL is required for the PRC2 recruitment to and silencing of p15(INK4B) tumor suppressor gene. Oncogene, 2011. 30(16): p. 1956–62. - PMC - PubMed
    1. Engreitz JM, et al., The Xist lncRNA exploits three-dimensional genome architecture to spread across the X chromosome. Science, 2013. 341(6147): p. 1237973. - PMC - PubMed

Publication types

Substances