Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Feb 15;20(1):137.
doi: 10.1186/s12864-019-5497-4.

PLAIDOH: a novel method for functional prediction of long non-coding RNAs identifies cancer-specific LncRNA activities

Affiliations

PLAIDOH: a novel method for functional prediction of long non-coding RNAs identifies cancer-specific LncRNA activities

Sarah C Pyfrom et al. BMC Genomics. .

Abstract

Background: Long non-coding RNAs (lncRNAs) exhibit remarkable cell-type specificity and disease association. LncRNA's functional versatility includes epigenetic modification, nuclear domain organization, transcriptional control, regulation of RNA splicing and translation, and modulation of protein activity. However, most lncRNAs remain uncharacterized due to a shortage of predictive tools available to guide functional experiments.

Results: To address this gap for lymphoma-associated lncRNAs identified in our studies, we developed a new computational method, Predicting LncRNA Activity through Integrative Data-driven 'Omics and Heuristics (PLAIDOH), which has several unique features not found in other methods. PLAIDOH integrates transcriptome, subcellular localization, enhancer landscape, genome architecture, chromatin interaction, and RNA-binding (eCLIP) data and generates statistically defined output scores. PLAIDOH's approach identifies and ranks functional connections between individual lncRNA, coding gene, and protein pairs using enhancer, transcript cis-regulatory, and RNA-binding protein interactome scores that predict the relative likelihood of these different lncRNA functions. When applied to 'omics datasets that we collected from lymphoma patients, or to publicly available cancer (TCGA) or ENCODE datasets, PLAIDOH identified and prioritized well-known lncRNA-target gene regulatory pairs (e.g., HOTAIR and HOX genes, PVT1 and MYC), validated hits in multiple lncRNA-targeted CRISPR screens, and lncRNA-protein binding partners (e.g., NEAT1 and NONO). Importantly, PLAIDOH also identified novel putative functional interactions, including one lymphoma-associated lncRNA based on analysis of data from our human lymphoma study. We validated PLAIDOH's predictions for this lncRNA using knock-down and knock-out experiments in lymphoma cell models.

Conclusions: Our study demonstrates that we have developed a new method for the prediction and ranking of functional connections between individual lncRNA, coding gene, and protein pairs, which were validated by genetic experiments and comparison to published CRISPR screens. PLAIDOH expedites validation and follow-on mechanistic studies of lncRNAs in any biological system. It is available at https://github.com/sarahpyfrom/PLAIDOH .

Keywords: Epigenetics; Interactome; Long non-coding RNA; Lymphoma; RNA-binding protein; Transcriptional control; cis-regulation; lncRNA.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

This study was approved annually by the Washington University School of Medicine Institutional Review Board (IRB ID 201104208). It was performed in accordance with the Declaration of Helsinki. Informed written consent was obtained from participants after the nature and possible consequences of the studies were explained. Specimens were de-identified and all identifying information was secured to protect subjects from risks associated with participating in genomic studies.

Consent for publication

Not applicable.

Datasets used are publicly available and sources are listed in the Data Sources above and in PLAIDOH documentation (www.github.com/sarahpyfrom/PLAIDOH). Additional post-process data tables used to generate figures for the manuscript are available from the corresponding author on reasonable request.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
Hundreds of lncRNAs are dysregulated in NHL compared to normal B cells. a Schematic depicts collection, flow cytometry purification, and ‘omics profiling of malignant and normal B lymphocytes from NHL patients and healthy volunteers [21]. b Diagram of NHL lncRNA discovery pipeline. RNA-seq data was analyzed using a de novo processing pipeline to enable identification of novel transcripts (Cufflinks). Novel RNA transcripts were merged with annotated transcripts (Cuffmerge). c Volcano plot highlights lncRNA transcripts with significantly different expression in NHL tumor samples compared to normal B cells (red). Relative expression of lncRNA transcripts shown in log2 fold change expression (FPKM) versus –log10 adjusted p value (FDR, Benjamini&Hochberg) for NHL:normal B cells. d Data as in C, with different types of lncRNAs highlighted in different colors (red: annotated lncRNAs, blue: intergenic lincRNAs, green: novel (not annotated) lncRNAs
Fig. 2
Fig. 2
An overview of the PLAIDOH pipeline and algorithm output. a Schematic of the single, input file required by PLAIDOH to identify all possible lncRNA/Coding gene Pairs (LCPs) in the user’s dataset. b Overview of the datasets that are used by PLAIDOH to annotate lncRNAs and predict activity based on genomic and epigenomic context. c Abridged example of the primary PLAIDOH output table, showing the three scores PLAIDOH calculates for each LCP as well as the 30+ additional columns of valuable information about the lncRNA and coding gene in each LCP. d Examples of graphs output by PLAIDOH as part of its standard run settings. The three LCPs and lncRNA1 diagrammed in a are highlighted in red and green, respectively
Fig. 3
Fig. 3
PLAIDOH reveals global patterns of LCP co-expression. LncRNA expression (log10 FPKM) (a & c) or LCP correlation (−log10 Spearman adjusted p-value) (b & d), are plotted relative to genomic distance from each lncRNA to a coding gene (a & b) or the nearest enhancer (c & d) within 400 kb regions flanking the lncRNA. LCPs with positive Spearman correlation coefficients (rho) are plotted in the upper half of each plot; those with negative Spearman correlation coefficients (rho) are plotted in the lower half. Black points highlight LCPs with adjusted Spearman p-values < 0.05 or FPKM > 1
Fig. 4
Fig. 4
PLAIDOH ranks lncRNAs by number and fraction of correlated coding genes. a Contour plot shows the frequency of significant LCPs numbers as a function of the number of all possible coding gene pairs for each lncRNA. Color indicates increasing log10 frequency of LCPs at each x,y data point (white-blue-green). Highlighted in red are two LCPs in which single lncRNAs are each highly-correlated with large clusters of coding genes. b Genomic maps of the two LCPs shown in a. c Z-Scores of LCP correlation coefficients plotted by distance between each lncRNA and coding gene pair; positively correlated LCPs are plotted in the left panel and negatively correlated LCPs are in the right panel. Highlighted in red are LCPs in which single lncRNAs are correlated with only one coding gene. d Genomic maps of the LCPs shown in c
Fig. 5
Fig. 5
LncRNAs demonstrate common or cancer-type specific correlation profiles. a Venn diagram shows the number of significant LCPs shared or unique among five TCGA cancer types. Significant = Spearman correlation adj p < 0.05 for LCP expression. b Binary heatmap shows the pattern of correlation significance for LCPs across TCGA cancer types. Spearman adj p < 0.05 (purple); p > 0.01 (white). c Heatmap of LCP Spearman correlation p-values for expression of AC096992.2 and each of the genes within 400 kb. Spearman adj p < 0.01 (purple); p < 0.05 (blue); p ≥ 0.05 (white). d Bar graph shows expression of AC096992.2 in TCGA cancer types. e Box plot shows Spearman correlation coefficients (rho) for expression of AC096992.2 and all genes within 400 kb flanking. f Heatmap of LCP Spearman correlation p-values for expression of AC138207.5 and each of the genes within 400 kb flanking. Colors as in C. g Bar graph shows expression of AC138207.5 in TCGA cancer types. h Box plot shows Spearman correlation coefficients (rho) for expression of AC138207.5 and all genes within 400 kb flanking
Fig. 6
Fig. 6
PLAIDOH ranks LCPs by likely transcriptional regulatory mechanism, inferred from Enhancer and LncRNA Cis-regulatory Scores. a-f Plots show LCPs from ENCODE cell lines (a-c) or TCGA DLBC samples (d-f). a & d Plots show LCPs ranked by increasing LncRNA Transcript Cis-regulatory Scores. Red points are known cis-acting lncRNAs; in green are novel LCPs with the highest scores and/or containing known lymphoma oncogenes. b & e As in A&D, but ranked by increasing Enhancer Scores. Highlighted in red are known enhancer-associated lncRNAs; in green are novel LCPs with the highest scores and/or containing known lymphoma oncogenes. c & f XY plots show Enhancer versus LncRNA Transcript Cis-regulatory Scores segregating LCPs. Dotted lines in a-f reflect score cut-offs based on the geometric inflection points calculated from the data in a, b, d & e. Red and green data points are from a & b (for c), or d & e (for f)
Fig. 7
Fig. 7
PLAIDOH ranks lncRNAs using biological and experimental data from RNA binding protein interaction. a Interaction matrix of lncRNAs and RNA Binding Proteins. Binding events of concordantly localized lncRNAs and RBPs are colored by subcellular localization to the nucleus (blue), cytoplasm (red) or both nucleus and cytoplasm (purple). Discordantly-localized interactions are colored grey. No evidence of binding is white. b Plot shows lncRNA expression versus RBP binding-site density per kilobase of RNA transcript for each lncRNA/RBP interaction shown in panel a. Data point size is scaled to RBP expression level and subcellular localization interactions are colored to match panel a. Labeled dots highlight previously published and validated binding of RBP/lncRNA pairs
Fig. 8
Fig. 8
Validation of PLAIDOH’s functional predictions for a lncRNA highly expressed in human NHL. a UCSC Genome browser view of HK4me3 ChIP-seq (NHL) and RNA-seq (NHL, normal B cells) for the RP11-960 L18.1 locus. b XY plot shows Enhancer versus LncRNA Transcript Cis-regulatory Scores in primary NHL samples, highlighted are RP11-960 L18.1 and the two most proximal coding genes. c Expression of PLCG2 and RP11-960 L18.1 measured by qRT-PCR in HBL1 lymphoma B cell line treated with scramble or one of two RP11-960 L18.1 shRNAs. d Western Blot for PLCG2 or GAPDH in HBL1 cells treated with scramble or one of two RP11-960 L18.1 shRNAs. Triangles indicate relative number of cells loaded on the gel. e Subcellular localization of RNA transcripts determined by cell-fractionation of control (WT) HBL1 cells followed by qRT-PCR (CP: cytoplasm, NC: nuclear, NP: nucleoplasm, CA: chromatin-associated). f Plot shows lncRNA expression versus log10 RBP binding-site density per kilobase of RNA transcript for each lncRNA/RBP interaction, highlighted are RBPs that bind RP11-960 L18.1. Data point size is scaled to RBP expression level and subcellular localization interactions are colored as in Fig. 7

References

    1. Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi A, et al. Landscape of transcription in human cells. Nature. 2012;489:101–108. doi: 10.1038/nature11233. - DOI - PMC - PubMed
    1. Marchese FP, Raimondi I, Huarte M. The multidimensional mechanisms of long noncoding RNA function. Genome Biol. 2017;18:206. doi: 10.1186/s13059-017-1348-2. - DOI - PMC - PubMed
    1. Rivas E, Clements J, Eddy SR. A statistical test for conserved RNA structure shows lack of evidence for structure in lncRNAs. Nat Meth. 2017;14:45–48. doi: 10.1038/nmeth.4066. - DOI - PMC - PubMed
    1. Iyer MK, Niknafs YS, Malik R, Singhal U, Sahu A, Hosono Y, et al. The landscape of long noncoding RNAs in the human transcriptome. Nat Genet. 2015;47:199–208. doi: 10.1038/ng.3192. - DOI - PMC - PubMed
    1. Chen YG, Satpathy AT, Chang HY. Gene regulation in the immune system by long noncoding RNAs. Nat Immunol. 2017;18:962–972. doi: 10.1038/ni.3771. - DOI - PMC - PubMed

LinkOut - more resources