Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 4:2023:baad009.
doi: 10.1093/database/baad009.

lncHUB2: aggregated and inferred knowledge about human and mouse lncRNAs

Affiliations

lncHUB2: aggregated and inferred knowledge about human and mouse lncRNAs

Giacomo B Marino et al. Database (Oxford). .

Abstract

Long non-coding ribonucleic acids (lncRNAs) account for the largest group of non-coding RNAs. However, knowledge about their function and regulation is limited. lncHUB2 is a web server database that provides known and inferred knowledge about the function of 18 705 human and 11 274 mouse lncRNAs. lncHUB2 produces reports that contain the secondary structure fold of the lncRNA, related publications, the most correlated coding genes, the most correlated lncRNAs, a network that visualizes the most correlated genes, predicted mouse phenotypes, predicted membership in biological processes and pathways, predicted upstream transcription factor regulators, and predicted disease associations. In addition, the reports include subcellular localization information; expression across tissues, cell types, and cell lines, and predicted small molecules and CRISPR knockout (CRISPR-KO) genes prioritized based on their likelihood to up- or downregulate the expression of the lncRNA. Overall, lncHUB2 is a database with rich information about human and mouse lncRNAs and as such it can facilitate hypothesis generation for many future studies. The lncHUB2 database is available at https://maayanlab.cloud/lncHUB2. Database URL: https://maayanlab.cloud/lncHUB2.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
lncHUB2 Appyter and web application workflow. The lncHUB2 Appyter or web-based application takes as input 18 705 unique human and 11 274 unique mouse lncRNAs and generates a report. This report contains useful information such as the predicted secondary structure and expression levels in various tissues and cell lines. Additionally, using gene–gene correlations generated from publicly available RNA-seq data from ARCHS4, lncHUB2 provides predicted biological functions, as well as predicted small molecules and CRISPR-KO gene regulators, and gene-gene co-expression networks to explore closely related genes and lncRNAs associations based on expression similarity.
Figure 2.
Figure 2.
UMAP plots of 18 705 human lncRNAs and 11 274 mouse lncRNAs. (A) The lncRNAs level of intensity is by their median expression in the testis, where MALAT1 has the highest relative expression across tissues. The arrow is pointing to the location of MALAT1 on the UMAP plot. (B) lncRNAs level of intensity is by their log median expression in the peripheral nervous system, where Dleu2 has the highest relative expression across tissues. The arrow is pointing to the location of Dleu2 on the UMAP plot.
Figure 3.
Figure 3.
Comparing FANTOM6 lncRNA knockdowns followed by expression with gene–gene co-expression correlation data from ARCHS4. For each lncRNA in FANTOM6, we computed the significance of the overlap between the top 200 DEGs for each lncRNA knockdown (|log2 fold change (FC)| > 0.5; false discovery rate (FDR) < 0.05; |Zscore| > 1.645) in at least one knock-down condition and the top 200 most positively and top 200 most negatively correlated genes from the ARCHS4 gene–gene co-expression matrix using Fisher’s exact test. The P-values were then converted to −log10(P-values) and are visualized as stacked bar charts where the bottom part of the bar denotes the significance of overlap with positively correlated genes and the top part of each bar denotes the significance of overlap with the negatively correlated genes for each lncRNA. Only the top 37 lncRNA’s down genes with the most overlap and the top 21 lncRNA’s up genes with the most overlap are shown out of a total of 87 assessed.
Figure 4.
Figure 4.
Predicting and evaluating the predictions of lncRNA–disease associations using gene–gene co-expression correlations. For each disease term from the DisGeNET gene-set library downloaded from Enrichr, the 18 705 human lncRNAs were ranked by their negative mean PCC with the corresponding gene set (bars at the center). The AUROC was calculated (bars at the right side of the plot) using the ranks of lncRNAs known to be associated with the same disease based on experimentally validated lncRNA–disease associations from LncRNADisease v2.0 (bars at the left side of the plot).
Figure 5.
Figure 5.
Unsupervised learning to predict the localization of lncRNAs by cell line. (A) Co-expression gene–gene correlations were used to predict localization values for each human lncRNA for the 15 cell lines in lncAtlas. For each human lncRNA, the 35 371 genes present across the cell types in lncAtlas were ranked by PCCs and ranks were multiplied by the existing RCIs from lncAtlas and summed. True positives and false positives were calculated for CN RCIs >1 and <−1 per cell line. (B) Subcellular localization RCIs for XIST, which are available for the displayed cell lines from lncAtlas. (C) Predicted subcellular localization for TSIX, an antisense gene to XIST. Subcellular localization information for TSIX is not available in lncAtlas for the five cell lines. These cell lines have the highest AUROCs as reported in (A).
Figure 6.
Figure 6.
Interactive gene–gene co-expression network for the lncRNA HOTAIR. The HOTAIR gene–gene co-expression network contains the top 100 genes most correlated with HOTAIR. The thickness of the edges represents the magnitude of the PCCs, and nodes representing genes are colored by their chromosome of origin except for the queried lncRNA, which is colored in bright red. The network is pruned so that each node on average has less than three edges.
Figure 7.
Figure 7.
MEG3 involvement in dermatitis and chronic inflammatory response. MEG3 is co-expressed (positively correlated) with the genes contained in the left box, which are downregulated by IL17C and IL17RB. Genes in the right box represent the intersection of genes associated with dermatitis and chronic inflammation when knocked out in mice and which are negatively correlated with MEG3.

Similar articles

Cited by

References

    1. The ENCODE Project Consortium . (2012) An integrated encyclopedia of DNA elements in the human genome. Nature, 489, 57–74. - PMC - PubMed
    1. Mattick J.S. (2001) Non-coding RNAs: the architects of eukaryotic complexity. EMBO Rep., 2, 986–991. - PMC - PubMed
    1. Jandura A. and Krause H.M. (2017) The new RNA world: growing evidence for long noncoding RNA functionality. Trends Genet., 33, 665–676. - PubMed
    1. Santosh B., Varshney A. and Yadava P.K. (2015) Non-coding RNAs: biological functions and applications. Cell Biochem. Funct., 33, 14–22. - PubMed
    1. Lekka E. and Hall J. (2018) Noncoding RNAs in disease. FEBS Lett., 592, 2884–2900. - PMC - PubMed

Publication types

Substances