Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jan;41(Database issue):D246-51.
doi: 10.1093/nar/gks915. Epub 2012 Oct 5.

LNCipedia: a database for annotated human lncRNA transcript sequences and structures

Affiliations

LNCipedia: a database for annotated human lncRNA transcript sequences and structures

Pieter-Jan Volders et al. Nucleic Acids Res. 2013 Jan.

Abstract

Here, we present LNCipedia (http://www.lncipedia.org), a novel database for human long non-coding RNA (lncRNA) transcripts and genes. LncRNAs constitute a large and diverse class of non-coding RNA genes. Although several lncRNAs have been functionally annotated, the majority remains to be characterized. Different high-throughput methods to identify new lncRNAs (including RNA sequencing and annotation of chromatin-state maps) have been applied in various studies resulting in multiple unrelated lncRNA data sets. LNCipedia offers 21 488 annotated human lncRNA transcripts obtained from different sources. In addition to basic transcript information and gene structure, several statistics are determined for each entry in the database, such as secondary structure information, protein coding potential and microRNA binding sites. Our analyses suggest that, much like microRNAs, many lncRNAs have a significant secondary structure, in-line with their presumed association with proteins or protein complexes. Available literature on specific lncRNAs is linked, and users or authors can submit articles through a web interface. Protein coding potential is assessed by two different prediction algorithms: Coding Potential Calculator and HMMER. In addition, a novel strategy has been integrated for detecting potentially coding lncRNAs by automatically re-analysing the large body of publicly available mass spectrometry data in the PRIDE database. LNCipedia is publicly available and allows users to query and download lncRNA sequences and structures based on different search criteria. The database may serve as a resource to initiate small- and large-scale lncRNA studies. As an example, the LNCipedia content was used to develop a custom microarray for expression profiling of all available lncRNAs.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
LNCipedia is generated in a multistep process that comprises importing, naming, analysis and visualization of lncRNA genes. Import scripts for the FASTA, BED and GFF file formats process lncRNA transcripts and detect redundancy. LncRNA naming is preceded by the creation of lncRNA transcript clusters and requires information on the nearest protein-coding gene on the same DNA strand. Every lncRNA transcript is subsequently analysed using multiple algorithms, and the results are appended to the database. A web-interface build using Perl enables lncRNA visualization and database querying.
Figure 2.
Figure 2.
The SOX1 protein-coding gene locus contains three lncRNAs on the same DNA strand, numbered according to their distance in relation to SOX1. LncRNA transcripts are numbered according to their order in the gene, starting with the most upstream transcript.
Figure 3.
Figure 3.
The transcript page in the web interface provides a clear overview of information available on a specific lncRNA transcript.

References

    1. Mercer T, Dinger M. Long non-coding RNAs: insights into functions. Nat. Rev. Genet. 2009;10:155–159. - PubMed
    1. Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, Rinn JL. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 2011;25:1915–1927. - PMC - PubMed
    1. Taft RJ, Mattick JS. Increasing biological complexity is positively correlated with the relative genome-wide expansion of non-protein-coding DNA sequences. Genome Biol. 2003;5:P1–P24.
    1. Wang KC, Chang HY. Molecular mechanisms of long noncoding RNAs. Mol. Cell. 2011;43:904–914. - PMC - PubMed
    1. Guttman M, Rinn JL. Modular regulatory principles of large non-coding RNAs. Nature. 2012;482:339–346. - PMC - PubMed

Publication types