Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Aug;5(8):e1000617.
doi: 10.1371/journal.pgen.1000617. Epub 2009 Aug 21.

Genomic and transcriptional co-localization of protein-coding and long non-coding RNA pairs in the developing brain

Affiliations

Genomic and transcriptional co-localization of protein-coding and long non-coding RNA pairs in the developing brain

Jasmina Ponjavic et al. PLoS Genet. 2009 Aug.

Abstract

Besides protein-coding mRNAs, eukaryotic transcriptomes include many long non-protein-coding RNAs (ncRNAs) of unknown function that are transcribed away from protein-coding loci. Here, we have identified 659 intergenic long ncRNAs whose genomic sequences individually exhibit evolutionary constraint, a hallmark of functionality. Of this set, those expressed in the brain are more frequently conserved and are significantly enriched with predicted RNA secondary structures. Furthermore, brain-expressed long ncRNAs are preferentially located adjacent to protein-coding genes that are (1) also expressed in the brain and (2) involved in transcriptional regulation or in nervous system development. This led us to the hypothesis that spatiotemporal co-expression of ncRNAs and nearby protein-coding genes represents a general phenomenon, a prediction that was confirmed subsequently by in situ hybridisation in developing and adult mouse brain. We provide the full set of constrained long ncRNAs as an important experimental resource and present, for the first time, substantive and predictive criteria for prioritising long ncRNA and mRNA transcript pairs when investigating their biological functions and contributions to development and disease.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. A set of 659 non-coding RNA (ncRNA) transcripts, where each exhibits evidence of constraint on nucleotide substitutions since the mouse-human last common ancestor, shows significant enrichments in sequence predicted to contain folded RNA structures.
(A) An aggregated histogram showing 1,113 ncRNAs whose relative substitution rates (formula image) in mouse-human comparisons could be estimated reliably (see Materials and Methods). Each bin provides the number of ncRNAs whose relative substitution rate falls within a given (formula image) interval. Brain-expressed ncRNAs are indicated in blue, non-brain-expressed ncRNAs in red, and ncRNAs that exhibit significantly reduced substitution rates are represented as non-shaded bars. Of all ncRNAs with relative substitution rates between 0.9 and 1.0, 93% exhibit rates that are not significantly different from likely selectively neutral sequence and were, therefore, classified as non-constrained (shaded bars). (B) Evofold-predicted RNA secondary structures (red bars) and conserved sequence (of two types: either PhastCons multispecies conserved elements [MCSs; dark blue] or indel-purified segments [IPSs; light blue]) are each significantly enriched within constrained long ncRNAs. Such ncRNAs also tend to be depleted within segmentally duplicated (SDs; light green) and human copy number variable (CNVs; dark green) sequence. Checkmarks and crosses indicate whether there is evidence for long ncRNAs to be expressed in the brain and to show sequence constraint (see main text). The fold difference (X-axis) is shown on a log2-scale. An asterisk (*) indicates that a ncRNA set is significantly enriched/depleted in an annotation when compared with annotation densities in G+C-matched and randomly-sampled sequences (p<2×10−4).
Figure 2
Figure 2. Brain-derived ncRNAs, in particular those expressed during development, tend to lie adjacent to protein-coding genes that are involved in transcriptional regulation during development.
(A) Shown are fold-enrichments (X-axis) of Gene Ontology (GO) terms (Y-axis) for constrained brain-expressed ncRNAs. (B) Brain-derived ncRNAs that are expressed during mouse embryonic or neonatal development show significant tendencies to be proximal to transcription factor and developmental protein-coding genes, whereas those expressed in adult mice show no significant associations (not shown). (A, B) GO terms are listed if they are over-represented among protein-coding genes proximal to ncRNAs compared to those proximal to randomly-sampled sequences (p<10−3, EFDR = 0.08 entries). The fold difference (X-axis) is calculated between observed densities of ncRNAs associated with GO terms of nearby protein-coding genes and expected densities of corresponding G+C-matched and randomly sampled sequences. Abbreviations: 1 regulation of transcription, DNA dependent, 2 multicellular organismal development.
Figure 3
Figure 3. Non-brain-derived ncRNAs, in particular those expressed in adult mice, tend to be transcribed adjacent to protein-coding genes involved in signal transduction pathways.
(A) Shown are fold-enrichments (X-axis) of Gene Ontology (GO) terms (Y-axis) for non-brain-expressed ncRNAs that are evolutionarily constrained. (B) Non-brain-derived ncRNAs that are either expressed in adult mice (upper subpanel, light gray) or during mouse embryonal or neonatal development (lower subpanel, dark gray) show significant tendencies to be proximal to protein-coding genes with protein kinase, transcription factor and developmental GO annotations. (A, B) GO terms are listed if they are over-represented among protein-coding genes proximal to ncRNAs compared to those proximal to randomly-sampled sequences (p<10−3, EFDR = 0.08 entries). The fold difference (X-axis) is calculated between observed densities of ncRNAs associated with GO terms of nearby protein-coding genes and expected densities of corresponding G+C-matched and randomly sampled sequences. Abbreviations: 1 regulation of transcription, DNA dependent, 2 multicellular organismal development. Kinase and phosphatase genes strongly contribute to the observed enrichments seen for metal ion-, or ATP-, or manganese ion-binding.
Figure 4
Figure 4. Brain-derived ncRNAs tend to transcribed adjacent to protein-coding genes with high expression in the mouse vomeronasal organ and olfactory bulb.
Shown are brain- (A) and non-brain-expressed (B) ncRNAs that are evolutionarily constrained. The Y-axis represents tissues in which protein-coding genes located in proximity to a ncRNA are expressed at unusually high levels (see Materials and Methods). ncRNAs are significantly associated with protein-coding genes that are expressed in these tissues (Y-axis) when compared to randomly sampled G+C matched sequence (p<10−3, EFDR = 0.05 entries). The significant fold increase is shown on the X-axis. Non-brain-derived ncRNAs tend to be in close proximity to protein-coding genes expressed in tongue, prostate, intestine and digits, while brain-expressed ncRNAs tend to be located near protein-coding genes expressed in the vomeronasal organ and olfactory bulb. Similar results are found when ncRNAs are partitioned by their expression in brain or in non-brain tissues during development (Table S2).
Figure 5
Figure 5. Developmental neuronal expression patterns of Slitrk1, Vangl2, and Rbms1 overlap with those from ncRNAs transcribed from adjacent genomic sequence.
Brightfield images of in situ hybridization from adjacent wild-type sections are shown. (A) Slitrk1 and the ncRNA AK049627 (derived from an E12 spinal cord cDNA library) are expressed throughout mid/late embryonic development, with the specific co-expression in the brain and spinal column. (B) A similar pattern of co-expression in the CNS is observed for Vangl2 and the adjacent ncRNA AK082938 (derived from an E12 spinal cord library). (C) AK149041 (isolated from a P2 sympathetic ganglion library) was expressed with the adjacent Rbms1 gene at low levels in all major regions of the post-natal and adult brain (data not shown), although high levels of co-expression are observed in the developing Purkinje cell layer in the cerebellum from P12 to adulthood; higher magnification of the adult cerebellum shows that expression of both transcripts occurs in individual Purkinje cell bodies. The sense strand probe from the corresponding protein-coding gene is also shown. (A, B, C) Scale bars represent 2 mm in all cases. No expression information regarding any of these ncRNAs is currently available from the Allen Brain Atlas . (D) The genomic landscape for each protein-coding (light blue) and non-coding (red) transcript pair is shown. Experimental evidence for transcription in the form of CAGE tag clusters (TC) (orange) , and EST (green) data are also represented (as modified from the FANTOM3 Mouse Genomic Element Viewer (http://fantom32p.gsc.riken.jp/gev-f3/gbrowse/mm5): only unique transcripts and ESTs are shown). The size of a TC reflects the number of CAGE tags that are mapped to this region. A TC and its surrounding genomic sequence together can be considered a core promoter. It is evident that all three ncRNAs have further experimental support from ESTs (including those that are unspliced) and/or CAGE TCs (also listed in Table S4). AK082938 and AK149041 ncRNA transcripts are overlapped by ESTs and CAGE TCs that are derived from brain-associated tissues from adult and developing mice, whereas AK049627 has EST support from brain-associated tissues from developing mice.
Figure 6
Figure 6. Adult brain expression patterns of Mitf, Gabrb1, and Add2 overlap with those from ncRNAs transcribed from adjacent genomic sequence.
Brightfield images of in situ hybridization from adjacent wild-type adult male 8-week old brain sections are shown. (A) Both Mitf (I) and AK018196 (II) were co-expressed at low levels throughout the brain including the olfactory bulb (data not shown) but also show high levels of expression in the facial nuclei of the medulla. (B) Gabrb1 (I) and AK045528 (II) are co-expressed in most brain regions (data not shown), including specifically around the Purkinje cell layer of the cerebellum. (C) Add2 (I) and AK013768 (II) are also expressed in all areas of the brain, but levels are substantially higher in the hippocampus in both cases. (A, B, C) The sense strand probe from the corresponding protein-coding gene is shown (III). Scale bars represent 0.25 mm (A, B III) and 0.5 mm (C III). No expression information regarding any of these ncRNAs is currently available from the Allen Brain Atlas . Column IV represents the genomic landscape for each protein-coding (light blue) and non-coding (red) transcript pair. Experimental evidence for transcription in the form of CAGE tag clusters (TC) (orange) , and EST (green) data are also represented (as modified from the FANTOM3 Mouse Genomic Element Viewer (http://fantom32p.gsc.riken.jp/gev-f3/gbrowse/mm5): only unique transcripts and ESTs are shown). The size of a TC reflects the number of CAGE tags that are mapped to this region. A TC and its surrounding genomic sequence together can be considered a core promoter. It is evident that all three ncRNAs have further experimental support from ESTs (including those that are unspliced) and CAGE TCs (also listed in Table S4). AK045528 and AK013768 ncRNA transcripts are overlapped by ESTs and CAGE TCs that are derived from brain-associated tissues from adult and developing mice, whereas AK018196 has support from adult mouse brain ESTs.

References

    1. Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, et al. The transcriptional landscape of the mammalian genome. Science. 2005;309:1559–63. - PubMed
    1. Mercer TR, Dinger ME, Sunkin SM, Mehler MF, Mattick JS. Specific expression of long noncoding RNAs in the mouse brain. Proc Natl Acad Sci U S A. 2008;105:716–21. - PMC - PubMed
    1. Dinger ME, Amaral PP, Mercer TR, Pang KC, Bruce SJ, et al. Long noncoding RNAs in mouse embryonic stem cell pluripotency and differentiation. Genome Res. 2008;18:1433–45. - PMC - PubMed
    1. Ponting CP, Oliver PL, Reik W. Evolution and Functions of Long Non-coding RNAs. Cell. 2009;136:629–641. - PubMed
    1. Sproul D, Gilbert N, Bickmore WA. The role of chromatin structure in regulating the expression of clustered genes. Nat Rev Genet. 2005;6:775–81. - PubMed

Publication types

LinkOut - more resources