Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Oct 2;9(10):e109443.
doi: 10.1371/journal.pone.0109443. eCollection 2014.

Promoter analysis reveals globally differential regulation of human long non-coding RNA and protein-coding genes

Affiliations

Promoter analysis reveals globally differential regulation of human long non-coding RNA and protein-coding genes

Tanvir Alam et al. PLoS One. .

Abstract

Transcriptional regulation of protein-coding genes is increasingly well-understood on a global scale, yet no comparable information exists for long non-coding RNA (lncRNA) genes, which were recently recognized to be as numerous as protein-coding genes in mammalian genomes. We performed a genome-wide comparative analysis of the promoters of human lncRNA and protein-coding genes, finding global differences in specific genetic and epigenetic features relevant to transcriptional regulation. These two groups of genes are hence subject to separate transcriptional regulatory programs, including distinct transcription factor (TF) proteins that significantly favor lncRNA, rather than coding-gene, promoters. We report a specific signature of promoter-proximal transcriptional regulation of lncRNA genes, including several distinct transcription factor binding sites (TFBS). Experimental DNase I hypersensitive site profiles are consistent with active configurations of these lncRNA TFBS sets in diverse human cell types. TFBS ChIP-seq datasets confirm the binding events that we predicted using computational approaches for a subset of factors. For several TFs known to be directly regulated by lncRNAs, we find that their putative TFBSs are enriched at lncRNA promoters, suggesting that the TFs and the lncRNAs may participate in a bidirectional feedback loop regulatory network. Accordingly, cells may be able to modulate lncRNA expression levels independently of mRNA levels via distinct regulatory pathways. Our results also raise the possibility that, given the historical reliance on protein-coding gene catalogs to define the chromatin states of active promoters, a revision of these chromatin signature profiles to incorporate expressed lncRNA genes is warranted in the future.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors confirm that co–author Vladimir Bajic is a PLOS ONE Editorial Board member. This does not alter their adherence to PLOS ONE Editorial policies and criteria.

Figures

Figure 1
Figure 1. DNA feature distributions in the promoters of lncRNA genes and protein-coding genes.
DNA feature distributions in a sliding window of 100 bp with a step of 50 bp in the promoters of protein-coding and lncRNAs. Blue line corresponds to promoters of protein-coding genes; red line corresponds to lncRNAs gene promoters. Figure 1a–d shows distribution of the feature in a sliding window of 100 bp with a step of 50 bp, resulting in 39 windows on the plot. Figure 1e–f show the percentage of promoters where features were found. Transparent regions correspond to 5–95% bootstrap confidence interval of the statistics. WC: word commonality, PALIN: palindromes, CGI: CpG Islands, RE: repetitive elements, all types of repeats except “simple repeats”, “low complexity regions” and “satellite repeats”. The enrichment score was calculated using right-sided exact Fisher's test (Table S3).
Figure 2
Figure 2. Distribution of histone modification marks in the GM12878 cell line across lncRNA and protein-coding gene promoters.
Figure demonstrates fraction of all promoters covered by chromatin a particular mark. Blue line corresponds to promoters of protein-coding genes; red line corresponds to lncRNA gene promoters. Transparent regions correspond to 5–95% bootstrap confidence interval of the statistics.
Figure 3
Figure 3. Performance of the prediction model.
Quality of the models based on the complete feature set and several combinations of features. RE: repetitive elements, PALIN: palindromes, SKEW: A/T and C/G skews, CGI: CpG Islands, TFBS: transcription factor binding sites, WC: word commonality, CS: chromatin states, k-mer: mono-, di-,tri-nucleotide frequencies, COMBINE: combination of all types of features for complete promoter set (CPS).

Similar articles

Cited by

References

    1. Lee JT, Davidow LS, Warshawsky D (1999) Tsix, a gene antisense to Xist at the X-inactivation centre. Nat. Genet 21: 400–404. - PubMed
    1. Lanz RB, McKenna NJ, Onate SA, Albrecht U, Wong J, et al. (1999) A steroid receptor coactivator, SRA, functions as an RNA and is present in an SRC-1 complex. Cell 97: 17–27. - PubMed
    1. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, et al. (2001) Initial sequencing and analysis of the human genome. Nature 409: 860–921. - PubMed
    1. Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S, et al. (2012) The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression. Genome Res 22: 1775–1789. - PMC - PubMed
    1. Jia H, Osak M, Bogu GK, Stanton LW, Johnson R, et al. (2010) Genome-wide computational identification and manual annotation of human long noncoding RNA genes. RNA 16: 1478–1487. - PMC - PubMed

Publication types

LinkOut - more resources