Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Apr 16:11:102.
doi: 10.1186/1471-2148-11-102.

The sequence, structure and evolutionary features of HOTAIR in mammals

Affiliations

The sequence, structure and evolutionary features of HOTAIR in mammals

Sha He et al. BMC Evol Biol. .

Abstract

Background: An increasing number of long noncoding RNAs (lncRNAs) have been identified recently. Different from all the others that function in cis to regulate local gene expression, the newly identified HOTAIR is located between HoxC11 and HoxC12 in the human genome and regulates HoxD expression in multiple tissues. Like the well-characterised lncRNA Xist, HOTAIR binds to polycomb proteins to methylate histones at multiple HoxD loci, but unlike Xist, many details of its structure and function, as well as the trans regulation, remain unclear. Moreover, HOTAIR is involved in the aberrant regulation of gene expression in cancer.

Results: To identify conserved domains in HOTAIR and study the phylogenetic distribution of this lncRNA, we searched the genomes of 10 mammalian and 3 non-mammalian vertebrates for matches to its 6 exons and the two conserved domains within the 1800 bp exon6 using Infernal. There was just one high-scoring hit for each mammal, but many low-scoring hits were found in both mammals and non-mammalian vertebrates. These hits and their flanking genes in four placental mammals and platypus were examined to determine whether HOTAIR contained elements shared by other lncRNAs. Several of the hits were within unknown transcripts or ncRNAs, many were within introns of, or antisense to, protein-coding genes, and conservation of the flanking genes was observed only between human and chimpanzee. Phylogenetic analysis revealed discrete evolutionary dynamics for orthologous sequences of HOTAIR exons. Exon1 at the 5' end and a domain in exon6 near the 3' end, which contain domains that bind to multiple proteins, have evolved faster in primates than in other mammals. Structures were predicted for exon1, two domains of exon6 and the full HOTAIR sequence. The sequence and structure of two fragments, in exon1 and the domain B of exon6 respectively, were identified to robustly occur in predicted structures of exon1, domain B of exon6 and the full HOTAIR in mammals.

Conclusions: HOTAIR exists in mammals, has poorly conserved sequences and considerably conserved structures, and has evolved faster than nearby HoxC genes. Exons of HOTAIR show distinct evolutionary features, and a 239 bp domain in the 1804 bp exon6 is especially conserved. These features, together with the absence of some exons and sequences in mouse, rat and kangaroo, suggest ab initio generation of HOTAIR in marsupials. Structure prediction identifies two fragments in the 5' end exon1 and the 3' end domain B of exon6, with sequence and structure invariably occurring in various predicted structures of exon1, the domain B of exon6 and the full HOTAIR.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Sequence conservation of HOTAIR orthologues in mammals. (A) The sequences of HOTAIR orthologues are obviously conserved in primates but less well conserved in other animals (from UCSC Genome Browser). (B) Orthologues of the HOTAIR exons exist only in mammals. Exon1, exon3, exon4, exon5 and domain B of exon6 are better conserved than exon2, exon6 and domain A of exon6 (indicated by the darkness of the boxes). Note that the sequence of the exon6 orthologue is significantly shorter in rat than in other mammals and contains just two domains. The two boxes under each exon6 are domain A (right side) and domain B (left side), linked by a double line indicating a gap of 130 bp (unmatched part in the Infernal search). The gaps in exon6 of dolphin and dog also indicate unmatched parts in the Infernal search. The double slashes in the schematic of the dolphin gene indicate long introns. (C) The order and orientation of HOTAIR and its neighbouring HoxC genes in mammals. X: HoxC is absent.
Figure 2
Figure 2
Phylogeny of HOTAIR. (A) A tree built with concatenated sequences of orthologues of exon1, exon3, exon4 and exon5. (B) A tree built with sequences of exon6 orthologues. C1 indicates the first local clock, while C2a, C2b and C2c indicate the second local clock inserted at three different places in different computations.
Figure 3
Figure 3
Predicted structures of exon1 orthologues in mammals. (A) The structure predicted by PMmulti and used by Infernal. This consensus structure consists of one big arc and three substructures. In some mammals, the bottom substructure contains three small loops, but in cow, dolphin, mouse and rat, it is a big loop. The middle substructure contains three tiny loops and the top substructure contains a hairpin at its end in all animals. (B) Two structures predicted by Mfold in human. Although the overall structures are different, the hairpin structure found in the PMmulti-predicted structure invariably occurs in both structures. (C) Two structures predicted by Mfold: one in cow and one in dog. The sequence and its hairpin structure (slightly varied) occur in both structures.
Figure 4
Figure 4
Predicted structures of orthologues of domain B of exon6 in mammals. (A) The structure predicted by PMmulti and used by Infernal. The circled part was identified by comparing the structure with the structures predicted by Mfold based on the position and base pairing of sequence. (B) One structure predicted by Mfold in chimpanzee; the circled part is nearly the same as that predicted for the human sequence. (C) One structure predicted by Mfold in mouse; the circled part is slightly different but still occurs at one end. (D) One structure predicted by Mfold in dog; the circled part is embedded within other sequences.

Similar articles

Cited by

References

    1. Amaral PP, Dinger ME, Mercer TR, Mattick JS. The eukaryotic genome as an RNA machine. Science. 2008;319:1787–1789. doi: 10.1126/science.1155472. - DOI - PubMed
    1. The FANTOM Consortium. The transcriptional landscape of the mammalian genome. Science. 2005;309:1559–1563. doi: 10.1126/science.1112014. - DOI - PubMed
    1. Carthew RW, Sontheimer EJ. Origins and mechanisms of miRNAs and siRNAs. Cell. 2009;136:642–655. doi: 10.1016/j.cell.2009.01.035. - DOI - PMC - PubMed
    1. Bartel DP. MicroRNAs: Genomics, biogenesis, mechanism, and function. Cell. 2004;116:281–297. doi: 10.1016/S0092-8674(04)00045-5. - DOI - PubMed
    1. Mercer TR, Dinger ME, Mattick JS. Long non-coding RNAs: insights into functions. Nat Rev Genet. 2009;10:155–159. doi: 10.1038/nrg2521. - DOI - PubMed

Publication types

LinkOut - more resources