Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Apr;22(4):642-55.
doi: 10.1101/gr.132233.111. Epub 2012 Jan 10.

An ancient genomic regulatory block conserved across bilaterians and its dismantling in tetrapods by retrogene replacement

Affiliations

An ancient genomic regulatory block conserved across bilaterians and its dismantling in tetrapods by retrogene replacement

Ignacio Maeso et al. Genome Res. 2012 Apr.

Abstract

Developmental genes are regulated by complex, distantly located cis-regulatory modules (CRMs), often forming genomic regulatory blocks (GRBs) that are conserved among vertebrates and among insects. We have investigated GRBs associated with Iroquois homeobox genes in 39 metazoans. Despite 600 million years of independent evolution, Iroquois genes are linked to ankyrin-repeat-containing Sowah genes in nearly all studied bilaterians. We show that Iroquois-specific CRMs populate the Sowah locus, suggesting that regulatory constraints underlie the maintenance of the Iroquois-Sowah syntenic block. Surprisingly, tetrapod Sowah orthologs are intronless and not associated with Iroquois; however, teleost and elephant shark data demonstrate that this is a derived feature, and that many Iroquois-CRMs were ancestrally located within Sowah introns. Retroposition, gene, and genome duplication have allowed selective elimination of Sowah exons from the Iroquois regulatory landscape while keeping associated CRMs, resulting in large associated gene deserts. These results highlight the importance of CRMs in imposing constraints to genome architecture, even across large phylogenetic distances, and of gene duplication-mediated genetic redundancy to disentangle these constraints, increasing genomic plasticity.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
CNRs in nematode and drosophilid Sowah genes. Ancora plots of CNR density (top), VISTA plots (middle), and phastCons tracks (bottom) of the Irx-Sowah region of nematodes (A) and flies (B). (Red bars) Region depicted in Ancora plots zoomed in on VISTA and phastCons tracks. VISTA colored peaks (blue, coding; turquoise, UTR; pink, noncoding) indicate regions of at least 50 bp and ≥90% similarity (≥85% in the case of Caenorhabditis japonica) in nematodes and 60 bp and ≥90% similarity in flies. Only gene symbols corresponding to Sowah (swah-1 in nematodes) and Irx (ara, caup, and mirr in Drosophila) are indicated. Numbers at the left correspond to the percentage of base pairs covered by CNRs in Ancora plots, percentage identity in VISTA analyses, and conservation scores in phastCons.
Figure 2.
Figure 2.
Internal organization of the Irx-Sowah complex in B. floridae. VISTA plot of the alignments between each of the three Irx genes (plus a fourth region corresponding to a putative IrxD locus lost during amphioxus evolution) and their respective surrounding noncoding regions, including Sowah in the case of IrxA. Colored peaks (blue, coding; turquoise, UTR; pink, noncoding) indicate regions of at least 100 bp and ≥70% similarity. High-copy number elements (such as repeats and mobile elements) are masked and their presence is indicated by khaki segments above the VISTA plot. Vertical bars of different colors below the VISTA plot represent the different conserved repeated blocks, indicating their respective location to Sowah and Irx loci (DS, Downstream Sowah; S, within Sowah; U, Upstream Irx; I, Introns of Irx; D, Downstream Irx. DS/S bars indicate elements of uncertain identity [DS or S]). Black rectangles and arrows indicate the exon sequences of Irx, Sowah, or the remains of Sowah duplicates. The blocks tested for transcriptional enhancer activity are indicated and named with a number and letter code. The letter refers to the Irx locus with which they are associated; color indicates whether they are tissue-specific enhancers (green), unspecific enhancers (yellow), or negative elements (black).
Figure 3.
Figure 3.
Comparison of the expression patterns of Sowah and Iroquois genes in amphioxus and fly. In situ hybridization of B. lanceolatum Sowah (A,B) and IrxB (C,D) genes in 15-h early neurulas (A,C) and 21-h neurulas (B,D) in dorsal and lateral views, respectively. Sowah transcripts were detected almost ubiquitously, with stronger expression in the dorsal half of the embryos. In contrast, IrxB showed a very defined and restricted pattern in the endoderm (en), notochord (nt), and neural plate (np). The anterior limit of expression in the neural plate, which is conserved in evolution (Irimia et al. 2010), is indicated by an arrow. The expression of IrxA and IrxC was similar at these stages (data not shown). (bp) Blastopore. In situ hybridization of sowah (E,F) and caup (G,H) in D. melanogaster stage 17 (E) and stage 12 (G) embryos (dorsal views) and third instar larvae wing imaginal discs (F,H, anterior is to the left). (E,F) sowah is expressed in the pharynx (*) and cephalic nervous system of late embryos (arrow in E points to nonspecific staining of the cuticular denticle belts), but is undetectable in the wing imaginal discs. (G,H) During embryonic development, ara and caup display coincident dynamic expression patterns in the epidermis (ep), central nervous system (CNS) and mesoderm, as well as in the head (h). In the wing disc, caup is expressed in the prospective regions of the 1, 3, and 5 longitudinal veins (L1, L3, L5), pleura (Pl); tegula (Tg); dorsal radius (DR); alula and lateral notum (n).
Figure 4.
Figure 4.
Transcriptional enhancer activity of B. floridae sequences from the Irx-Sowah complex. Lateral views of 48-hpf zebrafish showing GFP expression driven by the 5b, 10b, and 10d CNRs. The 5b-driven expression is detected in the spinal cord and in the telencephalon; 10b and 10d consistently drove expression throughout the CNS (midbrain, hindbrain, and spinal cord) and in the eye. Anterior is to the right. (e) eye; (h) hindbrain; (m) midbrain; (s) spinal cord; (t) telencephalon.
Figure 5.
Figure 5.
Expression of caup in sowahEGP imaginal discs and embryos. (A) Physical map of the Iro-C locus. Genomic DNA is shown as a thick black bar with a 60-kb gap delimited by //. Transcripts are shown as black arrows below the genes (blue). Exons are shown in orange, with protein-coding regions colored darker. Red broken lines within brackets represent deleted regions. The purple box represents a region bound by several transcription factors as determined by ChIP-on-chip assays. (BG) In situ hybridization with a caup probe of wing (BE) and leg (F,G) discs of the indicated genotype. caup expression is not affected in the trans-heterozygous combination of sowahEGP1 and iroDFM3, used to rescue the early embryonic lethality of sowahEGP1. In sowahEGP2 (D) and sowahEGP3 (E) discs, caup expression is absent in L3 and DR domains and strongly reduced in Pl and Tg regions (marked with black asterisks). (G) sowahEGP3 leg disc, in which caup wild-type expression in a ring-like pattern (F) is lost. (HK) Anti-Caup staining of wild-type (H,H',J) and sowahEGP3 (I,I',K) late-stage 13 embryos. Yellow asterisks mark the expression of ara/caup in the nervous system; arrowheads point to the head mandibular segment. The brown signal in I corresponds to a nonspecific staining of the tracheal (respiratory) system. (H,I) ventral, (J,K) lateral views, (H',I') enlarged views of the head.
Figure 6.
Figure 6.
Genomic organization of Sowah and Irx genes in vertebrates. Schematic representation of the genomic organization of r-Sowah and Sowah (black arrows), Septin (white arrows), and Irx (gray arrows) genes in humans (A) and a generalized teleost (B). Red geometrical shapes represent CNRs within Irx clusters: triangles represent the UltraConserved Regions (UCRs) (de la Calle-Mustienes et al. 2005), and ellipses and rectangles the only two CNRs present within Sowah2 in teleosts. For simplicity, only schematic intron–exon structures are indicated for Sowah1 and Sowah2. (C) VISTA plot of the alignments between IrxB clusters of different vertebrate species, using elephant shark as a reference sequence for the comparison. Colored peaks (blue, coding; turquoise, UTR; pink, noncoding) indicate regions of at least 100 bp and ≥70% similarity. Sowah2-CNRs are demarcated by a red rectangle.
Figure 7.
Figure 7.
Sowah intronic CNRs. Two of the Sowah2 pre-WGD CNRs that function as tissue-specific enhancers in reporter assays (represented by red ovals and rectangles in Fig. 6). (Left) CNR 54390 (Sowah2 intron 7); (right) CNR 3240 (Sowah2 intron 8). (A) Sequence alignment of the 54390 and 3240 Sowah-CNRs in different Irx complexes of several species. Shadowed nucleotides correspond to >60% sequence conservation. (B,C) GFP expression driven by the elements 54390 and 3240. Paralogous sequences of the element 54390 in complexes IrxA and IrxB drove similar expression patterns in Xenopus embryos (C) and zebrafish (B) transgenic lines. In the case of the CNR 3240, only that present in the IrxA complex was found to be positive in transgenesis studies. (e) Eye; (m) midbrain; (h) hindbrain; (s) spinal cord.
Figure 8.
Figure 8.
Evolutionary scenarios for the convergent loss of Sowah coding exons near Irx genes in chordates. (A) A retroposition event to other parts of the genome (indicated as a solid black arrow, r-Sowah) allows the original, Irx-linked Sowah to lose the coding sequences (black bars), while retaining the functional noncoding regions (red). A similar event occurred at the base of the vertebrates. (B) A polyploidization (a WGD) creates redundancy of Sowah genes. Therefore, some Sowah genes can lose their coding sequences. This event was observed in teleosts, after the third round of WGD. (C) Gene redundancy is acquired by tandem duplication of Irx and Sowah, as reported for amphioxus. Subsequently, one of the Sowah copies loses its coding sequences, whereas the functional noncoding regions are maintained.

Similar articles

Cited by

References

    1. Abascal F, Zardoya R, Posada D 2005. ProtTest: selection of best-fit models of protein evolution. Bioinformatics 21: 2104–2105 - PubMed
    1. Aday AW, Zhu LJ, Lakshmanan A, Wang J, Lawson ND 2011. Identification of cis regulatory features in the embryonic zebrafish genome through large-scale profiling of H3K4me1 and H3K4me3 binding sites. Dev Biol 357: 450–462 - PMC - PubMed
    1. Aparicio S, Morrison A, Gould A, Gilthorpe J, Chaudhuri C, Rigby P, Krumlauf R, Brenner S 1995. Detecting conserved regulatory elements with the model genome of the Japanese puffer fish, Fugu rubripes. Proc Natl Acad Sci 92: 1684–1688 - PMC - PubMed
    1. Becker T, Lenhard B 2007. The random versus fragile breakage models of chromosome evolution: a matter of resolution. Mol Genet Genomics 278: 487–491 - PubMed
    1. Bejerano G, Pheasant M, Makunin I, Stephen S, Kent WJ, Mattick JS, Haussler D 2004. Ultraconserved elements in the human genome. Science 304: 1321–1325 - PubMed

Publication types

MeSH terms

Associated data