Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2007 Dec;17(12):1898-908.
doi: 10.1101/gr.6669607. Epub 2007 Nov 7.

Genomic regulatory blocks underlie extensive microsynteny conservation in insects

Affiliations
Comparative Study

Genomic regulatory blocks underlie extensive microsynteny conservation in insects

Pär G Engström et al. Genome Res. 2007 Dec.

Abstract

Insect genomes contain larger blocks of conserved gene order (microsynteny) than would be expected under a random breakage model of chromosome evolution. We present evidence that microsynteny has been retained to keep large arrays of highly conserved noncoding elements (HCNEs) intact. These arrays span key developmental regulatory genes, forming genomic regulatory blocks (GRBs). We recently described GRBs in vertebrates, where most HCNEs function as enhancers and HCNE arrays specify complex expression programs of their target genes. Here we present a comparison of five Drosophila genomes showing that HCNE density peaks centrally in large synteny blocks containing multiple genes. Besides developmental regulators that are likely targets of HCNE enhancers, HCNE arrays often span unrelated neighboring genes. We describe differences in core promoters between the target genes and the unrelated genes that offer an explanation for the differences in their responsiveness to enhancers. We show examples of a striking correspondence between boundaries of synteny blocks, HCNE arrays, and Polycomb binding regions, confirming that the synteny blocks correspond to regulatory domains. Although few noncoding elements are highly conserved between Drosophila and the malaria mosquito Anopheles gambiae, we find that A. gambiae regions orthologous to Drosophila GRBs contain an equivalent distribution of noncoding elements highly conserved in the yellow fever mosquito Aëdes aegypti and coincide with regions of ancient microsynteny between Drosophila and mosquitoes. The structural and functional equivalence between insect and vertebrate GRBs marks them as an ancient feature of metazoan genomes and as a key to future studies of development and gene regulation.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
HCNE arrays are centrally positioned in large synteny blocks. (A) RA-HCNE sequence is enriched in large synteny blocks compared to RA-CDS. Dashed lines show the distributions when sequence not covered by any synteny block is excluded. (B) HCNE density, RA-CDS density, and synteny blocks on Dmel chromosome arm 2L. Synteny blocks (green boxes with black borders) are shown between the density curves and in the area under them. Density peaks were detected above a threshold (gray line) set to cover 80% of the area under the density curve for the chromosome arm. In the magnified section, HCNE density peaks are labeled with inferred regulatory target genes located in the same synteny block as the HCNE density peak. (C) Line histogram of position of density peaks within synteny blocks. For each density peak that was located within a synteny block, we computed the distance between the peak and the synteny break closest to it, and scaled the distances to [0, 0.5] by dividing with synteny block size. Dashed lines show distributions from 10,000 randomizations where synteny blocks were ordered independently of density peaks (Supplemental Fig. S3). (D) Histogram of median distance in each of the 10,000 randomizations. Arrows indicate medians for the nonrandomized data, and one-sided P-values indicate the fraction of randomizations having equal or more extreme medians.
Figure 2.
Figure 2.
Genes associated with HCNE arrays tend to be in large fly-mosquito synteny blocks. (A,B) Examples of synteny blocks. Gaps within synteny blocks are colored yellow. Green lines connect genes in conserved microsynteny between Dmel and Agam. Microsynteny conservation between Dmel and Agam was determined by examining chained BLASTZ and TBLASTN alignments in the UCSC Genome Browser (Kent et al. 2002). Sometimes only parts of genes could be matched (e.g., in the case of ct). Aaeg contigs aligned to the Agam assembly are shown with regions having ≥50% identity over 50 bp in black and other regions in yellow. HCNE densities were computed as the fraction of bases in HCNEs in sliding windows of 40 kb. The UCSC Genome Browser (Kent et al. 2002) was used in making the images. (A) The ct locus in Dmel (upper panel) and Agam (lower panel). ct and nine other genes (underlined) show strong evidence of being in conserved microsynteny among the five flies. The orange line indicates a noncoding BLASTZ match between Dmel and Agam and hints at the location of the first ct exon in Agam. Comparison of HCNE density curves also supports that the first ct exon in Agam is in the area indicated by the orange line. Supporting a common origin of the HCNE clusters at the ct loci in flies and mosquitoes, the HCNE density curves have similar shapes. Two density peaks are visible in both organisms: one between CG9657 and ct, and the other within the borders of ct. The developmental transcriptional regulator brk (Moser and Campbell 2005) is centrally positioned in an adjacent synteny block. CG9650, which dominates a neighboring HCNE-rich synteny block, is expressed in developing CNS and PNS and encodes a putative C2H2 zinc finger protein (McGovern et al. 2003). (B) The tailup (tup) locus in Dmel (upper panel) and Agam (lower panel). tup is in conserved microsynteny with CG18397 among the five flies, Agam and Aaeg. tup encodes a homeodomain transcription factor involved in development (Thor and Thomas 1997). CG18397 is predicted to encode a protein with an AMP-dependent synthetase and ligase domain. In both flies and mosquitoes, HCNEs are found throughout the synteny block. Some HCNEs are within introns of tup and CG18397. This, combined with the lack of evidence for a functional relationship between the two genes, indicates that they have been kept in proximity in order to maintain the HCNE array. (C) For each gene that we could assign to a synteny block, we measured the span of its synteny block excluding the region spanned by the gene itself (in order to control for differences in gene size). Each curve shows the cumulative distribution of synteny block span, measured in Dmel bp, around genes in a particular category. Categories were defined from GO biological process annotation and HCNE density. The category “any biological process” contains all genes annotated with a GO biological process term other than “biological process unknown.” Genes in HCNE-dense regions overlap a 40-kb region where at least 1% (400 bp) of the sequence is in HCNEs. Numbers within parentheses indicate the number of genes annotated to the indicated process and assigned to a single synteny block.
Figure 3.
Figure 3.
Associations between core promoter types and gene functions. (A) Bars show the fraction of genes in each core promoter category that are annotated with indicated GO terms. All GO terms shown are significantly associated with a core promoter category at Bonferroni-adjusted P < 0.01 (see also Supplemental Table S3). (B) Violin plots (boxplots with added kernel density curves) show distributions of Pearson correlation coefficients for expression correlations between randomly selected gene pairs taken from pairs of core promoter categories indicated by colored rectangles below the plots. High correlations are frequent between genes with DRE core promoters and genes with Motif 1/6 core promoters, as well as among genes within each of those categories. Each distribution is based on a sample of 1000 randomly selected gene pairs. Genes were not compared against themselves.
Figure 4.
Figure 4.
HCNE-clusters spanning coregulated genes and boundary agreement among synteny blocks, HCNE clusters, and Polycomb binding regions. Gene models are colored by predicted core promoter type as in Fig. 2. Only selected genes are labeled. (A) The paralogous zinc finger genes elB and noc, implicated in tracheal and appendage development, have different, but partially overlapping, spatial expression patterns during embryonic development (Dorfman et al. 2002) and are coexpressed in larval leg and wing discs (Weihe et al. 2004). Among the five flies, elB and noc are in conserved microsynteny with a tRNA gene and at least three protein-coding genes (underlined), which have no evidence of being functionally related to elB or noc: pburs encodes a subunit of the hormone bursicon required for wing expansion and associated cuticle changes after flies emerge from pupae (Luo et al. 2005); CG3474 is predicted to encode a cuticle component; CG4218 is predicted to encode a protein of unknown function. (B) The paralogous T-box genes H15 and mid are involved in regulation of heart development and have similar spatial expression patterns during embryonic development (Miskolczi-McCallum et al. 2005; Reim et al. 2005). They are in conserved microsynteny with four other genes (underlined) among the five flies: CG12512, predicted to encode a protein with an AMP-dependent synthetase and ligase domain; nompC, encoding a mechanosensory transduction channel (Walker et al. 2000); and two genes of unknown function. The developmental regulators vri (George and Terracol 1997) and tomb (Jiang et al. 2007) are centrally positioned in neighboring synteny blocks. Two transcript isoforms are shown for tkv because it has two major transcription start sites with different types of core promoter predictions.
Figure 5.
Figure 5.
The E32 enhancer trap insertion at the Dmel decapentaplegic (dpp) locus. Gene models are colored by predicted core promoter type as in Fig. 2. Two transcript isoforms are shown for dpp because this gene has different core promoter predictions for two of its major transcription start sites (other dpp start sites are not shown, see St Johnston et al. 1990). The HCNE-spanned gene desert downstream from dpp contains several conserved enhancers that specify dpp expression in imaginal discs (Merli et al. 1996). Although the neighboring, divergently transcribed genes SLY1 homologous (Slh) and out at first (oaf) are insensitive to the array of dpp enhancers and have different expression patterns, the enhancer trap insertion E32, inserted into the 5′-untranslated region of oaf (arrow), reproduces part of the dpp expression pattern in imaginal discs (Merli et al. 1996). Five other genes (underlined) are in conserved microsynteny with dpp, Slh, and oaf among the investigated flies.

References

    1. Bailey P.J., Klos J.M., Andersson E., Karlen M., Kallstrom M., Ponjavic J., Muhr J., Lenhard B., Sandelin A., Ericson J., Klos J.M., Andersson E., Karlen M., Kallstrom M., Ponjavic J., Muhr J., Lenhard B., Sandelin A., Ericson J., Andersson E., Karlen M., Kallstrom M., Ponjavic J., Muhr J., Lenhard B., Sandelin A., Ericson J., Karlen M., Kallstrom M., Ponjavic J., Muhr J., Lenhard B., Sandelin A., Ericson J., Kallstrom M., Ponjavic J., Muhr J., Lenhard B., Sandelin A., Ericson J., Ponjavic J., Muhr J., Lenhard B., Sandelin A., Ericson J., Muhr J., Lenhard B., Sandelin A., Ericson J., Lenhard B., Sandelin A., Ericson J., Sandelin A., Ericson J., Ericson J. A global genomic transcriptional code associated with CNS-expressed genes. Exp. Cell Res. 2006;312:3108–3119. - PubMed
    1. Bejerano G., Pheasant M., Makunin I., Stephen S., Kent W.J., Mattick J.S., Haussler D., Pheasant M., Makunin I., Stephen S., Kent W.J., Mattick J.S., Haussler D., Makunin I., Stephen S., Kent W.J., Mattick J.S., Haussler D., Stephen S., Kent W.J., Mattick J.S., Haussler D., Kent W.J., Mattick J.S., Haussler D., Mattick J.S., Haussler D., Haussler D. Ultraconserved elements in the human genome. Science. 2004;304:1321–1325. - PubMed
    1. Boyer L.A., Plath K., Zeitlinger J., Brambrink T., Medeiros L.A., Lee T.I., Levine S.S., Wernig M., Tajonar A., Ray M.K., Plath K., Zeitlinger J., Brambrink T., Medeiros L.A., Lee T.I., Levine S.S., Wernig M., Tajonar A., Ray M.K., Zeitlinger J., Brambrink T., Medeiros L.A., Lee T.I., Levine S.S., Wernig M., Tajonar A., Ray M.K., Brambrink T., Medeiros L.A., Lee T.I., Levine S.S., Wernig M., Tajonar A., Ray M.K., Medeiros L.A., Lee T.I., Levine S.S., Wernig M., Tajonar A., Ray M.K., Lee T.I., Levine S.S., Wernig M., Tajonar A., Ray M.K., Levine S.S., Wernig M., Tajonar A., Ray M.K., Wernig M., Tajonar A., Ray M.K., Tajonar A., Ray M.K., Ray M.K., et al. Polycomb complexes repress developmental regulators in murine embryonic stem cells. Nature. 2006;441:349–353. - PubMed
    1. Brudno M., Malde S., Poliakov A., Do C.B., Couronne O., Dubchak I., Batzoglou S., Malde S., Poliakov A., Do C.B., Couronne O., Dubchak I., Batzoglou S., Poliakov A., Do C.B., Couronne O., Dubchak I., Batzoglou S., Do C.B., Couronne O., Dubchak I., Batzoglou S., Couronne O., Dubchak I., Batzoglou S., Dubchak I., Batzoglou S., Batzoglou S. Glocal alignment: Finding rearrangements during alignment. Bioinformatics. 2003;19:i54–i62. - PubMed
    1. Butler J.E., Kadonaga J.T., Kadonaga J.T. Enhancer-promoter specificity mediated by DPE or TATA core promoter motifs. Genes & Dev. 2001;15:2515–2519. - PMC - PubMed

Publication types

LinkOut - more resources