Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Aug;194(4):903-26.
doi: 10.1534/genetics.113.152546. Epub 2013 Jun 7.

Probing the boundaries of orthology: the unanticipated rapid evolution of Drosophila centrosomin

Affiliations

Probing the boundaries of orthology: the unanticipated rapid evolution of Drosophila centrosomin

Robert C Eisman et al. Genetics. 2013 Aug.

Abstract

The rapid evolution of essential developmental genes and their protein products is both intriguing and problematic. The rapid evolution of gene products with simple protein folds and a lack of well-characterized functional domains typically result in a low discovery rate of orthologous genes. Additionally, in the absence of orthologs it is difficult to study the processes and mechanisms underlying rapid evolution. In this study, we have investigated the rapid evolution of centrosomin (cnn), an essential gene encoding centrosomal protein isoforms required during syncytial development in Drosophila melanogaster. Until recently the rapid divergence of cnn made identification of orthologs difficult and questionable because Cnn violates many of the assumptions underlying models for protein evolution. To overcome these limitations, we have identified a group of insect orthologs and present conserved features likely to be required for the functions attributed to cnn in D. melanogaster. We also show that the rapid divergence of Cnn isoforms is apparently due to frequent coding sequence indels and an accelerated rate of intronic additions and eliminations. These changes appear to be buffered by multi-exon and multi-reading frame maximum potential ORFs, simple protein folds, and the splicing machinery. These buffering features also occur in other genes in Drosophila and may help prevent potentially deleterious mutations due to indels in genes with large coding exons and exon-dense regions separated by small introns. This work promises to be useful for future investigations of cnn and potentially other rapidly evolving genes and proteins.

Keywords: Cnn; Drosophila; centrosome; indels; rapid evolution.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The rapid divergence of cnn in Drosophila. (Top) The coding sequence of the centrosomin gene evolves rapidly within the genus Drosophila as shown by the weak hybridization of labeled cnn-RA transcripts with genomic DNA on two Southern blots. The first seven lanes are species in the subgenus Sophophora, and the strongest signals are members of the melanogaster group. The last two lanes are species in the subgenus Drosophila, representing the extent of the phylogenetic range producing detectable signal. (Bottom) The Cnn-PA protein also evolves rapidly, as evidenced by decreased reactivity of the anti-Cnn antibody in immunostained embryos. Cnn staining (green) is strong in Dmel, Dsim, Dere, and Dyak, all members of the melanogaster subgroup, but is weak in Dpse and almost undetectable in Dvir using the same antibody concentrations. The staining in both Dpse and Dvir can be improved by increasing the concentration of the Cnn antibody. Microtubules are shown in red and DNA in blue. Dmel: D. melanogaster; Dsim: D. simulans; Dere: D. erecta; Damb: D. ambigua; Dwil: D. willistoni; Dsal: D. saltans; Dana: D. ananassae; Dmer: D. mercatorum; Dvir: D. virilis; Dyak: D. yakuba; Dpse: D. pseudoobscura.
Figure 2
Figure 2
Conserved motifs and structure of Cnn-PA in Drosophila. Protein alignments of the KFC, SESAW, and SPD motifs and the carboxy terminus are nearly invariant within the genus Drosophila and are present in all long-form splice variants of Cnn. Their position in Cnn-PA is shown aligned to the secondary structure of the protein. Coiled-coil domains are represented as spirals and noncoiled α-helical regions are solid. Dbip: D. biarmipes; Dtak: D. takahashii; Dfic: D. ficusphila; Dele: D. elegans; Dbip: D. bipectinata; Dmoj: D. mojavensis; Dgri: D. grimshawi.
Figure 3
Figure 3
Drosophila MPOs confound gene-modeling programs. Splicing predictions for cnn transcripts are difficult due to a lack of strong splicing signals, multi-exon MPOs, and MPOs that extend well beyond the ends of known splice sites. A comparison of MPOs in all Drosophila species in this study (boxed area) showing all MPOs found in cnn reveals that MPOs are variable between species. At the top are all the MPOs in D. melanogaster cnn; the multi-colored boxes are the exons encoding Cnn-PA (below the grey boxes), and the gray and black boxes are coding exons present in other cnn splice variants. The MPO maps show that the transcriptional complexity of D. melanogaster cnn is probably conserved within the genus.
Figure 4
Figure 4
Intronic indels are common and random in Drosophila cnn. Cnn-PA coding exons (color coded as in Fig. 3) span the entire cnn gene. When maps are to scale and aligned at the conserved carboxy terminus, intronic indels are obvious. Because of frequent and random indels, the exon positions are variable as is the size of cnn across the genus. Gene sizes are shown on the left.
Figure 5
Figure 5
Cnn-PA motifs and structure are conserved within insects. Similar to Figure 2, we show the conserved motifs and carboxy terminus aligned to the structure of Cnn-PA from A. mellifera. Because some databases and the literature assert that the vertebrate genes CDK5RAP2 and mmg are orthologous to cnn, we have included the human co-orthologs in alignments. The human genes have a significantly different structure (not shown), the carboxy terminus has no significant homology to insects, and the terminus is more divergent between human paralogs than it is across the insects (bottom). The SESAW and SPD motifs are not detectable in any vertebrate gene. The Cnn motifs have diverged between orders but are highly conserved within orders. The carboxy terminus (CTERM) is the most divergent of the conserved motifs as shown in alignments between the Hymenoptera (top) and D. melanogaster (top, top line), but is highly conserved within the Hymenoptera and Lepidoptera (bottom). Agam: A. gambiae; Aegy: A. egypti; Amel: A. mellifera; Bter: B. terrestris; Bimp: B. impatiens; Nvit: N. vitripennis; Cflo: C. floridanus; Aech: A. echinatior; Sinv: S. invicta; Hsal: H. salator; Bmor: B. mori; Dple: D. plexippus; Hsap; Homo sapiens.
Figure 6
Figure 6
cnn intron–exon structure changes significantly between insect orders. A comparison of Cnn-PA coding exons relative to D. melanogaster in the Diptera (A), Hymenoptera (B), and Lepidoptera and one coleopteran (C) shows that multiple exon fusions and splitting events have occurred in cnn orthologs. Coding exons are color-coded as in Figure 3 to indicate sequence homology. The gray boxes in T. castaneum have no homology to any other Cnn proteins, but the motifs and structure are conserved, and the lepidopterans have no exons homologous to D. melanogaster exon 4. Although homologous exons are split in the hymenopterans and D. plexippus, the overall arrangement of coding exons is similar to Drosophila. As in Drosophila, gene size (left) is variable.
Figure 7
Figure 7
Sequence divergence and frequent indels are associated with the rapid evolution of Cnn-PA in the Diptera. Pair-wise comparisons of Cnn-PA proteins from Drosophila and mosquitoes (gray boxes) show that the percentage of identity of the protein decreases (section below diagonal solid boxes) and indels accumulate (section above diagonal solid boxes) over relatively short periods of time. A neighbor-joining tree of these Cnn-PA proteins showing the distance based on the number of differences with gaps distributed proportionally shows that the molecular phylogeny for cnn is consistent with the accepted organismal phylogeny for these species. Cnn-PA protein size is shown below species name (top row) and haploid genome sizes are in parentheses (left column).
Figure 8
Figure 8
The divergence of Cnn-PA coiled-coil domains in dipterans. A schematic representation of the divergence of dipteran Cnn-PA conserved motifs (black bars) and the coiled-coil domains (shaded bars) shows the average percentage of identity and the number of gaps needed for alignment across each region. The least-conserved motif is the carboxy terminus. The divergence graph is above structural models for (top pair) D. melanogaster and D. virilis, (middle pair) A. gambiae and A. aegypti, and D. melanogaster and A. gambiae. Arrowheads indicate exon splice sites for each species.
Figure 9
Figure 9
A lower limit for the divergence and number of indels in Cnn-PA orthologs. A table similar to that in Figure 7 showing the divergence of Cnn-PA in the other insects in this study reveals a similar trend in the dipterans. Comparisons between orders suggest that there is a lower limit for sequence divergence and an upper limit for the number of indels tolerated in Cnn-PA.
Figure 10
Figure 10
The divergence of Cnn-PA is consistent in all insects. A schematic representation of the divergence of Cnn-PA, similar to that in Figure 8, showing comparisons between (top left) A. mellifera (bottom) and N. vitripennis (bottom); (top right) B. mori (top) and D. plexippus (bottom); (bottom left) D. melanogaster (top) and A. mellifera (bottom); and (bottom right) D. melanogaster (top) and B. mori (bottom). The rate of change in the Hymenoptera and Lepidoptera is similar to the rate in the Diptera. The comparisons with D. melanogaster show the lower limit of divergence for Cnn-PA across this phylogenetic range. Arrowheads indicate exon splice sites for each species.
Figure 11
Figure 11
Cnn-PA allelic variation within a D. melanogaster population. The 23 amino acid substitutions present within a single population of D. melanogaster are mapped along the structural model of Cnn-PA, showing the amino acid substitution and position above the model. Ten of the 23 amino acid substitutions are not chemically similar residues. In addition to the wild-type protein, unique combinations of one to four of these substitutions produce 29 allelic variants of Cnn-PA in this population. Brackets above the model show the positions of the conserved Cnn motifs, and the positions of the exon splice junctions are indicated by solid triangles below the model.
Figure 12
Figure 12
Splicing changes and nontriplet indels create new codons in Cnn-PA orthologs. (A) Rapid changes in splicing of the intron between exon 6 (green) and exon 7 (red) in D. melanogaster, D. biarmipes, and D. pseudoobscura due to indels have shortened the intron in D. biarmipes and moved the intron in D. pseudoobscura. The exon and MPO (light red) maps (top) are aligned to the SPD motif (left) and at the start of the carboxy terminus (right) showing the effect of these changes on exon position. A single substitution in D. biarmipes has introduced a stop codon (boxed, middle), changing the MPO. The aligned protein sequences (bottom) showing the two splice sites (arrowheads) show the rapid divergence associated with these changes, which are typical of existing splice sites in orthologs and at sites of exon fusion. The evidence suggests that these changes are due to nontriplet indels and new codon usage in coding sequence and relaxed splicing. (B) An alignment of this same region from the North Carolina lines shows that multiple lines have begun to accumulate nontriplet indels (top). These changes do not change splicing, but they do reveal the buffering capacity of MPOs (light red boxes, bottom) and show how comparisons between MPOs are useful for the detection of very small significant changes in sequence. (C) A schematic (top) comparison of the MPOs (light red) for exon 4c (blue) and exon 5 (brown) between A. mellifera and N. vitripennis suggest that a change has occurred. Comparisons of the aligned introns (top) show a large deletion, and three different nontriplet indels in N. vitripennis have significantly changed the splicing. Alignments of the coding sequence (middle, boxed) and translated peptide encoded by exon 4c show that either nontriplet indels have generated new codons or, for some unknown reason, the nucleotide substitution rate is accelerated in this exon.
Figure 13
Figure 13
Minimal divergence between proteins perturbs Cnn-PA function during development. The ectopic expression of a D. melanogaster EGFP::Cnn-PA fusion protein in D. melanogaster, D. simulans, D. yakuba, and D. pseudoobscura during syncytial development shows minimal divergence between proteins expressed in an heterospecific embryo disrupts normal cleavage. Live imaging of EGFP (A–D) shows that the tagged protein localizes to the centrosome, but in D. yakuba the protein is less punctate and more diffuse and in D. pseudoobscura the amount of tagged protein at centrosomes is reduced and variable. Immunostaining of fixed embryos recognizes both the tagged and native Cnn protein and is similar to what is observed in live animals during prophase (compare I–L to A–D). Although centrosomes are more obvious in D. pseudoobscura in fixed material, there is still significant variability among nuclei (L). During metaphase (Q–T) defects are clear in D. simulans (Q) and D. yakuba (S) as indicated by free centrosomes (white arrowheads). Either centrosome replication is precocious or centrosomes are split at many poles in D. yakuba. The phenotype seen in D. pseudoobscura is interesting, as the spindles are similar to spindles in cnn loss-of-function mutant embryos. However, in D. pseudoobscura (T), while Cnn is at centrosomes, spindle poles have multiple centrosomes and many spindles are multipolar. These results show that <10% divergence and a single indel between two coexpressed Cnn proteins is sufficient to significantly perturb normal function. Control animals not expressing the EGFP-tagged Cnn all appear to be normal throughout the cell cycle (prophase: E–H; metaphase: M–P). Fixed embryos (E–T) is stained with anti-Cnn (green) and anti-β-tubulin (red) and DNA is stained with TOTO3 (blue).

Similar articles

Cited by

References

    1. Altenhoff A. M., Dessimoz C., 2009. Phylogenetic and functional assessment of orthologs inference projects and methods. PLOS Comput. Biol. 5: e1000262. - PMC - PubMed
    1. Carmon A., Wilkin M., Hassan J., Baron M., MacIntyre R., 2007. Concerted evolution within the Drosophila dumpy gene. Genetics 176: 309–325 - PMC - PubMed
    1. Carmon A., Guertin M. J., Grushko O., Marshall B., MacIntyre R., 2010. A molecular analysis of mutations at the complex dumpy locus in Drosophila melanogaster. PLoS ONE 5: e12319. - PMC - PubMed
    1. Conduit P. T., Raff J. W., 2010. Cnn dynamics drive centrosome size asymmetry to ensure daughter centriole retention in Drosophila neuroblasts. Curr. Biol. 20: 2187–2192 - PubMed
    1. Coulombe-Huntington J., Majewski J., 2007. Intron loss and gain in Drosophila. Mol. Biol. Evol. 24: 2842–2850 - PubMed

Publication types

LinkOut - more resources