Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Dec 17;4(12):e8345.
doi: 10.1371/journal.pone.0008345.

Molecular evolution of Drosophila cuticular protein genes

Affiliations

Molecular evolution of Drosophila cuticular protein genes

R Scott Cornman. PLoS One. .

Abstract

Several multigene families have been described that together encode scores of structural cuticular proteins in Drosophila, although the functional significance of this diversity remains to be explored. Here I investigate the evolutionary histories of several multigene families (CPR, Tweedle, CPLCG, and CPF/CPFL) that vary in age, size, and sequence complexity, using sequenced Drosophila genomes and mosquito outgroups. My objective is to describe the rates and mechanisms of 'cuticle-ome' divergence, in order to identify conserved and rapidly evolving elements. I also investigate potential examples of interlocus gene conversion and concerted evolution within these families during Drosophila evolution. The absolute rate of change in gene number (per million years) is an order of magnitude lower for cuticular protein families within Drosophila than it is among Drosophila and the two mosquito taxa, implying that major transitions in the cuticle proteome have occurred at higher taxonomic levels. Several hotspots of intergenic conversion and/or gene turnover were identified, e.g. some gene pairs have independently undergone intergenic conversion within different lineages. Some gene conversion hotspots were characterized by conversion tracts initiating near nucleotide repeats within coding regions, and similar repeats were found within concertedly evolving cuticular protein genes in Anopheles gambiae. Rates of amino-acid substitution were generally severalfold higher along the branch connecting the Sophophora and Drosophila species groups, and 13 genes have Ka/Ks significantly greater than one along this branch, indicating adaptive divergence. Insect cuticular proteins appear to be a source of adaptive evolution within genera and, at higher taxonomic levels, subject to periods of gene-family expansion and contraction followed by quiescence. However, this relative stasis is belied by hotspots of molecular evolution, particularly concerted evolution, during the diversification of Drosophila. The prominent association between interlocus gene conversion and repeats within the coding sequence of interacting genes suggests that the latter promote strand exchange.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The author has declared that no competing interests exist.

Figures

Figure 1
Figure 1. Schematic representing the estimation of ω, the ratio of nonsynonymous to synonymous substitutions, along branches of the Drosophila phylogeny.
A. Based on initial pairwise calculations of Ka/Ks, three distinct ω parameters were estimated for each set of orthologous genes. One ω was estimated for the Sophophora species group, one for the Drosophila species group, and a third ω was estimated for the branch connecting the two species groups. The branches of the tree labeled with each ω class are indicated by colored boxes. B. To test whether the estimated ω for the branch between the Sophophora and Drosophila groups is significantly greater than 1 for a particular set of orthologous genes, ω is recalculated for that branch with all other branches assigned to a single background ω class.
Figure 2
Figure 2. The array of CPR cuticular protein genes located approximately at band 44C of D. melanogaster chromosome 2R and the orthologous regions in six other Drosophila species.
A. Schematic of the organization of genes in the array, with colored boxes matching colored symbols in the phylogeny according to the legend at left. Names at top are of D. melanogaster genes; plus and minus symbols indicate relative orientation. Genes with dark outlines are predicted to be intronless. B. Neighbor-joining phylogeny (see Methods) of predicted amino acid sequence with bootstrap support indicated.
Figure 3
Figure 3. The array of CPR cuticular protein genes located approximately at band 65A of D. melanogaster chromosome 3L and the orthologous regions in six other Drosophila species.
A. Schematic of the organization of genes in the array, with colored boxes matching colored symbols in the phylogeny according to the legend at left. Numbered positions in the array correspond to numbered clades in the phylogeny of part B. Names at top are of D. melanogaster genes; plus and minus symbols indicate relative orientation. B. Neighbor-joining phylogeny (see Methods) of predicted amino acid sequence. Arrow indicates the clade within which genes cluster as paralogs rather than as orthologs (see text for details). Bootstrap support is not shown for clarity, but is greater than 90% for all numbered groups and generally low outside these groups.
Figure 4
Figure 4. Gene organization and polymorphism patterns within the 67F array of Drosophila CPR genes.
A. Gene organization in the 12 Drosophila genomes. The arrow indicates the melanogaster species group. Gray boxes indicate highly similar paralogs. B. Number of polymorphic sites within a sliding window of 50 bases across aligned Cpr67Fa1 and Cpr67Fa2 alleles (step of 25 bases). Polymorphism between paralogs is much lower than between orthologs across the entire coding region. The part of the X axis corresponding to the R&R Consensus is shaded.
Figure 5
Figure 5. Gene organization and divergence patterns within the 84A array of Drosophila CPR genes.
Note that genes in this array have the historical names Edg84A and Ccp84Aa-g and thus diverge from nomenclature for other arrays. A. Gene organization in ten Drosophila genomes. D. persimilis and D. pseudoobscura were excluded due to the presence of gaps in the genome sequence. Pairs of genes that cluster by species rather than with orthologs are shaded corresponding shades of gray. B. Graph of the number of polymorphic sites within a sliding window of 50 bases between aligned Ccp84Aa and Ccp84Ab alleles (step of 25 bases). Polymorphism is much lower between paralogs than between orthologs in the central region of the gene, which includes the R&R Consensus, but tends to increase at the 5′ and 3′ ends of coding sequence, approaching the level of orthologs. The part of the X axis corresponding to the R&R Consensus is shaded.
Figure 6
Figure 6. Graph of the number of polymorphic sites within a sliding window of 50 bases between aligned Ccp84Ad and Ccp84Ad′/Cpr5C alleles (step of 25 bases).
Note that Cpr5C is the presumed ortholog of Ccp84Ad′ based on amino-acid sequence phylogeny, but occurs on the X chromosome (see text). Polymorphism is lower between paralogs than between orthologs for all comparisons. However, the level of paralog polymorphism is lowest for those species (solid lines) that have Ccp84Ad′. Species (dotted lines) that lack Ccp84Ad′ but instead have Cpr5C show intermediate levels of divergence. The part of the X axis corresponding to the R&R Consensus is shaded.
Figure 7
Figure 7. Dot plots of the 84A array in two Drosophila species that illustrate sequence repetition (indicated by red circles) within the coding regions of those genes evolving concertedly.
Independently evolving genes in the array lack this sequence repetition. Dot plots of this region for the other ten Drosophila species are shown in Text S4.
Figure 8
Figure 8. Alignment of Ccp84Aa and Ccp84Ab sequence identified as repetitive in dot plots.
Nucleotide sequence was aligned with ClustalW and then trimmed around the most conserved repeat unit, although genes typically have more than two such units at varying degrees of conservation. Brackets indicate two copies of a repeated sequence that is well conserved among all species. Other repeats can be seen that are found in only a subset of species.
Figure 9
Figure 9. Concerted evolution of Drosophila Tweedle genes within an array at 97C.
A. Neighbor-joining phylogeny of predicted proteins. B. Dot plot of array.
Figure 10
Figure 10. Dot plots of the co-orthologous regions of the D. melanogaster 97C Tweedle array from D. pseudoobscura and D. willistoni, with the region around Dwil5471 and Dwil6460 enlarged.
Figure 11
Figure 11. Dot plots of concertedly evolving cuticular protein genes of An. gambiae. (see , for details).
Additional dot plots are shown in Text S4.
Figure 12
Figure 12. Graphs showing the range of ω estimates for orthologous ‘single-copy’ genes of each cuticular protein gene family, as described in the text.
CPR genes occurring in tandem arrays or as isolated genes (‘singletons’) are shown separately. For each gene, the values in the Sophophora (S) and the Drosophila (D) species groups are shown separately, connected by a line to more clearly illustrate the trends among genes. The red lines indicate the mean values.

Similar articles

Cited by

  • A Complex Lens for a Complex Eye.
    Stahl AL, Baucom RS, Cook TA, Buschbeck EK. Stahl AL, et al. Integr Comp Biol. 2017 Nov 1;57(5):1071-1081. doi: 10.1093/icb/icx116. Integr Comp Biol. 2017. PMID: 28992245 Free PMC article.
  • The whole genome sequence of the Mediterranean fruit fly, Ceratitis capitata (Wiedemann), reveals insights into the biology and adaptive evolution of a highly invasive pest species.
    Papanicolaou A, Schetelig MF, Arensburger P, Atkinson PW, Benoit JB, Bourtzis K, Castañera P, Cavanaugh JP, Chao H, Childers C, Curril I, Dinh H, Doddapaneni H, Dolan A, Dugan S, Friedrich M, Gasperi G, Geib S, Georgakilas G, Gibbs RA, Giers SD, Gomulski LM, González-Guzmán M, Guillem-Amat A, Han Y, Hatzigeorgiou AG, Hernández-Crespo P, Hughes DS, Jones JW, Karagkouni D, Koskinioti P, Lee SL, Malacrida AR, Manni M, Mathiopoulos K, Meccariello A, Munoz-Torres M, Murali SC, Murphy TD, Muzny DM, Oberhofer G, Ortego F, Paraskevopoulou MD, Poelchau M, Qu J, Reczko M, Robertson HM, Rosendale AJ, Rosselot AE, Saccone G, Salvemini M, Savini G, Schreiner P, Scolari F, Siciliano P, Sim SB, Tsiamis G, Ureña E, Vlachos IS, Werren JH, Wimmer EA, Worley KC, Zacharopoulou A, Richards S, Handler AM. Papanicolaou A, et al. Genome Biol. 2016 Sep 22;17(1):192. doi: 10.1186/s13059-016-1049-2. Genome Biol. 2016. PMID: 27659211 Free PMC article.
  • The CPCFC cuticular protein family: Anatomical and cuticular locations in Anopheles gambiae and distribution throughout Pancrustacea.
    Vannini L, Bowen JH, Reed TW, Willis JH. Vannini L, et al. Insect Biochem Mol Biol. 2015 Oct;65:57-67. doi: 10.1016/j.ibmb.2015.07.002. Epub 2015 Jul 8. Insect Biochem Mol Biol. 2015. PMID: 26164413 Free PMC article.
  • Mosquito genomics. Highly evolvable malaria vectors: the genomes of 16 Anopheles mosquitoes.
    Neafsey DE, Waterhouse RM, Abai MR, Aganezov SS, Alekseyev MA, Allen JE, Amon J, Arcà B, Arensburger P, Artemov G, Assour LA, Basseri H, Berlin A, Birren BW, Blandin SA, Brockman AI, Burkot TR, Burt A, Chan CS, Chauve C, Chiu JC, Christensen M, Costantini C, Davidson VL, Deligianni E, Dottorini T, Dritsou V, Gabriel SB, Guelbeogo WM, Hall AB, Han MV, Hlaing T, Hughes DS, Jenkins AM, Jiang X, Jungreis I, Kakani EG, Kamali M, Kemppainen P, Kennedy RC, Kirmitzoglou IK, Koekemoer LL, Laban N, Langridge N, Lawniczak MK, Lirakis M, Lobo NF, Lowy E, MacCallum RM, Mao C, Maslen G, Mbogo C, McCarthy J, Michel K, Mitchell SN, Moore W, Murphy KA, Naumenko AN, Nolan T, Novoa EM, O'Loughlin S, Oringanje C, Oshaghi MA, Pakpour N, Papathanos PA, Peery AN, Povelones M, Prakash A, Price DP, Rajaraman A, Reimer LJ, Rinker DC, Rokas A, Russell TL, Sagnon N, Sharakhova MV, Shea T, Simão FA, Simard F, Slotman MA, Somboon P, Stegniy V, Struchiner CJ, Thomas GW, Tojo M, Topalis P, Tubio JM, Unger MF, Vontas J, Walton C, Wilding CS, Willis JH, Wu YC, Yan G, Zdobnov EM, Zhou X, Catteruccia F, Christophides GK, Collins FH, Cornman RS, Crisanti A, Donnelly MJ, Emrich SJ, Fontaine MC, Gelbart W, Hahn MW, Han… See abstract for full author list ➔ Neafsey DE, et al. Science. 2015 Jan 2;347(6217):1258522. doi: 10.1126/science.1258522. Epub 2014 Nov 27. Science. 2015. PMID: 25554792 Free PMC article.
  • A novel chitin binding crayfish molar tooth protein with elasticity properties.
    Tynyakov J, Bentov S, Abehsera S, Khalaila I, Manor R, Katzir Abilevich L, Weil S, Aflalo ED, Sagi A. Tynyakov J, et al. PLoS One. 2015 May 26;10(5):e0127871. doi: 10.1371/journal.pone.0127871. eCollection 2015. PLoS One. 2015. PMID: 26010981 Free PMC article.

References

    1. Neville AC. Berlin: Springer Verlag; 1975. Biology of the Arthropod Cuticle.448
    1. Awolola TS, Oduola OA, Strode C, Koekemoer LL, Brooke B, et al. Evidence of multiple pyrethroid resistance mechanisms in the malaria vector Anopheles gambiae sensu stricto from Nigeria. Trans R Soc Trop Med Hyg 2008 - PubMed
    1. Vontas J, David JP, Nikou D, Hemingway J, Christophides GK, et al. Transcriptional analysis of insecticide resistance in Anopheles stephensi using cross-species microarray hybridization. Insect Mol Biol. 2007;16:315–324. - PubMed
    1. White BJ, Cheng C, Sangare D, Lobo NF, Collins FH, et al. The Population genomics of trans-specific inversion polymorphisms in Anopheles gambiae. Genetics. 2009 DOI: 109.105817. - PMC - PubMed
    1. Charles JP, Chihara C, Nejad S, Riddiford LM. A cluster of cuticle protein genes of Drosophila melanogaster at 65A: sequence, structure and evolution. Genetics. 1997;147:1213–1224. - PMC - PubMed

Publication types

LinkOut - more resources