Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Oct;1(4):e44.
doi: 10.1371/journal.pgen.0010044.

Positive selection of Iris, a retroviral envelope-derived host gene in Drosophila melanogaster

Affiliations

Positive selection of Iris, a retroviral envelope-derived host gene in Drosophila melanogaster

Harmit S Malik et al. PLoS Genet. 2005 Oct.

Abstract

Eukaryotic genomes can usurp enzymatic functions encoded by mobile elements for their own use. A particularly interesting kind of acquisition involves the domestication of retroviral envelope genes, which confer infectious membrane-fusion ability to retroviruses. So far, these examples have been limited to vertebrate genomes, including primates where the domesticated envelope is under purifying selection to assist placental function. Here, we show that in Drosophila genomes, a previously unannotated gene (CG4715, renamed Iris) was domesticated from a novel, active Kanga lineage of insect retroviruses at least 25 million years ago, and has since been maintained as a host gene that is expressed in all adult tissues. Iris and the envelope genes from Kanga retroviruses are homologous to those found in insect baculoviruses and gypsy and roo insect retroviruses. Two separate envelope domestications from the Kanga and roo retroviruses have taken place, in fruit fly and mosquito genomes, respectively. Whereas retroviral envelopes are proteolytically cleaved into the ligand-interaction and membrane-fusion domains, Iris appears to lack this cleavage site. In the takahashii/suzukii species groups of Drosophila, we find that Iris has tandemly duplicated to give rise to two genes (Iris-A and Iris-B). Iris-B has significantly diverged from the Iris-A lineage, primarily because of the "invention" of an intron de novo in what was previously exonic sequence. Unlike domesticated retroviral envelope genes in mammals, we find that Iris has been subject to strong positive selection between Drosophila species. The rapid, adaptive evolution of Iris is sufficient to unambiguously distinguish the phylogenies of three closely related sibling species of Drosophila (D. simulans, D. sechellia, and D. mauritiana), a discriminative power previously described only for a putative "speciation gene." Iris represents the first instance of a retroviral envelope-derived host gene outside vertebrates. It is also the first example of a retroviral envelope gene that has been found to be subject to positive selection following its domestication. The unusual selective pressures acting on Iris suggest that it is an active participant in an ongoing genetic conflict. We propose a model in which Iris has "switched sides," having been recruited by host genomes to combat baculoviruses and retroviruses, which employ homologous envelope genes to mediate infection.

PubMed Disclaimer

Conflict of interest statement

Competing interests. The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. CG4715 Homologs
(A) Baculoviral and insect retroviral env genes shown in their respective genomic context. Baculoviruses, represented by Autographa californica nucleopolyhedrovirus (ACNV) and Lymantria dispar nucleopolyhedrosis virus (LDNV) are double-stranded DNA viruses whose genome size is close to 150 kilobases [72], while retroviruses, represented by roo and Gypsy, are close to 7 kilobases in length [73]. CG4715 is an open reading frame found in the same genomic context in many species of Drosophila. CG4715/Iris and its env homologs are shown in black (open reading frame direction shown by arrows) while neighboring genes are shown in gray. Note that the gypsy env is expressed through a spliced message. Kyte-Doolittle hydropathy plots of encoded protein products from CG4715 (B) and the roo env gene (C) are shown. The putative signal peptide (SP) and C-terminal, transmembrane hydrophobic peptide (Tm) are highlighted in bold, while the furin cleavage site in the roo envelope protein is indicated by an arrowhead.
Figure 2
Figure 2. Phylogenetic Analysis of CG4715 Homologs
(A) CG4715 has been preserved in its syntenic location in Drosophila species. In species from the takahashii/suzukii species groups like D. lutescens, an additional paralog, CG4715-B (gray shading) is found in tandem orientation. D. ananassae has an additional transposon insertion in this syntenic location between CG4715 and CG4552, while the genomes of D. mojavensis and D. virilis lack CG4715 orthologs between CG4577 and CG4552. For D. ananassae and D. pseudoobscura, sequence was obtained from genome sequencing data (indicated with an asterisk) and confirmed by sequencing. (B) An “expected” phylogeny of Drosophila species is shown, summarizing results from many genes [30,31]. (C) A neighbor-joining phylogeny of CG4715 orthologs based on C-terminal amino acid sequence is presented. (For some species, only the C-terminal sequence was obtained (indicated by a “p” for partial)). This phylogeny is largely in agreement with the accepted species phylogeny in (B), indicating that the gene has been inherited by strict vertical inheritance. Although there is a slight discordance in phylogenetic placement of the D. ananassae, D. eugracilis, and D. auraria, these branches have only a low bootstrap support. A second lineage of CG4715 paralogs, CG4715-B is evident (gray shading) in the takahashii/suzukii species groups.
Figure 3
Figure 3. CG4715/Iris Relationships to Viral Envelope Genes
(A) The central domains of CG4715 and related viral env genes were aligned, and a neighbor-joining phylogenetic tree constructed. The tree separates the CG4715-env superfamily into four groups: the baculoviruses, the BEL clade retroviruses roo and Kanga, the gypsy-like retroviruses, and host genome borne CG4715 orthologs in Drosophila and mosquito genomes. While the tree overall does not provide high resolution to discern the order of divergence of each of the clades, there is very strong phylogenetic resolution (bootstrap support of key nodes shown) to unambiguously group CG4715 orthologs with the Kanga retrovirus lineage, indicating that this lineage of retroviruses is the likely source of the CG4715 lineage. (B) Neighbor-joining phylogeny of selected representatives from the BEL clade of retrotransposons indicates that the Kanga retroviruses from Drosophila genomes form a monophyletic clade (the presumed ancestor of CG4715 is indicated by a yellow oval). Most retrotransposons in the BEL lineage do not possess an env gene (blue lettering) while many elements that do (red) have acquired non-homologous env genes acquired from a different viral source [19,20].
Figure 4
Figure 4. Iris Expression in D. melanogaster
Iris expression through various stages of development was assayed using (A) RT-PCR and (B) Northern blots. Both show that Iris is predominantly expressed in adult females and males. (C) RT-PCR analysis on individually dissected tissues from adult flies shows that Iris is expressed in somatic tissues but expression is slightly reduced in ovaries and testes. RT-PCR to Karyopherin alpha-3 (αKap3, a ubiquitous nuclear import factor- CG9423) is shown as a control for amounts of template RNA in the RT-PCR reaction, and to show that there is no detectable contamination from genomic DNA.
Figure 5
Figure 5. Complete Alignment of Iris Proteins
An alignment of full-length Iris proteins from various Drosophila species is shown. All invariant residues are shown against a black background (except cysteines that are highlighted in yellow), while similar residues are highlighted in gray background. We did not include the Iris-B lineage here for ease of presentation (these are presented in Figure 6). Several features are conserved, including the signal peptide (predicted cleavage site indicated by arrowheads), C-terminal transmembrane domain (shown as a box), and several invariant cysteine residues (c1 through c6, highlighted in yellow) that are a characteristic feature of Iris and related envelope proteins. Other cysteine residue pairs (1–1 and 2–2, also highlighted in yellow) show co-conservation, i.e., loss of one results in loss of the other.
Figure 6
Figure 6. Iris Paralogs in the takashii/suzukii Species Groups
(A) An alignment of representative Iris-A and Iris-B proteins from the takahashii/suzukii species groups is shown. Iris-A and Iris-B proteins are highly similar to each other. Notable differences include pairs of cysteine residues that are conserved in the B lineage (indicated with “B”), but not in A. The B lineage also has a shorter cytoplasmic tail and is missing several residues (PLLEK amino acid residues) that are invariant in the A lineage. In addition, an internal segment of the Iris-A protein is lost from the Iris-B protein, by virtue of this genomic sequence becoming an intron (confirmed by RT-PCR). (B and C) Hydropathy plots of representative Iris-A and Iris-B proteins show that the overall architecture of the two proteins is largely unaffected by the differences between the two lineages. (D) A hypothetical model for the origin of the divergent Iris-B gene starts with the tandem gene duplication. A cryptic SA site is encountered by mutation, but this can be neutrally maintained. However, the simultaneous occurrence of an SD site activates the SA site and leads to a portion of the coding exon being spliced out from the mature message. If this is deleterious, the SD-SA combination is culled out by selection. However, in rare cases, like the Iris-B gene, this could lead to a novel functional gene that is favored by selection. Subsequently, the SD and SA sites are maintained by purifying selection.
Figure 7
Figure 7. Sliding Window dN/dS Analyses of Different Drosophila Iris Genes
We have chosen non-overlapping sets of the Drosophila species to do a pair-wise analysis of dN compared to dS. We present a sliding window analysis (window size 150 base pairs, slide of 50 base pairs) of dS and the dN/dS ratio (y-axis) plotted against nucleotide position (x-axis). Under neutrality, a dN/dS ratio of 1 is expected (dashed lines). We present a comparison of (A) D. melanogaster versus D. simulans, (B) D. yakuba versus D. teissieri, (C) D. erecta versus D. orena, (D) D. paralutea A versus D. lutescens A, and (E) D. paralutea B versus D. prostipennis B. In all these comparisons except (D), at least one window where dN/dS significantly exceeds 1 is seen (indicated by asterisks; significance tested by simulations in the K-estimator program [43]).
Figure 8
Figure 8. PAML Analyses of Iris Evolution
(A) A free-ratio model for Iris evolution in Drosophila is presented with numbers above branches indicating (whole gene) dN/dS ratios estimated for each individual branch. Only the lineage leading to the sibling species D. mauritiana, D. sechellia, and D. simulans (thick line) has a dN/dS ratio that appears to be greater than 1. When this value of dN/dS = 1.82 was compared against the neutral expectation of 1, the higher value fit the data marginally better (p < 0.08). (B) Individual residues highlighted by PAML analyses as having being subject to recurrent positive selection are shown by inverted triangles. Also schematized are the signal peptide cleavage site (arrowheads) and C-terminal hydrophobic peptide (box). Dark, dashed lines indicate the ten cysteine residues (1–1, 2–2, c1 through c6) highlighted in Figure 5. We note that most of the residues identified at high confidence appear to cluster around the 2–2 pair of cysteine residues, suggesting a functional interaction surface here [46].
Figure 9
Figure 9. Iris Phylogeny in Closely Related Species
(A) Phylogenetic analysis of Iris coding regions from different strains of D. melanogaster, D. simulans, D. sechellia, and D. mauritiana, the latter three species believed to have diverged less than half a million years ago [51]. Based on distance, parsimony or likelihood methods (bootstrap values indicated in ovals), the phylogeny clearly separates the three species. This is largely due to six sites that are “unambiguous” as far as phylogenetic information is concerned, indicated with “!.” An unambiguous site is defined as one in which the same derived nucleotide is found fixed in two of the three species (e.g., D. simulans and D. sechellia), whereas the third species (e.g., D. mauritiana) is fixed for the ancestral nucleotide, corresponding to the out-group, D. melanogaster. (B) Iris is only the second known gene to inform about the phylogeny of the three sibling species D. simulans, D. sechellia, and D. mauritiana with statistical significance. In the Iris phylogeny, D. mauritiana branched earliest while previously, D. sechellia was found to branch earliest. This suggests that speciation events' chronology among these three species is more complicated than suggested previously [52].
Figure 10
Figure 10. Two Hypothetical Models to Explain Positive Selection of Iris
(A) Under the first model, Iris has been domesticated for a role other than host defense. As part of this housekeeping function, Iris proteins reside on the cell surface, where they can be recognized as receptors by viral envelope proteins. Variants of Iris that cannot be recognized by the viral envelopes have a selective advantage. (B) A second model considers the possibility that Iris can act as a dominant negative agent that counteracts retroviral envelope trimers (red) from mediating infection. In this scenario, viruses encode for envelope trimers that can be cleaved into the SU ligand interaction and TM membrane fusion domains. In the absence of Iris, or if Iris lacks the specificity to bind the envelope trimers, the viral envelopes can mediate infection of the target cell. However, if the Iris protein (blue) can bind the viral envelopes and arrest the membrane fusion step, then the host defends against the viral infection. In this scenario, Iris directly acts as a host defense protein. Note that in both scenarios, Iris is predicted to be subject to positive selection (to decrease virus binding in the first model, and to increase virus binding in the second).

Similar articles

Cited by

References

    1. Nakamura TM, Morin GB, Chapman KB, Weinrich SL, Andrews WH, et al. Telomerase catalytic subunit homologs from fission yeast and human. Science. 1997;277:955–959. - PubMed
    1. Eickbush TH. Telomerase and retrotransposons: Which came first? Science. 1997;277:911–912. - PubMed
    1. Pardue ML, DeBaryshe PG. Retrotransposons provide an evolutionarily robust non-telomerase mechanism to maintain telomeres. Annu Rev Genet. 2003;37:485–511. - PubMed
    1. Levis RW, Ganesan R, Houtchens K, Tolar LA, Sheen FM. Transposons in place of telomeric repeats at a Drosophila telomere. Cell. 1993;75:1083–1093. - PubMed
    1. Agrawal A, Eastman QM, Schatz DG. Transposition mediated by RAG1 and RAG2 and its implications for the evolution of the immune system. Nature. 1998;394:744–751. - PubMed

Publication types

Associated data