Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Oct;27(10):2284-99.
doi: 10.1093/molbev/msq114. Epub 2010 May 5.

Haplotype structure and expression divergence at the Drosophila cellular immune gene eater

Affiliations

Haplotype structure and expression divergence at the Drosophila cellular immune gene eater

Punita Juneja et al. Mol Biol Evol. 2010 Oct.

Abstract

The protein Eater plays an important role in microbial recognition and defensive phagocytosis in Drosophila melanogaster. We sequenced multiple alleles of the eater gene from an African and a North American population of D. melanogaster and found signatures of a partial selective sweep in North America that is localized around the second intron. This pattern is consistent with local adaptation to novel selective pressures during range expansion out of Africa. The North American sample is divided into two predominant haplotype groups, and the putatively selected haplotype is associated with a significantly higher gene expression level, suggesting that gene regulation is a possible target of selection. The eater alleles contain from 22 to 40 repeat units that are characterized by the presence of a cysteine-rich NIM motif. NIM repeats in the structural stalk of the protein exhibit concerted evolution as a function of physical location in the repeat array. Several NIM repeats within eater have previously been implicated in binding to microbial ligands, a function which in principle might subject them to special evolutionary pressures. However, we find no evidence of elevated positive selection on these pathogen-interacting units. Our study presents an instance where gene expression rather than protein structure is thought to drive the adaptive evolution of a pathogen recognition molecule in the immune system.

PubMed Disclaimer

Figures

F<sc>IG</sc>. 1.
FIG. 1.
Gene structure and survey region. Variable number NIM repeat units were excluded when calculating population genetic statistics. The signal sequence, NIM 2, and the transmembrane (TM) domain are interrupted by introns, which are indicated by up-carats with the size of the intron (in base pairs) given above the carat. The numbers below sequence domains indicate their size in base pairs. The forward pointing arrow indicates the transcriptional start site. The 1765 base pair region immediately upstream of the transcriptional start site is indicated and includes the minimal enhancer region (Tokusumi et al. 2009). The boxes labeled 5′ and 3′ UTR are untranslated regions. “*” Indicates NIM repeats that have previously been implicated in microbial binding (Kocks et al. 2005). Dark lines below the gene schematic indicate various survey regions considered in different components of this article. Variable number repeat units are indicated with a ‘v’ subscript. “NIM 8–like” repeats are shown with gray and white diagonal lines, and “alternate” repeats are shown in gray. (NIM 8–like consensus motif: CKPICSxxCENGxCxAPEKCSCNGY, “alternate” consensus motif: CxxVCxxGCKNGFCxAPxKCSCxxxx.) Between 0 and 15 repeats were not sequenced in the interior of the gene (shown with hatched lines).
F<sc>IG</sc>. 2.
FIG. 2.
Polymorphic sites for the eater locus. The US population (Ithaca, NY) is divided into “A”- and “B”-type haplotypes based on sequence between base pairs 390 and 488 (highlighted in gray). “A” haplotypes are above the dotted line and “B” haplotypes are below. Nonsynonymous (N) and synonymous polymorphisms (S) in coding regions are indicated. Stop codons were found segregating in two individuals from Zimbabwe (boxed). CF2-II motif polymorphisms are shown (▾; see fig. 5). Variable number repeat units between NIM 8 and NIM 9 could not be aligned with confidence and are not shown, but the approximate length of that region is shown in base pairs (VN). Sites with alignment gaps were considered if there was a polymorphism. Base pair position within the gene corresponds with Figure 1 with the first position being the transcriptional start site and the 5′ upstream region indicated as −1765 to −1.
F<sc>IG</sc>. 3.
FIG. 3.
Linkage disequilibrium (r2) plotted across the concatenated gene region. Each pixel represents r2 plotted between a pair of segregating sites. Exons are shown with black boxes, with the transcriptional start site indicated with an arrow. Introns are shown as lines between the exons. The black triangle indicates the block of high linkage disequilibrium in the second intron of the US (Ithaca, NY) population and is indicated in the Zimbabwe population for comparison.
F<sc>IG</sc>. 4.
FIG. 4.
Plot of nucleotide diversity in the North American population by haplotype group. Nucleotide diversity is plotted for sliding windows with a window length of 200 sites and a step size of 75 sites. A schematic of the gene is shown below the graph with exons indicated as black boxes and the transcription start site indicated with an arrow. A spike in apparent diversity is seen over intron 2, where two divergent haplotypes (groups “A” and “B”) are segregating. There is no excess diversity within either haplotype.
F<sc>IG</sc>. 5.
FIG. 5.
Polymorphisms in putative CF2-II transcription factor recognition motifs in positions within the second intron of North American haplotypes “A” and “B.” Motif GTATATATA is considered a perfect match. The score indicates how well the input sequence matches the motif. The position within the gene region is indicated above each nucleotide. Haplotype group “A” has four high score matches to the CF2-II motif.
F<sc>IG</sc>. 6.
FIG. 6.
Absence of genetic differentiation (RST) between populations in variable number repeat sizes. (a) Box plots of the distribution of sizes of variable number repeat region by population. (b) Pairwise RST values between populations. *P = 0.0287 (not significant after a Bonferroni correction); P > 0.05 for all other pairwise comparisons.
F<sc>IG</sc>. 7.
FIG. 7.
Nearest genetic neighbors between NIM repeat units. Genetic distances were calculated between all pairwise combinations of NIM repeats from different individuals. The thickness of the connecting lines and the number on the line indicate the proportion of times that the nearest neighbor of a particular repeat unit was the indicated NIM repeat. Genetic distances were calculated with the Kimura 2-parameter model using an alignment of the 78 base pair NIM consensus motif that is conserved between all repeat units. NIM 1 through NIM 8 all showed the same pattern, so the intervening repeats are not shown (region indicated with dots). Variable number repeat units are shaded (“NIM 8 like” = gray and white stripes and “alternate” = gray). Some variable number repeats units were not sequenced (region indicated with a jagged line).

Similar articles

Cited by

References

    1. Adams MD, Celniker SE, Holt RA, et al. (195 co-authors) The genome sequence of Drosophila melanogaster. Science. 2000;287:2185–2195. - PubMed
    1. Aminetzach YT, Macpherson JM, Petrov DA. Pesticide resistance via transposition-mediated adaptive gene truncation in Drosophila. Science. 2005;309:764–767. - PubMed
    1. Baudry E, Viginier B, Veuille M. Non-African populations of Drosophila melanogaster. Mol Biol Evol. 2004;21:1482–1491. - PubMed
    1. Begun DJ, Aquadro CF. Molecular variation at the vermillion locus in geographically diverse populations of D. melanogaster and D. simulans. Genetics. 1995;140:1019–1032. - PMC - PubMed
    1. Begun DJ, Holloway AK, Stevens K, et al. (13 co-authors) Population genomics: whole-genome analysis of polymorphism and divergence in Drosophila simulans. PLoS Biol. 2007;5:e310. - PMC - PubMed

Publication types

MeSH terms

Associated data

LinkOut - more resources