Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;8(12):e1003090.
doi: 10.1371/journal.pgen.1003090. Epub 2012 Dec 20.

Genome-wide fine-scale recombination rate variation in Drosophila melanogaster

Affiliations

Genome-wide fine-scale recombination rate variation in Drosophila melanogaster

Andrew H Chan et al. PLoS Genet. 2012.

Abstract

Estimating fine-scale recombination maps of Drosophila from population genomic data is a challenging problem, in particular because of the high background recombination rate. In this paper, a new computational method is developed to address this challenge. Through an extensive simulation study, it is demonstrated that the method allows more accurate inference, and exhibits greater robustness to the effects of natural selection and noise, compared to a well-used previous method developed for studying fine-scale recombination rate variation in the human genome. As an application, a genome-wide analysis of genetic variation data is performed for two Drosophila melanogaster populations, one from North America (Raleigh, USA) and the other from Africa (Gikongoro, Rwanda). It is shown that fine-scale recombination rate variation is widespread throughout the D. melanogaster genome, across all chromosomes and in both populations. At the fine-scale, a conservative, systematic search for evidence of recombination hotspots suggests the existence of a handful of putative hotspots each with at least a tenfold increase in intensity over the background rate. A wavelet analysis is carried out to compare the estimated recombination maps in the two populations and to quantify the extent to which recombination rates are conserved. In general, similarity is observed at very broad scales, but substantial differences are seen at fine scales. The average recombination rate of the X chromosome appears to be higher than that of the autosomes in both populations, and this pattern is much more pronounced in the African population than the North American population. The correlation between various genomic features-including recombination rates, diversity, divergence, GC content, gene content, and sequence quality-is examined using the wavelet analysis, and it is shown that the most notable difference between D. melanogaster and humans is in the correlation between recombination and diversity.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Comparison of the results of LDhelmet and LDhat for 25 datasets simulated under neutrality.
In each plot, different colors represent the results for different datasets. The left and right columns show the estimated recombination maps of LDhelmet and LDhat, respectively, using the same block penalty of 50. Our method LDhelmet generally produces less noisy estimates than that produced by LDhat. (First Row) Each dataset was simulated with a constant recombination rate of formula image per bp. (Second Row) Each dataset was simulated with a hotspot of width formula image kb starting at location formula image kb. The background recombination rate was formula image per bp, while the hotspot intensity was formula image the background rate, i.e., formula image per bp. The maps are shown in their entirety, including potential edge effects.
Figure 2
Figure 2. LDhelmet results on simulations with realistic variable recombination rates.
In each study, the program MaCS was used to simulate data, with sample size 22, for a 1 Mb region with the variable recombination map shown in red. (We used formula image; output was postprocessed to incorporate an empirical quadra-allelic mutation model.) Estimated recombination maps are shown in blue. The same block penalty of 50 was used in both cases. (A) The average recombination rate for the region is formula image per kb, representative of the interior of the North American X. (B) The average recombination rate for the region is formula image per kb, representative of the interior of the African X.
Figure 3
Figure 3. Comparison of the results of LDhelmet and LDhat for 25 datasets simulated under strong positive selection.
In each plot, different colors represent the results for different datasets. The left and right columns show the estimated recombination maps of LDhelmet and LDhat, respectively, using the same block penalty of 50. In each simulation, the selected site was placed at position formula image kb and the population-scaled selection coefficient was set to formula image. The fixation time of the selected site was formula image coalescent unit in the past. Although the estimated recombination maps are in general noisier than that for the neutral case (c.f., Figure 1), LDhelmet is more robust than LDhat. As illustrated in the plots, LDhat produces false inference of elevated recombination rates near the selected site more frequently than does LDhelmet. The same scenarios of recombination patterns as in Figure 1 were considered: (First Row) with a constant recombination rate of formula image per bp, and (Second Row) with a hotspot of width formula image kb starting at location 11.5 kb. The background recombination rate was formula image per bp, while the hotspot intensity was formula image the background rate, i.e., formula image per bp. The maps are shown in their entirety, including potential edge effects.
Figure 4
Figure 4. LDhelmet's estimated fine-scale recombination maps for RAL and RG populations of D. melanogaster.
The North American sample (RAL) comprised 37 genomes, while the African sample (RG) comprised 22 genomes.
Figure 5
Figure 5. Comparison of LDhelmet estimates to the empirical genetic map of Singh et al.
The experimental genetic map of Singh et al. is shown in black with formula image confidence intervals. The LDhelmet estimate for the RAL sample is shown in blue, while the estimate for the RG sample is shown in red. The LDhelmet estimates were converted into units of cM/Mb by normalizing them to have the same total genetic distance as the empirical map for the region. The three maps demonstrate high correlation, especially near the center of the region, where they share the highest peak in the same interval.
Figure 6
Figure 6. Comparison with FlyBase genetic map.
Plotted for each chromosome arm are the estimated recombination maps using our method and the consensus experimental map hosted at FlyBase . To ease comparison each map is LOESS-smoothed using a span of 15%. LDhelmet estimates were converted into units of cM/Mb by normalizing them to have the same total genetic distance as the empirical map.
Figure 7
Figure 7. A putative hotspot found by LDhelmet and confirmed by sequenceLDhot.
(Top): Estimated recombination rate for RAL (blue) and RG (red) in a 50 kb region of chromosome 3R, and their respective mean recombination rates in this region (dotted). (Bottom): Evidence of recombination hotspots in the same region, evaluated according to sequenceLDhot. The dotted line shows the likelihood ratio cutoff we used.
Figure 8
Figure 8. Fine-scale recombination maps for the X chromosome subtelomeric region.
The telomere is at the left end of the region. The recombination rate between positions 10 kb and 20 kb is considerably higher than the rate in the subtelomeric region immediately to the right. This trend is much more pronounced in the North American X than in the African X, consistent with a previous study .
Figure 9
Figure 9. Local wavelet power spectrum of recombination rate variation across chromosome arm 2L.
The whole arm is shown on the left, and a detailed (central) 1 Mb is shown on the right, for RAL and RG. Black contours denote regions of significant power at the 5% level, and the white contour denotes the cone of influence. Color scale is relative to a white noise process with the same variance. Lower panels show estimates of the corresponding recombination maps.
Figure 10
Figure 10. Pairwise correlation of detail wavelet coefficients of RAL and RG recombination maps for chromosome arm 2L.
Black circles denote Kendall's rank correlation between pairs of detail coefficients at each scale. Crosses denote the correlation that would be required for significance at the 1% level in a two-tailed test; red crosses are those scales at which the correlation is in fact significant.
Figure 11
Figure 11. Wavelet coherence analysis comparing RAL against RG.
(Left): Wavelet coherence of the two maps for chromosome arm 2L. The cone of influence is shown in white. (Right): For each arm, the plot shows the fraction of the genome with significantly high coherence at the 5% level, at each scale.
Figure 12
Figure 12. Global wavelet power spectrum and pairwise correlations of detail wavelet coefficients of RAL and RG data for chromosome arm 2L.
Diagonal plots show the global wavelet power spectrum of each feature of the RAL (blue) and RG (red) data. Off-diagonal plots show Kendall's rank correlation between pairs of detail coefficients at each scale, with respect to the wavelet decomposition of the two indicated features. Crosses denote the correlation that would be required for significance at the 1% level in a two-tailed test; red crosses are those scales at which the correlation is in fact significant. The bottom left and top right plots correspond to RAL and RG, respectively.
Figure 13
Figure 13. Linear model for wavelet transform of recombination map of chromosome arm 2L.
(A) In a linear model for the detail coefficients of the wavelet transform of the recombination map of chromosome arm 2L, covariates are the detail coefficients of wavelet transforms of data quality, gene content, GC content, divergence, and diversity. Shown is the −formula image-value of the regression coefficient at the given scale, as determined by a t-test. Colored boxes indicate significant relationships, with red positive and blue negative. Also shown in the adjusted formula image. (B) As above, but with the recombination map of the other population as an additional covariate.

References

    1. Hernandez RD, Kelley JL, Elyashiv E, Melton SC, Auton A, et al. (2011) Classic selective sweeps were rare in recent human evolution. Science 331: 920–924. - PMC - PubMed
    1. Sattath S, Elyashiv E, Kolodny O, Rinott Y, Sella G (2011) Pervasive adaptive protein evolution apparent in diversity patterns around amino acid substitutions in Drosophila simulans. PLoS Genet 7: e1001302 doi:10.1371/journal.pgen.1001302. - DOI - PMC - PubMed
    1. Price AL, Tandon A, Patterson N, Barnes KC, Rafaels N, et al. (2009) Sensitive detection of chromosomal segments of distinct ancestry in admixed populations. PLoS Genet 5: e1000519 doi:10.1371/journal.pgen.1000519. - DOI - PMC - PubMed
    1. The International HapMap Consortium (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449: 851–861. - PMC - PubMed
    1. Ortiz-Barrientos D, Chang AS, Noor MAF (2006) A recombinational portrait of the Drosophila pseudoobscura genome. Genetics Research Cambridge 87: 23–31. - PubMed

Publication types