Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 25;18(1):e1009764.
doi: 10.1371/journal.pgen.1009764. eCollection 2022 Jan.

The estimates of effective population size based on linkage disequilibrium are virtually unaffected by natural selection

Affiliations

The estimates of effective population size based on linkage disequilibrium are virtually unaffected by natural selection

Irene Novo et al. PLoS Genet. .

Abstract

The effective population size (Ne) is a key parameter to quantify the magnitude of genetic drift and inbreeding, with important implications in human evolution. The increasing availability of high-density genetic markers allows the estimation of historical changes in Ne across time using measures of genome diversity or linkage disequilibrium between markers. Directional selection is expected to reduce diversity and Ne, and this reduction is modulated by the heterogeneity of the genome in terms of recombination rate. Here we investigate by computer simulations the consequences of selection (both positive and negative) and recombination rate heterogeneity in the estimation of historical Ne. We also investigate the relationship between diversity parameters and Ne across the different regions of the genome using human marker data. We show that the estimates of historical Ne obtained from linkage disequilibrium between markers (NeLD) are virtually unaffected by selection. In contrast, those estimates obtained by coalescence mutation-recombination-based methods can be strongly affected by it, which could have important consequences for the estimation of human demography. The simulation results are supported by the analysis of human data. The estimates of NeLD obtained for particular genomic regions do not correlate, or they do it very weakly, with recombination rate, nucleotide diversity, proportion of polymorphic sites, background selection statistic, minor allele frequency of SNPs, loss of function and missense variants and gene density. This suggests that NeLD measures mainly reflect demographic changes in population size across generations.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Estimates of effective population size from linkage disequilibrium (NeLD, sample size of n = 100 individuals), Relate (NeRelate, n = 100), MSMC (NeMSMC, n = 4), and from nucleotide diversity (π), this latter calculated as N = π/(4μ), where μ is the nucleotide mutation rate, for scenarios with different recombination rates (RR in cM/Mb) uniform across the genome.
Simulations assume a fixed population size of N = 1,000 individuals under neutrality (random mating or partial self-fertilization with a frequency of 50% selfed progeny), background selection and selective sweeps (both for random mating populations), with constant mutation rate μ = 10−8 per base per generation. Estimates were obtained including windows of recombination rate between pairs of SNPs ranging from c = 0.0025 to 0.0250 for NeLD and averaging historical estimates of Ne between generations 150 to 350 for NeRelate and NeMSMC. Error bars represent one standard error above and below the mean of the simulation replicates. NeLD estimates were obtained pooling all replicates to speed up computation. All simulations were run for up to 100 replicates.
Fig 2
Fig 2. Estimates of historical effective population size from linkage disequilibrium (NeLD, sample size n = 100 individuals), Relate (NeRelate, n = 100) and MSMC (NeMSMC, n = 4) from the present generation (generation 0) back to 400 or 1,000 generations in the past.
100 replicates were run of simulations with constant population size (N) under random mating and neutrality (grey), background selection (BS, red) or selective sweeps (SS, green), with constant (1 cM/Mb) or variable recombination rates (RR), and constant mutation rate μ = 10−8 or 10−9 mutations per base per generation for N = 1,000 or N = 10,000, respectively. The true simulated effective size from variance of family size (NeVk = N) is shown in black.
Fig 3
Fig 3. Estimates of historical effective population size from linkage disequilibrium (NeLD, sample size n = 100 individuals), Relate (NeRelate, n = 100) and MSMC (NeMSMC, n = 4) from the present generation (generation 0) back to 300 generations in the past.
100 replicates were run of simulations with constant population size (N) under random mating and neutrality (grey), background selection (BS, red) or selective sweeps (SS, green), with constant (1 cM/Mb) or variable recombination rate (RR), and constant mutation rate μ = 10−8 or 10−9 mutations per base per generation for N = 1,000 or N = 10,000, respectively. The true simulated effective size from variance of family size (NeVk = N) is shown in black.
Fig 4
Fig 4. Estimates of historical effective population size from linkage disequilibrium (NeLD, sample size n = 100 individuals), Relate (NeRelate, n = 100) and MSMC (NeMSMC, n = 4) from the present generation (generation 0) back to 600 generations in the past.
100 replicates were run of simulations with constant population size (N) under random mating and neutrality (grey), background selection (BS, red) or selective sweeps (SS, green), with constant (1 cM/Mb) or variable recombination rate (RR), and constant mutation rate μ = 10−8 or 10−9 mutations per base per generation for N = 1,000 or N = 10,000, respectively. The true simulated effective size from variance of family size (NeVk) is shown in black.
Fig 5
Fig 5. Spearman’s correlation coefficient (r) between nucleotide diversity (π; panel a) or estimates of linkage disequilibrium effective population size (NeLD; panel b) and different diversity variables, estimated within genomic regions.
RR: recombination rate; B: B statistic; P: proportion of polymorphic nucleotides; MAF: Minor Allele Frequency; LoF: number of Loss of Function variants; missense: number of missense variants; gene dens: gene density. Estimates are based on samples of n = 99 for Finnish and n = 16 for Koryaks populations. P-values: * < 0.05, *** <0.001.

Similar articles

Cited by

References

    1. Wright S. Evolution in mendelian populations. Genetics. 1931; 16: 97–159. doi: 10.1093/genetics/16.2.97 - DOI - PMC - PubMed
    1. Charlesworth B. Effective population size and patterns of molecular evolution and variation. Nat Rev Genet. 2009; 10: 195–205. doi: 10.1038/nrg2526 - DOI - PubMed
    1. Wang J, Santiago E, Caballero A. Prediction and estimation of effective population size. Heredity. 2016; 117: 193–206. doi: 10.1038/hdy.2016.43 - DOI - PMC - PubMed
    1. Caballero A. Quantitative Genetics. Cambridge University Press; 2020.
    1. Luikart G, Ryman N, Tallmon DA, Schwartz MK, Allendorf FW. Estimation of census and effective population sizes: The increasing usefulness of DNA-based approaches. Conserv Genet. 2010; 11: 355–373.

Publication types

Substances