Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Feb;14(1):65-81.
doi: 10.1515/sagmb-2014-0030.

Inference for one-step beneficial mutations using next generation sequencing

Inference for one-step beneficial mutations using next generation sequencing

Andrzej J Wojtowicz et al. Stat Appl Genet Mol Biol. 2015 Feb.

Abstract

Experimental evolution is an important research method that allows for the study of evolutionary processes occurring in microorganisms. Here we present a novel approach to experimental evolution that is based on application of next generation sequencing. Under this approach population level sequencing is applied to an evolving population in which multiple first-step beneficial mutations occur concurrently. As a result, frequencies of multiple beneficial mutations are observed in each replicate of an experiment. For this new type of data we develop methods of statistical inference. In particular, we propose a method for imputing selection coefficients of first-step beneficial mutations. The imputed selection coefficient are then used for testing the distribution of first-step beneficial mutations and for estimation of mean selection coefficient. In the case when selection coefficients are uniformly distributed, collected data may also be used to estimate the total number of available first-step beneficial mutations.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Dynamics of the mean proportion of a beneficial mutation under different values of . Data obtained by simulation in which a single genotype with selection coefficient si=0.1 competed against the wild type. According to the results, the mean proportion of a mutation converges to the function g(si) when ≥10.
Figure 2
Figure 2
Dynamics of the standard deviation of the proportion of a beneficial mutation under different values of . Data obtained by simulation in which a single genotype with selection coefficient si=0.1 competed against the wild type. According to the results, as increases, variance of the proportion decreases. It can be concluded that when →∞, the standard deviation goes to 0 and the observed proportion converges to its expected value which is given by the function g(si).
Figure 3
Figure 3
Plots of imputed selection coefficient vs. their true values. Selection coefficients were generated from either a uniform or an exponential distribution. Each plot contains combined results from 25 (uniform case) or 50 (exponential case) simulations. Selection coefficients were imputed for mutations observed ≥2 times. Performance of the proposed method is good since, in most cases, imputed values are close to the actual values of selection coefficients. Accuracy of imputation increases as the effects of mutations become larger.
Figure 4
Figure 4
Coefficient of variation (MSE/s) of the imputed selection coefficients. The results are based on 10,000 replicates of the simulation procedure. The presented curves are left-truncated at the point where mutations are observed in <10% (or 1000) replicates. Accuracy of imputation improves as the effects of mutations become larger. Increasing the number of reads from k=100 to k=500 allows for observing and imputing additional low or medium effect mutations, but it does not improve accuracy of imputation of big effect mutations since these are already accurately estimated when k=100.
Figure 5
Figure 5
Type I error and power of the LRT when the test is applied to imputed selection coefficients. The power of the test was examined under the uniform distribution of selection coefficients. The test was conducted if the sample size was ≥5. According to the left panel, type I error of the test is moderately inflated. The right panel indicates that the power of the LRT is high when selection coefficients are uniformly distributed.
Figure 6
Figure 6
Coefficient of variation (MSE/r) of estimators of r when selection coefficients are a sample from a uniform distribution. The approach based on the joint likelihood function for r and δ provides more accurate estimates than the simplified method, but as the number of reads (k) increases the difference becomes smaller.

References

    1. Barrett RDH, MacLean RC, Bell G. Mutations of intermediate effect are responsible for adaptation in evolving Pseudomonas fluorescens populations. Biol. Lett. 2006;2:236–238. - PMC - PubMed
    1. Barrick JE, Kauth MR, Strelioff CC, Lenski RE. Escherichia coli rpoB mutants have increased evolvability in proportion to their fitness defects. Mol. Biol. Evol. 2010;27:1338–1347. - PMC - PubMed
    1. Beisel CJ, Rokyta DR, Wichman HA, Joyce P. Testing the extreme value domain of attraction for distributions of beneficial fitness effects. Genetics. 2007;176:2441–2449. - PMC - PubMed
    1. Brockhurst MA, Colegrave N, Rozen DE. Next-generation sequencing as a tool to study microbial evolution. Mole. Ecol. 2011;20:972–980. - PubMed
    1. Castillo E, Hadi AS. Fitting the generalized Pareto distribution to data. J. Am. Stat. Assoc. 1997;92:1609–1620.

Publication types

LinkOut - more resources