Estimating the scaled mutation rate and mutation bias with site frequency data
- PMID: 25453604
- DOI: 10.1016/j.tpb.2014.10.002
Estimating the scaled mutation rate and mutation bias with site frequency data
Abstract
The distribution of allele frequencies of a large number of biallelic sites is known as "allele-frequency spectrum" or "site-frequency spectrum" (SFS). Without selection and in regions of relatively high recombination rates, sites may be assumed to be independently and identically distributed. With a beta equilibrium distribution of allelic proportions and binomial sampling, a beta-binomial compound likelihood for each site results. The likelihood of the data and the posterior distribution of two parameters, scaled mutation rate θ and mutation bias α, is investigated in the general case and for small scaled mutation rates θ. In the general case, an expectation-maximization (EM) algorithm is derived to obtain maximum likelihood estimates of both parameters. With an appropriate prior distribution, a Markov chain Monte Carlo sampler to integrate the posterior distribution is also derived. As far as I am aware, previous maximum likelihood or Bayesian estimators of θ, explicitly or implicitly assume small scaled mutation rates, i.e., θ≪1. For θ≪1, maximum-likelihood estimators are also derived for both parameters using a Taylor series expansion of the beta-binomial distribution. The estimator of θ is a variant of the Ewens-Watterson estimator and of the maximum likelihood estimator derived with the Poisson Random Field approach. With a conjugate prior distribution, marginal and conditional beta posterior distributions are also derived for both parameters.
Keywords: Beta–binomial; EM-algorithm; Markov chain Monte Carlo algorithm; Mutation–drift equilibrium; Posterior; Stirling distribution.
Copyright © 2014 Elsevier Inc. All rights reserved.
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials