Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;7(10):e48638.
doi: 10.1371/journal.pone.0048638. Epub 2012 Oct 31.

Towards improvements in the estimation of the coalescent: implications for the most effective use of Y chromosome short tandem repeat mutation rates

Affiliations

Towards improvements in the estimation of the coalescent: implications for the most effective use of Y chromosome short tandem repeat mutation rates

Steven C Bird. PLoS One. 2012.

Abstract

Over the past two decades, many short tandem repeat (STR) microsatellite loci on the human Y chromosome have been identified together with mutation rate estimates for the individual loci. These have been used to estimate the coalescent age, or the time to the most recent common ancestor (TMRCA) expressed in generations, in conjunction with the average square difference measure (ASD), an unbiased point estimator of TMRCA based upon the average within-locus allele variance between haplotypes. The ASD estimator, in turn, depends on accurate mutation rate estimates to be able to produce good approximations of the coalescent age of a sample. Here, a comparison is made between three published sets of per locus mutation rate estimates as they are applied to the calculation of the coalescent age for real and simulated population samples. A novel evaluation method is developed for estimating the degree of conformity of any Y chromosome STR locus of interest to the strict stepwise mutation model and specific recommendations are made regarding the suitability of thirty-two commonly used Y-STR loci for the purpose of estimating the coalescent. The use of the geometric mean for averaging ASD and û across loci is shown to improve the consistency of the resulting estimates, with decreased sensitivity to outliers and to the number of STR loci compared or the particular set of mutation rates selected.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The author has declared that no competing interests exist.

Figures

Figure 1
Figure 1. Calculated q values for 32 loci, SMGF data set.
A lower q value equals a better fit to the S-SMM for an individual locus.
Figure 2
Figure 2. Coalescent calculations () and confidence intervals.
A) Burgarella mutation rates, 32 loci, best to worst order. B) Ballantyne mutation rates, 32 loci, best to worst order. C) Busby et al., 2012, 15 recommended loci, best to worst order. D) YHRD mutation rates, best to worst order, 15 loci. TMRCA is given in generations. Error bars represent 95% confidence intervals for each TMRCA estimate. Dashed lines indicate the median formula image value for each comparison, shown in the box at the right end of the dashed line.
Figure 3
Figure 3. Regression coefficients and confidence intervals.
ASD for each locus compared with mutation rates for each locus, using three published studies. A) Burgarella rates, 32 loci. Loci identified in bold face (e.g., DYS448) fall within the 95% mean confidence interval for the locus set compared. B) Ballantyne rates, 32 loci. C) YHRD, 15 loci, YHRD rates. D) Burgarella rates, 24 loci, with loci that have formula image0.001 removed from the calculation. Data set compared is from the “British Isles DNA Project,” n = 245, 32 locus haplotypes and subsets of the same data set (24 locus reduced set and 15 locus set corresponding to the YHRD ssSTR loci.).
Figure 4
Figure 4. 32 locus STR data set with changing growth rates and genealogy shapes.
A) formula imageand 95% confidence intervals plotted for the STR data set (n = 245, 32 loci) as the Rgrowth (Ne*r) parameter is varied between Rgrowth = 0 (no growth, constant sized population) to a “star” genealogy (approximating infinite exponential growth by all descendant lines) by orders of magnitude from 1.0E+1 to 1.0E+11. formula image = 345.09 was the median value for all Rgrowth models and was shown empirically to be independent of the shape of the sampled genealogy, growth rate and population size. B) Percentage change of 95% CI overestimation compared to ‘star’ genealogy for the full range of comparisons in 4A.

Similar articles

Cited by

References

    1. Jobling MA, Tyler-Smith C (2003) The human Y chromosome: an evolutionary marker comes of age. Nat Rev Genet 4: 598–612. - PubMed
    1. Walsh B (2000) Estimating the time to the most recent common ancestor for the Y chromosome or mitochondrial DNA for a pair of individuals. Genetics 156: 897–912. - PMC - PubMed
    1. Ballantyne K, Goedbloed M, Fang R, Schaap O, Lao O, et al. (2010) Mutability of Y- Chromosomal Microsatellites: Rates, Characteristics, Molecular Bases, and Forensic Implications. Am J Hum Genet 87: 341–353. - PMC - PubMed
    1. Burgarella C, Navascués M (2011) Mutation rate estimates for 110 Y-chromosome STRs combining population and father–son pair data. Eur J Hum Genet 19: 70–75. - PMC - PubMed
    1. Ohta T, Kimura M (1973) A model of mutation appropriate to estimate the number of electrophoretically detectable alleles in a finite population. Genet Res 22: 201–204. - PubMed