The structure of genealogies and the distribution of fixed differences between DNA sequence samples from natural populations
- PMID: 1916247
- PMCID: PMC1204556
- DOI: 10.1093/genetics/128.4.831
The structure of genealogies and the distribution of fixed differences between DNA sequence samples from natural populations
Abstract
When two samples of DNA sequences are compared, one way in which they may differ is in the presence of fixed differences, which are defined as sites at which all of the sequences in one sample are different from all of the sequences in a second sample. The probability distribution of the number of fixed differences is developed. The theory employs Wright-Fisher genealogies and the infinite sites mutation model. For the case when both samples are drawn randomly from the same population it is found that genealogies permitting fixed differences are very unlikely. Thus the mere presence of fixed differences between samples is statistically significant, even for small samples. The theory is extended to samples from populations that have been separated for some time. The relationship between a simple Poisson distribution of mutations and the distribution of fixed differences is described as a function of the time since populations have been isolated. It is shown how these results may contribute to improved tests of recent balancing or directional selection.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
