The statistical distribution of nucleic acid similarities
- PMID: 3871073
- PMCID: PMC341021
- DOI: 10.1093/nar/13.2.645
The statistical distribution of nucleic acid similarities
Abstract
All pairs of a large set of known vertebrate DNA sequences were searched by computer for most similar segments. Analysis of this data shows that the computed similarity scores are distributed proportionally to the logarithm of the product of the lengths of the sequences involved. This distribution is closely related to recent results of Erdos and others on the longest run of heads in coin tossing. A simple rule is derived for determination of statistical significance of the similarity scores and to assist in relating statistical and biological significance.
References
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
