Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 1995 Apr;40(4):464-73.
doi: 10.1007/BF00164032.

The size distribution of insertions and deletions in human and rodent pseudogenes suggests the logarithmic gap penalty for sequence alignment

Affiliations
Comparative Study

The size distribution of insertions and deletions in human and rodent pseudogenes suggests the logarithmic gap penalty for sequence alignment

X Gu et al. J Mol Evol. 1995 Apr.

Abstract

The size distributions of deletions, insertions, and indels (i.e., insertions or deletions) were studied, using 78 human processed pseudogenes and other published data sets. The following results were obtained: (1) Deletions occur more frequently than do insertions in sequence evolution; none of the pseudogenes studied shows significantly more insertions than deletions. (2) Empirically, the size distributions of deletions, insertions, and indels can be described well by a power law, i.e., fk = Ck-b, where fk is the frequency of deletion, insertion, or indel with gap length k, b is the power parameter, and C is the normalization factor. (3) The estimates of b for deletions and insertions from the same data set are approximately equal to each other, indicating that the size distributions for deletions and insertions are approximately identical. (4) The variation in the estimates of b among various data sets is small, indicating that the effect of local structure exists but only plays a secondary role in the size distribution of deletions and insertions. (5) The linear gap penalty, which is most commonly used in sequence alignment, is not supported by our analysis; rather, the power law for the size distribution of indels suggests that an appropriate gap penalty is wk = a + b ln k, where a is the gap creation cost and blnk is the gap extension cost. (6) The higher frequency of deletion over insertion suggests that the gap creation cost of insertion (ai) should be larger than that of deletion (ad); that is, ai - ad = ln R, where R is the frequency ratio of deletions to insertions.

PubMed Disclaimer

References

    1. J Mol Evol. 1989 Apr;28(4):279-85 - PubMed
    1. Hum Genet. 1991 Mar;86(5):425-41 - PubMed
    1. Protein Eng. 1987 Feb-Mar;1(2):89-94 - PubMed
    1. Methods Enzymol. 1990;183:365-75 - PubMed
    1. Annu Rev Genet. 1985;19:253-72 - PubMed

Publication types

LinkOut - more resources