Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Feb 14:4:41-5.

A simple derivation of the distribution of pairwise local protein sequence alignment scores

Affiliations

A simple derivation of the distribution of pairwise local protein sequence alignment scores

Olivier Bastien. Evol Bioinform Online. .

Abstract

Confidence in pairwise alignments of biological sequences, obtained by various methods such as Blast or Smith-Waterman, is critical for automatic analyses of genomic data. In the asymptotic limit of long sequences, the Karlin-Altschul model computes a P-value assuming that the number of high scoring matching regions above a threshold is Poisson distributed. Using a simple approach combined with recent results in reliability theory, we demonstrate here that the Karlin-Altshul model can be derived with no reference to the extreme events theory.Sequences were considered as systems in which components are amino acids and having a high redundancy of Information reflected by their alignment scores. Evolution of the information shared between aligned components determined the Shared Amount of Information (SA.I.) between sequences, i.e. the score. The Gumbel distribution parameters of aligned sequences scores find here some theoretical rationale. The first is the Hazard Rate of the distribution of scores between residues and the second is the probability that two aligned residues do not lose bits of information (i.e. conserve an initial pairing score) when a mutation occurs.

Keywords: Karlin-Altshul theorem; conservation function; reliability theory.

PubMed Disclaimer

Similar articles

Cited by

References

    1. Altschul SF, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol. 1990;215:403–10. - PubMed
    1. Altschul SF, Bundschuh R, Olsen R, et al. The estimation of statistical parameters for local alignment score distributions. Nucl Acid Res. 2001;29:351–61. - PMC - PubMed
    1. Aude JC, Louis A. An incremental algorithm for Z-value computations. Comput Chem. 2002;26:403–11. - PubMed
    1. Bacro JN, Comet JP. Sequence alignment: an approximation law for the Z-value with applications to databank scanning. Comput Chem. 2001;25:401–10. - PubMed
    1. Bastien O, Aude JC, Roy S, et al. Fundamentals of massive automatic pairwise alignments of protein sequences: theoretical significance of Z-value statistics. Bioinformatics. 2004;20:534–7. - PubMed

LinkOut - more resources