Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 1996 Dec 1;71(1-3):259-268.
doi: 10.1016/S0166-218X(96)00068-6.

Fast multiple alignment of ungapped DNA sequences using information theory and a relaxation method

Affiliations

Fast multiple alignment of ungapped DNA sequences using information theory and a relaxation method

Thomas D Schneider et al. Discrete Appl Math. .

Abstract

An information theory based multiple alignment ("Malign") method was used to align the DNA binding sequences of the OxyR and Fis proteins, whose sequence conservation is so spread out that it is difficult to identify the sites. In the algorithm described here, the information content of the sequences is used as a unique global criterion for the quality of the alignment. The algorithm uses look-up tables to avoid recalculating computationally expensive functions such as the logarithm. Because there are no arbitrary constants and because the results are reported in absolute units (bits), the best alignment can be chosen without ambiguity. Starting from randomly selected alignments, a hill-climbing algorithm can track through the immense space of s(n) combinations where s is the number of sequences and n is the number of positions possible for each sequence. Instead of producing a single alignment, the algorithm is fast enough that one can afford to use many start points and to classify the solutions. Good convergence is indicated by the presence of a single well-populated solution class having higher information content than other classes. The existence of several distinct classes for the Fis protein indicates that those binding sites have self-similar features.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Distribution of 10000 alignments of 16 OxyR binding sites.
Figure 2
Figure 2
Distribution of 10000 alignments of 44 Fis binding sites.

Similar articles

Cited by

References

    1. Barber AM, Zhurkin VB. CAP binding sites reveal pyrimidine-purine pattern characteristic of DNA bending. J Biomol Struct Dyn. 1990;8:213–232. - PubMed
    1. Chan SC, Wong AKC, Chiu DKY. A survey of multiple sequence comparison methods. Bull of Math Biol. 1992;54:563–598. - PubMed
    1. Finkel SE, Johnson RC. The Fis protein: it’s not just for DNA inversion anymore. Mol Microbiol. 1992;6:3257–3265. - PubMed
    1. Finkel SE, Johnson RC. The Fis protein: it’s not just for DNA inversion anymore (erratum) Mol Microbiol. 1992;6:1023. - PubMed
    1. Hawley DK, McClure WR. Compilation and analysis of Escherichia coli promoter DNA sequences. Nucleic Acids Res. 1983;11:2237–2255. - PMC - PubMed

LinkOut - more resources