Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Mar 21;263(2):227-36.
doi: 10.1016/j.jtbi.2009.12.012. Epub 2009 Dec 16.

New method for global alignment of 2 DNA sequences by the tree data structure

Affiliations

New method for global alignment of 2 DNA sequences by the tree data structure

Zhao-Hui Qi et al. J Theor Biol. .

Abstract

We introduce a new approach to investigate problem of DNA sequence alignment. The method consists of three parts: (i) simple alignment algorithm, (ii) extension algorithm for largest common substring, (iii) graphical simple alignment tree (GSA tree). The approach firstly obtains a graphical representation of scores of DNA sequences by the scoring equation R(0)*R-S(0)*S-T(0)*(a+bk). Then a GSA tree is constructed to facilitate solving the problem for global alignment of 2 DNA sequences. Finally we give several practical examples to illustrate the utility and practicality of the approach.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
The building steps of all possible alignments without spaces within G1 and G2.
Fig. 2
Fig. 2
The figure illustrates the graphical score representation of simple alignments of sequences G1 and G2. The scoring equation is RS(a+bk), where identity score R is 9, substitution score S is 1, gap opening penalty a is 15, and gap extension penalty b is 1.
Fig. 3
Fig. 3
Improved simple alignment process with less consuming time (The length of G1 is M. The length of G2 is N. And let MN).
Fig. 4
Fig. 4
An example of graphical simple alignment tree (Strings G1 and G2 includes 6 substrings in the first level sub-alignment: U11C21U31C41U51C61. The substring U11 includes 3 substrings in the second level sub-alignment: C12U22C32. The substring U31 includes 2 substrings in the second level sub-alignment: U42C52. The substring U51 includes 3 substrings in the second level sub-alignment: U62C72U82. The substring U22 includes 2 substrings in the third level sub-alignment: U13C23. The substring U82 includes 2 substrings in the third level sub-alignment: C33U43).
Fig. 5
Fig. 5
The graphical simple alignment tree to be used to align the sequences G1 and G2 (G1, GGCCTCTGCCTAATCACACAGATCTAACAGGATTATTTC; G2, GGCCTCTGCCTTATTACACAAATCTTAACAGGACTATTTC).
Fig. 6
Fig. 6
The graphical simple alignment tree to be used to align the sequences G1 and G2 (G1, GCCCTCGCGGGCAACATTTAATTCACAGCCAGTTCTCTCAACAG TGATTATC; G2, CTGGGTCTTCAGGTCCTTTATGCTTAACACAAATCTATC GTTAACAGGACTATTCT).
Fig. 7
Fig. 7
The global alignment between string a and string b. Identities: 328/448 (73.2%); Gaps: 8/448 (1.8%); Score: 2772.

Similar articles

Cited by

References

    1. Althaus I.W., Chou J.J., Gonzales A.J., Diebel M.R., Chou K.C., Kezdy F.J., Romero D.L., Aristoff P.A., Tarpley W.G., Reusser F. Kinetic studies with the nonnucleoside HIV-1 reverse transcriptase inhibitor U-88204E. Biochemistry. 1993;32:6548–6554. - PubMed
    1. Althaus I.W., Chou J.J., Gonzales A.J., Diebel M.R., Chou K.C., Kezdy F.J., Romero D.L., Aristoff P.A., Tarpley W.G., Reusser F. Steady-state kinetic studies with the non-nucleoside HIV-1 reverse transcriptase inhibitor U-87201E. Journal of Biological Chemistry. 1993;268:6119–6124. - PubMed
    1. Althaus I.W., Chou J.J., Gonzales A.J., Diebel M.R., Chou K.C., Kezdy F.J., Romero D.L., Aristoff P.A., Tarpley W.G., Reusser F. The quinoline U-78036 is a potent inhibitor of HIV-1 reverse transcriptase. Journal of Biological Chemistry. 1993;268:14875–14880. - PubMed
    1. Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. Journal of Molecular Biology. 1990;215:403–410. - PubMed
    1. Altschul S.F., Madden T.L., Schäffer A.A., Zhang J., Zhang Z., Miller W., Lipman D.J. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Research. 1997;25:3389–3402. - PMC - PubMed

LinkOut - more resources