Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Dec 15:14:1232-55.
doi: 10.17179/excli2015-302. eCollection 2015.

An enhanced algorithm for multiple sequence alignment of protein sequences using genetic algorithm

Affiliations

An enhanced algorithm for multiple sequence alignment of protein sequences using genetic algorithm

Manish Kumar. EXCLI J. .

Abstract

One of the most fundamental operations in biological sequence analysis is multiple sequence alignment (MSA). The basic of multiple sequence alignment problems is to determine the most biologically plausible alignments of protein or DNA sequences. In this paper, an alignment method using genetic algorithm for multiple sequence alignment has been proposed. Two different genetic operators mainly crossover and mutation were defined and implemented with the proposed method in order to know the population evolution and quality of the sequence aligned. The proposed method is assessed with protein benchmark dataset, e.g., BALIBASE, by comparing the obtained results to those obtained with other alignment algorithms, e.g., SAGA, RBT-GA, PRRP, HMMT, SB-PIMA, CLUSTALX, CLUSTAL W, DIALIGN and PILEUP8 etc. Experiments on a wide range of data have shown that the proposed algorithm is much better (it terms of score) than previously proposed algorithms in its ability to achieve high alignment quality.

Keywords: bioinformatics; crossover operator; genetic algorithm; multiple sequence alignment; mutation operator.

PubMed Disclaimer

Figures

Table 1
Table 1. Summary of the test results of proposed method
Table 2
Table 2. Average Computation Times(s) comparison over Ref. 1, 2, 3, 4 and 5
Table 3
Table 3. Experimental results with Ref. 1 datasets of BAliBase 2.0
Table 4
Table 4. Experimental results with Ref. 3 datasets of BAliBase 2.0
Table 5
Table 5. Experimental results with Ref. 2 datasets of BAliBase 2.0
Table 6
Table 6. Performance evaluation of the proposed algorithm with hill climbing approach and randomly generated population through guide tree
Figure 1
Figure 1. Example of a multiple sequence alignment
Figure 2
Figure 2. One point crossover I
Figure 3
Figure 3. One point crossover II
Figure 4
Figure 4. Exchange Mutation operator
Figure 5
Figure 5. Reverse Mutation operator
Figure 6
Figure 6. Position mutation operator
Figure 7
Figure 7. Inverse mutation operator
Figure 8
Figure 8. Bar graph comparison result of scores between proposed and other methods over Ref. 1
Figure 9
Figure 9. Bar graph comparison result of scores between proposed and other methods over Ref. 1
Figure 10
Figure 10. Bar graph comparison result of scores between proposed and other methods over Ref. 3
Figure 11
Figure 11. Bar graph comparison result of scores between proposed and other methods over Ref. 3
Figure 12
Figure 12. Bar graph comparison result of scores between proposed and other methods over Ref. 2
Figure 13
Figure 13. Average score comparison between proposed and other methods over Ref. 1
Figure 14
Figure 14. Average score comparison between proposed and other methods over Ref. 3
Figure 15
Figure 15. Average score comparison between proposed and other methods over Ref. 2

Similar articles

References

    1. Aniba MR, Poch O. Thompson JD. Issues in bioinformatics benchmarking: the case study of multiple sequence alignment. Nucleic Acids Res. 2010;38:7353–63. - PMC - PubMed
    1. Ankit A, Huang X. Pairwise statistical significance of local sequence alignment using substitution matrices with sequence-pair-specific distance. Proc Int Conf Inform Technol. 2008:94–99.
    1. Auyeung A, Melcher U. Evaluations of protein sequence alignments using structural information. Int Conf Inform Technol: Coding and Computing. 2005;2:748–749.
    1. Bhattacharjee A, Sultana KZ, Shams Z. Dynamic and parallel approaches to optimal evolutionary tree construction. Can Conf Electr Comp Engin. 2006:119–112.
    1. Blackshields G, Wallace IM, Larkin M, Higgins DG. Analysis and comparison of benchmarks for multiple sequence alignment. In Silico Biol. 2006;6:321–39. - PubMed

LinkOut - more resources