. 2011 Aug 25:12:353.

doi: 10.1186/1471-2105-12-353.

Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment

Farhana Naznin¹, Ruhul Sarker, Daryl Essam

Affiliations

PMID: 21867510
PMCID: PMC3180391
DOI: 10.1186/1471-2105-12-353

Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment

Farhana Naznin et al. BMC Bioinformatics. 2011.

. 2011 Aug 25:12:353.

doi: 10.1186/1471-2105-12-353.

Authors

Farhana Naznin¹, Ruhul Sarker, Daryl Essam

Affiliation

¹ School of Engineering and Information Technology, University of New South Wales at Australian Defence Force Academy, Canberra, Australia.

PMID: 21867510
PMCID: PMC3180391
DOI: 10.1186/1471-2105-12-353

Abstract

Background: Many Bioinformatics studies begin with a multiple sequence alignment as the foundation for their research. This is because multiple sequence alignment can be a useful technique for studying molecular evolution and analyzing sequence structure relationships.

Results: In this paper, we have proposed a Vertical Decomposition with Genetic Algorithm (VDGA) for Multiple Sequence Alignment (MSA). In VDGA, we divide the sequences vertically into two or more subsequences, and then solve them individually using a guide tree approach. Finally, we combine all the subsequences to generate a new multiple sequence alignment. This technique is applied on the solutions of the initial generation and of each child generation within VDGA. We have used two mechanisms to generate an initial population in this research: the first mechanism is to generate guide trees with randomly selected sequences and the second is shuffling the sequences inside such trees. Two different genetic operators have been implemented with VDGA. To test the performance of our algorithm, we have compared it with existing well-known methods, namely PRRP, CLUSTALX, DIALIGN, HMMT, SB_PIMA, ML_PIMA, MULTALIGN, and PILEUP8, and also other methods, based on Genetic Algorithms (GA), such as SAGA, MSA-GA and RBT-GA, by solving a number of benchmark datasets from BAliBase 2.0.

Conclusions: The experimental results showed that the VDGA with three vertical divisions was the most successful variant for most of the test cases in comparison to other divisions considered with VDGA. The experimental results also confirmed that VDGA outperformed the other methods considered in this research.

PubMed Disclaimer

Figures

**Figure 2**
**Flowchart of Initial Generation**.

**Figure 3**
(a) Guide tree; (b) 1 represents randomly selected sequence numbers from (a), and 0 represents unselected sequence numbers; (c) subtree made by the selected sequences; (d) subtree made by the remaining sequences of (a); and (e) new Guide tree made from (c) and (d).

**Figure 4**
**Shuffling mechanism, sequence numbers 8 and 3 interchange their positions**.

**Figure 5**
**Single point crossover**.

**Figure 6**
**Multiple point crossovers**.

**Figure 8**
**Vertical Division Process**.

**Figure 9**
**Graphical presentation of generating New Generation**.

**Figure 10**
**Graphical presentations of the performance of the VDGA_Decomp_3 method w.r.to the Best and Average WSPM score per generation**.

**Figure 11**
**Graphical presentations of the experimental results on MSA-GA selected datasets**.

**Figure 12**
**Overall Performance of all methods in MSA-GA selected datasets**.

**Figure 13**
**Graphical presentations of the experimental results on reference 2 datasets**.

**Figure 14**
**Graphical presentations of the experimental results on reference 3 datasets**.

**Figure 15**
**Overall Performance of all methods in reference 2 datasets**.

**Figure 16**
**Overall performance of all methods in reference 3 datasets**.

See this image and copyright information in PMC

Cited by

Review on the Application of Machine Learning Algorithms in the Sequence Data Mining of DNA.
Yang A, Zhang W, Wang J, Yang K, Han Y, Zhang L. Yang A, et al. Front Bioeng Biotechnol. 2020 Sep 4;8:1032. doi: 10.3389/fbioe.2020.01032. eCollection 2020. Front Bioeng Biotechnol. 2020. PMID: 33015010 Free PMC article. Review.
Naturally selecting solutions: the use of genetic algorithms in bioinformatics.
Manning T, Sleator RD, Walsh P. Manning T, et al. Bioengineered. 2013 Sep-Oct;4(5):266-78. doi: 10.4161/bioe.23041. Epub 2012 Dec 6. Bioengineered. 2013. PMID: 23222169 Free PMC article. Review.
IBBOMSA: An Improved Biogeography-based Approach for Multiple Sequence Alignment.
Yadav RK, Banka H. Yadav RK, et al. Evol Bioinform Online. 2016 Oct 27;12:237-246. doi: 10.4137/EBO.S40457. eCollection 2016. Evol Bioinform Online. 2016. PMID: 27812276 Free PMC article.
Disease Sequences High-Accuracy Alignment Based on the Precision Medicine.
Li M, Long H, Wang H, Fu H, Xu D, Shen Y, Yao Y, Liao B. Li M, et al. Biomed Res Int. 2018 Feb 22;2018:1718046. doi: 10.1155/2018/1718046. eCollection 2018. Biomed Res Int. 2018. PMID: 29682519 Free PMC article.
Bacterial Foraging Optimization -Genetic Algorithm for Multiple Sequence Alignment with Multi-Objectives.
Manikandan P, Ramyachitra D. Manikandan P, et al. Sci Rep. 2017 Aug 18;7(1):8833. doi: 10.1038/s41598-017-09499-1. Sci Rep. 2017. PMID: 28821841 Free PMC article.

References

1. Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–4680. doi: 10.1093/nar/22.22.4673. - DOI - PMC - PubMed
1. Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970;48:443–453. doi: 10.1016/0022-2836(70)90057-4. - DOI - PubMed
1. Stoye J, Perrey SW, Dress AWM. Improving the divide-and-conquer approach to sum-of-pairs multiple sequence alignment. App Maths Letters. 1997;10:67–73.
1. Bonizzoni P, Vedova GD. The complexity of multiple sequence alignment with SP-score that is a metric. Theor Comp Sci. 2001;259:63–79. doi: 10.1016/S0304-3975(99)00324-2. - DOI
1. Feng DF, Dolittle RF. Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J Mol Evol. 1987;25:351–350. doi: 10.1007/BF02603120. - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment

Affiliation

Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

LinkOut - more resources

Full Text Sources