CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice
- PMID: 7984417
- PMCID: PMC308517
- DOI: 10.1093/nar/22.22.4673
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice
Abstract
The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. Firstly, individual weights are assigned to each sequence in a partial alignment in order to down-weight near-duplicate sequences and up-weight the most divergent ones. Secondly, amino acid substitution matrices are varied at different alignment stages according to the divergence of the sequences to be aligned. Thirdly, residue-specific gap penalties and locally reduced gap penalties in hydrophilic regions encourage new gaps in potential loop regions rather than regular secondary structure. Fourthly, positions in early alignments where gaps have been opened receive locally reduced gap penalties to encourage the opening up of new gaps at these positions. These modifications are incorporated into a new program, CLUSTAL W which is freely available.
Similar articles
-
Using CLUSTAL for multiple sequence alignments.Methods Enzymol. 1996;266:383-402. doi: 10.1016/s0076-6879(96)66024-8. Methods Enzymol. 1996. PMID: 8743695
-
Gaps in structurally similar proteins: towards improvement of multiple sequence alignment.Proteins. 2004 Jan 1;54(1):71-87. doi: 10.1002/prot.10508. Proteins. 2004. PMID: 14705025
-
Multiple DNA and protein sequence alignment based on segment-to-segment comparison.Proc Natl Acad Sci U S A. 1996 Oct 29;93(22):12098-103. doi: 10.1073/pnas.93.22.12098. Proc Natl Acad Sci U S A. 1996. PMID: 8901539 Free PMC article.
-
Comparison of linear gap penalties and profile-based variable gap penalties in profile-profile alignments.Comput Biol Chem. 2011 Oct 12;35(5):308-18. doi: 10.1016/j.compbiolchem.2011.07.006. Epub 2011 Jul 22. Comput Biol Chem. 2011. PMID: 22000802
-
Pairwise statistical significance and empirical determination of effective gap opening penalties for protein local sequence alignment.Int J Comput Biol Drug Des. 2008;1(4):347-67. doi: 10.1504/ijcbdd.2008.022207. Int J Comput Biol Drug Des. 2008. PMID: 20063463 Review.
Cited by
-
Molecular cloning and characterization of a cDNA encoding kiwifruit L-myo-inositol-1-phosphate synthase, a key gene of inositol formation.Mol Biol Rep. 2013 Jan;40(1):697-705. doi: 10.1007/s11033-012-2110-1. Epub 2012 Oct 11. Mol Biol Rep. 2013. PMID: 23065229
-
Genomic divergence in sympatry indicates strong reproductive barriers and cryptic species within Eucalyptus salubris.Ecol Evol. 2021 Mar 29;11(10):5096-5110. doi: 10.1002/ece3.7403. eCollection 2021 May. Ecol Evol. 2021. PMID: 34025994 Free PMC article.
-
Effect of Heterodera schachtii female age on susceptibility to three fungal hyperparasites in the genus Hyalorbilia.J Nematol. 2020 Sep 4;52:e2020-93. doi: 10.21307/jofnem-2020-093. eCollection 2020. J Nematol. 2020. PMID: 33829185 Free PMC article.
-
PASE: a novel method for functional prediction of amino acid substitutions based on physicochemical properties.Front Genet. 2013 Mar 6;4:21. doi: 10.3389/fgene.2013.00021. eCollection 2013. Front Genet. 2013. PMID: 23508070 Free PMC article.
-
Structural basis for high-affinity binding of LEDGF PWWP to mononucleosomes.Nucleic Acids Res. 2013 Apr 1;41(6):3924-36. doi: 10.1093/nar/gkt074. Epub 2013 Feb 8. Nucleic Acids Res. 2013. PMID: 23396443 Free PMC article.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Miscellaneous