The construction of amino acid substitution matrices for the comparison of proteins with non-standard compositions
- PMID: 15509610
- DOI: 10.1093/bioinformatics/bti070
The construction of amino acid substitution matrices for the comparison of proteins with non-standard compositions
Abstract
Motivation: Amino acid substitution matrices play a central role in protein alignment methods. Standard log-odds matrices, such as those of the PAM and BLOSUM series, are constructed from large sets of protein alignments having implicit background amino acid frequencies. However, these matrices frequently are used to compare proteins with markedly different amino acid compositions, such as transmembrane proteins or proteins from organisms with strongly biased nucleotide compositions. It has been argued elsewhere that standard matrices are not ideal for such comparisons and, furthermore, a rationale has been presented for transforming a standard matrix for use in a non-standard compositional context.
Results: This paper presents the mathematical details underlying the compositional adjustment of amino acid or DNA substitution matrices.
Similar articles
-
The compositional adjustment of amino acid substitution matrices.Proc Natl Acad Sci U S A. 2003 Dec 23;100(26):15688-93. doi: 10.1073/pnas.2533904100. Epub 2003 Dec 8. Proc Natl Acad Sci U S A. 2003. PMID: 14663142 Free PMC article.
-
Improved pairwise alignments of proteins in the Twilight Zone using local structure predictions.Bioinformatics. 2006 Feb 15;22(4):413-22. doi: 10.1093/bioinformatics/bti828. Epub 2005 Dec 13. Bioinformatics. 2006. PMID: 16352653
-
Eigenvalue analysis of amino acid substitution matrices reveals a sharp transition of the mode of sequence conservation in proteins.Bioinformatics. 2004 Nov 1;20(16):2504-8. doi: 10.1093/bioinformatics/bth297. Epub 2004 May 6. Bioinformatics. 2004. PMID: 15130930
-
Protein database searches using compositionally adjusted substitution matrices.FEBS J. 2005 Oct;272(20):5101-9. doi: 10.1111/j.1742-4658.2005.04945.x. FEBS J. 2005. PMID: 16218944 Free PMC article. Review.
-
Substitution scoring matrices for proteins - An overview.Protein Sci. 2020 Nov;29(11):2150-2163. doi: 10.1002/pro.3954. Epub 2020 Oct 12. Protein Sci. 2020. PMID: 32954566 Free PMC article. Review.
Cited by
-
Retrieval accuracy, statistical significance and compositional similarity in protein sequence database searches.Nucleic Acids Res. 2006;34(20):5966-73. doi: 10.1093/nar/gkl731. Epub 2006 Oct 26. Nucleic Acids Res. 2006. PMID: 17068079 Free PMC article.
-
Decreasing the number of false positives in sequence classification.BMC Genomics. 2010 Dec 22;11 Suppl 5(Suppl 5):S10. doi: 10.1186/1471-2164-11-S5-S10. BMC Genomics. 2010. PMID: 21210966 Free PMC article.
-
SIMAP--the database of all-against-all protein sequence similarities and annotations with new interfaces and increased coverage.Nucleic Acids Res. 2014 Jan;42(Database issue):D279-84. doi: 10.1093/nar/gkt970. Epub 2013 Oct 27. Nucleic Acids Res. 2014. PMID: 24165881 Free PMC article.
-
NCX-DB: a unified resource for integrative analysis of the sodium calcium exchanger super-family.BMC Neurosci. 2018 Apr 13;19(1):19. doi: 10.1186/s12868-018-0423-2. BMC Neurosci. 2018. PMID: 29649983 Free PMC article.
-
Improved search heuristics find 20,000 new alignments between human and mouse genomes.Nucleic Acids Res. 2014 Apr;42(7):e59. doi: 10.1093/nar/gku104. Epub 2014 Feb 3. Nucleic Acids Res. 2014. PMID: 24493737 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous