A new fourier transform approach for protein coding measure based on the format of the Z curve
- PMID: 9789094
- DOI: 10.1093/bioinformatics/14.8.685
A new fourier transform approach for protein coding measure based on the format of the Z curve
Abstract
Motivation: At the core of most protein gene-finding algorithms are the coding measures used to make a decision on coding/non-coding. Of the protein coding measures, the Fourier measure is one of the most important. However, due to the limited length of the windows usually used, the accuracy of the measure is not satisfactory. This paper is devoted to improving the accuracy by lengthening the sequence to amplify the periodicity of 3 in the coding regions.
Results: A new algorithm is presented called the lengthen-shuffle Fourier transform algorithm. For the same window length, the percentage accuracy of the new algorithm is 6-7% higher than that of the ordinary Fourier transform algorithm. The resulting percentage accuracy (average of specificity and sensitivity) of the new measure is 84.9% for the window length 162 bp.
Availability: The program is available on request fromC.-T. Zhang.
Contact: ctzhang@tju.edu.cn
Similar articles
-
Comparison of various algorithms for recognizing short coding sequences of human genes.Bioinformatics. 2004 Mar 22;20(5):673-81. doi: 10.1093/bioinformatics/btg467. Epub 2004 Feb 5. Bioinformatics. 2004. PMID: 14764563
-
An adaptive window length strategy for eukaryotic CDS prediction.IEEE/ACM Trans Comput Biol Bioinform. 2013 Sep-Oct;10(5):1241-52. doi: 10.1109/TCBB.2013.76. IEEE/ACM Trans Comput Biol Bioinform. 2013. PMID: 24384711
-
Prediction of probable genes by Fourier analysis of genomic sequences.Comput Appl Biosci. 1997 Jun;13(3):263-70. doi: 10.1093/bioinformatics/13.3.263. Comput Appl Biosci. 1997. PMID: 9183531
-
A Fourier characteristic of coding sequences: origins and a non-Fourier approximation.J Comput Biol. 2005 Nov;12(9):1153-65. doi: 10.1089/cmb.2005.12.1153. J Comput Biol. 2005. PMID: 16305326
-
Assessment of protein coding measures.Nucleic Acids Res. 1992 Dec 25;20(24):6441-50. doi: 10.1093/nar/20.24.6441. Nucleic Acids Res. 1992. PMID: 1480466 Free PMC article. Review.
Cited by
-
Visualization of the protein-coding regions with a self adaptive spectral rotation approach.Nucleic Acids Res. 2011 Jan;39(1):e3. doi: 10.1093/nar/gkq891. Epub 2010 Oct 14. Nucleic Acids Res. 2011. PMID: 20947567 Free PMC article.
-
Genomic signal processing methods for computation of alignment-free distances from DNA sequences.PLoS One. 2014 Nov 13;9(11):e110954. doi: 10.1371/journal.pone.0110954. eCollection 2014. PLoS One. 2014. PMID: 25393409 Free PMC article.
-
Recognition of protein coding genes in the yeast genome at better than 95% accuracy based on the Z curve.Nucleic Acids Res. 2000 Jul 15;28(14):2804-14. doi: 10.1093/nar/28.14.2804. Nucleic Acids Res. 2000. PMID: 10908339 Free PMC article.
-
Overview and Prospects of DNA Sequence Visualization.Int J Mol Sci. 2025 Jan 8;26(2):477. doi: 10.3390/ijms26020477. Int J Mol Sci. 2025. PMID: 39859192 Free PMC article. Review.
-
A brief review of computational gene prediction methods.Genomics Proteomics Bioinformatics. 2004 Nov;2(4):216-21. doi: 10.1016/s1672-0229(04)02028-5. Genomics Proteomics Bioinformatics. 2004. PMID: 15901250 Free PMC article. Review.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources