Discover 1: a new program to search for unusually represented DNA motifs
- PMID: 8255770
- PMCID: PMC310630
- DOI: 10.1093/nar/21.22.5152
Discover 1: a new program to search for unusually represented DNA motifs
Abstract
DISCOVER1 (DIStribution COunter VERsion 1) is a new program that can identify DNA motifs occurring with a high deviation from the expected frequency. The program generates families of patterns, each family having a common set of defined bases. Undefined bases are inserted amongst the defined bases in different ways, thus generating the diverse patterns of each family. The occurrences of the different patterns are then compared and analysed within each family, assuming that all patterns should have the same probability of occurrence. An extensive use of computer memory, combined with the immediate sorting of counts by address calculation allow a complete counting of all DNA motifs on a single pass on the DNA sequence. This approach offers a very fast way to search for unusually distributed patterns and can identify inexact patterns as well as exact patterns.
Similar articles
-
A general DNA analysis program for the Hewlett-Packard Model 86/87 microcomputer.Nucleic Acids Res. 1986 Jan 10;14(1):467-77. doi: 10.1093/nar/14.1.467. Nucleic Acids Res. 1986. PMID: 3753782 Free PMC article.
-
Methods to define and locate patterns of motifs in sequences.Comput Appl Biosci. 1988 Mar;4(1):53-60. doi: 10.1093/bioinformatics/4.1.53. Comput Appl Biosci. 1988. PMID: 2898280
-
On counting position weight matrix matches in a sequence, with application to discriminative motif finding.Bioinformatics. 2006 Jul 15;22(14):e454-63. doi: 10.1093/bioinformatics/btl227. Bioinformatics. 2006. PMID: 16873507
-
Bases of motifs for generating repeated patterns with wild cards.IEEE/ACM Trans Comput Biol Bioinform. 2005 Jan-Mar;2(1):40-50. doi: 10.1109/TCBB.2005.5. IEEE/ACM Trans Comput Biol Bioinform. 2005. PMID: 17044163
-
Structured motifs search.J Comput Biol. 2005 Oct;12(8):1065-82. doi: 10.1089/cmb.2005.12.1065. J Comput Biol. 2005. PMID: 16241898
Cited by
-
A new database (GCD) on genome composition for eukaryote and prokaryote genome sequences and their initial analyses.Genome Biol Evol. 2012;4(4):501-12. doi: 10.1093/gbe/evs026. Epub 2012 Mar 14. Genome Biol Evol. 2012. PMID: 22417913 Free PMC article.
References
MeSH terms
Substances
LinkOut - more resources
Full Text Sources