Training back-propagation neural networks to define and detect DNA-binding sites
- PMID: 2014171
- PMCID: PMC333596
- DOI: 10.1093/nar/19.2.313
Training back-propagation neural networks to define and detect DNA-binding sites
Abstract
A three layered back-propagation neural network was trained to recognize E. coli promoters of the 17 base spacing class. To this end, the network was presented with 39 promoter sequences and derivatives of those sequences as positive inputs; 60% A + T random sequences and sequences containing 2 promoter-down point mutations were used as negative inputs. The entire promoter sequence of 58 bases, approximately -50 to +8, was entered as input. The network was asked to associate an output of 1.0 with promoter sequence input and 0.0 with non-promoter input. Generally, after 100,000 input cycles, the network was virtually perfect in classifying the training set. A trained network was about 80% effective in recognizing 'new' promoters which were not in the training set, with a false positive rate below 0.1%. Network searches on pBR322 and on the lambda genome were also performed. Overall the results were somewhat better than the best rule-based procedures. The trained network can be analyzed both for its choice of base and relative weighting, positive and negative, at each position of the sequence. This method, which requires only appropriate input/output training pairs, can be used to define and search for any DNA regulatory sequence for which there are sufficient exemplars.
Similar articles
-
Analysis of E.coli promoter structures using neural networks.Nucleic Acids Res. 1994 Jun 11;22(11):2158-65. doi: 10.1093/nar/22.11.2158. Nucleic Acids Res. 1994. PMID: 8029027 Free PMC article.
-
Escherichia coli promoters: neural networks develop distinct descriptions in learning to search for promoters of different spacing classes.Nucleic Acids Res. 1992 Jul 11;20(13):3471-7. doi: 10.1093/nar/20.13.3471. Nucleic Acids Res. 1992. PMID: 1630917 Free PMC article.
-
Neural network optimization for E. coli promoter prediction.Nucleic Acids Res. 1991 Apr 11;19(7):1593-9. doi: 10.1093/nar/19.7.1593. Nucleic Acids Res. 1991. PMID: 2027766 Free PMC article.
-
Escherichia coli promoters. II. A spacing class-dependent promoter search protocol.J Biol Chem. 1989 Apr 5;264(10):5531-4. J Biol Chem. 1989. PMID: 2647721
-
Consensus methods for finding and ranking DNA binding sites. Application to Escherichia coli promoters.J Mol Biol. 1989 May 20;207(2):301-10. doi: 10.1016/0022-2836(89)90256-8. J Mol Biol. 1989. PMID: 2666673
Cited by
-
A general procedure for locating and analyzing protein-binding sequence motifs in nucleic acids.Proc Natl Acad Sci U S A. 1998 Sep 1;95(18):10710-5. doi: 10.1073/pnas.95.18.10710. Proc Natl Acad Sci U S A. 1998. PMID: 9724769 Free PMC article.
-
The Haemophilus influenzae dprABC genes constitute a competence-inducible operon that requires the product of the tfoX (sxy) gene for transcriptional activation.J Bacteriol. 1997 Aug;179(15):4815-20. doi: 10.1128/jb.179.15.4815-4820.1997. J Bacteriol. 1997. PMID: 9244270 Free PMC article.
-
Non-canonical sequence elements in the promoter structure. Cluster analysis of promoters recognized by Escherichia coli RNA polymerase.Nucleic Acids Res. 1997 Dec 1;25(23):4703-9. doi: 10.1093/nar/25.23.4703. Nucleic Acids Res. 1997. PMID: 9365247 Free PMC article.
-
Computer-assisted prediction, classification, and delimitation of protein binding sites in nucleic acids.Nucleic Acids Res. 1993 Apr 11;21(7):1655-64. doi: 10.1093/nar/21.7.1655. Nucleic Acids Res. 1993. PMID: 8479918 Free PMC article.
-
DNA sequence and characterization of Haemophilus influenzae dprA+, a gene required for chromosomal but not plasmid DNA transformation.J Bacteriol. 1995 Jun;177(11):3235-40. doi: 10.1128/jb.177.11.3235-3240.1995. J Bacteriol. 1995. PMID: 7768823 Free PMC article.
References
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Molecular Biology Databases