Computer-assisted prediction, classification, and delimitation of protein binding sites in nucleic acids
- PMID: 8479918
- PMCID: PMC309377
- DOI: 10.1093/nar/21.7.1655
Computer-assisted prediction, classification, and delimitation of protein binding sites in nucleic acids
Abstract
We present a method to determine the location and extent of protein binding regions in nucleic acids by computer-assisted analysis of sequence data. The program ConsIndex establishes a library of consensus descriptions based on sequence sets containing known regulatory elements. These defined consensus descriptions are used by the program ConsInspector to predict binding sites in new sequences. We show the programs to correctly determine the significant regions involved in transcriptional control of seven sequence elements. The internal profile of relative variability of individual nucleotide positions within these regions paralleled experimental profiles of biological significance. Consensus descriptions are determined by employing an anchored alignment scheme, the results of which are then evaluated by a novel method which is superior to cluster algorithms. The alignment procedure is able to include several closely related sequences without biasing the consensus description. Moreover, the algorithm detects additional elements on the basis of a moderate distance correlation and is capable of discriminating between real binding sites and false positive matches. The software is well suited to cope with the frequent phenomenon of optional elements present in a subset of functionally similar sequences, while taking maximal advantage of the existing sequence data base. Since it requires only a minimum of seven sequences for a single element, it is applicable to a wide range of binding sites.
Similar articles
-
Specific modelling of regulatory units in DNA sequences.Pac Symp Biocomput. 1997:151-62. Pac Symp Biocomput. 1997. PMID: 9390288
-
Computer tool FUNSITE for analysis of eukaryotic regulatory genomic sequences.Proc Int Conf Intell Syst Mol Biol. 1995;3:197-205. Proc Int Conf Intell Syst Mol Biol. 1995. PMID: 7584437
-
Context specific transcription factor prediction.Ann Biomed Eng. 2007 Jun;35(6):1053-67. doi: 10.1007/s10439-007-9268-z. Epub 2007 Mar 22. Ann Biomed Eng. 2007. PMID: 17377845 Free PMC article.
-
On computer-assisted analysis of biological sequences: proline punctuation, consensus sequences, and apolipoprotein repeats.J Lipid Res. 1986 Oct;27(10):1011-34. J Lipid Res. 1986. PMID: 3540168 Review.
-
DNA binding sites: representation and discovery.Bioinformatics. 2000 Jan;16(1):16-23. doi: 10.1093/bioinformatics/16.1.16. Bioinformatics. 2000. PMID: 10812473 Review.
Cited by
-
ALU repeats in promoters are position-dependent co-response elements (coRE) that enhance or repress transcription by dimeric and monomeric progesterone receptors.Mol Endocrinol. 2009 Jul;23(7):989-1000. doi: 10.1210/me.2009-0048. Epub 2009 Apr 16. Mol Endocrinol. 2009. PMID: 19372234 Free PMC article.
-
Predicting gene regulatory elements in silico on a genomic scale.Genome Res. 1998 Nov;8(11):1202-15. doi: 10.1101/gr.8.11.1202. Genome Res. 1998. PMID: 9847082 Free PMC article.
-
TRANSFAC: a database on transcription factors and their DNA binding sites.Nucleic Acids Res. 1996 Jan 1;24(1):238-41. doi: 10.1093/nar/24.1.238. Nucleic Acids Res. 1996. PMID: 8594589 Free PMC article.
-
The design of transcription-factor binding sites is affected by combinatorial regulation.Genome Biol. 2005;6(12):R103. doi: 10.1186/gb-2005-6-12-r103. Epub 2005 Dec 2. Genome Biol. 2005. PMID: 16356266 Free PMC article.
-
Detecting DNA regulatory motifs by incorporating positional trends in information content.Genome Biol. 2004;5(7):R50. doi: 10.1186/gb-2004-5-7-r50. Epub 2004 Jun 24. Genome Biol. 2004. PMID: 15239835 Free PMC article.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources