An editing environment for DNA sequence analysis and annotation (extended abstract)
- PMID: 9697184
- DOI: 10.2172/563243
An editing environment for DNA sequence analysis and annotation (extended abstract)
Abstract
This paper presents a computer system for analyzing and annotating large-scale genomic sequences. The core of the system is a multiple-gene structure identification program, which predicts the most "probable" gene structures based on the given evidence, including pattern recognition, EST and protein homology information. A graphics-based user interface provides an environment which allows the user to interactively control the evidence to be used in the gene identification process. To overcome the computational bottleneck in the database similarity search used in the gene identification process, we have developed an effective way to partition a database into a set of sub-databases of "related" sequences, and reduced the search problem on a large database to a signature identification problem and a search problem on a much smaller sub-database. This reduces the number of sequences to be searched from N to O ([square root of] N) on average, and hence greatly reduces the search time, where N is the number of sequences in the original database. The system provides the user with the ability to facilitate and modify the analysis and modeling in real time.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Research Materials