Informatics issues in large-scale sequence analysis: elucidating the protein kinases of C. elegans
- PMID: 11074585
- DOI: 10.1002/1097-4644(20010201)80:2<181::aid-jcb30>3.0.co;2-1
Informatics issues in large-scale sequence analysis: elucidating the protein kinases of C. elegans
Abstract
With the availability of the nearly complete genomic sequence of C. elegans, the first multicellular organism to be sequenced, molecular biology has definitely entered the postgenomic era. Annotation of the genomic sequence, which refers to identifying the genes and other biologically relevant sections of the genome, is an important and nontrivial next step. A first-pass annotation will be necessarily incomplete but will drive further biological experiments, which in turn will help to annotate the genome better. Given the scale of the genome sequence analysis, it is clear that the annotation should be automated as much as possible without sacrificing the quality of analysis. In this work, we outline our approach to identifying the protein kinases of C. elegans from the genomic sequence. We describe new tools we have developed for analysis, management and visualization of genomic data. By developing modular and scalable solutions, this study has provided a framework for future analysis of the Drosophila and human genomes.
Copyright 2000 Wiley-Liss, Inc.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources