Using database matches with for HMMGene for automated gene detection in Drosophila
- PMID: 10779492
- PMCID: PMC310864
- DOI: 10.1101/gr.10.4.523
Using database matches with for HMMGene for automated gene detection in Drosophila
Abstract
The application of the gene finder HMMGene to the Adh region of the Drosophila melanogaster is described, and the prediction results are analyzed. HMMGene is based on a probabilistic model called a hidden Markov model, and the probabilistic framework facilitates the inclusion of database matches of varying degrees of certainty. It is shown that database matches clearly improve the performance of the gene finder. For instance, the sensitivity for coding exons predicted with both ends correct grows from 62% to 70% on a high-quality test set, when matches to proteins, cDNAs, repeats, and transposons are included. The specificity drops more than the sensitivity increases when ESTs are used. This is due to the high noise level in EST matches, and it is discussed in more detail why this is and how it might be improved.
Comment in
-
A biologist's view of the Drosophila genome annotation assessment project.Genome Res. 2000 Apr;10(4):391-3. doi: 10.1101/gr.10.4.391. Genome Res. 2000. PMID: 10779478 Review. No abstract available.
References
-
- Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J Mol Biol. 1997;268:78–94. - PubMed
-
- Durbin RM, Eddy SR, Krogh A, Mitchison G. Biological sequence analysis. Cambridge, UK: Cambridge University Press; 1998.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Research Materials