This is a preprint.
Efficient evidence-based genome annotation with EviAnn
- PMID: 40463080
- PMCID: PMC12132231
- DOI: 10.1101/2025.05.07.652745
Efficient evidence-based genome annotation with EviAnn
Abstract
For many years, machine learning-based ab initio gene finding approaches have been central components of eukaryotic genome annotation pipelines, and they remain so today. The reliance on these approaches was originally sustained by the high cost and low availability of gene expression data, a primary source of evidence for gene annotation along with protein homology. However, innovations in modern sequencing technologies have revolutionized the acquisition of gene expression data, allowing scientists to rely more heavily on this class of evidence. In addition, proteins found in a multitude of well-annotated genomes represent another invaluable resource for gene annotation. Existing annotation packages often underutilize these data sources, which prompted us to develop EviAnn (Evidence-based Annotator), a novel evidence-based eukaryotic gene annotation system. EviAnn takes a strongly data-driven approach, building the exon-intron structure of genes from transcript alignments or protein-sequence homology rather than from purely ab initio gene finding techniques. We show that when provided with the same input data, EviAnn consistently outperforms current state-of-the-art packages including BRAKER3, MAKER2, and FINDER, while utilizing considerably less computer time. Annotation of a mammalian genome can be completed in less than an hour on a single multi-core server. EviAnn is freely available under an open-source license from https://github.com/alekseyzimin/EviAnn_release and from Bioconda as "eviann".
Figures
References
-
- Cenik C, Derti A, Mellor JC, Berriz GF, Roth FP. Genome-wide functional analysis of human 5′untranslated region introns. Genome biology. 2010. Mar;11:1–7.
-
- Chatterjee S, Rao SJ, Pal JK. Pathological mutations in 5′ untranslated regions of human genes. eLS. 2001:1–8.
Publication types
Grants and funding
LinkOut - more resources
Full Text Sources