Galba: genome annotation with miniprot and AUGUSTUS
- PMID: 37653395
- PMCID: PMC10472564
- DOI: 10.1186/s12859-023-05449-z
Galba: genome annotation with miniprot and AUGUSTUS
Abstract
Background: The Earth Biogenome Project has rapidly increased the number of available eukaryotic genomes, but most released genomes continue to lack annotation of protein-coding genes. In addition, no transcriptome data is available for some genomes.
Results: Various gene annotation tools have been developed but each has its limitations. Here, we introduce GALBA, a fully automated pipeline that utilizes miniprot, a rapid protein-to-genome aligner, in combination with AUGUSTUS to predict genes with high accuracy. Accuracy results indicate that GALBA is particularly strong in the annotation of large vertebrate genomes. We also present use cases in insects, vertebrates, and a land plant. GALBA is fully open source and available as a docker image for easy execution with Singularity in high-performance computing environments.
Conclusions: Our pipeline addresses the critical need for accurate gene annotation in newly sequenced genomes, and we believe that GALBA will greatly facilitate genome annotation for diverse organisms.
Keywords: AUGUSTUS; Gene prediction; Miniprot; Protein coding gene.
© 2023. BioMed Central Ltd., part of Springer Nature.
Conflict of interest statement
The authors declare that they have no competing interests.
Figures
Update of
-
GALBA: Genome Annotation with Miniprot and AUGUSTUS.bioRxiv [Preprint]. 2023 Apr 10:2023.04.10.536199. doi: 10.1101/2023.04.10.536199. bioRxiv. 2023. Update in: BMC Bioinformatics. 2023 Aug 31;24(1):327. doi: 10.1186/s12859-023-05449-z. PMID: 37090650 Free PMC article. Updated. Preprint.
References
-
- Hope H, Willis S, Markie M, Elliott L. Wellcome Open Research. https://wellcomeopenresearch.org/browse/articles Accessed Accessed 10 April 2023. 2023.
-
- for Biotechnology Information NC. NCBI Genomes. https://www.ncbi.nlm.nih.gov/genome/browse#!/eukaryotes/ Accessed Accessed 10 April 2023. 2023.
MeSH terms
Supplementary concepts
Grants and funding
LinkOut - more resources
Full Text Sources
