BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database
- PMID: 33575650
- PMCID: PMC7787252
- DOI: 10.1093/nargab/lqaa108
BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database
Abstract
The task of eukaryotic genome annotation remains challenging. Only a few genomes could serve as standards of annotation achieved through a tremendous investment of human curation efforts. Still, the correctness of all alternative isoforms, even in the best-annotated genomes, could be a good subject for further investigation. The new BRAKER2 pipeline generates and integrates external protein support into the iterative process of training and gene prediction by GeneMark-EP+ and AUGUSTUS. BRAKER2 continues the line started by BRAKER1 where self-training GeneMark-ET and AUGUSTUS made gene predictions supported by transcriptomic data. Among the challenges addressed by the new pipeline was a generation of reliable hints to protein-coding exon boundaries from likely homologous but evolutionarily distant proteins. In comparison with other pipelines for eukaryotic genome annotation, BRAKER2 is fully automatic. It is favorably compared under equal conditions with other pipelines, e.g. MAKER2, in terms of accuracy and performance. Development of BRAKER2 should facilitate solving the task of harmonization of annotation of protein-coding genes in genomes of different eukaryotic species. However, we fully understand that several more innovations are needed in transcriptomic and proteomic technologies as well as in algorithmic development to reach the goal of highly accurate annotation of eukaryotic genomes.
© The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.
Figures




Similar articles
-
BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS.Bioinformatics. 2016 Mar 1;32(5):767-9. doi: 10.1093/bioinformatics/btv661. Epub 2015 Nov 11. Bioinformatics. 2016. PMID: 26559507 Free PMC article.
-
BRAKER3: Fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS and TSEBRA.bioRxiv [Preprint]. 2024 Feb 29:2023.06.10.544449. doi: 10.1101/2023.06.10.544449. bioRxiv. 2024. Update in: Genome Res. 2024 Jun 25;34(5):769-777. doi: 10.1101/gr.278090.123. PMID: 37398387 Free PMC article. Updated. Preprint.
-
BRAKER3: Fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS, and TSEBRA.Genome Res. 2024 Jun 25;34(5):769-777. doi: 10.1101/gr.278090.123. Genome Res. 2024. PMID: 38866550 Free PMC article.
-
TSEBRA: transcript selector for BRAKER.BMC Bioinformatics. 2021 Nov 25;22(1):566. doi: 10.1186/s12859-021-04482-0. BMC Bioinformatics. 2021. PMID: 34823473 Free PMC article.
-
Genome annotation: From human genetics to biodiversity genomics.Cell Genom. 2023 Aug 1;3(8):100375. doi: 10.1016/j.xgen.2023.100375. eCollection 2023 Aug 9. Cell Genom. 2023. PMID: 37601977 Free PMC article. Review.
Cited by
-
Chromosome-level genome assembly of Megachile lagopoda (Linnaeus, 1761) (Hymenoptera: Megachilidae).Sci Data. 2024 Oct 29;11(1):1171. doi: 10.1038/s41597-024-04028-y. Sci Data. 2024. PMID: 39472626 Free PMC article.
-
A haplotype resolved chromosomal level avocado genome allows analysis of novel avocado genes.Hortic Res. 2022 Mar 30;9:uhac157. doi: 10.1093/hr/uhac157. eCollection 2022. Hortic Res. 2022. PMID: 36204209 Free PMC article.
-
Ensembl 2023.Nucleic Acids Res. 2023 Jan 6;51(D1):D933-D941. doi: 10.1093/nar/gkac958. Nucleic Acids Res. 2023. PMID: 36318249 Free PMC article.
-
Multi-omics analyses reveal MdMYB10 hypermethylation being responsible for a bud sport of apple fruit color.Hortic Res. 2022 Aug 29;9:uhac179. doi: 10.1093/hr/uhac179. eCollection 2022. Hortic Res. 2022. PMID: 36338840 Free PMC article.
-
The genomes of the aquarium sponges Tethya wilhelma and Tethya minuta (Porifera: Demospongiae).F1000Res. 2024 Aug 1;13:679. doi: 10.12688/f1000research.150836.2. eCollection 2024. F1000Res. 2024. PMID: 39193510 Free PMC article.
References
-
- Zheng H., Zhang W., Zhang L., Zhang Z., Li J., Lu G., Zhu Y., Wang Y., Huang Y., Liu J. et al. . The genome of the hydatid tapeworm Echinococcus granulosus. Nat. Genet. 2013; 45:1168–1175. - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources