BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS
- PMID: 26559507
- PMCID: PMC6078167
- DOI: 10.1093/bioinformatics/btv661
BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS
Abstract
Motivation: Gene finding in eukaryotic genomes is notoriously difficult to automate. The task is to design a work flow with a minimal set of tools that would reach state-of-the-art performance across a wide range of species. GeneMark-ET is a gene prediction tool that incorporates RNA-Seq data into unsupervised training and subsequently generates ab initio gene predictions. AUGUSTUS is a gene finder that usually requires supervised training and uses information from RNA-Seq reads in the prediction step. Complementary strengths of GeneMark-ET and AUGUSTUS provided motivation for designing a new combined tool for automatic gene prediction.
Results: We present BRAKER1, a pipeline for unsupervised RNA-Seq-based genome annotation that combines the advantages of GeneMark-ET and AUGUSTUS. As input, BRAKER1 requires a genome assembly file and a file in bam-format with spliced alignments of RNA-Seq reads to the genome. First, GeneMark-ET performs iterative training and generates initial gene structures. Second, AUGUSTUS uses predicted genes for training and then integrates RNA-Seq read information into final gene predictions. In our experiments, we observed that BRAKER1 was more accurate than MAKER2 when it is using RNA-Seq as sole source for training and prediction. BRAKER1 does not require pre-trained parameters or a separate expert-prepared training step.
Availability and implementation: BRAKER1 is available for download at http://bioinf.uni-greifswald.de/bioinf/braker/ and http://exon.gatech.edu/GeneMark/
Contact: katharina.hoff@uni-greifswald.de or borodovsky@gatech.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Similar articles
-
Whole-Genome Annotation with BRAKER.Methods Mol Biol. 2019;1962:65-95. doi: 10.1007/978-1-4939-9173-0_5. Methods Mol Biol. 2019. PMID: 31020555 Free PMC article.
-
BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database.NAR Genom Bioinform. 2021 Jan 6;3(1):lqaa108. doi: 10.1093/nargab/lqaa108. eCollection 2021 Mar. NAR Genom Bioinform. 2021. PMID: 33575650 Free PMC article.
-
BRAKER3: Fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS, and TSEBRA.Genome Res. 2024 Jun 25;34(5):769-777. doi: 10.1101/gr.278090.123. Genome Res. 2024. PMID: 38866550 Free PMC article.
-
BRAKER3: Fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS and TSEBRA.bioRxiv [Preprint]. 2024 Feb 29:2023.06.10.544449. doi: 10.1101/2023.06.10.544449. bioRxiv. 2024. Update in: Genome Res. 2024 Jun 25;34(5):769-777. doi: 10.1101/gr.278090.123. PMID: 37398387 Free PMC article. Updated. Preprint.
-
First Steps in the Analysis of Prokaryotic Pan-Genomes.Bioinform Biol Insights. 2020 Aug 7;14:1177932220938064. doi: 10.1177/1177932220938064. eCollection 2020. Bioinform Biol Insights. 2020. PMID: 32843837 Free PMC article. Review.
Cited by
-
Draft genome of a biparental beetle species, Lethrus apterus.BMC Genomics. 2021 Apr 26;22(1):301. doi: 10.1186/s12864-021-07627-w. BMC Genomics. 2021. PMID: 33902445 Free PMC article.
-
Historical genomics reveals the evolutionary mechanisms behind multiple outbreaks of the host-specific coffee wilt pathogen Fusarium xylarioides.BMC Genomics. 2021 Jun 4;22(1):404. doi: 10.1186/s12864-021-07700-4. BMC Genomics. 2021. PMID: 34082717 Free PMC article.
-
Accelerated differentiation of neo-W nuclear-encoded mitochondrial genes between two climate-associated bird lineages signals potential co-evolution with mitogenomes.Heredity (Edinb). 2024 Nov;133(5):342-354. doi: 10.1038/s41437-024-00718-w. Epub 2024 Aug 22. Heredity (Edinb). 2024. PMID: 39174672 Free PMC article.
-
De novo sequencing, assembly and functional annotation of Armillaria borealis genome.BMC Genomics. 2020 Sep 10;21(Suppl 7):534. doi: 10.1186/s12864-020-06964-6. BMC Genomics. 2020. PMID: 32912216 Free PMC article.
-
A high-quality Brassica napus genome reveals expansion of transposable elements, subgenome evolution and disease resistance.Plant Biotechnol J. 2021 Mar;19(3):615-630. doi: 10.1111/pbi.13493. Epub 2020 Nov 20. Plant Biotechnol J. 2021. PMID: 33073445 Free PMC article.
References
-
- Hoff K.J., Stanke M. (2015) Current methods for automated annotation of protein-coding genes. Curr. Opin. Insect Sci., 7, 8–14. - PubMed
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous