Annotation of 2,507 Saccharomyces cerevisiae genomes
- PMID: 38488392
- PMCID: PMC10986567
- DOI: 10.1128/spectrum.03582-23
Annotation of 2,507 Saccharomyces cerevisiae genomes
Erratum in
-
Erratum for Wang et al., "Annotation of 2,507 Saccharomyces cerevisiae genomes".Microbiol Spectr. 2024 Nov 12;12(12):e0237424. doi: 10.1128/spectrum.02374-24. Online ahead of print. Microbiol Spectr. 2024. PMID: 39527776 Free PMC article. No abstract available.
Abstract
Saccharomyces cerevisiae (baker's yeast, budding yeast) is one of the most important model organisms for biological research and is a crucial microorganism in industry. Currently, a huge number of Saccharomyces cerevisiae genome sequences are available at the public domain. However, these genomes are distributed at different websites and a large number of them are released without annotation information. To provide one complete annotated genome data resource, we collected 2,507 Saccharomyces cerevisiae genome assemblies and re-annotated 2,506 assemblies using a custom annotation pipeline, producing a total of 15,407,164 protein-coding gene models. With a custom pipeline, all these gene sequences were clustered into families. A total of 1,506 single-copy genes were selected as marker genes, which were then used to evaluate the genome completeness and base qualities of all assemblies. Pangenomic analyses were performed based on a selected subset of 847 medium-high-quality genomes. Statistical comparisons revealed a number of gene families showing copy number variations among different organism sources. To the authors' knowledge, this study represents the largest genome annotation project of S. cerevisiae so far, providing rich genomic resources for the future studies of the model organism S. cerevisiae and its relatives.IMPORTANCESaccharomyces cerevisiae (baker's yeast, budding yeast) is one of the most important model organisms for biological research and is a crucial microorganism in industry. Though a huge number of Saccharomyces cerevisiae genome sequences are available at the public domain, these genomes are distributed at different websites and most are released without annotation, hindering the efficient reuse of these genome resources. Here, we collected 2,507 genomes for Saccharomyces cerevisiae, performed genome annotation, and evaluated the genome qualities. All the obtained data have been deposited at public repositories and are freely accessible to the community. This study represents the largest genome annotation project of S. cerevisiae so far, providing one complete annotated genome data set for S. cerevisiae, an important workhorse for fundamental biology, biotechnology, and industry.
Keywords: Saccharomyces cerevisiae; annotation; genome.
Conflict of interest statement
Xiaoping Hou, Yang He, Jun-Hong Yu, Shumin Hu, and Hua Yin are employed by Tsingtao Brewery Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures










Similar articles
-
The new modern era of yeast genomics: community sequencing and the resulting annotation of multiple Saccharomyces cerevisiae strains at the Saccharomyces Genome Database.Database (Oxford). 2013 Mar 13;2013:bat012. doi: 10.1093/database/bat012. Print 2013. Database (Oxford). 2013. PMID: 23487186 Free PMC article.
-
Genome-wide metabolic re-annotation of Ashbya gossypii: new insights into its metabolism through a comparative analysis with Saccharomyces cerevisiae and Kluyveromyces lactis.BMC Genomics. 2014 Sep 24;15(1):810. doi: 10.1186/1471-2164-15-810. BMC Genomics. 2014. PMID: 25253284 Free PMC article.
-
Saccharomyces cerevisiae: gene annotation and genome variability, state of the art through comparative genomics.Methods Mol Biol. 2011;759:31-40. doi: 10.1007/978-1-61779-173-4_2. Methods Mol Biol. 2011. PMID: 21863479
-
The Ecology and Evolution of the Baker's Yeast Saccharomyces cerevisiae.Genes (Basel). 2022 Jan 26;13(2):230. doi: 10.3390/genes13020230. Genes (Basel). 2022. PMID: 35205274 Free PMC article. Review.
-
European Functional Analysis Network (EUROFAN) and the functional analysis of the Saccharomyces cerevisiae genome.Electrophoresis. 1998 Apr;19(4):617-24. doi: 10.1002/elps.1150190427. Electrophoresis. 1998. PMID: 9588813 Review.
Cited by
-
Mapping-based genome size estimation.BMC Genomics. 2025 May 14;26(1):482. doi: 10.1186/s12864-025-11640-8. BMC Genomics. 2025. PMID: 40369445 Free PMC article.
-
Microbial interactions and ecology in fermented food ecosystems.Nat Rev Microbiol. 2025 May 23. doi: 10.1038/s41579-025-01191-w. Online ahead of print. Nat Rev Microbiol. 2025. PMID: 40410356 Review.
-
Overview of the Saccharomyces cerevisiae population structure through the lens of 3,034 genomes.G3 (Bethesda). 2024 Nov 19;14(12):jkae245. doi: 10.1093/g3journal/jkae245. Online ahead of print. G3 (Bethesda). 2024. PMID: 39559979 Free PMC article.
References
-
- Peter J, De Chiara M, Friedrich A, Yue J-X, Pflieger D, Bergström A, Sigwalt A, Barre B, Freel K, Llored A, Cruaud C, Labadie K, Aury J-M, Istace B, Lebrigand K, Barbry P, Engelen S, Lemainque A, Wincker P, Liti G, Schacherer J. 2018. Genome evolution across 1,011 Saccharomyces cerevisiae isolates. Nature 556:339–344. doi:10.1038/s41586-018-0030-5 - DOI - PMC - PubMed
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources