Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 7;50(D1):D996-D1003.
doi: 10.1093/nar/gkab1007.

Ensembl Genomes 2022: an expanding genome resource for non-vertebrates

Affiliations

Ensembl Genomes 2022: an expanding genome resource for non-vertebrates

Andrew D Yates et al. Nucleic Acids Res. .

Abstract

Ensembl Genomes (https://www.ensemblgenomes.org) provides access to non-vertebrate genomes and analysis complementing vertebrate resources developed by the Ensembl project (https://www.ensembl.org). The two resources collectively present genome annotation through a consistent set of interfaces spanning the tree of life presenting genome sequence, annotation, variation, transcriptomic data and comparative analysis. Here, we present our largest increase in plant, metazoan and fungal genomes since the project's inception creating one of the world's most comprehensive genomic resources and describe our efforts to reduce genome redundancy in our Bacteria portal. We detail our new efforts in gene annotation, our emerging support for pangenome analysis, our efforts to accelerate data dissemination through the Ensembl Rapid Release resource and our new AlphaFold visualization. Finally, we present details of our future plans including updates on our integration with Ensembl, and how we plan to improve our support for the microbial research community. Software and data are made available without restriction via our website, online tools platform and programmatic interfaces (available under an Apache 2.0 license). Data updates are synchronised with Ensembl's release cycle.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Shows the change in Ensembl Bacteria's collection, aggregated by our ten largest represented phylums, between releases 48 and 49. Component A shows the overall change in genome numbers in each phylum with over 15,000 genomes coming from three phylums. Component B demonstrates that overall family coverage within phylums has improved irrespective of the removal of genomes. Component C shows an increase in genomes without a known family with the majority occurring in Proteobacteria.
Figure 2.
Figure 2.
EPO multiple genome alignment visualization of chromosome 1 in three rice genomes: Oryza sativa indica Group (top), Oryza sativa japonica Group (middle) and Oryza glaberrima (bottom). Orange discontinuous blocks represent the areas of alignment across all three genomes. Each genome displays its genes and can be used to identify regions of uniqueness in each genome and identify potential areas of mis-assembly or mis-annotation. This alignment can be browsed at http://plants.ensembl.org/Oryza_nivara/Location/Compara_Alignments/Image?align = 9910;db = core;r = 1:586653–632276.
Figure 3.
Figure 3.
An AlphaFold 3D prediction for the Arabidopsis thaliana protein Q00958 (LFY: AT5G61850.1) displayed as a Richardson model using Mol*. The central panel annotates the model with regions of high confidence (blue) to low confidence (orange) with its protein sequence displayed above. The right hand panel enables highlighting of one or more exons, variants and protein features which are controlled by clicking on the eye icon. Variants can be turned on/off according to how deleterious or tolerated they are or individually. Only variants resulting in protein changes with SIFT scores are made available for display.

References

    1. Howe K.L., Achuthan P., Allen J., Allen J., Alvarez-Jarreta J., Amode M.R., Armean I.M., Azov A.G., Bennett R., Bhai J.et al. .. Ensembl 2021. Nucleic Acids Res. 2021; 49:D884–D891. - PMC - PubMed
    1. Howe K.L., Contreras-Moreira B., De Silva N., Maslen G., Akanni W., Allen J., Alvarez-Jarreta J., Barba M., Bolser D.M., Cambell L.et al. .. Ensembl Genomes 2020—enabling non-vertebrate genomic research. Nucleic Acids Res. 2020; 48:D689–D695. - PMC - PubMed
    1. Arita M., Karsch-Mizrachi I., Cochrane G.. The international nucleotide sequence database collaboration. Nucleic Acids Res. 2021; 49:D121–D124. - PMC - PubMed
    1. Resource Coordinators NCBI, Agarwala R., Barrett T., Beck J., Benson D.A., Bollin C., Bolton E., Bourexis D., Brister J.R., Bryant S.H.et al. .. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2018; 46:D8–D13. - PMC - PubMed
    1. Fujita P.A., Rhead B., Zweig A.S., Hinrichs A.S., Karolchik D., Cline M.S., Goldman M., Barber G.P., Clawson H., Coelho A.et al. .. The UCSC Genome Browser database: update 2011. Nucleic Acids Res. 2011; 39:D876–D882. - PMC - PubMed

Publication types