Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jan;43(Database issue):D599-605.
doi: 10.1093/nar/gku1062. Epub 2014 Dec 15.

Update on RefSeq microbial genomes resources

Affiliations

Update on RefSeq microbial genomes resources

Tatiana Tatusova et al. Nucleic Acids Res. 2015 Jan.

Abstract

NCBI RefSeq genome collection http://www.ncbi.nlm.nih.gov/genome represents all three major domains of life: Eukarya, Bacteria and Archaea as well as Viruses. Prokaryotic genome sequences are the most rapidly growing part of the collection. During the year of 2014 more than 10,000 microbial genome assemblies have been publicly released bringing the total number of prokaryotic genomes close to 30,000. We continue to improve the quality and usability of the microbial genome resources by providing easy access to the data and the results of the pre-computed analysis, and improving analysis and visualization tools. A number of improvements have been incorporated into the Prokaryotic Genome Annotation Pipeline. Several new features have been added to RefSeq prokaryotic genomes data processing pipeline including the calculation of genome groups (clades) and the optimization of protein clusters generation using pan-genome approach.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Growth of genomes, species and genera: rapid growth of the number of isolates with relatively slow growth of new genera. Note that the data does not include assemblies from environmental studies where the number of novel species is growing much faster.
Figure 2.
Figure 2.
ShigellaEscherichia coli pan-genome: core and mobile components.
Figure 3.
Figure 3.
Escherichia coli genomes grouped by BLAST similarity distance; each green box represents a short proximity genome group.
Figure 4.
Figure 4.
RefSeq prokaryotic genome processing: pan-genome and genome close proximity groups.
None

References

    1. Pruitt K.D., Tatusova T., Brown G.R., Maglott D.R. NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy. Nucleic Acids Res. 2012;40:D130–D135. - PMC - PubMed
    1. Tatusova T, Ciufo S., Fedorov B., O'Neill K., Tolstoy I. RefSeq microbial genomes database: new representation and annotation strategy. Nucleic Acids Res. 2014;42:D553–D559. - PMC - PubMed
    1. Fujishima K., Sugahara J., Miller C.S., Baker B.J., Di Giulio M., Tomita M., Banfield J.F., Kanai A.A. Novel three-unit tRNA splicing endonuclease found in ultra-small Archaea possesses broad substrate specificity. Nucleic Acids Res. 2011;39:9695–9704. - PMC - PubMed
    1. Han K., Li Z.E., Peng R., Zhu L.P., Zhou T., Wang L.G., Li S.G., Zhang X.B., Hu W., Wu Z.H., et al. Extraordinary expansion of a Sorangium cellulosum genome from an alkaline milieu. Sci. Rep. 2013;3:2101. - PMC - PubMed
    1. Loman N.J., Constantinidou C., Chan J.Z., Halachev M., Sergeant M., Penn C.W., Robinson E.R., Pallen M.J. High-throughput bacterial genome sequencing: an embarrassment of choice, a world of opportunity. Nat. Rev. Microbiol. 2012;10:599–606. - PubMed

Publication types