Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Sep 16;8(1):134.
doi: 10.1186/s40168-020-00903-z.

Estimate of the sequenced proportion of the global prokaryotic genome

Affiliations

Estimate of the sequenced proportion of the global prokaryotic genome

Zheng Zhang et al. Microbiome. .

Abstract

Background: Sequencing prokaryotic genomes has revolutionized our understanding of the many roles played by microorganisms. However, the cell and taxon proportions of genome-sequenced bacteria or archaea on earth remain unknown. This study aimed to explore this basic question using large-scale alignment between the sequences released by the Earth Microbiome Project and 155,810 prokaryotic genomes from public databases.

Results: Our results showed that the median proportions of the genome-sequenced cells and taxa (at 100% identities in the 16S-V4 region) in different biomes reached 38.1% (16.4-86.3%) and 18.8% (9.1-52.6%), respectively. The sequenced proportions of the prokaryotic genomes in biomes were significantly negatively correlated with the alpha diversity indices, and the proportions sequenced in host-associated biomes were significantly higher than those in free-living biomes. Due to a set of cosmopolitan OTUs that are found in multiple samples and preferentially sequenced, only 2.1% of the global prokaryotic taxa are represented by sequenced genomes. Most of the biomes were occupied by a few predominant taxa with a high relative abundance and much higher genome-sequenced proportions than numerous rare taxa.

Conclusions: These results reveal the current situation of prokaryotic genome sequencing for earth biomes, provide a more reasonable and efficient exploration of prokaryotic genomes, and promote our understanding of microbial ecological functions. Video Abstract.

Keywords: Earth microbiome project; Genome sequencing; Microbiome; Predominant taxa; Prokaryotic biome.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Genome-sequenced degree of prokaryotic biomes. a Genome-sequenced proportion of cells. b Genome-sequenced proportion of taxa. OTUs share 100% identities with the sequenced genomes. Based on the analysis of 10,000 EMP samples, each gray point represents a single sample. For the box plots, the middle line indicates the median, the box represents the 25th–75th percentiles, and the error bar indicates the 10th–90th percentiles of observations. Environment types were classified by EMPO; red represents host associated and green represents free living
Fig. 2
Fig. 2
High genome-sequenced proportion of prokaryotic cosmopolitan taxa. a OTUs that can exist in one or more samples. b OTUs that can exist in one or more environment types. The gray column represents the proportion of OTUs that can exist in one or more samples (environments), and the red column represents the genome-sequenced proportion of OTUs. c Lower POTU than BOTU is caused by a high genome-sequenced proportion of cosmopolitan taxa
Fig. 3
Fig. 3
Genome-sequenced proportion of prokaryotic taxa from global or different environment types. a As the number of samples increases, the POTU (100%) shows an exponential declining trend and finally stabilizes at 2.1%. A random selection of 1000, 2000…, 9000 samples was performed 10 times for each group to calculate the mean value and standard deviation. b Significant difference of POTU among environment types. The red point is POTU (100%), the blue point is POTU (98.6%), and the orange point is POTU (97%)
Fig. 4
Fig. 4
High genome-sequenced proportion of prokaryotic taxa with high abundance. a The top 1% of the prokaryotic taxa account for 72.9% of the global prokaryotic biomes. b The top 1% of the prokaryotic taxa from different environment types accounted for more than 40% with a genome-sequenced proportion greater than 10%. The gray column represents the cellular proportion of the top 1% of the taxa, and the red column represents the POTU (100%). c High genome-sequenced proportion of the top 1%. The red line is POTU (100%), the blue line is POTU (98.6%), and the orange line is POTU (97%)

Similar articles

  • A census-based estimate of Earth's bacterial and archaeal diversity.
    Louca S, Mazel F, Doebeli M, Parfrey LW. Louca S, et al. PLoS Biol. 2019 Feb 4;17(2):e3000106. doi: 10.1371/journal.pbio.3000106. eCollection 2019 Feb. PLoS Biol. 2019. PMID: 30716065 Free PMC article.
  • Correcting for 16S rRNA gene copy numbers in microbiome surveys remains an unsolved problem.
    Louca S, Doebeli M, Parfrey LW. Louca S, et al. Microbiome. 2018 Feb 26;6(1):41. doi: 10.1186/s40168-018-0420-9. Microbiome. 2018. PMID: 29482646 Free PMC article.
  • The microbial dark matter and "wanted list" in worldwide wastewater treatment plants.
    Zhang Y, Wang Y, Tang M, Zhou J, Zhang T. Zhang Y, et al. Microbiome. 2023 Mar 28;11(1):59. doi: 10.1186/s40168-023-01503-3. Microbiome. 2023. PMID: 36973807 Free PMC article.
  • En route to a genome-based classification of Archaea and Bacteria?
    Klenk HP, Göker M. Klenk HP, et al. Syst Appl Microbiol. 2010 Jun;33(4):175-82. doi: 10.1016/j.syapm.2010.03.003. Epub 2010 Apr 20. Syst Appl Microbiol. 2010. PMID: 20409658 Review.
  • Roadmap for naming uncultivated Archaea and Bacteria.
    Murray AE, Freudenstein J, Gribaldo S, Hatzenpichler R, Hugenholtz P, Kämpfer P, Konstantinidis KT, Lane CE, Papke RT, Parks DH, Rossello-Mora R, Stott MB, Sutcliffe IC, Thrash JC, Venter SN, Whitman WB, Acinas SG, Amann RI, Anantharaman K, Armengaud J, Baker BJ, Barco RA, Bode HB, Boyd ES, Brady CL, Carini P, Chain PSG, Colman DR, DeAngelis KM, de Los Rios MA, Estrada-de Los Santos P, Dunlap CA, Eisen JA, Emerson D, Ettema TJG, Eveillard D, Girguis PR, Hentschel U, Hollibaugh JT, Hug LA, Inskeep WP, Ivanova EP, Klenk HP, Li WJ, Lloyd KG, Löffler FE, Makhalanyane TP, Moser DP, Nunoura T, Palmer M, Parro V, Pedrós-Alió C, Probst AJ, Smits THM, Steen AD, Steenkamp ET, Spang A, Stewart FJ, Tiedje JM, Vandamme P, Wagner M, Wang FP, Yarza P, Hedlund BP, Reysenbach AL. Murray AE, et al. Nat Microbiol. 2020 Aug;5(8):987-994. doi: 10.1038/s41564-020-0733-x. Epub 2020 Jun 8. Nat Microbiol. 2020. PMID: 32514073 Free PMC article. Review.

Cited by

References

    1. Fuhrman JA, Cram JA, Needham DM. Marine microbial community dynamics and their ecological interpretation. Nat Rev Microbiol. 2015;13(3):133–146. - PubMed
    1. Fierer N. Embracing the unknown: disentangling the complexities of the soil microbiome. Nat Rev Microbiol. 2017;15(10):579–590. - PubMed
    1. Loman NJ, Pallen MJ. Twenty years of bacterial genome sequencing. Nat Rev Microbiol. 2015;13(12):787–794. - PubMed
    1. Wu D, Hugenholtz P, Mavromatis K, Pukall R, Dalin E, Ivanova NN, et al. A phylogeny-driven genomic encyclopaedia of bacteria and archaea. Nature. 2009;462(7276):1056–1060. - PMC - PubMed
    1. Fraser CM, Eisen JA, Salzberg SL. Microbial genome sequencing. Nature. 2000;406(6797):799–803. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources