Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr 7;13(1):1908.
doi: 10.1038/s41467-022-29438-7.

MiDAS 4: A global catalogue of full-length 16S rRNA gene sequences and taxonomy for studies of bacterial communities in wastewater treatment plants

Collaborators, Affiliations

MiDAS 4: A global catalogue of full-length 16S rRNA gene sequences and taxonomy for studies of bacterial communities in wastewater treatment plants

Morten Kam Dahl Dueholm et al. Nat Commun. .

Erratum in

Abstract

Microbial communities are responsible for biological wastewater treatment, but our knowledge of their diversity and function is still poor. Here, we sequence more than 5 million high-quality, full-length 16S rRNA gene sequences from 740 wastewater treatment plants (WWTPs) across the world and use the sequences to construct the 'MiDAS 4' database. MiDAS 4 is an amplicon sequence variant resolved, full-length 16S rRNA gene reference database with a comprehensive taxonomy from domain to species level for all sequences. We use an independent dataset (269 WWTPs) to show that MiDAS 4, compared to commonly used universal reference databases, provides a better coverage for WWTP bacteria and an improved rate of genus and species level classification. Taking advantage of MiDAS 4, we carry out an amplicon-based, global-scale microbial community profiling of activated sludge plants using two common sets of primers targeting regions of the 16S rRNA gene, revealing how environmental conditions and biogeography shape the activated sludge microbiota. We also identify core and conditionally rare or abundant taxa, encompassing 966 genera and 1530 species that represent approximately 80% and 50% of the accumulated read abundance, respectively. Finally, we show that for well-studied functional guilds, such as nitrifiers or polyphosphate-accumulating organisms, the same genera are prevalent worldwide, with only a few abundant species in each genus.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Sampling of WWTPs across the world.
a Geographical distribution of WWTPs included in the study and their process configuration. b Distribution of plant types. MBBR moving bed bioreactor, MBR membrane bioreactor. c Distribution of process types for the activated sludge plants. C carbon removal, C,N carbon removal with nitrification, C,N,DN carbon removal with nitrification and denitrification, C,N,DN,P carbon removal with nitrogen removal and enhanced biological phosphorus removal (EBPR). The values next to the bars are the number of WWTPs in each group.
Fig. 2
Fig. 2. Novel sequences and de novo taxa defined in the MiDAS 4 reference database.
The phylogenetic trees are based on a multiple alignment of all MiDAS 4 reference sequences, which were first aligned against the global SILVA 138 alignment using the SINA aligner, and subsequently pruned according to the ssuref:bacteria positional variability by parsimony filter in ARB to remove hypervariable regions. The eight phyla with most FL-ASVs are highlighted in different colours. Sequence novelty was determined by the percent identity between each FL-ASV and their closest relative in the SILVA_138_SSURef_Nr99 database according to Usearch mapping and the taxonomic thresholds proposed by Yarza et al. shown in Table 1. Taxonomy novelty was defined based on the assignment of de novo taxa by AutoTax.
Fig. 3
Fig. 3. Database evaluation based on amplicon data from the Global Water Microbiome Consortium project.
Raw amplicon data from the Global Water Microbiome Consortium project was processed to resolve ASVs of the 16S rRNA gene V4 region. The ASVs for each of the samples were filtered based on their relative abundance (only ASVs with ≥0.01% relative abundance were kept) before the analyses. The percentage of the microbial community represented by the remaining ASVs after the filtering was 88.35 ± 2.98% (mean ± SD) across samples. High-identity (≥99%) hits were determined by the stringent mapping of ASVs to each reference database. Classification of ASVs was done using the SINTAX classifier. The violin and box plots represent the distribution of percent of ASVs with high-identity hits or genus/species-level classifications for each database across n = 1165 biologically independent samples. Box plots indicate median (middle line), 25th, 75th percentile (box) and the min and max values after removing outliers based on 1.5x interquartile range (whiskers). Outliers have been removed from the box plots to ease visualisation. Different colours are used to distinguish the different databases.
Fig. 4
Fig. 4. Comparison of relative genus abundance based on V1–V3 and V4 region 16S rRNA gene amplicon data.
a Mean relative abundance was calculated based on 709 activated sludge samples. Genera present at ≥0.001% relative abundance in V1–V3 and/or V4 datasets are considered. Genera with less than twofold difference in relative abundance between the two primer sets are shown with gray circles, and those that are overrepresented by at least twofold with one of the primer sets are shown in red (V4) and blue (V1–V3). The twofold difference is an arbitrary choice; however, it relates to the uncertainty we usually encounter in amplicon data. Genus names are shown for all taxa present at a minimum of 0.1% mean relative abundance (excluding those with de novo names). b Heatmaps of the most abundant genera with more than twofold relative abundance difference between the two primer sets.
Fig. 5
Fig. 5. Effects of process and environmental factors on the activated sludge microbial community structure. Principal coordinate analyses of Bray–Curtis and Soerensen beta-diversity for genera based on V1–V3 amplicon data. Samples are coloured based on metadata.
The fraction of variation in the microbial community explained by each variable in isolation was determined by PERMANOVA (Adonis R2-values). Exact P values <0.001 could not be confidently determined due to the permutational nature of the test. Process types: C carbon removal, C,N carbon removal with nitrification, C,N,DN carbon removal with nitrification and denitrification, C,N,DN,P carbon removal with nitrogen removal and enhanced biological phosphorus removal (EBPR). Temperature range: Very low = <10 °C, low = 10–15 °C, moderate = 15–20 °C, high = 20–25 °C, very high = 25–30 °C, extremely high = >30 °C. Industrial load: None = 0%, very low = 0–10%, low = 10–30%, medium = 30–50%, high 50–100%, all = 100%.
Fig. 6
Fig. 6. Identification of core and conditionally rare or abundant taxa based on V1–V3 amplicon data.
a Identification of strict, general and loose core genera based on how often a given genus was observed at a relative abundance above 0.1% in WWTPs. b Identification of conditionally rare or abundant (CRAT) genera based on whether a given genus was observed at a relative abundance above 1% in at least one WWTP. The cumulative genus abundance is based on all ASVs classified at the genus-level. All core genera were removed before identification of the CRAT genera. c, d Number of genera and species, respectively, and their abundance in different process types across the global WWTPs. Values for genera and species are divided into strict core, general core, loose core, CRAT, other taxa and unclassified ASVs. The relative abundance of different groups was calculated based on the mean relative abundance of individual genera or species across samples. C carbon removal, C,N carbon removal with nitrification, C,N,DN carbon removal with nitrification and denitrification, C,N,DN,P carbon removal with nitrogen removal and enhanced biological phosphorus removal (EBPR).
Fig. 7
Fig. 7. Percent relative abundance of strict and general core taxa across process types.
The taxonomy for the core genera indicates phylum and genus. For general core species, genus names are also provided. De novo taxa in the core are highlighted in red. C carbon removal, C,N carbon removal with nitrification, C,N,DN carbon removal with nitrification and denitrification, C,N,DN,P carbon removal with nitrogen removal and enhanced biological phosphorus removal (EBPR).
Fig. 8
Fig. 8. Global diversity of genera belonging to major functional groups.
The percent relative abundance represents the mean abundance for each country considering only WWTPs with the relevant process types. Countries are grouped based on continent (shifting colour).
Fig. 9
Fig. 9. Global diversity of known filamentous organisms.
The percent relative abundance represents the mean abundance for each country across all process types. Countries are grouped based on the continent (shifting colour).

Similar articles

Cited by

References

    1. Ardern E, Lockett WT. Experiments on the oxidation of sewage without the aid of filters. J. Soc. Chem. Ind. 1914;33:523–539. doi: 10.1002/jctb.5000331005. - DOI
    1. Wu L, et al. Global diversity and biogeography of bacterial communities in wastewater treatment plants. Nat. Microbiol. 2019;4:1183–1195. doi: 10.1038/s41564-019-0426-5. - DOI - PubMed
    1. WWAP (United Nations World Water Assessment Programme). The United Nations World Water Development Report 2017: Wastewater, The Untapped Resource (2017).
    1. Nielsen PH. Microbial biotechnology and circular economy in wastewater treatment. Microb. Biotechnol. 2017;10:1102–1105. doi: 10.1111/1751-7915.12821. - DOI - PMC - PubMed
    1. Zhang B, et al. Seasonal bacterial community succession in four typical wastewater treatment plants: correlations between core microbes and process performance. Sci. Rep. 2018;8:4566. doi: 10.1038/s41598-018-22683-1. - DOI - PMC - PubMed

Publication types