Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan;7(1):34-47.
doi: 10.1038/s41564-021-01014-7. Epub 2021 Dec 6.

Species- and site-specific genome editing in complex bacterial communities

Affiliations

Species- and site-specific genome editing in complex bacterial communities

Benjamin E Rubin et al. Nat Microbiol. 2022 Jan.

Abstract

Understanding microbial gene functions relies on the application of experimental genetics in cultured microorganisms. However, the vast majority of bacteria and archaea remain uncultured, precluding the application of traditional genetic methods to these organisms and their interactions. Here, we characterize and validate a generalizable strategy for editing the genomes of specific organisms in microbial communities. We apply environmental transformation sequencing (ET-seq), in which nontargeted transposon insertions are mapped and quantified following delivery to a microbial community, to identify genetically tractable constituents. Next, DNA-editing all-in-one RNA-guided CRISPR-Cas transposase (DART) systems for targeted DNA insertion into organisms identified as tractable by ET-seq are used to enable organism- and locus-specific genetic manipulation in a community context. Using a combination of ET-seq and DART in soil and infant gut microbiota, we conduct species- and site-specific edits in several bacteria, measure gene fitness in a nonmodel bacterium and enrich targeted species. These tools enable editing of microbial communities for understanding and control.

PubMed Disclaimer

Conflict of interest statement

Competing Interests

The Regents of the University of California have patents pending related to this work on which B.E.R., S.D., B.F.C., A.M.D., J.F.B., and J.A.D. are inventors. J.A.D. is a co-founder of Caribou Biosciences, Editas Medicine, Intellia Therapeutics, Scribe Therapeutics and Mammoth Biosciences, a scientific advisory board member of Caribou Biosciences, Intellia Therapeutics, eFFECTOR Therapeutics, Scribe Therapeutics, Synthego, Mammoth Biosciences and Inari, and is a Director at Johnson & Johnson and has sponsored research projects by Biogen, Roche and Pfizer. J.F.B. is a founder of Metagenomi. R.B. is a shareholder of Caribou Biosciences, Intellia Therapeutics, Locus Biosciences, Inari, TreeCo, and Ancilia Biosciences.

Figures

Extended Data Fig. 1 |
Extended Data Fig. 1 |. Library preparation and data normalization for ET-Seq.
a, ET-Seq requires low-coverage metagenomic sequencing and customized insertion sequencing. Insertion sequencing relies on custom splinkerette adaptors, which minimize non-specific amplification, a digestion step for degradation of delivery vector containing fragments, and nested PCR to enrich for fragments containing insertions with high specificity. The second round of nested PCR adds unique dual index adaptors for Illumina sequencing. b, This insertion sequencing data is first normalized by the reads to internal standard DNA which is added equally to all samples and serves to correct for variation in reads produced per sample. Secondly, it is normalized by the relative metagenomic abundances of the community members.
Extended Data Fig. 2 |
Extended Data Fig. 2 |. Measurement and correction of chimeric reads.
a, The response of chimeric reads, measured as total normalized read counts to insertions into wildtype S. meliloti DNA spiked-in before library preparation, to increasing quantities of donor vector. Plot is log10 scaled on the x and y-axis for readability. Dashed lines indicate log-log linear fit to data (R2No Correction = 0.86, n = 7 biological replicates; R2Correction = 0.92, n = 7 biological replicates) b, Frequency of read properties (imperfect insert sequence = single difference in last 5 bp of transposon right end from expected sequence; imperfect host sequence = mismatch in first 3 bp of genomic sequence at transposon genome junction when aligned to host genome) identified as strongly associated with S. meliloti insertions, in which all reads are expected to be chimeric, used as markers for filtering chimeric reads. Box plots indicate median and bound 1st and 3rd quartile, whiskers indicate max/min values (n = 7 biological replicates). Plot is log10 scaled on the y-axis for readability. c, Fraction of insertion mapping reads filtered out of each dataset, for each organism/vector (n = 7 biological replicates) following chimera filtering. Box plots indicate median and bound 1st and 3rd quartile, whiskers indicate max/min values. Plot is log10 scaled on the y-axis for readability.
Extended Data Fig. 3 |
Extended Data Fig. 3 |. ET-Seq determined insertion efficiencies for all nine consortium members as a fraction of the entire community.
ET-Seq determined insertion efficiencies for conjugation, electroporation, and natural transformation on the synthetic soil community (n = 3 biological replicates). The values shown are the estimated fraction a constituent species’s transformed cells make of the total community population. Control samples received no exogenous DNA. Average relative abundance across all samples is indicated in parentheses (n = 18 independent samples).
Extended Data Fig. 4 |
Extended Data Fig. 4 |. Benchmarking DART vectors.
a, E. coli WM3064 to E. coli BL21(DE3) conjugation, transposition, and selection schematic (top) and guide RNAs targeting the lacZ α-fragment of recipient BL21(DE3), which is absent from donor WM3064 (bottom). b,d,f, Percent selectable transposed colonies is calculated as the number of colonies obtained with gentamycin selection divided by total viable colonies in absence of selection. b, Insertion-receiving colonies divided into on- and off-targeted. This was calculated by multiplying % selectable colonies for representative guides in d and f (highlighted by grey bars) by the on- or off-target rates (shown in Fig. 2b). c, Transposition with VcDART was tested using three promoters. The variant using the Plac promoter, harvested from pHelper_ShCAST_sgRNA, was also used for Fig. 2–5 and Extended Data Fig. 4b, 5, 6, and 8. d, Efficiencies of VcDART using various promoters. e, Transposition with ShDART was tested with three transcriptional configurations, all using Plac. The configuration used for characterization of ShCasTn originally was also used for Fig. 2 and Extended Data Fig. 4b. f, Efficiencies of ShDART using various promoters. b, d, f, Crossbar indicates mean and error bars indicate one standard deviation from the mean (n = 3 biological replicates). Guide RNAs ending in “NT” are non-targeting negative control samples.
Extended Data Fig. 5 |
Extended Data Fig. 5 |. Sanger sequencing of VcDART mutants from the synthetic soil microbial community.
a, Representative Sanger sequencing chromatogram of PCR product spanning transposon insertion site at targeted pyrF locus in K. michiganensis and b, in P. simiae mutant colonies following VcDART-mediated transposon integration and selection. Target-site duplications (TSD) are indicated with dashed boxes.
Extended Data Fig. 6 |
Extended Data Fig. 6 |. Insertion counts in Ralstonia sp. after metabolic enrichment for P. simiae.
a, Raw number of paired end reads in shotgun sequencing analysis detected as spanning a transposon-genome junction for the P. simiae and Ralstonia sp. genomes in each of three replicate enrichment samples. b, Number of paired end reads detected normalized to the coverage of each genome within each respective sample. The mean number of inserts normalized to coverage were compared between P. simiae and Ralstonia sp. (MeanPsim = 0.1250; MeanRal = 0.0042) and were significantly different (P-value = 0.00058; two-sample t-test).
Extended Data Fig. 7 |
Extended Data Fig. 7 |. Relative abundance of stool sample inoculum and infant gut community used for VcDART editing.
The gut microbiome compositions were obtained by read mapping to 1005 reference genomes from Lou et al. 2021. Bar height represents normalized subspecies relative abundance, and bars are colored by strain.
Extended Data Fig. 8 |
Extended Data Fig. 8 |. ET-Seq determined insertion efficiency for the infant gut community.
Insertion efficiency as quantified by ET-Seq for nine microbial species determined to be present by metagenomic sequencing. Experimental samples were conjugated with a donor containing the unguided mariner transposon (pHLL250; n = 3 biological replicates). Control samples did not receive the donor (n = 3 biological replicates). Percentages next to species names indicate their mean relative fraction in the infant gut community, averaged across the 6 biological replicate experiments performed.
Extended Data Fig. 9:
Extended Data Fig. 9:. Target site locus and strain comparisons for selective enrichment from infant gut community.
a, Clinically relevant gene clusters targeted by VcDART for selective enrichment included a locus associated with fimbriae biosynthesis (top) and a propanediol utilization gene cluster (bottom). Insets show mapped reads to these loci in E. coli subsp. 2 and subsp. 3, which were assembled from enrichment culture shotgun sequencing data. The right end of the VcDART transposon cargo was assembled (green), is bridged to the genome, and is supported by paired end read mapping. VcDART target sites (protospacer) are indicated in dark red. b, Dendrogram displaying average nucleotide identity differences between all E. coli genomes analyzed as part of the infant gut community. Strains in black were genomes originally recovered from metagenomic assembly in Lou, et al. 2021. Strains in red were assembled out of enrichment cultures in this study.
Extended Data Fig. 10 |
Extended Data Fig. 10 |. Location of VcDART transposon insertions in isolated E. coli mutant colonies following infant gut community editing.
a, Insertion orientations and locations relative to target site were determined by locus-specific PCR and Sanger sequencing on colonies picked from selective solid medium after editing the infant gut community with VcDART guided by the fimbriae associated locus-targeting guide RNA and b, the propanediol metabolism locus-targeting guide RNA (n = 3 biological replicates).
Fig. 1 |
Fig. 1 |. ET-Seq for quantitative measurement of insertion efficiency in a microbial community.
a, ET-Seq provides data on insertion efficiency of multiple delivery approaches, including conjugation, electroporation, and natural DNA transformation, on microbial community members. In this illustrative example, the blue strain is most amenable to electroporation (star). This data allows for the determination of feasible targets and delivery methods for DART targeted editing. b, ET-Seq determined efficiencies for known quantities of spiked-in pre-edited K. michiganensis. Solid line is the fit of the linear regression to the data not including zeros (n = 11 independent samples; R2 = 0.89). c, ET-Seq determined insertion efficiencies (insertion containing portion within each species) for conjugation, electroporation, and natural transformation of the synthetic soil community (n = 3 biological replicates). Average relative abundance for each organism is indicated in parentheses.
Fig. 2 |
Fig. 2 |. Benchmarking all-in-one conjugative targeted vectors.
a, Schematic of VcDART and ShDART delivery vectors. b, Fraction of insertions that occur 200 bp downstream of the 3’ end of the protospacer target site. Mean for three independent biological replicates is shown as cross bars. c-d, Aggregate unique insertion counts (n = 3 biological replicates) across the E. coli BL21(DE3) genome, determined by presence of unique barcodes, using c, VcDART and d, ShDART. The inset shows a 60 bp wind ow downstream of the target site where the peak of targeted insertions was observed. Insertion distance downstream of the target site is calculated from the 3’ end of the protospacer.
Fig. 3 |
Fig. 3 |. Selection free targeted editing and mutant tracking in the synthetic soil consortium.
a, The main figure shows the number of insertions detected by ET-Seq in each species normalized for sequencing effort by the B. thetaiotaomicron internal standard. The insets show the location of unique insertions summed for the three replicates in K. michiganensis (upper) and P. simiae (lower) *p<0.001 Poisson Probability. b, The diagram shows the use of ET-Seq to quantify the fitness effect of a VcDART mutation of interest, measured as the ratio of mutant of interest reads normalized to Safe Site mutant reads at the assay end point divided by their ratio at the beginning. c, Fitness of pyrF mutant under 5-FOA treatment as measured by the ratio of pyrF to Safe Site reads. Lines connect biologically paired replicates sampled longitudinally.
Fig. 4 |
Fig. 4 |. Enrichment of targeted strains in microbial communities.
a, VcDART delivery of antibiotic markers into a microbial community using species-specific crRNA, followed by selection for transposon cargo, facilitates isolation of targeted organisms. b, Relative abundance of synthetic soil community constituents measured by metagenomic sequencing before conjugative VcDART delivery and after selection for pyrF-targeted antibiotic casette in K. michiganensis or P. simiae. c, VcDART delivery of a nutrient utilization pathway, guided by species-specific crRNA, into a microbial community facilitates enrichment of a targeted organism through growth on the appropriate nutrient. d, Relative abundance of the constituents of a four-member community incapable of utilizing lactose measured before conjugative VcDART delivery and after lactose-based enrichment for Safe Site-targeted lacZY transposition into P. simiae.
Fig. 5 |
Fig. 5 |. Strain-resolved targeted editing in the infant gut microbiota.
a, Relative abundance of the infant gut community before and after VcDART editing and selection for targeted loci within E. coli subsp. 2 and 3. b, Fraction of insertions that occur within 20 bp of the expected target site (50 bp downstream of the 3’ end of protospacer). c-d, Unique insertion locations for targeted loci within c, E. coli subsp. 2 and d, E. coli subsp. 3. The main figures show unique insertions detected by ET-Seq normalized by the B. thetaiotaomicron internal standard. The insets show aggregate unique insertion counts (n = 3 biological replicates) within the protospacer adjacent region. In a and c-d members with relative abundance above 0.1% are shown and the targeted E. coli subsp. is noted with asterisks.

Comment in

References

    1. Steen AD et al. High proportions of bacteria and archaea across most biomes remain uncultured. ISME J. (2019). - PMC - PubMed
    1. Pascual-García A, Bonhoeffer S & Bell T Metabolically cohesive microbial consortia and ecosystem functioning. Philos. Trans. R. Soc. Lond. B Biol. Sci. 375, (2020). - PMC - PubMed
    1. Fux CA, Shirtliff M, Stoodley P & Costerton JW Can laboratory reference strains mirror ‘real-world’ pathogenesis? Trends Microbiol. 13, 58–63 (2005). - PubMed
    1. Pukall R, Tschäpe H & Smalla K Monitoring the spread of broad host and narrow host range plasmids in soil microcosms. FEMS Microbiol. Ecol. 20, 53–66 (1996).
    1. De Gelder L, Vandecasteele FPJ, Brown CJ, Forney LJ & Top EM Plasmid Donor Affects Host Range of Promiscuous IncP-1β Plasmid pB10 in an Activated-Sludge Microbial Community. Appl. Environ. Microbiol. 71, 5309–5317 (2005). - PMC - PubMed

Publication types

Substances