Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Oct;610(7933):731-736.
doi: 10.1038/s41586-022-05256-1. Epub 2022 Oct 19.

Borgs are giant genetic elements with potential to expand metabolic capacity

Affiliations

Borgs are giant genetic elements with potential to expand metabolic capacity

Basem Al-Shayeb et al. Nature. 2022 Oct.

Abstract

Anaerobic methane oxidation exerts a key control on greenhouse gas emissions1, yet factors that modulate the activity of microorganisms performing this function remain poorly understood. Here we discovered extraordinarily large, diverse DNA sequences that primarily encode hypothetical proteins through studying groundwater, sediments and wetland soil where methane production and oxidation occur. Four curated, complete genomes are linear, up to approximately 1 Mb in length and share genome organization, including replichore structure, long inverted terminal repeats and genome-wide unique perfect tandem direct repeats that are intergenic or generate amino acid repeats. We infer that these are highly divergent archaeal extrachromosomal elements with a distinct evolutionary origin. Gene sequence similarity, phylogeny and local divergence of sequence composition indicate that many of their genes were assimilated from methane-oxidizing Methanoperedens archaea. We refer to these elements as 'Borgs'. We identified at least 19 different Borg types coexisting with Methanoperedens spp. in four distinct ecosystems. Borgs provide methane-oxidizing Methanoperedens archaea access to genes encoding proteins involved in redox reactions and energy conservation (for example, clusters of multihaem cytochromes and methyl coenzyme M reductase). These data suggest that Borgs might have previously unrecognized roles in the metabolism of this group of archaea, which are known to modulate greenhouse gas emissions, but further studies are now needed to establish their functional relevance.

PubMed Disclaimer

Conflict of interest statement

J.F.B. is a co-founder of Metagenomi. J.A.D. is a cofounder of Caribou Biosciences, Editas Medicine, Scribe Therapeutics, Intellia Therapeutics and Mammoth Biosciences; is a scientific advisory board member of Vertex, Caribou Biosciences, Intellia Therapeutics, Scribe Therapeutics, Mammoth Biosciences, Algen Biotechnologies, Felix Biosciences, The Column Group and Inari Agriculture; is Chief Science Advisor to Sixth Street, a director at Johnson & Johnson, Altos and Tempus; and has research projects sponsored by Apple Tree Partners and Roche.

Figures

Fig. 1
Fig. 1. Borgs share overall genomic features.
a, Genome replichores (arrows) and coding strands (black bars) for aligned pairs of the four complete (Black, Purple, Sky and Lilac) and one near-complete (Orange) Borg. Blocks of sequence with identifiable nucleotide similarity are shown in between each pair (coloured graphs linked by lines; y axes show similarity). b, Genome overviews showing the distribution of three or more perfect tandem direct repeats (gold rods) along the complete genomes. Insets provide examples of local elevated GC content associated with certain gene clusters and within gene and intergenic tandem direct repeats (gold arrows).
Fig. 2
Fig. 2. Borg and Methanoperedens spp. genomic features and abundance patterns.
a, The average genome GC contents of Borgs and Methanoperedens spp. are distinct. The black line denotes the median, and the dashed lines show the interquartile range. b, Groups of related Methanoperedens spp. (rows) correlate with groups of Borgs (columns) across a set of 50 samples. The asterisks indicate two-sided Pearson correlations above 0.92 with FDR-corrected P values below 2.0 × 10–20 that suggest that Brown, Green, Orange, Beige and Ochre Borgs associate with one group of Methanoperedens spp., Olive, Cyan, Gold, Apricot and Rose associate with a second group, and Black associate with a third group.Black asterisks indicate best association with a Methanoperedens genome (correlation ≥ 0.92, P ≤ 1 × 10–20); grey asterisks indicate association with a scaffold containing the Methanoperedens L11 marker gene (correlation ≥ 0.92, P ≤ 1 × 10–20). c, Frequency of genes in different functional groups in the four complete Borg genomes. d, Comparison of the protein family composition of Borgs and Methanoperedens spp. Clustering on the basis of shared protein family content highlights groups of Borg-specific protein families (blue shading) and protein families shared with their hosts (orange shading). The full clustering, including diverse archaeal mobile elements, is shown in Extended Data Fig. 5. PEGA, surface layer protein; PHA, polyhydroxyalkanoate.
Fig. 3
Fig. 3. Cell cartoon illustrating capacities inferred to be provided to Methanoperedens spp. by the coexisting Lilac Borg.
Like all Borgs, this Borg lacks the capacity for independent existence, and we infer that it replicates within host Methanoperedens spp. cells. Borg-specific proteins are those that were not identified in the genome of coexisting Methanoperedens spp. Borg-encoded capacities are grouped into the major categories of energy metabolism (including the MCR complex involved in methane oxidation), extracellular electron transfer (including MHCs) involved in electron transport to external electron acceptors, central carbon metabolism (including genes that enable production of polyhydroxybutyrate (PHB)) and stress response/defence (including production of compatible solutes). Locus codes are listed in Supplementary Table 7.
Extended Data Fig. 1
Extended Data Fig. 1. Geochemical profiles of the permanently moist and organic-rich wetland soils.
(A) The mean concentrations of total carbon, nitrogen as well as (B) iron and manganese in wetland soils at 20 cm (n = 3), 40 cm (n = 5), and 90 cm (n = 2) where n denotes the number of biological samples. Deeper soils, where these extrachromosomal elements are most abundant, are somewhat depleted in carbon, iron and manganese compared to shallow soils. Error bars denote standard deviation. 36 samples were collected and sequenced, with 1 to 10 independent samples collected from the same soil depth.
Extended Data Fig. 2
Extended Data Fig. 2. Sets of three or more perfect tandem direct repeats (TDR) are a characteristic feature of the Borg genomes.
Up to 54 instances occur in the four complete Borg genomes, with, on average, one repeat every 12 (Lilac) − 31 (Sky) kbp. These repeat regions fragment assemblies and cause local assembly errors, which we resolved by manual curation (Methods). Within the TDR regions of the four curated, complete genomes, the unit repeats occur up to 20 times and unit repeats are up to 54 bps in length (Supplementary Table 2). Between 54 and 64% of these perfect TDRs are encoded in intergenic regions, although part or all of the first repeat may occur within the C- terminus of a protein-coding gene. When the TDRs occur within proteins, the unit lengths are almost always divisible by 3, so they introduce perfect amino acid repeats. TDR sequences within a single Borg genome are almost always unique. Repeat sequence comparison from the four complete curated Borgs highlights the novelty of almost all TDR sequences (both within and across genomes).
Extended Data Fig. 3
Extended Data Fig. 3. All genomes have two replichores of unequal lengths.
GC skew (grey plots) and cumulative GC skew (green lines) across the four complete Borg genomes, all of which end in long inverted terminal repeats (1.4–2.7 kbp in length). The cumulative GC skew plots indicate replication is initiated in these terminal repeats (red lines). Blue lines mark the predicted replication termini. The red and blue lines define two replichores of unequal length that correspond almost completely to distinct coding strands (almost all genes on the +ve strand of the large replichore and on the -ve strand of the small replichore).
Extended Data Fig. 4
Extended Data Fig. 4. Taxonomic profiles of the four complete Borg genomes.
A. In all cases, the majority of proteins have no similarity to proteins in the reference database (“Unknown”; e-value of > 0.0001). For the cases where a protein has an identifiable hit (blue and red bars in A), the plots in B. show the taxonomy of the organisms in which those hits were identified. Only cases where the same organism accounted for hits for > 0.5% of genes are shown. The results clearly indicate that the vast majority of cases where proteins have identifiable matches involve matches to proteins of Methanoperedens spp. (gold bars).
Extended Data Fig. 5
Extended Data Fig. 5. The clustering based on protein family content demonstrates that the Methanoperedens, Borgs, archaeal viruses and plasmids/minichromosomes are distinct from each other.
(A) Colored blocks indicate presence of each protein family in the corresponding genome. The blue highlight at the top indicates the Methanoperedens spp. (top) and Borg (bottom) protein family profiles. For details see Fig. 2d. We note that archaeal plasmids are highly undersampled. If Borgs are ultimately classified as plasmids, they dramatically expand the known characteristics (e.g., size, linear genomes) and diversity of archaeal plasmids. (B) Borg protein inventories (purple highlight) compared to giant linear bacterial plasmids. (C) Protein families occurring in more than 5 genomes of Borgs and giant linear bacterial plasmids. Few protein families are shared between Borgs and linear plasmids in bacteria beyond methyltransferases, histidine kinases, and other enzymes unrelated to replication. (D) Average Nucleotide Identity of different Methanoperedens species that coexist with Borgs (red) and previously reported genomes (gray) and the 95% species threshold shown with a dashed line.
Extended Data Fig. 6
Extended Data Fig. 6. Ribosomal protein analyses and phylogenies.
(A) The array of single-copy archaeal ribosomal genes (columns) vs. Borg (blue) and Methanoperedens spp. (gold) genomes illustrating that although Borgs often have rpL11 and occasionally, other ribosomal proteins, they do not have the gene inventory needed to construct ribosomes. (B) Left; Dendrogram of hierarchical clustering of all-vs-all Pearson correlation values between all Borgs and Methanoperedens spp. from the wetland. Right; Maximum Likelihood Phylogeny of concatenated ribosomal proteins from Methanoperedens species that do and do not coexist with Borgs and previously reported genomes. We found no data indicating the presence of Borgs in samples containing previously reported Methanoperedens genomes. We searched for Borgs in the samples highlighted in blue using the same methods used to detect Borgs in this study and concluded that they do not contain Borgs. A subset of the Borg-free samples contain Methanoperedens spp. at very high abundance levels.
Extended Data Fig. 7
Extended Data Fig. 7. Abundance and distribution of Borgs and Methanoperedens  spp. in the wetland soil and Rifle aquifer.
A. Relative abundances of Methanoperedens spp. and Borgs in samples collected over time and arrayed by sample collection depth from the wetland soils, sediments and groundwater. The absolute abundances of Borgs are far greater in the deeper compared to shallower soils B. Although some Borgs can substantially exceed all the combined abundance of Methanoperedens  spp., no Borgs were detected in some Methanoperedens-bearing samples. “W” indicates that the sample was pumped groundwater.
Extended Data Fig. 8
Extended Data Fig. 8. Genome comparisons and CRISPR-Cas interactions.
(A) Genome-to-genome comparisons provide evidence for recombination between two of the mostly closely related Borgs, Sky and Rose. These Borgs share only moderate overall genomic nucleic acid identity although, as is the case for other Borgs (Fig. 1a), have blocks of partially alignable sequence throughout their genomes. Notable, and indicating recent homologous recombination, are 100% identical regions of up to ~11 kbp in length (B). Although not fully manually curated to completion, the relevant Rose Borg genome regions were carefully checked by inspection of the mapped reads to rule out chimeric assembly that could otherwise explain perfect identity with the Sky Borg sequence (Sky is one of the four curated complete genomes). (C) Read coverages over the Rose and Sky genomes are consistent throughout, with the regions in B noted with green boxes. (D) Diagram illustrating the organization of the Type III-A CRISPR-Cas system variant (lacking acquisition machinery and Csm6) in the Orange Borg. One spacer from the CRISPR array targets a small protein with a ribbon-helix-helix motif, a common transcriptional regulator in archaeal mobile elements, in a mobile region of a Methanoperedens genome bin from the same wetland site.
Extended Data Fig. 9
Extended Data Fig. 9. The Borg ribosomal sequences form monophyletic groups that cluster adjacent to those from Methanoperedens spp.
Phylogenetic tree constructed using the protein sequences for (A) ribosomal protein L11 (rpL11), (B) Ribosomal protein S2 (C) Ribosomal protein 3ae.

Comment in

References

    1. Wallenius AJ, Dalcin Martins P, Slomp CP, Jetten MSM. Anthropogenic and environmental constraints on the microbial methane cycle in coastal sediments. Front. Microbiol. 2021;12:631621. - PMC - PubMed
    1. Thauer RK, Kaster A-K, Seedorf H, Buckel W, Hedderich R. Methanogenic archaea: ecologically relevant differences in energy conservation. Nat. Rev. Microbiol. 2008;6:579–591. - PubMed
    1. Hanson RS, Hanson TE. Methanotrophic bacteria. Microbiol. Rev. 1996;60:439–471. - PMC - PubMed
    1. Boetius A, et al. A marine microbial consortium apparently mediating anaerobic oxidation of methane. Nature. 2000;407:623–626. - PubMed
    1. Hallam SJ, Girguis PR, Preston CM, Richardson PM, DeLong EF. Identification of methyl coenzyme M reductase A (mcrA) genes associated with methane-oxidizing archaea. Appl. Environ. Microbiol. 2003;69:5483–5491. - PMC - PubMed

Publication types