Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2025 Oct 20:2025.03.25.645238.
doi: 10.1101/2025.03.25.645238.

Co-occurrence is associated with horizontal gene transfer across marine bacteria independent of phylogeny

Affiliations

Co-occurrence is associated with horizontal gene transfer across marine bacteria independent of phylogeny

Gavin M Douglas et al. bioRxiv. .

Update in

Abstract

Understanding the drivers and consequences of horizontal gene transfer (HGT) is a key goal of microbial evolution research. Although co-occurring taxa have long been appreciated to undergo HGT more often, this association is confounded with other factors, most notably their phylogenetic relatedness. To disentangle these factors, we analyzed 15,339 marine prokaryotic genomes (mainly bacteria) and their distribution in the global ocean. We identified HGT events across these genomes and enrichments for functions previously shown to be prone to HGT. By mapping metagenomic reads from 1,862 ocean samples to these genomes, we also identified co-occurrence patterns and environmental associations. Although we observed an expected negative association between HGT rates and phylogenetic distance, we only detected an association between co-occurrence and phylogenetic distance for closely related taxa. This observation refines the previously reported trend to closely related taxa, rather than a consistent pattern across all taxonomic levels, at least here within marine environments. In addition, we identified a significant association between co-occurrence and HGT, which remains even after controlling for phylogenetic distance and measured environmental variables. In a subset of samples with extended environmental data, we identified higher HGT levels associated with particle-attached bacteria and associations of varying directions with specific environmental variables, such as chlorophyll a and photosynthetically available radiation. Overall, our findings demonstrate the significant influence of ecological associations in shaping marine bacterial evolution through HGT.

Keywords: Horizontal gene transfer; bacteria; co-occurrence; lateral gene transfer; metagenomics; ocean ecology; phylogenetics; prokaryotes.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest disclosure The authors state that they have no conflicts of interest with the content of this article.

Figures

Figure 1:
Figure 1:
Summary of putative horizontal gene transfer events based on sequence identity cutoffs. (a) Counts of putative horizontal gene transfer (HGT) events based on the cluster-based approach, by the lowest taxonomic level at which the compared genomes differed. The total number of genome comparisons per level is also indicated. These cells are coloured by the percentage per row. (b) Proportion of putative horizontal gene transfer events per taxonomic level, after normalizing by the total number of genome comparisons per level.
Figure 2:
Figure 2:
Cluster of Orthologous Genes (COG) categories enriched among horizontal gene transfer (HGT) hits based on sequence identity. HGT results used for this analysis were based on the cluster-based approach. Expected patterns of enrichment are indicated for COG categories at the top, and individual COG categories (in abbreviated form in some cases) are indicated at the bottom. The row groupings indicate the identity cut-off, in addition to the taxonomic level at which the genomes that encode the genes differ. Results are only shown for significant Fisher’s exact tests (Benjamini-Hochberg-corrected P < 0.05); light grey cells indicate non-significant tests.
Figure 3:
Figure 3:
Breakdown of putative horizontally transferred genes identified, by co-occurrence and phylogenetic distance. (a) Tallies and percentages of pairwise genome comparisons, by whether they have at least one horizontal gene transfer (HGT) event inferred and whether they co-occur. (b) Boxplots over violin plots of all genomes tallied in panel a, displaying phylogenetic distance by HGT and co-occurrence. The width of the violin plots does not reflect the total sample size for each category (see panel a for counts in each category).
Figure 4:
Figure 4:
Distributions of the difference in median value of each environmental variable between genome pairs, split by genome pairs connected by HGT and not. The median value for each variable was computed for each genome based on all the metagenomics samples it was found within. The violin plots for the “no HGT” relationship show more detailed variation likely because there are many more genome pairs in this category.
Figure 5:
Figure 5:
Coefficients in logistic regression model based on all pairwise genome comparisons. The response is whether those genomes have at least one called horizontal gene transfer event (based on the cluster-based approach) between them or not. Co-occurrence (based on the hypergeometric approach) is a binary variable indicating whether those genomes significantly co-occur, and small particle enriched and “less-filtered” enriched indicate whether the genomes being compared are both enriched in samples filtered to enrich for small particles (small), or those with little or no size filters (“less-filtered”). The other variables are ordered quantile normalized continuous values. Variables with ‘diff.’ represent the median inter-sample difference in each environmental variable for all samples encoding the two genomes being compared. Coefficient values less than 0 indicate that decreases in those variables are associated with increases in horizontal gene transfer, whereas a positive value represents the opposite. The intercept for this model was −14.35. Error bars represent 95% confidence intervals (1.96 x standard error). All variables were significant (P < 0.05) except for “small particle enriched”.
Figure 6:
Figure 6:
Key associations between environmental variables and horizontal gene transfer prevalence. (a) Variable importance of tested environmental variables for predicting horizontal gene transfer prevalence. Variable colour indicates the direction of the association (based on Spearman correlation, for context), while grey indicates variables that did not have significant variable importance. (b) Boxplots of horizontal gene transfer prevalence for samples split into different groupings of PAR (lower and above 20%, which is the approximate centre of a large range with no samples).
Figure 7:
Figure 7:
Breakdown of horizontal gene transfer (HGT) and related feature prevalence across 43 Tara Oceans samples. The same biological samples are matched based on whether they were enriched for small or large particles based on the fraction filter (i.e., to enrich for free-floating or particle-attached bacteria). The y-axis for all panels is the proportion of genomes in a sample with at least one instance of that category (e.g., the proportion of genomes in a sample with at least one gene identified to be either acquired or transferred through HGT). Asterisks indicate significant comparisons (adjusted P < 0.05) based on paired Wilcoxon tests.

References

    1. Innamorati KA et al. The Bacterial Guide to Designing a Diversified Gene Portfolio. In: Tettelin H, Medini D (eds), The Pangenome: Diversity, Dynamics and Evolution of Genomes. New York: Springer, 2020, 51–87.
    1. Beiko RG, Harlow TJ, Ragan MA. Highways of gene sharing in prokaryotes. PNAS USA 2005;102:14332–14337. 10.1073/pnas.0504068102 - DOI - PMC - PubMed
    1. Smillie CS et al. Ecology drives a global network of gene exchange connecting the human microbiome. Nature 2011;480:241–244. 10.1038/nature10571 - DOI - PubMed
    1. Fondi M et al. “Every Gene Is Everywhere but the Environment Selects”: Global Geolocalization of Gene Sharing in Environmental Samples through Network Analysis. Genome Biology and Evolution 2016;8:1388–1400. 10.1093/gbe/evw077 - DOI - PMC - PubMed
    1. Sheinman M et al. Identical sequences found in distant genomes reveal frequent horizontal transfer across the bacterial domain. eLife 2021;10:e62719. 10.7554/eLife.62719 - DOI

Publication types

LinkOut - more resources