Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 4;12(7):995.
doi: 10.3390/life12070995.

Realistic Gene Transfer to Gene Duplication Ratios Identify Different Roots in the Bacterial Phylogeny Using a Tree Reconciliation Method

Affiliations

Realistic Gene Transfer to Gene Duplication Ratios Identify Different Roots in the Bacterial Phylogeny Using a Tree Reconciliation Method

Nico Bremer et al. Life (Basel). .

Abstract

The rooting of phylogenetic trees permits important inferences about ancestral states and the polarity of evolutionary events. Recently, methods that reconcile discordance between gene-trees and species-trees-tree reconciliation methods-are becoming increasingly popular for rooting species trees. Rooting via reconciliation requires values for a particular parameter, the gene transfer to gene duplication ratio (T:D), which in current practice is estimated on the fly from discordances observed in the trees. To date, the accuracy of T:D estimates obtained by reconciliation analyses has not been compared to T:D estimates obtained by independent means, hence the effect of T:D upon inferences of species tree roots is altogether unexplored. Here we investigated the issue in detail by performing tree reconciliations of more than 10,000 gene trees under a variety of T:D ratios for two phylogenetic cases: a bacterial (prokaryotic) tree with 265 species and a fungal-metazoan (eukaryotic) tree with 31 species. We show that the T:D ratios automatically estimated by a current tree reconciliation method, ALE, generate virtually identical T:D ratios across bacterial genes and fungal-metazoan genes. The T:D ratios estimated by ALE differ 10- to 100-fold from robust, ALE-independent estimates from real data. More important is our finding that the root inferences using ALE in both datasets are strongly dependent upon T:D. Using more realistic T:D ratios, the number of roots inferred by ALE consistently increases and, in some cases, clearly incorrect roots are inferred. Furthermore, our analyses reveal that gene duplications have a far greater impact on ALE's preferences for phylogenetic root placement than gene transfers or gene losses do. Overall, we show that obtaining reliable species tree roots with ALE is only possible when gene duplications are abundant in the data and the number of falsely inferred gene duplications is low. Finding a sufficient sample of true gene duplications for rooting species trees critically depends on the T:D ratios used in the analyses. T:D ratios, while being important parameters of genome evolution in their own right, affect the root inferences with tree reconciliations to an unanticipated degree.

Keywords: gene duplication rate; gene transfer rate; genome evolution; species tree rooting; tree reconciliation.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interest.

Figures

Figure 1
Figure 1
Rooting the bacterial species tree with ALE using different T:D settings. The species tree encompasses 265 bacterial species and was reconstructed from the concatenated alignment of 62 protein-coding genes using maximum-likelihood [17]. 62 alternative root positions within the tree were tested through the reconciliation of 11,269 maximum-likelihood gene trees and the most likely roots that passed the AU tests (p > 0.05) are indicated with black arrows. (a) analysis with 1:1 T:D ratio; (b) 50:1 T:D ratio; and (c) T:D ratio of 100:1. The clade in green shade corresponds to the Terrabacteria lineage, and the clade in blue shade corresponds to the Gracilicutes lineage.
Figure 2
Figure 2
The rates of gene gains and gene losses across bacterial genes. The rates of gains (horizontal axis), measured as the sum of gene duplication and gene transfer rates, versus the rates of losses (vertical axis) as obtained with different T:D settings with ALE (columns) conditioned upon three alternative bacterial roots [17]. (AC) 1:1 T:D ratio. (DF) 50:1 T:D ratio. (GI) 100:1 T:D ratio. Insets show the average transfer-to-duplication ratio (T:D) and loss-to-gain ratio (L:G) across 11,265 gene trees. The color bars indicate the number of gene trees (density).
Figure 3
Figure 3
Inferences of the fungi-metazoan species tree with ALE. The tree was reconstructed via maximum-likelihood analyses of 117 concatenated protein-coding genes spanning 31 species (for complete species composition see Supplemental Table S3). The species tree is shown rooted on the known root branch that separates fungi (pink) from metazoan (light green). The rooting analyses were performed under different T:D settings (rows). The reconciliations were performed for 15,614 maximum-likelihood gene trees against the all-possible rooted versions of the unrooted tree (59 roots in total). The results for the AU test are shown separately for all gene trees [left; (a,c,e,g)], and for 117 gene trees that contain all species without paralogs [right; (b,d,f,h)], referred in the text as single-copy gene trees. The root branches that passed the AU test (p > 0.05) are marked with black arrows.
Figure 4
Figure 4
Power analyses for the number of inferred roots as a function of varying evolutionary rates. Gene trees were ranked according to the evolutionary rates for gene duplication (D) (a,b), gene transfer (T) (c,d) and gene loss (L) (e,f) independently (estimated autonomously by ALE, referred in the text as 1:1 T:D ratio). The AU tests were performed interactively for all non-overlapping sets of consecutive 100 gene trees in the ranked list. The number of significant roots (vertical axis) were plotted against the mean evolutionary rates of the gene trees (horizontal axis). The insets show the correlation coefficient (r) and p-value (p) from the two-tailed Spearman correlation tests. Gene duplication rates, not gene transfer rates, have the strongest negative correlation with the number of inferred roots.

References

    1. Kluge A.G., Farris J.S. Quantitative phyletics and the evolution of anurans. Syst. Zool. 1969;18:1–32. doi: 10.2307/2412407. - DOI
    1. Brown J.R., Doolittle W.F. Root of the universal tree of life based on ancient aminoacyl-tRNA synthetase gene duplications. Proc. Natl. Acad. Sci. USA. 1995;92:2441–2445. doi: 10.1073/pnas.92.7.2441. - DOI - PMC - PubMed
    1. Farris J.S. Estimating phylogenetic trees from distance matrices. Am. Nat. 1972;106:645–668. doi: 10.1086/282802. - DOI
    1. Tria F.D.K., Landan G., Dagan T. Phylogenetic rooting using minimal ancestor deviation. Nat. Ecol. Evol. 2017;1:0193. doi: 10.1038/s41559-017-0193. - DOI - PubMed
    1. Lepage T., Bryant D., Philippe H., Lartillot N. A general comparison of relaxed molecular clock models. Mol. Biol. Evol. 2007;24:2669–2680. doi: 10.1093/molbev/msm193. - DOI - PubMed

LinkOut - more resources