Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Nov 11;7(3):358-368.
doi: 10.1111/2041-210X.12490.

HEXT, a software supporting tree-based screens for hybrid taxa in multilocus data sets, and an evaluation of the homoplasy excess test

Affiliations

HEXT, a software supporting tree-based screens for hybrid taxa in multilocus data sets, and an evaluation of the homoplasy excess test

Kevin Schneider et al. Methods Ecol Evol. .

Abstract

The homoplasy excess test (HET) is a tree-based screen for hybrid taxa in multilocus nuclear phylogenies. Homoplasy between a hybrid taxon and the clades containing the parental taxa reduces bootstrap support in the tree. The HET is based on the expectation that excluding the hybrid taxon from the data set increases the bootstrap support for the parental clades, whereas excluding non-hybrid taxa has little effect on statistical node support. To carry out a HET, bootstrap trees are calculated with taxon-jackknife data sets, that is excluding one taxon (species, population) at a time. Excess increase in bootstrap support for certain nodes upon exclusion of a particular taxon indicates the hybrid (the excluded taxon) and its parents (the clades with increased support).We introduce a new software program, hext, which generates the taxon-jackknife data sets, runs the bootstrap tree calculations, and identifies excess bootstrap increases as outlier values in boxplot graphs. hext is written in r language and accepts binary data (0/1; e.g. AFLP) as well as co-dominant SNP and genotype data.We demonstrate the usefulness of hext in large SNP data sets containing putative hybrids and their parents. For instance, using published data of the genus Vitis (~6,000 SNP loci), hext output supports V. × champinii as a hybrid between V. rupestris and V. mustangensis.With simulated SNP and AFLP data sets, excess increases in bootstrap support were not always connected with the hybrid taxon (false positives), whereas the expected bootstrap signal failed to appear on several occasions (false negatives). Potential causes for both types of spurious results are discussed.With both empirical and simulated data sets, the taxon-jackknife output generated by hext provided additional signatures of hybrid taxa, including changes in tree topology across trees, consistent effects of exclusions of the hybrid and the parent taxa, and moderate (rather than excessive) increases in bootstrap support. hext significantly facilitates the taxon-jackknife approach to hybrid taxon detection, even though the simple test for excess bootstrap increase may not reliably identify hybrid taxa in all applications.

Keywords: AFLP; Canidae; SNP; Vitis champinii; bootstrap support; homoplasy excess test; hybridization; phylogenetics.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Excess homoplasy introduced by a hybrid taxon in a multilocus phylogenetic tree. (a) The hybrid is placed intermediate to the parental taxa. Bootstrap support (numbers above branches) for clades containing the parental taxa is low due to homoplasy with the hybrid. Circled numbers identify nodes. (b) Exclusion of the hybrid increases bootstrap support for clades containing the parental taxa. (c) Exclusion of one parent taxon causes changes in BS support and tree topology: increased bootstrap support for both parental clades, and placement of the hybrid with its other parent. (d) BS values for each node observed in the full tree. BS values in the full tree (first line) and in each taxon‐jackknife tree are compiled in table. SC, support carryover: BS values were not scored for nodes that were sister to the excluded taxon. NA, node had joined the excluded taxon. (e) Boxplots representing the distribution of BS values scored in the taxon‐jackknife trees for each node observed in the full tree. The boxes encompass 50% of the observed values that are located between the first and the third quartile and define the interquartile range (IQR). Vertical bars within boxes mark the median value. Whiskers extend to the smallest and the largest BS value located within the 1·5 × IQR distance from the boxes, whereas values beyond this distance are considered outliers and represented by dots. Dot colour indicates whether an outlier was caused by the exclusion of the hybrid, parent 1 or parent 2.
Figure 2
Figure 2
Homoplasy excess test results for the Vitis SNP data set. V. × champinii is a natural hybrid between V. rupestris and V. mustangensis. Trees were rooted with V. rotundifolia, and nodes are labelled with BS values. (a) The full tree, containing hybrid and parent taxa. (b) Tree obtained after exclusion of the hybrid, V. x champinii. (c) Graphical output of hext, showing boxplots of BS values across taxon‐jackknife trees for nodes at which at least one outlier value was detected at a distance >1·5 IQR from the box. Description of the nodes (on left vertical axis) and identification of species whose exclusion caused the upper BS outlier(s) were added manually. Bold font highlights nodes and taxa contributing the expected hybrid signals. Nodes A and B are identified in 2a, node “V. rupestris” joins the five V. rupestris accessions. gird., V. girdiana; rup., V. rupestris; syl., V. sylvestris; pal., V. palmata; mon., V. monticola; rip., V. riparia; acer., V. acerifolia; pia, V. piasezkii; champ., V. × champinii; must., V. mustangensis. (d, e) Annotated boxplots revealing reduced homoplasy and tree topology changes after exclusion of the putative hybrid taxon V. × champinii (d) and its parental species V. rupestris and V. mustangensis (e). The node defining monophyly of V. × champinii did not occur in the full tree (2a) and is therefore not included in the default hext output shown in 2c. The boxplot for this node was obtained using the hext option to create custom boxplots for defined nodes. Both exclusion of V. rupestris and of V. mustangensis yielded a BS of 100% (overlapping signals drawn as light‐blue circle with dark‐blue ring). The smaller of the two upper outliers at the V. × champinii node originates from the tree excluding V. acerifolia, and is not interpreted as evidence of the putative hybrid origin of V. × champinii. Similarly, in (d), the smaller of the two upper outliers at the V. rupestris node originates from the tree excluding V. piasezkii, and is also not interpreted as evidence of the putative hybrid origin of V. × champinii. Other upper outliers not highlighted in (d) are due to the exclusion of a parent as annotated in (e). Lower outliers highlighted in neither (d) nor (e) are also not considered connected to the putative hybrid origin of V. × champinii.
Figure 3
Figure 3
(a) Tree topology used to simulate AFLP and SNP data sets. Simulations assumed that hybridization between two lineages marked with corresponding symbols gave rise to a novel hybrid taxon. (b) Full tree obtained from the SNP data set in simulation #11 (Table 2) including a hybrid taxon originating from hybridization between the lineages marked with red diamonds (l × s). BS support is given near nodes (% in 1000 BS replicates). BS boxplots are shown for selected nodes to illustrate the patterns described in the text.
Figure 4
Figure 4
(a) Negative correlation between numbers of false‐positive boxplot outliers (upper BS outliers unconnected to hybrid taxon) and average node support across the full tree in simulations of AFLP and SNP data. Outliers were identified as values at a distance >1·5 × IQR (open symbols) and >3 × IQR (filled symbols) from the third quartile. (b) True positive boxplot outliers (upper BS outliers upon exclusion of hybrid taxon) were more likely to occur when average node support in the full tree was low.

Similar articles

Cited by

References

    1. Abbott, R.J. & Rieseberg, L.H. (2012) Hybrid Speciation. Encyclopaedia of Life Sciences (eLS). John Wiley & Sons, Chichester, UK.
    1. Egger, B. , Koblmüller, S. , Sturmbauer, C. & Sefc, K.M. (2007) Nuclear and mitochondrial data reveal different evolutionary processes in the Lake Tanganyika cichlid genus Tropheus . BMC Evolutionary Biology, 7, 137. - PMC - PubMed
    1. Felsenstein, J. (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution, 39, 783–791. - PubMed
    1. Geiger, M.F. , McCrary, J.K. & Schliewen, U.K. (2010) Not a simple case – A first comprehensive phylogenetic hypothesis for the Midas cichlid complex in Nicaragua (Teleostei: Cichlidae: Amphilophus). Molecular Phylogenetics and Evolution, 56, 1011–1024. - PubMed
    1. Herder, F. , Nolte, A.W. , Pfaender, J. , Hadiaty, R.H. & Schliewen, U.K. (2006) Adaptive radiation and hybridization in Wallace's Dreamponds: evidence from sailfin silversides in the Malili Lakes of Sulawesi. Proceedings of the Royal Society of London Series B, 273, 2209–2217. - PMC - PubMed

LinkOut - more resources