Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul 1;69(4):756-773.
doi: 10.1093/sysbio/syz074.

Hierarchical Hybrid Enrichment: Multitiered Genomic Data Collection Across Evolutionary Scales, With Application to Chorus Frogs (Pseudacris)

Affiliations

Hierarchical Hybrid Enrichment: Multitiered Genomic Data Collection Across Evolutionary Scales, With Application to Chorus Frogs (Pseudacris)

Sarah E Banker et al. Syst Biol. .

Abstract

Determining the optimal targets of genomic subsampling for phylogenomics, phylogeography, and population genomics remains a challenge for evolutionary biologists. Of the available methods for subsampling the genome, hybrid enrichment (sequence capture) has become one of the primary means of data collection for systematics, due to the flexibility and cost efficiency of this approach. Despite the utility of this method, information is lacking as to what genomic targets are most appropriate for addressing questions at different evolutionary scales. In this study, first, we compare the benefits of target loci developed for deep- and shallow scales by comparing these loci at each of three taxonomic levels: within a genus (phylogenetics), within a species (phylogeography), and within a hybrid zone (population genomics). Specifically, we target evolutionarily conserved loci that are appropriate for deeper phylogenetic scales and more rapidly evolving loci that are informative for phylogeographic and population genomic scales. Second, we assess the efficacy of targeting multiple-locus sets for different taxonomic levels in the same hybrid enrichment reaction, an approach we term hierarchical hybrid enrichment. Third, we apply this approach to the North American chorus frog genus Pseudacris to answer key evolutionary questions across taxonomic and temporal scales. We demonstrate that in this system the type of genomic target that produces the most resolved gene trees differs depending on the taxonomic level, although the potential for error is substantially lower for the deep-scale loci at all levels. We successfully recover data for the two different locus sets with high efficiency. Using hierarchical data targeting deep and shallow levels: we 1) resolve the phylogeny of the genus Pseudacris and introduce a novel visual and hypothesis testing method that uses nodal heat maps to examine the robustness of branch support values to the removal of sites and loci; 2) estimate the phylogeographic history of Pseudacris feriarum, which reveals up to five independent invasions leading to sympatry with congener Pseudacris nigrita to form replicated reinforcement contact zones with ongoing gene flow into sympatry; and 3) quantify with high confidence the frequency of hybridization in one of these zones between P. feriarum and P. nigrita, which is lower than microsatellite-based estimates. We find that the hierarchical hybrid enrichment approach offers an efficient, multitiered data collection method for simultaneously addressing questions spanning multiple evolutionary scales. [Anchored hybrid enrichment; heat map; hybridization; phylogenetics; phylogeography; population genomics; reinforcement; reproductive character displacement.].

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The effect of site and locus filtering on overall phylogenetic support. Sites were ranked by rate, then subsampled in 12 nested strategies (a), with an increasing number of the most variable sites being excluded in each successive strategy. For each site-removal strategy, gene trees were estimated from the resulting subsampled alignments and compared in order to compute the Euclidian distance from each tree to the center of multidimensional tree distance space (b). Trees were subsampled in 12 nested strategies with an increasing number of the loci with the largest tree distance being excluded in each successive strategy. Phylogenetic trees were estimated using each of the 144 resulting subsampled data sets (12 site-filtered formula image 12 locus-filtered) and the average support value (averaged across the tree) are shown as a heat map, in which black indicates very strong support and white indicates moderate support (bootstrap support values from concatenated RAxML analyses are shown here). Overall support remains strong, except when a large portion of the sites or loci are removed, suggesting that the quantity of data available for analysis is more than sufficient to resolve the majority of the clades in the tree.
Figure 2.
Figure 2.
Degree of gene tree resolution for six combinations of hybrid enrichment locus type and taxonomic scale. For each combination, the proportion of gene tree branches with support greater than a specified value (varied on the formula image-axis) was computed. Note that the locus type producing the most resolved gene trees depends on the taxonomic scale. Also note that the formula image-axis scaling is not the same for the three graphs.
Figure 3.
Figure 3.
ASTRAL tree (a) and phylogeny with branch lengths (b, inset) of the genus Pseudacris showing phylogenetic sensitivity heat maps on nodes. Taxon numbers correspond to Supplementary Table S1 available on Dryad. On the heat maps, the color scale is the same as in Figure 1. Numbers to the lower left of a heat map indicate the support value when all data were included in the analysis (corresponds to lower left pixel of heat map), if that value was less than 100. The two nodes on which further hypothesis testing was performed (Fig. 4) are indicated by yellow circles around the heat map. The analogous concatenated maximum likelihood tree is presented in Supplementary Figure S2 available on Dryad.
Figure 4.
Figure 4.
Hypothesis testing framework for comparing heat maps. The heat maps of three alternative resolutions were compared for each of two different clades (a and b) through randomization tests. Previous studies supporting each alternative resolution are indicated as follows: “Barrow” is Barrow et al. (2014), “Lemmon” is Lemmon et al. (2007a,b), and “Duellman” is Duellman et al. (2016). For both clades, the leftmost resolution was strongly favored (formula image) as shown in Figure 3.
Figure 5.
Figure 5.
Phylogeography of P. feriarum, showing the geographic extent of the sampling (a), the intraspecific nuclear tree (b), and estimated routes of dispersal across the range and into the five river systems (c). Independent invasions of P. feriarum from allopatry into sympatry with P. nigrita are indicated by different colors on the tree (sympatric samples are colored, allopatric samples are not). Colors of clades match sample colors on map and correspond to colors of the arrows (numbering matches transects in Supplementary Fig. S3 available on Dryad). Bootstrap support is shown by grayscale dots on nodes.

References

    1. Ali O.A., O’Rourke S.M., Amish S.J., Meek M.H., Luikart G., Jeffres C., Miller M.R.. 2016. RAD capture (Rapture): flexible and efficient sequence-based genotyping. Genetics. 202:389–400. - PMC - PubMed
    1. Anderson E.C., Thompson E.A.. 2002. A model-based method for identifying species hybrids using multilocus genetic data. Genetics. 160:1217–1229. - PMC - PubMed
    1. Andrews K.R., Good, J.M., Miller, M.R., Luikart G., Hohenlohe P.A.. 2016. Harnessing the power of RADseq for ecological and evolutionary genomics. Nat. Rev. Genet. 17:81. - PMC - PubMed
    1. Ané C., Larget B., Baum D.A., Smith S.D., Rokas A.. 2007. Bayesian estimation of concordance among gene trees. Mol. Biol. Evol. 24:412–426. - PubMed
    1. Arnold B., Corbett-Detig R.B., Hartl D., Bomblies K.. 2013. RADseq underestimates diversity and introduces genealogical biases due to nonrandom haplotype sampling. Mol. Ecol. 22:3179–3190. - PubMed

Publication types