Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Apr;175(4):1773-85.
doi: 10.1534/genetics.106.066258. Epub 2006 Dec 28.

Phylogenetic mapping of recombination hotspots in human immunodeficiency virus via spatially smoothed change-point processes

Affiliations

Phylogenetic mapping of recombination hotspots in human immunodeficiency virus via spatially smoothed change-point processes

Vladimir N Minin et al. Genetics. 2007 Apr.

Abstract

We present a Bayesian framework for inferring spatial preferences of recombination from multiple putative recombinant nucleotide sequences. Phylogenetic recombination detection has been an active area of research for the last 15 years. However, only recently attempts to summarize information from several instances of recombination have been made. We propose a hierarchical model that allows for simultaneous inference of recombination breakpoint locations and spatial variation in recombination frequency. The dual multiple change-point model for phylogenetic recombination detection resides at the lowest level of our hierarchy under the umbrella of a common prior on breakpoint locations. The hierarchical prior allows for information about spatial preferences of recombination to be shared among individual data sets. To overcome the sparseness of breakpoint data, dictated by the modest number of available recombinant sequences, we a priori impose a biologically relevant correlation structure on recombination location log odds via a Gaussian Markov random field hyperprior. To examine the capabilities of our model to recover spatial variation in recombination frequency, we simulate recombination from a predefined distribution of breakpoint locations. We then proceed with the analysis of 42 human immunodeficiency virus (HIV) intersubtype gag recombinants and identify a putative recombination hotspot.

PubMed Disclaimer

Figures

F<sc>igure</sc> 1.—
Figure 1.—
Illustration of phylogenetic recombination detection. A multiple sequence alignment is divided into two parts by a recombination breakpoint (dashed line). These two parts support distinct phylogenies, shown on either side of the alignment. Sites that provide information about the topology of a phylogenetic tree are shown in boldface type.
F<sc>igure</sc> 2.—
Figure 2.—
Simulation study. The left top plot shows recombination probabilities used to simulate recombination events in primate mitochondrial DNA sequences. The letter A denotes the probability mass over the region [401, 600]. The rest of the plots on the left depict the sites at which recombination was simulated (solid dots) and the posterior median (solid line) and 95% BCIs (shading) of inferred site-specific recombination probabilities. The right top plot depicts the prior density of GMRF precision ω. Posterior densities of ω are plotted underneath the prior.
F<sc>igure</sc> 3.—
Figure 3.—
Analysis of HIV recombinants. The top plot illustrates the locations of gene products in the HIV gag coding region and marks the position of an instability element (INS) in the Capsid reading frame (hatched box). Below the gene map we show posterior medians (solid line) and 95% BCIs (shading) of population-level recombination probabilities. In the bottom two plots we depict averaged individual-level recombination probabilities (vertical bars), estimated jointly with the hierarchical model (plot second from bottom) and independently with the DMCP model (bottom plot). Solid circles mark breakpoint locations in individual recombinants as estimated by the joint and independent approaches.
F<sc>igure</sc> 4.—
Figure 4.—
Number of breakpoints in individual recombinants. The top plot shows independently estimated posterior mean numbers of breakpoints plotted against jointly estimated posterior mean numbers of breakpoints for the 42 HIV gag individual recombinants. In the two bottom bar plots, we show the posterior mean numbers of breakpoints for the two types of recombination analysis that correspond to the x- and y-axes of the top plot.
F<sc>igure</sc> 5.—
Figure 5.—
Convergence diagnostics. The left plot depicts 42 scaled regeneration quantile (SRQ) plots, where ti denotes an iteration, at which the total number of breakpoints in individual alignments returns to its posterior median for the ith time. Gelman–Rubin potential scale reduction factors (PSRFs, solid line) and their corresponding 97.5% quantiles (dashed line) for recombination log odds formula image are plotted against site indexes on the right.

Similar articles

Cited by

References

    1. Awadalla, P., 2003. The evolutionary genomics of pathogen recombination. Nat. Rev. Genet. 4: 50–60. - PubMed
    1. Balakrishnan, M., P. Fay and R. A. Bambara, 2001. The kissing hairpin sequence promotes recombination within the HIV-I 5′ leader region. J. Biol. Chem. 276: 36482–36492. - PubMed
    1. Barlow, K., I. Tatt, P. Cane, D. Pillay and J. Clewley, 2001. Recombinant strains of HIV type 1 in the United Kingdom. AIDS Res. Hum. Retroviruses 17: 467–474. - PubMed
    1. Bernardinelli, L., D. Clayton and C. Montomoli, 1995. Bayesian estimates of disease maps: How important are priors? Stat. Med. 14: 2411–2431. - PubMed
    1. Besag, J., 1974. Spatial interaction and the statistical analysis of lattice systems (with discussion). J. R. Stat. Soc. Ser. B 36: 192–236.

Publication types