Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011:2:321.
doi: 10.1038/ncomms1325. Epub 2011 May 24.

A novel methodology for large-scale phylogeny partition

Collaborators, Affiliations

A novel methodology for large-scale phylogeny partition

Mattia C F Prosperi et al. Nat Commun. 2011.

Abstract

Understanding the determinants of virus transmission is a fundamental step for effective design of screening and intervention strategies to control viral epidemics. Phylogenetic analysis can be a valid approach for the identification of transmission chains, and very-large data sets can be analysed through parallel computation. Here we propose and validate a new methodology for the partition of large-scale phylogenies and the inference of transmission clusters. This approach, on the basis of a depth-first search algorithm, conjugates the evaluation of node reliability, tree topology and patristic distance analysis. The method has been applied to identify transmission clusters of a phylogeny of 11,541 human immunodeficiency virus-1 subtype B pol gene sequences from a large Italian cohort. Molecular transmission chains were characterized by means of different clinical/demographic factors, such as the interaction between male homosexuals and male heterosexuals. Our method takes an advantage of a flexible notion of transmission cluster and can become a general framework to analyse other epidemics.

PubMed Disclaimer

Conflict of interest statement

A.De L. received speakers' honoraria, served as consultant or participated in advisory boards for GlaxoSmithKline, Gilead, Bristol-Myers Squibb, Abbott Virology, Janssen-Cilag Tibotec, Siemens Diagnostics and Monogram Biosciences. S.R. has received research grants and has been involved in advisory boards or educational courses supported by the following companies: Abbott, Boehringer-Ingelheim, Bristol-Myers Squibb, Gilead, GlaxoSmithKline, ViiV Healthcare, Merck, and Janssen-Cilag. M.Z. has received research funding from Pfizer; served as a consultant for Abbott Molecular, Boehringer Ingelheim, Gilead Sciences and Janssen-Cilag; and served on speakers' bureaus for Abbott, Bristol-Myers Squibb, Merck, and Pfizer. The remaining authors declare no competing financial interests. B.B. has received funds for speaking, consultancy and travel from ViiV Healthcare, Gilead Sciences, Abbott Molecular, Janssen-Cilag and Siemens Health Care.

Figures

Figure 1
Figure 1. Automated partition of a phylogenetic tree.
Graphical example of a depth-first tree search for automated phylogenetic tree partition. The method considers nodes/sub-trees with a reliability ≥ 90% and ≥ 2 distinct patients, recognizing a sub-tree as a cluster when the median sub-tree pairwise patristic distance is below a percentile threshold of the whole-tree patristic distance distribution (let it be the 10th percentile). (a) An example of a phylogenetic tree, where each patient/sequence is identified by a letter (A–Z) and each tree node has an associated value of reliability (which might be bootstrap support). (b) Histogram of the whole-tree patristic distance distribution. The vertical black line corresponds to the 10th percentile distance threshold. The partition method identifies three clusters (yellow, red and green) and discards the grey sub-tree.
Figure 2
Figure 2. Phylogeny of Italian HIV-1 subtype B pol isolates.
Maximum-likelihood phylogenetic tree of 11,541 HIV-1 subtype B pol gene sequences from the Italian ARCA cohort. Tree is rooted on subtype J and depicted using three-dimensional hyperbolic geometry. Nodes and leaves are highlighted by yellow points.
Figure 3
Figure 3. Tree topology.
Topological analysis of a maximum-likelihood phylogenetic tree composed of 11,541 HIV-1 subtype B pol sequences from the Italian ARCA cohort, rooted on subtype J. Median (interquartile range) branch length (blue) and number of nodes (red) for each tree level are depicted.

References

    1. Perrin L, Kaiser L, Yerly S. Travel and the spread of HIV-1 genetic variants. Lancet Infect. Dis. 2003;3:22–27. doi: 10.1016/S1473-3099(03)00484-5. - DOI - PubMed
    1. Gray RR, et al. Spatial phylodynamics of HIV-1 epidemic emergence in east Africa. AIDS. 2009;23:F9–F17. doi: 10.1097/QAD.0b013e32832faf61. - DOI - PMC - PubMed
    1. Resik S, et al. Limitations to contact tracing and phylogenetic analysis in establishing HIV type 1 transmission networks in Cuba. AIDS Res. Hum. Retroviruses. 2007;23:347–356. doi: 10.1089/aid.2006.0158. - DOI - PubMed
    1. Brown AE, et al. Phylogenetic reconstruction of transmission events from individuals with acute HIV infection: toward more-rigorous epidemiological definitions. J. Infect. Dis. 2009;199:427–431. doi: 10.1086/596049. - DOI - PubMed
    1. Hué S, Clewley JP, Cane PA, Pillay D. HIV-1 pol gene variation is sufficient for reconstruction of transmissions in the era of antiretroviral therapy. AIDS. 2004;18:719–728. doi: 10.1097/00002030-200403260-00002. - DOI - PubMed

Publication types

Substances