Inferring the source of transmission with phylogenetic data
- PMID: 24367249
- PMCID: PMC3868546
- DOI: 10.1371/journal.pcbi.1003397
Inferring the source of transmission with phylogenetic data
Abstract
Identifying the source of transmission using pathogen genetic data is complicated by numerous biological, immunological, and behavioral factors. A large source of error arises when there is incomplete or sparse sampling of cases. Unsampled cases may act as either a common source of infection or as an intermediary in a transmission chain for hosts infected with genetically similar pathogens. It is difficult to quantify the probability of common source or intermediate transmission events, which has made it difficult to develop statistical tests to either confirm or deny putative transmission pairs with genetic data. We present a method to incorporate additional information about an infectious disease epidemic, such as incidence and prevalence of infection over time, to inform estimates of the probability that one sampled host is the direct source of infection of another host in a pathogen gene genealogy. These methods enable forensic applications, such as source-case attribution, for infectious disease epidemics with incomplete sampling, which is usually the case for high-morbidity community-acquired pathogens like HIV, Influenza and Dengue virus. These methods also enable epidemiological applications such as the identification of factors that increase the risk of transmission. We demonstrate these methods in the context of the HIV epidemic in Detroit, Michigan, and we evaluate the suitability of current sequence databases for forensic and epidemiological investigations. We find that currently available sequences collected for drug resistance testing of HIV are unlikely to be useful in most forensic investigations, but are useful for identifying transmission risk factors.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
. Right: True positive versus false positive rates (ROC) using estimated infector probabilities for classification of who infected whom in simulated HIV epidemics. The ROC curves were calculated for 208 pairs of individuals clustered in cherries in the transmission genealogy. Estimates are shown for the true transmission genealogy for a sample of 662 individuals and for the average infector probability calculated from a sample of 50 trees from a Bayesian phylogenetic posterior distribution.
References
-
- Grenfell B, Pybus O, Gog J, Wood J, Daly J, et al. (2004) Unifying the epidemiological and evolutionary dynamics of pathogens. Science 303: 327. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
