Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2003 Jul;20(7):1168-80.
doi: 10.1093/molbev/msg128. Epub 2003 May 30.

Predicting demographic group structures based on DNA sequence data

Affiliations
Comparative Study

Predicting demographic group structures based on DNA sequence data

Jon P Anderson et al. Mol Biol Evol. 2003 Jul.

Abstract

The ability to infer relationships between groups of sequences, either by searching for their evolutionary history or by comparing their sequence similarity, can be a crucial step in hypothesis testing. Interpreting relationships of human immunodeficiency virus type 1 (HIV-1) sequences can be challenging because of their rapidly evolving genomes, but it may also lead to a better understanding of the underlying biology. Several studies have focused on the evolution of HIV-1, but there is little information to link sequence similarities and evolutionary histories of HIV-1 to the epidemiological information of the infected individual. Our goal was to correlate patterns of HIV-1 genetic diversity with epidemiological information, including risk and demographic factors. These correlations were then used to predict epidemiological information through analyzing short stretches of HIV-1 sequence. Using standard phylogenetic and phenetic techniques on 100 HIV-1 subtype B sequences, we were able to show some correlation between the viral sequences and the geographic area of infection and the risk of men who engage in sex with men. To help identify more subtle relationships between the viral sequences, the method of multidimensional scaling (MDS) was performed. That method identified statistically significant correlations between the viral sequences and the risk factors of men who engage in sex with men and individuals who engage in sex with injection drug users or use injection drugs themselves. Using tree construction, MDS, and newly developed likelihood assignment methods on the original 100 samples we sequenced, and also on a set of blinded samples, we were able to predict demographic/risk group membership at a rate statistically better than by chance alone. Such methods may make it possible to identify viral variants belonging to specific demographic groups by examining only a small portion of the HIV-1 genome. Such predictions of demographic epidemiology based on sequence information may become valuable in assigning different treatment regimens to infected individuals.

PubMed Disclaimer

Publication types

MeSH terms