Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 May;30(5):1224-8.
doi: 10.1093/molbev/mst028. Epub 2013 Feb 13.

Hierarchical and spatially explicit clustering of DNA sequences with BAPS software

Affiliations

Hierarchical and spatially explicit clustering of DNA sequences with BAPS software

Lu Cheng et al. Mol Biol Evol. 2013 May.

Abstract

Phylogeographical analyses have become commonplace for a myriad of organisms with the advent of cheap DNA sequencing technologies. Bayesian model-based clustering is a powerful tool for detecting important patterns in such data and can be used to decipher even quite subtle signals of systematic differences in molecular variation. Here, we introduce two upgrades to the Bayesian Analysis of Population Structure (BAPS) software, which enable 1) spatially explicit modeling of variation in DNA sequences and 2) hierarchical clustering of DNA sequence data to reveal nested genetic population structures. We provide a direct interface to map the results from spatial clustering with Google Maps using the portal http://www.spatialepidemiology.net/ and illustrate this approach using sequence data from Borrelia burgdorferi. The usefulness of hierarchical clustering is demonstrated through an analysis of the metapopulation structure within a bacterial population experiencing a high level of local horizontal gene transfer. The tools that are introduced are freely available at http://www.helsinki.fi/bsg/software/BAPS/.

PubMed Disclaimer

Figures

F<sc>ig</sc>. 1.
Fig. 1.
Google Maps representation of the estimated spatial genetic population structure of North American Borrelia burgdorferi produced from the BAPS output using the tool available in the portal http://www.spatialepidemiology.net/, last accessed November 5, 2012.
F<sc>ig</sc>. 2.
Fig. 2.
BAPS clustering of 427 genotypes from 23 species in the viridans group Streptococci. Each leaf node of the tree is labeled with a color corresponding to a BAPS cluster.
F<sc>ig</sc>. 3.
Fig. 3.
Results from a hierarchical BAPS clustering of 25,000 strains of simulated bacteria from a population subdivided into 25 patches of 1,000 strains each with no between-patch migration and no patch turnover. The mutation rate of 0.0001 per locus/individual/generation was used in the simulation such that the population is subject to local recombination at a per locus rate 10 times more frequent than mutation. The tree on the left is the result from the first level of BAPS clustering, with leaf colors indicating their assignment into detected clusters. The trees on the right show cluster assignments from the second level of BAPS clustering, where two “conservative” clusters are correctly split with respect to the underlying patches used in the simulation process.

References

    1. Beaumont MA, Nielsen R, Robert C, et al. (22 co-authors) In defence of model-based inference in phylogeography. Mol Ecol. 2010;19:436–446. - PMC - PubMed
    1. Bishop CJ, Aanensen DM, Jordan GE, Kilian M, Hanage WP, Spratt BG. Assigning strains to bacterial species via the internet. BMC Biol. 2009;7:3. - PMC - PubMed
    1. Castillo-Ramírez S, Corander J, Marttinen P, Aldeljawi M, Hanage WP, Westh H, Boye K, Gulay Z, Holden M, Feil EJ. Linking founder events with regional variation in recombination rates within a global clone of Methicillin Resistant Staphylococcus aureus (MRSA) Genome Biol. Forthcoming 2012;13:R126. - PMC - PubMed
    1. Chen C, Durand E, Forbes F, Francois O. Bayesian clustering algorithms ascertaining spatial population structure: a new computer program and a comparison study. Mol Ecol Notes. 2007;7:747–756.
    1. Cheng L, Connor TR, Aanensen DM, Spratt BG, Corander J. Bayesian semi-supervised classification of bacterial samples using MLST databases. BMC Bioinformatics. 2011;12:302. - PMC - PubMed

Publication types