Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jan 26:7:3.
doi: 10.1186/1741-7007-7-3.

Assigning strains to bacterial species via the internet

Affiliations

Assigning strains to bacterial species via the internet

Cynthia J Bishop et al. BMC Biol. .

Abstract

Background: Methods for assigning strains to bacterial species are cumbersome and no longer fit for purpose. The concatenated sequences of multiple house-keeping genes have been shown to be able to define and circumscribe bacterial species as sequence clusters. The advantage of this approach (multilocus sequence analysis; MLSA) is that, for any group of related species, a strain database can be produced and combined with software that allows query strains to be assigned to species via the internet. As an exemplar of this approach, we have studied a group of species, the viridans streptococci, which are very difficult to assign to species using standard taxonomic procedures, and have developed a website that allows species assignment via the internet.

Results: Seven house-keeping gene sequences were obtained from 420 streptococcal strains to produce a viridans group database. The reference tree produced using the concatenated sequences identified sequence clusters which, by examining the position on the tree of the type strain of each viridans group species, could be equated with species clusters. MLSA also identified clusters that may correspond to new species, and previously described species whose status needs to be re-examined. A generic website and software for electronic taxonomy was developed. This site http://www.eMLSA.net allows the sequences of the seven gene fragments of a query strain to be entered and for the species assignment to be returned, according to its position within an assigned species cluster on the reference tree.

Conclusion: The MLSA approach resulted in the identification of well-resolved species clusters within this taxonomically challenging group and, using the software we have developed, allows unknown strains to be assigned to viridans species via the internet. Submission of new strains will provide a growing resource for the taxonomy of viridans group streptococci, allowing the recognition of potential new species and taxonomic anomalies. More generally, as the software at the MLSA website is generic, MLSA schemes and strain databases for other groups of related species can be hosted at this website, providing a portal for microbial electronic taxonomy.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Tree showing the positions of well-characterized strains and type strains within sequence clusters. A neighbour-joining radial tree was constructed using the concatenated sequences of all 420 strains. The viridans group strains that are coloured were those assigned to species in the laboratory of MK; the positions on the tree of the type strains of viridans group species are indicated. Mitis, Anginosus and Salivarius group species are shown, respectively, as coloured circles, squares and diamonds. Bootstrap support values (%) for each of the nodes leading to the Mitis group sequence clusters are indicated; values for the clusters within the Anginosus and Salivarius groups are shown in Figure 2C. The colour key is ordered top to bottom according to the position of the clusters on the radial tree, from S. pneumoniae to S. pyogenes.
Figure 2
Figure 2
The MLSA reference tree. A. The neighbour-joining tree of Figure 1 was relabelled so that all strains within a sequence cluster were assigned to the species inferred from the positions on the tree of the type strain and other well-characterized strains. The limits of the S. mitis and S. pseudopneumoniae clusters are somewhat unclear and the strains on the flanks of this cluster are shown as white circles (uncertain species) to indicate this. Strains that were re-checked as they were outliers of species clusters, or were distinct from all other strains, are indicated by a red asterisk. B. The tree constructed from the same set of concatenated sequences using minimum evolution. C. The neighbour-joining tree with the Mitis group clusters collapsed, to show more clearly the sequence clusters within the Anginosus and Salivarius groups. The positions of the type strains and bootstrap values for the nodes are shown.
Figure 3
Figure 3
Comparison of the clustering of strains using two different MLSA schemes. The neighbour-joining tree obtained for a set of 93 strains of S. pneumoniae, S. pseudopneumoniae, S. mitis and S. oralis using the seven-locus MLSA scheme (A) was compared with the tree obtained from the same strains using the concatenated sequences of six of the loci used in the S. pneumoniae MLST scheme (B). The two trees are drawn to the same scale.
Figure 4
Figure 4
Effect on clustering patterns of adding an additional locus (guaA) to the seven-locus scheme. The guaA gene was successfully sequenced from 326 of the Mitis group strains and the neighbour-joining tree obtained for these strains using the seven-locus MLSA scheme (A) was compared with that obtained from the same strains by adding the guaA sequence to the concatenated sequences of the seven loci (B). The colour code for species clusters is as in Figure 2. The positions on the tree of the type strains are indicated.
Figure 5
Figure 5
Phenotypically distinct sub-clusters within the S. oralis species cluster. The region of the neighbour-joining tree that includes strains within the S. oralis species cluster is shown in more detail. The positions on the tree of the type strains are indicated. The sub-cluster of phenotypically distinct strains that are arginine hydrolysis and α-maltosidase positive, and which express the Lancefield group K cell wall carbohydrate antigen, is indicated (highlighted in blue). A further subgroup, which included the S. oligofermentans type strain (SK1136), consisting exclusively of IgA protease-negative strains is also indicated (highlighted in yellow). Bootstrap values for relevant nodes are shown. Green strain names indicate IgA1 protease-positive strains whereas in red they are IgA1 protease negative. The IgA1 protease status of a few strains (black strain names) was unknown.
Figure 6
Figure 6
Examples of individual gene trees produced from the sequences of all 420 strains. A-C shows the ppaC, pyk and tuf gene trees, which each fail to resolve some of the species clusters. Strain SK264 (assigned as S. parasanguinis; labelled in Figure 2A) was used as the query strain in Figures 7 and 8. The ppaC allele of this strain is assigned as resident compatible as it falls within an unresolved cluster that includes the other S. parasanguinis sequences. For both pyk and tuf, the sequences from SK264 fall within clusters that are well resolved (bootstrap values of ≥ 80%) from the cluster that includes all (or the great majority) of the other S. parasanguinis sequences and are assigned as foreign alleles. For pyk the source of the foreign allele is unclear as its sequence falls within a cluster that includes those from several species. For tuf, the sequence appears to have been introduced from an S. australis strain. The major unresolved clusters in the trees are indicated.
Figure 7
Figure 7
Species assignment on the internet. The eMLSA.net page returned after entering the seven gene sequences of a query strain and requesting a species assignment. The species assigned to the five most closely matching concatenated sequences are returned (left) along with an unrooted neighbour-joining radial tree indicating the position of the query strain. In this case, the query strain (SK264) is assigned as S. parasanguinis as the five most similar concatenated sequences are all from this species and the query strain falls within the S. parasanguinis species cluster on the tree.
Figure 8
Figure 8
Online assignment of alleles as resident or foreign. Having assigned a query strain to a species (Figure 7), the locus view page shows the assignment of each of the seven individual sequences as resident to that species (or compatible with being resident) or foreign. In the example, the sequences of five of the genes from strain SK264 are assigned by eMLSA.net as resident (that is, S. parasanguinis) or resident compatible, but the pyk and tuf genes are assigned as foreign. The locus view page allows the individual gene trees (in this case, the tuf tree) to be displayed to explore why the algorithm considers the sequences of an allele of a query strain to be resident, resident compatible or foreign (see Figure 6 for details). The colour codes for species clusters are as in Figure 2.

Similar articles

Cited by

References

    1. Cohan FM. Towards a conceptual and operational union of bacterial systematics, ecology, and evolution. Phil Trans R Soc B. 2006;361:1985–1996. doi: 10.1098/rstb.2006.1918. - DOI - PMC - PubMed
    1. Gevers D, Cohan F, Lawrence J, Spratt BG, Coenye T, Feil EJ, Stackebrandt E, Manfio G, Peer Y Van de, Nesme X, Thompson F, Swings J. Re-evaluating bacterial species. Nature Microbiol Rev. 2005;3:733–739. doi: 10.1038/nrmicro1236. - DOI - PubMed
    1. Gevers D, Dawyndt P, Vandamme P, Willems A, Vancanneyt M, Swings J, de Vos P. Stepping stones to a new prokaryotic taxonomy. Phil Trans Roy Soc B. 2006;361:1911–1916. doi: 10.1098/rstb.2006.1915. - DOI - PMC - PubMed
    1. Staley JT. The bacterial species dilemma and the genomic-phylogenetic species concept. Phil Trans R Soc B. 2006;361:1899–1909. doi: 10.1098/rstb.2006.1914. - DOI - PMC - PubMed
    1. Achtman M, Wagner M. Microbial diversity and the genetic nature of microbial species. Nat Rev Microbiol. 2008;6:431–440. doi: 10.1038/nrmicro1872. - DOI - PubMed

Publication types