Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2011;21(3-4):83-96.
doi: 10.1159/000334611. Epub 2012 Jan 31.

Phylogenetic characterization of transport protein superfamilies: superiority of SuperfamilyTree programs over those based on multiple alignments

Affiliations
Review

Phylogenetic characterization of transport protein superfamilies: superiority of SuperfamilyTree programs over those based on multiple alignments

Jonathan S Chen et al. J Mol Microbiol Biotechnol. 2011.

Abstract

Transport proteins function in the translocation of ions, solutes and macromolecules across cellular and organellar membranes. These integral membrane proteins fall into >600 families as tabulated in the Transporter Classification Database (www.tcdb.org). Recent studies, some of which are reported here, define distant phylogenetic relationships between families with the creation of superfamilies. Several of these are analyzed using a novel set of programs designed to allow reliable prediction of phylogenetic trees when sequence divergence is too great to allow the use of multiple alignments. These new programs, called SuperfamilyTree1 and 2 (SFT1 and 2), allow display of protein and family relationships, respectively, based on thousands of comparative BLAST scores rather than multiple alignments. Superfamilies analyzed include: (1) Aerolysins, (2) RTX Toxins, (3) Defensins, (4) Ion Transporters, (5) Bile/Arsenite/Riboflavin Transporters, (6) Cation:Proton Antiporters, and (7) the Glucose/Fructose/Lactose superfamily within the prokaryotic phosphoenol pyruvate-dependent Phosphotransferase System. In addition to defining the phylogenetic relationships of the proteins and families within these seven superfamilies, evidence is provided showing that the SFT programs outperform programs that are based on multiple alignments whenever sequence divergence of superfamily members is extensive. The SFT programs should be applicable to virtually any superfamily of proteins or nucleic acids.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Phylogenetic (Fitch) trees for the Aerolysin superfamily using the proteins in TCDB as of February 2011. Four different methods of tree construction were used: ClustalX-based neighbor-joining (a), ProtPars-based parsimony (b), the BLAST-derived SFT1 program results showing all Aerolysin superfamily members (c), and the SFT2-based tree showing all Aerolysin superfamily families (d). a–c Numbers indicate the protein TC#s (last two digits of the complete TC#). d Family abbreviations are presented with TC family numbers in parenthesis. b–d Small numbers adjacent to the branches represent the ‘bootstrap’ values, indicating the reliability of the branching order. See TCDB for protein identification.
Fig. 2
Fig. 2
Phylogenetic (Fitch) trees for the RTX toxin superfamily using the proteins in TCDB as of February 2011. Three different methods of tree construction were used: ClustalX-based neighbor-joining (a), MrBayes-based Bayesian (b), and the BLAST-derived SFT1 program showing all RTX toxin superfamily members (c). In all three figures, numbers indicate the protein TC#s. Family TC#s are indicated within parentheses under the family abbreviation. c Small numbers adjacent to the branches represent the ‘bootstrap’ values, indicating the reliability of the branching order. See TCDB for protein identification.
Fig. 3
Fig. 3
Phylogenetic (Fitch) trees for the Defensin superfamily using the proteins in TCDB as of February 2011. Two different methods of tree construction were used: ClustalX-based neighbor-joining (a) and the BLAST-derived SFT1 (b). Both trees show all Defensin superfamily members. Numbers indicate the protein TC#s, while family TC#s are indicated within parentheses under the family name. b Small numbers adjacent to the branches represent the ‘bootstrap’ values, indicating the relative reliability of the branching order.
Fig. 4
Fig. 4
Phylogenetic (Fitch) trees for the IT superfamily using the proteins in TCDB. Three different methods of tree construction were used: ClustalX-based neighbor-joining (a), the BLAST-derived SFT1 approach showing all IT superfamily members (b), and the SFT2 approach showing all IT superfamily families (c). a, b Numbers indicate the protein TC#s (last two digits of the complete TC#). c Family abbreviations are presented with TC family numbers in parentheses. b, c Small numbers adjacent to the branches represent the ‘bootstrap’ values, indicating the reliability of the branching order. See TCDB for protein identification.
Fig. 5
Fig. 5
Phylogenetic trees for the BART superfamily using the proteins in TCDB. a ClustalX. b SFT1. The conventions of presentation are the same as for figure 3. For family and protein identification, see TCDB.
Fig. 6
Fig. 6
Phylogenetic trees for the CPA superfamily using the proteins in TCDB. a ClustalX. b SFT1. The conventions of presentation are the same as for figure 3. For family and protein identification, see TCDB.
Fig. 7
Fig. 7
Phylogenetic (Fitch) trees for the PTS-GFL superfamily using the proteins examined previously by Nguyen et al. 2006]. Only the BLAST-based SFT1 method of tree construction was used. a Numbers indicate the protein TC#s, while family TC#s are indicated within parentheses under the family designation. Small numbers adjacent to the branches represent the ‘bootstrap’ values, indicating the reliability of the branching order. At the ends of each branch, the protein abbreviation is presented as in table 1 of Nguyen et al. [2006].

References

    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. - PubMed
    1. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. - PMC - PubMed
    1. An JH, Kim YS. A gene cluster encoding malonyl-CoA decarboxylase (MatA), malonyl-CoA synthetase (MatB) and a putative dicarboxylate carrier protein (MatC) in Rhizobium trifolii–cloning, sequencing, and expression of the enzymes in Escherichia coli. Eur J Biochem. 1998;257:395–402. - PubMed
    1. Bhattacharjee H, Zhou T, Li J, Gatti DL, Walmsley AR, Rosen BP. Structure-function relationships in an anion-translocating ATPase. Biochem Soc Trans. 2000;28:520–526. - PubMed
    1. Castillo R, Saier MH. Functional promiscuity of homologues of the bacterial ArsA ATPases. Int J Microbiol. 2010;2010:187–373. - PMC - PubMed

Publication types