Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Apr 12:10:99.
doi: 10.1186/1471-2148-10-99.

FLU, an amino acid substitution model for influenza proteins

Affiliations

FLU, an amino acid substitution model for influenza proteins

Cuong Cao Dang et al. BMC Evol Biol. .

Abstract

Background: The amino acid substitution model is the core component of many protein analysis systems such as sequence similarity search, sequence alignment, and phylogenetic inference. Although several general amino acid substitution models have been estimated from large and diverse protein databases, they remain inappropriate for analyzing specific species, e.g., viruses. Emerging epidemics of influenza viruses raise the need for comprehensive studies of these dangerous viruses. We propose an influenza-specific amino acid substitution model to enhance the understanding of the evolution of influenza viruses.

Results: A maximum likelihood approach was applied to estimate an amino acid substitution model (FLU) from approximately 113,000 influenza protein sequences, consisting of approximately 20 million residues. FLU outperforms 14 widely used models in constructing maximum likelihood phylogenetic trees for the majority of influenza protein alignments. On average, FLU gains approximately 42 log likelihood points with an alignment of 300 sites. Moreover, topologies of trees constructed using FLU and other models are frequently different. FLU does indeed have an impact on likelihood improvement as well as tree topologies. It was implemented in PhyML and can be downloaded from ftp://ftp.sanger.ac.uk/pub/1000genomes/lsq/FLU or included in PhyML 3.0 server at http://www.atgc-montpellier.fr/phyml/.

Conclusions: FLU should be useful for any influenza protein analysis system which requires an accurate description of amino acid substitutions.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Amino acid frequencies of FLU, HIVb, LG models and the empirical frequencies counted from all alignments (denoted Influenza).
Figure 2
Figure 2
The exchangeability coefficients in FLU, HIVb and LG models. The black bubble at the intersection of line X and column Y presents the exchangeability between amino acid X and amino acid Y in FLU. Similarly, the grey and white bubbles present exchangeabilities between amino acids in the LG and HIVb models, respectively. These bubbles show remarkable differences between these models.
Figure 3
Figure 3
The bubbles display the relative differences between exchangeability coefficients in FLU and HIVb (left), and FLU with LG (right). On the left side, each bubble represents the value of (FLUij - HIVbij)/(FLUij + HIVbij) where FLUij (HIVbij) is the exchangeability coefficient in FLU (HIVb). Values 1/3 and 2/3 mean that the FLU coefficient is 2 and 5 times as large as that of HIVb, respectively. Values -1/3 and -2/3 mean that HIVb is 2 and 5 times larger than FLU, respectively. Similar explanations can be also given on the right side, but now between FLU and LG models.
Figure 4
Figure 4
The Robinson-Foulds distance between trees inferred using FLU and HIVb (LG, JTT, HIVw) models. The horizontal axis indicates the RF distance between 2 tree topologies, whereas the vertical axis indicates the number of alignments.
Figure 5
Figure 5
Flowchart to estimate the influenza-specific amino acid substitution model.

Similar articles

Cited by

References

    1. Felsenstein J. Infering Phylogenies. Sunderland, Massachusetts, US: Sinauer Associates; 2004.
    1. Ziheng Y. Computational Molecular Evolution. 1. Oxford, UK: Oxford University Press; 2006.
    1. Opperdoes FR. In: The Phylogenetics Handbook A Practical Approach to DNA and Protein Phylogeny. Salemi M, Vandamme AM, editor. Cambridge: Cambridge University Press; 2003. Phylogenetic analysis using protein sequences; pp. 207–235.
    1. Setubal C, Meidanis J. Introduction to Computational Molecular Biology. 1. Boston, Massachusetts, US: PWS Publishing; 1997.
    1. Thorne J. Models of protein sequence evolution and their applications. Currrent Opinion in Genetics and Development. 2000;10:602–605. doi: 10.1016/S0959-437X(00)00142-8. - DOI - PubMed

Publication types

LinkOut - more resources