Understanding the undelaying mechanism of HA-subtyping in the level of physic-chemical characteristics of protein
- PMID: 24809455
- PMCID: PMC4014573
- DOI: 10.1371/journal.pone.0096984
Understanding the undelaying mechanism of HA-subtyping in the level of physic-chemical characteristics of protein
Erratum in
- PLoS One. 2014;9(6):e99921
Abstract
The evolution of the influenza A virus to increase its host range is a major concern worldwide. Molecular mechanisms of increasing host range are largely unknown. Influenza surface proteins play determining roles in reorganization of host-sialic acid receptors and host range. In an attempt to uncover the physic-chemical attributes which govern HA subtyping, we performed a large scale functional analysis of over 7000 sequences of 16 different HA subtypes. Large number (896) of physic-chemical protein characteristics were calculated for each HA sequence. Then, 10 different attribute weighting algorithms were used to find the key characteristics distinguishing HA subtypes. Furthermore, to discover machine leaning models which can predict HA subtypes, various Decision Tree, Support Vector Machine, Naïve Bayes, and Neural Network models were trained on calculated protein characteristics dataset as well as 10 trimmed datasets generated by attribute weighting algorithms. The prediction accuracies of the machine learning methods were evaluated by 10-fold cross validation. The results highlighted the frequency of Gln (selected by 80% of attribute weighting algorithms), percentage/frequency of Tyr, percentage of Cys, and frequencies of Try and Glu (selected by 70% of attribute weighting algorithms) as the key features that are associated with HA subtyping. Random Forest tree induction algorithm and RBF kernel function of SVM (scaled by grid search) showed high accuracy of 98% in clustering and predicting HA subtypes based on protein attributes. Decision tree models were successful in monitoring the short mutation/reassortment paths by which influenza virus can gain the key protein structure of another HA subtype and increase its host range in a short period of time with less energy consumption. Extracting and mining a large number of amino acid attributes of HA subtypes of influenza A virus through supervised algorithms represent a new avenue for understanding and predicting possible future structure of influenza pandemics.
Conflict of interest statement
Figures


Similar articles
-
Identifying discriminative amino acids within the hemagglutinin of human influenza A H5N1 virus using a decision tree.IEEE Trans Inf Technol Biomed. 2008 Nov;12(6):689-95. doi: 10.1109/TITB.2008.896871. IEEE Trans Inf Technol Biomed. 2008. PMID: 19000947
-
Accurate classification and hemagglutinin amino acid signatures for influenza A virus host-origin association and subtyping.Virology. 2014 Jan 20;449:328-38. doi: 10.1016/j.virol.2013.11.010. Epub 2013 Dec 22. Virology. 2014. PMID: 24418567
-
Prediction of thermostability from amino acid attributes by combination of clustering with attribute weighting: a new vista in engineering enzymes.PLoS One. 2011;6(8):e23146. doi: 10.1371/journal.pone.0023146. Epub 2011 Aug 10. PLoS One. 2011. PMID: 21853079 Free PMC article.
-
Receptor binding and pH stability - how influenza A virus hemagglutinin affects host-specific virus infection.Biochim Biophys Acta. 2014 Apr;1838(4):1153-68. doi: 10.1016/j.bbamem.2013.10.004. Epub 2013 Oct 24. Biochim Biophys Acta. 2014. PMID: 24161712 Review.
-
Acid-induced membrane fusion by the hemagglutinin protein and its role in influenza virus biology.Curr Top Microbiol Immunol. 2014;385:93-116. doi: 10.1007/82_2014_393. Curr Top Microbiol Immunol. 2014. PMID: 25007844 Free PMC article. Review.
Cited by
-
Integration of Cross Species RNA-seq Meta-Analysis and Machine-Learning Models Identifies the Most Important Salt Stress-Responsive Pathways in Microalga Dunaliella.Front Genet. 2019 Aug 29;10:752. doi: 10.3389/fgene.2019.00752. eCollection 2019. Front Genet. 2019. PMID: 31555319 Free PMC article.
-
Function-based classification of hazardous biological sequences: Demonstration of a new paradigm for biohazard assessments.Front Bioeng Biotechnol. 2022 Oct 7;10:979497. doi: 10.3389/fbioe.2022.979497. eCollection 2022. Front Bioeng Biotechnol. 2022. PMID: 36277394 Free PMC article.
-
Computational approaches for classification and prediction of P-type ATPase substrate specificity in Arabidopsis.Physiol Mol Biol Plants. 2016 Jan;22(1):163-74. doi: 10.1007/s12298-016-0351-5. Epub 2016 Apr 7. Physiol Mol Biol Plants. 2016. PMID: 27186030 Free PMC article.
-
Unified Transcriptomic Signature of Arbuscular Mycorrhiza Colonization in Roots of Medicago truncatula by Integration of Machine Learning, Promoter Analysis, and Direct Merging Meta-Analysis.Front Plant Sci. 2018 Nov 12;9:1550. doi: 10.3389/fpls.2018.01550. eCollection 2018. Front Plant Sci. 2018. PMID: 30483277 Free PMC article.
-
Machine learning-based prediction of gastroparesis risk following complete mesocolic excision.Discov Oncol. 2024 Sep 27;15(1):483. doi: 10.1007/s12672-024-01355-9. Discov Oncol. 2024. PMID: 39331201 Free PMC article.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources