Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 25:10:e1980.
doi: 10.7717/peerj-cs.1980. eCollection 2024.

Physicochemical properties-based hybrid machine learning technique for the prediction of SARS-CoV-2 T-cell epitopes as vaccine targets

Affiliations

Physicochemical properties-based hybrid machine learning technique for the prediction of SARS-CoV-2 T-cell epitopes as vaccine targets

Syed Nisar Hussain Bukhari et al. PeerJ Comput Sci. .

Abstract

Majority of the existing SARS-CoV-2 vaccines work by presenting the whole pathogen in the attenuated form to immune system to invoke an immune response. On the other hand, the concept of a peptide based vaccine (PBV) is based on the identification and chemical synthesis of only immunodominant peptides known as T-cell epitopes (TCEs) to induce a specific immune response against a particular pathogen. However PBVs have received less attention despite holding huge untapped potential for boosting vaccine safety and immunogenicity. To identify these TCEs for designing PBV, wet-lab experiments are difficult, expensive, and time-consuming. Machine learning (ML) techniques can accurately predict TCEs, saving time and cost for speedy vaccine development. This work proposes novel hybrid ML techniques based on the physicochemical properties of peptides to predict SARS-CoV-2 TCEs. The proposed hybrid ML technique was evaluated using various ML model evaluation metrics and demonstrated promising results. The hybrid technique of decision tree classifier with chi-squared feature weighting technique and forward search optimal feature searching algorithm has been identified as the best model with an accuracy of 98.19%. Furthermore, K-fold cross-validation (KFCV) was performed to ensure that the model is reliable and the results indicate that the hybrid random forest model performs consistently well in terms of accuracy with respect to other hybrid approaches. The predicted TCEs are highly likely to serve as promising vaccine targets, subject to evaluations both in-vivo and in-vitro. This development could potentially save countless lives globally, prevent future epidemic-scale outbreaks, and reduce the risk of mutation escape.

Keywords: COVID-19; Hybrid technique; Machine learning; Peptide based vaccine; SARS-CoV-2; T-cell epitope.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1. Proposed methodology.
Figure 2
Figure 2. KFCV technique.
Figure 3
Figure 3. KFCV results of hybrid models.

Similar articles

Cited by

References

    1. Alpaydin E. Introduction to machine learning. Second Edition. Cambridge, MA: The MIT Press; 2010.
    1. Awad N, Mohamed RH, Ghoneim NI, Elmehrath AO, El-Badri N. Immunoinformatics approach of epitope prediction for SARS-CoV-2. Journal of Genetic Engineering and Biotechnology. 2022;20(1):1–11. doi: 10.1186/s43141-022-00344-1. - DOI - PMC - PubMed
    1. Baruah V, Bose S. Immunoinformatics-aided identification of T cell and B cell epitopes in the surface glycoprotein of 2019-nCoV. Journal of Medical Virology. 2020;92(5):495–500. doi: 10.1002/jmv.25698. - DOI - PMC - PubMed
    1. Bhasin M, Raghava GPS. Prediction of CTL epitopes using QM, SVM and ANN techniques. Vaccine. 2004;22(23–24):3195–3204. doi: 10.1016/j.vaccine.2004.02.005. - DOI - PubMed
    1. Bradley AP. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition. 1997;30(7):1145–1159. doi: 10.1016/S0031-3203(96)00142-2. - DOI

LinkOut - more resources