Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul:146:105419.
doi: 10.1016/j.compbiomed.2022.105419. Epub 2022 Apr 25.

A machine learning-based approach to determine infection status in recipients of BBV152 (Covaxin) whole-virion inactivated SARS-CoV-2 vaccine for serological surveys

Prateek Singh  1 Rajat Ujjainiya  1 Satyartha Prakash  2 Salwa Naushin  1 Viren Sardana  1 Nitin Bhatheja  2 Ajay Pratap Singh  1 Joydeb Barman  2 Kartik Kumar  2 Saurabh Gayali  2 Raju Khan  3 Birendra Singh Rawat  4 Karthik Bharadwaj Tallapaka  5 Mahesh Anumalla  5 Amit Lahiri  6 Susanta Kar  6 Vivek Bhosale  6 Mrigank Srivastava  6 Madhav Nilakanth Mugale  6 C P Pandey  6 Shaziya Khan  6 Shivani Katiyar  6 Desh Raj  6 Sharmeen Ishteyaque  6 Sonu Khanka  6 Ankita Rani  6 Promila  6 Jyotsna Sharma  6 Anuradha Seth  6 Mukul Dutta  6 Nishant Saurabh  7 Murugan Veerapandian  8 Ganesh Venkatachalam  8 Deepak Bansal  9 Dinesh Gupta  10 Prakash M Halami  11 Muthukumar Serva Peddha  11 Ravindra P Veeranna  11 Anirban Pal  12 Ranvijay Kumar Singh  13 Suresh Kumar Anandasadagopan  14 Parimala Karuppanan  15 Syed Nasar Rahman  14 Gopika Selvakumar  15 Subramanian Venkatesan  14 Malay Kumar Karmakar  16 Harish Kumar Sardana  17 Anamika Kothari  18 Devendra Singh Parihar  17 Anupma Thakur  17 Anas Saifi  17 Naman Gupta  17 Yogita Singh  17 Ritu Reddu  17 Rizul Gautam  17 Anuj Mishra  17 Avinash Mishra  19 Iranna Gogeri  20 Geethavani Rayasam  21 Yogendra Padwad  22 Vikram Patial  22 Vipin Hallan  22 Damanpreet Singh  22 Narendra Tirpude  22 Partha Chakrabarti  23 Sujay Krishna Maity  23 Dipyaman Ganguly  23 Ramakrishna Sistla  24 Narender Kumar Balthu  25 Kiran Kumar A  25 Siva Ranjith  25 B Vijay Kumar  25 Piyush Singh Jamwal  26 Anshu Wali  26 Sajad Ahmed  26 Rekha Chouhan  26 Sumit G Gandhi  27 Nancy Sharma  27 Garima Rai  27 Faisal Irshad  27 Vijay Lakshmi Jamwal  27 Masroor Ahmad Paddar  27 Sameer Ullah Khan  27 Fayaz Malik  27 Debashish Ghosh  28 Ghanshyam Thakkar  29 S K Barik  30 Prabhanshu Tripathi  31 Yatendra Kumar Satija  32 Sneha Mohanty  31 Md Tauseef Khan  31 Umakanta Subudhi  33 Pradip Sen  34 Rashmi Kumar  34 Anshu Bhardwaj  34 Pawan Gupta  34 Deepak Sharma  34 Amit Tuli  34 Saumya Ray Chaudhuri  34 Srinivasan Krishnamurthi  34 L Prakash  35 Ch V Rao  36 B N Singh  36 Arvindkumar Chaurasiya  37 Meera Chaurasiyar  37 Mayuri Bhadange  37 Bhagyashree Likhitkar  37 Sharada Mohite  37 Yogita Patil  37 Mahesh Kulkarni  37 Rakesh Joshi  37 Vaibhav Pandya  38 Sachin Mahajan  38 Amita Patil  38 Rachel Samson  37 Tejas Vare  37 Mahesh Dharne  37 Ashok Giri  37 Sachin Mahajan  38 Shilpa Paranjape  39 G Narahari Sastry  40 Jatin Kalita  40 Tridip Phukan  40 Prasenjit Manna  40 Wahengbam Romi  40 Pankaj Bharali  40 Dibyajyoti Ozah  40 Ravi Kumar Sahu  40 Prachurjya Dutta  40 Moirangthem Goutam Singh  41 Gayatri Gogoi  41 Yasmin Begam Tapadar  41 Elapavalooru Vssk Babu  42 Rajeev K Sukumaran  43 Aishwarya R Nair  44 Anoop Puthiyamadam  43 Prajeesh Kooloth Valappil  43 Adrash Velayudhan Pillai Prasannakumari  43 Kalpana Chodankar  45 Samir Damare  45 Ved Varun Agrawal  46 Kumardeep Chaudhary  1 Anurag Agrawal  1 Shantanu Sengupta  47 Debasis Dash  48
Affiliations

A machine learning-based approach to determine infection status in recipients of BBV152 (Covaxin) whole-virion inactivated SARS-CoV-2 vaccine for serological surveys

Prateek Singh et al. Comput Biol Med. 2022 Jul.

Abstract

Data science has been an invaluable part of the COVID-19 pandemic response with multiple applications, ranging from tracking viral evolution to understanding the vaccine effectiveness. Asymptomatic breakthrough infections have been a major problem in assessing vaccine effectiveness in populations globally. Serological discrimination of vaccine response from infection has so far been limited to Spike protein vaccines since whole virion vaccines generate antibodies against all the viral proteins. Here, we show how a statistical and machine learning (ML) based approach can be used to discriminate between SARS-CoV-2 infection and immune response to an inactivated whole virion vaccine (BBV152, Covaxin). For this, we assessed serial data on antibodies against Spike and Nucleocapsid antigens, along with age, sex, number of doses taken, and days since last dose, for 1823 Covaxin recipients. An ensemble ML model, incorporating a consensus clustering approach alongside the support vector machine model, was built on 1063 samples where reliable qualifying data existed, and then applied to the entire dataset. Of 1448 self-reported negative subjects, our ensemble ML model classified 724 to be infected. For method validation, we determined the relative ability of a random subset of samples to neutralize Delta versus wild-type strain using a surrogate neutralization assay. We worked on the premise that antibodies generated by a whole virion vaccine would neutralize wild type more efficiently than delta strain. In 100 of 156 samples, where ML prediction differed from self-reported uninfected status, neutralization against Delta strain was more effective, indicating infection. We found 71.8% subjects predicted to be infected during the surge, which is concordant with the percentage of sequences classified as Delta (75.6%-80.2%) over the same period. Our approach will help in real-world vaccine effectiveness assessments where whole virion vaccines are commonly used.

Keywords: BBV152; COVID-19; Covaxin; Ensemble methods; Infection; Machine learning; SARS-CoV-2.

PubMed Disclaimer

Conflict of interest statement

We declare no conflict of interest.

Figures

Fig. 1
Fig. 1
Workflow of the study to identify COVID-19 infection status. Using a consensus of supervised (machine learning) and unsupervised (clustering) approaches, COVID-19 Infection status was ascertained in 1063 individuals who provided samples in Phase 3 (P3) and also in Phase 1 or Phase 2 (P1/P2). The final ensemble model was used to predict the COVID-19 infection status for all Covaxin administered individuals in P3.
Fig. 2
Fig. 2
Data structure and antibody level distribution. A): Sample distribution and overlap among three phases [P1 (June–November 2020), P2 (December 2020–April 2021), P3 (May–August 2021)] of CSIR Cohort of Covaxin administered individuals (N = 1823), B): Distribution of Antibodies to Nucleocapsid (COI) and Spike (U/mL) in the form of density histograms of 1823 individuals, C): PCA plot of 1823 Covaxin administered individuals based on six features including COI, U/mL, age, gender, days since last vaccination, and the number of doses. COVID-19 self-reported infection is depicted in red color, D): Sample distribution stratified via self-reported COVID-19 infection status and doses taken (N = 1823). Density-based contours indicate the presence of two subgroups amongst both in 1 dose and 2 doses administered self-reported not infected individuals. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
Fig. 3
Fig. 3
Development and validation of prediction models. A): Consensus clustering with k-prototype and VarSelLCM methods (N = 1063). Light Brown and blue colour represent concordance between two clustering approaches for Cluster 1 and Cluster 2, respectively. The black color represents discordance between the two methods, hence indeterminate; B): Supervised machine learning (SVM method) based prediction of the infection status (N = 1063), further stratified via self-reported COVID-19 infection status and the number of vaccine doses; C): Ensemble ML model-based prediction of COVID-19 infection in all individuals (N = 1823), further stratified via self-reported infection status and the number of vaccine doses; D): Phase 2 seronegative subjects who gave samples in Phase 3 analyzed using a surrogate virus neutralization assay (sVNT) and predicted to be infected by Ensemble model (N = 39). 71.8% of samples predicted to be infected by Ensemble were found to be Delta infected utilizing a variant-specific sVNT assay. Delta infected was labelled when Delta Inhibition % > WT Inhibition % with a margin based on standard error. Delta Not infected were labelled when samples processed without dilution had less than 30% inhibition. All other data points were labelled Delta Uninfected. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

References

    1. Estiri H., Strasser Z.H., Klann J.G., Naseri P., Wagholikar K.B., Murphy S.N. Predicting COVID-19 mortality with electronic medical records. NPJ Digit Med. 2021;4:15. doi: 10.1038/s41746-021-00383-x. - DOI - PMC - PubMed
    1. Gupta R.K., Marks M., Samuels T.H.A., Luintel A., Rampling T., Chowdhury H., Quartagno M., Nair A., Lipman M., Abubakar I., van Smeden M., Wong W.K., Williams B., Noursadeghi M. UCLH COVID-19 Reporting Group, Systematic evaluation and external validation of 22 prognostic models among hospitalised adults with COVID-19: an observational cohort study. Eur. Respir. J. 2020;56 doi: 10.1183/13993003.03498-2020. - DOI - PMC - PubMed
    1. Zoabi Y., Deri-Rozov S., Shomron N. Machine learning-based prediction of COVID-19 diagnosis based on symptoms. Npj Digital Medicine. 2021;4 doi: 10.1038/s41746-020-00372-6. - DOI - PMC - PubMed
    1. Singanayagam A., Hakki S., Dunning J., Madon K.J., Crone M.A., Koycheva A., Derqui-Fernandez N., Barnett J.L., Whitfield M.G., Varro R., Charlett A., Kundu R., Fenn J., Cutajar J., Quinn V., Conibear E., Barclay W., Freemont P.S., Taylor G.P., Ahmad S., Zambon M., Ferguson N.M., Lalvani A., Badhan A., Dustan S., Tejpal C., Ketkar A.V., Narean J.S., Hammett S., McDermott E., Pillay T., Houston H., Luca C., Samuel J., Bremang S., Evetts S., Poh J., Anderson C., Jackson D., Miah S., Ellis J., Lackenby A. Community transmission and viral load kinetics of the SARS-CoV-2 delta (B.1.617.2) variant in vaccinated and unvaccinated individuals in the UK: a prospective, longitudinal, cohort study. Lancet Infect. Dis. 2022;22:183–195. doi: 10.1016/s1473-3099(21)00648-4. - DOI - PMC - PubMed
    1. Pelleau S., Woudenberg T., Rosado J., Donnadieu F., Garcia L., Obadia T., Gardais S., Elgharbawy Y., Velay A., Gonzalez M., Nizou J.Y., Khelil N., Zannis K., Cockram C., Hélène Merkling S., Meola A., Kerneis S., Terrier B., de Seze J., Planas D., Schwartz O., Dejardin F., Petres S., von Platen C., Arowas L., de Facci L.P., Duffy D., Cheallaigh C.N., Conlon N., Townsend L., Auerswald H., Backovic M., Hoen B., Fontanet A., Mueller I., Fafi-Kremer S., Bruel T., White M. Serological reconstruction of COVID-19 epidemics through analysis of antibody kinetics to SARS-CoV-2 proteins. bioRxiv. 2021 doi: 10.1101/2021.03.04.21252532. - DOI

Publication types