Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb 28;15(3):645.
doi: 10.3390/v15030645.

Human Genome Polymorphisms and Computational Intelligence Approach Revealed a Complex Genomic Signature for COVID-19 Severity in Brazilian Patients

Affiliations

Human Genome Polymorphisms and Computational Intelligence Approach Revealed a Complex Genomic Signature for COVID-19 Severity in Brazilian Patients

André Filipe Pastor et al. Viruses. .

Abstract

We present a genome polymorphisms/machine learning approach for severe COVID-19 prognosis. Ninety-six Brazilian severe COVID-19 patients and controls were genotyped for 296 innate immunity loci. Our model used a feature selection algorithm, namely recursive feature elimination coupled with a support vector machine, to find the optimal loci classification subset, followed by a support vector machine with the linear kernel (SVM-LK) to classify patients into the severe COVID-19 group. The best features that were selected by the SVM-RFE method included 12 SNPs in 12 genes: PD-L1, PD-L2, IL10RA, JAK2, STAT1, IFIT1, IFIH1, DC-SIGNR, IFNB1, IRAK4, IRF1, and IL10. During the COVID-19 prognosis step by SVM-LK, the metrics were: 85% accuracy, 80% sensitivity, and 90% specificity. In comparison, univariate analysis under the 12 selected SNPs showed some highlights for individual variant alleles that represented risk (PD-L1 and IFIT1) or protection (JAK2 and IFIH1). Variant genotypes carrying risk effects were represented by PD-L2 and IFIT1 genes. The proposed complex classification method can be used to identify individuals who are at a high risk of developing severe COVID-19 outcomes even in uninfected conditions, which is a disruptive concept in COVID-19 prognosis. Our results suggest that the genetic context is an important factor in the development of severe COVID-19.

Keywords: COVID-19 genetics; SARS-CoV-2 infection; complex genomic classifier; machine learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
The general impact of each human genotype on genome polymorphisms -ML COVID-19 prognosis classifier under SHAP (Shapley Additive exPlanations) analysis. The impact of each selected genotype (feature) on genome polymorphisms/machine learning model output (mild or severe COVID-19 cases). In blue, reference homozygotes; in purple, heterozygotes; in pink, alternative homozygotes. The figure depicts SHAP analysis over test dataset only–some alleles may not appear with all three values.
Figure 2
Figure 2
General performance of our human genome polymorphisms/machine learning COVID-19 prognosis classifier (SVM evaluation with Linear Kernel over 20 test samples: 10 mild cases and 10 severe cases). (A) Evaluation metrics for our complex COVID-19 classifier. (B) Confusion matrix: class 0 stands for mild cases, whereas class 1 stands for severe cases of COVID-19.
Figure 3
Figure 3
The selected SNPs from the proposed genome polymorphisms/machine severe COVID-19 classifier in cellular context. The genes of the SNPs selected by our classifier are related to IL-10 and IFN cellular pathways, which are highlighted in the figure (in red). In the viral recognition phase (1), the virus is recognized by receptors, such as DC-SIGNR, and then enters the cell, where the vRNA is identified by the intracellular molecules, such as IRAK4 and MDA5 proteins. Then, in the antiviral signaling phase (2), they activate the expression of genes involved in the antiviral response, such as IL-10 and IFN. On the other hand, PD-L1 and PD-L2 are inhibitors of the IL-10 pathway triggered by DC-SIGNR activation. The IL-10 and IFN proteins are delivered to the extracellular compartment, where they connect to their receptors in the cell plasma membrane, activating the JAK-STAT proteins cascade. It triggers the viral blocking phase (3), promoting the expression of antiviral factors that affect viral replication and translation, such as IFIT, as well as positive regulators, including PD-L1 and IRF1. PD-L1, incidentally, activates the expression of other ISGs, maintaining the cell’s antiviral status. While IRF1 is a transcriptional factor that activates the expression of immune genes, such as IFN, PD-L1 and IFIT.

References

    1. Johns Hopkins University COVID-19 Dashboard by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU) [(accessed on 22 January 2023)]. Available online: https://coronavirus.jhu.edu/map.html.
    1. Huang C., Wang Y., Li X., Ren L., Zhao J., Hu Y., Zhang L., Fan G., Xu J., Gu X., et al. Clinical Features of Patients Infected with 2019 Novel Coronavirus in Wuhan, China. Lancet. 2020;395:497–506. doi: 10.1016/S0140-6736(20)30183-5. - DOI - PMC - PubMed
    1. Buitrago-Garcia D., Egli-Gany D., Counotte M.J., Hossmann S., Imeri H., Ipekci A.M., Salanti G., Low N. Occurrence and Transmission Potential of Asymptomatic and Presymptomatic SARSCoV-2 Infections: A Living Systematic Review and Meta-Analysis. PLoS Med. 2020;17:e1003346. doi: 10.1371/journal.pmed.1003346. - DOI - PMC - PubMed
    1. Zsichla L., Müller V. Risk Factors of Severe COVID-19: A Review of Host, Viral and Environmental Factors. Viruses. 2023;15:175. doi: 10.3390/v15010175. - DOI - PMC - PubMed
    1. Niemi M.E.K., Karjalainen J., Liao R.G., Neale B.M., Daly M., Ganna A., Pathak G.A., Andrews S.J., Kanai M., Veerapen K., et al. Mapping the Human Genetic Architecture of COVID-19. Nature. 2021;600:472–477. doi: 10.1038/s41586-021-03767-x. - DOI - PMC - PubMed

Publication types

Substances