Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul:134:104531.
doi: 10.1016/j.compbiomed.2021.104531. Epub 2021 May 29.

Diagnosis and prediction of COVID-19 severity: can biochemical tests and machine learning be used as prognostic indicators?

Affiliations

Diagnosis and prediction of COVID-19 severity: can biochemical tests and machine learning be used as prognostic indicators?

Alexandre de Fátima Cobre et al. Comput Biol Med. 2021 Jul.

Abstract

Objective: This study aimed to implement and evaluate machine learning based-models to predict COVID-19' diagnosis and disease severity.

Methods: COVID-19 test samples (positive or negative results) from patients who attended a single hospital were evaluated. Patients diagnosed with COVID-19 were categorised according to the severity of the disease. Data were submitted to exploratory analysis (principal component analysis, PCA) to detect outlier samples, recognise patterns, and identify important variables. Based on patients' laboratory tests results, machine learning models were implemented to predict disease positivity and severity. Artificial neural networks (ANN), decision trees (DT), partial least squares discriminant analysis (PLS-DA), and K nearest neighbour algorithm (KNN) models were used. The four models were validated based on the accuracy (area under the ROC curve).

Results: The first subset of data had 5,643 patient samples (5,086 negatives and 557 positives for COVID-19). The second subset included 557 COVID-19 positive patients. The ANN, DT, PLS-DA, and KNN models allowed the classification of negative and positive samples with >84% accuracy. It was also possible to classify patients with severe and non-severe disease with an accuracy >86%. The following were associated with the prediction of COVID-19 diagnosis and severity: hyperferritinaemia, hypocalcaemia, pulmonary hypoxia, hypoxemia, metabolic and respiratory acidosis, low urinary pH, and high levels of lactate dehydrogenase.

Conclusion: Our analysis shows that all the models could assist in the diagnosis and prediction of COVID-19 severity.

Keywords: Blood test; COVID-19; Diagnosis; Machine learning model; Severity; Urine test.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1
Fig. 1
Exploratory analysis. Principal component analysis (PCA) model for the discrimination of negative and positive samples (A) and samples from patients with severe and non-severe disease (B).
Fig. 2
Fig. 2
Graph of leverage versus student residuals for detecting outlier samples. For diagnostic data: outlier analysis of negative samples (A) and positive samples (B). For severity data: outlier analysis for samples from patients without severity (C) and with severity (D).
Fig. 3
Fig. 3
ROC curves of the accuracy of the machine learning models. Artificial neural network (ANN): diagnosis (A) and severity (B). Decision tree (DT): diagnosis (C) and severity (D). Discriminant analysis by partial least squares (PLS-DA): diagnosis (E) and severity (F). K-nearest neighbours (KNN): diagnosis (G) and severity (H).

References

    1. World Health Organization, WHO WHO coronavirus disease (COVID-19) Dashboard 2021. 2021. https://covid19.who.int/ Available from:
    1. Cobre A.F., Böger B., Fachi M.M., Vilhena R. de O., Domingos E.L., Tonin F.S., Pontarolo R. Risk factors associated with delay in diagnosis and mortality in patients with Covid-19 in the city of Rio de Janeiro, Brazil, Cienc. Saúde Coletiva. 2020;25:4131–4140. doi: 10.1590/1413-812320202510.2.26882020. - DOI - PubMed
    1. Cobre A.F., Surek M., Vilhena R. de O., Böger B., Fachi M.M., Momade D.R.O., Tonin F.S., Sarti F.M., Pontarolo R. Influence of foods and nutrients on COVID-19 recovery: a multivariate analysis of data from 170 countries using a generalized linear model. Clin. Nutr. 2021 doi: 10.1016/j.clnu.2021.03.018. - DOI - PMC - PubMed
    1. Supady A., Curtis J.R., Abrams D., Lorusso R., Bein T., Boldt J., Brown C.E., Duerschmied D., Metaxa V., Brodie D. Allocating scarce intensive care resources during the COVID-19 pandemic: practical challenges to theoretical frameworks. Lancet Respir. Med. 2021;9:430–434. doi: 10.1016/S2213-2600(20)30580-4. - DOI - PMC - PubMed
    1. Ribeiro A.L., Alves-Sousa N.W., Martins-Filho P.R., Carvalho V.O. Social disparity in magnifying glass: the inequality among the vulnerable people during COVID-19 pandemic. Int. J. Clin. Pract. 2021;75:2–3. doi: 10.1111/ijcp.13839. - DOI - PMC - PubMed