Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jan:142:110512.
doi: 10.1016/j.chaos.2020.110512. Epub 2020 Nov 28.

Data analysis of Covid-19 pandemic and short-term cumulative case forecasting using machine learning time series methods

Affiliations

Data analysis of Covid-19 pandemic and short-term cumulative case forecasting using machine learning time series methods

Serkan Ballı. Chaos Solitons Fractals. 2021 Jan.

Abstract

The Covid-19 pandemic is the most important health disaster that has surrounded the world for the past eight months. There is no clear date yet on when it will end. As of 18 September 2020, more than 31 million people have been infected worldwide. Predicting the Covid-19 trend has become a challenging issue. In this study, data of COVID-19 between 20/01/2020 and 18/09/2020 for USA, Germany and the global was obtained from World Health Organization. Dataset consist of weekly confirmed cases and weekly cumulative confirmed cases for 35 weeks. Then the distribution of the data was examined using the most up-to-date Covid-19 weekly case data and its parameters were obtained according to the statistical distributions. Furthermore, time series prediction model using machine learning was proposed to obtain the curve of disease and forecast the epidemic tendency. Linear regression, multi-layer perceptron, random forest and support vector machines (SVM) machine learning methods were used. The performances of the methods were compared according to the RMSE, APE, MAPE metrics and it was seen that SVM achieved the best trend. According to estimates, the global pandemic will peak at the end of January 2021 and estimated approximately 80 million people will be cumulatively infected.

Keywords: Covid-19; Machine learning; Multi-layer perceptron; Statistical distribution; Support vector machines.

PubMed Disclaimer

Conflict of interest statement

The author declares that he has no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1
Fig. 1
Histogram of weekly cases for global, Germany and USA
Fig. 2
Fig. 2
Probability plot for weekly global cases
Fig. 3
Fig. 3
Probability plot for weekly cases in Germany
Fig. 4
Fig. 4
Probability plot for weekly cases in USA
Fig. 5
Fig. 5
Prediction of weekly cumulative global cases
Fig. 6
Fig. 6
Prediction of weekly cumulative cases for Germany
Fig. 7
Fig. 7
Prediction of weekly cumulative cases for USA

References

    1. Ahmed N.K., Atiya A.F., Gayar N.E., El-Shishiny H. An empirical comparison of machine learning models for time series forecasting. Econom Rev. 2010;29(5–6):594–621.
    1. Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
    1. Brokamp C., Jandarov R., Rao M.B., LeMasters G., Ryan P. Exposure assessment models for elemental components of particulate matter in an urban environment: a comparison of regression and random forest approaches. Atmos Environ. 2017;151:1–11. - PMC - PubMed
    1. Das R.C. Forecasting incidences of covid-19 using box-jenkins method for the period july 12-septembert 11, 2020: a study on highly affected countries. Chaos, Solitons and Fractals. 2020;140:110248. - PMC - PubMed
    1. Elfahham Y. Estimation and prediction of construction cost index using neural networks, time series, and regression. Alex Eng J. 2019;58(2):499–506.