Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct 20;13(1):17953.
doi: 10.1038/s41598-023-44924-8.

A novel bidirectional LSTM deep learning approach for COVID-19 forecasting

Affiliations

A novel bidirectional LSTM deep learning approach for COVID-19 forecasting

Nway Nway Aung et al. Sci Rep. .

Abstract

COVID-19 has resulted in significant morbidity and mortality globally. We develop a model that uses data from thirty days before a fixed time point to forecast the daily number of new COVID-19 cases fourteen days later in the early stages of the pandemic. Various time-dependent factors including the number of daily confirmed cases, reproduction number, policy measures, mobility and flight numbers were collected. A deep-learning model using Bidirectional Long-Short Term Memory (Bi-LSTM) architecture was trained on data from 22nd Jan 2020 to 8 Jan 2021 to forecast the new daily number of COVID-19 cases 14 days in advance across 190 countries, from 9 to 31 Jan 2021. A second model with fewer variables but similar architecture was developed. Results were summarised by mean absolute error (MAE), root mean squared error (RMSE), mean absolute percentage error (MAPE), and total absolute percentage error and compared against results from a classical ARIMA model. Median MAE was 157 daily cases (IQR: 26-666) under the first model, and 150 (IQR: 26-716) under the second. Countries with more accurate forecasts had more daily cases and experienced more waves of COVID-19 infections. Among countries with over 10,000 cases over the prediction period, median total absolute percentage error was 33% (IQR: 18-59%) and 34% (IQR: 16-66%) for the first and second models respectively. Both models had comparable median total absolute percentage errors but lower maximum total absolute percentage errors as compared to the classical ARIMA model. A deep-learning approach using Bi-LSTM architecture and open-source data was validated on 190 countries to forecast the daily number of cases in the early stages of the COVID-19 outbreak. Fewer variables could potentially be used without impacting prediction accuracy.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Use of 30 days prior data to predict the number of new cases 14 days later.
Figure 2
Figure 2
Conceptual framework of the proposed forecasting methods.
Figure 3
Figure 3
Modelling with Bidirectional long short-term memory networks.
Figure 4
Figure 4
Input features used for prediction. Key: flights—daily number of flights; deaths—cumulative number of COVID-19 deaths, confirmed—cumulative number of confirmed cases; recovery—cumulative number of recovered cases; E0_movil—daily reproduction number, Rt, smoothed; E0_estimated—daily reproduction number, Rt; new_tests_smoothed—daily test numbers; new_tests_smoothed_per_thousand—daily test numbers per thousand population; retail_and_recreation, grocery_and_pharmacy, parks, transit_stations. workplaces, residential—mobility data from Google contact_tracing—level of contact tracing (3 levels); restrictions_internal_movements—restrictions on internal movement during the COVID-19 pandemic (3 levels); containment_index—Containment and Health Index, a composite measure of eleven response metrics; stringency index—Government Stringency Index, a composite measure of nine response metrics; international_travel_controls—government policies on restrictions on international travel controls. (5 levels); facial_coverings—use of face coverings outside-of-the-home; stay_home_requirements—government policies on stay-at-home requirements or household lockdowns; cancel_public_events—government policies on the cancellation of public events; school_closures—government policies on school closures.
Figure 5
Figure 5
Ranking scatterplot of 84 countries.
Figure 6
Figure 6
Percentage error scatterplot of 84 countries.

Similar articles

Cited by

References

    1. Worldometers info. https://www.worldometers.info/coronavirus/. Accessed 2021.
    1. Ceylan Z. Estimation of COVID-19 prevalence in Italy, Spain, and France. Sci. Total Environ. 2020;729:138817. doi: 10.1016/j.scitotenv.2020.138817. - DOI - PMC - PubMed
    1. Medina-Ortiz D, Contreras S, Barrera-Saavedra Y, Cabas-Mora G, Olivera-Nappa Á. Country-wise forecast model for the effective reproduction number RT of coronavirus disease. Front. Phys. 2020;8:304. doi: 10.3389/fphy.2020.00304. - DOI
    1. Liu, Z., Magal, P., Seydi, O., & Webb, G. Predicting the cumulative number of cases for the COVID-19 epidemic in China from early data. arXiv preprint arXiv:2002.12298 (2020). - PubMed
    1. Peng, L., Yang, W., Zhang, D., Zhuge, C., & Hong, L. Epidemic analysis of COVID-19 in China by dynamical modeling. arXiv preprint arXiv:2002.06563 (2020).