. 2023 Oct 20;13(1):17953.

doi: 10.1038/s41598-023-44924-8.

A novel bidirectional LSTM deep learning approach for COVID-19 forecasting

Nway Nway Aung¹, Junxiong Pang^{2

3}, Matthew Chin Heng Chua⁴, Hui Xing Tan⁵

Affiliations

¹ Institute of Systems Science, National University of Singapore, 25 Heng Mui Keng Terrace, Singapore, 119615, Singapore. nwaynwayaung.lily@gmail.com.
² Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore.
³ Centre for Outbreak Preparedness, SingHealth Duke-NUS Global Health Institute, Duke-NUS Medical School, NUS, Singapore, Singapore.
⁴ Department of Biomedical Informatics, Yong Loo Lin School of Medicine, National University of Singapore, 1E Kent Ridge Road, Singapore, 119228, Singapore.
⁵ Institute of Systems Science, National University of Singapore, 25 Heng Mui Keng Terrace, Singapore, 119615, Singapore. tan.huixing@gmail.com.

PMID: 37863921
PMCID: PMC10589260
DOI: 10.1038/s41598-023-44924-8

A novel bidirectional LSTM deep learning approach for COVID-19 forecasting

Nway Nway Aung et al. Sci Rep. 2023.

. 2023 Oct 20;13(1):17953.

doi: 10.1038/s41598-023-44924-8.

Authors

Nway Nway Aung¹, Junxiong Pang^{2

3}, Matthew Chin Heng Chua⁴, Hui Xing Tan⁵

Affiliations

¹ Institute of Systems Science, National University of Singapore, 25 Heng Mui Keng Terrace, Singapore, 119615, Singapore. nwaynwayaung.lily@gmail.com.
² Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore.
³ Centre for Outbreak Preparedness, SingHealth Duke-NUS Global Health Institute, Duke-NUS Medical School, NUS, Singapore, Singapore.
⁴ Department of Biomedical Informatics, Yong Loo Lin School of Medicine, National University of Singapore, 1E Kent Ridge Road, Singapore, 119228, Singapore.
⁵ Institute of Systems Science, National University of Singapore, 25 Heng Mui Keng Terrace, Singapore, 119615, Singapore. tan.huixing@gmail.com.

PMID: 37863921
PMCID: PMC10589260
DOI: 10.1038/s41598-023-44924-8

Abstract

COVID-19 has resulted in significant morbidity and mortality globally. We develop a model that uses data from thirty days before a fixed time point to forecast the daily number of new COVID-19 cases fourteen days later in the early stages of the pandemic. Various time-dependent factors including the number of daily confirmed cases, reproduction number, policy measures, mobility and flight numbers were collected. A deep-learning model using Bidirectional Long-Short Term Memory (Bi-LSTM) architecture was trained on data from 22nd Jan 2020 to 8 Jan 2021 to forecast the new daily number of COVID-19 cases 14 days in advance across 190 countries, from 9 to 31 Jan 2021. A second model with fewer variables but similar architecture was developed. Results were summarised by mean absolute error (MAE), root mean squared error (RMSE), mean absolute percentage error (MAPE), and total absolute percentage error and compared against results from a classical ARIMA model. Median MAE was 157 daily cases (IQR: 26-666) under the first model, and 150 (IQR: 26-716) under the second. Countries with more accurate forecasts had more daily cases and experienced more waves of COVID-19 infections. Among countries with over 10,000 cases over the prediction period, median total absolute percentage error was 33% (IQR: 18-59%) and 34% (IQR: 16-66%) for the first and second models respectively. Both models had comparable median total absolute percentage errors but lower maximum total absolute percentage errors as compared to the classical ARIMA model. A deep-learning approach using Bi-LSTM architecture and open-source data was validated on 190 countries to forecast the daily number of cases in the early stages of the COVID-19 outbreak. Fewer variables could potentially be used without impacting prediction accuracy.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Figure 1**
Use of 30 days prior data to predict the number of new cases 14 days later.

**Figure 2**
Conceptual framework of the proposed forecasting methods.

**Figure 3**
Modelling with Bidirectional long short-term memory networks.

**Figure 4**
Input features used for prediction. Key: flights—daily number of flights; deaths—cumulative number of COVID-19 deaths, confirmed—cumulative number of confirmed cases; recovery—cumulative number of recovered cases; E0_movil—daily reproduction number, Rt, smoothed; E0_estimated—daily reproduction number, Rt; new_tests_smoothed—daily test numbers; new_tests_smoothed_per_thousand—daily test numbers per thousand population; retail_and_recreation, grocery_and_pharmacy, parks, transit_stations. workplaces, residential—mobility data from Google contact_tracing—level of contact tracing (3 levels); restrictions_internal_movements—restrictions on internal movement during the COVID-19 pandemic (3 levels); containment_index—Containment and Health Index, a composite measure of eleven response metrics; stringency index—Government Stringency Index, a composite measure of nine response metrics; international_travel_controls—government policies on restrictions on international travel controls. (5 levels); facial_coverings—use of face coverings outside-of-the-home; stay_home_requirements—government policies on stay-at-home requirements or household lockdowns; cancel_public_events—government policies on the cancellation of public events; school_closures—government policies on school closures.

**Figure 5**
Ranking scatterplot of 84 countries.

**Figure 6**
Percentage error scatterplot of 84 countries.

See this image and copyright information in PMC

Cited by

A dynamic ensemble model for short-term forecasting in pandemic situations.
Botz J, Valderrama D, Guski J, Fröhlich H. Botz J, et al. PLOS Glob Public Health. 2024 Aug 22;4(8):e0003058. doi: 10.1371/journal.pgph.0003058. eCollection 2024. PLOS Glob Public Health. 2024. PMID: 39172923 Free PMC article.

References

1. Worldometers info. https://www.worldometers.info/coronavirus/. Accessed 2021.
1. Ceylan Z. Estimation of COVID-19 prevalence in Italy, Spain, and France. Sci. Total Environ. 2020;729:138817. doi: 10.1016/j.scitotenv.2020.138817. - DOI - PMC - PubMed
1. Medina-Ortiz D, Contreras S, Barrera-Saavedra Y, Cabas-Mora G, Olivera-Nappa Á. Country-wise forecast model for the effective reproduction number RT of coronavirus disease. Front. Phys. 2020;8:304. doi: 10.3389/fphy.2020.00304. - DOI
1. Liu, Z., Magal, P., Seydi, O., & Webb, G. Predicting the cumulative number of cases for the COVID-19 epidemic in China from early data. arXiv preprint arXiv:2002.12298 (2020). - PubMed
1. Peng, L., Yang, W., Zhang, D., Zhuge, C., & Hong, L. Epidemic analysis of COVID-19 in China by dynamical modeling. arXiv preprint arXiv:2002.06563 (2020).

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A novel bidirectional LSTM deep learning approach for COVID-19 forecasting

Affiliations

A novel bidirectional LSTM deep learning approach for COVID-19 forecasting

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources

Medical