Observational Study

. 2025 May 12:27:e67156.

doi: 10.2196/67156.

Population-Wide Depression Incidence Forecasting Comparing Autoregressive Integrated Moving Average and Vector Autoregressive Integrated Moving Average to Temporal Fusion Transformers: Longitudinal Observational Study

Deliang Yang^#¹, Yiyi Tang^#^{1

2}, Vivien Kin Yi Chan³, Qiwen Fang³, Sandra Sau Man Chan⁴, Hao Luo⁵, Ian Chi Kei Wong^{3

6

7}, Huang-Tz Ou^{8

9}, Esther Wai Yin Chan^{3

10}, David Makram Bishai¹¹, Yingyao Chen¹², Martin Knapp¹³, Mark Jit^{10

14

15

16}, Dawn Craig¹⁷, Xue Li^{1

3

10}

Affiliations

¹ Department of Medicine, School of Clinical Medicine, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China (Hong Kong).
² Faculty of Science, University of Hong Kong, Hong Kong, China (Hong Kong).
³ Department of Pharmacology and Pharmacy, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China (Hong Kong).
⁴ Department of Psychiatry, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China (Hong Kong).
⁵ School of Public Health Sciences, University of Waterloo, Waterloo, ON, Canada.
⁶ School of Pharmacy, Aston University, Birmingham, United Kingdom.
⁷ Advanced Data Analytics for Medical Science (ADAMS) Limited, Hong Kong, China (Hong Kong).
⁸ Institute of Clinical Pharmacy and Pharmaceutical Sciences, College of Medicine, National Cheng Kung University, Tainan, Taiwan.
⁹ Department of Pharmacy, College of Medicine, National Cheng Kung University, Tainan, Taiwan.
¹⁰ Laboratory of Data Discovery for Health (D24H), Hong Kong, China (Hong Kong).
¹¹ Division of Health Economics, Policy and Management, School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China (Hong Kong).
¹² National Health Commission Key Laboratory of Health Technology Assessment, Fudan University, Shanghai, China.
¹³ Care Policy and Evaluation Centre, London School of Economics and Political Science, London, United Kingdom.
¹⁴ Department of Infectious Disease Epidemiology, Faculty of Epidemiology and Population Health, London School of Hygiene & Tropical Medicine, London, United Kingdom.
¹⁵ Department of Global and Environmental Health, School of Global Public Health, New York University, New York, NY, United States.
¹⁶ School of Public Health, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China (Hong Kong).
¹⁷ Population Health Sciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle, United Kingdom.

^# Contributed equally.

PMID: 40354111
PMCID: PMC12107200
DOI: 10.2196/67156

Observational Study

Population-Wide Depression Incidence Forecasting Comparing Autoregressive Integrated Moving Average and Vector Autoregressive Integrated Moving Average to Temporal Fusion Transformers: Longitudinal Observational Study

Deliang Yang et al. J Med Internet Res. 2025.

. 2025 May 12:27:e67156.

doi: 10.2196/67156.

Authors

Affiliations

¹ Department of Medicine, School of Clinical Medicine, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China (Hong Kong).
² Faculty of Science, University of Hong Kong, Hong Kong, China (Hong Kong).
³ Department of Pharmacology and Pharmacy, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China (Hong Kong).
⁴ Department of Psychiatry, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China (Hong Kong).
⁵ School of Public Health Sciences, University of Waterloo, Waterloo, ON, Canada.
⁶ School of Pharmacy, Aston University, Birmingham, United Kingdom.
⁷ Advanced Data Analytics for Medical Science (ADAMS) Limited, Hong Kong, China (Hong Kong).
⁸ Institute of Clinical Pharmacy and Pharmaceutical Sciences, College of Medicine, National Cheng Kung University, Tainan, Taiwan.
⁹ Department of Pharmacy, College of Medicine, National Cheng Kung University, Tainan, Taiwan.
¹⁰ Laboratory of Data Discovery for Health (D24H), Hong Kong, China (Hong Kong).
¹¹ Division of Health Economics, Policy and Management, School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China (Hong Kong).
¹² National Health Commission Key Laboratory of Health Technology Assessment, Fudan University, Shanghai, China.
¹³ Care Policy and Evaluation Centre, London School of Economics and Political Science, London, United Kingdom.
¹⁴ Department of Infectious Disease Epidemiology, Faculty of Epidemiology and Population Health, London School of Hygiene & Tropical Medicine, London, United Kingdom.
¹⁵ Department of Global and Environmental Health, School of Global Public Health, New York University, New York, NY, United States.
¹⁶ School of Public Health, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China (Hong Kong).
¹⁷ Population Health Sciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle, United Kingdom.

^# Contributed equally.

PMID: 40354111
PMCID: PMC12107200
DOI: 10.2196/67156

Abstract

Background: Accurate prediction of population-wide depression incidence is vital for effective public mental health management. However, this incidence is often influenced by socioeconomic factors, such as abrupt events or changes, including pandemics, economic crises, and social unrest, creating complex structural break scenarios in the time-series data. These structural breaks can affect the performance of forecasting methods in various ways. Therefore, understanding and comparing different models across these scenarios is essential.

Objective: This study aimed to develop depression incidence forecasting models and compare the performance of autoregressive integrated moving average (ARIMA) and vector-ARIMA (VARIMA) and temporal fusion transformers (TFT) under different structural break scenarios.

Methods: We developed population-wide depression incidence forecasting models and compared the performance of ARIMA and VARIMA-based methods to TFT-based methods. Using monthly depression incidence from 2002 to 2022 in Hong Kong, we applied sliding windows to segment the whole time series into 72 ten-year subsamples. The forecasting models were trained, validated, and tested on each subsample. Within each 10-year subset, the first 7 years were used for training, with the eighth year for setting hold-out validation, and the ninth and tenth years for testing. The accuracy of the testing set within each 10-year subsample was measured by symmetric mean absolute percentage error (SMAPE).

Results: We found that in subsamples without significant slope or trend change (structural break), multivariate TFT significantly outperformed univariate TFT, vector-ARIMA (VARIMA), and ARIMA, with an average SMAPE of 11.6% compared to 13.2% (P=.01) for univariate TFT, 16.4% (P=.002) for VARIMA, and 14.8% (P=.003) for ARIMA. Adjusting for the unemployment rate improved TFT performance more effectively than VARIMA. When fluctuating outbreaks happened, TFT was more robust to sharp interruptions, whereas VARIMA and ARIMA performed better when incidence surged and remained high.

Conclusions: This study provides a comparative evaluation of TFT and ARIMA and VARIMA models for forecasting depression incidence under various structural break scenarios, offering insights into predicting disease burden during both stable and unstable periods. The findings support a decision-making framework for model selection based on the nature of disruptions and data characteristics. For public health policymaking, the results suggest that TFT may be a more suitable tool for disease burden forecasting during periods of stable burden level or when sudden temporary interruption, such as pandemics or socioeconomic variation, impacts disease occurrence.

Keywords: ARIMA; deep learning; depression incidence forecasting; electronic health records; machine learning; medical informatics; population-wide depression incidence; structural break scenarios; temporal fusion transformers; vector-ARIMA.

©Deliang Yang, Yiyi Tang, Vivien Kin Yi Chan, Qiwen Fang, Sandra Sau Man Chan, Hao Luo, Ian Chi Kei Wong, Huang-Tz Ou, Esther Wai Yin Chan, David Makram Bishai, Yingyao Chen, Martin Knapp, Mark Jit, Dawn Craig, Xue Li. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 12.05.2025.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: XL received research grants from the Research Fund Secretariat of the Health Bureau, Health and Medical Research Fund (HMRF, HKSAR), Health and Medical Research Fund Fellowship Scheme (HMRF Fellowship, HKSAR), Research Grants Council Early Career Scheme (RGC/ECS, HKSAR), Commission grants from Hospital Authority of Hong Kong; educational and investigator initiate research fund from Janssen, Pfizer, and Amgen; internal funding from the University of Hong Kong; consultancy fee from Pfizer, Merck Sharp & Dohme, Open Health, Office of Health Economics; she is also the former non-executive director of ADAMS Limited Hong Kong; all outside the submitted work. ICKW reports research funding from the Hong Kong Research Grants Council, the Hong Kong Health and Medical Research Fund, the European Commission, IQVIA, and Amgen outside the submitted work; and is a director of Jacobson Medical, Advanced Data Analytics for Medical Science (ADAMS) in Hong Kong and a former director of Therakind Ltd in London and Asia Medicine Regulatory Affairs (AMERA) Services Limited, he was a consultant to IQVIA and World Health Organization; and serve as a member of the Pharmacy and Poisons Board, Hong Kong SAR.

Figures

**Figure 1**
Diagram of the streamlined analytical plan for the comparative study of forecasting models. ARIMA: Auto-Regressive Integrated Moving Average; EMR: electronic medical records; Multivariate TFT: Multivariate Temporal Fusion Transformers; SMAPE: symmetric mean absolute percentage error; Univariate TFT: Univariate Temporal Fusion Transformers; VARIMA: Vector Auto-Regressive Integrated Moving Average.

**Figure 2**
Depression incidence and unemployment rate in Hong Kong between 2002 and 2023. (A) Depression incidence time series for the overall population (age-standardized) and for specific age subgroups: vertical pink dotted lines represent the breakpoints on the timeseries as indicated by Chow’s test. (B) Unemployment rate time series for the overall 20 years population and for specific age subgroups. (C) Ten-year sub-timeseries sample set construction: segmenting the 10-year sub-timeseries (one window) according to year-by-year sliding. In the example shown for 2002-2022, there are 12 sliding windows in total. The first 7 years in each sub-timeseries is the training set, the eighth year is the validation set, and the ninth and tenth year are the testing set. In the database used in this study, we analyzed 72 sub-timeseries datasets (12 samples×6 groups) from the overall population and age subgroups. (D) An example of stable period sample. (E) An example of unstable period sample with sharp interruptions. (F) An example of unstable period sample with level shift.

**Figure 3**
Testing accuracy comparison between models. (A) Stable periods with no breakpoint between training, validation, and testing periods. (B) Unstable periods with one or more breakpoints between training, validation, and testing periods. ARIMA: autoregressive integrated moving average; multiTFT: multivariate temporal fusion transformers; SMAPE: symmetric mean absolute percentage error; uniTFT: univariate temporal fusion transformers; VARIMA: vector autoregressive integrated moving average.

**Figure 4**
Model performance comparison during unstable periods with a sharp interruption or level shift. Training set: 2013-2019; validation set: 2020; testing set: 2021-2022. (A) Model performance comparison during unstable periods with a sharp interruption in 2019: the red circles highlight the last points in the training set, which heavily influenced the prediction output of the ARIMA/VARIMA models due to their autoregression mechanism. These points fall within the sharp interruption period of 2019. (B) Model performance comparison during unstable period with level shift at or after 2020: the red lines indicate the changes in levels occurring at or after 2020. ARIMA: autoregressive integrated moving average; multivariate TFT: multivariate temporal fusion transformers; univariate TFT: univariate temporal fusion transformers; VARIMA: vector autoregressive integrated moving average.

See this image and copyright information in PMC

References

1. Institute of Health Metrics and Evaluation. GBD results. [2024-08-28]. https://vizhub.healthdata.org/gbd-results/
1. COVID-19 Mental Disorders Collaborators Global prevalence and burden of depressive and anxiety disorders in 204 countries and territories in 2020 due to the COVID-19 pandemic. Lancet. 2021;398(10312):1700–1712. doi: 10.1016/S0140-6736(21)02143-7. https://linkinghub.elsevier.com/retrieve/pii/S0140-6736(21)02143-7 S0140-6736(21)02143-7 - DOI - PMC - PubMed
1. Chan VKY, Chai Y, Chan SSM, Luo H, Jit M, Knapp M, Bishai DM, Ni MY, Wong ICK, Li X. Impact of COVID-19 pandemic on depression incidence and healthcare service use among patients with depression: an interrupted time-series analysis from a 9-year population-based study. BMC Med. 2024;22(1):169. doi: 10.1186/s12916-024-03386-z. https://bmcmedicine.biomedcentral.com/articles/10.1186/s12916-024-03386-z 10.1186/s12916-024-03386-z - DOI - DOI - PMC - PubMed
1. Liu Q, He H, Yang J, Feng X, Zhao F, Lyu J. Changes in the global burden of depression from 1990 to 2017: findings from the Global Burden of Disease study. J Psychiatr Res. 2020;126:134–140. doi: 10.1016/j.jpsychires.2019.08.002. https://linkinghub.elsevier.com/retrieve/pii/S0022-3956(19)30738-1 S0022-3956(19)30738-1 - DOI - PubMed
1. König H, König H-h, Konnopka A. The excess costs of depression: a systematic review and meta-analysis. Epidemiol Psychiatr Sci. 2019;29:e30. doi: 10.1017/S2045796019000180. https://europepmc.org/abstract/MED/30947759 S2045796019000180 - DOI - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- JMIR Publications
- PubMed Central
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Population-Wide Depression Incidence Forecasting Comparing Autoregressive Integrated Moving Average and Vector Autoregressive Integrated Moving Average to Temporal Fusion Transformers: Longitudinal Observational Study

Affiliations

Population-Wide Depression Incidence Forecasting Comparing Autoregressive Integrated Moving Average and Vector Autoregressive Integrated Moving Average to Temporal Fusion Transformers: Longitudinal Observational Study

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical