Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 12;13(11):e0206753.
doi: 10.1371/journal.pone.0206753. eCollection 2018.

Clustering time series based on dependence structure

Affiliations

Clustering time series based on dependence structure

Beibei Zhang et al. PLoS One. .

Abstract

The clustering of time series has attracted growing research interest in recent years. The most popular clustering methods assume that the time series are only linearly dependent but this assumption usually fails in practice. To overcome this limitation, in this paper, we study clustering methods applicable to time series with a general and dependent structure. We propose a copula-based distance to measure dissimilarity among time series and consider an estimator for it, where the strong consistency of the estimator is guaranteed. Once the pairwise distance matrix for time series has been obtained, we apply a hierarchical clustering algorithm to cluster the time series and ensure its consistency. Numerical studies, including a large number of simulations and analysis of practical data, show that our method performs well.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Example 1.
Boxplot of clustering similarity indices. The distances from left to right: dACF, dPACF, dPIC, dM, dP, dLP, dNP, dLNP, dDLS, dLK, dGLK, dISD, Copula(K = 1), Copula(K = 2), Copula(K = 3), Copula(K = 4), and Copula(K = 5).
Fig 2
Fig 2. Example 1.
Multidimensional scaling plot.
Fig 3
Fig 3. Example 2.
Boxplot of clustering similarity indices: heterogeneity enlarged.
Fig 4
Fig 4. Example 3.
Boxplots of clustering similarity indices: adjusting nonlinear strength. The distances from left to right: dACF, dPACF, dPIC, dM, dP, dLP, dNP, dLNP, dDLS, dLK, dGLK, dISD, Copula(K = 1), Copula(K = 2), Copula(K = 3), Copula(K = 4), and Copula(K = 5).
Fig 5
Fig 5. Annual real GDP data analysis.
GDP clustering dendrogram based on copula distance with K = 2.
Fig 6
Fig 6. Annual real GDP data analysis.
Plot of average silhouette coefficient (K = 2).
Fig 7
Fig 7. Annual real GDP data analysis.
Two group of GDP data in map based on copula distance (K = 2).
Fig 8
Fig 8. Population growth data analysis.
Population Growth clustering dendrogram based on copula distance with K = 2.
Fig 9
Fig 9. Population growth data analysis.
Plot of average silhouette coefficient (K = 2).

References

    1. Frühwirth-Schnatter S, Kaufmann S. Model-based clustering of multiple time series. Journal of Business & Economic Statistics. 2008;26(1):78–89. 10.1198/073500107000000106 - DOI
    1. Xiong Y, Yeung DY. Time series clustering with ARMA mixtures. Pattern Recognition. 2004;37(8):1675–1689. 10.1016/j.patcog.2003.12.018 - DOI
    1. Otranto E. Clustering heteroskedastic time series by model-based procedures. Computational Statistics & Data Analysis. 2008;52(10):4685–4698. 10.1016/j.csda.2008.03.020 - DOI
    1. Ramoni M, Sebastiani P, Cohen P. Bayesian Clustering by Dynamics. Machine Learning. 2002;47(1):91–121. 10.1023/A:1013635829250 - DOI
    1. Oates T, Firoiu L, Cohen PR. Clustering Time Series with Hidden Markov Models and Dynamic Time Warping. In Proceedings of the IJCAI-99 Workshop on Neural, Symbolic and Reinforcement Learning Methods for Sequence Learning; 1999.

Publication types