Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul 22;17(15):5286.
doi: 10.3390/ijerph17155286.

A Clustering Approach to Classify Italian Regions and Provinces Based on Prevalence and Trend of SARS-CoV-2 Cases

Affiliations

A Clustering Approach to Classify Italian Regions and Provinces Based on Prevalence and Trend of SARS-CoV-2 Cases

Andrea Maugeri et al. Int J Environ Res Public Health. .

Abstract

While several efforts have been made to control the epidemic of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in Italy, differences between and within regions have made it difficult to plan the phase two management after the national lockdown. Here, we propose a simple and immediate clustering approach to categorize Italian regions working on the prevalence and trend of SARS-CoV-2 positive cases prior to the start of phase two on 4 May 2020. Applying both hierarchical and k-means clustering, we identified three regional groups: regions in cluster 1 exhibited higher prevalence and the highest trend of SARS-CoV-2 positive cases; those classified into cluster 2 constituted an intermediate group; those in cluster 3 were regions with a lower prevalence and the lowest trend of SARS-CoV-2 positive cases. At the provincial level, we used a similar approach but working on the prevalence and trend of the total SARS-CoV-2 cases. Notably, provinces in cluster 1 exhibited the highest prevalence and trend of SARS-CoV-2 cases. Provinces in clusters 2 and 3, instead, showed a median prevalence of approximately 11 cases per 10,000 residents. However, provinces in cluster 3 were those with the lowest trend of cases. K-means clustering yielded to an alternative cluster solution in terms of the prevalence and trend of SARS-CoV-2 cases. Our study described a simple and immediate approach to monitor the SARS-CoV-2 epidemic at the regional and provincial level. These findings, at present, offered a snapshot of the epidemic, which could be helpful to outline the hierarchy of needs at the subnational level. However, the integration of our approach with further indicators and characteristics could improve our findings, also allowing the application to different contexts and with additional aims.

Keywords: COVID-19; clustering; epidemiology; positive cases.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Description and correlation of regional indicators: (A) the distribution of Italian regions by prevalence of SARS-CoV-2 positive cases, their trend, and the number of tests performed; and (B) the correlations between the prevalence of SARS-CoV-2 positive cases, their trend, and the number of tests performed; the results are reported as Spearman’s correlation coefficient from −1 (in red) to 1 (in green).
Figure 2
Figure 2
Scatter plots of the relationships of the number of tests performed with (A) the prevalence of SARS-CoV-2 positive cases and (B) their trend. Indicators are reported as log-transformed values.
Figure 3
Figure 3
Dendrogram of the hierarchical clustering of regions based on Ward’s criterion.
Figure 4
Figure 4
Scatter plot illustrating how the egional clusters were distributed on the prevalence of SARS-CoV-2 positive cases and their trend. Clustering solution was obtained by the hierarchical clustering and consolidated by the k-means algorithm.
Figure 5
Figure 5
Dendrogram of the hierarchical clustering of the provinces based on Ward’s criterion.
Figure 6
Figure 6
Scatter plot illustrating how provincial clusters were distributed on the prevalence of SARS-CoV-2 cases and their trend: (A) the clustering solution obtained by hierarchical clustering; and (B) the clustering solution obtained using the k-means algorithm.

References

    1. World Health Organization Coronavirus Disease (COVID-19) Dashboard. [(accessed on 4 May 2020)]; Available online: https://covid19.who.int/
    1. Day M. Covid-19: Italy confirms 11 deaths as cases spread from north. BMJ. 2020;368:m757. doi: 10.1136/bmj.m757. - DOI - PubMed
    1. Italian Ministry of Health Covid-19. Situation Report Update at 4 May 18:00. [(accessed on 4 May 2020)]; Available online: http://www.salute.gov.it/portale/nuovocoronavirus/dettaglioNotizieNuovoC....
    1. Italian Ministry of Health Novel Coronavirus. [(accessed on 4 May 2020)]; Available online: http://www.salute.gov.it/portale/nuovocoronavirus/homeNuovoCoronavirus.j....
    1. Signorelli C., Scognamiglio T., Odone A. COVID-19 in Italy: Impact of containment measures and prevalence estimates of infection in the general population. Acta Biomed. 2020;91:175–179. doi: 10.23750/abm.v91i3-S.9511. - DOI - PMC - PubMed

LinkOut - more resources