Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 May 15;16(1):19.
doi: 10.1186/s12942-017-0093-9.

Visualizing statistical significance of disease clusters using cartograms

Affiliations

Visualizing statistical significance of disease clusters using cartograms

Barry J Kronenfeld et al. Int J Health Geogr. .

Abstract

Background: Health officials and epidemiological researchers often use maps of disease rates to identify potential disease clusters. Because these maps exaggerate the prominence of low-density districts and hide potential clusters in urban (high-density) areas, many researchers have used density-equalizing maps (cartograms) as a basis for epidemiological mapping. However, we do not have existing guidelines for visual assessment of statistical uncertainty. To address this shortcoming, we develop techniques for visual determination of statistical significance of clusters spanning one or more districts on a cartogram. We developed the techniques within a geovisual analytics framework that does not rely on automated significance testing, and can therefore facilitate visual analysis to detect clusters that automated techniques might miss.

Results: On a cartogram of the at-risk population, the statistical significance of a disease cluster is determinate from the rate, area and shape of the cluster under standard hypothesis testing scenarios. We develop formulae to determine, for a given rate, the area required for statistical significance of a priori and a posteriori designated regions under certain test assumptions. Uniquely, our approach enables dynamic inference of aggregate regions formed by combining individual districts. The method is implemented in interactive tools that provide choropleth mapping, automated legend construction and dynamic search tools to facilitate cluster detection and assessment of the validity of tested assumptions. A case study of leukemia incidence analysis in California demonstrates the ability to visually distinguish between statistically significant and insignificant regions.

Conclusion: The proposed geovisual analytics approach enables intuitive visual assessment of statistical significance of arbitrarily defined regions on a cartogram. Our research prompts a broader discussion of the role of geovisual exploratory analyses in disease mapping and the appropriate framework for visually assessing the statistical significance of spatial clusters.

Keywords: Cartograms; Density equalizing maps; Disease mapping; Geovisual analytics; Scan statistics.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Ordinary map (a) and at-risk population cartogram (b) of California counties. County sizes on the cartogram are proportional to SEER database age-adjusted leukemia at-risk population values, reported in person-years. Selected cities are shown for reference
Fig. 2
Fig. 2
Standard choropleth map of leukemia incidence in California counties, 2008–2013. Counties highlighted in red outline have rates that are significantly higher than the remainder of the state according to the a priori test of significance at the 0.05 significance level
Fig. 3
Fig. 3
Geovisual analysis environment with a priori test hypothesis and 0.05 significance threshold selected. Legend shows the minimum area for which the observed rate of an a priori designated map region would be significantly higher than the rest of the state
Fig. 4
Fig. 4
Geovisual analysis environment with global scan hypothesis and 0.05 significance threshold selected. Legend shows the minimum area for which the presence of a square region with the given rate would indicate a statistically significant event cluster
Fig. 5
Fig. 5
Illustration of two dynamically placed moving scan windows. Reported significance is calculated under the assumption of an a priori designated region. Population sizes of the two regions (A and B) are in person-years
Fig. 6
Fig. 6
Illustration of two user-defined aggregate regions. Reported significance is calculated using the global scan test under the assumption of a square scan window

References

    1. Elliott P, Wartenberg D. Spatial epidemiology: current approaches and future challenges. Environ Health Perspect. 2004;112(9):998–1006. doi: 10.1289/ehp.6735. - DOI - PMC - PubMed
    1. Aylin P, Maheswaran R, Wakefield J, Cockings S, Jarup L, Arnold R, Wheeler G, Elliott P. A national facility for small area disease mapping and rapid initial assessment of apparent disease clusters around a point source: the UK Small Area Health Statistics Unit. J Public Health Med. 1999;21(3):289–298. doi: 10.1093/pubmed/21.3.289. - DOI - PubMed
    1. California Cancer Registry. Age-adjusted invasive cancer incidence rates by county in California, 2009–2013. Based on December 2015 Extract. http://cancer-rates.info/ca/. Accessed on Jun 19, 2016.
    1. Wartenberg D. Analysis and interpretation of disease clusters and ecological studies. J R Stat Soc Ser A Stat Soc. 2001;164(1):13–22. doi: 10.1111/1467-985X.00181. - DOI
    1. Quataert PKM, Armstrong B, Berghold A, Bianchi F, Kelly A, Marchi M, Martuzzi M, Rosano A. Methodological problems and the role of statistics in cluster response studies: a framework. Eur J Epidemiol. 1999;15(9):821–831. doi: 10.1023/A:1007537813282. - DOI - PubMed

MeSH terms

LinkOut - more resources