A simple method to generate equal-sized homogenous strata or clusters for population-based sampling
- PMID: 21376276
- PMCID: PMC3073640
- DOI: 10.1016/j.annepidem.2010.11.016
A simple method to generate equal-sized homogenous strata or clusters for population-based sampling
Abstract
Purpose: Statistical efficiency and cost efficiency can be achieved in population-based samples through stratification and/or clustering. Strata typically combine subgroups of the population that are similar with respect to an outcome. Clusters are often taken from preexisting units, but may be formed to minimize between-cluster variance, or to equalize exposure to a treatment or risk factor. Area probability sample design procedures for the National Children's Study required contiguous strata and clusters that maximized within-stratum and within-cluster homogeneity while maintaining approximately equal size of the strata or clusters. However, there were few methods that allowed such strata or clusters to be constructed under these contiguity and equal size constraints.
Methods: A search algorithm generates equal-size cluster sets that approximately span the space of all possible clusters of equal size. An optimal cluster set is chosen based on analysis of variance and convexity criteria.
Results: The proposed algorithm is used to construct 10 strata based on demographics and air pollution measures in Kent County, MI, following census tract boundaries. A brief simulation study is also conducted.
Conclusions: The proposed algorithm is effective at uncovering underlying clusters from noisy data. It can be used in multi-stage sampling where equal-size strata or clusters are desired.
Copyright © 2011 Elsevier Inc. All rights reserved.
Figures




References
-
- Cochran WG. Sampling Techniques. 3. New York: Wiley; 1977.
-
- LaVarnway GT. An introduction to CART: Classification and regression trees. In: Wegman Edward J., editor. Computer Science and Statistics: Proceedings of the 20th Symposium on the Interface. Alexandria, VA: American Statistical Association; 1988. pp. 298–301.
-
- MacQueen JB. Proceedings of the Fifth Symposium on Match, Statistics, and Probability. Vol. 1. Berkeley, CA: University of California Press; 1967. Some methods for the classification and analysis of multivariate observations; pp. 281–297.
-
- McLachen G, Peel D. Finite mixture models. New York: Wiley; 2000.
-
- Cantwell PJ. Equal Characteristic Clustering. Proceedings of the American Statistical Association, Survey Methods Section. Alexandria, VA: American Statistical Association; 1990. pp. 231–236.
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources