Estimating disease prevalence using census data
- PMID: 18047747
- PMCID: PMC2870920
- DOI: 10.1017/S0950268807009752
Estimating disease prevalence using census data
Abstract
We describe a method of working on publicly available data to estimate disease prevalence in small geographic areas using Helicobacter pylori as a model infection. Using data from the Third National Health and Nutrition Examination Survey, risk parameters for H. pylori infection were obtained by logistic regression and validated by predicting 737.5 infections in an independent cohort with 736 observed infections. The prevalence of H. pylori infection in the San Francisco Bay Area was estimated with the probabilities obtained from a predictive logistic model, using risk parameters with individual-level 1990 U.S. Census data as input. Predicted H. pylori prevalence was also compared to gastric cancer incidence obtained from the Northern California Cancer Center and showed a positive correlation with gastric cancer incidence (P<0.001, R2=0.87), and no statistically significant association with other malignancies. By exclusively using publicly available data, these methods may be applied to selected conditions with strong demographic predictors.
Figures



References
-
- Belanger CF et al. The nurses' health study. American Journal of Nursing. 1978;78:1039–1040. - PubMed
-
- McQuillan GM, Gunter EW, Lannom L. Field issues for the plan and operation of the laboratory component of the Third National Health and Nutrition Examination Survey. Journal of Nutrition. 1990;120:1446–1450. (Suppl. 11): - PubMed
-
- Borrell LN et al. Neighbourhood characteristics and mortality in the Atherosclerosis Risk in Communities Study. International Journal of Epidemiology. 2004;33:398–407. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Medical