Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Sep;136(9):1253-60.
doi: 10.1017/S0950268807009752. Epub 2007 Nov 30.

Estimating disease prevalence using census data

Affiliations

Estimating disease prevalence using census data

M Choy et al. Epidemiol Infect. 2008 Sep.

Abstract

We describe a method of working on publicly available data to estimate disease prevalence in small geographic areas using Helicobacter pylori as a model infection. Using data from the Third National Health and Nutrition Examination Survey, risk parameters for H. pylori infection were obtained by logistic regression and validated by predicting 737.5 infections in an independent cohort with 736 observed infections. The prevalence of H. pylori infection in the San Francisco Bay Area was estimated with the probabilities obtained from a predictive logistic model, using risk parameters with individual-level 1990 U.S. Census data as input. Predicted H. pylori prevalence was also compared to gastric cancer incidence obtained from the Northern California Cancer Center and showed a positive correlation with gastric cancer incidence (P<0.001, R2=0.87), and no statistically significant association with other malignancies. By exclusively using publicly available data, these methods may be applied to selected conditions with strong demographic predictors.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
ROC curve for validation dataset. Sensitivity and specificity refer to the prediction of H. pylori infection using demographic risk factors with serology as the reference standard. The black dot (●) represents the greatest balance between sensitivity and specificity. The area under the curve is 0·69.
Fig. 2
Fig. 2
Map of study area. Percentages indicate the predicted prevalence of H. pylori infection in that county.
Fig. 3
Fig. 3
Regression plots of H. pylori and age-adjusted gastric cancer rates. Each open symbol (○) represents a county, with the size of the symbol proportional to the population of the county. The regression was weighted by population and has the equation: gastric cancer rate=−11·76+63·172×H. pylori prevalence.

References

    1. Belanger CF et al. The nurses' health study. American Journal of Nursing. 1978;78:1039–1040. - PubMed
    1. McQuillan GM, Gunter EW, Lannom L. Field issues for the plan and operation of the laboratory component of the Third National Health and Nutrition Examination Survey. Journal of Nutrition. 1990;120:1446–1450. (Suppl. 11): - PubMed
    1. Borrell LN et al. Neighbourhood characteristics and mortality in the Atherosclerosis Risk in Communities Study. International Journal of Epidemiology. 2004;33:398–407. - PubMed
    1. Krueger PM et al. Neighbourhoods and homicide mortality: an analysis of race/ethnic differences. Journal of Epidemiology and Community Health. 2004;58:223–230. - PMC - PubMed
    1. Winkleby MA, Cubbin C. Influence of individual and neighbourhood socioeconomic status on mortality among black, Mexican-American, and white women and men in the United States. Journal of Epidemiology and Community Health. 2003;57:444–452. - PMC - PubMed

Publication types