Drawing statistical inferences from historical census data, 1850-1950
- PMID: 19771946
- PMCID: PMC2831349
- DOI: 10.1353/dem.0.0062
Drawing statistical inferences from historical census data, 1850-1950
Abstract
Virtually all quantitative microdata used by social scientists derive from samples that incorporate clustering, stratification, and weighting adjustments (Kish 1965, 1992). Such data can yield standard error estimates that differ dramatically from those derived from a simple random sample of the same size. Researchers using historical U.S. census microdata, however, usually apply methods designed for simple random samples. The resulting p values and confidence intervals could be inaccurate and could lead to erroneous research conclusions. Because U.S. census microdata samples are among the most widely used sources for social science and policy research, the need for reliable standard error estimation is critical. We evaluate the historical microdata samples of the Integrated Public Use Microdata Series (IPUMS) project from 1850 to 1950 in order to determine (1) the impact of sample design on standard error estimates, and (2) how to apply modern standard error estimation software to historical census samples. We exploit a unique new data source from the 1880 census to validate our methods for standard error estimation, and then we apply this approach to the 1850-1870 and 1900-1950 decennial censuses. We conclude that Taylor series estimation can be used effectively with the historical decennial census microdata samples and should be applied in research analyses that have the potential for substantial clustering effects.
Similar articles
-
Integrating Canadian and U.S. historical census microdata: Canada (1871 and 1901) and the United States (1870 and 1900).Hist Methods. 2000;33(4):185-94. doi: 10.1080/01615440009598959. Hist Methods. 2000. PMID: 17216887 No abstract available.
-
Creating Statistically Literate Global Citizens: The Use of IPUMS-International Integrated Census Microdata in Teaching.Stat J IAOS. 2011;27(3-4):145-156. doi: 10.3233/SJI-2011-0733. Stat J IAOS. 2011. PMID: 25279022 Free PMC article.
-
IPUMS-International High Precision Population Census Microdata Samples: Balancing the Privacy-Quality Tradeoff by Means of Restricted Access Extracts.Priv Stat Databases. 2006 Dec;4302:375-382. doi: 10.1007/11930242_31. Priv Stat Databases. 2006. PMID: 28393148 Free PMC article.
-
The shortcomings of synthetic census microdata.Proc Natl Acad Sci U S A. 2025 Mar 18;122(11):e2424655122. doi: 10.1073/pnas.2424655122. Epub 2025 Mar 6. Proc Natl Acad Sci U S A. 2025. PMID: 40048290
-
[Some problems related to the collection of demographic data in the population censuses taken in Latin America during the 1980s].Notas Poblacion. 1986 Dec;14(42):51-76. Notas Poblacion. 1986. PMID: 12314803 Spanish.
Cited by
-
Intergenerational Coresidence and Family Transitions in the United States, 1850 - 1880.J Marriage Fam. 2011 Feb;73(1):138-148. doi: 10.1111/j.1741-3737.2010.00794.x. J Marriage Fam. 2011. PMID: 22039309 Free PMC article.
-
Prevalence and characteristics of indoor tanning use among men and women in the United States.Arch Dermatol. 2010 Dec;146(12):1356-61. doi: 10.1001/archdermatol.2010.355. Arch Dermatol. 2010. PMID: 21173319 Free PMC article.
-
Big Data: Large-Scale Historical Infrastructure from the Minnesota Population Center.Hist Methods. 2011 Jan 1;44(2):61-68. doi: 10.1080/01615440.2011.564572. Hist Methods. 2011. PMID: 21949459 Free PMC article.
References
APPENDIX A: DEMOGRAPHY ARTICLES THAT USED 1850–1950 MICRODATA FOR ORIGINAL ANALYSIS
-
- Alba R, Logan J, Lutz A, Stults B. “Only English by the Third Generation? Loss and Preservation of the Mother Tongue Among the Grandchildren of Contemporary Immigrants”. Demography. 2002;39:467–84. - PubMed
-
- Elman C, Myers GC. “Geographic Morbidity Differentials in the Late Nineteenth-Century United States”. Demography. 1999;36:429–43. - PubMed
-
- Goldscheider FK, Bures RM. “The Racial Crossover in Family Complexity in the United States”. Demography. 2003;40:569–87. - PubMed
-
- Gutmann MP, Haines MR, Frisbie WP, Blanchard KS. “Intra-ethnic Diversity in Hispanic Child Mortality, 1890–1910”. Demography. 2000;37:467–75. - PubMed
-
- Hacker JD. “Rethinking the ‘Early’ Decline of Marital Fertility in the United States”. Demography. 2003;40:605–20. - PubMed
References
-
- Davern M, Jones A, Jr, Lepkowski J, Davidson G, Blewett LA. “Estimating Standard Errors for Regression Coefficients Using the Current Population Survey’s Public Use File”. Inquiry. 2007;44:211–24. - PubMed
-
- Dippo CS, Wolter KM. ASA Proceedings of the Section on Survey Research Methods. Arlington, VA: American Statistical Association; 1984. “A Comparison of Variance Estimators Using the Taylor Series Approximation.”; pp. 112–21.
-
- Goeken R, Nguyen C, Ruggles S, Sargent WL. “The 1880 United States Population Database”. Historical Methods. 2003;36(4):27–34.
-
- Graubard BI, Korn EL. “Survey Inference for Subpopulations”. American Journal of Epidemiology. 1996;144:102–106. - PubMed
-
- Hammer H, Shin Hee-Choon, Porcellini LE. “A Comparison of Taylor Series and JK1 Resampling Methods for Variance Estimation.”. Proceedings of the Hawaii International Conference on Statistics; Honolulu, HI. 2003. pp. 1–9.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Medical