Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2011 May;22(3):382-9.
doi: 10.1097/EDE.0b013e3182125cff.

Designs for the combination of group- and individual-level data

Affiliations
Review

Designs for the combination of group- and individual-level data

Sebastien Haneuse et al. Epidemiology. 2011 May.

Abstract

Background: Studies of ecologic or aggregate data suffer from a broad range of biases when scientific interest lies with individual-level associations. To overcome these biases, epidemiologists can choose from a range of designs that combine these group-level data with individual-level data. The individual-level data provide information to identify, evaluate, and control bias, whereas the group-level data are often readily accessible and provide gains in efficiency and power. Within this context, the literature on developing models, particularly multilevel models, is well-established, but little work has been published to help researchers choose among competing designs and plan additional data collection.

Methods: We review recently proposed "combined" group- and individual-level designs and methods that collect and analyze data at 2 levels of aggregation. These include aggregate data designs, hierarchical related regression, two-phase designs, and hybrid designs for ecologic inference.

Results: The various methods differ in (i) the data elements available at the group and individual levels and (ii) the statistical techniques used to combine the 2 data sources. Implementing these techniques requires care, and it may often be simpler to ignore the group-level data once the individual-level data are collected. A simulation study, based on birth-weight data from North Carolina, is used to illustrate the benefit of incorporating group-level information.

Conclusions: Our focus is on settings where there are individual-level data to supplement readily accessible group-level data. In this context, no single design is ideal. Choosing which design to adopt depends primarily on the model of interest and the nature of the available group-level data.

PubMed Disclaimer

Figures

Figure 1
Figure 1
County-specific outcome and exposure data for the North Carolina low birth weight data.

Similar articles

Cited by

References

    1. Morgenstern H. Ecologic studies. In: Rothman KJ, Greenland S, Lash T, editors. Modern Epidemiology. Third. Philadelphia: Lippincott Williams & Wilkins; 2008. pp. 511–531.
    1. Best NG, Cockings S, Bennett J, Wakefield J, Elliott P. Ecological regression analysis of environmental benzene exposure and childhood leukaemia: sensitivity to data inaccuracies, geographical scale and ecological bias. J R Stat Soc Ser A Stat Soc. 2001;164(1):155–174.
    1. Wilkinson P, Thakrar B, Walls P, et al. Lymphohaematopoietic malignancy around all industrial complexes that include major oil refineries in Great Britain. Occup Environ Med. 1999;56(9):577–80. - PMC - PubMed
    1. Whitley E, Darby S. Quantifying the risks from residential radon. In: Barnett V, Stein A, Turkman K, editors. Statistics for the Environment 4: Statistical Aspects of Health and the Environment. Chichester: John Wiley & Sons; 1999. pp. 71–89.
    1. Maheswaran R, Morris S, Falconer S, et al. Magnesium in drinking water supplies and mortality from acute myocardial infarction in north west England. Heart. 1999;82(4):455–60. - PMC - PubMed