Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Apr 13;107(15):6743-7.
doi: 10.1073/pnas.1000261107. Epub 2010 Mar 29.

Assessing respondent-driven sampling

Affiliations

Assessing respondent-driven sampling

Sharad Goel et al. Proc Natl Acad Sci U S A. .

Abstract

Respondent-driven sampling (RDS) is a network-based technique for estimating traits in hard-to-reach populations, for example, the prevalence of HIV among drug injectors. In recent years RDS has been used in more than 120 studies in more than 20 countries and by leading public health organizations, including the Centers for Disease Control and Prevention in the United States. Despite the widespread use and growing popularity of RDS, there has been little empirical validation of the methodology. Here we investigate the performance of RDS by simulating sampling from 85 known, network populations. Across a variety of traits we find that RDS is substantially less accurate than generally acknowledged and that reported RDS confidence intervals are misleadingly narrow. Moreover, because we model a best-case scenario in which the theoretical RDS sampling assumptions hold exactly, it is unlikely that RDS performs any better in practice than in our simulations. Notably, the poor performance of RDS is driven not by the bias but by the high variance of estimates, a possibility that had been largely overlooked in the RDS literature. Given the consistency of our results across networks and our generous sampling conditions, we conclude that RDS as currently practiced may not be suitable for key aspects of public health surveillance where it is now extensively applied.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Design effects for the 13 binary traits in Project 90 (A) and the 46 binary traits in Add Health (B); in Add Health, each circle indicates the design effect of a trait in one of 84 schools.
Fig. 2.
Fig. 2.
Coverage rates of nominal 95% RDS confidence intervals in Project 90 (A) and Add Health (B). For each trait in Add Health, the mean coverage over all 84 schools is shown.
Fig. 3.
Fig. 3.
Comparison of bias, standard error, and RMSE between the RDS estimator and the unweighted mean of the RDS sample in Project 90 (AC) and Add Health (DF). Each point corresponds to a given trait in a given network population.

References

    1. Magnani R, Sabin K, Saidel T, Heckathorn D. Review of sampling hard-to-reach and hidden populations for HIV surveillance. AIDS. 2005;19:S67–S72. - PubMed
    1. Heckathorn DD. Respondent-driven sampling: A new approach to the study of hidden populations. Soc Probl. 1997;44:174–199.
    1. Heckathorn DD. Respondent-driven sampling II: Deriving valid population estimates from chain-referral samples of hidden populations. Soc Probl. 2002;49:11–34.
    1. Salganik MJ, Heckathorn DD. Sampling and estimation in hidden populations using respondent-driven sampling. Sociol Methodol. 2004;34:193–239.
    1. Malekinejad M, et al. Using respondent-driven sampling methodology for HIV biological and behavioral surveillance in international settings: A systematic review. AIDS Behav. 2008;12:105–130. - PubMed

Publication types

LinkOut - more resources