Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2014 Feb 1:135:71-7.
doi: 10.1016/j.drugalcdep.2013.11.020. Epub 2013 Dec 3.

A simulative comparison of respondent driven sampling with incentivized snowball sampling--the "strudel effect"

Affiliations
Comparative Study

A simulative comparison of respondent driven sampling with incentivized snowball sampling--the "strudel effect"

V Anna Gyarmathy et al. Drug Alcohol Depend. .

Abstract

Background: Respondent driven sampling (RDS) and incentivized snowball sampling (ISS) are two sampling methods that are commonly used to reach people who inject drugs (PWID).

Methods: We generated a set of simulated RDS samples on an actual sociometric ISS sample of PWID in Vilnius, Lithuania ("original sample") to assess if the simulated RDS estimates were statistically significantly different from the original ISS sample prevalences for HIV (9.8%), Hepatitis A (43.6%), Hepatitis B (Anti-HBc 43.9% and HBsAg 3.4%), Hepatitis C (87.5%), syphilis (6.8%) and Chlamydia (8.8%) infections and for selected behavioral risk characteristics.

Results: The original sample consisted of a large component of 249 people (83% of the sample) and 13 smaller components with 1-12 individuals. Generally, as long as all seeds were recruited from the large component of the original sample, the simulation samples simply recreated the large component. There were no significant differences between the large component and the entire original sample for the characteristics of interest. Altogether 99.2% of 360 simulation sample point estimates were within the confidence interval of the original prevalence values for the characteristics of interest.

Conclusions: When population characteristics are reflected in large network components that dominate the population, RDS and ISS may produce samples that have statistically non-different prevalence values, even though some isolated network components may be under-sampled and/or statistically significantly different from the main groups. This so-called "strudel effect" is discussed in the paper.

Keywords: Incentivized snowball sampling; People who inject drugs; Prevalence estimates; Respondent driven sampling; Sampling methodology; Simulations.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Sociometric graph of the original sample – participants not recruited into any of the 30 RDS samples are marked black.
Figure 2
Figure 2
Recruitment chains of one of the RDS simulation samples – seeds are marked black.
Figure 3
Figure 3
Point estimates and 95% confidence intervals of the RDS simulations samples for selected infections and risk behaviours (past 30 days) 3a: Distributive syringe sharing 3b: Receptive syringe sharing 3c: Sharing cookers and filters 3d: Using condoms every time during sex 3e: Having two or more sex partners 3f: Prevalence of HAV 3g: Prevalence of anti-HBc 3h: Prevalence of HBsAg 3i: Prevalence of HCV 3j: Prevalence of HIV 3k: Prevalence of Chlamydia 3l: Prevalence of syphilis Notes:
  1. For better visualization and interpretation, point estimates of the simulation samples were ordered by magnitude (the X axis of each chart represents the rank order of the simulation sample point estimates), and only the related confidence interval ranges are depicted (therefore the Y axis ranges are different for each chart both in terms of minimum and maximum values and in terms of units within the ranges between minimum and maximum).

  2. Prevalence and, respectively, 95% confidence intervals for each characteristic in the original sample are represented as black and, respectively, gray vertical lines within each chart to provide a visual reference for the point estimates of the simulation samples.

  3. Statistically significant difference is when a point estimate of the simulation sample lies either under the lower confidence interval or above the upper confidence interval of the prevalence within the original sample for the relevant characteristic.

Figure 3
Figure 3
Point estimates and 95% confidence intervals of the RDS simulations samples for selected infections and risk behaviours (past 30 days) 3a: Distributive syringe sharing 3b: Receptive syringe sharing 3c: Sharing cookers and filters 3d: Using condoms every time during sex 3e: Having two or more sex partners 3f: Prevalence of HAV 3g: Prevalence of anti-HBc 3h: Prevalence of HBsAg 3i: Prevalence of HCV 3j: Prevalence of HIV 3k: Prevalence of Chlamydia 3l: Prevalence of syphilis Notes:
  1. For better visualization and interpretation, point estimates of the simulation samples were ordered by magnitude (the X axis of each chart represents the rank order of the simulation sample point estimates), and only the related confidence interval ranges are depicted (therefore the Y axis ranges are different for each chart both in terms of minimum and maximum values and in terms of units within the ranges between minimum and maximum).

  2. Prevalence and, respectively, 95% confidence intervals for each characteristic in the original sample are represented as black and, respectively, gray vertical lines within each chart to provide a visual reference for the point estimates of the simulation samples.

  3. Statistically significant difference is when a point estimate of the simulation sample lies either under the lower confidence interval or above the upper confidence interval of the prevalence within the original sample for the relevant characteristic.

Figure 3
Figure 3
Point estimates and 95% confidence intervals of the RDS simulations samples for selected infections and risk behaviours (past 30 days) 3a: Distributive syringe sharing 3b: Receptive syringe sharing 3c: Sharing cookers and filters 3d: Using condoms every time during sex 3e: Having two or more sex partners 3f: Prevalence of HAV 3g: Prevalence of anti-HBc 3h: Prevalence of HBsAg 3i: Prevalence of HCV 3j: Prevalence of HIV 3k: Prevalence of Chlamydia 3l: Prevalence of syphilis Notes:
  1. For better visualization and interpretation, point estimates of the simulation samples were ordered by magnitude (the X axis of each chart represents the rank order of the simulation sample point estimates), and only the related confidence interval ranges are depicted (therefore the Y axis ranges are different for each chart both in terms of minimum and maximum values and in terms of units within the ranges between minimum and maximum).

  2. Prevalence and, respectively, 95% confidence intervals for each characteristic in the original sample are represented as black and, respectively, gray vertical lines within each chart to provide a visual reference for the point estimates of the simulation samples.

  3. Statistically significant difference is when a point estimate of the simulation sample lies either under the lower confidence interval or above the upper confidence interval of the prevalence within the original sample for the relevant characteristic.

Figure 3
Figure 3
Point estimates and 95% confidence intervals of the RDS simulations samples for selected infections and risk behaviours (past 30 days) 3a: Distributive syringe sharing 3b: Receptive syringe sharing 3c: Sharing cookers and filters 3d: Using condoms every time during sex 3e: Having two or more sex partners 3f: Prevalence of HAV 3g: Prevalence of anti-HBc 3h: Prevalence of HBsAg 3i: Prevalence of HCV 3j: Prevalence of HIV 3k: Prevalence of Chlamydia 3l: Prevalence of syphilis Notes:
  1. For better visualization and interpretation, point estimates of the simulation samples were ordered by magnitude (the X axis of each chart represents the rank order of the simulation sample point estimates), and only the related confidence interval ranges are depicted (therefore the Y axis ranges are different for each chart both in terms of minimum and maximum values and in terms of units within the ranges between minimum and maximum).

  2. Prevalence and, respectively, 95% confidence intervals for each characteristic in the original sample are represented as black and, respectively, gray vertical lines within each chart to provide a visual reference for the point estimates of the simulation samples.

  3. Statistically significant difference is when a point estimate of the simulation sample lies either under the lower confidence interval or above the upper confidence interval of the prevalence within the original sample for the relevant characteristic.

References

    1. Borgatti SP, Everett MG, Freeman LC. Ucinet for Windows: Software for Social Network Analysis. Analytic Technologies; Harvard, MA: 2002.
    1. Des Jarlais DC, Arasteh K, Semaan S, Wood E. HIV among injecting drug users: current epidemiology, biologic markers, respondent-driven sampling, and supervised-injection facilities. Curr Opin HIV AIDS. 2009;4:308–313. - PMC - PubMed
    1. Friedman SR, Curtis R, Neaigus A, Jose B, Des Jarlais DC. Social Networks, Drug Injectors’ Lives, and HIV. Plenum; New York: 1999.
    1. Friedman SR, Neaigus A, Jose B, Curtis R, Goldstein MF, Ildefonso G, Rothenberg RB, Des Jarlais DC. Sociometric risk networks and risk for HIV infection. Am J Public Health. 1997;87:1289–1296. - PMC - PubMed
    1. Gile KJ, Handcock MS. Respondent driven sampling: an assessment of current methodology. Sociol Methodol. 2010;40:285–327. - PMC - PubMed

Publication types

LinkOut - more resources