Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Aug;47(1):274-306.
doi: 10.1177/0081175017716489. Epub 2017 Jul 6.

NEW SURVEY QUESTIONS AND ESTIMATORS FOR NETWORK CLUSTERING WITH RESPONDENT-DRIVEN SAMPLING DATA

Affiliations

NEW SURVEY QUESTIONS AND ESTIMATORS FOR NETWORK CLUSTERING WITH RESPONDENT-DRIVEN SAMPLING DATA

Ashton M Verdery et al. Sociol Methodol. 2017 Aug.

Abstract

Respondent-driven sampling (RDS) is a popular method for sampling hard-to-survey populations that leverages social network connections through peer recruitment. While RDS is most frequently applied to estimate the prevalence of infections and risk behaviors of interest to public health, such as HIV/AIDS or condom use, it is rarely used to draw inferences about the structural properties of social networks among such populations because it does not typically collect the necessary data. Drawing on recent advances in computer science, we introduce a set of data collection instruments and RDS estimators for network clustering, an important topological property that has been linked to a network's potential for diffusion of information, disease, and health behaviors. We use simulations to explore how these estimators, originally developed for random walk samples of computer networks, perform when applied to RDS samples with characteristics encountered in realistic field settings that depart from random walks. In particular, we explore the effects of multiple seeds, without replacement versus with replacement, branching chains, imperfect response rates, preferential recruitment, and misreporting of ties. We find that clustering coefficient estimators retain desirable properties in RDS samples. This paper takes an important step toward calculating network characteristics using nontraditional sampling methods, and it expands the potential of RDS to tell researchers more about hidden populations and the social factors driving disease prevalence.

Keywords: HIV/AIDS; clustering coefficient; estimation; hidden populations; respondent-driven sampling (RDS); sampling; small world model; social networks; transitivity; triad.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Example network with hypothetical random walk sampling (RWS) and components needed to calculate local and global clustering coefficients for the whole network.
Figure 2.
Figure 2.
Largest weakly connected component of Project 90 data set; nodes shaded by race (grey = white; black = nonwhite) and sized by degree. The network is displayed using the ForceAtlas2 algorithm, with no node overlap, in Gephi 0.9.
Figure 3.
Figure 3.
Performance of Hardiman Katzir estimators by estimator and question format in RWS on the Project 90 data set. Note: These are nonstandard box plots that show the mean rather than the median as the central line; the thick dashed line indicates the population parameter.
Figure 4.
Figure 4.
Performance of Hardiman Katzir estimators by estimator and question format in RWS and RDS scenarios on the Project 90 data set. Note: These are nonstandard box plots that show the mean rather than the median as the central line; the thick dashed line indicates the population parameter.

Similar articles

Cited by

References

    1. Baraff Aaron J., McCormick Tyler H., and Raftery Adrian E.. 2016. “Estimating Uncertainty in Respondent-Driven Sampling Using a Tree Bootstrap Method.” Proceedings of the National Academy of Sciences, 113(51):14668–14673. - PMC - PubMed
    1. Barash Vladimir D., Cameron Christopher J., Spiller Michael W., and Heckathorn Douglas D.. 2016. “Respondent-Driven Sampling—Testing Assumptions: Sampling with Replacement.” Journal of Official Statistics 32 (1):29–73. doi:10.1515/jos-2016-0002. - DOI
    1. Britton Tom, Maria Deijfen, Lagerås Andreas N., and Mathias Lindholm. 2008. “Epidemics on Random Graphs with Tunable Clustering.” Journal of Applied Probability 45(3):743–56.
    1. Centola Damon. 2010. “The Spread of Behavior in an Online Social Network Experiment.” Science 329 (5996):1194–97. doi:10.1126/science.1185231. - DOI - PubMed
    1. Centola Damon, and Michael Macy. 2007. “Complex Contagions and the Weakness of Long Ties.” American Journal of Sociology 113 (3):702–34.

LinkOut - more resources