Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;7(4):e34740.
doi: 10.1371/journal.pone.0034740. Epub 2012 Apr 6.

One plus one makes three (for social networks)

Affiliations

One plus one makes three (for social networks)

Emőke-Ágnes Horvát et al. PLoS One. 2012.

Erratum in

  • PLoS One. 2012:7(4): doi/10.1371/annotation/c2a07195-0843-4d98-a220-b1c5b77a7e1a. Horvát, Emöke-Ágnes [corrected to Horvát, Emőke-Ágnes]

Abstract

Members of social network platforms often choose to reveal private information, and thus sacrifice some of their privacy, in exchange for the manifold opportunities and amenities offered by such platforms. In this article, we show that the seemingly innocuous combination of knowledge of confirmed contacts between members on the one hand and their email contacts to non-members on the other hand provides enough information to deduce a substantial proportion of relationships between non-members. Using machine learning we achieve an area under the (receiver operating characteristic) curve (AUC) of at least 0.85 for predicting whether two non-members known by the same member are connected or not, even for conservative estimates of the overall proportion of members, and the proportion of members disclosing their contacts.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Definitions and examples.
Any social network platform divides society into two sets: the set of members formula image (black nodes) and of non-members formula image. In our toy example formula image of formula image individuals, i.e. a fraction of formula image, are members. The relevant subset formula image of non-members (red nodes) that are in contact with at least one member is distinguished from other non-members (gray nodes). formula image of the formula image members, i.e., a fraction of formula image, have disclosed their outside social contacts. The knowledge of the set of edges formula image between members (black, bi-directed) and the set of edges formula image (green) to non-members is enough to infer a substantial fraction of edges between non-members (red edges).
Figure 2
Figure 2. Comparison of basic network analytic statistics of the five data sets obtained from Traud et al. .
Figure 3
Figure 3. Membership propagation in a toy example according to different propagation models.
Note that real social networks exhibit more long-range edges. Examples for the platform penetration value formula image show the nodes from which the propagation started (black nodes with white core). Other members are marked black and relevant non-members red; for ease of reading arrows are not displayed, but black edges are bidirectional while green edges point from black to red nodes. With BFS and DFS the network is explored starting from one node (denoted by a white circle); with RW and EN there are more nodes from which the propagation is launched; and finally, for RS all selected nodes can be seen as starting nodes.
Figure 4
Figure 4. Features based on different edge sets between the exclusive, joint, and common neighborhoods of v and w.
All left-hand nodes belong to the joint neighborhood of formula image and formula image. formula image is exclusive to formula image, while formula image are exclusive to formula image, and formula image are common neighbors of both. Our features comprise the absolute number of edges between common neighbors (black, dashed edges), exclusive neighbors (black, straight edge), joint neighborhood (all black edges between nodes formula image), and an exclusive and a common neighbor (black, dotted edges). For each of them we also added their normalized value. Normalization was done by the number of possible edges between the neighbors they have.
Figure 5
Figure 5. Prediction accuracy (AUC) of samples based on all member recruitment models in the cross-validation training scheme applied to UNC data.
The white square denotes a data point where there was not enough data to make the prediction.
Figure 6
Figure 6. 4 → 1 cross-prediction accuracy.
Minimal (lower triangle) and maximal (upper triangle) prediction accuracy for all five member recruitment models are shown as a function of platform penetration formula image and the disclosure parameter formula image. Upper row: formula image; lower row: formula image; black triangles denote data points where formula image was smaller than the according fraction of positive samples among all samples.
Figure 7
Figure 7. 1 → 1 cross-prediction accuracy.
formula image values for each of the five member recruitment models at formula image. The formula image and formula image-axis show on which network the random forest was trained and tested, respectively. The white field indicates that there were too few edge samples to reasonably train the classifier.

Similar articles

Cited by

References

    1. Jernigan C, Mistree B. Gaydar: Facebook friendships expose sexual orientation. First Monday [Online] 2009;14
    1. Lindamood J, Heatherly R, Kantarcioglu M, Thuraisingham B. Inferring private information using social network data. Proceedings of the 18th International Conference on World Wide Web (WWW ’09) 2009. pp. 1145–1146.
    1. Mislove A, Viswanath B, Gummadi KP, Druschel P. You are who you know: inferring user profiles in online social networks. Proceedings of the 3rd ACM International Conference on Web Search and Data Mining (WSDM ’10) 2010. pp. 251–260.
    1. Zheleva E, Getoor L. To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles. Proceedings of the 18th International Conference on World Wide Web (WWW ’09) 2009. pp. 531–540.
    1. Getoor L, Diehl CP. Link mining: a survey. ACM SIGKDD Explorations Newsletter. 2005;7:3–12.

Publication types