Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 2;41(6):btaf273.
doi: 10.1093/bioinformatics/btaf273.

Beacon Reconstruction Attack: Reconstruction of genomes in genomic data-sharing beacons using summary statistics

Affiliations

Beacon Reconstruction Attack: Reconstruction of genomes in genomic data-sharing beacons using summary statistics

Kousar Saleem et al. Bioinformatics. .

Abstract

Motivation: Genomic data-sharing beacon protocol, developed by the Global Alliance for Genomics and Health, offers a privacy-preserving mechanism for querying genomic datasets while restricting direct data access. Despite their design, beacons remain vulnerable to privacy attacks. This study introduces a novel privacy vulnerability of the protocol: one can reconstruct large portions of the genomes of all beacon participants by only using the summary statistics reported by the protocol.

Results: We introduce a novel optimization-based algorithm that leverages beacon responses and SNP correlations for reconstruction. By optimizing for the SNP correlations and allele frequencies, the proposed approach achieves genome reconstruction with a substantially higher F1-score (70%) compared to baseline methods (45%) on beacons generated using individuals from the HapMap and OpenSNP datasets. We show that reconstructed genomes can be used by downstream applications such as in membership inference attacks against other beacons. Our findings reveal that beacons releasing allele frequencies substantially increase the reconstruction risk, underscoring the need for enhanced privacy-preserving mechanisms to protect genomic data.

Availability and implementation: Our implementation is available at https://github.com/ASAP-Bilkent/Beacon-Reconstruction-Attack.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The system model of our approach. The attacker finds out N and decides on the subset of SNPs M to reconstruct. 1. The attacker takes the snapshot of the beacon for M. 2. Beacon can respond with allele frequencies or yes/no for each SNP. In the latter, the attacker estimates allele frequencies from the population. 3. The attacker initializes the matrix using the baseline algorithm, and then 4. updates the assignments such that the SNP correlations are preserved. 5. The attacker optimizes for the original allele frequencies obtained. 6. Steps 4 and 5 are alternated until convergence.
Figure 2.
Figure 2.
F1-score comparison for |M| = 1000 when frequency is unknown. Plot A represents reconstruction from 64 left-out samples of HapMap dataset, Plot B represents reconstruction using Mexican samples.

Similar articles

References

    1. Ayoz K, Ayday E, Cicek AE. Genome reconstruction attacks against genomic data-sharing beacons. In: Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium, Vol 2021, Oxford, England, UK: NIH Public Access, 2021, p28. - PMC - PubMed
    1. Ayoz K, Aysen M, Ayday E et al. The effect of kinship in re-identification attacks against genomic data sharing beacons. Bioinformatics 2020;36(Supplement_2):i903–10. - PMC - PubMed
    1. Bu D, Wang X, Tang H. Haplotype-based membership inference from summary genomic data. Bioinformatics 2021;37(Supplement_1):i161–8. - PMC - PubMed
    1. Cho H, Simmons S, Kim R et al. Privacy-preserving biomedical database queries with optimal privacy-utility trade-offs. Cell Syst 2020;10:408–16.e9. - PubMed
    1. Clayton D. On inferring presence of an individual in a mixture: a Bayesian approach. Biostatistics 2010;11:661–73. - PMC - PubMed