Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jul 26;10(Suppl 2):44.
doi: 10.1186/s12920-017-0277-y.

BLOOM: BLoom filter based oblivious outsourced matchings

Affiliations

BLOOM: BLoom filter based oblivious outsourced matchings

Jan Henrik Ziegeldorf et al. BMC Med Genomics. .

Abstract

Background: Whole genome sequencing has become fast, accurate, and cheap, paving the way towards the large-scale collection and processing of human genome data. Unfortunately, this dawning genome era does not only promise tremendous advances in biomedical research but also causes unprecedented privacy risks for the many. Handling storage and processing of large genome datasets through cloud services greatly aggravates these concerns. Current research efforts thus investigate the use of strong cryptographic methods and protocols to implement privacy-preserving genomic computations.

Methods: We propose FHE-BLOOM and PHE-BLOOM, two efficient approaches for genetic disease testing using homomorphically encrypted Bloom filters. Both approaches allow the data owner to securely outsource storage and computation to an untrusted cloud. FHE-BLOOM is fully secure in the semi-honest model while PHE-BLOOM slightly relaxes security guarantees in a trade-off for highly improved performance.

Results: We implement and evaluate both approaches on a large dataset of up to 50 patient genomes each with up to 1000000 variations (single nucleotide polymorphisms). For both implementations, overheads scale linearly in the number of patients and variations, while PHE-BLOOM is faster by at least three orders of magnitude. For example, testing disease susceptibility of 50 patients with 100000 variations requires only a total of 308.31 s (σ=8.73 s) with our first approach and a mere 0.07 s (σ=0.00 s) with the second. We additionally discuss security guarantees of both approaches and their limitations as well as possible extensions towards more complex query types, e.g., fuzzy or range queries.

Conclusions: Both approaches handle practical problem sizes efficiently and are easily parallelized to scale with the elastic resources available in the cloud. The fully homomorphic scheme, FHE-BLOOM, realizes a comprehensive outsourcing to the cloud, while the partially homomorphic scheme, PHE-BLOOM, trades a slight relaxation of security guarantees against performance improvements by at least three orders of magnitude.

Keywords: Bloom filters; Homomorphic encryption; Secure outsourcing.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
The problem scenario of Track 3 in the iDASH competition: A researcher aims to securely outsource expensive genome analysis to the cloud using homomorphic encryption
Fig. 2
Fig. 2
General overview of our approach: The data owner holds a database with genomes of Patients P i=1…n which she encodes row-wise as Bloom filters (Step 1) and encrypts and uploads them to the cloud (Step 2) in the preprocessing phase. In the online phase, the data owner encodes, encrypts (only in FHE-BLOOM), and uploads her query. In the encrypted domain, the cloud matches the query to each database record (Step 3) and aggregates the results (Step 4) without ever learning the data in clear by utilizing the homomorphic properties of the chosen encryption scheme. The final results are returned to the data owner, who decrypts (Step 5) and postprocesses the results (Step 6) to obtain a list of patients that match her query
Fig. 3
Fig. 3
Query time of FHE-BLOOM grows linearly in the number of patients n
Fig. 4
Fig. 4
Query time of FHE-BLOOM grows linearly in the number of SNPs m (note the non-linear x-axis)
Fig. 5
Fig. 5
Query time of FHE-BLOOM grows linearly with exponentially decreasing p (note the logarithmic x-axis)
Fig. 6
Fig. 6
Query time of PHE-BLOOM grows linearly in ⌈n/s P
Fig. 7
Fig. 7
Query time of PHE-BLOOM increases linearly in the maximum number of SNPs m (note the non-linear x-axis)
Fig. 8
Fig. 8
Query time of PHE-BLOOM grows linearly with exponentially decreasing p (note the logarithmic x-axis)

References

    1. Collins FS, Varmus H. A New Initiative on Precision Medicine. N Engl J Med. 2015;372(9):793–5. doi: 10.1056/NEJMp1500523. - DOI - PMC - PubMed
    1. Church GM. The Personal Genome Project. Mol Syst Biol. 2005;1(1). - PMC - PubMed
    1. Gibbs RA, Belmont JW, Hardenbol P, Willis TD, Yu F, Yang H, Ch’ang LY, Huang W, Liu B, Shen Y, et al. The International HapMap Project. Nature. 2003;426(6968):789–96. doi: 10.1038/nature02168. - DOI - PubMed
    1. 23andme. https://www.23andme.com/. Accessed 3 Dec 2016.
    1. Homer N, Szelinger S, Redman M, Duggan D, Tembe W, Muehling J, Pearson JV, Stephan DA, Nelson SF, Craig DW. Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays. PLoS Genet. 2008;4(8):1000167. doi: 10.1371/journal.pgen.1000167. - DOI - PMC - PubMed

Publication types