Lossy compression of statistical data using quantum annealer

Boram Yoon¹, Nga T T Nguyen², Chia Cheng Chang^{3

4

5

6}, Ermal Rrapaj⁴

Affiliations

¹ CCS-7, Computer, Computational and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, NM, 87545, USA. boram@lanl.gov.
² CCS-3, Computer, Computational and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, NM, 87545, USA.
³ RIKEN iTHEMS, Wako, Saitama, 351-0198, Japan.
⁴ Department of Physics, University of California, Berkeley, CA, 94720, USA.
⁵ Nuclear Science Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.
⁶ LinkedIn Corporation, Sunnyvale, CA, 94085, USA.

PMID: 35264581
PMCID: PMC8907274
DOI: 10.1038/s41598-022-07539-z

Lossy compression of statistical data using quantum annealer

Boram Yoon et al. Sci Rep. 2022.

. 2022 Mar 9;12(1):3814.

doi: 10.1038/s41598-022-07539-z.

Authors

Boram Yoon¹, Nga T T Nguyen², Chia Cheng Chang^{3

4

5

6}, Ermal Rrapaj⁴

Affiliations

¹ CCS-7, Computer, Computational and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, NM, 87545, USA. boram@lanl.gov.
² CCS-3, Computer, Computational and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, NM, 87545, USA.
³ RIKEN iTHEMS, Wako, Saitama, 351-0198, Japan.
⁴ Department of Physics, University of California, Berkeley, CA, 94720, USA.
⁵ Nuclear Science Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.
⁶ LinkedIn Corporation, Sunnyvale, CA, 94085, USA.

PMID: 35264581
PMCID: PMC8907274
DOI: 10.1038/s41598-022-07539-z

Abstract

We present a new lossy compression algorithm for statistical floating-point data through a representation learning with binary variables. The algorithm finds a set of basis vectors and their binary coefficients that precisely reconstruct the original data. The optimization for the basis vectors is performed classically, while binary coefficients are retrieved through both simulated and quantum annealing for comparison. A bias correction procedure is also presented to estimate and eliminate the error and bias introduced from the inexact reconstruction of the lossy compression for statistical data analyses. The compression algorithm is demonstrated on two different datasets of lattice quantum chromodynamics simulations. The results obtained using simulated annealing show 3-3.5 times better compression performance than the algorithm based on neural-network autoencoder. Calculations using quantum annealing also show promising results, but performance is limited by the integrated control error of the quantum processing unit, which yields large uncertainties in the biases and coupling parameters. Hardware comparison is further studied between the previous generation D-Wave 2000Q and the current D-Wave Advantage system. Our study shows that the Advantage system is more likely to obtain low-energy solutions for the problems than the 2000Q.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Figure 1**
Correlation pattern of the 16 components of the vector (left) and the axial-vector (right) data. Red indicates the high correlation (correlation coefficient = 1), and white indicates no correlation.

**Figure 2**
$Q^{2}$ , defined in Eq. (8) for different number of storing bits for data dimension $D = 16$ . For the principal component analysis (PCA) and autoencoder (AE) approaches, the number of storing bits is calculated by $32 \times N_{z}$ , assuming single-precision floating-point numbers. For binary compression algorithm of $N_{q} > = 48$ , we use the boosting approach with $N_{q 1} = N_{q 2} = N_{q} / 2$ .

**Figure 3**
Scaling of $Q^{2}$ for various dimensions (D) vector (left) and axial-vector (right) data. Upper figures show the results using the binary compression algorithm at fixed compression rates ( $N_{q} = D$ and $N_{q} = 2 D$ ), and the bottom figures show the comparison between the binary compression and autoencoder algorithms at fixed sizes in the compressed space ( $N_{z} = 1, 2, 3$ for autoencoder, and $N_{q} = 32$ for binary compression). For binary compression algorithm of $N_{q} > = 48$ , we use the boosting approach with $N_{q 1} = N_{q 2} = N_{q} / 2$ .

**Figure 4**
Cumulative distribution function (CDF) of the normalized reconstruction error from all feasible samples obtained from the D-Wave 2000Q (red) and Advantage system (blue) for the axial-vector data. About 50% and 38% of the samples were feasible from D-Wave 2000Q and Advantage for $N_{q} = 32$ , respectively. For $N_{q} = 60$ there were about 51% and 18%, respectively.

**Figure 5**
Cumulative distribution function (CDF) of the normalized reconstruction error from all feasible samples obtained from the D-Wave 2000Q (red) and Advantage system (blue) for the vector data. About 95% and 91% of the samples were feasible from D-Wave 2000Q and Advantage for $N_{q} = 32$ , respectively. For $N_{q} = 60$ there were about 63% and 73%, respectively.

See this image and copyright information in PMC

References

1. Park, S., Gupta, R., Yoon, B., Mondal, S., Bhattacharya, T., Jang, Y.-C., Joó, B. & F. Winter Precision Nucleon Charges and Form Factors Using 2+1-flavor Lattice QCD (Nucleon Matrix Elements (NME), 2021). arXiv:2103.05599 [hep-lat]
1. He, J. et al. Detailed analysis of excited state systematics in a lattice QCD calculation of $g_{A}$ (2021). arXiv:2104.05226 [hep-lat]
1. Lakshminarasimhan, S., Shah, N., Ethier, S., Ku, S.-H., Chang, C. S., Klasky, S., Latham, R., Ross, R. & Samatova, N. F. ISABELA for effective in situ compression of scientific data: Isabela for effective in-situ reduction of spatio-temporal data. Concurr. Comput. Pract. Exp.25 (2012). 10.1002/cpe.2887
1. Lindstrom P. Fixed-rate compressed floating-point arrays. IEEE Trans. Vis. Comput. Graph. 2014;20:2674. doi: 10.1109/TVCG.2014.2346458. - DOI - PubMed
1. Di, S. & Cappello, F. Fast error-bounded lossy HPC data compression with SZ, in 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 730–739 (2016).

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Lossy compression of statistical data using quantum annealer

Affiliations

Lossy compression of statistical data using quantum annealer

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources