Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 9;12(1):3814.
doi: 10.1038/s41598-022-07539-z.

Lossy compression of statistical data using quantum annealer

Affiliations

Lossy compression of statistical data using quantum annealer

Boram Yoon et al. Sci Rep. .

Abstract

We present a new lossy compression algorithm for statistical floating-point data through a representation learning with binary variables. The algorithm finds a set of basis vectors and their binary coefficients that precisely reconstruct the original data. The optimization for the basis vectors is performed classically, while binary coefficients are retrieved through both simulated and quantum annealing for comparison. A bias correction procedure is also presented to estimate and eliminate the error and bias introduced from the inexact reconstruction of the lossy compression for statistical data analyses. The compression algorithm is demonstrated on two different datasets of lattice quantum chromodynamics simulations. The results obtained using simulated annealing show 3-3.5 times better compression performance than the algorithm based on neural-network autoencoder. Calculations using quantum annealing also show promising results, but performance is limited by the integrated control error of the quantum processing unit, which yields large uncertainties in the biases and coupling parameters. Hardware comparison is further studied between the previous generation D-Wave 2000Q and the current D-Wave Advantage system. Our study shows that the Advantage system is more likely to obtain low-energy solutions for the problems than the 2000Q.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Correlation pattern of the 16 components of the vector (left) and the axial-vector (right) data. Red indicates the high correlation (correlation coefficient = 1), and white indicates no correlation.
Figure 2
Figure 2
Q2, defined in Eq. (8) for different number of storing bits for data dimension D=16. For the principal component analysis (PCA) and autoencoder (AE) approaches, the number of storing bits is calculated by 32×Nz, assuming single-precision floating-point numbers. For binary compression algorithm of Nq>=48, we use the boosting approach with Nq1=Nq2=Nq/2.
Figure 3
Figure 3
Scaling of Q2 for various dimensions (D) vector (left) and axial-vector (right) data. Upper figures show the results using the binary compression algorithm at fixed compression rates (Nq=D and Nq=2D), and the bottom figures show the comparison between the binary compression and autoencoder algorithms at fixed sizes in the compressed space (Nz=1,2,3 for autoencoder, and Nq=32 for binary compression). For binary compression algorithm of Nq>=48, we use the boosting approach with Nq1=Nq2=Nq/2.
Figure 4
Figure 4
Cumulative distribution function (CDF) of the normalized reconstruction error from all feasible samples obtained from the D-Wave 2000Q (red) and Advantage system (blue) for the axial-vector data. About 50% and 38% of the samples were feasible from D-Wave 2000Q and Advantage for Nq=32, respectively. For Nq=60 there were about 51% and 18%, respectively.
Figure 5
Figure 5
Cumulative distribution function (CDF) of the normalized reconstruction error from all feasible samples obtained from the D-Wave 2000Q (red) and Advantage system (blue) for the vector data. About 95% and 91% of the samples were feasible from D-Wave 2000Q and Advantage for Nq=32, respectively. For Nq=60 there were about 63% and 73%, respectively.

References

    1. Park, S., Gupta, R., Yoon, B., Mondal, S., Bhattacharya, T., Jang, Y.-C., Joó, B. & F. Winter Precision Nucleon Charges and Form Factors Using 2+1-flavor Lattice QCD (Nucleon Matrix Elements (NME), 2021). arXiv:2103.05599 [hep-lat]
    1. He, J. et al. Detailed analysis of excited state systematics in a lattice QCD calculation of gA (2021). arXiv:2104.05226 [hep-lat]
    1. Lakshminarasimhan, S., Shah, N., Ethier, S., Ku, S.-H., Chang, C. S., Klasky, S., Latham, R., Ross, R. & Samatova, N. F. ISABELA for effective in situ compression of scientific data: Isabela for effective in-situ reduction of spatio-temporal data. Concurr. Comput. Pract. Exp.25 (2012). 10.1002/cpe.2887
    1. Lindstrom P. Fixed-rate compressed floating-point arrays. IEEE Trans. Vis. Comput. Graph. 2014;20:2674. doi: 10.1109/TVCG.2014.2346458. - DOI - PubMed
    1. Di, S. & Cappello, F. Fast error-bounded lossy HPC data compression with SZ, in 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 730–739 (2016).