Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Oct 4;13(1):5847.
doi: 10.1038/s41467-022-33441-3.

Noise-injected analog Ising machines enable ultrafast statistical sampling and machine learning

Affiliations

Noise-injected analog Ising machines enable ultrafast statistical sampling and machine learning

Fabian Böhm et al. Nat Commun. .

Abstract

Ising machines are a promising non-von-Neumann computational concept for neural network training and combinatorial optimization. However, while various neural networks can be implemented with Ising machines, their inability to perform fast statistical sampling makes them inefficient for training neural networks compared to digital computers. Here, we introduce a universal concept to achieve ultrafast statistical sampling with analog Ising machines by injecting noise. With an opto-electronic Ising machine, we experimentally demonstrate that this can be used for accurate sampling of Boltzmann distributions and for unsupervised training of neural networks, with equal accuracy as software-based training. Through simulations, we find that Ising machines can perform statistical sampling orders-of-magnitudes faster than software-based methods. This enables the use of Ising machines beyond combinatorial optimization and makes them into efficient tools for machine learning and other applications.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Schematic of noise-induced sampling with spatially multiplexed and time-multiplexed analog Ising machines.
a Schematic of a spatially multiplexed analog Ising machine for noise-induced sampling. The system consists of a set of N bistable nonlinear systems that represent N spin states and are mutually coupled. Inset: Bifurcation diagram of the spin amplitude as a function of the feedback gain for a single bistable system. Below the bifurcation point at α = 1 (red dashed line), only the trivial solution exists (solid black line), while above the bifurcation point, the trivial solution becomes unstable (black dotted line) and two new bistable fixed points arise (orange and blue line). b Exemplary time evolution of the Ising energy (orange) and the spin amplitudes (blue) while solving a Maxcut optimization problem with N = 100 spins. c Experimental setup of the time-multiplexed opto-electronic Ising machine, where Gaussian white noise with a standard deviation of δ is injected. PC polarization controller, ADC analog-digital converter, DAC digital-analog converter, MZM Mach-Zehnder modulator, EA electronic amplifier, FPGA field programmable gated array.
Fig. 2
Fig. 2. Experimental demonstration of noise-induced Boltzmann sampling with a time-multiplexed opto-electronic Ising machine.
Time evolution (a, c) and sampled distribution function (b, d) of the Ising energy for noise-induced sampling (a, b) and discontinuous sampling (c, d). In b, d The energy distributions obtained with the Ising machine (IM) are compared to those obtained with the Metropolis–Hastings algorithm (MCMC). e Boltzmann distribution obtained from noise-induced sampling as a function of the noise variance δ2 for the three degenerate energy levels of a 4-spin ring network (dots, squares, and triangles) at α = 1.2 and β = 0.5. The probabilities are compared to the analytical solutions (solid lines) obtained from equation (1) at different temperatures T. The overlap of the distributions is quantified by the Kullback–Leibler divergence DKL (dashed line). f Relation between temperature and the noise variance for the problem in e for different coupling strengths β.
Fig. 3
Fig. 3. Experimental demonstration of unsupervised training of RBMs with a time-multiplexed opto-electronic Ising machine.
a Activation probability of single neurons as a function of the neuron bias for different temperatures. The probabilities have been obtained from continuous sampling of 100 independent Ising spins at different noise levels (squares, dots, and triangles) and are compared to the analytical solution at different temperatures (solid lines). b Activation probabilities for an RBM with 16 hidden and 16 visible neurons with random weights and biases. Probabilities for the analog Ising machine (IM, orange bars) have been obtained by continuous sampling at a fixed noise strength and are compared against probabilities obtained with the Metropolis–Hastings algorithm (MCMC, blue bars). c Comparison of the pseudolikelihood L and the prediction accuracy η for Ising machine- and MCMC-based sampling during unsupervised training of handwritten digit recognition.
Fig. 4
Fig. 4. Simulation-based investigation of the scalability of noise-induced Boltzmann sampling.
a Kullback–Leibler divergence as a function of the problem size for sampling the energy distributions for different random sparse graphs (dots). The solid line shows the average. The insets (i) and (ii) show exemplary sampled distributions in comparison to the Metropolis–Hastings algorithm (MCMC) for N = 64 (i) and N = 8192 (ii) spins. b, c Average energy (top panel) and Kullback–Leiber divergence (lower panel) as a function of the temperature and the noise variance δ2 for a sparse random graph with N = 64 b and N = 8192 c. The average energy and Kullback–Leiber divergence is compared against MCMC-based sampling (blue lines). For DKL for the MCMC-based sampling, repeated sampling runs are compared against the reference distribution. The shaded regions show the standard deviation.
Fig. 5
Fig. 5. Simulated sampling rate of spatially multiplexed analog Ising machines.
a Estimation of the sampling rate of a spatially multiplexed analog Ising machine at different analog bandwidths for randomly generated sparse Maxcut problems at a temperature of T = 2 (dots). The lines indicate the average of the different graphs. The sampling rate is estimated from the autocorrelation function of the Ising energy as the point when samples become statistically independent. b Number of iterations z required z to create statistically independent samples with the Metropolis–Hastings algorithm (MCMC) and with simulations of analog Ising machines (IM,sim) using the forward Euler method. Also shown is the average runtime t to obtain independent samples when executing the Metropolis–Hastings algorithm and the forward Euler integration on the same CPU.
Fig. 6
Fig. 6. Characterization of the time-multiplexed optoelectronic Ising machine.
a Measurement of the MZM output as a function of the input voltage. The dashed line shows the operating bias voltage of the Ising machine and the gray region denotes the voltage range. b Measurement of the spin amplitude distribution as a function of the gain for 100 uncoupled spins (β = 0). The fixed points are compared against the numerical models in Eqs. (5) and (6).
Fig. 7
Fig. 7. Scalability of sampling accuracy and speed for time-multiplexed analog Ising machines.
a Kullback–Leibler divergence as a function of the problem size for sampling the energy distributions for the different random sparse graphs in Fig. 4a (dots) using simulations of the time-discrete model in eq. (9). The solid line shows the average. b Number of iterations z required to create statistically independent samples for the graphs in a with the Metropolis–Hastings algorithm (MCMC) and with simulations of analog Ising machines (IM,sim) using the time-discrete model. Also shown is the average runtime t to obtain independent samples when executing the Metropolis–Hastings algorithm and the time-discrete Ising machine model on the same CPU.
Fig. 8
Fig. 8. Simulated Boltzmann sampling performance of different analog Ising machine models.
a Average energy at different temperatures for the 2D Ising model. Samples are obtained with the Metropolis–Hastings algorithm (MCMC, solid line) and with Ising machine simulations for different gain-dissipative systems (triangle: polynomial, square: clipped, circle: sigmoid). Insets: energy distributions for the four different systems at T = 1.8 (i) and T = 3.4 (ii). b Relation between noise variance δ2 and the temperature for the 2D Ising model for the different gain-dissipative systems. c Unsupervised training for the digit recognition task in Fig. 4 using simulations of different gain-dissipative systems.

References

    1. Xu X, et al. Scaling for edge inference of deep neural networks. Nat. Electronics. 2018;1:216–222. doi: 10.1038/s41928-018-0059-3. - DOI
    1. Strubell, E., Ganesh, A. & McCallum, A.Energy and Policy Considerations for Deep Learning in NLP. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 1, 3645–3650 (Association for Computational Linguistics, Stroudsburg, PA, USA, 2019).
    1. Johnson MW, et al. Quantum annealing with manufactured spins. Nature. 2011;473:194–198. doi: 10.1038/nature10012. - DOI - PubMed
    1. Cai F, et al. Power-efficient combinatorial optimization using intrinsic noise in memristor Hopfield neural networks. Nat. Electronics. 2020;3:409–418. doi: 10.1038/s41928-020-0436-6. - DOI
    1. Prabhu M, et al. Accelerating recurrent Ising machines in photonic integrated circuits. Optica. 2020;7:551. doi: 10.1364/OPTICA.386613. - DOI

Publication types

MeSH terms