Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Dec 20;5(12):eaay6946.
doi: 10.1126/sciadv.aay6946. eCollection 2019 Dec.

Wave physics as an analog recurrent neural network

Affiliations

Wave physics as an analog recurrent neural network

Tyler W Hughes et al. Sci Adv. .

Abstract

Analog machine learning hardware platforms promise to be faster and more energy efficient than their digital counterparts. Wave physics, as found in acoustics and optics, is a natural candidate for building analog processors for time-varying signals. Here, we identify a mapping between the dynamics of wave physics and the computation in recurrent neural networks. This mapping indicates that physical wave systems can be trained to learn complex features in temporal data, using standard training techniques for neural networks. As a demonstration, we show that an inverse-designed inhomogeneous medium can perform vowel classification on raw audio signals as their waveforms scatter and propagate through it, achieving performance comparable to a standard digital implementation of a recurrent neural network. These findings pave the way for a new class of analog machine learning platforms, capable of fast and efficient processing of information in its native domain.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1. Conceptual comparison of a standard RNN and a wave-based physcal system.
(A) Diagram of an RNN cell operating on a discrete input sequence and producing a discrete output sequence. (B) Internal components of the RNN cell, consisting of trainable dense matrices W(h), W(x), and W(y). Activation functions for the hidden state and output are represented by σ(h) and σ(y), respectively. (C) Diagram of the directed graph of the RNN cell. (D) Diagram of a recurrent representation of a continuous physical system operating on a continuous input sequence and producing a continuous output sequence. (E) Internal components of the recurrence relation for the wave equation when discretized using finite differences. (F) Diagram of the directed graph of discrete time steps of the continuous physical system and illustration of how a wave disturbance propagates within the domain.
Fig. 2
Fig. 2. Schematic of the vowel recognition setup and the training procedure.
(A) Raw audio waveforms of spoken vowel samples from three classes. (B) Layout of the vowel recognition system. Vowel samples are independently injected at the source, located at the left of the domain, and propagate through the center region, indicated in green, where a material distribution is optimized during training. The dark gray region represents an absorbing boundary layer. (C) For classification, the time-integrated power at each probe is measured and normalized to be interpreted as a probability distribution over the vowel classes. (D) Using automatic differentiation, the gradient of the loss function with respect to the density of material in the green region is computed. The material density is updated iteratively, using gradient-based stochastic optimization techniques until convergence.
Fig. 3
Fig. 3. Vowel recognition training results.
Confusion matrix over the training and testing datasets for the initial structure (A and B) and final structure (C and D), indicating the percentage of correctly (diagonal) and incorrectly (off-diagonal) predicted vowels. Cross-validated training results showing the mean (solid line) and SD (shaded region) of the (E) cross-entropy loss and (F) prediction accuracy over 30 training epochs and five folds of the dataset, which consists of a total of 279 total vowel samples of male and female speakers. (G to I) The time-integrated intensity distribution for a randomly selected input (G) ae vowel, (H) ei vowel, and (I) iy vowel.
Fig. 4
Fig. 4. Frequency content of the vowel classes.
The plotted quantity is the mean energy spectrum for the ae, ei, and iy vowel classes. a.u., arbitrary units.

References

    1. Russakovsky O., Deng J., Su H., Krause J., Satheesh S., Ma S., Huang Z., Karpathy A., Khosla A., Bernstein M., Berg A. C., Fei-Fei L., Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015).
    1. I. Sutskever, O. Vinyals, Q. V. Le, Sequence to sequence learning with neural networks, in Advances in Neural Information Processing Systems, NIPS Proceedings, Montreal, CA, 2014.
    1. Shainline J. M., Buckley S. M., Mirin R. P., Nam S. W., Superconducting optoelectronic circuits for neuromorphic computing. Phys. Rev. Appl. 7, 034013 (2017).
    1. Tait A. N., Lima T. F. D., Zhou E., Wu A. X., Nahmias M. A., Shastri B. J., Prucnal P. R., Neuromorphic photonic networks using silicon photonic weight banks. Sci. Rep. 7, 7430 (2017). - PMC - PubMed
    1. Romera M., Talatchian P., Tsunegi S., Abreu Araujo F., Cros V., Bortolotti P., Trastoy J., Yakushiji K., Fukushima A., Kubota H., Yuasa S., Ernoult M., Vodenicarevic D., Hirtzlin T., Locatelli N., Querlioz D., Grollier J., Vowel recognition with four coupled spin-torque nano-oscillators. Nature 563, 230–234 (2018). - PubMed

Publication types