Experimentally validated memristive memory augmented neural network with efficient hashing and similarity search

Ruibin Mao¹, Bo Wen¹, Arman Kazemi^{2

3}, Yahui Zhao¹, Ann Franchesca Laguna^{3

4}, Rui Lin¹, Ngai Wong¹, Michael Niemier³, X Sharon Hu³, Xia Sheng², Catherine E Graves⁵, John Paul Strachan^{6

7}, Can Li⁸

Affiliations

¹ Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong SAR, China.
² Hewlett Packard Labs, Hewlett Packard Enterprise, Milpitas, CA, USA.
³ Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, USA.
⁴ Department of Computer Technology, De La Salle University, Manila, Philippines.
⁵ Hewlett Packard Labs, Hewlett Packard Enterprise, Milpitas, CA, USA. catherine.graves@hpe.com.
⁶ Peter Grünberg Institut (PGI-14), Forschungszentrum Jülich GmbH, Jülich, Germany. j.strachan@fz-juelich.de.
⁷ RWTH Aachen University, Aachen, Germany. j.strachan@fz-juelich.de.
⁸ Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong SAR, China. canl@hku.hk.

PMID: 36271072
PMCID: PMC9587027
DOI: 10.1038/s41467-022-33629-7

Experimentally validated memristive memory augmented neural network with efficient hashing and similarity search

Ruibin Mao et al. Nat Commun. 2022.

. 2022 Oct 21;13(1):6284.

doi: 10.1038/s41467-022-33629-7.

Authors

Affiliations

¹ Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong SAR, China.
² Hewlett Packard Labs, Hewlett Packard Enterprise, Milpitas, CA, USA.
³ Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, USA.
⁴ Department of Computer Technology, De La Salle University, Manila, Philippines.
⁵ Hewlett Packard Labs, Hewlett Packard Enterprise, Milpitas, CA, USA. catherine.graves@hpe.com.
⁶ Peter Grünberg Institut (PGI-14), Forschungszentrum Jülich GmbH, Jülich, Germany. j.strachan@fz-juelich.de.
⁷ RWTH Aachen University, Aachen, Germany. j.strachan@fz-juelich.de.
⁸ Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong SAR, China. canl@hku.hk.

PMID: 36271072
PMCID: PMC9587027
DOI: 10.1038/s41467-022-33629-7

Abstract

Lifelong on-device learning is a key challenge for machine intelligence, and this requires learning from few, often single, samples. Memory-augmented neural networks have been proposed to achieve the goal, but the memory module must be stored in off-chip memory, heavily limiting the practical use. In this work, we experimentally validated that all different structures in the memory-augmented neural network can be implemented in a fully integrated memristive crossbar platform with an accuracy that closely matches digital hardware. The successful demonstration is supported by implementing new functions in crossbars, including the crossbar-based content-addressable memory and locality sensitive hashing exploiting the intrinsic stochasticity of memristor devices. Simulations show that such an implementation can be efficiently scaled up for one-shot learning on more complex tasks. The successful demonstration paves the way for practical on-device lifelong learning and opens possibilities for novel attention-based algorithms that were not possible in conventional hardware.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Fig. 1. Memory-augmented neural networks in crossbar arrays.**
a The schematic of a crossbar-based MANN architecture. The expensive data transfer in a von Neumann architecture can be alleviated by performing analog matrix multiplication, locality sensitive hashing, and nearest neighbor searching directly in memristor crossbars, where the data is stored. b Optical image of a 64 × 64 crossbar array in a fully integrated memristor chip. c Top view of four 50 nm × 50 nm integrated cross-point memristors. d Cross-section of the memristor chip, where complementary metal-oxide-silicon (CMOS) circuits at the bottom, inter-connection in the middle, and metal vias on the surface for memristor integration with back-end processes. Animal figures in (a) are taken from www.flaticon.com.

**Fig. 2. Robust ternary locality sensitive hashing in analog memristive crossbars.**
a Illustration of the Locality Sensitive Hashing (LSH) and the Ternary Locality Sensitive Hashing (TLSH) concept. b The LSH or TLSH implemented in memristor crossbars. Each adjacent column pair represents one hashing plane. Thus, crossbars with N + 1 columns can generate N hashing bits with this method. Greyscale colors on the memristor symbol represent random conductance states. c A random memristor conductance distribution in a 64 × 129 crossbar after applying five RESET pulses to each device. The intrinsic stochastic behavior in memristor devices results in a lognormal-like distribution near 0 μS. d The distribution of the memristor conductance difference for devices in adjacent columns. The differential conductance distribution is random with zero-mean, matching the requirements of our hashing scheme. e The conductance difference map of size 64 × 128 (including three crossbar arrays each of size 64 × 64). f The correlation between cosine distance and Hamming distance with different hashing representations shows that the Hamming distance generated by both hardware and software can well approximate the cosine distance. IQR interquartile range. g The linear correlation coefficient between Hamming distance and cosine distance increases with the number of total hashing bits. The hardware TLSH approach shows a higher correlation coefficient than the hardware LSH approach due to the reduced number of unstable bits, as detailed in Supplementary Fig. 5. CD cosine distance, HD hamming distance.

**Fig. 3. TCAM implemented in crossbar array capable of conducting Hamming distance calculation.**
a Illustration of the basic principle for using dot product to distinguish “match” and “mismatch” cases. b The schematic of calculating Hamming distance in a crossbar. The figure shows three 3-dimensional ternary key vectors stored in a 3 × 6 crossbar with a differential encoding. Differential voltages representing ternary bits in search vectors are applied to the source line and the output current from the bit line can represent the THD between the search vector and keys stored in the memory. c The readout conductance map after eight binary vectors experimentally stored in the crossbar as memory. In the experiment, we set G_on as 150 μS and V_search as 0.2V. d Distribution of G_on and G_off. e Ouput current shows a linear relation with Hamming distance measuring the degree of mismatches. IQR interquartile range. f Current distributions are separated from each other through which we can obtain the number of mismatch bits (i.e., Hamming distance).

**Fig. 4. Experimental demonstration of few-shot learning with memristive crossbar arrays.**
a Schematic of CNN structure implemented in the memristor crossbar array. The conductance shows the weight mapping of CNN kernels. The format of dimension representations in the figure follows the Output channel (O), Input channel (I), Height (H), and Width (W). The conductance maps representing the whole CNN kernels are shown in Supplementary Fig. 10. b Linear relationship between the sensing current from the crossbar-based TCAM and the number of mismatch bits during the search operations. c Classification accuracy with cosine similarity, software-based LSH with 128 bits, and end-to-end experimental results on crossbar arrays. We provide 5 experimental data points for each task. Software LSH shows experimental variation due to different initializations of the hashing planes in each experiment. Simulations of classification accuracy of 5-way 1-shot problem (d) and 25-way 1-shot problem (e) as a function of device fluctuations in the memristor model for both TLSH and LSH. Fluctuations from nearly zero to 1 μS are shown. The actual experimental fluctuation level is shown with an arrow.

**Fig. 5. Experiment validated simulation results of Mini-ImageNet dataset.**
a The architecture of TLSH and TCAM for the scaled-up MANN. Both matrices for hashing and external memory are partitioned into H × W memristor crossbar tiles, to mitigate the voltage drop problem in large crossbars and to increase the utilization rate. b The accuracy performance from our experiment-validated models on Mini-ImageNet dataset. The error bar shows the 95% confidence interval among 100 repeated inference experiments. c The execution time of search operations per inference on a GPU drastically increases when external memory size reaches a threshold, confirming the operation is memory intensive. d The comparison of the search latency and energy consumption for 5-way 1-shot learning on both the Omniglot and Mini-ImageNet datasets. For GPU, the models for both datasets stores the same number of entries (8192), but Mini-ImageNet uses a larger memory capacity due to the higher dimension (64 vs. 512) of feature vectors, leading to even better improvement on latency and energy efficiency. The number of hashing bits used in crossbar arrays is 128 and 4096 for Omniglot and Mini-ImageNet, respectively.

See this image and copyright information in PMC

References

1. Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D. & Lillicrap, T. Meta-learning with memory-augmented neural networks. In Balcan, M. F. & Weinberger, K. Q. (eds.) Proceedings of The 33rd International Conference on Machine Learning, vol. 48 of Proceedings of Machine Learning Research, 1842-1850 (PMLR, New York, New York, USA, 2016). http://proceedings.mlr.press/v48/santoro16.html.
1. Stevens, J. R., Ranjan, A., Das, D., Kaul, B. & Raghunathan, A. Manna: An accelerator for memory-augmented neural networks. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 794-806 (2019).
1. Rae, J. W. et al. Scaling memory-augmented neural networks with sparse reads and writes (2016). 1610.09027.
1. Von Neumann J. First draft of a report on the edvac. IEEE Annals Hist. Comput. 1993;15:27–75. doi: 10.1109/85.238389. - DOI
1. Strubell, E., Ganesh, A. & McCallum, A. Energy and policy considerations for deep learning in NLP. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 3645-3650 (Association for Computational Linguistics, Florence, Italy, 2019). https://www.aclweb.org/anthology/P19-1355.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Experimentally validated memristive memory augmented neural network with efficient hashing and similarity search

Affiliations

Experimentally validated memristive memory augmented neural network with efficient hashing and similarity search

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources