The unreasonable effectiveness of deep learning in artificial intelligence

Terrence J Sejnowski^{1

2}

Affiliations

¹ Computational Neurobiology Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037; terry@salk.edu.
² Division of Biological Sciences, University of California San Diego, La Jolla, CA 92093.

PMID: 31992643
PMCID: PMC7720171
DOI: 10.1073/pnas.1907373117

The unreasonable effectiveness of deep learning in artificial intelligence

Terrence J Sejnowski. Proc Natl Acad Sci U S A. 2020.

. 2020 Dec 1;117(48):30033-30038.

doi: 10.1073/pnas.1907373117. Epub 2020 Jan 28.

Author

Terrence J Sejnowski^{1

2}

Affiliations

¹ Computational Neurobiology Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037; terry@salk.edu.
² Division of Biological Sciences, University of California San Diego, La Jolla, CA 92093.

PMID: 31992643
PMCID: PMC7720171
DOI: 10.1073/pnas.1907373117

Abstract

Deep learning networks have been trained to recognize speech, caption photographs, and translate text between languages at high levels of performance. Although applications of deep learning networks to real-world problems have become ubiquitous, our understanding of why they are so effective is lacking. These empirical results should not be possible according to sample complexity in statistics and nonconvex optimization theory. However, paradoxes in the training and effectiveness of deep learning networks are being investigated and insights are being found in the geometry of high-dimensional spaces. A mathematical theory of deep learning would illuminate how they function, allow us to assess the strengths and weaknesses of different network architectures, and lead to major improvements. Deep learning has provided natural ways for humans to communicate with digital devices and is foundational for building artificial general intelligence. Deep learning was inspired by the architecture of the cerebral cortex and insights into autonomy and general intelligence may be found in other brain regions that are essential for planning and survival, but major breakthroughs will be needed to achieve these goals.

Keywords: artificial intelligence; deep learning; neural networks.

PubMed Disclaimer

Conflict of interest statement

The author declares no competing interest.

Figures

**Fig. 1.**
Cover of the 1884 edition of *Flatland: A Romance in Many Dimensions* by Edwin A. Abbott (1). Inhabitants were 2D shapes, with their rank in society determined by the number of sides.

**Fig. 2.**
The Neural Information Processing Systems conference brought together researchers from many fields of science and engineering. The first conference was held at the Denver Tech Center in 1987 and has been held annually since then. The first few meetings were sponsored by the IEEE Information Theory Society.

**Fig. 3.**
Early perceptrons were large-scale analog systems (3). (*Left*) An analog perceptron computer receiving a visual input. The racks contained potentiometers driven by motors whose resistance was controlled by the perceptron learning algorithm. (*Right*) Article in the *New York Times*, July 8, 1958, from a UPI wire report. The perceptron machine was expected to cost $100,000 on completion in 1959, or around $1 million in today’s dollars; the IBM 704 computer that cost $2 million in 1958, or $20 million in today’s dollars, could perform 12,000 multiplies per second, which was blazingly fast at the time. The much less expensive Samsung Galaxy S6 phone, which can perform 34 billion operations per second, is more than a million times faster. Reprinted from ref. .

**Fig. 4.**
Nature has optimized birds for energy efficiency. (A) The curved feathers at the wingtips of an eagle boosts energy efficiency during gliding. (B) Winglets on a commercial jets save fuel by reducing drag from vortices.

**Fig. 5.**
Levels of investigation of brains. Energy efficiency is achieved by signaling with small numbers of molecules at synapses. Interconnects between neurons in the brain are 3D. Connectivity is high locally but relatively sparse between distant cortical areas. The organizing principle in the cortex is based on multiple maps of sensory and motor surfaces in a hierarchy. The cortex coordinates with many subcortical areas to form the central nervous system (CNS) that generates behavior.

**Fig. 6.**
The caption that accompanies the engraving in Flammarion’s book reads: “A missionary of the Middle Ages tells that he had found the point where the sky and the Earth touch ….” Image courtesy of Wikimedia Commons/Camille Flammarion.

See this image and copyright information in PMC

References

1. Abbott E. A., Flatland: A Romance in Many Dimensions (Seeley & Co., London, 1884).
1. Breiman L., Statistical modeling: The two cultures. Stat. Sci. 16, 199–231 (2001).
1. Chomsky N., Knowledge of Language: Its Nature, Origins, and Use (Convergence, Praeger, Westport, CT, 1986).
1. Sejnowski T. J., The Deep Learning Revolution: Artificial Intelligence Meets Human Intelligence (MIT Press, Cambridge, MA, 2018).
1. Rosenblatt F., Perceptrons and the Theory of Brain Mechanics (Cornell Aeronautical Lab Inc., Buffalo, NY, 1961), vol. VG-1196-G, p. 621.

Publication types

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect - Access expert opinions and insights on biomedical research.

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The unreasonable effectiveness of deep learning in artificial intelligence

Affiliations

The unreasonable effectiveness of deep learning in artificial intelligence

Author

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

LinkOut - more resources

Full Text Sources

Other Literature Sources