Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Dec 1;117(48):30033-30038.
doi: 10.1073/pnas.1907373117. Epub 2020 Jan 28.

The unreasonable effectiveness of deep learning in artificial intelligence

Affiliations

The unreasonable effectiveness of deep learning in artificial intelligence

Terrence J Sejnowski. Proc Natl Acad Sci U S A. .

Abstract

Deep learning networks have been trained to recognize speech, caption photographs, and translate text between languages at high levels of performance. Although applications of deep learning networks to real-world problems have become ubiquitous, our understanding of why they are so effective is lacking. These empirical results should not be possible according to sample complexity in statistics and nonconvex optimization theory. However, paradoxes in the training and effectiveness of deep learning networks are being investigated and insights are being found in the geometry of high-dimensional spaces. A mathematical theory of deep learning would illuminate how they function, allow us to assess the strengths and weaknesses of different network architectures, and lead to major improvements. Deep learning has provided natural ways for humans to communicate with digital devices and is foundational for building artificial general intelligence. Deep learning was inspired by the architecture of the cerebral cortex and insights into autonomy and general intelligence may be found in other brain regions that are essential for planning and survival, but major breakthroughs will be needed to achieve these goals.

Keywords: artificial intelligence; deep learning; neural networks.

PubMed Disclaimer

Conflict of interest statement

The author declares no competing interest.

Figures

Fig. 1.
Fig. 1.
Cover of the 1884 edition of Flatland: A Romance in Many Dimensions by Edwin A. Abbott (1). Inhabitants were 2D shapes, with their rank in society determined by the number of sides.
Fig. 2.
Fig. 2.
The Neural Information Processing Systems conference brought together researchers from many fields of science and engineering. The first conference was held at the Denver Tech Center in 1987 and has been held annually since then. The first few meetings were sponsored by the IEEE Information Theory Society.
Fig. 3.
Fig. 3.
Early perceptrons were large-scale analog systems (3). (Left) An analog perceptron computer receiving a visual input. The racks contained potentiometers driven by motors whose resistance was controlled by the perceptron learning algorithm. (Right) Article in the New York Times, July 8, 1958, from a UPI wire report. The perceptron machine was expected to cost $100,000 on completion in 1959, or around $1 million in today’s dollars; the IBM 704 computer that cost $2 million in 1958, or $20 million in today’s dollars, could perform 12,000 multiplies per second, which was blazingly fast at the time. The much less expensive Samsung Galaxy S6 phone, which can perform 34 billion operations per second, is more than a million times faster. Reprinted from ref. .
Fig. 4.
Fig. 4.
Nature has optimized birds for energy efficiency. (A) The curved feathers at the wingtips of an eagle boosts energy efficiency during gliding. (B) Winglets on a commercial jets save fuel by reducing drag from vortices.
Fig. 5.
Fig. 5.
Levels of investigation of brains. Energy efficiency is achieved by signaling with small numbers of molecules at synapses. Interconnects between neurons in the brain are 3D. Connectivity is high locally but relatively sparse between distant cortical areas. The organizing principle in the cortex is based on multiple maps of sensory and motor surfaces in a hierarchy. The cortex coordinates with many subcortical areas to form the central nervous system (CNS) that generates behavior.
Fig. 6.
Fig. 6.
The caption that accompanies the engraving in Flammarion’s book reads: “A missionary of the Middle Ages tells that he had found the point where the sky and the Earth touch ….” Image courtesy of Wikimedia Commons/Camille Flammarion.

References

    1. Abbott E. A., Flatland: A Romance in Many Dimensions (Seeley & Co., London, 1884).
    1. Breiman L., Statistical modeling: The two cultures. Stat. Sci. 16, 199–231 (2001).
    1. Chomsky N., Knowledge of Language: Its Nature, Origins, and Use (Convergence, Praeger, Westport, CT, 1986).
    1. Sejnowski T. J., The Deep Learning Revolution: Artificial Intelligence Meets Human Intelligence (MIT Press, Cambridge, MA, 2018).
    1. Rosenblatt F., Perceptrons and the Theory of Brain Mechanics (Cornell Aeronautical Lab Inc., Buffalo, NY, 1961), vol. VG-1196-G, p. 621.

Publication types