Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 Feb 1:2018:7068349.
doi: 10.1155/2018/7068349. eCollection 2018.

Deep Learning for Computer Vision: A Brief Review

Affiliations
Review

Deep Learning for Computer Vision: A Brief Review

Athanasios Voulodimos et al. Comput Intell Neurosci. .

Abstract

Over the last years deep learning methods have been shown to outperform previous state-of-the-art machine learning techniques in several fields, with computer vision being one of the most prominent cases. This review paper provides a brief overview of some of the most significant deep learning schemes used in computer vision problems, that is, Convolutional Neural Networks, Deep Boltzmann Machines and Deep Belief Networks, and Stacked Denoising Autoencoders. A brief account of their history, structure, advantages, and limitations is given, followed by a description of their applications in various computer vision tasks, such as object detection, face recognition, action and activity recognition, and human pose estimation. Finally, a brief overview is given of future directions in designing deep learning schemes for computer vision problems and the challenges involved therein.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Example architecture of a CNN for a computer vision task (object detection).
Figure 2
Figure 2
Deep Belief Network (DBN) and Deep Boltzmann Machine (DBM). The top two layers of a DBN form an undirected graph and the remaining layers form a belief network with directed, top-down connections. In a DBM, all connections are undirected.
Figure 3
Figure 3
Denoising autoencoder [56].
Figure 4
Figure 4
Object detection results comparison from [66]. (a) Ground truth; (b) bounding boxes obtained with [32]; (c) bounding boxes obtained with [66].

Similar articles

Cited by

References

    1. McCulloch W. S., Pitts W. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biology. 1943;5(4):115–133. doi: 10.1007/BF02478259. - DOI - PubMed
    1. LeCun Y., Boser B., Denker J., et al. Handwritten digit recognition with a back-propagation network. In: Touretzky D., editor. Advances in Neural Information Processing Systems 2 (NIPS∗89) Denver, CO, USA: 1990.
    1. Hochreiter S., Schmidhuber J. Long short-term memory. Neural Computation. 1997;9(8):1735–1780. doi: 10.1162/neco.1997.9.8.1735. - DOI - PubMed
    1. Hinton G. E., Osindero S., Teh Y.-W. A fast learning algorithm for deep belief nets. Neural Computation. 2006;18(7):1527–1554. doi: 10.1162/neco.2006.18.7.1527. - DOI - PubMed
    1. TensorFlow, Available online: https://www.tensorflow.org.