Deep Learning for Computer Vision: A Brief Review

Athanasios Voulodimos^{1

2}, Nikolaos Doulamis², Anastasios Doulamis², Eftychios Protopapadakis²

Affiliations

¹ Department of Informatics, Technological Educational Institute of Athens, 12210 Athens, Greece.
² National Technical University of Athens, 15780 Athens, Greece.

PMID: 29487619
PMCID: PMC5816885
DOI: 10.1155/2018/7068349

Review

Deep Learning for Computer Vision: A Brief Review

Athanasios Voulodimos et al. Comput Intell Neurosci. 2018.

. 2018 Feb 1:2018:7068349.

doi: 10.1155/2018/7068349. eCollection 2018.

Authors

Athanasios Voulodimos^{1

2}, Nikolaos Doulamis², Anastasios Doulamis², Eftychios Protopapadakis²

Affiliations

¹ Department of Informatics, Technological Educational Institute of Athens, 12210 Athens, Greece.
² National Technical University of Athens, 15780 Athens, Greece.

PMID: 29487619
PMCID: PMC5816885
DOI: 10.1155/2018/7068349

Abstract

Over the last years deep learning methods have been shown to outperform previous state-of-the-art machine learning techniques in several fields, with computer vision being one of the most prominent cases. This review paper provides a brief overview of some of the most significant deep learning schemes used in computer vision problems, that is, Convolutional Neural Networks, Deep Boltzmann Machines and Deep Belief Networks, and Stacked Denoising Autoencoders. A brief account of their history, structure, advantages, and limitations is given, followed by a description of their applications in various computer vision tasks, such as object detection, face recognition, action and activity recognition, and human pose estimation. Finally, a brief overview is given of future directions in designing deep learning schemes for computer vision problems and the challenges involved therein.

PubMed Disclaimer

Figures

**Figure 1**
Example architecture of a CNN for a computer vision task (object detection).

**Figure 2**
Deep Belief Network (DBN) and Deep Boltzmann Machine (DBM). The top two layers of a DBN form an undirected graph and the remaining layers form a belief network with directed, top-down connections. In a DBM, all connections are undirected.

**Figure 3**
Denoising autoencoder [56].

**Figure 4**
Object detection results comparison from [66]. (a) Ground truth; (b) bounding boxes obtained with [32]; (c) bounding boxes obtained with [66].

See this image and copyright information in PMC

References

1. McCulloch W. S., Pitts W. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biology. 1943;5(4):115–133. doi: 10.1007/BF02478259. - DOI - PubMed
1. LeCun Y., Boser B., Denker J., et al. Handwritten digit recognition with a back-propagation network. In: Touretzky D., editor. Advances in Neural Information Processing Systems 2 (NIPS∗89) Denver, CO, USA: 1990.
1. Hochreiter S., Schmidhuber J. Long short-term memory. Neural Computation. 1997;9(8):1735–1780. doi: 10.1162/neco.1997.9.8.1735. - DOI - PubMed
1. Hinton G. E., Osindero S., Teh Y.-W. A fast learning algorithm for deep belief nets. Neural Computation. 2006;18(7):1527–1554. doi: 10.1162/neco.2006.18.7.1527. - DOI - PubMed
1. TensorFlow, Available online: https://www.tensorflow.org.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Deep Learning for Computer Vision: A Brief Review

Affiliations

Deep Learning for Computer Vision: A Brief Review

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources