Learning optimized features for hierarchical models of invariant object recognition
- PMID: 12816566
- DOI: 10.1162/089976603321891800
Learning optimized features for hierarchical models of invariant object recognition
Abstract
There is an ongoing debate over the capabilities of hierarchical neural feedforward architectures for performing real-world invariant object recognition. Although a variety of hierarchical models exists, appropriate supervised and unsupervised learning methods are still an issue of intense research. We propose a feedforward model for recognition that shares components like weight sharing, pooling stages, and competitive nonlinearities with earlier approaches but focuses on new methods for learning optimal feature-detecting cells in intermediate stages of the hierarchical network. We show that principles of sparse coding, which were previously mostly applied to the initial feature detection stages, can also be employed to obtain optimized intermediate complex features. We suggest a new approach to optimize the learning of sparse features under the constraints of a weight-sharing or convolutional architecture that uses pooling operations to achieve gradual invariance in the feature hierarchy. The approach explicitly enforces symmetry constraints like translation invariance on the feature set. This leads to a dimension reduction in the search space of optimal features and allows determining more efficiently the basis representatives, which achieve a sparse decomposition of the input. We analyze the quality of the learned feature representation by investigating the recognition performance of the resulting hierarchical network on object and face databases. We show that a hierarchy with features learned on a single object data set can also be applied to face recognition without parameter changes and is competitive with other recent machine learning recognition approaches. To investigate the effect of the interplay between sparse coding and processing nonlinearities, we also consider alternative feedforward pooling nonlinearities such as presynaptic maximum selection and sum-of-squares integration. The comparison shows that a combination of strong competitive nonlinearities with sparse coding offers the best recognition performance in the difficult scenario of segmentation-free recognition in cluttered surround. We demonstrate that for both learning and recognition, a precise segmentation of the objects is not necessary.
Similar articles
-
Adaptive object recognition model using incremental feature representation and hierarchical classification.Neural Netw. 2012 Jan;25(1):130-40. doi: 10.1016/j.neunet.2011.06.020. Epub 2011 Jul 7. Neural Netw. 2012. PMID: 21783342
-
Learning invariant object recognition in the visual system with continuous transformations.Biol Cybern. 2006 Feb;94(2):128-42. doi: 10.1007/s00422-005-0030-z. Epub 2005 Dec 21. Biol Cybern. 2006. PMID: 16369795
-
Combining reconstruction and discrimination with class-specific sparse coding.Neural Comput. 2007 Jul;19(7):1897-918. doi: 10.1162/neco.2007.19.7.1897. Neural Comput. 2007. PMID: 17521283
-
Invariant visual object recognition: a model, with lighting invariance.J Physiol Paris. 2006 Jul-Sep;100(1-3):43-62. doi: 10.1016/j.jphysparis.2006.09.004. Epub 2006 Oct 30. J Physiol Paris. 2006. PMID: 17071062 Review.
-
Object recognition and segmentation by a fragment-based hierarchy.Trends Cogn Sci. 2007 Feb;11(2):58-64. doi: 10.1016/j.tics.2006.11.009. Epub 2006 Dec 22. Trends Cogn Sci. 2007. PMID: 17188555 Review.
Cited by
-
Biased Competition in Visual Processing Hierarchies: A Learning Approach Using Multiple Cues.Cognit Comput. 2011 Mar;3(1):146-166. doi: 10.1007/s12559-010-9092-x. Epub 2011 Jan 19. Cognit Comput. 2011. PMID: 21475682 Free PMC article.
-
The global landscape of cognition: hierarchical aggregation as an organizational principle of human cortical networks and functions.Sci Rep. 2015 Dec 16;5:18112. doi: 10.1038/srep18112. Sci Rep. 2015. PMID: 26669858 Free PMC article.
-
Visual object categorization in birds and primates: integrating behavioral, neurobiological, and computational evidence within a "general process" framework.Cogn Affect Behav Neurosci. 2012 Mar;12(1):220-40. doi: 10.3758/s13415-011-0070-x. Cogn Affect Behav Neurosci. 2012. PMID: 22086545
-
Modeling invariant object processing based on tight integration of simulated and empirical data in a Common Brain Space.Front Comput Neurosci. 2012 Mar 9;6:12. doi: 10.3389/fncom.2012.00012. eCollection 2012. Front Comput Neurosci. 2012. PMID: 22408617 Free PMC article.
-
A feedforward architecture accounts for rapid categorization.Proc Natl Acad Sci U S A. 2007 Apr 10;104(15):6424-9. doi: 10.1073/pnas.0700622104. Epub 2007 Apr 2. Proc Natl Acad Sci U S A. 2007. PMID: 17404214 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources