Kernel optimization in discriminant analysis

Di You¹, Onur C Hamsici, Aleix M Martinez

Affiliations

PMID: 20820072
PMCID: PMC3149884
DOI: 10.1109/TPAMI.2010.173

Kernel optimization in discriminant analysis

Di You et al. IEEE Trans Pattern Anal Mach Intell. 2011 Mar.

. 2011 Mar;33(3):631-8.

doi: 10.1109/TPAMI.2010.173.

Authors

Di You¹, Onur C Hamsici, Aleix M Martinez

Affiliation

¹ Department of Electrical and Computer Engineering, The Ohio State University, Columbus, 43210, USA. youd@ece.osu.edu

PMID: 20820072
PMCID: PMC3149884
DOI: 10.1109/TPAMI.2010.173

Abstract

Kernel mapping is one of the most used approaches to intrinsically derive nonlinear classifiers. The idea is to use a kernel function which maps the original nonlinearly separable problem to a space of intrinsically larger dimensionality where the classes are linearly separable. A major problem in the design of kernel methods is to find the kernel parameters that make the problem linear in the mapped representation. This paper derives the first criterion that specifically aims to find a kernel representation where the Bayes classifier becomes linear. We illustrate how this result can be successfully applied in several kernel discriminant analysis algorithms. Experimental results, using a large number of databases and classifiers, demonstrate the utility of the proposed approach. The paper also shows (theoretically and experimentally) that a kernel version of Subclass Discriminant Analysis yields the highest recognition rates.

PubMed Disclaimer

Figures

**Fig. 1**
Here we show an example of two non-linearly separable class distributions, each consisting of 3 subclasses. (a) Classification boundary of LDA. (b) SDA’s solution. Note how this solution is piecewise linear (i.e., linear when separating subclasses, but nonlinear when classifying classes). (c) KDA’s solution.

**Fig. 2**
Three examples of the use of the homoscedastic criterion, Q₁. The examples are for two Normal distributions with equal covariance matrix up to scale and rotation. (a) The value of Q₁ decreases as the angle θ increases. The 2D rotation between the two distributions θ is in the x axis. The value of Q₁ is in the y axis. (b) When θ = 0°, the two distributions are homoscedastic, and Q₁ takes its maximum value of .5. Note how for distributions that are close to homoscedastic (i.e., *θ ≈* 0°), the value of the criterion remains high. (c) When θ = 45°, the value has decreased about .4. (d) By θ = 90°, Q₁ ≈ 3.

**Fig. 3**
Here we show a two class classification problem with multi-modal class distributions. When σ = 1 both KDA (a) and KSDA (b) generate solutions that have small training error. (c) However, when the model complexity is small, σ = 3, KDA fails. (d) KSDA’s solution resolves this problem with piecewise smooth, nonlinear classifiers.

See this image and copyright information in PMC

Cited by

Low-rank and eigenface based sparse representation for face recognition.
Hou YF, Sun ZL, Chong YW, Zheng CH. Hou YF, et al. PLoS One. 2014 Oct 21;9(10):e110318. doi: 10.1371/journal.pone.0110318. eCollection 2014. PLoS One. 2014. PMID: 25334027 Free PMC article.
Multiobjective optimization for model selection in kernel methods in regression.
You D, Benitez-Quiroz CF, Martinez AM. You D, et al. IEEE Trans Neural Netw Learn Syst. 2014 Oct;25(10):1879-93. doi: 10.1109/TNNLS.2013.2297686. IEEE Trans Neural Netw Learn Syst. 2014. PMID: 25291740 Free PMC article.
The promises and perils of automated facial action coding in studying children's emotions.
Martinez AM. Martinez AM. Dev Psychol. 2019 Sep;55(9):1965-1981. doi: 10.1037/dev0000728. Dev Psychol. 2019. PMID: 31464498 Free PMC article.
Compound facial expressions of emotion.
Du S, Tao Y, Martinez AM. Du S, et al. Proc Natl Acad Sci U S A. 2014 Apr 15;111(15):E1454-62. doi: 10.1073/pnas.1322355111. Epub 2014 Mar 31. Proc Natl Acad Sci U S A. 2014. PMID: 24706770 Free PMC article.
Adding Knowledge to Unsupervised Algorithms for the Recognition of Intent.
Synakowski S, Feng Q, Martinez A. Synakowski S, et al. Int J Comput Vis. 2021 Apr;129(4):942-959. doi: 10.1007/s11263-020-01404-0. Epub 2021 Jan 5. Int J Comput Vis. 2021. PMID: 34211258 Free PMC article.

See all "Cited by" articles

References

1. Baudat G, Anouar F. Generalized discriminant analysis using a kernel approach. Neural Computation. 2000;12(10):2835–2404. - PubMed
1. Blake CL, Merz CJ. UCI repository of machine learning databases. University of California; Irvine: 1998. http://www.ics.uci.edu/mlearn/MLRepository.html.
1. Bregman L. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comp Mathematics and Mathematical Physics. 1967;7:200217.
1. Chen B, Yuan L, Liu H, Bao Z. Kernel subclass discriminant analysis. Neurocomputing. 2007
1. Demsar J. Statistical comparisons of classifiers over multiple data sets. J Machine Learning Research. 2006;7:1–30.

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Kernel optimization in discriminant analysis

Affiliation

Kernel optimization in discriminant analysis

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources