Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Oct;25(10):1879-93.
doi: 10.1109/TNNLS.2013.2297686.

Multiobjective optimization for model selection in kernel methods in regression

Multiobjective optimization for model selection in kernel methods in regression

Di You et al. IEEE Trans Neural Netw Learn Syst. 2014 Oct.

Abstract

Regression plays a major role in many scientific and engineering problems. The goal of regression is to learn the unknown underlying function from a set of sample vectors with known outcomes. In recent years, kernel methods in regression have facilitated the estimation of nonlinear functions. However, two major (interconnected) problems remain open. The first problem is given by the bias-versus-variance tradeoff. If the model used to estimate the underlying function is too flexible (i.e., high model complexity), the variance will be very large. If the model is fixed (i.e., low complexity), the bias will be large. The second problem is to define an approach for selecting the appropriate parameters of the kernel function. To address these two problems, this paper derives a new smoothing kernel criterion, which measures the roughness of the estimated function as a measure of model complexity. Then, we use multiobjective optimization to derive a criterion for selecting the parameters of that kernel. The goal of this criterion is to find a tradeoff between the bias and the variance of the learned function. That is, the goal is to increase the model fit while keeping the model complexity in check. We provide extensive experimental evaluations using a variety of problems in machine learning, pattern recognition, and computer vision. The results demonstrate that the proposed approach yields smaller estimation errors as compared with methods in the state of the art.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
The two plots in this figure show the contradiction between the RSS and the curvature measure with respect to: (a) the kernel parameter σ, and (b) the regularization parameter λ in Kernel Ridge Regression. The Boston Housing data-set [5] is used in this example. Note that in both cases, while one criterion increases, the other decreases. Thus, a compromise between the two criteria ought to be determined.
Fig. 2
Fig. 2
Here we show a case of two objective functions. u(S) represents the set of all the objective vectors with the Pareto frontier colored in red. The Pareto-optimal solution θ can be determined by minimizing u1 given that u2 is upper-bounded by ε.
Fig. 3
Fig. 3
Comparison between the proposed modified and the original ε-constraint methods. We have used ‘*’ to indicate the objective vector and ‘o’ to specify the solution vector. Solutions given by (a) the ε-constraint method and (b) the proposed modified ε-constraint approach on the first example, and (c) the ε-constraint method and (d) the modified ε-constraint approach on the second example. Note that the proposed approach identifies the Pareto-frontier, while the original algorithm identifies weakly Pareto-solutions, since the solution vectors go beyond the Pareto-frontier.
Fig. 4
Fig. 4
Sample images showing the same person at different ages.
Fig. 5
Fig. 5
This figure plots the estimated (lighter dashed curve) and actual (darker dashed curve) maximum daily temperature for a period of more than 200 days. The estimated results are given by the algorithm proposed in this paper.

References

    1. Data for evaluating learning in valid experiments (DELVE) http://www.cs.toronto.edn/delve/
    1. FG-NET aging database. http://www.fgnet.rsunit.com/
    1. Andersen ED, Andersen KD. The mosek interior point optimizer for linear programming: An implementation of the homogeneous algorithm. In: Frenk Hans, Roos Kees, Terlaky Tams, Zhang Shuzhong., editors. High Performance Optimization, volume 33 of Applied Optimization. Springer; US: 2000. pp. 197–232.
    1. Bishop CM. Neural Networks for Pattern Recognition. Oxford University Press; 1995.
    1. Blake CL, Merz CJ. UCI repository of machine learning databases. University of California; Irvine: 1998. http://www.ics.uci.edu/mlearn/MLRepository.html.

Publication types