Published on Fri May 21 2021

A Precise Performance Analysis of Support Vector Regression

Houssem Sifaou, Abla kammoun, Mohamed-Slim Alouini

In this paper, we study the hard and soft support vector regression techniques applied to a set of linear measurements. We illustrate that adding more samples may be harmful to the test performance of support. vector regression, while it is always beneficial when the parameters are optimally selected.

1
0
0
Abstract

In this paper, we study the hard and soft support vector regression techniques applied to a set of linear measurements of the form where is an unknown vector, $\left\{{\bf x}_i\right\}_{i=1}^n$ are the feature vectors and model the noise. Particularly, under some plausible assumptions on the statistical distribution of the data, we characterize the feasibility condition for the hard support vector regression in the regime of high dimensions and, when feasible, derive an asymptotic approximation for its risk. Similarly, we study the test risk for the soft support vector regression as a function of its parameters. Our results are then used to optimally tune the parameters intervening in the design of hard and soft support vector regression algorithms. Based on our analysis, we illustrate that adding more samples may be harmful to the test performance of support vector regression, while it is always beneficial when the parameters are optimally selected. Such a result reminds a similar phenomenon observed in modern learning architectures according to which optimally tuned architectures present a decreasing test performance curve with respect to the number of samples.

Fri May 28 2021
Machine Learning
Support vector machines and linear regression coincide with very high-dimensional features
The support vector machine (SVM) and minimum Euclidean norm least squares (MLS) are two fundamentally different approaches to fitting linear models. They have recently been connected in models for very high-dimensional data through a phenomenon of support vector proliferation.
1
0
0
Mon Mar 09 2020
Machine Learning
A working likelihood approach to support vector regression with a data-driven insensitivity parameter
The insensitive parameter in support vector regression determines the set of support vectors that greatly impacts the prediction. A data-driven approach is proposed to determine an approximate value for this insensitive parameter by minimizing a generalized loss function originating from the likelihood principle.
0
0
0
Tue Apr 21 2020
Machine Learning
A novel embedded min-max approach for feature selection in nonlinear support vector machine classification
Feature selection has become a challenging problem in machine learning fields. We propose an embedded feature selection method based on a min-max optimization problem. By leveraging duality theory, we reformulate the problem and solve it without further ado.
0
0
0
Thu Feb 20 2014
Machine Learning
A Quasi-Newton Method for Large Scale Support Vector Machines
This paper adapts a recently developed regularized stochastic version of the quasi-Newton method for theolution of support vector machine classification problems. The proposed method is shown to converge almost surely to the optimal classifier at a rate that is linear in expectation.
0
0
0
Mon Apr 03 2017
Machine Learning
Geometric Insights into Support Vector Machine Behavior using the KKT Conditions
The support vector machine (SVM) is a powerful and widely used classification algorithm. This paper uses the Karush-Kuhn-Tucker conditions to provide rigorous mathematical proof for new insights into SVM.
0
0
0
Mon May 13 2019
Machine Learning
Exact high-dimensional asymptotics for Support Vector Machine
The Support Vector Machine (SVM) is one of the most widely used classification methods. We propose a set of equations that exactly characterizes the behavior of the SVM. In particular, we give exact formulas for the variability of the optimal coefficients.
0
0
0
Mon May 28 2018
Machine Learning
Optimal ridge penalty for real-world high-dimensional data can be zero or negative due to the implicit ridge regularization
Conventional wisdom is that large models require strong regularization to prevent overfitting. We show that this rule can be violated by linear regression in the underdetermined situation. The optimal value of ridge penalty in this situation can be negative.
1
163
582
Sun Jan 06 2019
Machine Learning
Scaling description of generalization with number of parameters in deep learning
Supervised deep learning involves the training of neural networks with a large number of parameters. For large enough , one can essentially fit the training data points. Sparsity-based arguments would suggest that the generalization error increases as grows past a certain threshold.
1
0
1
Wed Dec 04 2019
Machine Learning
Deep Double Descent: Where Bigger Models and More Data Hurt
We show that a variety of modern deep learning tasks exhibit a "double-descent" phenomenon. As we increase model size, performance first gets worse and then gets better. We unify the above phenomena by defining a new complexity measure.
3
0
1
Mon Mar 18 2019
Machine Learning
Two models of double descent for weak features
The "double descent" risk curve was proposed to qualitatively describe the out-of-sample prediction accuracy of variably-parameterized machine learning models. This article provides a precise mathematical analysis for the shape of this curve in two simple data models.
0
0
0
Mon Jun 25 2018
Machine Learning
Does data interpolation contradict statistical optimality?
We show that learning methods interpolating the training data can achieve optimal rates for the problems of nonparametric regression and prediction with square loss.
0
0
0
Thu Jan 30 2020
Machine Learning
Analytic Study of Double Descent in Binary Classification: The Impact of Loss
The risk curve exhibits a double-descent (DD) trend as a function of the model size. This emphasizes that crucial features of DD curves depend both on the training data and on the learning algorithm.
0
0
0