Title: | The SVM Path Algorithm |
---|---|
Description: | Computes the entire regularization path for the two-class svm classifier with essentially the same cost as a single SVM fit. |
Authors: | Trevor Hastie |
Maintainer: | Trevor Hastie <[email protected]> |
License: | GPL-2 |
Version: | 0.970 |
Built: | 2024-10-31 03:03:15 UTC |
Source: | https://github.com/cran/svmpath |
Datasets for illustrating the svmpath function, that can be plotted while its running
data(svmpath)
data(svmpath)
In each case a list with a component x
(t column matrix) and a
component y
(vector of +1/-1 values)
"Balanced" refers to whether the number of +1s is the same as the -1s.
"Overlap" indicates whether the classes are linearly separable.
mixture.data
is a balanced dataset with 100 observations in
each class. The others are smaller with between 10-12 obs total.
The paper http://www-stat.stanford.edu/~hastie/Papers/svmpath.pdf, as well as the talk http://www-stat.stanford.edu/~hastie/TALKS/svmpathtalk.pdf.
data(svmpath) attach(balanced.overlap) svmpath(x,y,trace=TRUE,plot=TRUE) detach(2)
data(svmpath) attach(balanced.overlap) svmpath(x,y,trace=TRUE,plot=TRUE) detach(2)
produces a plot of the svm solution along the path, and optinally indicates support points
## S3 method for class 'svmpath' plot(x, step, Size = 60, elbow.show = TRUE, support.show = TRUE, ...)
## S3 method for class 'svmpath' plot(x, step, Size = 60, elbow.show = TRUE, support.show = TRUE, ...)
x |
the |
step |
which step to plot; default is the last step. Use
|
Size |
If the solution is non-linear, this is the gridsize for |
elbow.show |
Should the points on the elbow be indicated |
support.show |
Should the support points be indicated |
... |
additional arguments to plot, allowing one to change, for example, "main", "xlab" etc |
A two-dimensional plot is produced of the SVM solution. Makes sense only if X is two-dimensional. If not, the first two dimensions will be used
A list is returned silently, with the ingredients of the plot
Trevor Hastie
The paper http://www-stat.stanford.edu/~hastie/Papers/svmpath.pdf, as well as the talk http://www-stat.stanford.edu/~hastie/TALKS/svmpathtalk.pdf.
coef.svmpath, svmpath, predict.svmpath, print.svmpath,summary.svmpath
data(svmpath) attach(balanced.overlap) fit <- svmpath(x,y,trace=TRUE,plot=FALSE) plot(fit,step=2) detach(2)
data(svmpath) attach(balanced.overlap) fit <- svmpath(x,y,trace=TRUE,plot=FALSE) plot(fit,step=2) detach(2)
Provide a value for lambda
, and produce the fitted lagrange alpha
values. Provide values for x
, and get fitted function values or
class labels.
## S3 method for class 'svmpath' predict(object, newx, lambda, type = c("function", "class", "alpha", "margin"),...)
## S3 method for class 'svmpath' predict(object, newx, lambda, type = c("function", "class", "alpha", "margin"),...)
object |
fitted |
newx |
values of |
lambda |
the value of the regularization parameter. Note that
|
type |
type of prediction, with default |
... |
Generic compatibility |
This implementation of the SVM uses a parameterization that is slightly
different but equivalent to the usual (Vapnik) SVM. Here
.
The Lagrange multipliers are related via
, where
is the usual multiplier, and
our multiplier. Note that if
alpha=0
, that
observation is right of the elbow; alpha=1
, left of the elbow;
0<alpha<1
on the elbow. The latter two cases are all support
points.
In each case, the desired prediction.
Trevor Hastie
The paper http://www-stat.stanford.edu/~hastie/Papers/svmpath.pdf, as well as the talk http://www-stat.stanford.edu/~hastie/TALKS/svmpathtalk.pdf.
coef.svmpath, svmpath
data(svmpath) attach(balanced.overlap) fit <- svmpath(x,y,trace=TRUE,plot=TRUE) predict(fit, lambda=1,type="alpha") predict(fit, x, lambda=.9) detach(2)
data(svmpath) attach(balanced.overlap) fit <- svmpath(x,y,trace=TRUE,plot=TRUE) predict(fit, lambda=1,type="alpha") predict(fit, x, lambda=.9) detach(2)
print a summary of the fitted svmpath object
## S3 method for class 'svmpath' print(x, digits, maxsteps, ...)
## S3 method for class 'svmpath' print(x, digits, maxsteps, ...)
x |
object to be printed |
digits |
number of significant digits (default 6) |
maxsteps |
the number of steps to print; default all |
... |
additional arguments to the generic print function |
For each step taken by the algorithm, one or more lines are printed. The step is described in terms of the observation number involved, a coded version of what happened, such as "L->E" meaning "from the Left set" to the "Elbow". Initially all the sets are empty. It gives the margin (sum of the xi), the size of the elbow, and the training error.
Trevor Hastie
The paper http://www-stat.stanford.edu/~hastie/Papers/svmpath.pdf, as well as the talk http://www-stat.stanford.edu/~hastie/TALKS/svmpathtalk.pdf.
coef.svmpath, svmpath, predict.svmpath
data(svmpath) attach(balanced.overlap) fit <- svmpath(x,y,trace=TRUE,plot=TRUE) print(fit) detach(2)
data(svmpath) attach(balanced.overlap) fit <- svmpath(x,y,trace=TRUE,plot=TRUE) print(fit) detach(2)
compute the kernel matrix for svmpath
radial.kernel(x, y=x, param.kernel = 1/p,...) poly.kernel(x, y=x, param.kernel = 1,...)
radial.kernel(x, y=x, param.kernel = 1/p,...) poly.kernel(x, y=x, param.kernel = 1,...)
x |
an n x p matrix of features |
y |
an m x p matrix of features (if omitted, it defaults to |
param.kernel |
the parameter(s) for the kernel. For this radial kernel, the parameter is known in the fields as "gamma". For the polynomial kernel, it is the "degree" |
... |
unused |
For the radial kernel, this computes the function for each pair of rows x,y
from the input matrices. Here g is param.kernel.
For the polynomial kernel, it computes
, where d is
param.kernel
.
An n x m matrix.
Trevor Hastie
The paper http://www-stat.stanford.edu/~hastie/Papers/svmpath.pdf, as well as the talk http://www-stat.stanford.edu/~hastie/TALKS/svmpathtalk.pdf.
svmpath
data(svmpath) attach(balanced.overlap) fit<-svmpath(x,y,kernel=radial.kernel) detach(2)
data(svmpath) attach(balanced.overlap) fit<-svmpath(x,y,kernel=radial.kernel) detach(2)
printing an svmpath object can produce a lot of lines. The summary methods gives a more concise description by picking out a subset of the steps
## S3 method for class 'svmpath' summary(object, nsteps = 5, digits = 6, ...)
## S3 method for class 'svmpath' summary(object, nsteps = 5, digits = 6, ...)
object |
the |
nsteps |
usually omitted, but can be changed to get longer summaries |
digits |
number of significant digits |
... |
additional arguments to the generic summary function |
Uses the pretty
function to extract the approximately the desired
number of steps. Always includes the first and last step.
returns a dataframe with the steps, value of lambda, training error, size of elbow, number of support points, and the sum of the overlaps
Trevor Hastie
The paper http://www-stat.stanford.edu/~hastie/Papers/svmpath.pdf, as well as the talk http://www-stat.stanford.edu/~hastie/TALKS/svmpathtalk.pdf.
coef.svmpath, svmpath, predict.svmpath, print.svmpath
data(svmpath) attach(balanced.overlap) fit <- svmpath(x,y,trace=TRUE,plot=TRUE) summary(fit) detach(2)
data(svmpath) attach(balanced.overlap) fit <- svmpath(x,y,trace=TRUE,plot=TRUE) summary(fit) detach(2)
The SVM has a regularization or cost parameter C, which controls the amount by which points overlap their soft margins. Typically either a default large value for C is chosen (allowing minimal overlap), or else a few values are compared using a validation set. This algorithm computes the entire regularization path (i.e. for all possible values of C for which the solution changes), with a cost a small (~3) multiple of the cost of fitting a single model.
svmpath(x, y, K, kernel.function = poly.kernel, param.kernel = 1, trace, plot.it, eps = 1e-10, Nmoves = 3 * n, digits = 6, lambda.min = 1e-04,ridge=0, ...)
svmpath(x, y, K, kernel.function = poly.kernel, param.kernel = 1, trace, plot.it, eps = 1e-10, Nmoves = 3 * n, digits = 6, lambda.min = 1e-04,ridge=0, ...)
x |
the data matrix (n x p) with n rows (observations) on p variables (columns) |
y |
The "-1,+1" valued response variable. |
K |
a n x n kernel matrix, with default value |
kernel.function |
This is a user-defined function. Provided are
|
param.kernel |
parameter(s) of the kernels |
trace |
if |
plot.it |
a flag indicating whether a plot should be produced
(default |
eps |
a small machine number which is used to identify minimal step sizes |
Nmoves |
the maximum number of moves |
digits |
the number of digits in the printout |
lambda.min |
The smallest value of |
ridge |
Sometimes the algorithm encounters singularities; in this
case a small value of ridge, around 1e-12, can help. Default is |
... |
additional arguments to some of the functions called by
svmpath. One such argument that can be passed is |
The algorithm used in svmpath()
is described in detail in
"The Entire Regularization Path for the Support Vector Machine" by
Hastie, Rosset, Tibshirani and Zhu (2004). It exploits the fact that
the "hinge" loss-function is piecewise linear, and the penalty term is
quadratic. This means that in the dual space, the lagrange multipliers
will be pieceise linear (c.f. lars
).
a "svmpath" object is returned, for which there are print, summary, coef and predict methods.
Currently the algorithm can get into machine errors if
epsilon
is too small, or if lambda.min
is too
small. Increasing either from their defaults should make the problems
go away, by terminating the algorithm slightly early.
This implementation of the algorithm does not use updating to solve the "elbow" linear equations. This is possible, since the elbow changes by a small number of points at a time. Future version of the software will do this. The author has encountered numerical problems with early attempts at this.
Trevor Hastie
The paper http://www-stat.stanford.edu/~hastie/Papers/svmpath.pdf, as well as the talk http://www-stat.stanford.edu/~hastie/TALKS/svmpathtalk.pdf.
print, coef, summary, predict, and FilmPath
data(svmpath) attach(unbalanced.separated) svmpath(x,y,trace=TRUE,plot=TRUE) detach(2) ## Not run: svmpath(x,y,kernel=radial.kernel,param.kernel=.8)
data(svmpath) attach(unbalanced.separated) svmpath(x,y,trace=TRUE,plot=TRUE) detach(2) ## Not run: svmpath(x,y,kernel=radial.kernel,param.kernel=.8)