The Support Kernel Machine

The Support Kernel Machine Algorithm
By Thomas Stahlbuhk: Under Dr. Gert Lanckriet and With Guillaume Obozinski
Introduction
Support Vector Machine
The Support Vector Machine
The Primal Problem
A Support Vector Machine (SVM) is a kernel
based learning algorithm that uses optimization
theory to learn a high dimensional hyperplane
that is capable of binary classification.
(Please, see plot opposite)
The primal problem to find the SVM’s optimal
hyperplane is:
Plot of a Learned Hyperplane
8
6
4
Where w is a vector normal to the hyperplane
and b is the distance of the hyperplane to the
geometric origin.
The Support Kernel Machine
A Support Kernel Machine (SKM) is very similar
to a SVM. However, unlike the SVM, the SKM
can learn from multiple input kernels.
The Dual Problem
2
0
-2
Hyperplane Learned using
C++ code and
one first dimensional
polynomial kernel
By solving the Lagrangian:
-4
-2
Applications of the Support Kernel Machine
-
Kernel-Based Learning
The Examples
The SKM learns its hyperplane from a data set:
-1
1
2
3
4
5
6
7
Matlab Vs. C++ Implementation
Varying Number of Kernels
2.5
Which can be “kernelized” to become:
Note that we are now only solving for the
Lagrange values, α. The majority of these
values will become zero. The ones that don’t
will act as vectors that “support” the hyperplane.
2
1.5
1
0.5
0
0
5
10
15
20
Number of Kernels
Support Kernel Machine
25
30
Matlab Implementation
C++ Implementation
The Primal Problem
Matlab vs. C++ Implementation
Varying C Constraint
If our input space is decomposed into blocks:
70
We find the primal problem for the SKM is:
60
Time (Seconds)
Where X is the input space of the examples and
Y denotes proper classification.
Kernelization
In order to classify the inputted data we need a
similarity measure of the examples in X. To do
this we can map the examples into a dot
product (feature) space via a mapping function:
0
We can arrive at the dual problem:
Time (minutes)
-Text and Handwriting Recognition
Computer Vision
- Biometrics
- Economics
- Genome Mapping
Project Results
and take the canonical dot product of the
result to form a kernel matrix from the function:
Where the weight vector, w, has the same
block decomposition and ξ is a slack variable.
The Dual Problem
By using conic duality and kernelization we can
arrive at a dual problem. The result is that we end
up trying to learn the best linear combination:
50
40
30
20
10
0
0.001
0.01
0.1
1
10
C Constraint
Matlab Implementation
C++ Implementation