Norms, Kernels and Dimensions What`s missing?

What’s missing?
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
S. Evert
S. Evert
Linear Algebra in a Nutshell
Part 2: Norms, Kernels and Dimensions
Distance
Metric spaces
Vector norms
ñ
But we need something else in order to perform
clustering or find dimensions of major variance . . .
Metric spaces
ñ
Vector norms
Euclidean geometry
Normal vector
Normal vector
Isometry
Isometry
General inner product
Stefan Evert
Kernel trick
We know (almost :-)) everything about vector spaces
and the methods of linear algebra now
Distance
Euclidean geometry
General inner product
ñ
Can you guess what is missing?
+ We need a notion of distance!
Kernel trick
Institute of Cognitive Science
University of Osnabrück, Germany
[email protected]
Rovereto, 20 March 2007
Measuring distance
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
ñ
S. Evert
Metric spaces
ñ
Vector norms
Euclidean geometry
Normal vector
ñ
Isometry
General inner product
distance between vectors
~ v
~ ∈ Rn Ü (dis)similarity
u,
of data points
ñ
Distance
ñ
Kernel trick
ñ
Metric: a measure of distance
u
6
~ v
~
“city block” distance d1 u,
both are special cases of the
~ v
~
p-distance dp u,
(for p ∈ [1, ∞])
Distance
ñ
Metric spaces
d2 (!u , !v ) = 3.6
Vector norms
ñ
Euclidean geometry
v
2
~ v
~ between
A general measure of the distance d u,
~ and v
~ is called a metric and must satisfy the
points u
following axioms:
ñ
d1 (!u , !v ) = 5
4
3
ñ
S. Evert
5
~ = (u1 , . . . , un )
u
~ = (v1 , . . . , vn )
v
~ v
~
Euclidean distance d2 u,
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
x2
ñ
Normal vector
d
d
d
d
~ v
~ = d v,
~ u
~
u,
~≠v
~
~ v
~ > 0 for u
u,
~ u
~ =0
u,
~ w
~ ≤ d u,
~ v
~ + d v,
~ w
~ (triangle inequality)
u,
Isometry
1
General inner product
1
2
3
4
1/p
~ y
~ Í |u1 − v1 |p + · · · + |un − vn |p
dp x,
~ y
~ = max |u1 − v1 |, . . . , |un − vn |
d∞ x,
5
6
x1
ñ
Kernel trick
ñ
ñ
Metrics are very broad class of distance measures,
some of which do not fit well into vector spaces
E.g., metrics need not be translation-invariant
~ + x,
~ v
~+x
~ ≠ d u,
~ v
~
d u
Another unintuitive example is the discrete metric

~=v
~
0 u
~ v
~ =
d u,
1 u
~≠v
~
+ exercise: show that discrete metric satisfies axioms
Distance & norm
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
Norm: a measure of length
Intuitively,
distance
~ v
~ should
d u,
correspond to length
~−v
~
~ − vk
~ of vector u
ku
ñ
S. Evert
Distance
Metric spaces
ñ
Vector norms
ñ
Euclidean geometry
Normal vector
ñ
Isometry
~ v
~ is a metric
d u,
~ − vk
~ is a norm
ku
~
~ = d u,
~ 0
kuk
General inner product
Such a metric is always
translation-invariant
ñ
Kernel trick
x2
6
!
"
!!u ! = d !u , !0
d (!u , !v ) = !!u − !v !
Distance
Euclidean geometry
3
Normal vector
v
Isometry
2
!
"
!!v ! = d !v , !0
1
1.0
ñ
2
3
4
5
6
Kernel trick
~ v
~ Í ku
~ − vk
~
d u,
x1
1/p
Operator and matrix norm
0.5
Visualisation of norms in
R2 by plotting unit circle
for each norm, i.e. points
~ with kuk
~ =1
u
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
Normal vector
Here: p-norms k·kp for
different values of p
−0.5
Isometry
−1.0
ñ
−0.5
The norm of a linear map (or “operator”) f : U → V
between normed vector spaces U and V is defined as
S. Evert
~ u
~ ∈ U, kuk
~ =1
kf k Í max kf (u)k
Distance
Vector norms
0.0
x1
0.5
Triangle inequality ⇐⇒
unit circle is convex
ñ
Euclidean geometry
kf k depends on the norms chosen in U and V !
Normal vector
Isometry
General inner product
Kernel trick
−1.0
ñ
Metric spaces
ñ
0.0
x2
every norm defines a translation-invariant metric
origin
p=1
p=2
p=5
p=∞
Vector norms
General inner product
ñ
General inner product
p-norm for p ∈ [1, ∞]:
Metric spaces
Euclidean geometry
~
~ > 0 for u
~≠0
kuk
~ = |λ| · kuk
~ (homogeneity, not req’d for metric)
kλuk
~ + vk
~ ≤ kuk
~ + kvk
~ (triangle inequality)
ku
Vector norms
~ v
~ = kv
~ − uk
~ p
dp u,
Unit circle according to p−norm
Distance
ñ
Metric spaces
Norm: a measure of length
S. Evert
ñ
S. Evert
4
~ p Í |u1 |p + · · · + |un |p
kuk
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
~ for the length of a vector u
~ must
A general norm kuk
satisfy the following axioms:
ñ
1
ñ
u
ñ
5
ñ
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
ñ
Kernel trick
The definition of the operator norm implies
~ ≤ kf k · kuk
~
kf (u)k
1.0
ñ
This shows that p-norms
with p < 1 would violate
the triangle inequality
ñ
norm of a matrix A = norm of corresponding map f
ñ
ñ
NB: this is not the same as a p-norm of A in Rk·n
spectral norm induced by Euclidean vector norms
in U and V = largest singular value of A (Ü SVD)
Cave canem!
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
S. Evert
ñ
Discussion about which norm to use for measuring
distributional similarity in word space models
ñ
Measures of distance between points:
ñ
Distance
Metric spaces
ñ
Vector norms
Euclidean geometry
ñ
Normal vector
ñ
Isometry
General inner product
Kernel trick
Euclidean norm & inner product
ñ
“natural” Euclidean norm k·k2
city-block (“Manhattan”) distance k·k1
maximum distance k·k∞
and many other formulae . . .
Measures of the similarity of arrows:
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
ñ
S. Evert
Distance
Vector norms
~ ≡E x
~ ≡E y
~ and v
~ are the standard coordinates
where u
~ and v
~ (certain other coordinate systems also work)
of u
Euclidean geometry
Normal vector
Isometry
General inner product
Kernel trick
“cosine distance” ∼ u1 v1 + · · · + un vn
ñ Dice coefficient (matching non-zero coordinates)
ñ and, of course, many other formulae . . .
+ these measures determine angles between arrows
ñ
The inner product is a positive definite and
symmetric bilinear form with the following
properties:
ñ
ñ
Don’t do this! – the Euclidean norm induces a much
richer and more intuitive geometric structure
+ There’s a trick to make Euclidean norms more flexible
ñ
ñ
ñ
ñ
Angles and orthogonality
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
ñ
S. Evert
Distance
ñ
Metric spaces
The Euclidean inner product has an important
geometric interpretation: it can be used to define
angles and orthogonality
Cauchy-Schwarz inequality:
Vector norms
u,
~ v
~ ≤ kuk
~ · kvk
~
Normal vector
Isometry
Kernel trick
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
ñ
ñ
Distance
Metric spaces
~ v
~ ∈ Rn :
Angle φ between vectors u,
~ v
~
u,
cos φ Í
~ · kvk
~
kuk
ñ
ñ
cos φ is the “cosine distance” measure of similarity
~ and v
~ are orthogonal iff u,
~ v
~ =0
u
ñ
~ and a
the shortest connection between a point u
~∈U
subspace U is orthogonal to all vectors v
(j) (k) ~ ,b
~
b
= 0 for j ≠ k
(k) (k) (k) 2
~ ,b
~
b
= ~
b =1
ñ
An orthonormal basis and the corresponding
coordinates are called Cartesian
ñ
Cartesian coordinates are particularly intuitive, and the
inner product has the same form wrt. every Cartesian
~ ≡B x
~ 0 and v
~ ≡B y
~ 0 , we have
basis B: for u
Euclidean geometry
Normal vector
General inner product
ñ
~(1) , . . . , b
~(n) is called orthonormal if
A set of vectors b
the vectors are pairwise orthogonal and of unit length:
ñ
S. Evert
Isometry
General inner product
~ v
~ = u,
~ λv
~ = λ u,
~ v
~
λu,
0 0
~+u
~ ,v
~ = u,
~ v
~ + u
~ ,v
~
u
~ v
~+v
~ 0 = u,
~ v
~ + u,
~ v
~0
u,
~ v
~ = v,
~ u
~ (symmetric)
u,
~ (positive definite)
~ u
~ = kuk
~ 2 > 0 for u
~≠0
u,
also called dot product or scalar product
Cartesian coordinates
Vector norms
Euclidean geometry
~ v
~ Íx
~T y
~ = x1 y1 + · · · + xn yn
u,
Metric spaces
ñ
ñ
q
~ u
~ is special because
~ 2=
u,
The Euclidean norm kuk
it can be derived from the inner product:
Kernel trick
ñ
0
~ v
~ = (x
~ 0 )T y
~ 0 = x10 y10 + · · · + xn
u,
yn0
NB: the column vectors of the matrix B are orthonormal
ñ
recall that the columns of B specify the standard
~(1) , . . . , b
~(n)
coordinates of the vectors b
Orthogonal projection
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
ñ
Hyperplanes & normal vectors
~ ≡B x
~ can easily be computed:
Cartesian coordinates u
S. Evert
D
E
~(k) =
~ b
u,
*
Distance
n
X
+
~(j)
xj b
~(k)
,b
Normal vector
Isometry
General inner product
S. Evert
=δjk
Kernel trick
~ ∈ Rn with knk
~ =1
for a suitable n
Euclidean geometry
Normal vector
Isometry
ñ
~ is called the normal vector of U
n
ñ
The orthogonal projection PU into U is given by
General inner product
Kernel trick
ñ
~ ∈ Rn u,
~ n
~ =0
U= u
Vector norms
D
E
~(j) , b
~(k) = xk
xj b
=
|
{z
}
j=1
Euclidean geometry
~ can be
A hyperplane U ⊆ Rn through the origin 0
characterized by the equation
Metric spaces
n
X
Vector norms
ñ
Distance
j=1
Metric spaces
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
Kronecker delta: δjk = 1 for j = k and 0 for j ≠ k
~Ív
~−n
~ v,
~ n
~
PU v
ñ
n
Orthogonal
projection
PV : R → V to subspace
(1)
(k)
~
~
V Í sp b , . . . , b
(for k < n) is given by
~Í
PV u
k
X
ñ
~ ∈ Rn u,
~ n
~ =a
Γ = u
D
E
~(j) u,
~(j)
~ b
b
j=1
~
where a ∈ R is the (signed) distance of Γ from 0
Isometric maps
Orthogonal matrices
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
S. Evert
ñ
A matrix A whose column vectors are orthonormal is
called an orthogonal matrix
ñ
AT is orthogonal iff A is orthogonal
Distance
Metric spaces
Vector norms
Euclidean geometry
Normal vector
Isometry
General inner product
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
S. Evert
ñ
An endomorphism f : Rn → Rn is called an isometry
~ v
~ f (v)
~ = u,
~ v
~ for all u,
~ ∈ Rn
iff f (u),
ñ
Geometric interpretation: isometries preserve angles
and distances (which are defined in terms of h·, ·i)
ñ
f is an isometry iff its matrix A is orthogonal
ñ
Coordinate transformations between Cartesian systems
are isometric (because B and B −1 = B T are orthogonal)
ñ
Every isometric endomorphism of Rn can be written as
a combination of planar rotations and axial
reflections in a suitable Cartesian coordinate system




Distance
ñ
The inverse of an orthogonal matrix is simply its
transpose, i.e. A−1 = AT
ñ
Kernel trick
ñ
ñ
An arbitrary hyperplane Γ ⊆ Rn can analogously be
characterized by
it is easy to show AT A = I by matrix multiplication,
since the columns of A are orthonormal
since AT is also orthogonal, it follows that
AAT = (AT )T AT = I
side remark: the transposition operator ·T is called an
involution because (AT )T = A
Metric spaces
Vector norms
Euclidean geometry
Normal vector
Isometry
General inner product
Kernel trick
(1,3)
Rφ
cos φ
= 0
sin φ
0
1
0
− sin φ
0 ,
cos φ
1
Q(2) = 0
0
0
−1
0
0
0
1
General inner products
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
ñ
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
General inner products can be defined by
S. Evert
Distance
General inner products
~ v
~
u,
B
0
~ 0 )T y
~ 0 = x10 y10 + · · · + xy
yn0
Í (x
Metric spaces
Vector norms
Euclidean geometry
Isometry
Kernel trick
Normal vector
h·, ·iB can be expressed in standard coordinates
~ ≡E x,
~ ≡E y
~ v
~ using the transformation matrix B:
u
~ v
~
u,
B
0 T
0
~ ) y
~ = B
= (x
T
~ (B
=x
−1 T
) B
−1
−1
~
x
T
B
−1
~
y
Isometry
Distance
ñ
Metric spaces
Vector norms
Euclidean geometry
Normal vector
Isometry
ñ
General inner product
Kernel trick
ñ
ñ
~(1) = (3, 2), b
~(2) = (1, 2)
b
"
#
3 1
B=
2 2
#
" 1
1
−4
2
−1
B =
1
3
−2
4
"
#
.5
−.5
C=
−.5 .625
graph shows unit circle
of the inner product C,
~ with
i.e. points x
~T C x
~=1
x
T
~ = (x
~ 0 )T x
~0 ≥ 0
B −1 x
Kernel trick
T
~ Îx
~ Cy
~
y
General inner products
An example:
ñ
~T C x
~ = B −1 x
~
x
General inner product
General inner products
S. Evert
and positive definite
Euclidean geometry
ñ
General inner product
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
C T = (B −1 )T ((B −1 )T )T = (B −1 )T B −1 = C
Distance
Vector norms
Normal vector
The coefficient matrix C Í (B −1 )T B −1 of the general
inner product is symmetric
S. Evert
~ ≡B x
~0, v
~ ≡B y
~ 0)
wrt. non-Cartesian basis B (u
Metric spaces
ñ
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
x2
3
2
S. Evert
2
b
b1
ñ
C is a symmetric matrix
x2
ñ
There is always an
orthonormal basis so
that C has diagonal form
3
Distance
Metric spaces
1
Vector norms
c1
2
c2 1
Euclidean geometry
Normal vector
-3
-2
-1
-1
1
2
3
x1
Isometry
å “standard” dot product
with additional scaling
factors (wrt. this
orthonormal basis)
General inner product
Kernel trick
-2
-3
ñ
Intuitively, the unit circle
is a squashed and
rotated circle
-3
-2
-1
-1
-2
-3
1
2
3
x1
The kernel trick
The kernel trick
Tweak your space, don’t tweak your norms . . .
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
S. Evert
S. Evert
Distance
ñ
Use standard inner products, but map data to
higher-dimensional space before applying them
å All methods of Euclidean geometry are still available
Distance
Metric spaces
Metric spaces
Vector norms
Vector norms
Euclidean geometry
Euclidean geometry
Normal vector
Normal vector
Isometry
Isometry
General inner product
ñ
Non-linear mappings can drastically change the
geometry of the original vector space
ñ
The kernel trick allows efficient computation of inner
products and distances without an explicit
high-dimensional representation
General inner product
Kernel trick
Kernel trick
~ v
~ = f u,
~ v
~
u,
where f must satisfy the properties of an inner product
Kernelisation
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
ñ
Kernelised versions of all algorithms from linear
algebra and normed vector spaces can be formulated
S. Evert
Distance
Metric spaces
Vector norms
Euclidean geometry
Normal vector
Isometry
General inner product
Kernel trick
Linear Algebra
in a Nutshell:
Norms, Kernels
& Dimensions
S. Evert
ñ
A hyperplane in a kernelised space corresponds to a
non-linear classifier in the original space
å this is the principle behind support vector machines
Distance
Metric spaces
Vector norms
Euclidean geometry
Normal vector
Isometry
General inner product
Kernel trick
I think that’s enough for today . . .