Privacy-Preserving Support
Vector Machines via Random
Kernels
July 28, 2017
Olvi Mangasarian
UW Madison & UCSD La Jolla
Edward Wild
UW Madison
Horizontally
Vertically Partitioned
Data
PartitionedData
Data
Features
1 2 ..………….…………. n
Examples
1
2
.
.
.
.
.
.
.
.
m
A1
A¢1
A22
AA
¢
A3
A¢3
Problem Statement
• Entities with related data wish to learn a classifier based on
all data
• The entities are unwilling to reveal their data to each other
– If each entity holds a different set of features for all examples, then
the data is said to be vertically partitioned
– If each entity holds a different set of examples with all features,
then the data is said to be horizontally partitioned
• Our approach: privacy-preserving support vector machine
(PPSVM) using random kernels
– Provides accurate classification
– Does not reveal private information
Outline
• Support vector machines (SVMs)
• Reduced and random kernel SVMs
• Privacy-preserving SVM for vertically
partitioned data
• Privacy-preserving SVM for horizontally
partitioned data
• Summary
Linear kernel: (K(A, B))ij = (AB)ij = AiB j = K(Ai, B j)
¢
¢
kernel, parameter
:(K(A, B)) = exp(-||A -B || )
SupportGaussian
Vector
SVMs
Machines
¢
• x 2 Rn
• SVM defined by
parameters u and threshold
of the nonlinear surface
• A contains all data points
•{+…+} ½ A+
•{…} ½ A
• e is a vector of ones
K(A, A0)u· e e
ij
_
__
0
j
2
K(A+, A0)u ¸ e +e
+
++
_
_
i
+
+
+
+ +
+ ++
Minimize e y (hinge loss or plus
_
+ + +
Minimize e s (||u|| at+
function or max{•, 0}) to fit _
+
solution)
__to reduce +
data
K(x , A )u =
overfitting
+
_
_
K(x , A )u =
_
__
_
Slack variable y ¸ 0
_
_
_ K(x , A )u = 1
_
allows points
to be on the
_
wrong side of the
_
_
bounding surface _
0
0
1
0
0
0
0
0
0
Random
Reduced
Support
Vector
Machine
Reduced
Support
Support
Vector
Vector
Machine
Machine
Using
random
kernel
L&M, the
M&T,
2006:
2001:
replace
the
0) is a key result
K(A,
for
kernelBmatrix
K(A, A0) with
generating
a simple
and
K(A, B
Ā0), where
B
Ā00 is
consists
a
accurate
privacy-preserving
of a randomly
completely
random
selected
matrix
subset
SVM
of the rows of A
Random Kernel AB0 Error
Error of Random Kernels is Comparable to Full Kernels:
Linear Kernels
B is a random matrix with the
same number of columns as A
and 10% as many rows.
dim(AB0) << dim(AA0)
Equal error for
random and
full kernels
Each point represents one of
7 datasets from the UCI
repository
Full Kernel AA0 Error
Random Kernel K(A, B0) Error
Error of Random Kernels is Comparable to Full Kernels:
Gaussian Kernels
Full Kernel K(A, A0) Error
Vertically Partitioned Data:
Each entity holds different features for the same
examples
A¢1 A¢2 A¢3
A¢1
A¢2
A¢3
Serial Secure Computation of the Linear Kernel AA0
Yu-Vaidya-Jiang 2006
A¢1A¢10 + R1
(A¢1A¢10 + R1) + A¢2A¢20
((A¢1A¢10 + R1) + A¢2A¢20) + A¢3A¢30
Our Parallel Secure Computation of the Random Linear
Kernel AB0
A¢1B¢10
0
0
A
B
A¢1B¢1 ¢2 ¢2
0
A¢3B¢3A
0
B
¢2 A¢2 B 0
¢3 ¢3
Privacy Preserving SVMs for Vertically
Partitioned Data via Random Kernels
• Each of q entities privately owns a block of data
A¢1, …, A¢q that it is unwilling to share with the others
• Each entity j picks its own random matrix B¢j and
distributes K(A¢j, B¢j) to the other p - 1 entites
• K(A, B0) = K(A¢1, B¢10)©…©K(A¢p, B¢p0)
– © is + for the linear kernel
– © is the Hadamard (element-wise) product for the Gaussian kernel
• A new point x = (x10, …, xp0)0 can be distributed amongst
the entities by similarly computing
K(x0, B0) = K(x10, B¢1) ©…©K(xp0, B¢p0)
• Recovering A¢j from K(A¢j, B¢j0) without knowing B¢ j is
essentially impossible
Results for PPSVM on Vertically
Partitioned Data
• Compare classifiers which share feature data with
classifiers which do not share
– Seven datasets from the UCI repository
• Simulate situations in which each entity has only a
subset of features
– In first situation, features evenly divided between 5
entities
– In second situation, each entity receives about 3
features
Error Sharing Data
Error Rate of Sharing Data Generally Better than not Sharing:
Linear Kernels
Error Rate
Without Sharing
7 datasets represented by two
points each
Error Rate
With Sharing
Error Without Sharing Data
Error Sharing Data
Error Rate of Sharing Data Generally Better than not Sharing:
Nonlinear Kernels
Error Without Sharing Data
Horizontally Partitioned Data:
Each entity holds different examples with the same
features
A1
A2
A3
A1
A2
A3
Privacy Preserving SVMs for Horizontally
Partitioned Data via Random Kernels
• Each of q entities privately owns a block of data A1, …,
Aq that they are unwilling to share with the other q - 1
entities
• The entities all agree on the same random basis matrix
and distribute K(Aj, B0) to all entities
• K(A, B0) =
• Aj cannot be recovered uniquely from K(Aj, B0)
Privacy Preservation:
Infinite Number of Solutions for Ai given AiB0
• Given
–
–
• Consider an attempt to solve for row r of Ai, 1 · r · mi
from the equation
– BAir0 = Pir , Air0 2 Rn
– Every square submatrix of the random matrix B is nonsingular
– There are at least
=
B
Pir
Air0
• Thus there are
solutions Ai to the equation BAi0 = Pi
• If each entity has 20 points in R30, there are 3020 solutions
• Furthermore, each of the infinite number of matrices in the
affine hull of these matrices is a solution
Results for PPSVM on Horizontally
Partitioned Data
• Compare classifiers which share examples
with classifiers which do not share
– Seven datasets from the UCI repository
• Simulate a situation in which each entity
has only a subset of about 25 examples
Error Sharing Data
Error Rate of Sharing Data is Better than not Sharing:
Linear Kernels
Error Without Sharing Data
Error Sharing Data
Error Rate of Sharing Data is Better than not Sharing:
Gaussian Kernels
Error Without Sharing Data
Summary
• Privacy preserving SVM for vertically or
horizontally partitioned data
– Based on using the random kernel K(A, B0)
– Learn classifier using all data, but without
revealing privately held data
– Classification accuracy is better than an SVM
without sharing, and comparable to an SVM
where all data is shared
© Copyright 2026 Paperzz