The Teaching Dimension of Linear Learners

Submission and Formatting Instructions for ICML 2016
Supplemental Material: Proofs
Proof to Theorem 1
Proof. Let n⇤ be the minimal number of training items to ensure a unique solution ✓ ⇤ . First consider the case n⇤ = 0. It
happens if and only if ✓ ⇤ = 0 and Rank(A) = d, which is a special case of A✓✓ ⇤ = 0. Clearly, this case is consistent with
LB1. Next consider the case n⇤ 1. Since ✓ ⇤ solves (1), the KKT condition holds:
⇤
⇤
A✓✓ 2
n
X
⇤
@1 `(x>
i ✓ , yi )xi .
(29)
i=1
We seek all such that ✓ ⇤ + satisfies
A(✓✓ ⇤ + ) = A✓✓ ⇤
⇤
✓ ⇤ + ) = x>
x>
i (✓
i ✓
and
8i = 1, · · · , n⇤ ,
(30)
For any such , simple algebra verifies that ✓ ⇤ + t satisfies the KKT condition (29) for any t 2 [0, 1]. Consequently, ✓ ⇤ +
also solves the problem in (1). To see this, we consider two situations:
• If the loss function `(·, ·) is convex in the first argument, the KKT condition is a sufficient optimality condition, which
means that ✓ ⇤ + solves (1).
• If the loss function `(·, ·) is smooth (not necessary convex) in the first argument, we have f (✓✓ ⇤ ) = f (✓✓ ⇤ + ) by using
the Taylor expansion (recall f is defined in equation 1):
f (✓✓ ⇤ + ) =f (✓✓ ⇤ ) + hrf (✓✓ ⇤ + t ), i (for some t 2 [0, 1])
* n⇤
+
X
⇤
> ⇤
⇤
=f (✓✓ ) +
r1 `(xi (✓✓ + t ), yi )xi + A(✓✓ + t )),
i=1
⇤
=f (✓✓ ) +
* n⇤
X
i=1
=f (✓✓ ⇤ ).
|
⇤
r1 `(x>
i ✓ ,
⇤
yi )xi + A✓✓ ,
{z
=0 due to the KKT condition (29)
Therefore, ✓ ⇤ + also solves (1). However, the uniqueness of ✓ ⇤ requires
is equivalent to say
}
+
= 0 to be the only value satisfying (30). This
Null(A) \ Null(Span{x1 , · · · , xn⇤ }) = {0}.
(31)
It indicates that
From n⇤
Rank(A) + Dim(Span{x1 , · · · , xn⇤ })
Dim(span{x1 , · · · , xn⇤ }), we have n⇤
d.
Rank(A). We proved the general case for LB1.
d
If we have A✓✓ 6= 0, we can further improve LB1. Let g = (g1⇤ , . . . , gn⇤ ⇤ )> be the vector satisfying
✓⇤
⇤
⇤
⇤
A✓✓ =
n
X
i=1
gi⇤ xi
and
⇤
gi⇤ 2 @1 `(x>
i ✓ , yi )
8i = 1, 2, · · · , n⇤ .
(32)
Since ✓ ⇤ satisfies the KKT condition, such vector g⇤ must exist. Applying A✓✓ ⇤ 6= 0 to (32), we have g⇤ 6= 0 and
Dim (Span{A.1 , A.2 , · · · , A.d } \ Span{x1 , · · · , xn⇤ })
To satisfy (31), we must have
d = Dim (Span{A.1 , A.2 , · · · , A.d , x1 , · · · , xn⇤ }) .
1.
(33)
Submission and Formatting Instructions for ICML 2016
Using the fact in linear algebra
Dim (Span{A.1 , A.2 , · · · , A.d , x1 , · · · , xn⇤ })
= Dim (Span{A.1 , A.2 , · · · , A.d }) +
{z
}
|
=Rank(A)
Dim (Span{x1 , · · · , xn⇤ })
|
{z
}
n⇤
Dim (Span{A.1 , A.2 , · · · , A.d } \ Span{x1 , · · · , xn⇤ })
|
{z
}
1 (from (33))
We conclude that n⇤
Rank(A) + 1. We completed the proof for LB1.
d
Proof to Theorem 2
Proof. When A has full rank we have an equivalent expression for the KKT condition (29):
⇤
1
2
⇤
A ✓ 2
n
X
1
2
A
⇤
xi @1 `(x>
i ✓ , yi )
i=1
1
1
8i = 1, · · · , n⇤ .
(34)
1
1
1
2 ⇤
Let us decompose A 2 xi for all i = 1, · · · , n⇤ into A 2 xi = ↵i A 2 ✓ ⇤ + ui , where ui is orthogonal to A 2 ✓ ⇤ : u>
i A ✓ =
1
0. Equivalently xi = ↵i A✓✓ ⇤ + A 2 ui . Applying this decomposition, we have
1
⇤
2 ⇤
✓ ⇤ k2A + u>
✓ ⇤ k2A .
x>
i ✓ = ↵i k✓
i A ✓ = ↵i k✓
Putting it back in (34) we obtain
⇤
1
2
⇤
A ✓ 2
n ⇣
X
i=1
⌘
1
↵i A 2 ✓ ⇤ + ui @1 `(↵i k✓✓ ⇤ k2A , yi )
8i = 1, · · · , n⇤ .
(35)
1
Since ui is orthogonal to A 2 ✓ ⇤ , (35) can be rewritten as
9↵i 2 R, 9yi 2 Y, 9gi 2 @1 `(↵i k✓✓ ⇤ k2A , yi )
⇤
satisfying
n
X
8i = 1, · · · , n⇤
(36)
gi ui = 0
i=1
⇤
1
2
⇤
1
2
A ✓ =A ✓
⇤
n
X
(37)
↵i g i
i=1
1
Since A✓✓ ⇤ 6= 0, we have A 2 ✓ ⇤ 6= 0 and (37) is equivalent to
=
⇤
=
n
X
i=1
↵ i g i  n⇤
P n⇤
i=1
↵i gi . It follows that
↵g = n⇤
sup
✓ ⇤ k2A ,y)
↵2R,y2Y,g2@1 `(↵k✓
sup
✓ ⇤ k2A ,y)
↵2R,y2Y,g2 @1 `(↵k✓
It indicates the lower bound for n⇤
n
Proof to Theorem 3
⇤
&
sup↵2R,y2Y,g2
✓ ⇤ k2A ,y)
@1 `(↵k✓
↵g
'
.
↵g
Submission and Formatting Instructions for ICML 2016
Proof. Let D = {xi , yi }i=1,··· ,n be a teaching set for [w⇤ ; b⇤ ]. The following KKT condition needs to be satisfied:


n
X
xi
Aw⇤
> ⇤
⇤
02
@`(yi (xi w + b ))yi
+
.
1
0
(38)
i=1
If we construct a new training set
D̂ =
⇢
x̂i = xi +
b⇤
Aw⇤ , ŷi = yi
kw⇤ k2A
i=1,··· ,n
then [w⇤ ; 0] satisfies the KKT condition defined on D̂. This can be verified as follows:


n
X
x̂i
Aw⇤
> ⇤
@`(ŷi (x̂i w ))yˆi
+
1
0
i=1
"
# 
⇤
n
X
xi + kwb⇤ k2 Aw⇤
Aw⇤
> ⇤
⇤
A
=
@`(yi (xi w + b ))yi
+
0
1
i=1
"
# n


⇤
n
b
⇤ X
⇤
X
Aw
x
Aw
2
⇤
i
⇤
⇤
⇤
⇤
=
@`(yi (x>
+ kw kA
@`(yi (x>
i w + b ))yi 1 +
i w + b ))yi
0
0
i=1
i=1
|
{z
}
|
{z
}
30 from (38)
where 0 2
Pn
30 from (38)
30
i=1
(39)
⇤
⇤
@`(yi (x>
i w + b ))yi is from the bias dimension in (38). It follows that
02
which is equivalent to
02
=
n
X
⇤
⇤
@`(ŷi x̂>
i w )ŷi x̂i + Aw
i=1
n
X
⇤
@`(ŷi x̂>
i w )A
1
2
i=1
n
X
1
ŷi x̂i + A 2 w⇤
|{z}
=:zi
⇤
@`(z>
i w )A
1
2
1
zi + A 2 w ⇤ .
(40)
i=1
We decompose A
1
2
1
1
⇤
2
zi = ↵i A 2 w⇤ + ui where ui satisfies u>
i A w = 0. Applying this decomposition to (40), we have
1
A 2 w⇤ 2
n
X
i=1
1
@`(↵i kw⇤ k2A )(↵i A 2 w⇤ + ui ).
1
Since ui is orthogonal to A 2 w⇤ , (41) implies that
A 2 w⇤ 2
n
X
@`(↵i kw⇤ k2A )↵i A 2 w⇤ .
2
n
X
@`(↵i kw⇤ k2A )↵i .
1
Since w 6= 0 we have
⇤
Together with
n
X
i=1
we obtain LB3.
i=1
i=1
1
@`(↵i kw⇤ k2A )↵i  n
sup
↵2R,g2 @`(↵kw⇤ k2A )
↵g,
(41)
Submission and Formatting Instructions for ICML 2016
Proof to Proposition 1
Proof. We simply verify the KKT condition to see that ✓ ⇤ is a solution to (10) by applying the construction in (11). The
uniqueness of ✓ ⇤ is guaranteed by the strong convexity of (10).
Proof to Proposition 2
Proof. We only need to verify that the KKT condition holds for ✓ ⇤ . Due to the strong convexity of (12) uniqueness is
guaranteed automatically. We denote the subgradient @a max(1 a, 0) = @1 max(1 a, 0) = I(a), where
8
>
<1,
I(a) = [0, 1],
>
:
0,
The KKT condition is
n
X
yi xi @1 max(1
if a < 1
if a = 1 .
otherwise
(42)
⇤
⇤
yi x >
i ✓ , 0) + ✓
i=1
=
n
X
⇤
⇤
yi xi I(yi x>
i ✓ )+ ✓
i=1
=
=
30
where the last line is due to I
⇣
✓ ⇤ k2
k✓
✓ ⇤ k2 e
d k✓
⌘
✓
◆
✓⇤
k✓✓ ⇤ k2
n
I
+ ✓⇤
d k✓✓ ⇤ k2 e
d k✓✓ ⇤ k2 e
✓
◆
k✓✓ ⇤ k2
⇤
✓ I
+ ✓⇤
d k✓✓ ⇤ k2 e
giving either the set [0, 1] or the value 1.
Proof to Corollary 2
Proof. We show this number matches LB2. Let A = I, `(a, b) = max(1
sup
↵g =
✓ ⇤ k2 ,y)
↵2R,y2Y,g2 @1 `(↵k✓
ab, 0), and consider the denominator of (7):
sup
↵g
✓ ⇤ k2 )
↵,y2{ 1,1},g2yI(y↵k✓
=
sup
↵g
✓ ⇤ k2 )
↵,g2I(↵k✓
=
where the first equality is due to @1 `(a, b) =
in (13).
1
k✓✓ ⇤ k2
bI(ab). Therefore, LB2 =
⌃
⌥
k✓✓ ⇤ k2 which matches the construction
Proof to Proposition 3
Proof. We first verify that ✓ ⇤ is a solution to (14) based on the teaching set construction in (16). We only need to verify
Submission and Formatting Instructions for ICML 2016
the gradient of (14) is zero. Computing the gradient of (14), we have
n
X
yi x i
+ ✓⇤
>✓ ⇤ }
1
+
exp{y
x
i
i
i=1
xi
⇤
⇢
✓
= n
l ⇤ 2 m 1◆ + ✓
✓
k✓
k
1 + exp ⌧ 1
k✓✓ ⇤ k2 ⌧max
✓
l ⇤ 2 m 1◆
✓
⌧ 1
k✓✓ ⇤ k2 ⌧k✓maxk
✓⇤
⇤
⇢
✓
= n
l ⇤ 2 m 1 ◆ k✓✓ ⇤ k2 + ✓
✓ k
k✓
1
⇤
2
1 + exp ⌧
k✓✓ k
⌧max
⇤ 2
n k✓✓ k
=
=0,
⇠
where the third equality uses the fact k✓✓ ⇤ k2
of (14) automatically implies uniqueness.
l
k✓✓ ⇤ k2
⌧max
⇡
✓ ⇤ k2
k✓
⌧max
m
1
1
✓⇤
+ ✓⇤
k✓✓ ⇤ k2
 ⌧max and the property a =
⌧ 1 (a)
.
1+e⌧ 1 (a)
The strong convexity
Proof to Corollary 3
Proof. We show that the number matches LB2. In (7) let A = I and `(a, b) = log(1 + exp{ ab}). The denominator of
LB2 is:
sup
↵g =
✓ ⇤ k2 ,y)
↵2R,y2Y,g2 @1 `(↵k✓
sup
↵g
✓ ⇤ k2 })
↵,y2{ 1,1},g=y(1+exp{y↵k✓
=
sup
1
↵g
✓ ⇤ k2 })
↵,g=(1+exp{↵k✓
1
↵
✓ ⇤ k2 }
↵ 1 + exp{↵k✓
t
=k✓✓ ⇤ k 2 sup
1
+
exp{t}
t
⌧max
= ⇤ 2,
k✓✓ k
= sup
which implies LB2 =
l
Proof to Proposition 4
✓ ⇤ k2
k✓
⌧max
m
.
Proof. We first prove the case for w⇤ = 0. We can verify that the KKT condition is satisfied by designing x1 and y1 as
in (18):
⇤
⇤
(x>
1w +b
y1 )x1 + w⇤ =0
⇤
⇤
x>
1w +b
y1 =0.
The uniqueness of [w⇤ ; b⇤ ] is indicated by the strong convexity of (17) when n = 1.
We then prove the case for w⇤ 6= 0. With simple algebra, we can verify the KKT condition holds via the construction in
(19):
⇤
⇤
(x>
1w +b
⇤
⇤
y1 )x1 + (x>
2w +b
⇤
⇤
(x>
1w +b
y2 )x2 + w⇤ =0
⇤
⇤
y1 ) + (x>
2w +b
Similarly, the uniqueness is implied by the strong convexity of (17) when n = 2.
y2 ) =0.
Submission and Formatting Instructions for ICML 2016
Proof to Corollary 4
Proof. We match the lower bound LB1 in (6). Note ✓ ⇤ = [w⇤ ; b⇤ ] 2 Rd+1 , and A in this case is a (d + 1) ⇥ (d + 1)
matrix with the d ⇥ d identity matrix Id padded with one additional row and column of zeros for the offset. Therefore
Rank(A) = Rank(Id ) = d. When w⇤ = 0, A✓✓ ⇤ = 0 and LB1 = (d + 1) Rank(A) = 1. When w⇤ 6= 0, A✓✓ ⇤ 6= 0 and
LB1 = (d + 1) Rank(A) + 1 = 2. These lower bounds match the teaching set sizes in (18) and (19), respectively.
Proof to Proposition 5
Proof. Unlike in previous learners (including homogeneous SVM), we no longer have strong convexity w.r.t. b. In order
to prove that (21) is a teaching set, we need to verify the KKT condition and verify solution uniqueness.
We first verify the KKT condition to show that the solution under (21) includes the target model [w⇤ ; b⇤ ]. From (21), we
have
⇤
⇤
x>
x> w⇤ + b⇤ = 1.
(43)
+ w + b = 1,
Applying them to the KKT condition and using the notation in (42) we obtain


 ⇤
n
n
x+
x
w
> ⇤
⇤
> ⇤
⇤
I(x+ w + b )
+ I( x w
b )
+
1
1
0
2
2


 ⇤
n
n
x
x
w
=
I(1) + + I(1)
+
1
1
0
2
2

 ⇤
n
x
x+
w
I(1)
+
setting the last dimension to 0
0
0
2

 ⇤
n
⇤
w
kw⇤ k2 w
=I(1)
+
applying (21)
0
0

 ⇤
w⇤
w
◆I(1)
+
observing n
kw⇤ k2
0
0
30.
It proves that [w⇤ ; b⇤ ] solves (20) by our teaching set construction.
Next we prove uniqueness by contradiction. We use f (w, b) to denote the objective function in (20) under the teaching set.
It is easy to verify that f (w⇤ , b⇤ ) = 2 kw⇤ k2 . Assume that there exists another solution [w̄; b̄] different from [w⇤ ; b⇤ ]. We
can obtain kw̄k2  kw⇤ k2 due to
2
kw⇤ k2 = f (w⇤ , b⇤ ) = f (w̄, b̄)
2
kw̄k2 .
The second equality is due to [w̄; b̄] being a solution; the inequality is due to whole-part relationship. Therefore, there are
only two possibilities for the norm of w̄: kw̄k = kw⇤ k or kw̄k = tkw⇤ k for some 0  t < 1. Next we will show that both
cases are impossible.
(Case 1) For the case kw̄k = kw⇤ k, we have
f (w̄, b̄) =
=
n
max 1
2
0
n
B
⇤
max @x>
+ (w
2
|
+
=
(x>
+ w̄ + b̄), 0 +
2
kw⇤ k2
n
max (
2
+ , 0)
+
w̄) + (b⇤
{z
=:
+
n
max (
2
n
max 1 + (x> w̄ + b̄), 0 + kw̄k2
2
2
1
0
C n
B
b̄), 0A + max @ x> (w⇤
2
}
|
, 0) + f (w⇤ , b⇤ ).
w̄)
{z
=:
(b⇤
1
C
b̄), 0A
}
Submission and Formatting Instructions for ICML 2016
From f (w̄, b̄) = f (w⇤ , b⇤ ), it follows
0
+
+
+
 0 and
= (x+
 0. Since
x )> (w⇤
w̄) =
2(w⇤ )> (w⇤
kw⇤ k2
w̄)
=2
2
w̄> w⇤
,
kw⇤ k2
we have w̄> w⇤ kw⇤ k2 . But because kw̄k = kw⇤ k, we must have w̄ = w⇤ . Applying this new observation to
and
 0, we obtain b⇤ = b̄. It means that [w⇤ ; b⇤ ] = [w̄; b̄], contradicting our assumption [w⇤ ; b⇤ ] 6= [w̄; b̄].
+
0
(Case 2) Next we turn to the case kw̄k = tkw⇤ k for some t 2 [0, 1). Recall our assumption that [w̄; b̄] solves (20). Then it
follows that the following specific construction [ŵ, b̂] solves (20) as well:
ŵ = tw⇤ ,
b̂ = tb⇤ .
(44)
To see this, we consider the following optimization problem:
min
w,b
s.t.
n
max(1
2
kwk  tkw⇤ k.
L(w, b) :=
(x>
+ w + b), 0) +
n
max(1 + (x> w + b), 0)
2
(45)
Since [w̄; b̄] solves (20), it is easy to see that [w̄; b̄] solves (45) too, otherwise there exists a solution for (45) which gives
a lower function value on (20). Then we can verify that [ŵ; b̂] solves (45) as well by showing the following optimality
condition holds:
"
#
@L(w,b)
@w
@L(w,b)
@b
[ŵ;b̂]
\
Nkwktkw⇤ k (ŵ, b̂)
|
{z
}
6= ;
(46)
Normal cone to the set {[w; b] : kwk  tkw⇤ k} at [ŵ; b̂]
Given a convex closed set ⌦ and a point ✓ 2 ⌦, the normal cone at point ✓ is defined to be a set
N⌦ (✓✓ ) = { : h ,
✓ i  0 8 2 ⌦}.
The optimality condition basically suggests that at the optimal point, the negative (sub)gradient direction overlaps with the
normal cone. In other words, there does not exist any valid direction to decrease the objective at the optimal point. Readers
can refer to Nocedal & Wright (2006) or Bertsekas & Nedic (2003) for more explanations about the geometric optimality
condition.
Because of (43) and (44), we have x>
+ ŵ + b̂ = t < 1. Thus at [ŵ; b̂] the subgradient is
"
And the normal cone is
@L(w,b)
@w
@L(w,b)
@b
#
[ŵ;b̂]

n x+ x
=
0
2
( 
w⇤
Nkwktkw⇤ k (ŵ, b̂) = s
0
The intersection is non-empty by choosing s =
Together with kŵk = kw̄k, we have
n
kw⇤ k2 .
=
s

nw⇤
kw⇤ k2
0
)
0 .
(47)
(48)
Since both [ŵ; b̂] and [w̄; b̄] solve (45), we have L(ŵ, b̂) = L(w̄, b̄).
f (ŵ, b̂) = L(ŵ, b̂) +
2
kŵk2 = f (w̄, b̄) = f (w⇤ , b⇤ ).
Therefore, we proved that [ŵ; b̂] solves (20) as well. To see the contradiction, let us check the function value of f (ŵ, b̂)
Submission and Formatting Instructions for ICML 2016
via a different route:
f (ŵ, b̂) =f (tw⇤ , tb⇤ )
n
=
2
X
n
max 1
⇤
t(x>
+w
⇤
+ b ), 0 +
i=1
X
max 1 + t(x> w⇤ + b⇤ ), 0 +
i=1
n
2
=
2
X
n
2
max (1
t, 0) +
i=1
X
max (1
t, 0) +
i=1
=n(1
t)
n(1
t)
2
2
kw⇤ k2 t2
kw⇤ k2 t2
kw⇤ k2 (1 t2 ) + kw⇤ k2
2
2
n
2
⇤ 2
(1 t ) + kw k
2
2
n
= (1 t)2 + f (w⇤ , b⇤ )
2
>f (w⇤ , b⇤ ),
where the first inequality uses the fact that n
cases 1 and 2 together we prove uniqueness.
kw⇤ k2 . It contradicts our early assertion f (ŵ, b̂) = f (w⇤ , b⇤ ). Putting
Proof to Corollary 5
Proof. The upper bound directly follows Proposition 5. We only need to show the lower bound LB3 =
Theorem 3. Let A = I, `(a) = max(1 a, 0), and consider the denominator of (9):
sup
↵g =
↵2R,g2 @`(↵kw⇤ k2 )
where the first equality is due to @`(a) =
sup
↵,g2I(↵kw⇤ k2 )
I(a). Therefore, LB3 =
Proof to Proposition 6
↵g =
⌃
⌃
⌥
kw⇤ k2 in
1
kw⇤ k2
⌥
kw⇤ k2 which proves the lower bound.
Proof. We first point out that for t to be well-defined the argument to ⌧
1
() has to be bounded
kw⇤ k2
⌧max .
kw⇤ k2
n
 ⌧max . This
implies n
The size of our proposed teaching set is the smallest among all such symmetric construction that
satisfy this constraint.
We verify that the KKT condition to show the construction in (23) includes the solution [w⇤ ; b⇤ ]. From (23), we have
⇤
⇤
x>
+w + b = t
x > w ⇤ + b⇤ =
t.
We apply them and the teaching set construction to compute the gradient of (22):


n
1
n
1
x+
x
+
>
>
⇤
⇤
⇤
⇤
1
2 1 + exp{x+ w + b }
2 1 + exp{ x w
b } 1


 ⇤
n
1
n
1
x+
x
w
=
+
+
0
2 1 + exp{t} 1
2 1 + exp{t} 1
 ⇤
 ⇤
n
t
w
w
=
+
0
kw⇤ k2 1 + exp{t} 0

 ⇤
n
kw⇤ k2 w⇤
w
=
+
0
0
kw⇤ k2
n
=0.
This verifies the KKT condition.
+

w⇤
0
Submission and Formatting Instructions for ICML 2016
Finally we show uniqueness. The Hessian matrix of the objective function (22) under our training set (23) is:


>
n
exp{t}
I 0
x+ x>
x+ + x
+ +x x
+
.
>
>
0
0
x>
+
x
2
2 (1 + exp{t})2
+
|
{z
}|
| {z }
{z
}
:=a
:=B
:=A


⇤
⇤
x+ ⇥
x ⇥
x+ 1 +
x
1 is positive semi-definite. We show that aA+ B is positive definite.
1
1
Suppose not. Then there exists [u; v] 6= 0 such that [u; v]> (aA+ B)[u; v] = 0. This implies [u; v]> (aA)[u; v]+ u> u =
0. Since the first term is non-negative due to A being positive semi-definite, u = 0. But then we have 2av 2 = 0 which
implies [u; v] = 0, a contradiction. Therefore uniqueness is guaranteed.
Note a > 0 and A =
Proof to Corollary 6
Proof. The upper bound directly follows Proposition 6. We only need to show the lower bound
l
LB3 in Theorem 3. Let A = I and `(a) = log(1 + exp{ a}) and consider the denominator of (9):
sup
↵2R,g2@`( ↵kw⇤ k2 )
↵g =
sup
↵,g=(1+exp{↵kw⇤ k2 })
↵
1 + exp{↵kw⇤ k2 }
t
=kw⇤ k 2 sup
t 1 + exp{t}
⌧max
= ⇤ 2,
kw k
= sup
↵
which implies LB3 =
l
kw⇤ k2
⌧max
m
.
↵g
1
kw⇤ k2
⌧max
m
by applying

Download Report

The Teaching Dimension of Linear Learners

Paperzz.com

Your Paperzz