Learning From Data Lecture 6 Bounding The Growth Function Bounding the Growth Function Models are either Good or Bad The VC Bound - replacing |H| with mH (N ) M. Magdon-Ismail CSCI 4100/6100 recap: The Growth Function mH(N ) A new measure for the diversity of a hypothesis set. H(x1, . . . , xN ) = {(h(x1), . . . , h(xN ))} The dichotomies (N -tuples) H implements on x1, . . . , xN . H H viewed through D The growth function mH(N ) considers the worst possible x1, . . . , xN . mH (N ) = max |H(x1, . . . , xN )|. x1 ,...,xN This lecture: Can we bound mH (N ) by a polynomial in N ? Can we replace |H| by mH (N ) in the generalization bound? c AM L Creator: Malik Magdon-Ismail Bounding the Growth Function: 2 /31 Example growth functions −→ Example Growth Functions N 4 5 ··· 1 2 3 2-D perceptron 2 4 8 14 ··· 1-D pos. ray 2 3 4 5 ··· 2-D pos. rectangles 2 4 8 16 < 25 · · · • mH(N ) drops below 2N – there is hope. • A break point is any k for which mH (k) < 2k . c AM L Creator: Malik Magdon-Ismail Bounding the Growth Function: 3 /31 Quiz I −→ Pop Quiz I ∗ I give you a set of k ∗ points x1, . . . , xk∗ on which H implements < 2k dichotomys. (a) k ∗ is a break point. (b) k ∗ is not a break point. (c) all break points are > k ∗. (d) all break points are ≤ k ∗. (e) we don’t know anything about break points. c AM L Creator: Malik Magdon-Ismail Bounding the Growth Function: 4 /31 Answer −→ Pop Quiz I ∗ I give you a set of k ∗ points x1, . . . , xk∗ on which H implements < 2k dichotomys. (a) k ∗ is a break point. (b) k ∗ is not a break point. (c) all break points are > k ∗. (d) all break points are ≤ k ∗. X (e) we don’t know anything about break points. c AM L Creator: Malik Magdon-Ismail Bounding the Growth Function: 5 /31 Quiz II −→ Pop Quiz II ∗ For every set of k ∗ points x1, . . . , xk∗ , H implements < 2k dichotomys. (a) k ∗ is a break point. (b) k ∗ is not a break point. (c) all k ≥ k ∗ are break points. (d) all k < k ∗ are break points. (e) we don’t know anything about break points. c AM L Creator: Malik Magdon-Ismail Bounding the Growth Function: 6 /31 Answer −→ Pop Quiz II ∗ For every set of k ∗ points x1, . . . , xk∗ , H implements < 2k dichotomys. X (a) k ∗ is a break point. (b) k ∗ is not a break point. X (c) all k ≥ k ∗ are break points. (d) all k < k ∗ are break points. (e) we don’t know anything about break points. c AM L Creator: Malik Magdon-Ismail Bounding the Growth Function: 7 /31 Quiz III −→ Pop Quiz III To show that k is not a break point for H: (a) Show a set of k points x1, . . . xk which H can shatter. (b) Show H can shatter any set of k points. (c) Show a set of k points x1, . . . xk which H cannot shatter. (d) Show H cannot shatter any set of k points. (e) Show mH(k) = 2k . c AM L Creator: Malik Magdon-Ismail Bounding the Growth Function: 8 /31 Answer −→ Pop Quiz III To show that k is not a break point for H: X (a) Show a set of k points x1, . . . xk which H can shatter. overkill (b) Show H can shatter any set of k points. (c) Show a set of k points x1, . . . xk which H cannot shatter. (d) Show H cannot shatter any set of k points. X (e) Show mH (k) = 2k . c AM L Creator: Malik Magdon-Ismail Bounding the Growth Function: 9 /31 Quiz IV −→ Pop Quiz IV To show that k is a break point for H: (a) Show a set of k points x1, . . . xk which H can shatter. (b) Show H can shatter any set of k points. (c) Show a set of k points x1, . . . xk which H cannot shatter. (d) Show H cannot shatter any set of k points. (e) Show mH (k) > 2k . c AM L Creator: Malik Magdon-Ismail Bounding the Growth Function: 10 /31 Answer −→ Pop Quiz IV To show that k is a break point for H: (a) Show a set of k points x1, . . . xk which H can shatter. (b) Show H can shatter any set of k points. (c) Show a set of k points x1, . . . xk which H cannot shatter. X (d) Show H cannot shatter any set of k points. (e) Show mH (k) > 2k . c AM L Creator: Malik Magdon-Ismail Bounding the Growth Function: 11 /31 Combinatorial puzzle again −→ Back to Our Combinatorial Puzzle How many dichotomies can you list on 4 points so that no 2 is shattered. x1 ◦ ◦ ◦ ◦ • x2 ◦ ◦ ◦ • ◦ x3 ◦ ◦ • ◦ ◦ x4 ◦ • ◦ ◦ ◦ Can we add a 6th dichotomy? c AM L Creator: Malik Magdon-Ismail Bounding the Growth Function: 12 /31 Can’t add a 6th dichotomy −→ Can’t Add A 6th Dichotomy x1 ◦ ◦ ◦ ◦ • ◦ c AM L Creator: Malik Magdon-Ismail x2 ◦ ◦ ◦ • ◦ • x3 ◦ ◦ • ◦ ◦ • x4 ◦ • ◦ ◦ ◦ ◦ Bounding the Growth Function: 13 /31 B(N, K) −→ The Combinatorial Quantity B(N, k) How many dichotomies can you list on 4 points so that no 2 are shattered. ↑ ↑ N k B(N , k): Max. number of dichotomys on N points so that no k are shattered. x1 ◦ ◦ ◦ • x2 ◦ ◦ • ◦ x3 ◦ • ◦ ◦ B(3, 2) = 4 c AM L Creator: Malik Magdon-Ismail x1 ◦ ◦ ◦ ◦ • x2 ◦ ◦ ◦ • ◦ x3 ◦ ◦ • ◦ ◦ x4 ◦ • ◦ ◦ ◦ B(4, 2) = 5 Bounding the Growth Function: 14 /31 B(4, 3) −→ Let’s Try To Bound B(4, 3) How many dichotomies can you list on 4 points so that no subset of 3 is shattered. x1 ◦ ◦ ◦ ◦ • ◦ ◦ • ◦ • • c AM L Creator: Malik Magdon-Ismail x2 ◦ ◦ ◦ • ◦ ◦ • ◦ • ◦ • x3 ◦ ◦ • ◦ ◦ • ◦ ◦ • • ◦ x4 ◦ • ◦ ◦ ◦ • • • ◦ ◦ ◦ Bounding the Growth Function: 15 /31 Two kinds of dichotomys −→ Two Kinds of Dichotomys Prefix appears once or prefix appears twice. x1 ◦ ◦ ◦ ◦ • ◦ ◦ • ◦ • • c AM L Creator: Malik Magdon-Ismail x2 ◦ ◦ ◦ • ◦ ◦ • ◦ • ◦ • x3 ◦ ◦ • ◦ ◦ • ◦ ◦ • • ◦ x4 ◦ • ◦ ◦ ◦ • • • ◦ ◦ ◦ Bounding the Growth Function: 16 /31 Reorder the dichotomys −→ Reorder the Dichotomys α β β c AM L Creator: Malik Magdon-Ismail x1 ◦ • • ◦ ◦ ◦ • ◦ ◦ ◦ • x2 • ◦ • ◦ ◦ • ◦ ◦ ◦ • ◦ x3 • • ◦ ◦ • ◦ ◦ ◦ • ◦ ◦ x4 ◦ ◦ ◦ ◦ ◦ ◦ ◦ • • • • α: prefix appears once β: prefix appears twice B(4, 3) = α + 2β Bounding the Growth Function: 17 /31 Bound for α + β −→ First, Bound α + β α β β c AM L Creator: Malik Magdon-Ismail x1 ◦ • • ◦ ◦ ◦ • ◦ ◦ ◦ • x2 • ◦ • ◦ ◦ • ◦ ◦ ◦ • ◦ x3 • • ◦ ◦ • ◦ ◦ ◦ • ◦ ◦ x4 ◦ ◦ ◦ ◦ ◦ ◦ ◦ • • • • α + β ≤ B(3, 3) ↑ A list on 3 points, with no 3 shattered (why?) Bounding the Growth Function: 18 /31 Bound for β −→ Second, Bound β α β β c AM L Creator: Malik Magdon-Ismail x1 ◦ • • ◦ ◦ ◦ • ◦ ◦ ◦ • x2 • ◦ • ◦ ◦ • ◦ ◦ ◦ • ◦ x3 • • ◦ ◦ • ◦ ◦ ◦ • ◦ ◦ x4 ◦ ◦ ◦ ◦ ◦ ◦ ◦ • • • • β ≤ B(3, 2) ↑ If 2 points are shattered, then using the mirror dichotomies you shatter 3 points (why?) Bounding the Growth Function: 19 /31 Combine the bounds −→ Combining to Bound α + 2β α β β c AM L Creator: Malik Magdon-Ismail x1 ◦ • • ◦ ◦ ◦ • ◦ ◦ ◦ • x2 • ◦ • ◦ ◦ • ◦ ◦ ◦ • ◦ x3 • • ◦ ◦ • ◦ ◦ ◦ • ◦ ◦ x4 ◦ ◦ ◦ ◦ ◦ ◦ ◦ • • • • B(4, 3) = α + β + β ≤ B(3, 3) + B(3, 2) The argument generalizes to (N , k) B(N , k) ≤ B(N − 1, k) + B(N − 1, k − 1) Bounding the Growth Function: 20 /31 Simple boundary cases −→ Boundary Cases: B(N , 1) and B(N , N ) 1 N 1 1 2 1 3 1 4 1 5 1 6 .. 1 .. c AM L Creator: Malik Magdon-Ismail 2 3 k 4 5 6 ··· 3 B(N , 1) = 1 (why?) 7 B(N , N ) = 2N − 1 15 (why?) 31 63 ... Bounding the Growth Function: 21 /31 Getting B(3, 2) −→ Recursion Gives B(N , k) Bound B(N , k) ≤ B(N − 1, k) + B(N − 1, k − 1) 1 N c AM L Creator: Malik Magdon-Ismail 1 1 2 1 3 1 4 1 5 1 6 .. 1 .. 2 3 k 4 5 6 ··· 63 .. ... 3 ց ↓ 4 7 15 31 .. .. .. .. Bounding the Growth Function: 22 /31 Filling the table −→ Recursion Gives B(N , k) Bound B(N , k) ≤ B(N − 1, k) + B(N − 1, k − 1) 1 N c AM L Creator: Malik Magdon-Ismail 2 3 k 4 5 1 1 2 1 3 3 1 4 7 4 1 5 11 15 5 1 6 16 26 31 6 .. 1 .. 7 .. 22 .. 42 .. 57 .. Bounding the Growth Function: 23 /31 6 ··· 63 .. ... B(N , k) ≤ k−1 X i=0 N i −→ Analytic Bound for B(N , k) Theorem. B(N , k) ≤ k−1 X N i=0 i . Proof: (Induction on N.) 1. Verify for N = 1: B(1, 1) ≤ k−1 P N 2. Suppose B(N, k) ≤ . i i=0 N Lemma. Nk + k−1 = Nk+1 . 1 0 =1 X B(N + 1, k) ≤ B(N, k) + B(N, k − 1) k−1 P N k−2 P N ≤ + i i = i=0 k−1 P i=0 = 1+ = 1+ N i + i=1 k−1 P i=1 k−1 P N i k−1 P i=0 c AM L Creator: Malik Magdon-Ismail Bounding the Growth Function: 24 /31 N +1 i N +1 i i=1 = i=0 k−1 P + N i−1 N i−1 (lemma) mH (N ) ≤ B(N, k) −→ mH(N ) is bounded by B(N, k)! Theorem. Suppose that H has a break point at k. Then, mH (N ) ≤ B(N, k). x1 ◦ ◦ ◦ ◦ • ◦ ◦ .. x2 ◦ ◦ ◦ • ◦ ◦ • .. x3 ◦ ◦ • ◦ ◦ • ◦ .. c AM L Creator: Malik Magdon-Ismail x4 ◦ • ◦ ◦ ◦ • • .. . . . xN ... • ... ◦ ... ◦ ... ◦ ... • ... • ... ◦ . . . .. Consider any k points. They cannot be shattered (otherwise k woud not be a break point). B(N, k) is largest such list. mH (N ) ≤ B(N, k) Bounding the Growth Function: 25 /31 Once broken, forever polynomial −→ Once bitten, twice shy . . . Once Broken, Forever Polynomial Theorem. If k is any break point for H, so mH (k) < 2k , then k−1 X N . mH (N ) ≤ i i=0 Facts (Problems 2.5 and 2.6): k−1 X N i=0 i ≤ N k−1 + 1 k−1 eN k−1 (polynomial in N ) This is huge: if we can replace |H| with mH (N ) in the bound, then learning is feasible. c AM L Creator: Malik Magdon-Ismail Bounding the Growth Function: 26 /31 There’s good, bad, no ugly −→ A Hypothesis Set is either Good and Bad log mH (N ) the bad H the ugly H the good H N 1 2 3 2-D perceptron 2 4 8 1-D pos. ray 2 3 2-D pos. rectangles 2 4 c AM L Creator: Malik Magdon-Ismail N 4 mH (N ) 5 ··· 14 ··· ··· ≤ N3 + 1 4 5 ··· ··· ≤ N1 + 1 8 16 < 25 ··· ≤ N4 + 1 Bounding the Growth Function: 27 /31 We have a bound on mH ; next: |H| ← mH ? −→ We have One Step in the Puzzle X Can we get a polynomial bound on mH (N ) even for infinite H? X Can we replace |H| with mH (N ) in the generalization bound? c AM L Creator: Malik Magdon-Ismail Bounding the Growth Function: 28 /31 Ghost ‘test’ set D′ represents Eout −→ Eout ր Age ց Age ′ Ein Income The ghost data set: a ‘fictitious’ data set D ′: Income Ein Income (i) How to Deal With Eout (Sketch) Probability distribution ′ of Ein, Ein Age ′ Ein is like a test error on N new points. ′ Ein deviates from Eout implies Ein deviates from Ein . ′ Ein and Ein have the same distribution. ′ Ein Ein ′ P[(Ein (g), Ein(g)) “deviate”] ≥ 1 2 P [(Eout(g), Ein(g)) “deviate”] Eout We can analyze deviations between two in-sample errors. c AM L Creator: Malik Magdon-Ismail Bounding the Growth Function: 29 /31 D ∪ D′ =⇒ mH (2N ) −→ (ii) Real Plus Ghost Data Set = 2N points x1 x2 x3 . . . xN xN +1 xN +2 xN +3 . . . x2N ◦ ◦ • ... ◦ • • ◦ ... ◦ Number of dichotomys is at most mH (2N ). Up to technical details, analyze a “hypothesis set” of size at most mH (2N ). c AM L Creator: Malik Magdon-Ismail Bounding the Growth Function: 30 /31 The VC-Bound −→ The Vapnik-Chervonenkis Bound (VC Bound) P [|Ein(g) − Eout(g)| > ǫ] ≤ 4mH(2N )e−ǫ 2 N/8 , P [|Ein(g) − Eout(g)| ≤ ǫ] ≥ 1 − 4mH(2N )e−ǫ Eout(g) ≤ Ein(g) + c AM L Creator: Malik Magdon-Ismail q 8 N log 4mHδ(2N ) , Bounding the Growth Function: 31 /31 2 for any ǫ > 0. N/8 , for any ǫ > 0. w.p. at least 1 − δ.
© Copyright 2024 Paperzz