15-251: Great Theoretical Ideas In Computer Science Recitation 11 Solutions VC Dimension Recall that the VC dimension of a class C is the largest number d such that there exists a set S of size d such that any partitioning of the set S into + and − is valid (this is the definition of shattering). (a) What is the VC dimension of the class of circles on the plane? Solution: 3. For any three noncollinear points, we can easily draw a circle that includes all three of them, any two of them, any one of them, or none of them, so any three noncollinear points are shattered by the class of circles. However, for any set of four points, they are not shattered. We show this by constructing a counterexample in several cases: 1. If the four points are collinear, the labeling +-+- (going along the line) is impossible, among numerous others. 2. If the convex hull of the four points is a triangle, then the labeling with +(the three points of the triangle) and -(the interior point) is not possible. 3. If the convex hull of the four points is a quadrilateral, then let (a1 , a2 ) be the points separated by the long diagonal and (b1 , b2 ) be the points separated by the short diagonal. At least one of the labelings +(a1 , a2 ), −(b1 , b2 ), +(b1 , b2 ), −(a1 , a2 ) must be impossible: if they were both possible, then there would be some satisfying circle c1 for the first labeling and some other circle c2 satisfying the second labeling, and the symmetric difference of these circles ((c1 \ c2 ) ∪ (c2 \ c1 )) would consist of four disjoint regions, which is impossible for circles. Since some set of 3 points is shattered by the class of circles, and no set of 4 points is, the VC dimension of the class of circles is 3. (b) Let X be the set of binary strings of length 4 (i.e. the set of 4-character strings consisting only of 0 and 1). Let the class H be the set of “schemas” over X, where a schema consists of the symbols 1, 0, and ∗, where ∗ matches either 0 or 1. For example, h = 1 ∗ ∗∗ returns “true” for any string that starts with 1, and false for everything else. Similarly, h = ∗ ∗ ∗∗ returns “true” for all strings. Is X shattered by H? Solution: X is not shattered by H, because there exists a labeling of inclusions that is not satisfiable by a single schema. Consider the labeling +(0000, 1111), −(everythingelse) The only schema which would accept both of the first two strings is ∗ ∗ ∗∗, however, this schema also accepts evrything else, so there is no schema which accepts 0000 and 1111 but rejects other things. Therefore, X is not shattered by H. (c) Bonus: What is the VC dimension of the class of ellipses in the plane? 1 Solution: 5. Note that 5 points in a regular pentagon is shattered by ellipses, so the VC dimension is at least 5. The proof that no 6 points are shatterable by ellipses is left as an exercise to the reader. Lazy Diagonalization Show that the reals in [0, 1] (R[0,1] ) are uncountable, using the binary representation. Solution: We could just write the reals in decimal, but instead we will use this problem as an opportunity to bring out some of the subtleties of the binary representation and introduce the technique of lazy diagonalization. Suppose the reals in [0, 1] have the same cardinality as N. Then there exists a bijective mapping f : N → R[0,1] i.e. f (i) is the real number (in binary representation) assigned to i under this mapping. First, a bit of notation. We say that xi is the ith digit of the number x. Where normal diagonalization fails: Define a real number in [0, 1], represented in binary, X = 0.c0 c1 c2 ..., where ci = 1 − f (i)i . We know by diagonalization that the bits 0.c0 c1 c2 ... cannot be mapped to N. The catch is that we can represent X differently. For example, 0.0100000..... does not have the same bits as 0.00111111...., but the two numbers are actually the same. So our argument doesn’t actually quite work! Fixing it with lazy diagonalization: We define X = c0 c1 c2 ..., where c3i = (1 − f (i)3i ), c3i+1 = 0 and c3i+2 = 1 for i ∈ N. This time, X differs from f (i) in the 3ith position for each i ∈ N. This time though, we don’t have the same glitch, because c3i+1 and c3i+2 always differ! Thus, since X ∈ [0, 1], we have arrived at a contradiction and can conclude that |R[0,1] | = 6 |N|. Note to the lazy prover: we could theoretically solve this prolem in base 10 by avoiding using 9s and 0s (say, using 5s and 6s). The idea here is more general though. Cantor Set Consider the following set. Start with the interval [0, 1] Then remove the middle third, so you’re left with 2 1 ∪ ,1 0, 3 3 Now from each of the two remaining intervals, remove the middle third again, leaving 1 2 1 2 7 8 0, ∪ , ∪ , ∪ ,1 9 9 3 3 9 9 Continue this process infinitely. The numbers that remain make up the Cantor Set. (a) What is the “length” of the Cantor Set? For example, the length of 2 1 0, ∪ ,1 3 3 is ( 31 − 0) + (1 − 23 ) = 23 . 2 Solution: k Informally, we remove 13 of the total length at each step, meaning that the length is 23 after k steps, which goes to 0 as k → ∞. Formally, we can prove this with a summation of a geometric series. We start with a length of 1. Let us count the length removed at each step. At the ith time that we remove something, we 1 start with 2i−1 intervals, each of length 3i−1 , and we remove 13 of each of them, so in total we remove ∞ ∞ X 1X 2 i 1 1 2i−1 =1 = = 3i 3 3 3 1 − 23 i=0 i=1 so the “length” remaining is 0. (b) What is the cardinality of the Cantor set? (Hint: think of the numbers in base-3.) Solution: If we look at the numbers in base-3, we notice that at the ith step we remove all of the numbers that have a 1 for the ith digit (taking the representation that you represent a number with an infinite trailing amount of 2s instead of a 0). Then the numbers that remain are those that can be represented with only 0s or 2s in base 3. These are uncountable, since we can diagonalize: For any proposed bijection, let CONFUSEi be 0 if the ith digit of the ith number is 2 and 2 otherwise. Then CONFUSE differs from every element but since only has 0s and 2s it should be bijected to, so no bijection works. Instead of diagonalizing, we can also biject the Cantor set to [0, 1] by representing every number in [0, 1] in binary, then defining a bijection that takes the base-3 representation of an element of the Cantor set and changes all the 2s to 1s, resulting in a binary representation of a real number. 3
© Copyright 2026 Paperzz