The Noisy Power Method Moritz Hardt Eric Price IBM IBM → UT Austin 2014-10-31 Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 1 / 18 Problem Common problem: find low rank approximation to a matrix A I I PCA: apply to covariance matrix Spectral analysis: PageRank, Cheever’s inequality for cuts, etc. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 2 / 18 Simple algorithm: the power method AKA subspace iteration, subspace power iteration Choose random X0 ∈ Rn×k . Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 3 / 18 Simple algorithm: the power method AKA subspace iteration, subspace power iteration Choose random X0 ∈ Rn×k . Repeat: Yt+1 = AXt Xt+1 = orthonormalize(Yt+1 ) Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 3 / 18 Simple algorithm: the power method AKA subspace iteration, subspace power iteration Choose random X0 ∈ Rn×k . Repeat: Yt+1 = AXt Xt+1 = orthonormalize(Yt+1 ) Converges towards U, the space of the top k eigenvalues. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 3 / 18 Simple algorithm: the power method AKA subspace iteration, subspace power iteration Choose random X0 ∈ Rn×k . Repeat: Yt+1 = AXt Xt+1 = orthonormalize(Yt+1 ) Converges towards U, the space of the top k eigenvalues. Question 1: how quickly? Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 3 / 18 Simple algorithm: the power method AKA subspace iteration, subspace power iteration Choose random X0 ∈ Rn×k . Repeat: Yt+1 = AXt Xt+1 = orthonormalize(Yt+1 ) Converges towards U, the space of the top k eigenvalues. Question 1: how quickly? I [Stewart ’69, ..., Halko-Martinsson-Tropp ’10] Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 3 / 18 Simple algorithm: the power method AKA subspace iteration, subspace power iteration Choose random X0 ∈ Rn×k . Repeat: Yt+1 = AXt + G Xt+1 = orthonormalize(Yt+1 ) Converges towards U, the space of the top k eigenvalues. Question 1: how quickly? I [Stewart ’69, ..., Halko-Martinsson-Tropp ’10] Question 2: how robust to noise? Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 3 / 18 Simple algorithm: the power method AKA subspace iteration, subspace power iteration Choose random X0 ∈ Rn×k . Repeat: Yt+1 = AXt + G Xt+1 = orthonormalize(Yt+1 ) Converges towards U, the space of the top k eigenvalues. Question 1: how quickly? I [Stewart ’69, ..., Halko-Martinsson-Tropp ’10] Question 2: how robust to noise? I Application-specific bounds: [Hardt-Roth ’13, Mitliagkas-Caramanis-Jain ’13, Jain-Netrapalli-Sanghavi ’13] Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 3 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Suppose A has eigenvectors v1 , . . . , vn , eigenvalues λ1 > λ2 ≥ · · · λn ≥ 0. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Suppose A has eigenvectors v1 , . . . , vn , eigenvalues λ1 > λ2 ≥ · · · λn ≥ 0. Start with x0 = α1 v1 + α2 v2 + · · · + αn vn . Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Suppose A has eigenvectors v1 , . . . , vn , eigenvalues λ1 > λ2 ≥ · · · λn ≥ 0. Start with x0 = α1 v1 + α2 v2 + · · · + αn vn . After q iterations, Aq x0 = X λqi αi vi i Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Suppose A has eigenvectors v1 , . . . , vn , eigenvalues λ1 > λ2 ≥ · · · λn ≥ 0. Start with x0 = α1 v1 + α2 v2 + · · · + αn vn . After q iterations, X λi q αi X q q vi A x0 = λi αi vi ∝ v1 + λ1 α1 i Moritz Hardt, Eric Price (IBM) i≥2 The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Suppose A has eigenvectors v1 , . . . , vn , eigenvalues λ1 > λ2 ≥ · · · λn ≥ 0. Start with x0 = α1 v1 + α2 v2 + · · · + αn vn . After q iterations, X λi q αi X q q vi A x0 = λi αi vi ∝ v1 + λ1 α1 i For q ≥ logλ1 /λ2 Moritz Hardt, Eric Price (IBM) d α1 , i≥2 have Aq x proportional to v1 ± O(). The Noisy Power Method 2014-10-31 4 / 18 Basic power method, k = 1 Choose a random unit vector x ∈ Rn . Output Aq x, renormalized to unit vector. I xt+1 = Axt /kAxt k for t = 0, . . . , q − 1. Suppose A has eigenvectors v1 , . . . , vn , eigenvalues λ1 > λ2 ≥ · · · λn ≥ 0. Start with x0 = α1 v1 + α2 v2 + · · · + αn vn . After q iterations, X λi q αi X q q vi A x0 = λi αi vi ∝ v1 + λ1 α1 i≥2 i For q ≥ logλ1 /λ2 d α1 , have Aq x proportional to v1 ± O(). q = O( Moritz Hardt, Eric Price (IBM) λ1 log n) λ1 − λ2 The Noisy Power Method 2014-10-31 4 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k What conditions on G will cause this to converge to within ? Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k What conditions on G will cause this to converge to within ? I G must make progress at the beginning 1 |G1 | ≤ (λ1 − λ2 ) √ d Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k What conditions on G will cause this to converge to within ? I I G must make progress at the beginning G must not perturb by at the end. 1 |G1 | ≤ (λ1 − λ2 ) √ d Moritz Hardt, Eric Price (IBM) kGk ≤ (λ1 − λ2 ) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k What conditions on G will cause this to converge to within ? I I I G must make progress at the beginning G must not perturb by at the end. Looser requirements in the middle. 1 |G1 | ≤ (λ1 − λ2 ) √ d Moritz Hardt, Eric Price (IBM) kGk ≤ (λ1 − λ2 ) The Noisy Power Method 2014-10-31 5 / 18 Handling noise, k = 1 Consider the iteration yt+1 = Axt + G xt+1 = yt+1 /kyt+1 k What conditions on G will cause this to converge to within ? I I I G must make progress at the beginning G must not perturb by at the end. Looser requirements in the middle. Theorem: Converges to v1 ± O() if all the G satisfy 1 |G1 | ≤ (λ1 − λ2 ) √ d kGk ≤ (λ1 − λ2 ) 1 in O( λ2λ−λ log(d/)) iterations. 1 Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 5 / 18 Noisy convergence proof (k = 1) Use a potential-based argument to show progress at each step. Potential: qP tan θ = Moritz Hardt, Eric Price (IBM) θ 2 j>1 αj α1 The Noisy Power Method 2014-10-31 6 / 18 Noisy convergence proof (k = 1) Use a potential-based argument to show progress at each step. Potential: qP θ 2 j>1 αj tan θ = α1 With no noise: qP tan θt+1 = Moritz Hardt, Eric Price (IBM) 2 2 j>1 λj αj λ1 α1 ≤ λ2 tan θt λ1 The Noisy Power Method 2014-10-31 6 / 18 Noisy convergence proof (k = 1) Use a potential-based argument to show progress at each step. Potential: qP θ 2 j>1 αj tan θ = α1 With no noise: qP tan θt+1 = 2 2 j>1 λj αj λ1 α1 ≤ λ2 tan θt λ1 With noise G satisfying the conditions (|G1 |, kGk small enough), qP 2 λ2 j>1 αj + kGk tan θt+1 ≤ λ1 α1 − |G1 | Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 6 / 18 Noisy convergence proof (k = 1) Use a potential-based argument to show progress at each step. Potential: qP θ 2 j>1 αj tan θ = α1 With no noise: qP tan θt+1 = 2 2 j>1 λj αj λ1 α1 ≤ λ2 tan θt λ1 With noise G satisfying the conditions (|G1 |, kGk small enough), qP 2 λ2 j>1 αj + kGk λ2 tan θt+1 ≤ ≤ + ( )1/4 tan θt λ1 α1 − |G1 | λ1 Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 6 / 18 Noisy convergence proof (general k) Use “principal angle” θ from X to U let U ∈ Rd×k have top k eigenvectors, V = U ⊥ . vP u T α2 kV X k u t Pj>k j = tan θ := 2 kU T X k j≤k αj Moritz Hardt, Eric Price (IBM) The Noisy Power Method X U θ 2014-10-31 7 / 18 Noisy convergence proof (general k) Use “principal angle” θ from X to U let U ∈ Rd×k have top k eigenvectors, V = U ⊥ . vP u T α2 kV X k u t Pj>k j = tan θ := 2 kU T X k j≤k αj X U θ With no noise: tan θt+1 Moritz Hardt, Eric Price (IBM) vP u u j>k λ2j αj2 = tP 2 2 j≤k λj αj The Noisy Power Method 2014-10-31 7 / 18 Noisy convergence proof (general k) Use “principal angle” θ from X to U let U ∈ Rd×k have top k eigenvectors, V = U ⊥ . vP u T α2 kV X k u t Pj>k j = tan θ := 2 kU T X k j≤k αj X U θ With no noise: tan θt+1 Moritz Hardt, Eric Price (IBM) vP u u j>k λ2j αj2 λ = tP ≤ k+1 tan θt 2 2 λk j≤k λj αj The Noisy Power Method 2014-10-31 7 / 18 Noisy convergence proof (general k) Use “principal angle” θ from X to U let U ∈ Rd×k have top k eigenvectors, V = U ⊥ . vP u T α2 kV X k u t Pj>k j = tan θ := 2 kU T X k j≤k αj X U θ With no noise: tan θt+1 vP u u j>k λ2j αj2 λ = tP ≤ k+1 tan θt 2 2 λk j≤k λj αj With noise G “small enough” we will have tan θt+1 ≤ Moritz Hardt, Eric Price (IBM) λk+1 kV T X k + kGk λk kU T X k − kU T Gk The Noisy Power Method 2014-10-31 7 / 18 Noisy convergence proof (general k) Use “principal angle” θ from X to U let U ∈ Rd×k have top k eigenvectors, V = U ⊥ . vP u T α2 kV X k u t Pj>k j = tan θ := 2 kU T X k j≤k αj X U θ With no noise: tan θt+1 vP u u j>k λ2j αj2 λ = tP ≤ k+1 tan θt 2 2 λk j≤k λj αj With noise G “small enough” we will have tan θt+1 ≤ Moritz Hardt, Eric Price (IBM) λk+1 kV T X k + kGk λ ≤ + ( k+1 )1/4 tan θt T T λk λk kU X k − kU Gk The Noisy Power Method 2014-10-31 7 / 18 Noisy power method lemma Theorem Consider running the noisy power method on a random starting space X0 ∈ Rd×k . Let U ∈ Rd×k have top k eigenvectors of A. If 5kGk ≤ (λk − λk+1 ) 1 5kU T Gk ≤ (λk − λk+1 ) √ kd λk then after L = O( λk −λ log(d/)) iterations, k+1 tan Θ(XL , U) . ⇐⇒ k(I − XL XLT )Uk . Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 8 / 18 Noisy power method lemma Theorem Consider running the noisy power method on a random starting space X0 ∈ Rd×k . Let U ∈ Rd×k have top k eigenvectors of A. If 5kGk ≤ (λk − λk+1 ) 1 5kU T Gk ≤ (λk − λk+1 ) √ kd λk then after L = O( λk −λ log(d/)) iterations, k+1 tan Θ(XL , U) . ⇐⇒ k(I − XL XLT )Uk . Can also iterate on a p > k dimensional subspace. U θ X Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 8 / 18 Noisy power method lemma Theorem Consider running the noisy power method on a random starting space X0 ∈ Rd×p . Let U ∈ Rd×k have top k eigenvectors of A. If √ √ p− k −1 T √ 5kGk ≤ (λk − λk+1 ) 5kU Gk ≤ (λk − λk+1 ) d λk then after L = O( λk −λ log(d/)) iterations, k+1 tan Θ(XL , U) . ⇐⇒ k(I − XL XLT )Uk . Can also iterate on a p > k dimensional subspace. √ √ I kth singular value of X is typically p− k−1 √ . d U θ X Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 8 / 18 Noisy power method lemma Theorem Consider running the noisy power method on a random starting space X0 ∈ Rd×2k . Let U ∈ Rd×k have top k eigenvectors of A. If r k T 5kGk ≤ (λk − λk +1 ) 5kU Gk ≤ (λk − λk+1 ) d λk then after L = O( λk −λ log(d/)) iterations, k+1 tan Θ(XL , U) . ⇐⇒ k(I − XL XLT )Uk . Can also iterate on a p > k dimensional subspace. √ √ I kth singular value of X is typically p− k−1 √ . d U θ X Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 8 / 18 Noisy power method lemma Theorem Consider running the noisy power method on a random starting space X0 ∈ Rd×2k . Let U ∈ Rd×k have top k eigenvectors of A. If r k T 5kGk ≤ (λk − λk +1 ) 5kU Gk ≤ (λk − λk+1 ) d λk then after L = O( λk −λ log(d/)) iterations, k+1 tan Θ(XL , U) . ⇐⇒ k(I − XL XLT )Uk . Can also iterate on a p > k dimensional subspace. √ √ I kth singular value of X is typically p− k−1 √ . d q If G is fairly uniform, expect kU T Gk ≈ kGk Moritz Hardt, Eric Price (IBM) The Noisy Power Method U θ k d X 2014-10-31 8 / 18 Noisy power method lemma Theorem Consider running the noisy power method on a random starting space X0 ∈ Rd×2k . Let U ∈ Rd×k have top k eigenvectors of A. If r k T 5kGk ≤ (λk − λk +1 ) 5kU Gk ≤ (λk − λk+1 ) d λk then after L = O( λk −λ log(d/)) iterations, k+1 tan Θ(XL , U) . ⇐⇒ k(I − XL XLT )Uk . Can also iterate on a p > k dimensional subspace. √ √ I kth singular value of X is typically p− k−1 √ . d q If G is fairly uniform, expect kU T Gk ≈ kGk I U θ k d X First condition is the main one, iteration will converge to to Moritz Hardt, Eric Price (IBM) The Noisy Power Method kGk λk −λk +1 . 2014-10-31 8 / 18 Conjectures to remove eigengap Theorem Consider running the noisy power method on a random starting space X0 ∈ Rd×2k . Let U ∈ Rd×k have top k eigenvectors of A. If r k 5kGk ≤ (λk − λk+1 ) 5kU T Gk ≤ (λk − λk+1 ) d λk at each iteration then after L = O( λk −λ log(d/)) iterations, k +1 tan Θ(XL , U) . ⇐⇒ k(I − XL XLT )Uk . If λk = λk+1 , our theorem is useless. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 9 / 18 Conjectures to remove eigengap Conjecture (Can depend on λk − λ2k+1 eigengap) Consider running the noisy power method on a random starting space X0 ∈ Rd×2k . Let U ∈ Rd×k have top k eigenvectors of A. If r k 5kGk ≤ (λk − λ2k+1 ) 5kU T Gk ≤ (λk − λ2k+1 ) d λk log(d/)) iterations, at each iteration then after L = O( λk −λ 2k+1 tan Θ(XL , U) . ⇐⇒ k(I − XL XLT )Uk . If λk = λk+1 , our theorem is useless. If X is n × p, maybe the relevant eigengap is λk − λp+1 ? Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 9 / 18 Conjectures to remove eigengap Conjecture (Can depend on λk − λ2k+1 eigengap) Consider running the noisy power method on a random starting space X0 ∈ Rd×2k . Let U ∈ Rd×k have top k eigenvectors of A. If r k 5kGk ≤ (λk − λ2k+1 ) 5kU T Gk ≤ (λk − λ2k+1 ) d λk log(d/)) iterations, at each iteration then after L = O( λk −λ 2k+1 tan Θ(XL , U) . ⇐⇒ k(I − XL XLT )Uk . If λk = λk+1 , our theorem is useless. If X is n × p, maybe the relevant eigengap is λk − λp+1 ? But do we need any eigengap at all? Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 9 / 18 Conjectures to remove eigengap Do we need any eigengap at all? Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 10 / 18 Conjectures to remove eigengap Do we need any eigengap at all? Yes for X to approximate U: Moritz Hardt, Eric Price (IBM) k(I − XX T )Uk ≤ . The Noisy Power Method 2014-10-31 10 / 18 Conjectures to remove eigengap Do we need any eigengap at all? Yes for X to approximate U: k(I − XX T )Uk ≤ . Not clear for X to approximate A: k(I − XX T )Ak ≤ λk+1 + . Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 10 / 18 Conjectures to remove eigengap Do we need any eigengap at all? Yes for X to approximate U: k(I − XX T )Uk ≤ . Not clear for X to approximate A: k(I − XX T )Ak ≤ λk+1 + . I This is weaker: doesn’t imply Frobenius approximation. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 10 / 18 Conjectures to remove eigengap Do we need any eigengap at all? k(I − XX T )Uk ≤ . Yes for X to approximate U: Not clear for X to approximate A: k(I − XX T )Ak ≤ λk+1 + . I This is weaker: doesn’t imply Frobenius approximation. Conjecture Consider running the noisy power method on a random starting space X0 ∈ Rd×2k . Let U ∈ Rd×k have top k eigenvectors of A. If r k T kGk ≤ kU Gk ≤ d at each iteration then after L = O( λk+1 log(d/)) iterations, k(I − XL XLT )Ak ≤ λk+1 + O() Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 10 / 18 Review of our theorem Theorem Consider running the noisy power method on a random starting space X0 ∈ Rd×2k . Let U ∈ Rd×k have the top k eigenvectors of A. If r k T 5kGk ≤ (λk − λk+1 ) 5kU Gk ≤ (λk − λk+1 ) d λk at each iteration then after L = O( λk −λ log(d/)) iterations, k +1 tan Θ(XL , U) . ⇐⇒ k(I − XL XLT )Uk . √ √ Gaussian G: if Gi,j ∼ N(0, σ 2 ) then kGk . dσ, kU T Gk . k σ √ with high probability. Hence σ = (λk − λk+1 )/ d is tolerable. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 11 / 18 Outline 1 Applications Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 12 / 18 Applications of the Noisy Power Method Will discuss two applications of our theorem: I I Privacy-preserving spectral analysis [Hardt-Roth ’13] Streaming PCA [Mitliagkas-Caramanis-Jain ’13] Both cases, get improved bound. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 13 / 18 Privacy-preserving spectral analysis Can we find differentially private approximations to the top eigenvectors? Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 14 / 18 Privacy-preserving spectral analysis Can we find differentially private approximations to the top eigenvectors? Think of A as related to adjacency matrix for graph (web links, social network, etc.) Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 14 / 18 Privacy-preserving spectral analysis Can we find differentially private approximations to the top eigenvectors? Think of A as related to adjacency matrix for graph (web links, social network, etc.) I Top eigenvectors are useful to study and reveal (e.g. PageRank, Cheever cuts) Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 14 / 18 Privacy-preserving spectral analysis Can we find differentially private approximations to the top eigenvectors? Think of A as related to adjacency matrix for graph (web links, social network, etc.) I I Top eigenvectors are useful to study and reveal (e.g. PageRank, Cheever cuts) Don’t want to reveal whether x and y are friends. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 14 / 18 Privacy-preserving spectral analysis Can we find differentially private approximations to the top eigenvectors? Think of A as related to adjacency matrix for graph (web links, social network, etc.) I I Top eigenvectors are useful to study and reveal (e.g. PageRank, Cheever cuts) Don’t want to reveal whether x and y are friends. Randomized algorithm f is (, δ) differentially private if: for any A, A0 with kA − A0 k ≤ 1, and for any subset S of the range, Pr[f (A) ∈ S] ≤ e Pr[f (A0 ) ∈ S] + δ Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 14 / 18 Privacy-preserving spectral analysis Can we find differentially private approximations to the top eigenvectors? Think of A as related to adjacency matrix for graph (web links, social network, etc.) I I Top eigenvectors are useful to study and reveal (e.g. PageRank, Cheever cuts) Don’t want to reveal whether x and y are friends. Randomized algorithm f is (, δ) differentially private if: for any A, A0 with kA − A0 k ≤ 1, and for any subset S of the range, Pr[f (A) ∈ S] ≤ e Pr[f (A0 ) ∈ S] + δ Typical dependence is poly( 1 log(1/δ)). Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 14 / 18 Privacy-preserving spectral analysis Hardt-Roth ’12: Noisy power method X → AX + G preserves privacy if Gi,j ∼ N(0, σ 2 ) for large enough σ at each stage. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 15 / 18 Privacy-preserving spectral analysis Hardt-Roth ’12: Noisy power method X → AX + G preserves privacy if Gi,j ∼ N(0, σ 2 ) for large enough σ at each stage. Apply our theorem to see how well the result approximates U. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 15 / 18 Privacy-preserving spectral analysis Hardt-Roth ’12: Noisy power method X → AX + G preserves privacy if Gi,j ∼ N(0, σ 2 ) for large enough σ at each stage. Apply our theorem to see how well the result approximates U. Complicated expression, but for example: Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 15 / 18 Privacy-preserving spectral analysis Hardt-Roth ’12: Noisy power method X → AX + G preserves privacy if Gi,j ∼ N(0, σ 2 ) for large enough σ at each stage. Apply our theorem to see how well the result approximates U. Complicated expression, but for example: I Suppose A represents random graph with a planted sparse cut. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 15 / 18 Privacy-preserving spectral analysis Hardt-Roth ’12: Noisy power method X → AX + G preserves privacy if Gi,j ∼ N(0, σ 2 ) for large enough σ at each stage. Apply our theorem to see how well the result approximates U. Complicated expression, but for example: I I Suppose A represents random graph with a planted sparse cut. Then the (, δ)-differentially private result is very close to U. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 15 / 18 Privacy-preserving spectral analysis Hardt-Roth ’12: Noisy power method X → AX + G preserves privacy if Gi,j ∼ N(0, σ 2 ) for large enough σ at each stage. Apply our theorem to see how well the result approximates U. Complicated expression, but for example: I I I Suppose A represents random graph with a planted sparse cut. Then the (, δ)-differentially private result is very close to U. [Hardt-Roth: k = 1 case.] Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 15 / 18 Privacy-preserving spectral analysis Hardt-Roth ’12: Noisy power method X → AX + G preserves privacy if Gi,j ∼ N(0, σ 2 ) for large enough σ at each stage. Apply our theorem to see how well the result approximates U. Complicated expression, but for example: I I I Suppose A represents random graph with a planted sparse cut. Then the (, δ)-differentially private result is very close to U. [Hardt-Roth: k = 1 case.] Added bonus: algorithm uses sparsity of A. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 15 / 18 Streaming PCA Can take samples x1 , x2 , . . . ∼ D in Rd . Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 16 / 18 Streaming PCA Can take samples x1 , x2 , . . . ∼ D in Rd . Want to estimate covariance matrix Σ = E[xx T ]. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 16 / 18 Streaming PCA Can take samples x1 , x2 , . . . ∼ D in Rd . Want to estimate covariance matrix Σ = E[xx T ]. Easy answer is the empirical covariance: n X b=1 Σ xi xiT . n i=1 Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 16 / 18 Streaming PCA Can take samples x1 , x2 , . . . ∼ D in Rd . Want to estimate covariance matrix Σ = E[xx T ]. Easy answer is the empirical covariance: n X b=1 Σ xi xiT . n i=1 But uses n2 space. Can we use O(nk ) space if Σ is nearly low rank? Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 16 / 18 Streaming PCA Can take samples x1 , x2 , . . . ∼ D in Rd . Want to estimate covariance matrix Σ = E[xx T ]. Easy answer is the empirical covariance: n X b=1 Σ xi xiT . n i=1 But uses n2 space. Can we use O(nk ) space if Σ is nearly low rank? [Mitliagkas-Caramanis-Jain ’13] Yes, using more samples. Can do one iteration of the power method in small space: n X b =1 xi xiT X Xt+1 = ΣX n i=1 Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 16 / 18 Streaming PCA Algorithm is: in every iteration, take a bunch of samples to move in correct direction. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 17 / 18 Streaming PCA Algorithm is: in every iteration, take a bunch of samples to move in correct direction. How many iterations, and how many samples per iteration? Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 17 / 18 Streaming PCA Algorithm is: in every iteration, take a bunch of samples to move in correct direction. How many iterations, and how many samples per iteration? This is then a noisy power method problem: b − Σ)X Xt+1 = Σx + (Σ Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 17 / 18 Streaming PCA Algorithm is: in every iteration, take a bunch of samples to move in correct direction. How many iterations, and how many samples per iteration? This is then a noisy power method problem: b − Σ)X Xt+1 = Σx + (Σ b − Σ)X in each iteration. Just need to bound norm of (Σ Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 17 / 18 Streaming PCA Algorithm is: in every iteration, take a bunch of samples to move in correct direction. How many iterations, and how many samples per iteration? This is then a noisy power method problem: b − Σ)X Xt+1 = Σx + (Σ b − Σ)X in each iteration. Just need to bound norm of (Σ Nice case: spiked covariance model Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 17 / 18 Streaming PCA Algorithm is: in every iteration, take a bunch of samples to move in correct direction. How many iterations, and how many samples per iteration? This is then a noisy power method problem: b − Σ)X Xt+1 = Σx + (Σ b − Σ)X in each iteration. Just need to bound norm of (Σ Nice case: spiked covariance model I Gaussian, where Σ has k eigenvalues λ1 , . . . , λk = Θ(1) perturbed by Gaussian noise N(0, σ 2 ) in each coordinate. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 17 / 18 Streaming PCA Algorithm is: in every iteration, take a bunch of samples to move in correct direction. How many iterations, and how many samples per iteration? This is then a noisy power method problem: b − Σ)X Xt+1 = Σx + (Σ b − Σ)X in each iteration. Just need to bound norm of (Σ Nice case: spiked covariance model I I Gaussian, where Σ has k eigenvalues λ1 , . . . , λk = Θ(1) perturbed by Gaussian noise N(0, σ 2 ) in each coordinate. 6 e 1+σ O( dk ) samples suffice. 2 Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 17 / 18 Streaming PCA Algorithm is: in every iteration, take a bunch of samples to move in correct direction. How many iterations, and how many samples per iteration? This is then a noisy power method problem: b − Σ)X Xt+1 = Σx + (Σ b − Σ)X in each iteration. Just need to bound norm of (Σ Nice case: spiked covariance model I I I Gaussian, where Σ has k eigenvalues λ1 , . . . , λk = Θ(1) perturbed by Gaussian noise N(0, σ 2 ) in each coordinate. 6 e 1+σ O( dk ) samples suffice. 2 Factor k improvement on [Mitliagkas-Caramanis-Jain ’13] Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 17 / 18 Streaming PCA Algorithm is: in every iteration, take a bunch of samples to move in correct direction. How many iterations, and how many samples per iteration? This is then a noisy power method problem: b − Σ)X Xt+1 = Σx + (Σ b − Σ)X in each iteration. Just need to bound norm of (Σ Nice case: spiked covariance model I I I Gaussian, where Σ has k eigenvalues λ1 , . . . , λk = Θ(1) perturbed by Gaussian noise N(0, σ 2 ) in each coordinate. 6 e 1+σ O( dk ) samples suffice. 2 Factor k improvement on [Mitliagkas-Caramanis-Jain ’13] But also applies to less nice cases: Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 17 / 18 Streaming PCA Algorithm is: in every iteration, take a bunch of samples to move in correct direction. How many iterations, and how many samples per iteration? This is then a noisy power method problem: b − Σ)X Xt+1 = Σx + (Σ b − Σ)X in each iteration. Just need to bound norm of (Σ Nice case: spiked covariance model I I I Gaussian, where Σ has k eigenvalues λ1 , . . . , λk = Θ(1) perturbed by Gaussian noise N(0, σ 2 ) in each coordinate. 6 e 1+σ O( dk ) samples suffice. 2 Factor k improvement on [Mitliagkas-Caramanis-Jain ’13] But also applies to less nice cases: I Strong whenever D has exponential concentration. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 17 / 18 Streaming PCA Algorithm is: in every iteration, take a bunch of samples to move in correct direction. How many iterations, and how many samples per iteration? This is then a noisy power method problem: b − Σ)X Xt+1 = Σx + (Σ b − Σ)X in each iteration. Just need to bound norm of (Σ Nice case: spiked covariance model I I I Gaussian, where Σ has k eigenvalues λ1 , . . . , λk = Θ(1) perturbed by Gaussian noise N(0, σ 2 ) in each coordinate. 6 e 1+σ O( dk ) samples suffice. 2 Factor k improvement on [Mitliagkas-Caramanis-Jain ’13] But also applies to less nice cases: I I Strong whenever D has exponential concentration. Nontrivial result for general distributions. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 17 / 18 Recap and open questions Noisy power method is a useful tool. Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 18 / 18 Recap and open questions Noisy power method is a useful tool. Can show tan Θ(XL , U) ≤ if r 5kGk ≤ (λk − λk+1 ) Moritz Hardt, Eric Price (IBM) T 5kU Gk ≤ (λk − λk+1 ) The Noisy Power Method k d 2014-10-31 18 / 18 Recap and open questions Noisy power method is a useful tool. Can show tan Θ(XL , U) ≤ if r 5kGk ≤ (λk − λk+1 ) T 5kU Gk ≤ (λk − λk+1 ) k d Can we apply it to more problems? Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 18 / 18 Recap and open questions Noisy power method is a useful tool. Can show tan Θ(XL , U) ≤ if r 5kGk ≤ (λk − λk+1 ) T 5kU Gk ≤ (λk − λk+1 ) k d Can we apply it to more problems? Can we prove a theorem without the eigengap? Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 18 / 18 Moritz Hardt, Eric Price (IBM) The Noisy Power Method 2014-10-31 19 / 18
© Copyright 2026 Paperzz