Random Moments for Sketched Mixture Learning Nicolas Keriven12, Rémi Gribonval2, Gilles Blanchard3, Yann Traonmilin2 1Université Rennes 1 2Inria Rennes Bretagne-atlantique 3University of Potsdam SPARS 2017 Outline Introduction Illustration Main results Conclusion 07/06/2017 Nicolas Keriven Statistical Learning . . . Hypothesis Database of vectors i.i.d. 07/06/2017 Nicolas Keriven 1 Statistical Learning . . . Hypothesis Database of vectors i.i.d. 07/06/2017 Nicolas Keriven PCA : Classification : Regression : k-means : Density estimation : 1 Statistical Learning . . . Hypothesis Database of vectors i.i.d. Loss function 07/06/2017 Nicolas Keriven PCA : Classification : Regression : k-means : Density estimation : 1 Statistical Learning . . . Hypothesis Database of vectors i.i.d. Loss function Goal : Minimize Expected Risk 07/06/2017 Nicolas Keriven PCA : Classification : Regression : k-means : Density estimation : 1 Statistical Learning . . . Hypothesis Database of vectors i.i.d. Loss function Goal : Minimize Expected Risk PCA : Classification : Regression : k-means : Density estimation : Empirical Risk Minimization (ERM) 07/06/2017 Nicolas Keriven 1 Statistical Learning . . . Hypothesis Database of vectors i.i.d. Loss function Goal : Minimize Expected Risk PCA : Classification : Regression : k-means : Density estimation : Large d or n Empirical Risk Minimization (ERM) 07/06/2017 Nicolas Keriven 1 Statistical Learning . . . Hypothesis Database of vectors i.i.d. Loss function Goal : Minimize Expected Risk PCA : Classification : Regression : k-means : Density estimation : Large d or n Empirical Risk Minimization (ERM) Compress the database before learning 07/06/2017 Nicolas Keriven 1 Compressive Statistical Learning . . . Hypothesis Database of vectors Large d : Dimensionality reduction See eg [Calderbank 2009, Boutsidis 2010] - Random Projection Feature selection 07/06/2017 . . . Database of compressed vectors Nicolas Keriven 2 Compressive Statistical Learning . . . Hypothesis Database of vectors Large n : Subsampling, coresets See eg [Feldman 2010] . . . - Uniform sampling (naive) Adaptive, weighted sampling Hierarchical construction 07/06/2017 Reduced database Nicolas Keriven 2 Compressive Statistical Learning . . . Hypothesis Database of vectors Linear sketch See [Thaper 2002, Cormode 2011] - - Sketch of union of databases = sum of sketches Extremely convenient for streaming / parallel computing Used for simple queries. Can we do learning ? 07/06/2017 Nicolas Keriven Sketch 2 Random Sketching operator Linear sketch = Empirical generalized moments… 07/06/2017 Nicolas Keriven 3 Random Sketching operator Linear sketch = Empirical generalized moments… 07/06/2017 ... i.e. linear measurements of underlying probability distribution Nicolas Keriven 3 Random Sketching operator Linear sketch = Empirical generalized moments… 07/06/2017 ... i.e. linear measurements of underlying probability distribution Nicolas Keriven 3 Random Sketching operator Linear sketch = Empirical generalized moments… ... i.e. linear measurements of underlying probability distribution Reminiscent of Compressive Sensing : Random design of 07/06/2017 Nicolas Keriven 3 Outline Introduction Illustration (previous work) Main results Conclusion 07/06/2017 Nicolas Keriven Experimental illustration Compressive Learning-OMP algorithm [Keriven 2015,2016] (OMP + non-convex updates) 07/06/2017 Nicolas Keriven 4 Experimental illustration k-means (d=10, k=10) Compressive Learning-OMP algorithm [Keriven 2015,2016] (OMP + non-convex updates) Comparison with • Matlab’s kmeans • VLFeat’s gmm • Faster and more memory efficient on large databases • Number of measurements does not depend on n 07/06/2017 Nicolas Keriven 4 Experimental illustration k-means (d=10, k=10) Compressive Learning-OMP algorithm [Keriven 2015,2016] (OMP + non-convex updates) Comparison with • Matlab’s kmeans • VLFeat’s gmm • Faster and more memory efficient on large databases • Number of measurements does not depend on n GMMs 07/06/2017 Nicolas Keriven (d=10, k=10) 4 Outline Introduction Illustration Main results Conclusion 07/06/2017 Nicolas Keriven Statistical Learning . . . Hypothesis Database of vectors i.i.d. Loss function Goal : Minimize Expected Risk 07/06/2017 Nicolas Keriven 6 Statistical Learning . . . Hypothesis Database of vectors i.i.d. Loss function Goal : Minimize Expected Risk Here: - k-means - GMM with known covariance 07/06/2017 Nicolas Keriven 6 k-means Hyp. class 07/06/2017 Nicolas Keriven 7 k-means • Hyp. class • - separation - bounded domain (centroids, not samples) 07/06/2017 Nicolas Keriven 7 k-means • Hyp. class • - separation - bounded domain (centroids, not samples) Loss function 07/06/2017 Nicolas Keriven 7 k-means • Hyp. class • - separation - bounded domain (centroids, not samples) Loss function Sketching operator 07/06/2017 • (weighted) Random Fourier sampling • « Smoothing » weights Nicolas Keriven 7 k-means • Hyp. class • - separation - bounded domain (centroids, not samples) Loss function Sketching operator 07/06/2017 • (weighted) Random Fourier sampling • « Smoothing » weights Nicolas Keriven 7 k-means: result 07/06/2017 Nicolas Keriven 8 k-means: result : (weighted) Random Fourier sampling 07/06/2017 Nicolas Keriven 8 k-means: result : (weighted) Random Fourier sampling 07/06/2017 Nicolas Keriven 8 k-means: result : (weighted) Random Fourier sampling If w.h.p. on 07/06/2017 Nicolas Keriven 8 GMM with known covariance Hyp. class 07/06/2017 Nicolas Keriven 9 GMM with known covariance Hyp. class • • separation - bounded domain (means, not samples) 07/06/2017 Nicolas Keriven 9 GMM with known covariance Hyp. class • • separation - bounded domain (means, not samples) Loss function 07/06/2017 Nicolas Keriven 9 GMM with known covariance Hyp. class • • separation - bounded domain (means, not samples) Loss function Sketching operator 07/06/2017 • Random Fourier sampling Nicolas Keriven 9 GMM with known covariance Hyp. class • • separation - bounded domain (means, not samples) Loss function Sketching operator • Random Fourier sampling linked to separation 07/06/2017 Nicolas Keriven 9 GMM: result 07/06/2017 Nicolas Keriven 10 GMM: result : Random Fourier sampling 07/06/2017 Nicolas Keriven 10 GMM: result : Random Fourier sampling 07/06/2017 Nicolas Keriven 10 GMM: result : Random Fourier sampling Trade-off with minimal separation If w.h.p. 07/06/2017 Nicolas Keriven 10 GMM trade-off Trade-off Separation of means Size of sketch More high frequencies 07/06/2017 Nicolas Keriven 11 Sketch Size k-means Non-convex optimization. Greedy heuristic: CL-OMP SSE on k points [Keriven 2016] In theory, at least Empirically GMMs, known cov. Relative loglike 07/06/2017 Nicolas Keriven 5 Sketch of proof Key idea 1 Sketching operator = Step 1 Relate risk to kernel metric Kernel mean embedding [Smola 2007] + Random Features [Rahimi 2007] 07/06/2017 Nicolas Keriven 12 Sketch of proof Key idea 1 Sketching operator = Step 1 Relate risk to kernel metric Kernel mean embedding [Smola 2007] + Random Features [Rahimi 2007] Step 2 Key idea 2 Compressive Sensing analysis satisfies the RIP [Bourrier 2014] 07/06/2017 Nicolas Keriven 12 Sketch of proof Key idea 1 Sketching operator = Step 1 Relate risk to kernel metric Kernel mean embedding [Smola 2007] + Random Features [Rahimi 2007] Step 2 Key idea 2 Compressive Sensing analysis satisfies the RIP [Bourrier 2014] Main difficulty Controlling metrics between mixtures that get close to each other in infinitedimensional space 07/06/2017 Nicolas Keriven 12 Sketch of proof Key idea 1 Sketching operator = Step 1 Relate risk to kernel metric Kernel mean embedding [Smola 2007] + Random Features [Rahimi 2007] Step 2 Key idea 2 Compressive Sensing analysis satisfies the RIP [Bourrier 2014] Main difficulty Controlling metrics between mixtures that get close to each other in infinitedimensional space No hypothesis Separation hypothesis 07/06/2017 Nicolas Keriven 12 Outline Introduction Main results Experimental illustration Conclusion 07/06/2017 Nicolas Keriven Conclusions Contributions • Efficient sketched mixture learning framework, using random generalized moments • Combination of many tools: • Kernel mean embedding • Random Fourier features • Analysis inspired by Compressive Sensing 07/06/2017 Nicolas Keriven 13 Conclusions Contributions • Efficient sketched mixture learning framework, using random generalized moments • Combination of many tools: • Kernel mean embedding • Random Fourier features • Analysis inspired by Compressive Sensing Outlooks • Bridge gap theory / practice • Other models (done in practice), with other sketching operators • Non-linear sketches ? (neural networks…) 07/06/2017 Nicolas Keriven 13 The SketchMLbox SketchMLbox (sketchml.gforge.inria.fr) • Mixture of Diracs (« K-means ») • GMMs with known covariance • GMMs with unknown diagonal covariance • Soon: • Mixtures of multivariate alpha-stable (only known algorithm !) • Gaussian Locally Linear Mapping [Deleforge 2014] • Optimized for user-defined 07/06/2017 Nicolas Keriven 14 Thank you ! • K., Bourrier, Gribonval, Perez. Sketching for Large-Scale Learning of Mixture Models ICASSP 2016 • K., Bourrier, Gribonval, Perez. Sketching for Large-Scale Learning of Mixture Models (extended version) submitted to Information and Inference, arXiv:1606.0238 • K., Tremblay, Gribonval, Traonmilin. Compressive K-means ICASSP 2017 • K., Tremblay, Gribonval. SketchMLbox (sketchml.gforge.inria.fr) • Gribonval, Blanchard, K., Traonmilin. Compressive Statistical Learning online soon Nicolas Keriven Appendix : CLOMPR Nicolas Keriven
© Copyright 2026 Paperzz