Introduction to Kernel PCA Near-optimal Spare L1-PCA 1 2 Need to capture ``non-linear data’’ pattern principal component analysis (PCA) is linear 3 High dimensional data mapping Data becomes linearly separable Feature mapping Data in low. dim. space Data in high dim. space 4 To answer this, Lets visit PCA... : data matrix : Assume zero-center data : Calculate covariance matrix : eigen vectors calculation : low-dim. projection 5 To answer this, Lets visit PCA... • Two lesser known-facts Projected-data are de-correlated in new-basis is a diagonal sub-matrix Every eigen-vector can be exactly written as some linear combination of data-vectors 6 • Non-linear transformation or using...(1) and PC property #2 7 • Non-linear transformation or define kernel function After few simplification 8 Focus Kernel function 9 Q. Do we need individual A. No, need only projections and ? Mysterious 10 Given two points and , we need Let 11 12 • Kernel of form corresponds to inner product in higher space • Computing just in -space, design kernel 13 14 15 original data class 1 class 2 100 z 50 0 -50 -100 100 50 100 50 0 0 -50 -100 y standard PCA 20 8 class 1 class 2 100 -50 -100 x polynomial kernel order-5 x 10 Radial kernel 3 class 1 class 2 6 class 1 class 2 2 4 1 2 0 50 -1 0 0 -2 -2 -3 -50 -4 -4 -6 -5 -100 -8 -1.5 -100 -50 0 50 100 -1 -0.5 0 0.5 1 1.5 21 x 10 -6 -8 -6 -4 -2 0 2 4 166 8 17 -PCA -SPCA Interpretability A direction that not only maximizes data variance but also has only non-zero components -PCA Robustness against outlier • • Sparsity enhanced interpretability and L1-norm enhanced robustness Lets have dual benefit of robustness and interpretability simultaneously -SPCA Interpretability Robustness against outlier 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
© Copyright 2026 Paperzz