Gene Selection for Cancer Classification using Support Vector Machines Jesye Content • Feature ranking with correlation coefficients • SVM Recursive Feature Elimination (SVM RFE) • Data sets • Experimental results Ranking criteria wi = (mi(+) – mi(-)) / (si(+)+ si(-)) (Golub , 1999) mi(+):mean of class (+) mi(-):mean of class (-) si(+):standard deviation of class (+) si(-):standard deviation of class (-) Ranking criteria (Golub , 1999) equal number of genes in + - class (Furey, 2000) absolute value of wi (Pavlidis, 2000) 2 2 2 (mi(+) – mi(-)) / (si(+) + si(-) ) similar to Fisher’s discriminant criterion This paper : wi 2 Recursive Feature Elimination • 1) Train the classifier. • 2) Compute the ranking criterion for all features. • 3) Remove the feature with smallest ranking criterion. SVM RFE Inputs: Training examples X0 = [x1, x2, … xk, …xl] Class labels y = [y1, y2, … yk, … yl] Output: Feature ranked list r. SVM RFE Initialize: Surviving features s = [1, 2, … n] Feature ranked list r=[] SVM RFE Repeat until s = [ ] X = X0(:, s) a = SVM-train(X, y) w = ak yk xk ci = (wi) 2, for all i f = argmin(c) r = [s(f), r] s = s(1:f-1, f+1:length(s)) End SVM Model Minimize over a k : J (1/ 2) yh yka ha k ( xh xk hk ) a k hk k Subject to: 0 ak C and a k yk 0 k Outputs: Parameters: a k Data sets • Leukemia • Colon 7129×72 2000×62 Data sets For example: sp. Sample 1 Sample 2 …… Gene (Cancer) Sample k (Normal) Gene 1 29 19 …… 16 Gene 2 5 17 …… 40 …… Gene n …… …… 13 …… 8 …… …… 2 Experimental results • Leukemia Number of genes 100 50 34 20 10 8 5 3 1 (SVM-RFE) Train accuracy 100 100 100 100 100 100 100 100 92.093 Test accuracy 99.31 98.276 99.31 98.621 98.621 96.552 95.172 92.759 78.966 Experimental results • Colon (SVM-RFE) Number of genes 100 50 33 20 10 8 5 3 1 Train accuracy 100 100 100 100 100 100 99.189 95.405 80 Test accuracy 80.4 80.8 82 79.2 78.8 77.6 75.6 77.6 71.6 The End Thank you for watching!
© Copyright 2026 Paperzz