IWIM 2007 workshop, Sept. 23-26, 2007, Prague Regularization of Evolving Polynomial Models Pavel Kordík CTU Prague, Faculty of Electrical Engineering, Department of Computer Science External criteria in GMDH theory 2/67 GAME model First layer of units x1 Linear unit x2 Input variables ... ... ... 2nd layer of units xn n y ai xi an 1 i 1 Output variable Polynomial unit x1 x2 ... Units in layer evolved by genetic algorithm xn 3/67 m r y ai x j a0 i 1 j 1 n Encoding into chromosomes GAME model evolution Input layer 4 1 5 2 Niching GA Linear transfer unit 1234567 1001000 Inputs 6 3 y a1 x1 a2 x2 a0 not encoded Transfer function Chromosomes Polynomial trasfer unit 7 Frozen layer(s) Actual layer 1234567 1234567 1234567 0000110 2115130 1203211 Inputs Transfer function y a1 x1 x23 a2 x12 x2 a0 4/67 Polynomial units encoded into neurons Elements y = 8.94 * x23 * x42 - 2.37 * x1 * x45 + 7.12 * x4 Coeff. x1 x2 x 3 x4 x 5 Coeff. x1 x2 x3 x4 x5 Coeff. x1 x2 x3 x4 x5 8.94 01010 23124 -2.37 10010 14256 7.12 00010 32411 degree_field used_field Encoding 5/67 Data set division 2/3 A 1/3 Adaptive division? B Training data Optimize coefficients (learning of units) Validation data Select surviving units 6/67 Testing data Check if model overfits data Fitness function – which units survive? • RMS Error of the unit on the training data – feedback for optimization methods m E ( y ' y ) i 1 • RMS Error of the unit on the validation data – used to compute fitness of units 7/67 2 External criterion • Computation of error on validation set Fitness = 1/CR 8/67 9/67 t1 te R 50 st2 t R est 50 1 R tes 30 t 0 2 t e R 30 st1 0 R tes 72 5 t2 R tes 72 t1 R 5 te 16 s 00 t2 R 16 tes t1 0 R 0 te 30 0 st2 R 0 te 30 s 00 t1 te st 2 12 te s 0.46 R 0.56 12 Optimal value of R is 300 on the Antro data set 3.3 1 09 2E +2 6 4. 94 5. 7E + 7 RMS error on the Antro training data set & the Antro testing data set 7. 44 0.58 R R tra 12 in1 tr R ain 50 2 tra R 50 in1 R train 30 2 0 t r R 30 ain1 0 R trai 72 n2 5 t r R 72 ain1 5 t R 16 rain 00 2 R 16 train 00 1 t R 30 rain 00 2 R t r a 30 00 in1 tra in 2 12 R CRrms-r-val criterion on real data 1.5 1.3 0.54 1.1 0.52 0.9 0.5 0.48 0.7 0.5 40 59 .6 CRrms-r-val criterion on real data RMS error on the Building training data set & Building testing data set 0.03 0.035 WBE WBCW 0.026 0.03 WBHW 0.022 0.025 0.018 0.02 0.014 0.015 0.01 0.01 12 R in rt a 1 1 in rt a 50 300 R R in rt a 1 in rt a 1 1 in tra 5 00 00 72 6 0 R 1 3 R R n1 i tra 12 R st te 1 st e t 1 50 300 R R st te 1 st te 1 st te 5 00 00 72 6 0 R 1 3 R R Optimal value of R is 725 on the Building data set 10/67 1 st te 1 CR should be sensitive to noise CR High noise 0.9 St op th e 0 in R5 R12 0.6 0.3 R3 Medium noise m in im um of CR 00 R750 Low noise 0 y = a1x1+a2 R3000 Model complexity 11/67 y = a1x13x4+ ... +a6x2+a7 How to estimate the penalization strength (1/R)? Variance of the output variable? 12/67 0.35 0.3 Training & Validation set 1.7 3E -01 4.1 7 4.1 E-02 7E -02 1.7 0 2.7 E-03 1E -03 0.1 4.1 1 4.1 E-02 0E -02 0.15 4.8 6 4.8 E-02 4E -02 8.7 0 8.7 E-02 4E -02 0.2 0 3.8 6E +0 8 6.9 0E +0 7 0.4 0.35 RMS-tr&val 0.3 R300-tr&val Validation set 0.25 0.05 3.2 0E -01 0.4 Regularization on testing data 0.25 RMS-p-n-tr&val 0.2 0.15 0.1 0.05 0 e e e e e e e is is is is is is is o o o o o o o n n n n n n n % % % % % % % 0 5 0 0 10 20 50 10 20 0% 1.7 0 2.0 E-03 1 1.9 E-03 5E -03 3.8 6E +0 8 RMS on the testing data 6.9 0E +0 7 Experiments with synthetic data is no e 5% is no e % 10 is no e % 20 is no e % 50 is no e is no e 0% 00% 10 2 is no e Regularization works, but the difference of R300 and p-n is not significant. Validate just on validation set! 13/67 Theoretical and experimental aspects of regularization 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 RM SR1 2 RM SR5 0 RM SR3 00 RM SR7 25 RM SR1 60 0 RM SR3 00 0 RM SR2 RM SR5 0 0.2 0.1 0 14/67 RM SR1 2 RM SR5 0 RM SR3 00 RM SR7 25 RM SR1 60 0 RM SR3 00 0 0.1 200% noise 100% noise 50% noise 20% noise 10% noise 5% noise 0% noise RM SR2 RM SR5 0.2 200% noise 100% noise 50% noise 20% noise 10% noise 5% noise 0% noise So which criterion is the best? 0% noise 0.003 200% noise 0.002 5% noise 0.001 RMS-valid R300-val RMS-p-n-valid 0 -0.001 100% noise 50% noise 10% noise 20% noise 15/67 R300-tr&val RMS-tr&val RMS-p-n-tr&valid Regularized polynomial models on Antro data set It is evident that optimal value of R is between 100 and 1000 – the same results as in our pervious experiments with Antro data set (Ropt=300). Linear models are still better than the best polynomial !!! 16/67 Conclusion • Experiments with regularization of polynomial models • Every data set requires different level of penalization for complexity • It can be partially derived from the variance of the output variable • The regularization is still not sufficient, linear models perform better on highly noisy data sets! 17/67 Thank you! 18/67
© Copyright 2026 Paperzz