Etymology of “Entropy” Information Entropy: Illustrating Example • Entropy = randomness Andrew Kusiak Intelligent Systems Laboratory 2139 Seamans Center The University of Iowa Iowa City, Iowa 52242 - 1527 • Amount of uncertainty [email protected] http://www.icaen.uiowa.edu/~ankusiak Tel: 319 - 335 5934 Fax: 319-335 5669 The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory Definitions Shannon Entropy • S = final probability space composed of two disjoint events E1 and E2 with probability p1 = p and p2 = 1 – p, respectively. • The Shannon entropy is defined as Information content m I(s1,s2,...,sm) = −∑ i =1 Entropy v E(A) = ∑ j =1 H(S) = H(p1, p2) = – plogp – (1 – p)log(1 – p) si si log 2 s s s1 j + ... + smj I ( s1 j ,..., smj ) s Information gain Gain(A) = I(s 1, s 2,..., sm) − E(A) The University of Iowa Intelligent Systems Laboratory Example I(D1, D2) = -4/8*log2 (4/8) - 4/8*log2 (4/8) = 1 For Blue D11 = 4, D21 = 0 I(D11, D21) = -4/4* log2 (4/4) = 0 For Red D12 = 0, D22 = 4 I(D12, D22) = -4/4* log2 (4/4) = 0 E(F1) = 4/8 I(D11, D21) + 4/8 I(D12, D22) = 0 Gain (F1) = I (D1, D2) - E (F1) = 1 D1= No of examples in class 1 D2= No of examples in class 2 The University of Iowa No. 1 2 3 4 5 6 7 8 F1 Blue Blue Blue Blue Red Red Red Red D 1 1 1 1 2 2 2 2 m I(s1,s2,...,sm) = −∑ i =1 v E(A) = ∑ j =1 si si log 2 s s s1 j + ... + smj I ( s1 j,..., smj ) s The University of Iowa Intelligent Systems Laboratory m Example i =1 v E(A) = ∑ j =1 si si log 2 s s s1 j + ... + smj I ( s1 j ,..., smj ) s I(D1, D2, D3) = -2/8*log2(2/8) - 3/8*log2(3/8) - 3/8*log2(3/8) = 1.56 Gain(A) = I(s 1, s 2,..., sm) − E(A) For Blue D11 = 2, D21 = 2, D31 = 0 I(D11, D21)= -2/4* log2 (2/4) -2/4* log2 (2/4) = 1 For Red D12 = 0, D22 = 1, D32 = 3 I(D22, D32) = -1/4* log2 (1/4) -3/4* log2 (3/4) = 0.81 E(F1) = 4/8 I(D11, D21) + 4/8 I (D22, D32) = 0.905 Gain (F1) = I(D1, D2) - E (F1) = 0.655 Gain(A) = I(s 1, s 2,..., sm) − E(A) Intelligent Systems Laboratory I(s1,s2,...,sm) = −∑ The University of Iowa No. 1 2 3 4 5 6 7 8 F1 Blue Blue Blue Blue Red Red Red Red D 1 1 2 2 2 3 3 3 Intelligent Systems Laboratory 1 m I(s1,s2,...,sm) = −∑ Example i =1 si si log 2 s s Example m I(s1,s2,...,sm) = −∑ i =1 v E(A) = ∑ j =1 s1 j + ... + smj I ( s1 j,..., smj ) s I(D1, D2, D3) = -2/8*log2(2/8) -3/8*log2(3/8) -3/8*log2(3/8) = 1.56 v I(D1, D2, D3) = -1/8*log2(1/8) - 3/8*log2(3/8) - 4/8*log2(4/8) = 1.41 For Blue D11 = 1, D21 = 3, D31 = 0 I (D11, D21) = -1/4* log2 (1/4) -3/4* log2 (3/4) = 0.81 Gain(A) = I(s 1, s 2,..., sm) − E(A) No. 1 2 3 4 5 6 7 8 For Red D12 = 0, D22 = 0, D32 = 4 I (D32) = -4/4* log2 (4/4) = 0 E(F1) = 4/8 I(D11, D21) + 4/8 I(D32) = 0.41 Gain (F1)= I(D1, D2) - E (F1) = 1 The University of Iowa F1 Blue Blue Blue Blue Red Red Red Red D 1 2 2 2 3 3 3 3 No. 1 2 3 4 5 6 7 8 I(D1, D2, D3, D4) = -2/8*log2(2/8) -2/8*log2(2/8) -2/8*log2(2/8) -2/8*log2(2/8) = 2 For Blue D11 = 1, D21 = 0, D31 = 0, D41 =0 I(D11) = -1/1* log2 (1/1) = 0 F1 Blue Green Green Green Green Green Red Red For Red D12 = 0, D22 = 2, D32 = 0, D42 = 0 I(D22) = -2/2* log2 (2/2) = 0 E(A) = ∑ For Blue D11 = 2, D21 = 0, D31 = 0 I(D11) = -2/2* log2 (2/2) = 0 j =1 No. 1 2 3 4 5 6 7 8 For Green D13 = 0, D23 = 0, D33 = 3 I(D33) = -3/3* log2 (3/3) = 0 E(F1) = 2/8 I(D11) + 3/8 I(D32) + 3/8 I(D32) = 0 Gain (F1) = I(D1, D2) - E (F1) = 1.56 The University of Iowa D 1 1 3 3 4 4 2 2 Case 2 Case 3 Case 4 Case 5 No. F1 1 Blue D 1 No. F1 1 Blue D 1 No. F1 1 Blue D 1 No. F1 1 Blue 2 Blue 1 2 Blue 1 2 Blue 2 2 Blue 3 Blue 1 3 Blue 2 3 Blue 2 3 Green 4 Blue 1 4 Blue 2 4 Blue 2 4 Green 3 5 Red 2 5 Red 2 5 Red 3 5 Green 3 5 Green 4 6 Red 2 6 Red 3 6 Red 3 6 Red 2 6 Green 4 7 Red 2 7 Red 3 7 Red 3 7 Red 2 7 Red 2 8 Red 2 8 Red 3 8 Red 3 8 Red 2 8 Red 2 E(F1) = 0 E(F1) = 0.905 Gain (F1) = 1 Gain (F1) = 0.655 Gain (F1) = 1 E(F1) = 0.41 D 1 The University of Iowa Intelligent Systems Laboratory Play Tennis:Training Data Set Feature (Attribute) Humidity Wind Play tennis high weak no high strong no high weak yes overcast hot rain mild high weak yes rain cool normal weak yes rain cool normal strong no overcast cool normal strong yes sunny mild high weak no sunny cool normal weak yes rain mild normal weak yes sunny mild normal strong yes overcast mild high strong yes overcast hot normal weak yes rain mild high strong The University of Iowa D 1 1 2 Green 1 3 3 Green 3 4 Green E(F1) = 0 3 E(F1) = 0.95 Gain (F1) = 1.56 Gain (F1) = 1.05 The University of Iowa Intelligent Systems Laboratory Outlook hot No. F1 1 Blue Continuous values Æ a split http://www.icaen.uiowa.edu/%7Ecomp/Public/Kantardzic.pdf Gain (F1)= I(D1, D2) - E (F1) = 1.05 Temperature hot D 1 1 3 3 3 2 2 2 Summary Case 1 E(F1)= 1/8 I(D11) + 2/8 I(D22) + 5/8 I(D13, D33, D43) = 0.95 sunny F1 Blue Blue Green Green Green Red Red Red Intelligent Systems Laboratory For Green D13 = 1, D23 = 0, D33 = 2, D43 = 2 I(D13, D33, D43) = -1/5* log2 (1/5) - 2/5* log2 (2/5) - 2/5* log2 (2/5) = 1.52 Outlook sunny s1 j + ... + smj I ( s1 j,..., smj ) s Gain(A) = I(s 1, s 2,..., sm) − E(A) For Red D12 = 0, D22 = 3, D32 = 0 I(D32) = -3/3* log2 (3/3) = 0 Intelligent Systems Laboratory Example si si log 2 s s no Decision Feature values Intelligent Systems Laboratory Feature Selection Humidity Wind Play tennis hot high weak no sunny hot high strong no overcast hot high weak yes rain mild high weak yes rain cool normal weak yes rain cool normal strong no overcast cool normal strong yes sunny mild high weak no sunny cool normal weak yes rain mild normal weak feature wind Gain(S, wind) = 0.048 feature outlook Gain(S, outlook) = 0.246 feature humidity Gain(S, humidity) = 0.151 feature temperature Gain(S, temperature) = 0.029 The University of Iowa Temperature sunny yes sunny mild normal strong yes overcast mild high strong yes overcast hot normal weak yes rain mild high strong no Intelligent Systems Laboratory 2 Constructing Decision Tree Outlook Temp hot sunny Outlook hot sunny overcast hot mild rain cool rain Rain Sunny Overcast Complete Decision Tree Play tennis no normal weak normal overcast cool normal strong no strong yes rain mild high weak no sunny cool normal weak yes rain mild normal weak yes sunny mild normal strong yes overcast mild high strong yes overcast hot normal weak high strong no rain mild Intelligent Systems Laboratory Outlook Wind Yes High Normal No Yes Strong Weak No Yes The University of Iowa Intelligent Systems Laboratory Decision Trees: Key Characteristics Rain Humidity Overcast Wind yes High Normal No Rain Overcast Humidity yes From Decision Trees to Rules Sunny Sunny yes sunny The University of Iowa Outlook no yes yes cool Yes Yes and No Humidity Wind high weak high strong high weak high weak Yes Strong No Weak Yes If Outlook = Overcast OR Outlook = Sunny AND Humidity = Normal OR Outlook = Rain AND Wind = Weak THEN Play tennis The University of Iowa Intelligent Systems Laboratory Complete space of finite discrete-valued functions Maintaining a single hypothesis No backtracking in search All training examples used at each step The University of Iowa Avoiding Overfitting the Data Intelligent Systems Laboratory Reference J. R. Quinlan, Induction of decision trees, Machine Learning, 1, 1986, 81-106. Accuracy Training data set Testingdata set Size of tree The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory 3
© Copyright 2026 Paperzz