Classification Classification – basic aim Simple allocation rules Extensions Not the same costs of mis mis--classifications Prior knowledge available NOT the th same (co)variances ( ) i Multivariate feature observations Other methods: K-nearest neighbour PLS regression Multiclass problems © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. Classification – basic aim Simple example 1 Females: 73 kg 2 Males: 100 kg x Weight Observe a weight of a new person: Task: predict the sex of this new person (”classify”) © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. Simple example Allocation rule ”Obvious” method: IF weigth close to female level: cnew 1(female) IF weigth close to male level: cnew 2(male) ( 1 2 ) / 2 1 2 Females: 86.5 Males: 100 kg 73 kg © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. x Weight 1, if xnew ( 1 2) / 2 cnew 2, if xnew ( 1 2) / 2 ( 1 2 ) / 2 1 2 Females: 86.5 Males: 100 kg 73 kg x Weight © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. 1 Considering population distributions: f1 Allocation rule f1 f2 f2 Weight Females Males Weight Females Allocation rule: The most ”probable”! 1, if f1 ( xnew ) f 2 ( xnew ) 1 c ( xnew ) 2, if f1 ( xnew ) f 2 ( xnew ) 1 © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. Allocation rule Classification f1 Complications: f2 Weight Females Males IF both distributions are normal with the same variance: f1 ( x) f 2 ( x) x ( 1 2 ) / 2 Not the same cost of mismis-classifying in class 1 or 2 Prior knowledge available about the class sizes (NOT fiftyfifty-fifty) NOT the same (co)variances Extensions: Multivariate feature observations More than two classes © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. Loss function for wrong classification Different priors Predicted class True Class 1 2 1 0 L(1|2) 2 L(2|1) 0 In this course: We work with equal losses: L(1|2)=L(2|1) © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. Males Take knowlegde of class sizes into account: Prior probilities P(x|1) and P(x|2) Usually: n1/(n1+n2) and n2/(n1+n2) Allocation rule: 1, if f1 ( x) f 2 ( x) P ( x | 2) / P( x | 1) c( x) 2, if f1 ( x) f 2 ( x) P ( x | 2) / P ( x | 1) © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. 2 Two class bivariate example Discriminant Analysis Normal distributed data Weight f x f2 1 2 p e 1 x -μ t Σ 1 x -μ 2 Σ To classify we must study: f1 ( x) f 2 ( x) f1 Fat weight © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. Discriminant Analysis Bivariate normal with homogeneous variance Σ1 = Σ2 : Same x1 and x2 variance and no correlation: Exercise 2 f1 ( x) f 2 ( x) 1 7 Linear discriminant function (LDA) 6 x2 5 Σ1 Σ2 4 3 2 Quadratic discriminant function (QDA) 1 0 1 2 3 4 5 x1 6 7 8 9 10 © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. Bivariate normal with homogeneous variance Fisher Discriminant Analysis (CVA) WITH correlation: Optimal ratio: Exercise 1 4 x2 f1 ( x) f 2 ( x) 1 3 f1 f2 2 x 2 1 0 -1 x1 -2 -3 -4 -5 tc -4 -3 © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. -2 -1 0 x1 1 2 3 4 5 © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. 3 Use data to estimate distributions: Bivariate normal with heterogeneous covariance Exercise 3 9 8 log c f1 ( x) f 2 ( x) 1 7 6 x2 5 4 3 2 1 T x Σ 1 1 T 1 1 T 1 μ 1 μ 2 μ 1 Σ μ 1 μ 2 Σ μ 2 log c 0 2 2 0 0 2 4 6 8 10 x1 © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. Use data to estimate distributions: Relations Mahalanobis: Finds points of “equidistance” Fisher: Finds univariate discrimination score ”General Bayes”: Finds points of ”equi-probability” x μ T 1 Σ1 1 x μ x μ T 1 2 Σ2 1 x μ 2 log c 0 2 © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. Relations k-Nearest Neighbour method For 2 classes, normal distribution and variance homogeneity: The k-NN classifier is a very intuitive method All three methods are the same! (=LDA) For 2 classes, normal distribution and variance heterogeneity: Bayes = Mahalanobis = QDA (Can be mimicked by an LDA with squared and cross product terms) © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. Examples are classified based on their similarity with training data. The k-NN k NN only requires: An integer k. A set of labeled examples. A measure of “closeness”. © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. 4 k-Nearest Neighbour method k-Nearest Neighbour method 3-NN 3-NN c=2 x2 c=2 x2 c=1 c=1 ? x1 © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. x1 © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. k-Nearest Neighbour method Characteristics of the NN classifier 3-NN Advantages Analytically tractable, simple implementation Nearly optimal in the large sample limit (N→∞). Disadvantages Large storage requirements Computationally intensive recall Highly susceptible to the curse of dimensionality 1-NN versus k-NN Dertermine by cross-validation c=2 x2 c=1 c=1!! x1 © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. Multiclass class problems Mahalanobis/Fisher approach (LDA) x2 π1 π2 π3 π7 π6 © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. π4 Between group variance Within group variance π5 x1 © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. 5 Example k-Nearest Neighbour method 742 obs: 7 mutants and 600 ions. CVA: PCA: Data is generated for a 2-dim 3-class problem, where the likelihoods are unimodal, and are distributed in rings around a common mean. Nobs=742, Nmetabolites=135 Nobs=742, Nmetabolites=135 25 2nd principal component 15 10 5 0 -5 -10 -20 1.5 WT CAT OPI INO2 HAP4 INO4 INO24 1 2nd canonical discrininant function WT CAT OPI INO2 HAP4 INO4 INO24 20 0.5 0 k = 5 Metric = Euclidean distance -0.5 -1 -15 -10 -5 0 5 10 15 20 1st principal component -1.5 -2 These classes are also non-linearly separable. k-NN with -1.5 -1 -0.5 0 0.5 1 1.5 2 1st canonical discrininant function © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. 5-Nearest Neighbour Link to regression and PLS Solution Two group LDA can be (exactly) obtained by simple regression Just use the binary y’s in an MLR A version of QDA (not 100% equivalent) can be obtained by including product and squared variables in the MLR PLSR S and/or / PCR C is also a good idea! Multi-class discrimination can be handled by PLS2 Make K dummy variables (easy/inbuilt in Unscrambler) © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. Classification in Unscrambler X Classification Linear Discriminant Analysis Simple allocation rules Extensions LDA: Multiclass Multiclass,, multivariate X ((assuming assuming n_i>p n_i>p)) QDA (Mahalanobis) Mahalanobis) All may be combined with PCA (PCA (PCA--LDA option) PLS1: 2-group LDA, PLSPLS-DA (a ”PLS” version of PCA--LDA) PCA PLS2: multi multi--group LDA, PLS PLS--DA The PLS PLS--methods can by ”QDA ”QDA--like” like” by inclusion of squared/crossed variables. © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. Not the same costs of mis mis--classifications Prior knowledge available NOT the th same (co (co))variances i Multivariate feature observations Other methods: methods: K-nearest neighbour PLS regression Multiclass problems © Per Bruun Brockhoff, [email protected], and Michael Edberg Hansen IMM, DTU. 6
© Copyright 2025 Paperzz