L5: Quadratic classifiers β’ Bayes classifiers for Normally distributed classes β Case 1: Ξ£π = π 2 πΌ β Case 2: Ξ£π = Ξ£ (Ξ£ diagonal) β Case 3: Ξ£π = Ξ£ (Ξ£ non-diagonal) β Case 4: Ξ£π = ππ2 πΌ β Case 5: Ξ£π β Ξ£π (general case) β’ Numerical example β’ Linear and quadratic classifiers: conclusions CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 1 Bayes classifiers for Gaussian classes Class assignment β’ Recap β On L4 we showed that the decision rule that minimized π[πππππ] could be formulated in terms of a family of discriminant functions β’ For normally Gaussian classes, these DFs reduce to simple expressions β The multivariate Normal pdf is βπ/2 Costs Costs Discriminant functions Features β1/2 β xx1 1 xx2 2 xx3 3 ggC (x) C (x) xxd d π π₯ 1 π β1 β1/2 β2 π₯βππ Ξ£π π₯βππ Ξ£π π = 2π β Eliminating constant terms gg1(x) gg2(x) 1(x) 2(x) 1 π β1 π₯βπ β1/2 β2 π₯βπ Ξ£ π ππ π₯ = 2π Ξ£ β Using Bayes rule, the DFs become ππ π₯ = π ππ |π₯ = π ππ π π₯ ππ βπ/2 Select Select max max π ππ π π₯ 1 π₯βππ π Ξ£β1 π₯βππ π 2 ππ π₯ = Ξ£π π π ππ β And taking natural logs 1 1 ππ π₯ = β π₯ β ππ π Ξ£πβ1 π₯ β ππ β log Ξ£π + ππππ(ππ ) 2 2 β This expression is called a quadratic discriminant function CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 2 Case 1: πΊπ = ππ π° β’ This situation occurs when features are statistically independent with equal variance for all classes β In this case, the quadratic DFs become ππ π₯ = β 1 π₯ β ππ 2 π π 2πΌ β1 1 1 π₯ β ππ β log π 2 πΌ + πππππ β‘ β 2 π₯ β ππ 2 2π π π₯ β ππ + πππππ β Expanding this expression ππ π₯ = β 1 π₯ π π₯ β 2πππ π₯ + πππ ππ + πππππ 2 2π β Eliminating the term π₯ π π₯, which is constant for all classes ππ π₯ = β 1 π π π β2π π₯ + π π + ππππ = π€ π₯ + π€0 π π π π π 2π 2 β So the DFs are linear, and the boundaries ππ (π₯) = ππ (π₯) are hyper-planes β If we assume equal priors 1 2π2 π₯ β ππ π π₯ β ππ β’ This is called a minimum-distance or nearest mean classifier β’ The equiprobable contours are hyper-spheres β’ For unit variance (π 2 = 1), ππ π₯ is the Euclidean distance CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU x ο³ ο1 ο2 οC Distance Distance Minimum Selector ππ π₯ = β class Distance [Schalkoff, 1992] 3 β’ Example β Three-class 2D problem with equal priors π1 = 3 2 Ξ£1 = 2 π 2 π2 = 7 4 Ξ£1 = 2 π 2 π3 = 2 5 Ξ£1 = 2 CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU π 2 4 Case 2: πΊπ = πΊ (diagonal) β’ Classes still have the same covariance, but features are allowed to have different variances β In this case, the quadratic DFs becomes ππ π₯ = β 1 β βπ 2 π=1 1 1 π₯ β ππ π Ξ£πβ1 π₯ β ππ β log Ξ£π + πππππ = 2 2 2 π₯π β ππ,π 1 2 β πππβπ π=1 ππ + πππππ 2 2 ππ β Eliminating the term π₯π2 , which is constant for all classes 2 1 π β2π₯π ππ,π + ππ,π 1 π ππ π₯ = β βπ=1 β πππΞ π=1 ππ2 + πππππ 2 2 2 ππ β This discriminant is also linear, so the decision boundaries ππ (π₯) = ππ (π₯) will also be hyper-planes β The equiprobable contours are hyper-ellipses aligned with the reference frame β Note that the only difference with the previous classifier is that the distance of each axis is normalized by its variance CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 5 β’ Example β Three-class 2D problem with equal priors π1 = 3 2 Ξ£1 = 1 π 2 π2 = 5 4 Ξ£2 = 1 π 2 π3 = 2 5 Ξ£3 = 1 CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU π 2 6 Case 3: πΊπ = πΊ (non-diagonal) β’ Classes have equal covariance matrix, but no longer diagonal β The quadratic discriminant becomes 1 1 ππ π₯ = β π₯ β ππ π Ξ£β1 π₯ β ππ β log Ξ£ + πππππ 2 2 β Eliminating the term log Ξ£ , which is constant for all classes, and assuming equal priors 1 ππ π₯ = β π₯ β ππ π Ξ£β1 π₯ β ππ 2 β The quadratic term is called the Mahalanobis distance, a very important concept in statistical pattern recognition β The Mahalanobis distance is a vector distance that uses a Ξ£ β1 norm, β Ξ£ β1 acts as a stretching factor on the space β Note that when Ξ£ = πΌ, the Mahalanobis distance becomes the familiar Euclidean distance CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU x1 ο x2 2 x i - ΞΌ å -1 = K 2 xi - ΞΌ = Ξ 7 β Expanding the quadratic term 1 ππ π₯ = β π₯ π Ξ£β1 π₯ β 2πππ Ξ£ β1 π₯ + πππ Ξ£β1 ππ 2 β Removing the term π₯ π Ξ£β1 π₯, which is constant for all classes 1 ππ π₯ = β β2πππ Ξ£ β1 π₯ + πππ Ξ£β1 ππ = π€1π π₯ + π€0 2 β So the DFs are still linear, and the decision boundaries will also be hyper-planes β The equiprobable contours are hyper-ellipses aligned with the eigenvectors of Ξ£ β This is known as a minimum (Mahalanobis) distance classifier å ο1 ο2 οC CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU Distance Distance Minimum Selector x class Distance 8 β’ Example β Three-class 2D problem with equal priors π1 = 3 2 Ξ£1 = π 1 .7 .7 2 π2 = 5 4 Ξ£2 = π 1 .7 .7 2 π3 = 2 5 Ξ£3 = π 1 .7 .7 2 CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 9 π Case 4: πΊπ = ππ π° β’ In this case, each class has a different covariance matrix, which is proportional to the identity matrix β The quadratic discriminant becomes 1 1 β2 π ππ π₯ = β π₯ β ππ ππ π₯ β ππ β πlog ππ2 + πππππ 2 2 β This expression cannot be reduced further β The decision boundaries are quadratic: hyper-ellipses β The equiprobable contours are hyper-spheres aligned with the feature axis CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 10 β’ Example β Three-class 2D problem with equal priors π1 = 3 2 Ξ£1 = .5 π .5 π2 = 5 4 Ξ£2 = 1 π 1 π3 = 2 5 Ξ£3 = 2 π 2 Zoom out CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 11 Case 5: πΊπ β πΊπ (general case) β’ We already derived the expression for the general case 1 1 π₯ β ππ π Ξ£πβ1 π₯ β ππ β log Ξ£π + πππππ 2 2 β Reorganizing terms in a quadratic form yields π ππ π₯ = π₯ π π2,π π₯ + π€1,π π₯ + π€0,π ππ π₯ = β 1 π2,π = β 2 Ξ£πβ1 where π€1,π = Ξ£πβ1 ππ 1 1 π€π,π = β 2 πππ Ξ£πβ1 ππ β 2 log Ξ£π + πππππ β The equiprobable contours are hyper-ellipses, oriented with the eigenvectors of Ξ£π for that class β The decision boundaries are again quadratic: hyper-ellipses or hyperparabolloids β Notice that the quadratic expression in the discriminant is proportional to the Mahalanobis distance for covariance Ξ£π CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 12 β’ Example β Three-class 2D problem with equal priors π1 = 3 2 Ξ£1 = π 1 β1 β1 2 π2 = 5 4 Ξ£2 = 1 β1 π β1 7 π3 = 3 4 Ξ£3 = .5 .5 π .5 3 Zoom out CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 13 Numerical example β’ Derive a linear DF for the following 2-class 3D problem π π π1 = 0 0 0 ; π2 = 1 1 1 ; Ξ£1 = Ξ£2 = .25 .25 .25 β Solution 1 π1 π₯ = β 2 π₯ β π1 2π β’ β’ π1 π₯ π1 > π < 2 π2 π 1 π₯β0 + ππππ1 = β π¦ β 0 2 π§β0 π₯ β π1 π2 π₯ = 2 1 β 2 2 π₯β1 π¦β1 π§β1 π₯ β β2 π₯ + π¦ + π§ β’ 2 + π 4 4 1 ππ 3 π₯+π¦+π§ π1 > < π2 4 π 4 4 4 π₯β0 1 π¦ β 0 + πππ 3 π§β0 π₯β1 2 π¦ β 1 + πππ 3 π§β1 β2 π₯β1 π2 > 6βπππ2 < 4 π1 π ; P2 = 2P1 2 + π¦β1 2 + π§β1 2 + lg 2 3 = 1.32 β Classify the test example π₯π’ = 0.1 0.7 0.8 π2 > 0.1 + 0.7 + 0.8 = 1.6 < 1.32 β π₯π’ β π2 π1 CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 14 Conclusions β’ The examples in this lecture illustrate the following points β The Bayes classifier for Gaussian classes (general case) is quadratic β The Bayes classifier for Gaussian classes with equal covariance is linear β The Mahalanobis distance classifier is Bayes-optimal for β’ normally distributed classes and β’ equal covariance matrices and β’ equal priors β The Euclidean distance classifier is Bayes-optimal for β’ normally distributed classes and β’ equal covariance matrices proportional to the identity matrix and β’ equal priors β Both Euclidean and Mahalanobis distance classifiers are linear classifiers β’ Thus, some of the simplest and most popular classifiers can be derived from decision-theoretic principles β Using a specific (Euclidean or Mahalanobis) minimum distance classifier implicitly corresponds to certain statistical assumptions β The question whether these assumptions hold or donβt can rarely be answered in practice; in most cases we can only determine whether the classifier solves our problem CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 15
© Copyright 2026 Paperzz