What does a classifier do? Just like linear regression: find weights such that the weighted sum of inputs matches a desired output • The inputs are multivoxel activation patterns • The desired outputs are the stimulus conditions which elicited that activation Then, apply a threshold to that output: • If weighted sum is less than zero, then prediction is Class A • If weighted sum is greater than zero, then prediction is Class B Differences between classifiers are mostly a function of how the weights are calculated A toy example of a classifier, using two feature dimensions Sumo wrestlers Classifier boundary Weight Basketball players Height A linear classifier draws a straight line Sumo wrestlers Classifier boundary Weight Basketball players Height Some dimensions can be more important than others Sumo wrestlers Classifier boundary Weight Basketball players Height A nonlinear classifier draws a not-straight line! Sumo wrestlers Weight Classifier boundary Basketball players Height Some patterns are more separable than others, given a particular set of measurements Sumo wrestlers Faculty Basketball players Height Weight Weight Students Height Neural decoding, as it is usually done From: Norman, Polyn, Detre & Haxby (2006), Trends in CogSci, 10(9), 424-30 Cross-validation: Training and testing sets Cross-validation in everyday life Replication in science (sort of) Original study: training set Replication attempt: testing set Generalisation, by classsifiers and by the brain Overfitting in everyday life http://content.time.com/time/specials/packages/article/0,28804,1856094_1856096_1856102,00.html Problem: With so many feature dimensions, you can fit anything you want Solution: Cross-validation Divide the data into a training set and a testing set Classifier succeeds only if it is able to find an underlying regularity that is shared across training and testing sets Overfitting in 2-D http://commons.wikimedia.org/wiki/File:Overfitting.svg The multivariate approach in genetics What’s the gene for disease X? Unless X is a monogenetic disease, e.g. Huntington’s, there is no single gene for X You need to screen multiple genes simultaneously, and try to predict disease occurrence from that multivariate dataset • E.g. Clarke R, Ressom HW, Wang A, Xuan J, Liu MC, Gehan EA, Wang Y. (2008) The properties of highdimensional data spaces: implications for exploring gene and protein expression data. Nature Reviews Cancer. 8(1), 37-49. Clarke R, Ressom HW, Wang A, Xuan J, Liu MC, Gehan EA, Wang Y. (2008) The properties of high-dimensional data spaces: implications for exploring gene and protein expression data. Nature Reviews Cancer. 8(1), 37-49.
© Copyright 2026 Paperzz