An Overview of Multiple Instance Learning Using Diverse Density Yixin Chen Dept. of Computer Science and Engineering Pennsylvania State University Outline Introduction Multiple-Instance Learning Diverse Density Concept Classes Concept Learning using Diverse Density Two Examples Comments Introduction Supervised Learning Unsupervised Learning Classification, regression Decision trees, nearest neighbor, ANN, SVM Clustering, PCA, ICA Learning from Partially Labeled Data Multiple-Instance Learning Each example is labeled. An example is not a simple feature vector, but is a collection of instances. A collection of instances is called a bag. Each instance is described by a feature vector. The number of instances in a bag varies. Multiple-Instance Learning Negative Bag: all instances in it are negative. Positive Bag: at least one of the instances in it is positive. Examples: drug discovery Stock prediction Image retrieval Multiple-Instance Learning One instance per bag regular supervised learning Treat every instance in a positive (negative) bag as positive (negative) does not work. Concatenate all instance together doesn’t work. Diverse Density Treat bags as sets, quantify the intersection of the positive bags minus the union of the negative bags. Soft version of intersection, union, and difference. Thinking of the instances and bags as coming from some probability distribution. The location of an instance is treated as evidence of the location of the concept. Diverse Density Assign every possible concept a measure of “goodness”: Diverse Density Diverse Density measures not merely a co-occurrence of samples (i.e. intersection of instance), but a cooccurrence of instances from different (diverse) positive bags. Diverse Density The Diverse Density at a point is a measure of how many different positive bags have instances near that point and how far the negative instances are from that point. Use Diverse Density to generate a concept from multiple-instance examples. Concept Classes Single point concept class Every concept corresponds to a single point in feature space. Every positive bag has at least one instance that is generated by the true concept corrupted by some Gaussian noise. Concept Classes Single point-and-scaling concept class Taking the scaling of the dimensions into consideration. Every positive bag has at least one instance that is generated by the true concept corrupted by some Gaussian noise with diagonal covariance matrix. Concept Classes Disjunctive point-and-scaling concept class More complicated concept classes can be formed by allowing a disjunction of d single-point concepts. A bag is positive if at least one of its instances is one of the concepts. A bag is negative if none of its instances are in any of the d concepts. Concept Learning using Diverse Density Maximizing Diverse Density Using gradient based optimizations Multiple starting points to escape local maxima Learning disjunctive concepts is computationally expensive Concept Learning using Diverse Density EM-DD E-step: current concept is used to pick one instance, which is most likely to be the one responsible for the label given to the bag, from each bag. M-step: using gradient ascent to find a new concept that maximizes the Diverse Density defined on the instances chose in the E-step. Example 1 Image Classification Bag generation Image features Blocks, regions Color and texture Find the concept with maximal Diverse Density Use the distance to the concept to classify images Example 1 Performance 120 positive images 600 negative images, 30 runs Mountain/non-mountain Error rate = 0.2 Sunset/non-sunset Error rate = 0.11 Waterfall/non-waterfall Error rate = 0.21 Example 2 Image Retrieval Bag generation Image features Blocks, regions Color and texture Find the concept with maximal Diverse Density Use the distance to the concept to rank images Example 2 Performance 120 sunset images 600 other images, 6 training examples Among top 120 images, precision = 70% Comments Single point-and-scaling concept is too simple Disjunctive point-and-scaling concept is too expensive Rule based composite concepts looks promising May not work for image retrieval References O. Maron, Learning from Ambiguity. T. Dietterich, et al., Solving the Multiple-instance Problem with AxisParallel rectangles. Q. Zhang, et al., EM-DD: An Improved Multiple-Instance Learning Technique.
© Copyright 2026 Paperzz