Automatic Image Annotation Using Group Sparsity Shaoting Zhang1, Junzhou Huang1, Yuchi Huang1, Yang Yu1, Hongsheng Li2, Dimitris Metaxas1 1CBIM, Rutgers University, NJ 2IDEA Lab, Lehigh University, PA Introductions • Goal: image annotation is to automatically assign relevant text keywords to any given image, reflecting its content. • Previous methods: – Topic models [Barnard, et.al., J. Mach. Learn Res.’03; Putthividhya, et.al., CVPR’10] – Mixture models [Carneiro, et.al., TPAMI’07; Feng, et.al., CVPR’04] – Discriminative models [Grangier, et.al., TPAMI’08; Hertz, et.al., CVPR’04] – Nearest neighbor based methods [Makadia, et.al., ECCV’08; Guillaumin, et.al., ICCV’09] Introductions • Limitations: – Features are often preselected, yet the properties of different features and feature combinations are not well investigated in the image annotation task. – Feature selection is not well investigated in this application. • Our method and contributions: – Use feature selection to solve annotation problem. – Use clustering prior and sparsity prior to guide the selection. Outline • Regularization based Feature Selection – Annotation framework – L2 norm regularization – L1 norm regularization – Group sparsity based regularization • Obtain Image Pairs • Experiments Regularization based Feature Selection • Given similar/dissimilar image pair list (P1,P2) …… …… …… …… …… …… …… FP1 …… …… …… …… …… …… …… FP2 …… …… …… …… …… …… …… X Regularization based Feature Selection wˆ arg min || Xw Y ||22 wR p 1 -1 1 1 … … … … … X w Y Regularization based Feature Selection • Annotation framework Weights Similarity Testing input High similarity Training data Regularization based Feature Selection 1 wˆ arg min || Xw Y ||22 + || w ||2 wR p n • L2 regularization • Robust, solvable: (XTX+λI)-1XTY • No sparsity % w Histogram of weights Regularization based Feature Selection 1 wˆ arg min || Xw Y ||22 + || w ||1 wR p n • L1 regularization • Convex optimization • Basis pursuit, Grafting, Shooting, etc. • Sparsity prior % w Histogram of weights Regularization based Feature Selection m 1 2 wˆ arg min || Xw Y ||2 + || wG j wR p n j 1 • Group • L2 inside the same group, L1 for different groups • Benefits: removal of whole feature groups • Projected-gradient[2] sparsity[1] ||2 RGB HSV =0 ≠0 [1] M. Yuan and Y. Lin. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society, Series B, 68:49–67, 2006. [2] E. Berg, M. Schmidt, M. Friedlander, and K. Murphy. Group sparsity via linear-time projection. In Technical report, TR-2008-09, 2008. http://www.cs.ubc.ca/~murphyk/Software/L1CRF/index.html Outline • Regularization based Feature Selection • Obtain Image Pairs – Only rely on keyword similarity – Also rely on feedback information • Experiments Obtain Image Pairs • Previous method[1] solely relies on keyword similarity, which induces a lot of noise. Distance histogram of similar pairs Distance histogram of all pairs [1] A. Makadia, V. Pavlovic, and S. Kumar. A new baseline for image annotation. In ECCV, pages 316–329, 2008. Obtain Image Pairs • Inspired by the relevance feedback and the expectation maximization method. k1 nearest (candidates of similar pairs) k2 farthest (candidates of dissimilar pairs) m 1 2 wˆ arg min || Xw Y ||2 + || wG j wR p n j 1 ||2 Outline • Regularization based Feature Selection • Obtain Image Pairs • Experiments – Experimental settings – Evaluation of regularization methods – Evaluation of generality – Some annotation results Experimental Settings • Data protocols – Corel5K (5k images) – IAPR TC12[1] (20k images) • Evaluation – Average precision – Average recall – #keywords recalled (N+) [1] M. Grubinger, P. D. Clough, H. Muller, and T. Deselaers. The iapr tc-12 benchmark - a new evaluation resource for visual information systems. 2006. Experimental Settings • Features – RGB, HSV, LAB – Opponent – rghistogram – Transformed color distribution – Color from Saliency[1] – Haar, Gabor[2] – SIFT[3], HOG[4] [1] X. Hou and L. Zhang. Saliency detection: A spectral residual approach. In CVPR, 2007. [2] A. Makadia, V. Pavlovic, and S. Kumar. A new baseline for image annotation. In ECCV, pages 316–329, 2008. [3] K. van de Sande, T. Gevers, and C. Snoek. Evaluating color descriptors for object and scene recognition. PAMI, 99(1),2010. [4] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, pages 886–893, 2005. Evaluation of Regularization Methods Precision Recall N+ Corel5K || w || IIAPR TC12 Evaluation of Generality • Weights computed from Corel5K, then applied on IAPR TC12. Precision N+ Recall λ λ λ Some Annotation Results Conclusions and Future Work • Conclusions – Proposed a feature selection framework using both sparsity and clustering priors to annotate images. – The sparse solution improves the scalability. – Image pairs from relevance feedback perform much better. • Future work – – – – Different grouping methods. Automatically find groups (dynamic group sparsity). More priors (combine with other methods). Extend this framework to object recognition. Thanks for listening
© Copyright 2026 Paperzz