Active hashing and its application to
image and text retrieval
Yi Zhen, Dit-Yan Yeung, Published in DMKD Feb 2012
Presented by
Arshad Jamal, Rajesh Dhania, VinkalVishnoi
Introduction
Computing similarity plays a fundamental role
Hashing based methods gained popularity for large-
scale similarity search
Hashing based
Tree based
Suitable for low
dimensions
This paper proposes
a novel
Framework for
Active Hashing
Data
Dependent
Unsupervised
Data
Independent
Semisupervised
Related work
Locality Sensitive Hashing [Andoni A, Indyk P (2006)]
Goal is to assign similar binary code for data points that are
closer in feature space [Random Linear Projection + Thresh]
Code length could become quite large
Spectral Hashing [Weiss Y, Torralba A, Fergus R (2008)]
Performs spectral decomposition to learn hash functions
Assumes data to be uniformly distributed
Active Learning
Identify and present the most informative unlabeled data to
human experts for labeling
Related Work: Semi-supervised
Hashing [Wang J,Kumar S, Chang S-F (2010a)]
Given N normalized data points of D dimensions
Learn K Hash functions to generate K-bit binary code
hk (x) sgn( w x)
T
k
Build two set of point pairs S (Similar), D(Dissimilar)
Together they characterize the semantic similarity
Hash functions Η {hk }kK1 are learned by maximizing
an objective function,
J ( H ) hk ( xi )hk ( x j ) hk ( xi )hk ( x j )
k 1
( xi , x j )D
( xi , x j )S
K
Limitations of SSH
Point pairs from both S and D sets are considered to be
equally important
For multi-class data, the D points picked from closer or
farther class contribute same weight
More dissimilar points will spoil the learned hash function
C1
C2
C3
Active Hashing (Greedy AH)
Tries to overcome the limitations of SSH by picking most
informative points
Algorithm: Three main steps
Given (L, U) labeled and un-labeled data points and candidate
set C
1. Select most
informative pts A
from C
2. Get A labeled by an
expert
Update L, U, C
3. Train the hash
functions based on L & U
Greedy AH: Selecting data points
Based on SSH model hash function hk (x) sgn( wTk x)
Intuitively, the term
w Tk x indicates the certainty of x
2
Data certainty (DC):
f ( H , x) W T x
2
Data points with smallest f will be the most informative
points
Batch Mode Active Hashing
Selecting points one by one is inefficient and suboptimal
Set of points are selected and processed to learn a Hash fn.
T ~ T
min f K
M
µ is indicator function deciding about the presence of a point
f is a vector of normalized certainty values in C
K is positive semi-definite similarity matrix defined on C
Choose M examples with largest µ
BMAH Algorithm
Algorithm: Three main steps
Given (L, U) labeled and un-labeled data points and candidate
set C
1. Select most
informative M pts
from C
2. Get it labeled by an
expert
Update L, U, C
Train the hash functions
based on L & U
Experimental evaluation-I
Image retrieval (MNIST dataset): Results reported for
different parameter settings
Text Retrieval (20Newsgroups (NEWS) data set)
Random vs BMAH: Performance improvement
Experimental evaluation-II
Image retrieval (MNIST dataset)
BMAH vs GAH: BMAH takes less time
References
Andoni A, Indyk P (2006) Near-optimal hashing algorithms for approximate
nearest neighbor in high dimensions. In: Proceedings of the 47th annual IEEE
symposium on foundations of computer science, FOCS ’06, IEEE Computer
Society, Washington, pp 459–468
Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. In: Koller D, Schuurmans
D, BengioY, Bottou L (eds) Advances in neural information processing systems
21, NIPS 21, The MIT Press, Cambridge, MA, pp 1753–1760
Wang J,Kumar S, Chang S-F (2010a) Semi-supervised hashing for scalable
image retrieval. In: Proceedings of IEEE conference on computer vision and
pattern recognition [46], pp 3424–3431
Salakhutdinov R, Hinton GE (2009) Semantic hashing. Int J Approx Reason
50:969–978
Thanks
© Copyright 2026 Paperzz