00530266.pdf

A Fast and Distortion Tolerant Hashing for Fingerprint
Image Authentication
Thi Hoi Le and The Duy Bui
Faculty of Information Technology
Vietnam National University, Hanoi
[email protected], [email protected]
Abstract. Biometrics such as fingerprint, face, eye retina, and voice offers means of reliable
personal authentication is now a widely used technology both in forensic and civilian domains.
Reality, however, makes it difficult to design an accurate and fast biometric recognition due to
large biometric database and complicated biometric measures. In particular, fast fingerprint indexing is one of the most challenging problems faced in fingerprint authentication system. In this
paper, we present a specific contribution to advance the state of the art in this field by introducing a new robust indexing scheme that is able to fasten the fingerprint recognition process.
Keywords: fingerprint hashing, fingerprint authentication, error correcting code, image
authentication.
1 Introduction
With the development of digital world, reliable personal authentication has become a
big interest in human computer interface activity. National ID card, electronic commerce, and access to computer networks are some scenarios where declaration of a
person’s identity is crucial. Existing security measures rely on knowledge-based approaches like passwords or token-based such as magnetic cards and passports are used
to control access to real and virtual societies. Though ubiquitous, such methods are
not very secure. More severely, they may be shared or stolen easily. Passwords and
PIN numbers may be even stolen electronically. Furthermore, they cannot differentiate between authorized user and fraudulent imposter. Otherwise, biometrics has a special characteristic that user is the key; hence, it is not easily compromised or shared.
Therefore, biometrics offers means of reliable personal authentication that can address
these problems and is gaining citizen and government acceptance.
Although significant progress has been made in fingerprint authentication system,
there are still a number of research issues that need to be addressed to improve the
system efficiency. Automatic fingerprint identification which requires 1-N matching
is usually computationally demanding. For a small database, a common approach is to
exhaustively match a query fingerprint against all the fingerprints in database [21].
For a large database, however, it is not desirable in practice without an effective fingerprint indexing scheme.
There are two technical choices to reduce the number of comparisons and consequently to reduce the response time of the identification process: classification and
E. Corchado et al. (Eds.): CISIS 2008, ASC 53, pp. 266–273, 2009.
© Springer-Verlag Berlin Heidelberg 2009
springerlink.com
A Fast and Distortion Tolerant Hashing for Fingerprint Image Authentication
267
indexing techniques. Traditional classification techniques (e.g. [5], [16]) attempt to
classify fingerprint into five classes: Right Loop (R), Left Loop (L), Whorl (W),
Arch (A), and Tented Arch (A). Due to the uneven natural distribution, the number
of classes is small and real fingerprints are unequally distributed among them: over
90% of fingerprints fall in to only three classes (Loops and Whorl) [18]. This is
resulted in the inability of reducing the search space enough of such systems. Indexing technique performs better than classification in terms of the size of space need
to be searched. Fingerprint indexing algorithms select most probable candidates and
sort them by the similarity to the query. Many indexing algorithms have been proposed recently. A.K. Jain et al [11] use the features around the core point of a Gabor filtered image to realize indexing. Although this approach makes use of global
information (core point) but the discrimination power of just one core is limited. In
[2], the singular point (SP) is used to estimate the search priority which is resulted
in the mean search space below 4% the whole dataset. However, detecting singular
point is a hard problem. Some fingerprints even do not have SPs and the uncertainty
of SP location is large [18]. Besides, several attempts to account for fingerprint indexing have shown the improvement. R. Cappelli et al. [6] proposed an approach
which reaches the reasonable performance and identification time. R.S Germain et
al. [9] use the triplets of minutiae in their indexing procedure. J.D Boer et al. [3]
make effort in combining multiple features (orientation field, FingerCode, and minutiae triplets).
One of another key point is to make indexing algorithm more accurate, fingerprint
distortion must be considered. There are two kinds of distortion: transformation distortion and system distortion. In particular, due to fingerprint scanners can only capture
partial fingerprints; some minutiae – primary fingerprint features - are missing during
acquisition process. These distortions of fingerprint including minutia missing cause
several problems: (i) the number of minutia points available in such prints is few, thus
reducing its discrimination power; (ii) loss of singular points (core and delta) is likely
[15]. Therefore, a robust indexing algorithm independent of such global features is
required. However, in most of existed indexing scheme mentioned above, they perform
indexing by utilizing these global feature points.
In this paper, we propose a hashing scheme that can achieve high efficiency in
hashing performance by perform hashing on localized features that are able to tolerate
distortions. This hashing scheme can perform on any localized features such as minutiae or triplet. To avoid alignment, in this paper we use the triplet feature introduced
by Tsai-Yang Jea et. al. [15]. One of our main contributions is that we present a definition of codeword for such fingerprint feature points and our hashing scheme performs on those codewords. By producing codewords for feature points, this scheme
can tolerate the distortion of each point. Moreover, to reduce number of candidates for
matching stage more efficiently, a randomized hash function scheme is used to reduce
the output collisions.
The paper is organized as follows. Section 2 introduces some notions and our definition of codeword for fingerprint feature points. Section 3 presents our hashing
scheme based on these codewords and a scheme to retrieve fingerprint from hashing
results. We present experiment results of our scheme in Section 4.
268
T.H. Le and T.D. Bui
2 Preliminaries
2.1 Error Correction Code
For a given choice of metric d, one can define error correction codes in the corresponding space M. A code is a subset C={w1,…,wk} ⊂ M. The set C is sometimes
called codebook; its K elements are the codewords. The (minimum) distance of a code
is the smallest distance d between two distinct codewords (according to the metric d).
Given a codebook C, we can define a pair of functions (C,D). The encoding function
C is an injective map for the elements of some domain of size K to the elements of C.
The decoding function D maps any element w ∈ M to the pre-image C-1[wk] of the
codeword
wk that minimizes the distance d[w,wk]. The error correcting distance is
the largest radius t such that for every element w in M there is at most one codeword
in the ball of radius t centered on w. For integer distance functions we have t=(d-1)/2.
A standard shorthand notation in coding theory is that of a (M,K,t)-code.
2.2 Fingerprint Feature Point
Primary feature point of fingerprint is called minutia. Minutiae are the various ridge
discontinuities of a fingerprint. There are two types of widely used minutiae which
are bifurcations and endings (Fig.1). Minutia contains only local information of fingerprints. Each minutia is represented by its coordinates and orientation.
Fig. 1. (left) Ridge bifurcation. (b) Ridge endings [15].
Secondary feature is a vector of five-elements (Fig.2). For each minutiae
Mi(xi,yi,θi) and its two nearest neighbors N0(xn0,yn0,θn0) and N1(xn1,yn1,θn1), the
secondary feature is constructed by form a vector Si(ri0 ,ri1,φi0,φi1,δi) in which ri0
and ri1 are the Euclidean distances between the central minutia Mi and its neighbors
N0 and N1 respectively. φik is the orientation difference between Mi and Nk, where k
is 0 or 1. δi represents the acute angle between the line segments MiN0 and MiN1.
Note that N0 and N1 are the two nearest neighbors of the central minutia Mi and ordered not by their Euclidean distances but by satisfying the equation: N0MixN1Mi ≥ 0.
A Fast and Distortion Tolerant Hashing for Fingerprint Image Authentication
269
Fig. 2. Secondary feature of Mi. Where ri0 and ri1 are the Euclidean distances between central
minutia Mi and its neighbors N0 and N1 respectively. φik is the orientation difference between Mi
and Nk where k is 0 or 1. δi represents the acute angle between MiN0 and MiN1 [15].
N0 is the first and N1 is the second minutia encountered while traversing the angle
∠N0MiN1.
For the given matched reference minutiae pair pi and qj, it is said that minutiae
pi(ri,i’,Φi,i’;,θi,i’) matches qj(rj,j’,Φj,j’,θj,j’), if qj is within the tolerance area of pi. Thus for
given threshold functions Thldr(.), ThldΦ(.), and Thldθ(.), |ri,i’ – rj,j’| ≤ Thldr(ri,i’), |Φi,i’ –
Φj,j’| ≤ ThldΦ(Φi,i’) and |θi,i’ – θj,j’| ≤ Thldθ(θi,i’). Note that the thresholds are not predefined values but are adjustable according to rii, and rjj. In this literature, we call this
secondary fingerprint feature point as feature point.
2.3 Our Error Correction Code for Fingerprint Feature Point
Most existing error correction schemes are respect to Hamming distance and used to
correct the message at bit level (e.g. parity check bits, repetition scheme, CRC…).
Therefore, we define a new error correction scheme for fingerprint feature point. In
particular, we consider each feature x as a corrupted codeword and try to correct to the
correct codeword by using an encoding function C(x).
Definition 1. Let x ∈ RD, and C(x) is an encoding function of the error correction
scheme. We define C(x) = (q(x-t)|q(x)|q(x+t)) where q(x) is a quantization function of
x with quantization step t. We call the output of C(x) (cx-t,cx,cx+t) is the codeword set
of x and cx=q(x) is the nearest codeword of x or codeword of x for short.
Lemma 1. Let x ∈ R, given a tolerant threshold t ∈ R. For every y such that |x-y| ≤ t,
then the codeword of y cy = q(y) takes one of three elements in the codeword set of x.
Lemma 1 can be proved easily by some algebraic transformations. In our approach,
we generate codeword set for every dimension of template fingerprint feature point q
and only codeword for every dimension of query point p. Following lemma 1 and
definition of two matched feature points, we can see that if p and q are corresponding
270
T.H. Le and T.D. Bui
feature points of two versions from one fingerprint, codeword of q will be an element
in codeword set of p.
3 Codeword-Based Hashing Scheme
We present a fingerprint hash generation scheme based on the codeword of the feature
point. The key idea is: first, “correct” the error feature point to its codeword, then use
that codeword as the input of a randomized hash function which can scatter the input
set and ensure that the probability of collision of two feature points is closely related to
the distance between their corresponding coordinate pairs (refer to definition of two
matched feature points).
3.1 Our Approach
Informal description. Fingerprint features stored in database can be considered as a
very large set. Moreover, the distribution of fingerprint feature points is not uniform
and unknown. Therefore, we want to design a randomized hash scheme such that for
the large input sets of fingerprint feature points, it is unlikely that elements collide.
Fortunately, some standard hash functions (e.g. MD5, SHA) can be made randomized
to satisfy the property of target collision resistance. Our scheme works as follows.
First, the message bits are permuted by a random permutation and then the hash of
resulting message is computed (by a compression function e.g. SHA). A permutation
is a special kind of block cipher where the output and input have the same length. A
random permutation is widely used in cryptography since it possesses two important
properties: randomness and efficient computation if the key is random. The basic
idea so far is for any given set of inputs, this algorithm will scatter the inputs among
the range of the function well based on a random permutation so that the probabilistic
expectation of the output will be distributed more randomly.
To ensure the error tolerant property, we perform hashing on codeword (for query
points) and on the whole codeword set (for template points) instead of the feature
point itself. Follow the lemma 1 we have the query point y and the template point x
are matched if and only if codeword of y is an element in codeword set of x .
Formal description. We set up our hashing scheme as follows:
1. Choose random dimensions from (1,2,…,D); by this way, we adjust the trade off
between the collision of hashing values and the space of our database.
2. Choose an tolerant threshold t and appropriate metric ld for selected dimensions.
For each template point p:
1. Generate the codeword set for pd. These values are mapped to binary strings. Binary strings in the codeword set of selected dimensions are then concatenated to
form L3 binary strings mt for i=1,…,L3 where L is the number of selected dimensions.
2. Each mi is padded with zero bits to form a n – bit message mi.
3. Generate a random key K for a permutation π of {0,1}n.
4. mi is permuted by π and the resulting message is hashed using a compression
function such as SHA.. Let shi = SHA( π (mi)) for i=1,…,L3.
A Fast and Distortion Tolerant Hashing for Fingerprint Image Authentication
271
5. There are at most L3 hash values for p. These values can be stored in the same
hash table as well as separate ones.
For query point q:
1. Compute the nearest codeword value for every selected dimension of q. Then map
the value to a binary string.
2. Binary strings of selected dimensions are then concatenated.
3. Perform steps 2 and 4 as for template point. Note that there is only one binary
string for query point q.
4. Return all the points which are sharing identical hash value with q.
For query evaluation, all candidate matches are returned by our hashing for every
query feature point. Hence, each query fingerprint is treated as a bag of points, simulating a multi-point query evaluation. To do this efficiently, we use an identical
framework as Ke et al. [17], in that, we maintain two auxiliary index structure –File
Table (FT) and Keypoint Table (KT) – to map feature points to their corresponding
fingerprint; an entry in KT consists of the file ID (index location of FT) and feature
point information.
The template fingerprints that have points sharing identical hash values (collisions)
with the query version are then ranked by the number of similarity points. Only top T
templates are selected for identifying the query fingerprint by matching. Thus, the
search space is greatly reduced. The candidate selection process requires only linear
computational cost so that it can be applied for online interactive querying on large
image collections.
3.2 Analysis
Space complexity. For an automate fingerprint identification, we must allocate extra
storage for the hash values of template fingerprints. Assume that the number of Ddimension feature points extracted from template version is N. With L selected dimensions, the total extra space required for one template is O(L3.N). Thus, the hashing scheme requires a polynomial-sized data structure that allows sub-linear time retrievals of near neighbors as shown in following section.
Time complexity. On query fingerprint Y with N feature points in a database of M
fingerprints, we must: compute the codeword for each feature point which takes O(N)
due to quantization hashing functions required constant time complexity; compute the
hash value which takes O(time(f,x)) where f is the hash function used and time(f,x) is
the time required by function f with input x; and compute the similarity scores of templates that has any point sharing identical hash value with the query fingerprint, requiring time O(M.N/2m). Thus the key quantity is O(M.N/2n + time(f,x)) which is approximately equivalent to 1/2n computations of exhaustive search.
4 Experiments
We evaluate our method by testing it on Database FVC2004 [19] which consists of
300 images, 3 prints each of 100 distinct fingers. DB1_A database contains the partial
fingerprint templates of various sizes. These images are captured by a optical sensor
with a resolution of 500dpi, resulting in images of 300x200 pixels in 8 bit gray scale.
272
T.H. Le and T.D. Bui
The full original templates are used to construct the database, while the remaining 7
impressions are used to hashing. We use the feature extraction algorithm described in
[15] in our system. However, the authors do not mention in detail how to determine
the threshold of error tolerances. Therefore, in our experiments, we assume that the
error correction distance t is fixed for all fingerprint feature points. This assumption
makes the implementation not optimal so that two feature points are recognized as
“match” by matching algorithm proposed in [15] may not share the same hash value
in our experiments. The Euclidean distances between the corresponding dimensions
of feature vectors are used in quantization hashing function.
Table 1 shows the Correct Index Power (CIP) which is defined as the percentage of
correctly indexed queries based on the percentage of hypotheses that need to be
searched in the verification step. Although our implementation is not optimal, scheme
still achieves good CIP result. As can be easily seen, the larger search percentage is,
the better results are obtained. It indicates that the optimal implementation can improve the result performance.
Table 1. Correct Indexing Power of our algorithm
Correct Index Power
Search
Percentage
CIP
5%
10%
15%
20%
80%
87%
94%
96%
Compare with some published experiments in the literature, at the search percentage
10%, [13] comes up with 84.5% CIP and [23] reaches a result of 92.8% CIP. However,
unlike previous works, our scheme is much simpler and by adjusting t carefully, it is
promising that our scheme will reach 100% CIP with low search percentage.
5 Conclusion
In this paper, we have presented a new robust approach to perform indexing on fingerprint which provides both accurate and fast indexing. However, there is still some
works need to be done to in order to make the system more persuasive and to obtain
the optimal result like studying optimal choices of t parameter. Moreover, to guarantee
the privacy of fingerprint template in any indexing scheme is another important problem that must be considered.
References
[1] Bazen, A.M., Gerez, S.H.: Fingerprint matching by thin-plate spline modeling of elastic
deformations. Pattern Recognition 36, 1859–1867 (2003)
[2] Bazen, A.M., Verwaaijen, G.T.B., Garez, S.H., Veelunturf, L.P.J.: A correlation-based
fingerprint verification system. In: ProRISC 2000 Workshops on Circuits, Systems and
Signal Processing (2000)
A Fast and Distortion Tolerant Hashing for Fingerprint Image Authentication
273
[3] Boer, J., Bazen, A., Cerez, S.: Indexing fingerprint database based on multiple features.
In: ProRISC 2001 Workshop on Circuits, Systems and Singal Processing. (2001)
[4] Brown, L.: A survey of image registration techniques. ACM Computing Surveys (1992)
[5] Cappelli, R., Lumini, A., Maio, D., Maltoni, D.: Fingerprint Classification by Directional
Image Partitioning. IEEE Trans. on PAMI 21(5), 402–421 (1999)
[6] Cappelli, R., Maio, D., Maltoni, D.: Indexing fingerprint databases for efficicent 1: n
matching. In: Sixth Int.Conf. on Control, Automation, Robotics and Vision, Singapore
(2000)
[7] Choudhary, A.M., Awwal, A.A.S.: Optical pattern recognition of fingerprints using distortion-invariant phase-only filter. In: Proc. SPIE, vol. 3805(20), pp. 162–170 (1999)
[8] Fingerprint verification competition, http://bias.csr.unibo.it/fvc2002/
[9] Germain, R., Califano, A., Colville, S.: Fingerprint matching using transformation parameter clustering. IEEE Computational Science and Eng. 4(4), 42–49 (1997)
[10] Gonzalez, Woods, Eddins: Digital Image Processing, Prentice Hall, Englewood Cliffs
(2004)
[11] Jain, A., Ross, A., Prabhakar, S.: Fingerprint matching using minutiae texture features. In:
International Conference on Image Processing, pp. 282–285 (2001)
[12] Jain, A., Prabhakar, S., Hong, L., Pankanti, S.: Filterbank-based fingerprint matching.
Transactions on Image Processing 9, 846–859 (2000)
[13] Jain, A.K., Prabhakar, S., Hong, L., Pankanti, S.: FingerCode: a filterbank for fingerprint
representation and matching. In: CVPR IEEE Computer Society Conference (2), pp. 187–
193 (1999)
[14] Jea, T., Chavan, V.K., Govindaraju, V., Schneider, J.K.: Security and matching of partial
fingerprint recognition systems, pp. 39–50. SPIE (2004)
[15] Tsai-Yang, J., Venu, G.: A minutia-based partial fingerprint recognition system. Pattern
Recognition 38(10), 1672–1684 (2005)
[16] Karu, K., Jain, A.K.: Fingerprint Classification. Pattern Recognition 18(3), 389–404
(1996)
[17] Ke, Y., Sukthankar, R., Huston, L.: An efficient parts-based near duplicate and sub-image
retrieval system. In: MM International Conference on Multimedia, pp. 869–876 (2004)
[18] Liang, X., Asano, T.,, B.: Distorted Fingerprint indexing using minutiae detail and delaunay triangle. In: ISVD 2006, pp. 217–223 (2006)
[19] Maio, D., Maltoni, D., Cappelli, R., Wayman, J.L., Jain, A.K.: FVC2004: Third Fingerprint Verification Competition. In: Proc. ICBA, Hong Kong, July 2004, pp. 1–7 (2004)
[20] Nandakumar, K., Jain, A.K.: Local correlation-based fingerprint matching. In: Indian
Conference on Computer Vision, Graphics and Image Processing, pp. 503–508 (2004)
[21] Nist fingerprint vendor technology evaluation, http://fpvte.nist.gov/
[22] Ruud, B., Connell, J.H., Pankanti, S., Ratha, N.K., Senior, A.W.: Guide to Biometrics.
Springer, Heidelberg (2003)
[23] Liu, T., Zhang, G.Z.C., Hao, P.: Fingerprint Indexing Based on Singular Point Correlation. In: ICIP 2005 (2005)