A Fast and Distortion Tolerant Hashing for Fingerprint Image Authentication Thi Hoi Le and The Duy Bui Faculty of Information Technology Vietnam National University, Hanoi [email protected], [email protected] Abstract. Biometrics such as fingerprint, face, eye retina, and voice offers means of reliable personal authentication is now a widely used technology both in forensic and civilian domains. Reality, however, makes it difficult to design an accurate and fast biometric recognition due to large biometric database and complicated biometric measures. In particular, fast fingerprint indexing is one of the most challenging problems faced in fingerprint authentication system. In this paper, we present a specific contribution to advance the state of the art in this field by introducing a new robust indexing scheme that is able to fasten the fingerprint recognition process. Keywords: fingerprint hashing, fingerprint authentication, error correcting code, image authentication. 1 Introduction With the development of digital world, reliable personal authentication has become a big interest in human computer interface activity. National ID card, electronic commerce, and access to computer networks are some scenarios where declaration of a person’s identity is crucial. Existing security measures rely on knowledge-based approaches like passwords or token-based such as magnetic cards and passports are used to control access to real and virtual societies. Though ubiquitous, such methods are not very secure. More severely, they may be shared or stolen easily. Passwords and PIN numbers may be even stolen electronically. Furthermore, they cannot differentiate between authorized user and fraudulent imposter. Otherwise, biometrics has a special characteristic that user is the key; hence, it is not easily compromised or shared. Therefore, biometrics offers means of reliable personal authentication that can address these problems and is gaining citizen and government acceptance. Although significant progress has been made in fingerprint authentication system, there are still a number of research issues that need to be addressed to improve the system efficiency. Automatic fingerprint identification which requires 1-N matching is usually computationally demanding. For a small database, a common approach is to exhaustively match a query fingerprint against all the fingerprints in database [21]. For a large database, however, it is not desirable in practice without an effective fingerprint indexing scheme. There are two technical choices to reduce the number of comparisons and consequently to reduce the response time of the identification process: classification and E. Corchado et al. (Eds.): CISIS 2008, ASC 53, pp. 266–273, 2009. © Springer-Verlag Berlin Heidelberg 2009 springerlink.com A Fast and Distortion Tolerant Hashing for Fingerprint Image Authentication 267 indexing techniques. Traditional classification techniques (e.g. [5], [16]) attempt to classify fingerprint into five classes: Right Loop (R), Left Loop (L), Whorl (W), Arch (A), and Tented Arch (A). Due to the uneven natural distribution, the number of classes is small and real fingerprints are unequally distributed among them: over 90% of fingerprints fall in to only three classes (Loops and Whorl) [18]. This is resulted in the inability of reducing the search space enough of such systems. Indexing technique performs better than classification in terms of the size of space need to be searched. Fingerprint indexing algorithms select most probable candidates and sort them by the similarity to the query. Many indexing algorithms have been proposed recently. A.K. Jain et al [11] use the features around the core point of a Gabor filtered image to realize indexing. Although this approach makes use of global information (core point) but the discrimination power of just one core is limited. In [2], the singular point (SP) is used to estimate the search priority which is resulted in the mean search space below 4% the whole dataset. However, detecting singular point is a hard problem. Some fingerprints even do not have SPs and the uncertainty of SP location is large [18]. Besides, several attempts to account for fingerprint indexing have shown the improvement. R. Cappelli et al. [6] proposed an approach which reaches the reasonable performance and identification time. R.S Germain et al. [9] use the triplets of minutiae in their indexing procedure. J.D Boer et al. [3] make effort in combining multiple features (orientation field, FingerCode, and minutiae triplets). One of another key point is to make indexing algorithm more accurate, fingerprint distortion must be considered. There are two kinds of distortion: transformation distortion and system distortion. In particular, due to fingerprint scanners can only capture partial fingerprints; some minutiae – primary fingerprint features - are missing during acquisition process. These distortions of fingerprint including minutia missing cause several problems: (i) the number of minutia points available in such prints is few, thus reducing its discrimination power; (ii) loss of singular points (core and delta) is likely [15]. Therefore, a robust indexing algorithm independent of such global features is required. However, in most of existed indexing scheme mentioned above, they perform indexing by utilizing these global feature points. In this paper, we propose a hashing scheme that can achieve high efficiency in hashing performance by perform hashing on localized features that are able to tolerate distortions. This hashing scheme can perform on any localized features such as minutiae or triplet. To avoid alignment, in this paper we use the triplet feature introduced by Tsai-Yang Jea et. al. [15]. One of our main contributions is that we present a definition of codeword for such fingerprint feature points and our hashing scheme performs on those codewords. By producing codewords for feature points, this scheme can tolerate the distortion of each point. Moreover, to reduce number of candidates for matching stage more efficiently, a randomized hash function scheme is used to reduce the output collisions. The paper is organized as follows. Section 2 introduces some notions and our definition of codeword for fingerprint feature points. Section 3 presents our hashing scheme based on these codewords and a scheme to retrieve fingerprint from hashing results. We present experiment results of our scheme in Section 4. 268 T.H. Le and T.D. Bui 2 Preliminaries 2.1 Error Correction Code For a given choice of metric d, one can define error correction codes in the corresponding space M. A code is a subset C={w1,…,wk} ⊂ M. The set C is sometimes called codebook; its K elements are the codewords. The (minimum) distance of a code is the smallest distance d between two distinct codewords (according to the metric d). Given a codebook C, we can define a pair of functions (C,D). The encoding function C is an injective map for the elements of some domain of size K to the elements of C. The decoding function D maps any element w ∈ M to the pre-image C-1[wk] of the codeword wk that minimizes the distance d[w,wk]. The error correcting distance is the largest radius t such that for every element w in M there is at most one codeword in the ball of radius t centered on w. For integer distance functions we have t=(d-1)/2. A standard shorthand notation in coding theory is that of a (M,K,t)-code. 2.2 Fingerprint Feature Point Primary feature point of fingerprint is called minutia. Minutiae are the various ridge discontinuities of a fingerprint. There are two types of widely used minutiae which are bifurcations and endings (Fig.1). Minutia contains only local information of fingerprints. Each minutia is represented by its coordinates and orientation. Fig. 1. (left) Ridge bifurcation. (b) Ridge endings [15]. Secondary feature is a vector of five-elements (Fig.2). For each minutiae Mi(xi,yi,θi) and its two nearest neighbors N0(xn0,yn0,θn0) and N1(xn1,yn1,θn1), the secondary feature is constructed by form a vector Si(ri0 ,ri1,φi0,φi1,δi) in which ri0 and ri1 are the Euclidean distances between the central minutia Mi and its neighbors N0 and N1 respectively. φik is the orientation difference between Mi and Nk, where k is 0 or 1. δi represents the acute angle between the line segments MiN0 and MiN1. Note that N0 and N1 are the two nearest neighbors of the central minutia Mi and ordered not by their Euclidean distances but by satisfying the equation: N0MixN1Mi ≥ 0. A Fast and Distortion Tolerant Hashing for Fingerprint Image Authentication 269 Fig. 2. Secondary feature of Mi. Where ri0 and ri1 are the Euclidean distances between central minutia Mi and its neighbors N0 and N1 respectively. φik is the orientation difference between Mi and Nk where k is 0 or 1. δi represents the acute angle between MiN0 and MiN1 [15]. N0 is the first and N1 is the second minutia encountered while traversing the angle ∠N0MiN1. For the given matched reference minutiae pair pi and qj, it is said that minutiae pi(ri,i’,Φi,i’;,θi,i’) matches qj(rj,j’,Φj,j’,θj,j’), if qj is within the tolerance area of pi. Thus for given threshold functions Thldr(.), ThldΦ(.), and Thldθ(.), |ri,i’ – rj,j’| ≤ Thldr(ri,i’), |Φi,i’ – Φj,j’| ≤ ThldΦ(Φi,i’) and |θi,i’ – θj,j’| ≤ Thldθ(θi,i’). Note that the thresholds are not predefined values but are adjustable according to rii, and rjj. In this literature, we call this secondary fingerprint feature point as feature point. 2.3 Our Error Correction Code for Fingerprint Feature Point Most existing error correction schemes are respect to Hamming distance and used to correct the message at bit level (e.g. parity check bits, repetition scheme, CRC…). Therefore, we define a new error correction scheme for fingerprint feature point. In particular, we consider each feature x as a corrupted codeword and try to correct to the correct codeword by using an encoding function C(x). Definition 1. Let x ∈ RD, and C(x) is an encoding function of the error correction scheme. We define C(x) = (q(x-t)|q(x)|q(x+t)) where q(x) is a quantization function of x with quantization step t. We call the output of C(x) (cx-t,cx,cx+t) is the codeword set of x and cx=q(x) is the nearest codeword of x or codeword of x for short. Lemma 1. Let x ∈ R, given a tolerant threshold t ∈ R. For every y such that |x-y| ≤ t, then the codeword of y cy = q(y) takes one of three elements in the codeword set of x. Lemma 1 can be proved easily by some algebraic transformations. In our approach, we generate codeword set for every dimension of template fingerprint feature point q and only codeword for every dimension of query point p. Following lemma 1 and definition of two matched feature points, we can see that if p and q are corresponding 270 T.H. Le and T.D. Bui feature points of two versions from one fingerprint, codeword of q will be an element in codeword set of p. 3 Codeword-Based Hashing Scheme We present a fingerprint hash generation scheme based on the codeword of the feature point. The key idea is: first, “correct” the error feature point to its codeword, then use that codeword as the input of a randomized hash function which can scatter the input set and ensure that the probability of collision of two feature points is closely related to the distance between their corresponding coordinate pairs (refer to definition of two matched feature points). 3.1 Our Approach Informal description. Fingerprint features stored in database can be considered as a very large set. Moreover, the distribution of fingerprint feature points is not uniform and unknown. Therefore, we want to design a randomized hash scheme such that for the large input sets of fingerprint feature points, it is unlikely that elements collide. Fortunately, some standard hash functions (e.g. MD5, SHA) can be made randomized to satisfy the property of target collision resistance. Our scheme works as follows. First, the message bits are permuted by a random permutation and then the hash of resulting message is computed (by a compression function e.g. SHA). A permutation is a special kind of block cipher where the output and input have the same length. A random permutation is widely used in cryptography since it possesses two important properties: randomness and efficient computation if the key is random. The basic idea so far is for any given set of inputs, this algorithm will scatter the inputs among the range of the function well based on a random permutation so that the probabilistic expectation of the output will be distributed more randomly. To ensure the error tolerant property, we perform hashing on codeword (for query points) and on the whole codeword set (for template points) instead of the feature point itself. Follow the lemma 1 we have the query point y and the template point x are matched if and only if codeword of y is an element in codeword set of x . Formal description. We set up our hashing scheme as follows: 1. Choose random dimensions from (1,2,…,D); by this way, we adjust the trade off between the collision of hashing values and the space of our database. 2. Choose an tolerant threshold t and appropriate metric ld for selected dimensions. For each template point p: 1. Generate the codeword set for pd. These values are mapped to binary strings. Binary strings in the codeword set of selected dimensions are then concatenated to form L3 binary strings mt for i=1,…,L3 where L is the number of selected dimensions. 2. Each mi is padded with zero bits to form a n – bit message mi. 3. Generate a random key K for a permutation π of {0,1}n. 4. mi is permuted by π and the resulting message is hashed using a compression function such as SHA.. Let shi = SHA( π (mi)) for i=1,…,L3. A Fast and Distortion Tolerant Hashing for Fingerprint Image Authentication 271 5. There are at most L3 hash values for p. These values can be stored in the same hash table as well as separate ones. For query point q: 1. Compute the nearest codeword value for every selected dimension of q. Then map the value to a binary string. 2. Binary strings of selected dimensions are then concatenated. 3. Perform steps 2 and 4 as for template point. Note that there is only one binary string for query point q. 4. Return all the points which are sharing identical hash value with q. For query evaluation, all candidate matches are returned by our hashing for every query feature point. Hence, each query fingerprint is treated as a bag of points, simulating a multi-point query evaluation. To do this efficiently, we use an identical framework as Ke et al. [17], in that, we maintain two auxiliary index structure –File Table (FT) and Keypoint Table (KT) – to map feature points to their corresponding fingerprint; an entry in KT consists of the file ID (index location of FT) and feature point information. The template fingerprints that have points sharing identical hash values (collisions) with the query version are then ranked by the number of similarity points. Only top T templates are selected for identifying the query fingerprint by matching. Thus, the search space is greatly reduced. The candidate selection process requires only linear computational cost so that it can be applied for online interactive querying on large image collections. 3.2 Analysis Space complexity. For an automate fingerprint identification, we must allocate extra storage for the hash values of template fingerprints. Assume that the number of Ddimension feature points extracted from template version is N. With L selected dimensions, the total extra space required for one template is O(L3.N). Thus, the hashing scheme requires a polynomial-sized data structure that allows sub-linear time retrievals of near neighbors as shown in following section. Time complexity. On query fingerprint Y with N feature points in a database of M fingerprints, we must: compute the codeword for each feature point which takes O(N) due to quantization hashing functions required constant time complexity; compute the hash value which takes O(time(f,x)) where f is the hash function used and time(f,x) is the time required by function f with input x; and compute the similarity scores of templates that has any point sharing identical hash value with the query fingerprint, requiring time O(M.N/2m). Thus the key quantity is O(M.N/2n + time(f,x)) which is approximately equivalent to 1/2n computations of exhaustive search. 4 Experiments We evaluate our method by testing it on Database FVC2004 [19] which consists of 300 images, 3 prints each of 100 distinct fingers. DB1_A database contains the partial fingerprint templates of various sizes. These images are captured by a optical sensor with a resolution of 500dpi, resulting in images of 300x200 pixels in 8 bit gray scale. 272 T.H. Le and T.D. Bui The full original templates are used to construct the database, while the remaining 7 impressions are used to hashing. We use the feature extraction algorithm described in [15] in our system. However, the authors do not mention in detail how to determine the threshold of error tolerances. Therefore, in our experiments, we assume that the error correction distance t is fixed for all fingerprint feature points. This assumption makes the implementation not optimal so that two feature points are recognized as “match” by matching algorithm proposed in [15] may not share the same hash value in our experiments. The Euclidean distances between the corresponding dimensions of feature vectors are used in quantization hashing function. Table 1 shows the Correct Index Power (CIP) which is defined as the percentage of correctly indexed queries based on the percentage of hypotheses that need to be searched in the verification step. Although our implementation is not optimal, scheme still achieves good CIP result. As can be easily seen, the larger search percentage is, the better results are obtained. It indicates that the optimal implementation can improve the result performance. Table 1. Correct Indexing Power of our algorithm Correct Index Power Search Percentage CIP 5% 10% 15% 20% 80% 87% 94% 96% Compare with some published experiments in the literature, at the search percentage 10%, [13] comes up with 84.5% CIP and [23] reaches a result of 92.8% CIP. However, unlike previous works, our scheme is much simpler and by adjusting t carefully, it is promising that our scheme will reach 100% CIP with low search percentage. 5 Conclusion In this paper, we have presented a new robust approach to perform indexing on fingerprint which provides both accurate and fast indexing. However, there is still some works need to be done to in order to make the system more persuasive and to obtain the optimal result like studying optimal choices of t parameter. Moreover, to guarantee the privacy of fingerprint template in any indexing scheme is another important problem that must be considered. References [1] Bazen, A.M., Gerez, S.H.: Fingerprint matching by thin-plate spline modeling of elastic deformations. Pattern Recognition 36, 1859–1867 (2003) [2] Bazen, A.M., Verwaaijen, G.T.B., Garez, S.H., Veelunturf, L.P.J.: A correlation-based fingerprint verification system. In: ProRISC 2000 Workshops on Circuits, Systems and Signal Processing (2000) A Fast and Distortion Tolerant Hashing for Fingerprint Image Authentication 273 [3] Boer, J., Bazen, A., Cerez, S.: Indexing fingerprint database based on multiple features. In: ProRISC 2001 Workshop on Circuits, Systems and Singal Processing. (2001) [4] Brown, L.: A survey of image registration techniques. ACM Computing Surveys (1992) [5] Cappelli, R., Lumini, A., Maio, D., Maltoni, D.: Fingerprint Classification by Directional Image Partitioning. IEEE Trans. on PAMI 21(5), 402–421 (1999) [6] Cappelli, R., Maio, D., Maltoni, D.: Indexing fingerprint databases for efficicent 1: n matching. In: Sixth Int.Conf. on Control, Automation, Robotics and Vision, Singapore (2000) [7] Choudhary, A.M., Awwal, A.A.S.: Optical pattern recognition of fingerprints using distortion-invariant phase-only filter. In: Proc. SPIE, vol. 3805(20), pp. 162–170 (1999) [8] Fingerprint verification competition, http://bias.csr.unibo.it/fvc2002/ [9] Germain, R., Califano, A., Colville, S.: Fingerprint matching using transformation parameter clustering. IEEE Computational Science and Eng. 4(4), 42–49 (1997) [10] Gonzalez, Woods, Eddins: Digital Image Processing, Prentice Hall, Englewood Cliffs (2004) [11] Jain, A., Ross, A., Prabhakar, S.: Fingerprint matching using minutiae texture features. In: International Conference on Image Processing, pp. 282–285 (2001) [12] Jain, A., Prabhakar, S., Hong, L., Pankanti, S.: Filterbank-based fingerprint matching. Transactions on Image Processing 9, 846–859 (2000) [13] Jain, A.K., Prabhakar, S., Hong, L., Pankanti, S.: FingerCode: a filterbank for fingerprint representation and matching. In: CVPR IEEE Computer Society Conference (2), pp. 187– 193 (1999) [14] Jea, T., Chavan, V.K., Govindaraju, V., Schneider, J.K.: Security and matching of partial fingerprint recognition systems, pp. 39–50. SPIE (2004) [15] Tsai-Yang, J., Venu, G.: A minutia-based partial fingerprint recognition system. Pattern Recognition 38(10), 1672–1684 (2005) [16] Karu, K., Jain, A.K.: Fingerprint Classification. Pattern Recognition 18(3), 389–404 (1996) [17] Ke, Y., Sukthankar, R., Huston, L.: An efficient parts-based near duplicate and sub-image retrieval system. In: MM International Conference on Multimedia, pp. 869–876 (2004) [18] Liang, X., Asano, T.,, B.: Distorted Fingerprint indexing using minutiae detail and delaunay triangle. In: ISVD 2006, pp. 217–223 (2006) [19] Maio, D., Maltoni, D., Cappelli, R., Wayman, J.L., Jain, A.K.: FVC2004: Third Fingerprint Verification Competition. In: Proc. ICBA, Hong Kong, July 2004, pp. 1–7 (2004) [20] Nandakumar, K., Jain, A.K.: Local correlation-based fingerprint matching. In: Indian Conference on Computer Vision, Graphics and Image Processing, pp. 503–508 (2004) [21] Nist fingerprint vendor technology evaluation, http://fpvte.nist.gov/ [22] Ruud, B., Connell, J.H., Pankanti, S., Ratha, N.K., Senior, A.W.: Guide to Biometrics. Springer, Heidelberg (2003) [23] Liu, T., Zhang, G.Z.C., Hao, P.: Fingerprint Indexing Based on Singular Point Correlation. In: ICIP 2005 (2005)
© Copyright 2025 Paperzz