Biometric System Design Under Zero and Non-Zero Effort

Biometric System Design Under Zero and Non-Zero Effort Attacks
Ajita Rattani
University of Cagliari
Cagliari, Italy
Norman Poh
Dept of Computing, University of Surrey
Guilford, UK
[email protected]
[email protected]
Abstract
An increasing number of studies have reported that the
quality of biometric samples has a significant impact on the
performance of the system. However, to our best knowledge,
these studies are limited to impersonation attempts from different subjects, i.e., zero-effort attack, and they do not take
into account the possibility of spoof attack, also called nonzero effort attack. In order to thwart the spoof attack, one
way is to assess the likelihood of a spoof attempt by using biometric liveness measures. Since both biometric sample quality and liveness measures are different, and possibly complementary, we propose an information fusion framework that
combines them under both zero- and non-zero effort (spoof)
attacks. We implemented this framework using three generative classifiers, namely, Gaussian Mixture Model, Gaussian
Copula, and Quadratic Discriminant Analysis. Experimental
results on LivDet11 spoof fingerprint database demonstrate
that the proposed framework can reduce the error rate of the
baseline system by about 56%, under both types of attack.
1. Introduction
Nearly a decade of research has been directed towards enhancing the performance of biometric systems in the form of
multibiometrics, that is, combining information from multiple biometric modalities e.g., face and fingerprint; userspecific schemes such as user-specific thresholds and fusion
schemes [16]; and recently, by incorporating quality measures [13, 7]. However, these studies often assume that an
attacker is simply a “casual” impostor i.e., another subject
in the database rather than another person who actively masquerades as someone else by falsifying the biometric of the
claimed identity using artificial materials. While in the former, the attack carried out by a casual impostor is referred to
as zero- effort attack; in the latter, it is referred to as spoof or
nonzero-effort attack.
Our study relies on two lines of research: biometric
sample quality or quality measures, and biometric
liveness measures. Quality measures quantify the de-
gree of excellence or conformance of biometric samples to
some predefined criteria known to influence biometric system
performance1, A family of techniques has emerged, where
quality measures have been used to weight the contribution
of different biometric matchers or heuristically included as
meta-parameters in multibiometric fusion [4, 18]. Quality measures have been interpreted as conditionally-relevant
classification features and used jointly with other features to
train statistical models for uni- and multi-modal biometric
classification [10, 13, 7].
Recently, the vulnerability of the biometric systems to various attacks is identified [9, 2]. Among these attacks, a spoofing attack poses the most severe threat to biometric systems
because it can be easily performed using commonly available
materials and furthermore, does not require any knowledge of
the internal functionality of the system. For instance, a person can fool a fingerprint system by using artificial or gummy
fingers of another person in order to gain unauthorized access [9]. An effective counter-measure to the above attack
is by using liveness measures which aim to discriminate live
biometric samples from the spoofed (fake) ones [6, 17]. For
example, the algorithms for liveness detection for fingerprint
use different physiological properties, such as texture analysis, pores detection [6] and skin perspiration [17].
The goal of this manuscript is to investigate biometric system design under both zero-effort (impostor) as well as nonzero effort (spoof) attacks. To this aim, we propose a framework that incorporates quality as well as liveness measures
and evaluate the system under both zero-effort and non-zero
effort spoof attacks. The framework has been implemented
using three Generative classifiers, namely, a Bayesian classifier based on Gaussian Mixture Model (GMM) as its density estimator, another Bayesian classifier based on Gaussian
Copula, and Quadratic Discriminant Analysis (QDA). The
effectiveness of the proposed framework is assessed on the
LiveDet 2011 fingerprint data set in two ways: comparison
with the quality-based system which exploits quality measures alone, and assessment of the impact of the type of spoof
fabrication materials such as Ecoflex and Latex on the pro1 http://www.nist.gov/director/qualitystandards.cfm
posed system.
In summary, our contributions are: (1) a novel information
fusion framework that combines both quality as well as liveness measures; (2) implementation of the framework using
three different algorithms; and, (3) assessment of the framework under both zero and non-zero effort attacks, as well as
design issues related to the use of different spoof materials.
Besides making a biometric system more robust as will be
supported by experiments, our proposed framework also has
an additional benefit; it circumvents the need of scale normalization and selection of optimal weights for the purpose
of information integration as part of a larger multimodal biometric system.
This paper is organized as follows: Section 2 presents the
proposed framework posed as a biometric classification task
under two types of attack. Section 3 elaborates on database,
tools and the adopted protocol. Section 4 explains the obtained experimental results. Conclusions are drawn in section
5.
2. Proposed Biometric System Design Under Attack
Let the observation be x = [s, lt , li , q] where s ∈ R is a
matching score, lt ∈ R (li ∈ R) denotes liveness value of
template (input sample), and q ∈ R is a quality metric for a
template-query pair of samples. Note that q ∈ R represents
the quality of a comparison operation; it is defined as the average between two quality measures: one from the template
(qt ) and another from the query biometric sample (qi ). Let
k = {C, I} denote the class of matching where C and I denote genuine and impostor classes, respectively.
Using the above notation, a generative classifier based on
the log-likelihood ratio test (f llr ) takes the following form:
f llr (x) = log
p(x|C)
p(x|I)
(1)
where p(x|C) and p(x|I) are the joint class-conditional densities for x, given the genuine (C) and impostor (I) classes,
respectively. Note that both zero-effort impostor and nonzero effort spoof attacks belong to impostor class. The final
decision is made using the following function:
Figure 1. Proposed framework for biometric system design under
zero-effort impostor and non-zero effort spoof attacks.
The proposed framework for biometric system design under attack is illustrated in Figure 1. In this figure, the proposed module is labeled as “Joint Density Estimation”. This
process considers three pieces of information: a matching
score, an average quality measure, and a pair of liveness measures.
In the following subsections, we explain how the mentioned log-likelihood ratio classifier (1) can be implemented
using Gaussian Mixture Model (GMM), Gaussian Copula
(Copula), and Quadratic Discriminant Analysis (QDA).
2.1. Gaussian Mixture Model (GMM)
Gaussian mixture model has been successfully used to estimate joint densities. The estimated joint density obtained
using finite mixture models indeed converges to the true density when sufficient training samples are provided [5].
Let φN (x, µ, Σ) be the N -variate gaussian density with
mean vector µ and covariance matrix Σ, i.e.,
1
φN (x, µ, Σ) = (2π)−N/2 |Σ|−1/2 exp(− (x−µ)T Σ−1 (x−µ))
2
(2)
The estimates of p(x|k) for k = {C, I} is obtained as a mixture of Gaussians as follows:
p(x|k) =
Mk
X
wk,j φN (x, µk,j , Σk,j )
(3)
j=1
decision(f
llr
(
accept, if f llr (x) > η
(x)) =
reject, otherwise
where η is the threshold set at fixed false acceptance rate
(FAR).
The optimality of the test in (1) is guaranteed by the
Neyman-Pearson theorem [3], subjecting to the condition
that the underlying class conditional densities (p(x|C) and
p(x|I) are well estimated.
where Mk is the number of mixture components used to
model the densities of the genuine (when k = C) and impostor classes (when k = I). wk,j is the weight assigned to
PMk
the j th mixture component in p(x|k), j=1
wk,j = 1. The
Selection of the appropriate number of components is one
of the most challenging issues in mixture density estimation.
The GMM fitting algorithm proposed in [5] automatically estimates the appropriate number of components and the component parameters using an EM algorithm and the minimum
message length criterion. Hence, the GMM fitting algorithm
in [5] has been used in this study.
2.2. Gaussian Copula (Copula)
Another way to estimate the joint density is by using a
copula model. Let X1 , X2 · · · XN be N continuous distribution functions on the real line and X be a N -dimensional
distribution function with the nth marginal given by Xn for
n = 1, 2 · · · N . The Sklar’s theorem [11] states that there
exists a unique function C(u1 , u2 , · · · , uN ) from [0, 1]N to
[0, 1] satisfying
X(s1 , s2 , · · · , sN ) = C(X1 (s1 ), X2 (s2 ), · · · , XN (sN ))
(4)
where s1 , s2 , · · · , sN are N real numbers. The function C
is known as a N -copula function that couples the one dimensional distribution functions X1 , X2 · · · XN to obtain
the N -variate function X. The family of copulas considered in this paper is the N -dimensional multivariate Gaussian copula[11]. These functions can represent a variety of
dependence structures using a N × N correlation matrix
R. The (m, n)-th entry of R, ρm,n , measures the degree
of correlation between the m-th and n-th components for
m, n = 1, 2, · · · , N . Let Fn be an estimate of the cumulative density function of sn , such that un = Fn (sn ). The N dimensional Gaussian copula function with correlation matrix R is given by
N
−1
CR
(u1 , u2 , · · · , uN ) = ΦN
(u1 ), Φ−1 (u2 ), Φ−1 (uN ))
R (Φ
(5)
where each un ∈ [0, 1] for n = 1, 2, · · · , N . Φ(.) is the
distribution function of the standard normal, Φ−1 is its inverse and ΦN
R (Z) is the N -dimensional distribution function
of a random vector Z = (Z1 , Z2 , · · · , ZN )T with component
means and variances given by 0 and 1, respectively. Therefore, the joint density p(x|k) is given by
N
p(x|k) = CR
(F1 (s1 ), F2 (s2 ), · · · , FN (sN ))
(6)
2.3. Quadratic Discriminant Analysis (QDA)
Quadratic discriminant analysis (QDA) is closely related
to linear discriminant analysis (LDA), where it is assumed
that the measurements from each class are normally distributed, and has a closed form solution [3]. However, unlike
LDA, in QDA there is no assumption that the covariance of
each of the classes is identical.
QDA can be easily implemented by setting the number of
Gaussians in GMM to one. In this case, the log-likelihood
ratio decision rule takes the following form (7):
log
−1
2π|Σk=C exp(− 21 (x − µC )T Σ−1
C (x − µC ))
p
−1
−1
1
2π|Σk=I exp(− 2 (x − µI )T ΣI (x − µI ))
p
!
(7)
As the decision boundary is quadratic in x, it allows for more
flexibility for the model to fit the data better than linear discriminant analysis (LDA) because in the latter case, the covariance matrix of genuine and impostor classes are assumed
to be equal. For this reason, LDA is not considered in this
paper.
3. Database, Tools and Protocol
3.1. Database and Tools
LivDet11: We shall use the same data set that was used to
evaluate fingerprint liveness detection algorithms in the Second International Competition on Fingerprint Liveness Detection (LivDet11) [1]. This data set consists of 1000 live and
1000 fake fingerprint images each in training and test set, respectively. All images collected using the Biometrika sensor
have been used in this study. These live images are obtained
from 100 subjects with 10 samples from distinct finger per
subject for each set (training and test). The fake fingerprints
are fabricated using the following materials: gelatine, silicone, woodglue, ecoflex and latex. For each of these five materials, 200 images are fabricated from 20 subjects for each
set.
Three pieces of software are used. The NIST Bozorth32
software is used for obtaining a matching score between a
pair of fingerprint images. In order to measure the quality of fingerprint impressions, we use the IQF developed by
MITRE3 which has been used for various FBI applications.
This quality factor (Q) ranges from 0 to 100, with 0 being
the lowest and 100 being the highest quality. Finally, in
order to assess fingerprint liveness, we developed the antispoofing measure proposed by Nikam and Aggarwal which
is based on Local Binary Pattern (LBP) features[12]. The
LBP features have been shown to outperform other competing liveness measures based on pores detection, Curvelet,
Power spectrum, Wavelet energy signature [6] evaluated on
the LivDet11 fingerprint database, giving an equal error rate
(EER) of 10.95%. A two-class support vector machine
(SVM) is trained using LBP features in order to classify live
and fake fingerprint images. The output of this trained SVM
is used as a liveness measure directly.
Figure 2 shows the probability density (pdf) of the liveness measures obtained for the live and fake fingerprints (fabricated using different types of material). This figure shows
that the obtained liveness measure for fake samples differs for
different types of material used for the fabrication of spoofed
fingerprint samples.
3.2. Protocol and Performance metrics
Following the LivDet2011 protocol as adopted in [1], we
used 1, 000 live and 1, 000 fake images to train the proposed
2 http://www.nist.gov/itl/iad/ig/nbis.cfm
3 http://www.mitre.org/tech/mtf/
Probability Distribution Function (pdf)
8
7
6
Table 1. The five possible events during the biometric system operation and the desirable classification decisions.
Live
Gelatine
Latex
Silgum
Ecoflex
WoodGlue
Event
1
2
3
4
5
5
4
Template
live
live
live
fake
fake
Query
live
live
fake
live
fake
Attack type
non-attack
zero-effort
non-zero effort
non-zero effort
non-zero effort
Classification
genuine
impostor
impostor
impostor
impostor
3
2
events 4 and 5.
1
0
−0.2
0
0.2
0.4
0.6
0.8
1
1.2
LBP based Liveness measure
Figure 2. Probability density of the LBP based liveness measures for
live as well as fake fingerprints, fabricated using different material
type.
fusion classifiers, and the remaining 1, 000 live and 1, 000
fake images were reserved as the test set, which is used
uniquely to gauge the generalization performance of the proposed framework.
Attack types and events: Recall that we use the observation
vector x consisting of a matching score, a combined quality measure, and a pair of liveness measures each extracted
from the template and query samples, as explained in Section 2. Since a template-query pair is considered for each
comparison, five possible events can occur during the system
operation, leading to different desirable classification decisions. These events can be described by the properties of
template and query samples, and categorized by their attack
type and classification decision, as shown in Table 1. The
first row shows the properties of a genuine access which are
characterized by live template and query samples. The second and third rows in Table 1 cover the cases when an attack is executed while the system is operational, constituting
a zero-effort and non-zero effort attack, respectively. The last
two rows cover the cases where a genuine (enrolled) template
has also been replaced by a spoofed sample. The attack to the
enrolled template happens when a biometric database is compromised. This can also happen when a biometric system
adopts a template-update mechanism [14] where fake biometric samples classified with high confidence may be used
to adapt/replace the enrolled template(s).
Following the categorization of these attacks, we will consider the following two scenarios in our experiments: (1)
when the attacks are directed only during the system operation, i.e., involving events 2 and 3; and (2) when the spoof
attacks are also directed at the template level, i.e., involving
Performance assessment: In order to compare the performance of the proposed framework and three realizations of
Baysian classifiers (GMM, Copula and QDA), we used equal
error rate (EER), false acceptance and false rejection rates
(FAR and FRR), but in the context of zero-effort and nonzero effort attack. This gives rise to attack-specific measure
such as false acceptance rate of impostor samples (IFAR) for
zero-effort attack and false acceptance rate of spoofed samples (SFAR) for non-zero effort attack. The false rejection
rate (FRR) which is an estimate of the probability of rejecting a user at a given threshold remains the same in both cases.
4. Experimental Results
In this section, we investigate two case studies 1) when
the attacks are executed during the system operation and the
templates have not been tempered with, and 2) when spoof
attacks are directed both at the templates as well as during
the system operation. We distinguish these two cases in order
to study the additional effect caused by spoof attacks at the
enrolled templates.
4.1. Case 1: Attacks during the system operation
This section presents the case when the classifiers are
trained and tested for the events comprising of genuine access, zero-effort impostor and spoof attacks executed during
the system operation.
Figure 3 shows the ROC curves of the proposed framework (incorporating quality as well as liveness with the
matching scores) for GMM, Copula and QDA (labeled
as Quality + Liveness). In these figures, we compare the performance of the same classifiers incorporating
only quality (labeled as Quality) and the baseline without incorporating quality and liveness measures (labeled as
Baseline). The baseline performance has been evaluated
using standard biometric performance evaluation technique
based on score distribution. These figures show that the
proposed framework has reduced EER, over the baseline by
52%, 48% and 29% for GMM, Copula and QDA classifiers,
respectively. Among the implementation variants, GMM outperforms Copula and QDA.
Performance of Gaussian Mixture Model based Classifier Under Attack
100
Genuine Acceptance Rate [%]
90
80
70
60
50
40
30
20
Baseline (EER = 17.8%)
Quality (EER = 16.59%)
Quality and Liveness (EER = 8.6%)
10
0
0
10
20
30
40
50
False Acceptance Rate [%]
Performance of Gaussian Copula based Classifier Under Attack
100
Genuine Acceptance Rate [%]
90
80
70
60
50
40
30
20
Baseline (EER = 17.8%)
Quality (EER = 17.3%)
Quality and Liveness (EER = 9.2%)
10
0
0
10
20
30
40
50
False Acceptance Rate [%]
Performance of Quadratic Discriminant Analysis based Classifier Under Attack
100
90
Genuine Acceptance Rate [%]
In the above experiments, we can observe a common
trend, i.e., classifiers incorporating only quality measures that
can enhance the performance under genuine operation and
zero-effort impostor attacks [7], may actually obtain limited
performance enhancement or even degrade in performance
under both zero-effort and non-zero effort attacks. For instance, for QDA classifier, the EER of the baseline under attack is 17.8% and the EER of the quality based system is
19.22% under attack. Similarly, for GMM, the EER of the
quality based system under attack is 16.59%. The reason
for this is that spoofed samples of higher quality are likely
to yield high matching scores. This is illustrated in Figure
4 which shows a scatter plot of matching score (y-axis) versus quality (x-axis) for spoofed fingerprint fabricated using
silgum material. The same observation has been noted for
spoofed samples fabricated from other materials as well (not
shown here for the sake of space).
In Table 2, we tabulate SFAR and IFAR of the classifiers,
when the decision threshold tuned to EER, for the following
methods: the proposed framework (labeled as Proposed),
the classifier incorporating quality measures and matching
scores (labeled as Quality), and the baseline system which
uses the matching scores alone (Baseline). The false rejection rate (FRR) of these classifiers are equal to their respective EER value. SFAR (IFAR) is computed as the ratio
between number of the spoofed (casual impostor) samples
accepted and the total spoofed (casual impostor) samples presented to the biometric system.
First of all, it can be observed that for all systems, SFAR
is much higher than IFAR. This shows that if the biometric
system is under attack, the probability of false acceptance
due to spoof attack is significantly higher than that due to
zero-effort attack. Therefore, reducing the SFAR should be
given a top priority while keeping false rejection rate to an
acceptable level.
It can be seen that the proposed framework significantly
reduces SFAR. The average improvement of the proposed
framework over the baseline system is estimated to be 58%.
A 50% reduction implies halving the false acceptance of
an active, dedicated spoof attack. Among these systems,
the Copula-based classifier achieves the smallest SFAR, i.e.,
15%. Furthermore, the FRR of the proposed framework,
which is equal to its EER computed on genuine operation,
is also the smallest.
On the other hand, IFAR values of the proposed systems
increase slightly compared to the baseline system. However,
the significant reduction in SFAR would represent an important benefit that outweighs a slight increase in IFAR, considering that in practice, IFAR is many times smaller than
SFAR.
A head-to-head comparison between SFAR of the proposed framework and the quality-based system shows that
the former outperforms the latter for all the classifiers with
80
70
60
50
40
30
20
Baseline (EER = 17.8%)
Quality (EER = 19.22%)
Quality and Liveness (EER = 12.6%)
10
0
0
10
20
30
40
50
False Acceptance Rate [%]
Figure 3. ROC Curves of a) GMM, b) Copula and c) QDA based
classifiers implementing the proposed system design under attack.
Comparative analysis has been made with these classifiers incorporating only quality measures and the baseline without incorporating
quality and liveness measures.
an average relative difference of about 52% (i.e., roughly 2
times larger than that of the proposed system). This shows
the important role of liveness measures in countering spoof
Table 2. SFAR and IFAR of GMM, Copula, QDA implementing
the proposed framework. Comparison has been made with those
incorporating only quality and the baseline. In all cases, the decision
threshold has been set at Equal Error Rate.
180
160
Matching Score
140
Classifier
GMM
120
100
80
Copula
60
QDA
40
20
0
38
Baseline
40
42
44
46
48
50
52
Type
Proposed
Quality
Proposed
Quality
Proposed
Quality
N/A
SFAR[%]
17.14
38.40
15.02
38.30
23.70
40.83
44.14
IFAR[%]
1.40
0.80
3.20
1.20
2.82
2.43
0.07
54
Quality Measure
Figure 4. Scatter plot of matching score (y-axis) versus quality (xaxis) for spoofed fingerprint samples fabricated using silgum material.
attack [8].
A shortcoming of the above experiment is that the classifiers have been trained using all spoof fabrication materials.
This is an overly optimistic scenario because in practice, it is
impossible to consider all types of spoof fabrication materials for the implementation of the proposed framework. For
this reason, in the next section, we shall investigate a scenario
where the fake fingerprint impressions made from a different
types of spoof fabrication materials are used for implementing the proposed framework.
Efficacy of different spoof fabrication materials for the
proposed framework
In these experiments, a classifier is trained with spoofed
samples generated from one single type of material and tested
on samples from genuine and zero-effort impostor attacks,
as well as spoofed samples fabricated from all the available
materials. Therefore, the training set consists of only 200
spoofed samples for each material type (recalling that we
have five materials) whereas the test set consists of 1000
spoofed samples from all the spoof fabrication materials
available.
For instance, GMM, Copula and QDA based classifiers
are trained for data samples taken from events 1 and 2 as well
as event 3 using spoofed samples only from silicone based
material. The classifiers are then tested against events 1 and
2, as well as event 3 using spoofed samples fabricated from
all the available materials i.e., latex, silicone, woodglue, gelatine and ecoflex. By doing so, we can also evaluate the efficacy of different types of spoof fabrication material in training the proposed framework against general, unknown spoof
attacks.
Figure 5 plots the Equal Error Rate (EER) of GMM,
Copula-based and QDA trained using different spoof fabrication materials (listed in the x-axis). As a control experiment,
we also include the performance of the classifiers trained using all the five available spoof materials (indicated using keyword “All” in x-axis in Figure 5). These figures consistently
suggest the efficacy of ecoflex spoof fabrication material in
training the classifiers of our proposed scheme. Despite using only a very small training set, the EER of GMM trained
with ecoflex is 9.38% which is just slight lower than the EER
representing the most optimistic scenario where all possible
spoof fabrication materials have been used for training, i.e.,
8.6%.
Regarding the efficacy of ecoflex material, our conjecture
is that ecoflex material is able to produce fake fingerprint impressions of better quality than the other materials. Figure 6
shows the fake fingerprint image fabricated using ecoflex and
silgum material for the same finger.
4.2. Case 2: Attacks to the enrolled templates
The objective of the experiments here is to assess the impact of a compromised biometric database – one where fake
fingerprint templates have been introduced – on the proposed
framework. In this case study, classifiers are trained with
samples obtained from all the 5 events listed in Table 1. Table 3 lists the performance in terms of EER, SFAR and IFAR
of the following classifiers: those of the proposed framework
(labeled as Proposed), those incorporating only quality (labeled as Quality), and as a control experiment, the baseline (score only) system.
This table shows that the EER and SFAR of the baseline system increases by 26% and 53%, respectively, under
case 2, in comparison to the performance of the baseline for
case 1 (with all materials, as shown in Table 2). The FRR
of these classifiers are equal to their EER. The GMM classifier trained with the proposed framework continues to outperform the other classifiers even for case 2. For instance, it
reduces the EER of the baseline by 56% and reduces SFAR
of the baseline by 60 %. When comparing the result to case
1, we observe that the GMM-based Bayesian classifier also
reduces the IFAR of the baseline by 50%. The above obser-
Gaussian Mixture Model (GMM)
16
Equal Error Rate [%]
14
12
10
8
6
4
Figure 6. Fake fingerprint image fabricated using a) Ecoflex and b)
Silgum material (left to right).
2
0
All
EcoFlex
Gelatine
Latex
Silgum
WoodGlue
Gaussian Copula
16
Table 3. EER, SFAR and IFAR of GMM, Copula and QDA incorporating the proposed framework for the case study two. Comparison
has been made with those incorporating only quality and the baseline. In all cases, the decision threshold has been set at Equal Error
Rate.
Equal Error Rate [%]
14
Classifier
GMM
12
10
Copula
8
QDA
6
4
Baseline
Type
Proposed
Quality
Proposed
Quality
Proposed
Quality
N/A
EER[%]
9.91
20.00
12.35
20.00
17.60
26.53
22.38
SFAR[%]
27.40
56.39
45.50
47.77
38.40
60.64
67.60
IFAR[%]
0.04
6.94
0.09
6.15
0.11
12.55
0.08
2
0
All
EcoFlex
Gelatine
Latex
Silgum
WoodGlue
Quadratic Discriminant Analysis (QDA)
16
Equal Error Rate [%]
14
12
10
8
6
4
2
0
All
EcoFlex
Gelatine
Latex
Silgum
WoodGlue
Figure 5. The EER obtained for GMM, Copula and QDA trained
with different type of spoof fabrication materials (listed in x-axis)
for the proposed framework. These classifiers are tested against
spoofed samples from all the available materials.
vations suggest the importance of consider all the possible
events (1-5) when designing a fusion classifier that combines
liveness measures, quality measures and a matching score.
All the trained classifiers continues to outperform those
incorporating only quality and the baseline (without quality
and liveness). Furthermore, the performance of GMM incorporating the proposed framework is superior to GMM incorporating only liveness measures with the biometric system
(EER = 11.2%). Figure 7 shows the ROC curves of GMM
incorporating the proposed framework, only quality and the
baseline (without quality and liveness) for the case study two.
In summary, the proposed system design incorporating
liveness as well as quality can reduce the error rate (EER) as
well as increase the robustness of the biometric system under
attack, in terms of SFAR and IFAR, directed at the templates
as well as during the system operation.
5. Conclusions
This paper investigates the biometric system design under
zero- and non-zero effort attacks by combining quality and
liveness measures with a biometric system at the score level.
The framework has been implemented using three generative
classifiers based on GMM, Gaussian Copula and QDA. Experimental investigations on the LivDet11 database reveals
the following findings:
• Quality based biometric systems that enhance the performance under genuine and zero-effort impostor attack
may degrade in performance under spoof attacks. This
Performance of Gaussian Mixture Model based classifier under attack (Case 2)
100
Genuine Acceptance Rate [%]
90
80
70
60
50
40
30
20
Baseline (EER = 22.38%)
Quality (EER = 20%)
Quality and Liveness (EER = 9.91%)
10
0
0
10
20
30
False Acceptance Rate [%]
40
50
Figure 7. ROC Curves of GMM based classifier incorporating the
proposed framework, only quality and the baseline for the case study
two.
is particularly acute as advancement in spoofing techniques may lead to production of high-quality spoofed
samples.
• Fortunately, the combined use of quality and liveness measures provides a possible countermeasure that
thwart both zero-effort and non-zero effort attacks.
• Ecoflex as a spoof fabrication material appears to be efficient in training the proposed information fusion framework in counteracting general spoof attacks.
• Finally, our experiments suggest that the proposed system design can reduce the EER, SFAR and IFAR of a
biometric system by about 56%, 60% and 50%, respectively, over the baseline system under attack, directed at
the templates as well as during the system operation.
The security of the proposed framework can be further enhanced by adopting multibiometrics [16], by incorporating
user-specific characteristics for attacks [15], and by improving the sensitivity of its underlying liveness measure. In order to extend to use the proposed framework to a multibiometric system, one simply combines the joint class conditional densities of M biometric modalities by summing the
P
p(xi |C)
log-likelihood ratio as M
i=1 log p(xi |I) where xi is an observation vector for biometric modality i.
Acknowledgement: Poh was partially supported by Biometrics
Evaluation and Testing (BEAT), an EU FP7 project with grant no.
284989.
References
[1] LivDet 2011: Fingerprint liveness detection competition. http://people.clarkson.edu/projects/
biosal/fingerprint/index.php.
[2] Z. Akhtar. Security of Multimodal Biometric Systems against
Spoof Attacks. PhD thesis, Dept. of Electrical and Electronic
Engineering, University of Cagliari, Cagliari, Italy, 2012.
[3] R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification.
Wiley-Interscience Publication, 2000.
[4] J. Fierrez-Aguilar, Y. Chen, J. Ortega-Garcia, and A. K. Jain.
Incorporating image quality in multi-algorithm fingerprint verification. In Proc. of Intl. Conf. on Biometrics (ICB), pages
213–220, Hong Kong, 2006.
[5] M. Figueiredo and A. Jain. Unsupervised learning on finite
mixture models. IEEE Trans. on Pattern Analysis and Machine Intelligence, 24, 2002.
[6] L. Ghiani, G. L. Marcialis, and F. Roli. Experimental results
on the feature-level fusion of multiple fingerprint liveness detection algorithms. In Proc. of 14th ACM Workshop on Multimedia and Security, pages 157–164, Coventry, UK, 2012.
[7] K. Kryszczuk, J. Richiardi, and A. Drygajlo. Impact of combining quality measures on biometric sample matching. In
Proc. of IEEE Intl. Conf. on BTAS, pages 133–138, Piscataway, NJ, USA, 2009.
[8] E. Marasco, Y. Ding, and A. Ross. Combining match scores
with liveness values in a fingerprint verification system. In
Proc. of IEEE Intl. Conf. on BTAS, pages 1–8, Washington,
USA.
[9] T. Matsumoto, H. Matsumoto, K. Yamada, and S. Hoshino.
Impact of artificial ”gummy” fingers on fingerprint systems.
In Proc. of SPIE Opt. Sec. Counterfeit Deterrence Tech. IV,
pages 275–289, 2002.
[10] K. Nandakumar, Y. Chen, S. C. Dass, and A. K. Jain. Qualitybased score level fusion in multibiometric systems. In Proc.
of Intl. Conf. on Pattern Recognition (ICPR), volume 4, pages
473–476, Hong Kong, China, August 2006.
[11] R. Nelsen. An Introduction to Copulas. Springer, 1999.
[12] S. Nikam and S. Aggarwal. Local binary pattern and waveletbased spoof fingerprint detection. Intl. Journal of Biometrics,
1(2):141–159, 2008.
[13] N. Poh and J. Kittler. A unified framework for biometric expert
fusion incorporating quality measures. IEEE Trans. on Pattern
Analysis and Machine Intelligence, 34(1):3–18, 2012.
[14] A. Rattani. Adaptive biometric system based on template update procedures. PhD thesis, Dept. of Electrical and Electronic
Engineering, University of Cagliari, Cagliari, Italy, 2010.
[15] A. Rattani, N. Poh, and A. Ross. Analysis of user-specific
score characteristics for spoof biometric attacks. In Proc. of
IEEE Computer Society Conf. on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 124–129, Providence, USA, 2012.
[16] A. Ross, K. Nandakumar, and A. K. Jain. Handbook of Multibiometrics. Springer Verlag, 2006.
[17] S. Schuckers. Spoofing and anti-spoofing measures. Information Security Technical Report, 7:56–62, 2002.
[18] K. A. Toh, W.-Y. Yau, E. Lim, L. Chen, and C.-H. Ng. Fusion
of auxiliary information for multi-modal biometrics authentication. In Proc. of Intl. Conf. on Biometrics (ICB), pages
678–685, Hong Kong, China, 2004.