18623.pdf

ISCAS 2000 - IEEE International Symposium on Circuits and Systems, May 28-31, 2000, Geneva, Switzerland
FACE RECOGNITION
Vinayadatt V Kohir and U. B. Desai
Department of Electrical Engineering
Indian Institute of Technology
Mumbai, INDIA. 400 076
email: { vvkohir,ubdesai}C3ee.iitb.emet.h
ABSTRACT
A transform domain face recognition approach is presented.
The DCT is coupled with the HMM to achieve a recognition
rate of 100% on ORL face database of 40 subjects with 10
images per subject. The recognition time for ORL database
is little over 2Sec. 5 images of a subject are used to train
HMM and remaining 5 are used for recognition test. The
proposed method is tested on another face data base of 249
subjects w,ith 3 training images and 4 test images per subject.
The recognition rate is 90%. A test of recognition is carried
out at different resolutions with recognition rate varing from
100% to 95% depending on the resolution. Further, a simple
scheme is proposed to incorporate rejection of images of new
subjects. On ORL database 100% rejection occurs for the
images of new subjects.
1. INTRODUCTION
Face recognition has potential application in areas like military, security, electronic line up etc., and hence has been a
topic of interest in the last couple of decades [l, 2, 3,4, 2, 5,
61.
This paper presents two transform domain schemes for
face recognition with the basic block being the HMM (Hidden Markov Model). The proposed method combines DCT
(Discrete Cosine Transform) with HMM to exploit the best
of the two. The face recognition has two steps - HMM training, and then the actual face recognition.
For every subject to be recognized, a HMM is trained using the training face images, and labeled respectively. Recognition is carried out by computing state optimized probability estimate P ( 0 ,Q/Ai) for every hmrh i, and then selecting
the HMM label with highest state optimized probability estimate.
The proposed schemes are tested on ORL (Olevitti Research Laboratory) database and S P A " (Signal Processing And Artificial Neural Networks) laboratory database. To
compare the results, eigenface method of [2] is implemented
and the recognition rates are: 88% for ORL face database.
2. HIDDEN MARKOV MODEL [7]
1D Hh4M is associated with interconnected non-observable
(hidden) states manifested by the observable vector sequence.
HMM, X is characterized by three parameters ( A ,B , II).
Let 0 = (ol, o2, . . . , w),where each ot is a D -element
observation vector, be the observation sequence at T different observation instances and the corresponding state sequence be Q = (41,q'~,. . . ,q T ) , where qt € { 1 , 2 , . . . ,N } ,
N being the number of states in the model. Then the HMM
X = ( A ,B , II) is defined as follows :
A: is the transition probability matrix. The elements of
Aare: urj = P(qt+l = j / y t = 2 ) .
B : is the emission probability matrix determining the
output observation given that the HMM is in a particular
15 j 5
state. Every element of this matrix: b3(ot) :
N , and 1 5 t 5 T , is the posterior density of observation
ot at time t given that HMM is in state qt = j .
is the initial state distribution matrix with i-th entry,
7rz = P(q1 = i) being the probability of being in state 2' at
the start of the observation.
n:
3. VECTOR SEQUENCE GENERATION
Since, the proposed method uses HMMs for face recognition
the 2D face image data must be converted to 1D data without loosing significant information. The DCT based method
is proposed to generate 1D vector sequence from the 2D images.
3.1. Subimage sequence generation
The square sampling window is slid over the entire face image in raster scan fashion from top left comer of the face
image upto bottom right comer window is slid with predefined overlap. The grey levels captured by the sampling win:
dow form the subimage. Each of the face image generates a
subimage sequence.
3.2. DCT based vector sequence
Since, the DCT transforms spatial information to decoupled
frequency information in the form of DCT coefficients with
excellent energy compaction, it is used to obtain transformed
vector sequence from subimage sequence.
Each of the subimage is DCT transformed to obtain DCT
coefficients. Low frequency coefficient of DCT matrix are
arranged as a vector. An observation sequence is obtained
0-7803-5482-6/99/$10.00
02000 IEEE
V-305
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on December 3, 2008 at 02:25 from IEEE Xplore. Restrictions apply.
from the input face image after scanning and DCT transforming. This DCT vector sequence obtained is used for
HMM training andor testing.
4. HMM FOR FACE RECOGNITION
HMM Training: Following steps give a procedure of ergodic HMM training.
Step 1: Cluster all R training vector sequences, generated
from R number of training face images of the subject to be
recognized, i.e. { O w } ,1 5 w 5 R, each of length T , in N
clusters using some clustering algorithm, say k-means clustering algorithm. Each cluster will represent a state of the
training vector.
Step 2: Assign cluster number of the nearest cluster to each
of the training vector. i.e. t t h training vector will be assigned
a number 2' if its distance, say Euclidean distance, is smaller
than than its distance to any other cluster j , j # i.
Step 3: Calculate mean { p a } and covariance matrix {E,}
for each state (cluster).
Ei =
1
-E(
Al.
02
-
(ot - p i ) for 1 _< i
5N
otEa
'la
where N,is the number of vectors assigned to state i.
Step 4: Calculate A and Jl matrices using event counting.
IT,
=
No. of occurrences of 01 E i
for 15 i
No. of training sequences = 0
=
5N
No. of occurrences of ot E i and ot+l E j
No. of occurrences of ot E i
for 1 5 i ; j 5 N a n d f o r l 5 t _< T - 1
Step 5: Calculate the B matrix of probability density for each
of the training vector for each state. Here we assume that
b, ( o t )is Gaussian. For 1 5 j 5 N
where, ot is of size D x 1.
Step 6: Now use the Viterbi algorithm [7] to find the optimal
state sequence Q' for each training sequence. Here, the state
reassignment is done. A vector is assigned state i if q; = i.
Step 7: If there is any state reassignment, then repeat Steps
3 to 6; else STOP and the HMM is trained for the given
training sequences.
Face Recognition: For the face image to be recognized, the
vector sequence generation using mean subtracted image is
followed as described in Section 4. The trained HMMs are
used to compute the likelihood function as follows: Let 0
be the DCT based vector sequence obtained from the mean
subtracted face image to be recognized.
1. Compute Q: = argmaxQ P ( 0 ,Q/A,) using Viterbi algorithm [7].
2. The recognized face corresponds to that i for which the
is maximum.
likelihood function P ( 0 ,Q:/X,)
5. EXPERIMENTAL EVALUATION
Two face databases, ORL database and SPA"
database,
are used here. The face databases constitute of both male
and female subjects with some facial expressions and facial
accessories. No precise control over lighting, head orientation or facial expressions was exercised while capturing the
face images. All the images are 256 grey level images.
The training and recognition steps are as follows:
Training: Select the number of images per subject for training HMM.Then the HMM is trained as follows:
Step 0: Construct mean image from all the training images.
Subract training image from the mean image to get mean
subtracted image.
Step 1: Subtract training image from mean image to obtain
mean subtracted image, and sample it with a square sampling
window, say,of 16 x 16 size with 75% overlap, to generate
sequence 0 as described in Section 4. Take DCT of each
subimage enclosed by this sampling window. Scan the DCT
matrix and select few significant DCT coefficients, say 10 to
form a vector.
Step 2: Repeat Step 1 for all the training images. This step
gives the set of training vector sequences.
Step 3: Use the training algorithm described earlier to train
the 5 state ergodic HMM.
Recognition: The recognition test is performed on the face
images which are not the part of training. Each face image
is recognized by following the steps outlined below. Care
is taken that the sampling window size, amount of overlap,
transformation and number of coefficients used in recognition step are identical to that of training step.
Step 1: Construct mean subtracted face image by subtracting the test image from the mean image fo the training set.
Generate the DCT based vector sequence from the mean subtracted face image. This is done exactly similar to the training phase.
Step 2: Use the Viterbi algorithm to decode the state sequence and find the state-optimized likelihood function for
where
all the stored HMMs, namely, V i , P ( 0 ,Q:/A,),
Qa = argmaxQ P ( 0 ,Q / X a ) : and 0 is the DCT-based vector sequence corresponding to the face image to be recognized.
Step 3: Select that label of HMM for which the likelihood
function is maximum.
5.1. New Subject Rejection
In most of the face recognition work carried out, new subject's face image is not rejected. This feature is very much
essential for person (subject) authentication. The'proposed
technique is slightly modified and tested for authentic face
recognition.
As earlier, for every subject to be recognized as 'AUTHORIZED' and allowed access a 'SUBJECT HMM' is built.
In addition, a separate HMM, 'COMMON HMM'is built
and trained using all mean subtracted training face images of
all the authorized subjects. Then the decision is taken with
respect to common HMM. The state optimized probability
V-306
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on December 3, 2008 at 02:25 from IEEE Xplore. Restrictions apply.
estimate is calculated for all the HMMs including common
HMM. The decision is made based on the state optimized
probability estimate of the common HMM. The state optimized probability estimate of common HMM is compared
with all the subject HMMs. If the state optimized probability of common HMM is the highest, then the input face
image is rejected ‘UNAUTHORIZED’, otherwise the input
face image s treated as ‘AUTHORIZED’ and recognition is
performed using recognition algorithm states earlier.
5.2. Results and Discussion
DCT-HMM method is experimented on ORL database has
40 subject with 10 different images. 5 poses of the subject
are used to train the HMM, and remaining 5 poses are used
for recognition (see 1. The sampling window of 16 x 16
75% overlap is used to generate 1D transformed vector sequences. Significant first 10 DCT coefficients are used to
form vector from DCT transformed subimage. The recognition rate of 100% is obtained. Average recognition time of
little over 2Sec. is obtained on Pentium 200Mhz machine.
When the recognition test was carried on with full database
of 249 subjects (3 training poses and 4 test poses i.e. 249 x
4 ) a recognition rate of 90% is obtained. Sample training
posesand test poses are in Fig. 2. For the S P A “ (in house)
database the recognition rate is 95% with 20 randomly chosen subjects. To substantiate the above findings, eigenface
based method [2] is implemented. The recognition rate for
ORL face database is 88% and for SPA” face database is
77.5% (see Table 1). Comparative results, as reported by the
respective authors for OFU face database are reposted in Table 2. Also, an investigation is made into the recognition at
different resolution using ORL face database. The results of
the finding are in Table 3. The images are converted to different resolution using the pyramid algorithm proposed by
PI..
To validate new subject rejection, the ORL face database
is segmented into two two different sets: (i) 20 subjects corresponding to authorized (known) subject class and (ii) rest
20 subjects to the unauthorized (new) class. 5 poses of authorized subject are used to train HMMs. Remaining 5 different
poses of the respective authorized subject are used for subject validation (recognition - authorization). All the 10 poses
of unauthorized subject are used used for authentication. All
the 200 images of unauthorized class are rejected as ‘UNAUTHORIZED’ and 17 images of authorized class are rejected
Table 2: Comparative recognition results of some of the
other methods as reported by the respective authors on ORL
face database. The last three methods indicate the timings on
Pentium 200MHz machine in multiuser environment.
Table 3: Results of recognition at different resolutions for
the proposed DCT-HMM based face recognition scheme on
OF& face database.
are ‘UNAUTHORIZED’ i.e. 100%rejection of new subjects
and 83% rejection of known subjects.
6. PROPOSED FUTURE WORK
Towards achieving a full fledged face recognition system, a
face detection system (in a cluttered back ground) and recognition system are to be combined. The face is first detected
in a given photograph and then cropped. This cropped face
image is subjected to recognition technique proposed.
7. REFERENCES
% Recognition
DCT-HMM
Eigenface method
ORL Faces
40 subjects
100
88
SPANN Faces
249 subjects
90
577
R. Chellappa, C. L. Wilson and S. A. Sirohy. “Human
and Machine Recognition of Faces: A Survey”. IEEE
Proceedings, 83(3):704-740, May 1995.
M. Turk and A. Pentland. “Eigenfaces for Recognition”. Journal of Neuroscience, MIT, 3(1), 1991.
S. M. Lucas. “Face Recognition with Continuous ntuple Classifier”. BMVC’97, 1997.
V-307
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on December 3, 2008 at 02:25 from IEEE Xplore. Restrictions apply.
110
100
90
1
80
70
-
BO
-
50
-
40
-
Figure 1: Sample ORL face images. First 5 columns represent traning face images and last 5 columns are test images.
Figure 3: Graph showing dependence of recognition rate,
average HMM training time and average recognition time
for S P A " database of 20 subjects.
Figure 2: Sample SPA"
face images. First 3 columns represent training face images and last 4 columns represent test
images.
S. H. Lin, S. Y. Kung and L. J. Lin. "Face Recognition/Detection by Probabilistic Decision Based Neural
Network". IEEE Trans. on Neural Networks, 8( 1):114132, Jan. 1997.
F. S. Samaria.
"Face Recognition using Hidden
Markov Models". PhD thesis, University of Cambridge, UK, 1994.
S. Lawarance, C. L. Giles, A. C. Tsoi and A. D. Back.
"Face Recognition : A Convolutional Neural-Network
Approach". IEEE Trans. on Neural Networks, 8( 1):98113, Jan. 1997.
5
10
I5
20
25
Number of HhfM States
Figure 4: Dependence of recognition rate, average HMM
training time and recognition time for on Hh4M states20
S P A " subjects.
[lo] Vinayadatt V. Kohir and U. B. Desai. "Face Recognition based on Statistical Technique". In Proceedings of
ICAPRDT'99, Calcutta, India, Dec. 1999.
L. R. Rabiner. "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition".
IEEE Proceedings, 77(2):257-285, 1989.
"DWTVinayadatt V. Kohir and U. B. Desai.
HMM Based Face Recognition". In Proceedings of
ICVGIP'98, New Delhi, India, Dec. 1998.
S. Peleg, 0. Feder Busch, and R. Hummel. "Custom Made Pyramids". In Leonard Uhr, editor, Parallel Computer vision, pages 125-146. Academic Press,
1987.
V-308
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on December 3, 2008 at 02:25 from IEEE Xplore. Restrictions apply.