GUIDELINES FOR AUTHORS

335
FACE DETECTION USING PECULIAR POINT TECHNIQUE 1
O. S. Seredin2, I. A. Krestinin2
2
Tula State University, 300600, Tula, pr. Lenina, 92, RF
[email protected], [email protected]
The paper is devoted to the using of algorithm of image fragment localization based on
the peculiar point notion for the face detection problem. The localization object is the
region of eyes and nose, the model of pupil is considered as peculiar points. For the
solving problem of several faces localization and improving the results of initial
localization algorithm the principle of two-class pattern recognition is used. So, we label
test fragments to face-non-face classes.
Introduction
The person identification by his/her photo
image is an actual task. One of the stages of
solving this problem is the task of face
detection (localization) in some, usually raster
image.
A lot of papers are devoted to this theme [13,4], however results, demonstrated by
existing algorithms as a rule are not good
enough. One of the reasons of quality
reduction is absence of taking in to account of
some features, namely head slope, different
size of images which depends on person
closeness to the camera or to the another
registering device, etc.
We are assuming that attracting of peculiar
points technique [5] will improve the quality
of decision. General algorithm of localization
was designed to search of any arbitrary
fragments. So, when applied to the task of face
detection it doesn't give perfect results. In the
paper we discuss some modifications, taking
into account specificity of the problem and
considering improving of the recognition
quality.
1. Eye model building
Traditionally a lot of localization algorithms
are used the model of human eye as a principal
object of search. It is important that some
types of images are restricting this model:
 small size of face within an image – size of
pupil will be comparative with the
resolution limit of registering device. So,
when using raster format for image storing
the size of pupil will have size of one pixel
unit;
 unknown pupil size – different distance
from camera will give different size of
pupil;
 closed eyes – formally the pupils are not
presented in such images, however for
solving face detection it is desirable that
model will cover this situation. For
example, it is possible to build the model
of closed eyelid.
As a model satisfying the above mentioned
specifications we used the simplest description
of pupil as local minimum of brightness
function. Indeed, this model is quite simple for
organizing quick search, and at the same time
enough “invariant”, since not depend on the
pupil size in the picture. Moreover this model
is covered the situation of closed eyes. In the
closed eyelash it is possible to find at least one
or more minima (Fig.1).
However, the simplicity of this model appears
to be both: its advantage and its shortcoming.
It is not possible to distinguish minima
corresponding to pupils from other minima of
_______________________________________________________________________
1
This work is supported by the Russian Foundation for Basic Research, Grants 05-01-00679, 06-01-08042, 06-0100412, 06-07-89249.
336
brightness function. The amount of these local
minima ranges from several dozens to tens of
thousands depending on image size and its
noisiness.
Interesting approach for decreasing of peculiar
points is described in [6], more difficult eye
models are supposed in [7].
estimate the necessary number of peculiar
points for images with similar characteristics.
For example, for database BioID Face DB [8]
according with information from Fig. 2 the
amount of peculiar points may be chosen from
range 200-300 when median filter size is equal
l = 4.
However, in the practical tasks using filters
with fixed window size is lead to very high
dependency from image characteristics. Better
results are achieved using adaptive filtration.
So the number of peculiar points can be
decrease to range of 70-100 (see Fig. 3).
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
25
50
75
100
125
150
Fig. 3. Frequency of peculiar point location within 15pixel vicinity of pupil center depending on amount of
peculiar points for BioID Face Database
Fig. 1. Local minima positions of brightness function in
the case of semienclosed and closed eyes
l =6
1
l =8
0.9
0.8
0.7
0.6
l=4
0.5
l=2
0.4
0.3
0.2
0.1
0
100
200
300
400
500
600
700
Fig. 2. Frequency of peculiar point location within 15pixel vicinity of pupil center depending on amount of
peculiar points for BioID Face Database for the several
sizes of window in median filtering l
For the purpose of noise effect reduction the
preliminarily filtrating of image is applied.
The filter parameters affect on the number of
peculiar points of image. It is possible to
2. Choosing the searching fragment model
Searching fragment of image must not be
changeable grossly in different images, so the
image of head for example will be the lame
model. The hair-dress, beard, moustaches are
flexible objects. In the several images of the
same person the form of mouth is non-stable,
especially while speaking. So, over its nonvariability the region of eyes and nose is an
appropriate object for search. As a rule this
part of face not vary as for different persons as
for different photos of the same person. The
template of search as a result of 1520 images
averaging is shown in Fig. 4.
Fig. 4. «Averaging» image of eyes and nose
At the previous stage of research as a
localization task solution we searched the
337
fragment with minimal difference from
template. The results of this method
application were not promising, particularly
while testing on BioID only 60 percent of face
positions were fixed correctly.
3. Applying of SVM procedure for facenonface classification
One of the advantages of classifier using is
possibility to define number of faces in the
image. Also we are hoping that classification
approach will give us more good result as
usual comparing with template.
3.1 Choice of feature space
The recognition object (after photometrical
normalization) have been projected on greed
of fixed size ( 12 10 ). The values of
brightness in the greed nodes were used as
numerical features.
3.2 Choice of SVM kernel
The using of complex separable surfaces is
undesirable. Complex decision rules require
more numerical resources and more data for
training. So, in spite of linear non-separability
of sets in chosen feature space we used as
decision rule the linear separable hyperplane.
3.3 Training set structure
The training set was organized as following:
the procedure of localization using peculiar
point technique which analogous to described
one in [5] was applied to BioID database.
Numerical features of each located fragment
were stored in file. The information about
class attribute was taken from database eyes
position field. So we prepared the set of
1300000 non-faces and about 3000 faces.
Additionally we used information about
1200000 pseudo-faces, which were generated
by different shifting of original faces.
3.4 Training method with selection
of training subset
So, the full training set contains about 2,5
million of fragments. But available training
procedures were restricted by number of
processed objects regard to acceptable time.
This number is ranged by 5-8 thousands. To
overcome this obstacle we used for training
iterative approach which based on the
particular feature of SVM procedure. The
decision rule based on only subset of objects
named support vectors. So, it is possible to
build decision rule based on some not large
random set, then fixed support objects, then
add next random set with hope that new
objects are precise the decision rule. The
scheme of this procedure is shown in Fig.5.
full training set
(1 200 000 objects - faces,
1 300 000 objects - nonfaces)
from previous iteration
support objects
(about 200 - 400)
randomly chosen objects
(2000 faces & 2000 «nonfaces»)
SVM
decision rule
Fig. 5. Iterative training procedure based on
selection of training subset
It is possible to prove the convergence of this
procedure in the case of separable training
sets. However, our experiments have shown
that existing training sets are not linearly
separable, nevertheless the using of such
tecnique gives perfect results.
4. A priori information using
Let us note, that to use the general algorithm
of localization via peculiar points it is
necessary to have at least three points;
however we can effectively mark only two
points – centers of pupils. For negotiation of
this problem we must refuse general affine
transformation and solve its particular form
based on only rotation and scaling (with the
equal scaling ratio along both axes) and shift.
For acceleration of the search process it is
possible to use some additional a priori
information. For example, if it is known that
camera location is fixed and people are sitting
or standing it is reasonable to assume that
slope of face image is not more than ±60
degrees. Also, it may be known that people are
located at the some range of distances from
camera, so the size of face image will be quite
predetermined and additional restrictions on
338
scaling will be applied (practically it is
restriction on the distance between two
peculiar points). All these empirical
assumptions allow to restricted the set of
possible transformations A , and therefore
decrease the number of analyzing fragments of
images, and essentially decrease the
processing time (Fig. 6).
Conclusion
As can be seen from experimental results the
quality of algorithm is high enough. However
the fact that in 12% of cases the first candidate
occupied position not corresponding to face
shows that it is necessary to continue research
aimed at improving theclassifier performance.
Using of pattern recognition methods for
image comparing allows improving the quality
of localization task solving in comparison with
simple matching.
Our future research aim is trying another
kernel function and applying the feature
extraction procedures.
References
Fig. 6. The set of fragments before and after tacking into
account additional restrictions on scaling and rotating
5. Results of experimental study
While testing proposed algorithm of face
detection using BioID Face Database the
following feature was revealed: 10-100
fragments were classified as “face”, but
actually only one was truly the face. This fact
is a result of not sufficiently quality of using
type of classifier. Nevertheless after sorting of
fragments by the value of scalar product with
decision rule the results were following:
 in 88.2% of images the algorithm correctly
finds the part of the face and this part is the
first in the list;
 in 96.5% of images the position of face is
among 4 “best” parts pointed by our
algorithm;
 in 98.3% of images the position of face is
among 16 “best” parts pointed by our
algorithm.
1. Li Ma, Yunhong Wang, Tieniu Tan. Iris Recognition
Based
on
Multichannel
Gabor
Filtering.
ACCV2002: The 5th Asian Conference on
Computer Vision, pp. 23-25 January 2002,
Melbourne, Australia.
2. A. Kostin, J. Kittler, SVM for quick search of faces
and eye coordinates in image, Proceedings of 6th
International Conference on Pattern Recognition
and Image Analysis, PRIA-6-2002, Velikiy
Novgorod, 2002. - Vol. 2. 316-320 p. (in Russian)
3. Zhiwei Zhu, Kikuo Fujimura, Qiang Ji Real-Time
Eye Detection and Tracking Under Various Light
Conditions // ETRA'02 New Odeans Louisiana
USA, 2002.
4. A. Sachenko, I. Paliy, Y. Kurylyak, V. Kapura, R.
Sadykhov, D. Lamovsky Face Detection Algorithm
for Video Surveillance Systems. Pattern Recognition
and Information Processing: Proceedings of the
Ninth International Conference. Vol. II. - Minsk:
United Institute of Informatics Problems of National
Academy of Science of Belarus, 2007. pp. 141-145.
5. Krestinin I.A., Seredin O.S. Peculiar point technique
for object detection in image analysis. Paper in these
Proceedings.
6. Morimoto C., Koons D., Amir A. and Flickner M.
Pupil detection and tracking using multiple light
source// Image and Vision Computing, special
issue on Advances in Facial Image Analysis and
Recognition Technology.- No. 4. - 2000.-P.331-335.
7. Qui Chen, Koji Kotani, Feifei Lee, Tadahiro Ohmi
Accurate Eye Detection Using Elliptical Separability
Filter. Proceding of the Eighth IASTED
International Conference. Signal and Image
Processing. August 14-16, 2006, Honolulu, Hawaii,
USA, pp. 207-211.
8. http://www.humanscan.de