Geometrical Modeling of Facial Regions and CUDA based Parallel

International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6740-6752
© Research India Publications. http://www.ripublication.com
Geometrical Modeling of Facial Regions and CUDA based Parallel Face
Segmentation for Emotion Recognition
Sabu George
Department of Information and Communication Technology,
Manipal Institute of Technology,
Manipal University, Manipal, Karnataka, India.
algorithm is proposed for landmark detection. And a novel
facial landmark based feature extraction technique for Action
Units (AUs) detection and emotion recognition is also
proposed.
Automatic emotion recognition system consists of acquisition
of face, extraction of facial data and classification of facial
expression. Face acquisition means detection and tracking of
the face automatically from the input video with cluttered
backgrounds [4]. The feature extraction step finds a specific
representation of the data which can highlight relevant
information [3]. Classification of facial expression is the
process of assigning the observed data to the predefined
category of facial expression [4].
CUDA accelerated parallel computing is the development
trend of Graphics Processing Unit (GPU) based HighPerformance Computing (HPC) [7, 8, 9]. GPU is a massively
parallel unit [7], which proposes a highly parallel architecture
containing several hundred core calculations quite different
from a conventional multi core, is computationally a powerful
engine for image processing and computer vision applications.
In facial landmark based feature extraction techniques
landmark features serve as anchor points on face graphs [10].
The points in the eye corners, eyebrow arcs, nose tip, nostril
corners, mouth corners, chin, ear lobes etc. are the facial
landmarks. According to the image feature extraction
techniques, the facial landmarks can be grouped as primary or
fiducial and secondary or ancillary points. Using low level
image features, the points in the nose tip, corners of mouth
and eyes can be detected easily [11]. They are referred as
fiducial or primary landmarks. The secondary landmarks are
the points in nostrils, eyebrow, chin, lip midpoints, cheek
contours and non-extremity points which can be searched by
the guidance of primary landmarks. Active Shape Model
(ASM) is one of the popular face alignment methods [12]. It
helps to localize the landmark points on face such as the
points on the contour of face, mouth eye and nose [13]. These
points help to get facial features and from these features AUs
can be identified. Facial expressions to emotion mapping can
be done based on the AU combinations of facial expressions
which are defined in Facial Action Coding System (FACS)
[14].
The major contributions of this paper are: 1) Geometrical
modeling of upper, middle and lower regions of face
separately for detecting the AUs associated with each region
of face for FACS based emotion recognition, 2) CUDA based
face
segmentation
algorithm
implementation
on
heterogeneous parallel HPC server and 3) An implementation
of edge feature based algorithm for facial landmark detection
on HPC server.
Abstract
Human emotions are expressed through body gestures, voice
variations and facial expressions. Research in the area of
facial expression recognition has been active for last 20 years
for improving the system performance. This work proposes a
novel geometrical modeling of facial regions based feature
extraction technique for emotion recognition. Most of the
facial landmark based approaches use a common reference
point for detecting the facial variations. In such approaches a
slight variation or tripping of the reference point may result in
errors which may lead to erroneous expression recognition. In
order to reduce errors a new method is proposed wherein 3
important reference points in the axis of symmetry of face is
fixed and angle variations associated with these reference
points are used for detecting the upper and lower Action Units
(AUs). Also to increase the speed performance the
segmentation algorithm required for facial feature extraction
is implemented parallel in Compute Unified Device
Architecture (CUDA). Facial expressions of emotion are
recognised as combinations of FACS AUs. It is implemented
in Graphics Processing Unit (GPU) based High Performance
Computing (HPC), tesla K20, CUDA server and analysed the
performance as a massively parallel data processing tool. The
results showed that multithreaded GPU version of the face
segmentation algorithm is much faster than that of singlethreaded CPU version.
Keywords: ASM, CUDA, FACS, facial expression
recognition, GPU, massively parallel data processing.
Introduction
A large number of studies have been focusing on the
recognition of emotion through facial expression[1, 2] and it
has been an active on-going research with many challenges in
human-computer interfaces over the past several decades
[3].In human communication, the exchange of information
does not take place only through words but also through facial
expressions [4, 5]. Contractions of facial muscles result in
appropriate facial expressions [2]. A person's facial features
vary from one person to another because of variations in
gender, age, cosmetic products, ethnicity, occluding objects
like hair, glasses, cap, etc. [6]. Also, the appearance of faces
may differ due to lighting changes and pose variations. Most
of the facial expression recognition research are focused on
addressing all these challenges. In this paper a method to
enhance the performance of facial expression recognition
system is proposed which is based on Compute Unified
Device Architecture (CUDA). A parallel face segmentation
6740
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6740-6752
© Research India Publications. http://www.ripublication.com
The paper is organized as: A brief overview of previous works
of edge detection, feature extraction, landmark detection with
ASM and FACS based facial expression recognition are
discussed in the section followed by the Introduction section.
The next section presents the concept of CUDA accelerated
image processing. Then proposed work is discussed, in this
section edge based feature image formation using CUDA,
regions of feature image detection, landmark identification,
AU detection and emotion classification are presented. Next
section gives results and analysis of the performance of the
proposed method. Finally conclusions and future avenues of
research are discussed.
Appearance Method (AAM) are examples of the model-based
methods [22]. In 1995, Leung et al. proposed a method in
which Gaussian filtered face image at multiple orientations
and scales are used which provides a set of landmarks [23]. A
probabilistic model which is used to express the geometrical
relationship between landmarks helps to reduce matching
complexity and eliminates irrelevant points. In 1997, Wiskot
et al. proposed a method of elastically deformed multiple face
graph[24] for capturing head rotations and bunch graphs for
capturing various appearances. Links in the labeled graph are
the average distances between nodes and landmarks. They are
considered as Gabor jets at candidate locations. In 1998,
Cootes et al. introduced AAM which models the texture
variation and shape of the fiducial points [25]. In 2003,
Cristinacce et al. proposed multiple landmark detectors which
locate the initial landmarks, boosted regression [26] for
improving the estimated locations and updated landmarks
used for fitting the shape model. In 2008, Cristinacce et al
proposed an algorithm for global shape model with the
estimated location updation by a shape driven search for
avoiding noplausible face shapes [27]. In 2008, Milborrow
and Nicolls proposed learned profile models with two ASM
stacking [28] to enhance the initialization. They used learned
global shape model with PCA and 2D profile search for the
primary and secondary fiducial points. In 2011, Belhumeur et
al. introduced a Bayesian framework which unifies a global
shape by collecting SIFT [29] features by using a local
detector. In 2012, Zhu & Ramanan proposed a tree-structured
[30], linearly-parameterized pictorial structure of the facial
landmark. In 2013, Qihui Wang et al. analysed the accuracy of
different face fitting ASM methods by computing the
displacement of pixel [31], i.e. the displacement between
fitting model points and hand-labeled landmarks. Jinwei
Wang et al. in 2014 proposed GPU based parallel AAM fitting
algorithm [32]. In the algorithm they distribute the texture
data in pixels to the thousands of parallel GPU threads for
computation and they used 16 AAM face models of different
dimensional textures in the range of 4096 pixels to 65536
pixels. In the algorithm they observed that the CPU time
increases 18.8 times when the data size changes from 4096
pixels to 65536 pixels whereas GPU time increases by only 3
times.
FACS is the most commonly used method for recognition of
facial gestures in psychological research. FACS [33, 34]is
developed to detect the variations in the facial features. Facial
landmarks help to identify the Action Units (AU) [35, 36, 37]
The Action Unit (AU) combination of a facial expression
which is defined in the FACS helps to map facial expression
to corresponding emotion [38, 39]. In 1976 Friesen and
Ekman proposed that expressions of emotion can be
represented as combinations of FACS action units. 44 action
units are included in FACS. Table 1 and Table 2 show FACS
action units of lower and upper face. The indication of ‘*’ in
the AU means the changed criteria for the AU. For example,
AU 25, AU 26, AU 27, AU 41, AU 42 and AU 43 are based
on the criteria of intensity. In 1999, G. Donatoet al. attempted
to detect AUs automatically in static face images [35]. In
2001, Takeo Kanade and Jeffrey F. Cohn developed a facial
expression analysis system based on transient facial features[
(deepening of facial furrows and permanent facial features
Previous Works
A number of facial feature extraction techniques are available.
Model based approaches are one of them. In most of the
image processing tasks image segmentation is the primary
step for feature extraction. Edge based image segmentation
techniques [15] help to get input feature image for the model
based approaches. Model based approach such as ASM helps
to find the landmarks of face image. The landmarks can be
used for getting the combinations of facial AUs defined in
FACS. The AU combinations of each facial expression help to
map facial expression to corresponding emotion. In emotion
recognition system processing speed is also an important
factor. Parallel implementation of the facial feature extraction
technique enhances the speed performance in facial
expression recognition system.
In human vision, on the basis of shapes, patterns, color,
texture, etc. a complex image is immediately segmented into
the simple objects. In computer vision system the same is
constructed by using image segmentation techniques [16, 17].
In image segmentation process a label is assigned to each
pixel and pixels with same labels share common visual
characteristics in an image. It helps to identify boundaries in
an image and it also helps to locate certain regions of an
image [18, 19]. For example face based segmentation
techniques help to locate the boundary of face and facial
regions such as mouth, eyes and nose from a face image. Edge
based segmentation is one of an important segmentation
techniques. In 1980, Marr and Hildreth presented the analysis
of edge detection in two parts. First one is based on the
intensity changes in natural images in various scales by using
the second derivative of Gaussian for the filtering of image at
a given scale. Two dimensional Gaussian distribution and the
Laplacian operator are used for determining the intensity
changes at a given scale. They used zero-crossing segments
for representing the intensity changes. The second one is
based on the fact that the changes of intensity in an image
arise from illumination boundaries of surface discontinuities.
In 1986 Canny developed a technique for edge detection in
which he used two filters which represent derivatives in
vertical and horizontal directions with additive Gaussian white
noise in an image [20].
Model-based methods and texture-based methods are the two
different categories of facial feature detection methods.
Model-based methods try to fit the proper face shape to an
unknown face [21] by learning face shapes from labelled
training images. Active Shape Model (ASM) and Active
6741
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6740-6752
© Research India Publications. http://www.ripublication.com
(brows, eyes, mouth) of face image sequence in nearly frontalview. The system was able to recognize fine-grained changes
in face into AU of FACS. They proposed facial component
models of multistate for modeling the facial features such as
brows, eyes, lips, furrows and cheeks. In 2011, Patrick et al.
introduced an active appearance model (AAM)-based system
which detects the frames in a video automatically and
recognizes the spontaneous emotion [40] of patient’s reaction
to pain. Here pain is defined via facial AUs. In 2014, Bihan
Jiang et al. presented an approach using the dynamic
appearance descriptor to the temporal dynamics [41] of facial
actions.
function in a CUDA program. It determines thread count in a
thread-block, the block count and the account of shared
memory. CUDA uses C programming tools and C compiler,
which make better portability and compatibility for dataparallel computing.
Many image processing algorithms have modules of common
computation over many pixels which make these algorithms
for acceleration on GPU by exploiting processing units in
parallel [44]. Many applications of image and video
processing have been ported to CUDA. Most of the image and
video processing approaches follow data-based-parallelization
to port the sequential code into a CUDA implementation and
it is mainly on splitting the input into N threads and then a
master node reassembles the results of each thread and
provides the global result [45]. CUDA can provide highly
data-parallel processing for real-time emotion recognition
system. It usually processes substantive pixel data of face
image in all the frames of the required duration of video with
specified frames per second (fps). Computation of each thread
block is shown in the Fig. 1.
The main steps for the processing of face images with CUDA
programming include:
1)
Copy face images from host memory to device
memory:-Host memory and device memory data
transfer is the bottleneck because of the limit of
bandwidth that restricts the whole speed. Here data
operation by texture functions is an effective method.
2)
Schedule the CPU to execute the kernel function:-It
consists of the following:
a.
Set the configuration of kernel execution:-Allocate
data blocks to each thread blocks by decomposing
the input data.
b.
Read face images to shared memory from global
memory:-This step is effective to get the advantage
of the quick speed of shared memory.
c.
Launch the computation of kernel.
3)
Write the result back to host memory
Table 1: FACS Action Units (AU) for the upper face [33]
Table 2: Action Units of the lower face [33]
CUDA Accelerated Image Processing
Parallelizing the image processing algorithms has got more
research attention. NVIDIA developed a device architecture
called CUDA which allows parallel computation on CUDA
enabled GPU graphics cards [42]. The CUDA enabled GPU
graphics cards contain a number of SIMD stream multiprocessors (SM) [7]. Each SM has texture memory, constant
memory and global memory [8]. The SM memories can
communicate with host memory. On-chip memories and
cache [43] are used in the device to accelerate memory access.
GPU can get data from the global memory or any location. To
avoid global memory access frequently threads in the same
multi-processor is used and it helps to get data quickly. Shared
memory access speed is as quick as registers. A number of
processors present in the graphics card schedule the number of
threads to execute concurrently. Threads which are in a thread
group can synchronize and solve complex problems also they
can cooperate and communicate. An execution configuration
[8] is to be included while calling a GPU function from a CPU
Figure 1: Each Thread Block Computation
6742
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6740-6752
© Research India Publications. http://www.ripublication.com
Proposed Work
The block diagram of the proposed emotion recognition
system is shown in Fig.2. Landmark based facial feature
extraction technique is proposed in the work. In landmark
based approach, landmarks help to get facial features and
from the landmark based facial features AUs can be
identified. Emotion recognition can be done based on the AU
combinations of facial expressions which are defined in FACS
[14].
Feature Image-1
Feature Image-2
Figure 3: Feature Images
Figure 2: Proposed Emotion Recognition System.
Facial Landmark Identification
Locating landmarks on faces is equivalent to locating the
facial features. To detect the landmarks initially face detection
step is performed using Haar cascade detector [46] of
OpenCV. After that the face is segmented using CUDA and
then with the help of segmented image and Active Shape
Model (ASM) the landmarks are identified.
Figure 4: CUDA based Feature image Formation
CUDA based feature image formation is shown in Fig. 4. In
the CUDA kernel function, formation of the threads can be
done by computing row*width+col[7]. To implement feature
extraction in parallel using CUDA the following steps are
used:
1)
Read image in.PGM (Portable Gray Map) format
2)
Allocate memory and copy the input image I to GPU.
3)
Declare block and grid dimensions.
4)
Execute CUDA kernel function:
a.
Compute
gradient
in
horizontal
direction
Feature Image Formation using CUDA
Edge based segmentation techniques are used for getting
feature images. These feature images are used as input for
detecting the landmarks. First derivative filter like Sobel are
faster and simpler. But second order derivative operators give
better signal-to-noise ratio and sub-pixel resolution [20].
Sobel edge detection is computationally inexpensive and it
convolutes an image using two 3X3 convolution masks.
The GPU hardware allows pixel-wise fast operations of edge
detector implementation. The steps of edge detection are
implemented pixel-wise in parallel. For feature extraction, the
dimensions of the blocks in CUDA kernel are to be specified
in all the regions of the face image. Then each region is to be
given an identification (ID) number and denote it by the
blockIdx.x and blockIdx.y. The blockIdx.x refers X dimension
of the blocks in CUDA kernel and blockIdx.y refers Y
dimension of the blocks in CUDA kernel. Parallel edge
detection process allows computation of the simultaneous
feature values at various locations of the face image at various
scales in parallel by multiple threads. The images are initially
stored in texture memory and then they are transferred for
faster access to shared memory. To extract the facial features
the face is segmented using threshold and edge based
segmentation techniques and the resultant image is called
feature image. For landmark identification as shown in Fig. 3
two feature images are used. The image obtained initially by
applying edge detection technique to the face image is called
first level feature image. More segmented form of feature
image is called second level feature image which is formed by
varying the threshold. The second level feature image helps to
identify the fiducial points of eyes, nose and mouth regions.
⎛ −1 0 1⎞
⎜
⎟
Gx = ⎜ − 2 0 2 ⎟ * I
⎜ −1 0 1⎟
⎝
⎠
b.
Compute
gradient
in
vertical
direction
⎛ − 1 − 2 − 1⎞
⎜
⎟
Gy = ⎜ 0
0
0 ⎟*I
⎜ 1
2
1 ⎟⎠
⎝
G = Gx2 + Gy2
c.
Compute the gradient magnitude
d.
Compute the gradient direction θ = arctan(Gy / Gx ) with
e.
f.
5)
Normalise the gradient to the range 0-255.
Set new pixel values to get the feature image.
Copy back the processed feature images to CPU.
{
}
θ ∈ − π 2 ,π 2
.
Facial Region Detection
The Haar cascade detectors are very useful for detecting face
and its regions such as eyes, mouth and nose. The cascade
detector used in OpenCVcan be used for getting top left coordinates, widths and heights of all the regions. By using these
parameters other coordinates can be calculated. Initially face
detection is performed and then the location and size of the
6743
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6740-6752
© Research India Publications. http://www.ripublication.com
detected face are computed. After that the eyes, nose and
mouth are detected by using corresponding cascade detectors
by setting region of interest (ROI) for each detector. ROI
setting helps to concentrate the search only for that particular
region. For detecting the eyes the top portion of detected face
is set as ROI of eye, middle region of face is set as ROI of
nose and lower portion of face is set as ROI of mouth. Based
on the detector output the co-ordinates of the eyes, nose and
mouth are identified. These co-ordinates are used for
identifying the facial regions of the feature image. Table 3
illustrates the facial regions of input image and feature
images. The regions of feature image helps to identify primary
landmarks or fiducial points i.e. the reference points on the
mid points, tips or corners of the eyes, nose and mouth. All
other landmarks are secondary landmarks which can be
detected using ASM algorithm.
every training example they must be placed in the same way.
The vector a is used to represent the points of each image. The
k points form a shape vector:
a = ( x1 , y1 , x2 , y 2 , x3 , y3 ,........, x k , y k )T
Eye Regions of Input Image
The align operation of PDM model is based on the average
face model a . The basis for the match is the feature
information that is near to the key feature point [47]. Principal
Component Analysis (PCA) is applied to a set of aligned
shapes, each represented by landmarks for constructing PDM.
PCA helps to find the mode of variations. A weighted sum of
deviations and the mean shape obtained from the first M
modes are used for approximating any shape in the training
set.
It is essential to normalize or align the shapes before applying
PCA by translating, rotating and scaling using Procrustes
analysis [47, 48]. The alignment process not only helps the
model independent of the size, orientation and position but
also helps to minimize the sum of squared distances between
the fiducial points and family of point distribution in
Gaussian.
The steps for calculating some important parameters are as
follows:
Calculate the mean shape vector:
Eye Regions of Feature Images
a=
Table 3: Face Regions of Input image and Feature Image
Input Image
1
N
N
∑a
i
i =1
Calculate the covariance matrix of N vectors.
S=
Nose Region of Input Image
1 N
(ai − ai )(ai − ai )T
∑
N − 1 i =1
Form shape vector:-Calculate eigenvectors of the covariance
matrix S, sort the characteristic values in descending order and
retain the corresponding eigenvectors to the M larges
eigenvalues λi in the P matrix. Now any shape vector used for
training can be approximated as linear model by:
Nose Region of Feature Images
Mouth Region of Input Image
a ≈ a + Pb
Mouth Region of Feature Images
(1)
where b is a vector of M elements containing shape
parameters. According to (1), the shape or model varies
whenever the elements of b change and based on the value of
b it is possible to ensure the shape which is generated in the
range of permit able variation of the shape model [49]. The
shape vector defines the parameters for a deformable model
and is given by:
Landmark Detection using Feature Image Regions
ASM algorithm is used to get the point distribution of
different regions of the feature images. The ASM algorithm
starts the landmark searching from the mean training face
shape which are aligned to the size and position of the
detected regions of feature image. ASM algorithm is a modelbased algorithm and is derived from the Snake algorithm. The
objects which are having similar geometric shape can be
represented by a vector shape which is obtained by several
key feature points. ASM is based on two statistical models:
Point Distribution Model (PDM) and Local Texture Model
(LTM).
b = PT (a − a )
We need to ensure the shape caused by b is similar with the
shape in training set. The values of b are to be within the
range ± β λn while fitting the model to a set of points and
valid shapes are represented:
bin ≤ β λn
1≤ i ≤ N,
1≤ n ≤ M
whereβ is a regularization constant and has a value set usually
Point Distribution Model (PDM)
The PDM provides heuristic rules of the face shape. Assume
that we have N training face images and each face image is
described by k key facial fiducial points or landmarks. The
landmarks points of the each face are identified manually for
modeling are (x1,y1), (x2,y2), …………… (xk,yk). Each point
represents boundary or a specific part of the object and on
λ
between 1 and 3, n is the nth eigenvalue of the covariance
matrix S, M is the number of retained eigenvectors. So the
bn ≤ 3 λ
i
n
is used to limit the value of b.
relation
There is a proportion called scale coefficient (fv) of the
training shape variance, which decides the number of
6744
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6740-6752
© Research India Publications. http://www.ripublication.com
eigenvalues and eigenvectors to be retained and its range is
90% to 99.5%. The smallest M gives the number of modes for
which
M
2k
n =1
n =1
means maximizing the probability that gi originates from a
multidimensional Gaussian distribution.
Multi-resolution Framework
Multi-resolution framework improves the robustness and
efficiency of the algorithm. A typical multilevel model-a
Gaussian pyramid is constructed for each image at each level
for training and testing the image, which increases the speed
of the algorithm. As shown in the Fig. 5 the entire search is
repeated 4 levels from coarse to fine resolution in an image
pyramid at each level. The process of identifying profile starts
from level 3 i.e. the lowest level of pyramid. Then gradually
moves up to level 0 which is the highest level. The searching
result of ASM is more sensitive to the initial size and location
[28].
∑ λn ≥ fv ∑ λn
In the case of not aligned shape model, the initial few modes
of variations may include the position and size variations. The
shape models which show variations in first few models
which are constructed by shifting towards modes with lower
eigenvalues from the aligned shapes. So the parameter fv must
be greater for aligned shapes in which there should not be any
variation in position and size.
Local Texture Model (LTM)
The local texture around each fiducial point in an image is a
Grey Level Profile (GLP). This grey level appearance model
gives the appropriate image structure around each fiducial
point. It is computed from the fixed-length pixels sampled
using linear interpolation around each fiducial point. The
profile direction is perpendicular to the contour. The direction
perpendicular to the
( xk , yk ) fiducial
(x , y )
point is calculated by
(x , y )
Figure 5: Multilevel Resolution Model
rotating the vector from k −1 k −1 to k +1 k +1 over 90o.
In human emotion recognition application, face images are of
closed contours. Therefore a perpendicular direction is
calculated from the fiducial points of first, last and second.
That is for the last fiducialpoint, the second to last and first
fiducial points are used. For the ith point, if n pixels are
sampled using a pixel size then we can sample along a profile
of 2n+1pixels around the model points on the training images.
In 2001, Cootes and Taylor use the normalized first derivative
of these profiles as feature vector to construct the local texture
model. The differences between the (i-1)th and the (i+1)th
points are used for computing the derivatives. The feature
vectors of the training images are extracted and represented by
the normalized derivative profiles. The normalization step
refers the sum of absolute values of the elements in the
derivative profile is 1. The normalized derivative profiles are
denoted as g1, g2 ……….gN. The gij refers ith feature point of
jth sample image. For each landmark, the mean profile
the covariance matrix
point,
gij =
mean value
Si are
A global face detector is used as an initial face detector to
detect the initial size and position of the face from an image.
After detecting the face we can operate on the shape model
which represents the face to fit the model to test the face [50]
corresponding to the primary landmarks or fiducial points of
the segmented features images as shown in Fig.3. A loop on
the initial shape helps to find suitable secondary landmarks
based on the primary landmarks. At the lowest level landmark
fluctuations are highest and at higher level they are smaller.
The finest resolution makes use of the original image. The
image at scale σ = 1 and two pixel step size is for the next
resolution. Doubling the step size and image scale helps to
construct subsequent levels. Doubling of the step size means
that the displacement of landmark will happen over large
deviations at coarser resolutions. Small structures may
disappear due to blurring [28]. Because of this the fitting at
coarse resolution allows a good approximate location for the
model based on global image structures. At fine resolutions in
the later stages allow for the segmentation refinement result.
The following are the steps for ASM searching algorithm.
g and
calculated. For the ith feature
1 N
∑ gij
N j =1 ,
1)
1 N
T
(
g ij − gi ) .(g ij − g i )
∑
N j =1
.
and covariance
g
To get the similarity between the mean profile and the new
ai of a region of face image and
a
Si =
2)
3)
profile, Mahalanobis distance is used. It is defined as:
f ( g new ) = ( g new − g ) Si−1 ( g new − g )T
The matching between a
corresponding model is
Mahalanobis distance from
point to the corresponding
Analyze each point
4)
location in test image and the
obtained by minimizing the
the feature vector of the fiducial
model mean. Minimizing f(gnew)
find the most suitable match nearby the point i .
Update the required parameters to fit the new
points a .
To confirm the shape apply the constraints to the
parameter b.
Repeat the steps until convergence.
Fig. 6 shows the identified landmark points and the landmark
description is shown in the Table 4.
6745
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6740-6752
© Research India Publications. http://www.ripublication.com
into AU as shown in the Table 5. Each AU is related to the
facial muscles of different types. FACS is an anatomically
based system which helps to measure nearly all visually
discernible facial movement. It describes 44 unique AUs
based on different facial activities. The contractions of
specific facial muscles alter the location of the facial
landmarks. By combining the AUs based on the locations of
the facial landmarks it is possible to produce almost any facial
expression. In the proposed method after detecting all
landmarks as shown in the Fig. 7 the landmarks are to be
initialized. Then the AUs are detected using a novel facial
geometric model by extracting distances and angle features of
corresponding facial expressions using the landmarks of the
eyes, nose, mouth and boundary in the facial data [36].
Table 5: AU and FACS Description (Michel Valstar and
MajaPantic, 2006)
Action
Units
AU 1
AU 4
AU 6
AU 8
Figure 6: Identified landmarks
Table 4: Identified landmark Description
Sl.No. Landmark Description Sl.No.
Landmark
Description
0
Nose tip
1, 2 Right cheek contours
3
Chin tip
4, 5 Left cheek contours
6
Forehead upper centre 7-10 Right
eyebrow
contours
11-14 left eyebrow contours 15
Right eye inner corner
16
Upper
right
eyelid 17
Right eye outer corner
centre
18
Lower
right
eyelid 19
Right eye centre
centre
20
Left eye centre
21
Left eye inner corner
22
Upper left eyelid centre 23
Left eye outer corner
24
Lower left eyelid centre 25
Nose left contour
26
Nose right contour
27
Nostrils-left nose peaks
28
Nose contour centre
29
Nostrils-right
nose
peaks
30
Right mouth corner
31
Upper right mouth
contour
32
Upper mouth contour 33
Upper
left
mouth
centre
contour
34
Left mouth corner
35
Lower
left
mouth
contour
36
Lower mouth contour 37
Lower right mouth
centre
contour
FACS Description
AU 10
Action
Units
Inner Brow Raiser AU 2
Brow Lowerer
AU 5
Cheek Raiser
AU 7
Lips Toward Each AU 9
Other
Upper Lip Raiser AU 11
AU 12
AU 14
Lip Corner Puller
Dimpler
AU 16
AU 18
AU 20
AU 22
AU 24
AU 26
AU 28
AU 30
AU 32
AU 34
AU 36
AU 38
Lower
Lip AU 17
Depressor
Lip Pucker
AU 19
Lip Stretcher
AU 21
Lip Funneler
AU 23
Lip Pressor
AU 25
Jaw Drop
AU 27
Lip Suck
AU 29
Jaw Sideways
AU 31
[Lip] Bite
AU 33
[Cheek] Puff
AU 35
[Tongue] Bulge
AU 37
Nostril Dilator
AU 39
AU 41
Glabella Lowerer
AU 42
AU 43
AU 45
Eyes Closed
Blink
AU 44
AU 46
AU 13
AU 15
FACS Description
Outer Brow Raiser
Upper Lid Raiser
Lid Tightener
Nose Wrinkler
Nasolabial
Deepener
Sharp Lip Puller
Lip
Corner
Depressor
Chin Raiser
Tongue Show
Neck Tightener
Lip Tightener
Lips Part
Mouth Stretch
Jaw Thrust
Jaw Clencher
[Cheek] Blow
[Cheek] Suck
Lip Wipe
Nostril
Compressor
Inner
Eyebrow
Lowerer
Eyebrow Gatherer
Wink
Geometrical Modeling and Facial Action Detection
Most of the facial landmark based approaches use a common
reference point for detecting the facial variations. The
disadvantage of fiducial point based approach with a common
reference point is that a slight variation or tripping of the
reference point (say for example nose tip as reference point)
from neutral to other expressions may reflect small deviation
errors in the neighbouring points and more errors in other
landmark points. This problem can be solved to some extent
Emotion Recognition
FACS based emotion recognition approach is used in the
proposed method. The FACS developed by Ekman and
colleagues segments the contraction of specific facial muscles
6746
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6740-6752
© Research India Publications. http://www.ripublication.com
or we can reduce the errors by fixing the reference points to
the regions wherever the changes occur. This approach also
helps to detect upper and lower AUs separately. In this paper
geometrical modeling of upper, middle and lower regions of
face are performed separately for detecting the AUs
associated with each region. Here distance and angle based
facial muscle movement features are extracted and normalized
for detecting the AU combinations associated with the facial
expression. Neutral faces and other face images from
extended Cohn-Kanade (CK+) database and Japanese Female
Facial Expression (JAFFE) database are used here for emotion
recognition. In the CK+ database [51] each video sequence is
temporally segmented from neutral frame to the
corresponding facial expressions of peak frame. In case of
neutral facial expression, the features are treated as base
features for calculating difference between features later.
Whereas the new features of the facial expressions other than
neutral expressions are obtained by calculating the difference
between neutral features and current features. The calculated
feature vectors are: Di = (f1, f2, f3, f4, f5, f6, f7, f8, f9,f10, f11, f12,
f13, f14, f15, f16, f17, f18, f19) i=1, 2,.... N, where N-number of face
images.
The features f1 to f6 are distance features and the features f7 to
f19 are angle features.
Upper Facial Region Analysis
Figure 7: Geometrical Modeling of Upper Facial Region
The upper facial region analysis helps to detect AUs
associated with the upper region of face. Forehead upper
centre is taken as reference point and angle features are
calculated. The Fig. 7 shows the geometrical modeling of
upper facial region which helps to compute the movement of
the inner or outer portion of the brows is raised or the brows
lowered and drawn together. The features can be calculated
as:
Distance Feature Calculation
The steps for fixing origin, feature point calculation with
reference origin, normalizing the vectors and calculation of
distance vectors are as follows:
Facial landmarks: Fi = (xi,yi ), i=0,1,....37
Origin (Nose tip): O = F0=(x0, y0)
To do the calculation in the common coordinate system, the
feature point is to be transferred to the coordinate system by
subtracting the origin as: Pi =Fi-O. Also it is essential to
maintain in same scale by forming normalized vectors by
dividing eye corner distance as:
1)
Upper eye brow raiser:
2)
Lower eye brow raiser:
3)
Brow lower:
This computation helps to detect the AU1, AU2 and AU4.
Eye Region Analysis
. For distance
calculation Euclidean distance is used, for example
refers Euclidean distance between nose tip and forehead upper
centre point. The distance features can be calculated as:
1)
Chin centre movement:
2)
Left cheek movement:
3)
Right cheek movement:
4)
Forehead centre movement:
5)
Left eye centre movement:
6)
Right eye centre movement:
Figure 8: Geometrical Modeling of Eye Region
The eye region analysis and computation help to detect AUs
associated with the status of eyes. To detect the upper AU
associated with eyes it is essential to detect whether eyes are
closed or opened. The Fig. 8 shows geometrical modeling of
eye region. The features can be calculated as:
1)
Opening of left eye:
2)
Opening of right eye:
3)
Width of left eye:
4)
Width of right eye:
The analysis of the chin movement helps to detect AU17. The
displacement analysis between cheek contours and nose tip
helps to detect the raised state of cheek which is AU6.
Similarly all other AUs can also be detected from the distance
and angle features. The relaxed and closed states of lips are
treated as neutral for the lower face AUs.
The computation of raised state of upper eyelids and lower
eyelids helps to detect AU5 and AU7 respectively. The
relaxed state of brows, eyes and cheek are treated as neutral
for upper face AUs.
6747
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6740-6752
© Research India Publications. http://www.ripublication.com
Here angle calculation can be done as cosine angle between
can be calculated as:
two vectors, for example
Nose Region Analysis
. Other angle calculations are
similar to this.
Emotion Classification
In the proposed system SVM is used for training and
classification of emotions. Emotion recognition is done based
on the AU combinations which are defined in the FACS.
Examples of emotion recognition based on the AU
combinations of the facial expressions such as: anger, fear,
disgust, sadness, happiness and surprise are illustrated in the
Table 6. Extended Cohn-Kanade (CK+) and Japanese Female
Facial Expression (JAFFE) datasets are used for training.
Neutral expression is considered based on the features of the
relaxed state of brows, eyes, cheek and lips and closed states
of lips. The feature vectors for training are formed by
displacement and angle features of each expression variations.
Figure 9: Geometrical Modeling of Nose Region
Table 6: Facial Expressions and associated Aus
The nose region analysis helps to detect AUs associated with
the middle region of face. Nose tip is taken as reference point
and the associated angle features are calculated. The Fig. 9
shows the geometrical modeling of nose region for its state
analysis. To detect the AU9 which is associated with nose the
following features are used.
1)
Nose wrinkle:
2)
Nostril Compressor:
Emotion
AU
Anger
AU4
AU5
AU7
AU24
Disgust
AU9
AU15
AU16
Descriptor
Brow Lowerer
Upper Lid Raiser
Lid Tightener
Lip Pressor
Nose Wrinkler
Lip Corner Depressor
Lower Lip Depressor
Fear
Inner Brow Raiser
Outer Brow Raiser
Brow Lowerer
Upper Lid Raiser
Lip Stretcher
Jaw Drop
Cheek Raiser
Lip Corner Puller
Lips Part
Expressions
Mouth and Lower Region Analysis
AU1
AU2
AU4
AU5
AU20
AU26
Happiness AU6
AU12
AU25
Figure 10: Geometrical Modeling of Mouth and Lower Face
Region
The lower facial region analysis helps to detect AUs
associated with the lower region of face. Chin tip is taken as
reference point and angle features are calculated. The Fig. 10
shows the geometrical modeling of mouth and lower face
region. To detect the AU associated with mouth and lower
facial regions the following features are used.
1)
2)
3)
Opening of mouth:
Stretching of lip (Right):
Stretching of lip (Left):
4)
Opening of jaw:
Sadness
AU1 Inner Brow Raiser
AU4 Brow Lowerer
AU15 Lip Corner Depressor
Surprise
AU1
AU2
AU5
AU26
Inner Brow Raiser
Outer Brow Raiser
Upper Lid Raiser
Jaw Drop
Experimental Evaluations
Experiments are conducted to evaluate the performance of our
system. Sequential and parallel implementations of the
algorithms are tested and compared for the speed
performance.
6748
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6740-6752
© Research India Publications. http://www.ripublication.com
Table 8: Confusion Matrix for Emotion Recognition (JAFFE
Database)
Experimental Results
Extended Cohn-Kanade (CK+) database and Japanese Female
Facial Expression (JAFFE) database are used here for emotion
recognition. In the CK+ database each video sequence is
temporally segmented from neutral frame to the
corresponding facial expressions of peak frame. In case of
neutral facial expression initial frames are used. In the CK+
dataset 7 emotions including neutral expression of different
subjects are taken for training. The 500 images which are not
included in the training set are taken as test images, among 80
images are neutral images, all other images are of various
expressions of 70 images of each expression category. Table 7
shows the confusion matrix for emotion recognition using
CK+ database. The column corresponds to predicted emotions
while row corresponds to actual emotions and Fig. 11 shows
the corresponding graph.
%
Neutral Anger Disgust Fear Happy Sadness Surprise
Neutral
100
0
0
0
0
0
0
Anger
0
83.4
8.3
8.3
0
0
0
Disgust
0
16.6 83.4
0
0
0
0
Fear
0
0
0
91.7 0
8.3
0
Happy
0
0
0
0
100
0
0
Sadness
0
8.3
0
0
0
91.7
0
Surprise
0
0
0
0
0
0
100
Recognition
92.9
Table 7: Confusion Matrix for Emotion Recognition (CK+
Database)
%
Neutral Anger Disgust Fear Happy Sadness Surprise
Neutral
98.75
0
0
0
0
1.25
0
Anger
0
87.2 2.85
0 4.25
5.7
0
Disgust
1.4
0
88.6 10
0
0
0
Fear
0
5.7
5.7 88.6 0
0
0
Happy
0
0
2.85
0 92.9
0
4.25
Sadness
0
7.1
0
0
0
92.9
0
Surprise
0
0
0
1.4 2.85
0
95.75
Recognition
92.1
Figure 12: Recognized Emotions (JAFFE Database)
Performance Analysis
The performance comparison of single-threaded CPU version
and multithreaded GPU version of the face segmentation
algorithms are shown in Fig.13. Table 9 shows the average
execution time for GPU and CPU for feature extraction of a
face image. The host configuration is Intel(R) Xeon(R) CPU,
2 GHz, 48 GB RAM. The device used is NVIDIA Kepler
GK110 Tesla K20 1 T.F GPU. This GPU features 2048
multiprocessors, 48 KB shared memory per multiprocessor
and 48 GB device memory. There can be a maximum of 1024
threads per block and 2048 active threads per multiprocessor.
Table 9: Speed comparison of CPU and GPU versions of
Algorithms
Figure 11: Recognized Emotions (CK+ Database)
The JAFFE database features ten different Japanese women
containing a total of 213 images posing various examples for
seven basic emotions. All characteristics of a basic emotion
are inherited by neutral position. The 84 images which are not
included in the training set are taken as test images and they
are of various expressions of 12 images of each expression
category including neutral. Table 8 shows the confusion
matrix for emotion recognition using JAFFE database. The
column corresponds to predicted emotions while row
corresponds to actual emotions and Fig. 12 shows the
corresponding graph.
1 CK+ Database
2 JAFFE Database
6749
Average Execution Time (ms.)
CPU
GPU
43.817786
0.609375
11.028583
0.356010
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6740-6752
© Research India Publications. http://www.ripublication.com
CK+ Database
References
[1]
[2]
[3]
[4]
JAFFE Database
[5]
[6]
[7]
[8]
[9]
Figure 13: Speed comparison of CPU and GPU versions of
Algorithms
[10]
Conclusions and Future Work
In this research a novel facial landmark based facial feature
extraction technique for FACS based emotion recognition
system is developed. Geometrical modeling of face is used to
detect facial variations in different regions. Initially face is
detected using Haar cascade facial detection module. Then
CUDA accelerated edge based face segmentation is performed
and based on the segmented face image 38 facial landmarks
are located with the help of ASM. Then three important
reference points in the axis of symmetry of face are fixed and
distance and angle based facial muscle movement features are
extracted. After that the upper and lower AUs are detected and
trained using SVM. The proposed method was evaluated with
CK+ and JAFFE databases. The result showed that the
proposed method can perform 92.9% accuracy. Speed
difference in CPU and GPU based algorithms are also
compared. The output shows multithreaded GPU version of
the facial feature extraction algorithm is much faster than that
of single-threaded CPU version. The recognition efficiency
can be improved by extracting appearance based features also.
In future the algorithm will be modified in hybrid approach by
including geometric and appearance based facial features for
improving the accuracy of emotion recognition. Also this
work is focused on the recognition of emotions from face
images and the work will be extended in future for
recognizing emotions from video also.
[11]
[12]
[13]
[14]
[15]
6750
P. Ekman and W.V. Friesen,1971, “Constants across
Cultures in the Face and Emotion”, Journal of
Personality and Social Psychology, vol. 17, no. 2,
pp. 124-129.
P.Ekman and W.V. Friesen, 1980, “Facial Signs of
Emotional Experience”, Journal of Personality and
Social Psychology, vol. 39, no. 2, pp. 1125-1134.
Maja Pantic and Leon J.M. Rothkrantz, 2000,
“Automatic Analysis of Facial Expressions: The
State of the Art”, IEEE Transactions on Pattern
Analysis And Machine Intelligence, Vol. 22, No. 12.
B. Fasela and Juergen Luettinb, 2003, “Automatic
facial expression analysis: a survey”, Pattern
Recognition, Vol 36, pp. 259-275.
M. Pantic and L.J.M. Rothkrantz, 2000, “Expert
system for automatic analysis of facial expressions”,
Image and Vision Computing, Elsevier, Vol. 18,
No.11, pp. 881-905.
Vinay Bettadapura, 2010, “Face Expression
Recognition and Analysis: The State of the Art”,
Technical Report, Georgia Institute of Technology,
pp. 1-27.
David B. Kirk and Wen-Mei W. Hwu, 2010,
“Programming Massively Prallel Processors-A Hand
on Approach”, Morgan Kaufmann Publishers, ISBN:
978-0-12-381472-2.
Nvidia, 2014, CUDA Programming Guide.
Juan G´omez Luna, 2012, “Programming issues for
video analysis on Graphics Processing Units”,
University of Cordoba.
Sukno F.M, Ordas, Sebastian, Butakoff C., Cruz S,
Frangi A.F. (2007), “ Active Shape Models with
Invariant Optimal Features: Application to Facial
Analysis”, IEEE Transactions on Pattern Analysis
and Machine Intelligence, Vol. 29, No.7, pp. 11051117.
Seshadri K and Savvides M, 2012, “An Analysis of
the Sensitivity of Active Shape Models to
Initialization When Applied to Automatic Facial
Landmarking”, IEEE Transactions on Information
Forensics and Security, Vol. 7, No.4, pp. 1255-1269.
Ke Sun, Huiling Zhou, Kin Man Lam, 2014, “An
Adaptive-Profile Active Shape Model for FacialFeature Detection”, 22nd International Conference
on Pattern Recognition (ICPR), 24-28, Aug. 2014,
Stockholm, pp. 2849-2854.
Anastasios Koutlas, Dimitrios I. Fotiadis, 2008, “An
Automatic Region Based Methodology for Facial
Expression Recognition”, IEEE International
Conference on Systems, Man and Cybernetics, 12-15
Oct. 2008, Singapore, pp. 662-666.
Ying-li, Takeo Kanade and Jeffrey, 2001,
“Recognizing Action Units for Facial Expression
Analysis”, IEEE Transactions on Pattern Analysis
and Machine Intelligence, Vol. 23, No. 2, pp. 1-19.
Mohammed J. Islam, Saleh Basalamah, Majid
Ahmadi and Maher A. Sid-Ahmed, 2011, “Capsule
Image Segmentation in Pharmaceutical Applications
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6740-6752
© Research India Publications. http://www.ripublication.com
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
Using Edge-Based Techniques”, IEEE International
Conference on Electro/Information Technology
(EIT), 15-17 May 2011, Mankato, MN, pp. 1-5.
Giancarlo Jannizzotto, Francesco La Rosa, Pietro
Lanzafame, 2005, “An edge-based Segmentation
Technique for 2D still-image with Cellular Neural
Networks”, 10th IEEE Conference on Emerging
Technologies and Factory Automation, 19-22 Sept.
2005, Catania, pp. 211-218.
Farmer M.E and Jain A.K., 2005, “ A wrapper-based
approach to image segmentation and classification”,
IEEE Transactions on Image Processing, Vol. 14,
No. 12, pp. 2060-2072.
Kakumanu P. and Bourbakis N., 2006, “ ALocalGlobalGraphApproachforFacialExpressionRecogniti
on”, 18th IEEE International Conference on Tools
with Artificial Intelligence, Nov. 2006, Arlington, VA,
pp. 685-692.
D. Marrand E. C. and Hildreth, 1980, “Theory of
edge detection”. In Proceedings Royal Society of
London, Vol. B207, pp. 187-217.
John Canny, 1986, “A Computational Approach to
Edge Detection”, IEEE Transactions on Pattern
Analysis and Machine Intelligence, Vol. PAMI-8,
No. 6, pp. 679-698.
Lei Xiong, Nanning Zheng, Shaoyi Du, Lan Wu,
2009, “Extended Facial Expression Synthesis Using
Statistical Appearance Model”, 4th IEEE Conference
on Industrial Electronics and Applications, 25-27
May 2009, Xi'an, pp. 1582-1587.
Oya Celiktutan, Sezer Ulukaya and Bulent Sankur,
2013, “A comparative study of face landmarking
techniques”, EURASIP Journal on Image and Video
Processing, Vol. 13, pp.1-27.
Leung, T.K. Burl, M.C. ; Perona, P.,1995, “Finding
faces in cluttered scenes using random labeled graph
matching”, Fifth International Conference on
Computer Vision, 20-23 Jun 1995, Cambridge, MA,
pp. 637-644.
Wiskott, L.; Fellous, J.-M.; Kuiger, N.; von der
Malsburg, C.,1997, “ Face recognition by elastic
bunch graph matching”, IEEE Transactions
onPattern Analysis and Machine Intelligence, Vol.
19, No. 7, pp. 775-779.
Timothy F. Cootes, Gareth J. Edwards, and
Christopher J. Taylor, 2001, “Active Appearance
Models”, IEEE Transactions on Pattern Analysis and
Machine Intelligence, Vol. 23, No. 6.
D Cristinacce, T Cootes, 2003, “Facial feature
detection using adaboost with shape constraints”,
Proc. of British Machine Vision Conference, Vol. 1,
pp. 231-240.
D Cristinacce, 2008, T Cootes, “Automatic feature
localisation with constrained local models”, Pattern
Recognition. Vol. 41, pp. 3054-3067.
S Milborrow, F Nicolls, 2008, “Locating facial
features with an extended active shape model”, Proc.
of European Conference. on Computer Vision,
Marseille, France, pp. 504-513.
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
6751
PN Belhumeur, DW Jacobs, DJ Kriegman, N Kumar,
2011, “Localizing parts of faces using a consensus of
exemplars”, Proc. of Conf. on Computer Vision and
Pattern Recognition, Providence, RI,USA, pp. 545552.
X Zhu, D Ramanan, 2012, “Face detection, pose
estimation and landmark localization in the wild”,
Proc. of. Conf. on Computer Vision and Pattern
Recognition, Providence, RI, USA, pp. 2879-2886.
Qihui Wang, Lijun Xie, Bo Zhu, Tingjun Yang, Yao
Zheng, 2013, “Facial Features Extraction based on
Active Shape Model”, Journal of Multimedia, Vol 8,
No 6, pp. 747-754.
Jinwei Wang,Xirong Ma,Yuanping Zhu, and Jizhou
Sun, 2014, “Efficient Parallel Implementation of
Active Appearance Model Fitting Algorithm on
GPU”, The Scientific World Journal, Vol. 14, pp. 113.
J. F. Cohn, Z. Ambadar, and P. Ekman, 2007,
“Observer-based measurement of facial expression
with the Facial Action Coding System-The handbook
of emotion elicitation and assessment”, Oxford
University Press Series in Affective Science, New
York: Oxford.
J. F. Cohn and P. Ekman, 2005, “Measuring facial
action by manual coding, facial emg, and automatic
facial image analysis”, Handbook of Nonverbal
Behavior Research Methods in the Affective
Sciences, pp. 9-64.
Gianluca Donato, Marian Stewart Bartlett, Joseph C.
Hager, Paul Ekman, and Terrence J. Sejnowski,
1999, “Classifying Facial Actions”, IEEE
Transactions on Pattern Analysis and Machine
Intelligence, Vol. 21, No. 10, pp. 974-989.
Jeffrey F. Cohn,A Adena J. Zlochower,A James
Lien,A And Takeo Kanade, 1999, “Automated face
analysis by feature point tracking has high concurrent
validity
with
manual
FACS
coding”,
Psychophysiology, Vol. 36, pp. 35-43.
Irene Kotsia and Ioannis Pitas, 2007, “Facial
Expression Recognition in Image Sequences Using
Geometric Deformation Features and Support Vector
Machines”, IEEE Transactions on Image Processing,
Vol. 16, No. 1, pp.172-187.
Michael A. Sayette, Jeffrey F. Cohn, Joan M. Wertz,
Michael A. Perrott, and Dominic J. Parrott, 2002, “A
Psychometric Evaluation of the Facial Action Coding
System for Assessing Spontaneous Expression”,
Journal of Nonverbal Behavior, Vol. 25, pp.167-186.
Michel Valstar and Maja Pantic, 2006, “Fully
Automatic Facial Action Unit Detection and
Temporal Analysis”, Proceedings of the 2006
Conference on Computer Vision and Pattern
Recognition Workshop (CVPRW’06), 17-22 June,
2006.
Patrick J., Cohn, Jeffrey, Matthews, Iain, Lucey,
Simon, Sridharan, Sridha, Howlett, Jessica M., &
Prkachin, Kenneth M, 2011, “Automatically
detecting pain in video through facial action units”,
IEEE Transactions on Systems, Man, and
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6740-6752
© Research India Publications. http://www.ripublication.com
[41]
[42]
[43]
[44]
[45]
[46]
[47]
[48]
[49]
[50]
[51]
Cybernetics-Part B : Cybernetics, Vol. 41, No.3, pp.
664-674.
Bihan Jiang, Michel Valstar, Brais Martinez, and
Maja Pantic, 2014, “A Dynamic Appearance
Descriptor Approach to Facial Actions Temporal
Modeling”, IEEE Transactions on Cybernetics, Vol.
44, No. 2.
Karunadasa, N.P., Ranasinghe, D.N, 2009,
“Accelerating high performance applications with
CUDA and MPI”, International Conference
onIndustrial and Information Systems (ICIIS),28-31
Dec. 2009, Sri Lanka, pp. 331-336.
Jianqiang Lv; Chunfen Xia, 2010, “The Application
of CUDA Architecture in Facial Expression
Recognition”,
International
Symposium
on
Intelligence Information Processing and Trusted
Computing (IPTC), 28-29 Oct. 2010, Huanggang, pp.
180-183.
Huang XianLou; Yu ShuangYuan, 2013, “Image
segmentation based on Normalized Cut and CUDA
parallelimplementation”,
IET
International
Conference on Wireless, Mobile and Multimedia
Networks, 22-258 Nov. 2013, Beijing, pp. 209-214.
Roberto Di Salvo, Carmelo Pino, 2011, “Image and
Video Processing on CUDA: State of the Art and
Future Directions”, Mathematical Methods and
Techniques in Engineering and Environmental
Science, ISBN: 978-1-61804-046-6, pp. 60-66.
P. Viola and M. J. Jones (2004), "Robust real-time
face detection", International Journal of Computer
Vision, vol. 57, no.2, pp. 137-154.
Li Dang and Fanrang Kong, 2010, “Facial feature
point extraction using a new improved Active
ShapeModel”, 3rd InternationalCongress on Image
and Signal Processing (CISP), vol. 2, pp. 944-948.
Paola Campadelli and Raffaella Lanzarotti, 2002,
“Localization of Facial Features and Fiducial
Points”, Department of Computer Science and
Engineering, University of Milan, Italy.
Jian Li, Yuqiang Lu, Bo Pu, Yongming Xie, Jing
Qin, Wai-Man Pang, 2009, “Accelerating Active
Shape Model Using GPU for Facial Extraction in
Video”, IEEE International Conference on
Intelligent Computing and Intelligent Systems, Vol.
4, pp. 522-526.
Bram van Ginneken, Alejandro F. Frangi, Joes J.
Staal, Bart M. ter Haar Romeny, and Max
A.Viergever (2002), “Active Shape Model
Segmentation With Optimal Features”, IEEE
Transactions On Medical Imaging, Vol. 21, No. 8,
pp. 924-933.
Takeo Kanade, Jeffrey F. Cohn and Yingli Tian,
2000, “Comprehensive Database for Facial
Expression Analysis”, 4th IEEE International
ConferenceonAutomatic
Face
and
Gesture
Recognition (FG 2000), 26-30 March 2000,
Grenoble, France, pp. 484-491.
Biography
Sabu George was an Asst. Professor in Electronics and
Communication Engineering Department of Al-Azhar College
of Engineering and Technology, Kerala and Pankajakathuri
College of Engineering and Technology, Kerala. He received
his B.Tech.Degree in Electronics and Communication
Engineering from Cochin University of Science and
Technology, India in 2006 and M.Tech.Degree in Networking
and Internet Engineering from Visvesvaraya Technological
University, India in 2010. He is currently working towards
his Ph.D. Degree at Manipal Institute of Technology (MIT),
Manipal University, India. His research interests include
Image Processing and Computer Vision. His current focus of
interest is the analysis of the emotions of fraud. He is a life
member of Computer Society of India.
6752