Targeting specific facial variation for different identification tasks

Forensic Science International 201 (2010) 118–124
Contents lists available at ScienceDirect
Forensic Science International
journal homepage: www.elsevier.com/locate/forsciint
Targeting specific facial variation for different identification tasks
Gillian Aeria a, Peter Claes a,*, Dirk Vandermeulen b, John Gerald Clement a
a
Melbourne Dental School, The University of Melbourne, 4th floor, 720 Swanston Street, Carlton, 3053, Victoria, Australia
K.U. Leuven, Medical Imaging Research Center (MIRC), Faculty of Engineering, Department of Electrical Engineering–ESAT, Center for Processing Speech and Images–PSI,
Herestraat 49, Bus 7003, 3000 Leuven, Belgium
b
A R T I C L E I N F O
A B S T R A C T
Article history:
Received 15 January 2010
Received in revised form 1 March 2010
Accepted 8 March 2010
Available online 31 March 2010
A conceptual framework that allows faces to be studied and compared objectively with biological
validity is presented. The framework is a logical extension of modern morphometrics and statistical
shape analysis techniques. Three dimensional (3D) facial scans were collected from 255 healthy young
adults. One scan depicted a smiling facial expression and another scan depicted a neutral expression.
These facial scans were modelled in a Principal Component Analysis (PCA) space where Euclidean (ED)
and Mahalanobis (MD) distances were used to form similarity measures. Within this PCA space, property
pathways were calculated that expressed the direction of change in facial expression. Decomposition of
distances into property-independent (D1) and dependent components (D2) along these pathways
enabled the comparison of two faces in terms of the extent of a smiling expression. The performance of
all distances was tested and compared in dual types of experiments: Classification tasks and a
Recognition task. In the Classification tasks, individual facial scans were assigned to one or more
population groups of smiling or neutral scans. The property-dependent (D2) component of both
Euclidean and Mahalanobis distances performed best in the Classification task, by correctly assigning
99.8% of scans to the right population group. The recognition task tested if a scan of an individual
depicting a smiling/neutral expression could be positively identified when shown a scan of the same
person depicting a neutral/smiling expression. ED1 and MD1 performed best, and correctly identified
97.8% and 94.8% of individual scans respectively as belonging to the same person despite differences in
facial expression. It was concluded that decomposed components are superior to straightforward
distances in achieving positive identifications and presents a novel method for quantifying facial
similarity. Additionally, although the undecomposed Mahalanobis distance often used in practice
outperformed that of the Euclidean, it was the opposite result for the decomposed distances.
Crown Copyright ß 2010 Published by Elsevier Ireland Ltd. All rights reserved.
Keywords:
Identification
3D facial scanning
Morphometrics
PCA
Similarity measures
Property pathways
1. Introduction
Identification of a person can be achieved in two ways. Firstly a
person can be identified as a male, female, infant, adult and
ancestry affiliation, each a broad classification task. Alternatively,
this same person can be identified as a particular individual i.e.
John Smith. Both approaches of identification are equally
important but it is the task at hand that determines the more
suitable approach. In some scenarios, identifying or classifying an
individual into a population group i.e. ancestry affiliation, is of
greater interest and requires an understanding of the most
dominant population-specific characteristics, henceforth termed
inter-population variation. In a simple tribal affiliation example,
inter-population variation would include visual cues like tattoos,
* Corresponding author. Tel.: +61 3 9341 1522; fax: +61 3 9341 1594.
E-mail addresses: [email protected], [email protected]
(P. Claes).
piercings or pigments that are shared between members of the
same tribe only.
On the other hand, legal and high security situations require
exact identification of an individual. Positive recognition of an
individual relies on facial characteristics that make a person
distinctively different from all others (with the possible exception
of identical twins), which is different to classifying an individual
into a population based on shared facial characteristics. Thus,
knowledge of the entire spectrum of each and every biologically
viable facial characteristic throughout the human population is
required. This type of variation that makes someone unique can be
termed inter-individual variation. To illustrate this, recognition of
family members and individual allies occurs regardless of tribal
markings that may or may not be shared.
As faces are non-rigid structures, successful classification and
recognition is dependent on knowledge of natural facial deformations. The face can change over time (e.g. ageing or change in body
mass index) and with facial expressions [1]. Such changes that each
person undergoes follow a predictable pattern and can be
0379-0738/$ – see front matter . Crown Copyright ß 2010 Published by Elsevier Ireland Ltd. All rights reserved.
doi:10.1016/j.forsciint.2010.03.005
G. Aeria et al. / Forensic Science International 201 (2010) 118–124
considered intra-individual variation. These types of differences
are extremely difficult to deal with, making positive identification
of individuals an even more challenging task. For example, longterm missing persons get older and previous photographs of them
that are used as references must be used with caution.
A lot of credence is placed on identifications made by
eyewitnesses in court because human perception and recollection
is presumed to be correct. However, numerous studies show that
people are not always reliable sources when comparing faces with
recollections, and are significantly affected by differences in
lighting [2], familiarity [3], expression [3] and viewpoint or pose
[4]. There is an urgent need to develop objective and quantitative
methods to accurately describe facial variation that can then be
used to compare individuals or groups of people.
Farkas [5] identified discrete landmarks between which linear
distances and angles could be measured in order to quantitatively
record and describe the facial surface during direct (measured on the
subject directly) anthropometry. The main shortcomings of this
technique are that measurements taken are time consuming and
rely on the cooperation of the subject, and skill of the clinician, which
varied widely. More importantly, equipment often distorted the soft
overlying tissue resulting in inaccurate measurements. There is also
only one opportunity for measurements to be recorded and once the
subject has left or grown older, it is impossible to calculate error or
repeat/obtain additional measurements [6].
Indirect (measured on images of the subject) anthropometry
provides solutions to the pitfalls of direct methods. In addition to
being significantly quicker, measurements are taken on 2D images
where no physical contact with the subject is required. Images of
faces can be archived allowing inter-examiner error to be
calculated. However, one critical disadvantage is the loss of depth
information when transforming a 3D object into a flat surface like a
2D photo. Furthermore, distortion due to lighting and focal length
of lenses are other problems that are hard to standardise.
Both methods of measurement, direct as well as indirect, have
inherent flaws resulting in neither being suitable to objectively
record or describe facial variation. This inadequacy can be resolved
by using 3D scanners. Volumetric scanners like computer
tomography (CT) and magnetic resonance imaging (MRI) are
mainly reserved for patient use. Surface scanners are the preferred
choice when imaging healthy participants because images are
captured rapidly, safely, cost-efficiently and non-invasively.
Once facial scans are obtained, 3D morphometrics enables the
description of complex shapes using numerical data, which
facilitates quantitative comparisons to be carried out resulting
in objective assessments of facial variation [7]. It also forms the
basis of how similarity scores are calculated to confirm or refute
one’s identity. Facial scans also need to be represented in such a
way that correspondence is maintained between them so that valid
comparisons can be made. To allow for this, faces can be
represented using roughly three methods: landmarks, curves
and full surface based representation [8].
Regardless of the way in which facial data is represented and the
kind of numerical data extracted, a statistical shape analysis can
always be performed. This is a geometrical analysis of a set of shapes
in which statistics are derived to describe geometrical properties
from similar shapes or different groups [9]. The result is a statistical
space representing typical variability over similarly shaped objects.
The most popular technique used for this purpose is Principal
Component Analysis (PCA), which enables the description of the
maximum amount of shape variation while using the minimum
number of variables. As such it redescribes the original dataset in
terms of the observed variations that are independent of one
another. All principal components (PC) describe a direction of
variation independent to all other PCs in the dataset. This is of great
interest and value, when the focus of facial comparisons is indeed the
119
differences or variations between them. This method has been used
by Hammond et al. [10], Hennessy et al. [11] and Claes [12] to study
facial variation after dense correspondence had been achieved. It is
observed and known that faces that appear to be more similar are
found closer together within this PCA space. Consequently, this
enables distance within the shape space to be used as a measure of
similarity. Two forms of distances in the face space have been used so
far, Euclidean in Refs. [13–15] and Mahalanobis in Refs. [1,16]. The
Euclidean distance is the simple straightforward linear distance,
whereas the Mahalanobis distance is the variation-normalised
distance.
The crucial disadvantage of PCA for studying a specific type of
variation observed in a dataset like facial expression is the inability
to control the type of variation extracted by each PC. The most
prominent spread of variation within the dataset is always
extracted by the first PC, and the second widest variation by the
second PC. This is entirely dependent on the distribution of the
dataset being studied and may not reflect the specific variation of
interest. An approach that deals with this problem is found in Refs.
[17,18] and describes the concept of property pathways. A
property is a term assigned to the cause of change that individuals
undergo and includes, age, expression or weight. A property
pathway is the combination of PCs that cumulatively expresses the
direction of variation caused by the property. This allows the
ability to focus on and study specific, property-related variation
within the face (PCA) space.
The aim of this work is to extend the use of distances as
measures of similarity within a PCA space with reference to
specific variations of interest, by incorporating the concept of
property pathways. Using the direction of a property pathway,
distances can be decomposed into two components: a propertydependent and property-independent component. The smiling
facial expression is used here as an exemplar for a property,
however the decomposition can be applied to any kind of
detectable variation within a dataset like age, BMI. In this smiling
example, the property-dependent component is defined as the
variation between faces that is caused by and so dependent on the
change in this particular expression. The expression-independent
component describes the dual variation between faces that is not
caused by and so independent of this expression. Depending on the
identification task at hand, a certain variation may be better at
discerning between different faces or detrimental towards
achieving positive identification. The decomposition of distances
is a framework to target specific variation. This framework is tested
on two different types of identification: a classification task
typically focussing on inter-population variation and a recognition
task typically focussing on inter- and intra-individual variation.
There also exists two traditional types of distance measurements,
Euclidean and a standardised version called, the Mahalanobis
distance. The decomposed distances will be calculated using both
Euclidean and Mahalanobis distance types to test their effect on
measuring similarity with regards to a specific property.
2. Materials and methods
Ethics approval for the project ‘‘The Characterisation of 3-Dimensional Facial
Profile in Young Adult Western Australians’’ was granted from the Princess Margaret
Hospital for Children (PMH) ethics committee (PMHEC 1443/EP) in Perth, W.A. Scans
of healthy young adults between the ages of 18–26 were collected using a 3dMD facial
scanning system. Participants were scanned twice, the first time exhibiting a neutral
face (no expression), while in the second scan participants were asked to smile. The
precision and repeatability of the 3dMDfaceTM (two pod) System were tested by
Aldridge et al. [19]. She defined precision as the ‘averaged absolute difference between
repeated measures of the same object’ and was reported to be 0.827 mm.
Repeatability was defined by Aldridge et al. [19] as the ‘measure of precision relative
to the magnitude of actual biological difference between individuals’ and reported to
be greater than 95% in 181 out of 190 linear distances between landmarks.
Each participant also filled out a questionnaire, which recorded his or her gender,
age, weight, height and Body Mass Index (BMI). Individuals whose scans contained
120
G. Aeria et al. / Forensic Science International 201 (2010) 118–124
Fig. 1. Moving an individual face along a property path. Moving a face along a property path alters the individual’s face in terms of that property only. The above images
indicate a progression in the smiling facial expression of the same individual throughout.
artefacts or were of poor quality were excluded from the study leaving 255
individuals whose scans were acceptable to use. In this study, facial data that was
collected was represented as a complete surface. Shape data consisted of a dense
number of points each with its own x, y and z component in 3D space. This collection
of 3D points can exist as a point cloud or wireframe. To enable the statistical
analysis of shape represented by point clouds, a method that automatically achieves
anatomical correspondence between faces was employed, because of the
impracticality of indicating thousands of corresponding points manually.
The establishment of dense correspondence across all 3D points in facial scans
presented a chicken-and-egg problem. A reference face (point cloud) by which all
points of all facial scans could be modelled against needed to be created. Ideally, to
eliminate any type of bias towards specific individuals in a database, the average
face should be used as a reference. However this average cannot be obtained
without the knowledge of how each facial scan corresponded with other facial scans
in the dataset. This problem was resolved by applying a bootstrapping mapping
process from [12].
Once the redistributed facial scans shared the same number of data points and
connectivity between the points, PCA was utilised. Since the amount of information
conveyed by each facial scan was standardised (same amount of points and
connectivity), each facial scan was represented as a single point in a multidimensional model space. The number of principal components (PC) increases
exponentially with the total amount of variance that needs to be captured. Typically
the last one to two percent of the variance observed in the dataset is a result of
random errors or artefacts caused by the scanning and mapping process. As this
data was of no biological significance, it was omitted from the study.
Within this model space linear distances measured the similarity between two
faces, F1 and F2. Two types of distances were measured, a Euclidean distance (ED)
and a Mahalanobis distance (MD). The Euclidean distance was the simple
straightforward linear ‘shortest’ distance between two points, whereas the
Mahalanobis distance was the statistically normalised distance between two
points and was achieved by dividing the facial coordinates of each PC by the
standard deviation of that PC.
Since individual attributes like gender, BMI and expression were recorded for
each scan, linear directions called property pathways [17,18] could be established
within the PCA space. Hence, moving along an expression pathway linearly
approximates and simulated an expression shift for that person’s face which is
depicted in Fig. 1. Using this pathway, the decomposition of a distance between
faces F1 and F2 into a property-independent (D1) and dependent component (D2)
was obtained, which it the key contribution of this work. This decomposition is
illustrated in Fig. 2. The expression pathway was first plotted through F2. Moving F2
along this pathway would change the expression of F2. From F1, a line
perpendicular to this pathway was constructed. This resulted into a right-angled
triangle connecting F1, F and F2. According to Pythagoras’ theorem, two sides of a
Fig. 2. Distance decomposition in face space. For simplicity, the model space is
shown in 2D. The decomposition of D into its two components, D1 and D2 is
obtained via the right-angled triangle connecting F1, F and F2. Due to its
perpendicular nature from the property path, D1 represents the difference in faces
independent of the property being studied. Since D2 is parallel to the property
pathway, it represents the difference in faces that is caused by the property.
right-angled triangle can be combined to describe the hypotenuse. The distances
(D1 and D2) of these two sides are the two components of the original distance (D).
The perpendicular nature of these two sides or distances implies the statistically
independent nature of these two similarity measures. As D2 is parallel to the
expression property pathway, it measures variation between two faces that can be
solely attributed to it. Similarly, because D1 is perpendicular to the expression
pathway, it measures variation between two faces that is linearly independent of
the difference in facial expression.
In total, six similarity measures were examined, three Euclidean (E) distances:
ED, ED1 and ED2 and their Mahalanobis (M) versions: MD, MD1 and MD2. These
measurements formed the basis for the similarity scores used in the following tasks.
2.1. Classification and recognition tasks rationale
Two tasks were devised to test the ability of the similarity measures introduced
above to classify and recognise an individual with regards to a property, in this
study either a smiling or neutral facial expression. Both tasks represented two
methods for determining identity. The classification task aimed to identify an
individual in terms of the population(s) he or she might, or might not belong to, and
so aimed to isolate inter-population variation. To do so, similarity measures were
calculated and compared between scans of individuals and expression archetypes.
Archetypes [20] were constructed by averaging the re-sampled 3D coordinates of
every individual’s scan in the entire population. A neutral archetype was built using
the population of neutral scans and a smiling archetype from the population of
smiling scans. A leave-one-out scheme was applied to every test to remove any bias
of the archetype towards the individual being tested. A closed-classification and
open-classification test scenario was conducted for all individuals.
In the closed-classification test scenario, both archetypes were presented with
the certainty that the individual in question matched at least one of them. Similarity
measures were calculated between the smiling scan and both archetypes for each
individual. The same was calculated for the individuals’ neutral scan. Classification
of a scan into one population and not the other was made according to the higher
similarity score with respect to each archetype.
The open-classification test represented a more realistic scenario where only one
population possibility (archetype) was presented and questioned if an individual scan
belonged to it or not. Similarity scores were calculated for all individual scans with the
smiling archetype, and then with the neutral archetype. An operating threshold was
set to determine if a similarity score was high enough to indicate that an individual
scan belonged to that population. A classification decision was made based on
whether the similarity score was above or below the operating threshold. If the
classification decision was correct and the individual scan matched the facial
expression as the archetype, then a true positive (correct) classification was recorded.
If the classification was incorrect and the individual scan did not match the archetype,
then a false positive (incorrect) classification was recorded. Altogether, each
individual scan took part in three tests: firstly in the closed-classification test where
it was tested simultaneously against both archetypes, secondly against the smiling
archetype, and finally against the neutral archetype in the open-classification tests.
Recognition tasks were performed, as described in Ref. [16], to test the ability of
each similarity measure to identify 3D facial scans belonging to an individual
displaying a neutral and smiling expression as one and the same individual.
Accordingly, isolation of inter- and intra-individual variation was of concern. Either
the individual’s smiling or neutral scan was assigned to serve as a probe and was
added to the model while the other or counterpart scan served as a target to be
matched. Scans belonging to the rest of the dataset regardless of expression were
added to act as possible matches. All scans including the probe’s counterpart were
presented to it and similarity scores between them were calculated and ranked in
decreasing order. Hence, the scan that was ranked first was assumed to bear the
most resemblance to the probe. Like the open-classification test, an operating
threshold was set to decide whether the similarity scores between scans was high
enough to be deemed the same person. If the similarity score of a scan was above
the operating threshold, the scan was described as being detected. If the
counterpart was detected and ranked first, then a true positive recognition
(correct) is recorded. If the scan that was detected and ranked first was not the
counterpart to the probe then a false positive recognition (incorrect) was recorded.
This contributed to the false alarm rate. An alternative outcome called a false
negative recognition occurred (in conjunction with a false alarm) when the
counterpart scan was not detected even though it was presented to the probe.
G. Aeria et al. / Forensic Science International 201 (2010) 118–124
Table 1
Closed-classification task results.
Similarity measure
% Correct
Decision difference
Standard deviation
ED
ED1
ED2
MD
MD1
MD2
81.3
33.3
99.8
99.8
18.7
99.8
102.90
0.00
242.30
0.25
0.00
1.83
56.90
0.00
37.50
0.06
0.00
0.28
Enlisting of the percentage correct classification and decision difference for the
closed-classification task for all similarity measures.
Performances of similarity measures were further tested by comparing the ranks of
each counterpart detected for all probes.
3. Results
The performance of the closed-classification task was expressed
as the percentage of individual cases that were classified as having
121
the correct expression (% correct). The decision difference is
calculated by finding the absolute difference in distance between
an individual scan and the smiling archetype and that same
individual scan with the neutral archetype. These absolute
differences were calculated for each of the six distance measurements for all individual scans and the mean was found. The bigger
this difference, the greater the discriminating power of the
similarity measure. Results for the closed-classification task are
listed in Table 1. Of all the similarity measures, ED2, MD and MD2
achieved the highest rate of correct classifications (99.8%). Out of
510 scans, only one smiling scan was misclassified as belonging to
the neutral population. ED correctly classified 81.3% of scans while
ED1 and MD1 only correctly classified 33.3% and 18.7% of
individual scans respectively. As Mahalanobis distances are
normalised Euclidean distances, the scale of the decision
differences was not equal and so comparisons could only be
carried out within their respective groups. MD2 had a greater
decision difference of 1.83 than MD (0.25), while ED2 had a greater
decision difference of 242.30 compared to ED (102.90).
Fig. 3. Open-classification task graphs. Open-classification tasks are evaluated using the respective ROC curves per similarity measure, (a) ED, (b) ED1, (c) ED2, (d) MD, (e)
MD1, (f) MD2.
122
G. Aeria et al. / Forensic Science International 201 (2010) 118–124
Performances in the open-classification task were evaluated
using Receiver Operator Characteristic (ROC) curves. This plotted
the correct classification rates as a percentage (out of one) against
the misclassification rates for a range of operating thresholds. This
curve represents the trade-off between the correct classifications
and misclassifications for a given operating threshold. A diagonal
connecting the lower left corner with the upper right corner
represents the line of chance (50%). The ROC curves graphed in
Fig. 3 indicate that at a certain operating threshold, both ED2 (c)
and MD2 (f) achieve perfect classification scores (correct
Fig. 4. Recognition task graphs. Recognition curves showing Identification Rate (%) VS Rank (%) per similarity measures: (a) ED, (b) ED1, (c) ED2, (d) MD, (e) MD1, (f) MD2.
G. Aeria et al. / Forensic Science International 201 (2010) 118–124
Table 2
Recognition task results.
% Correct
False alarm rate (%)
ED
ED1
ED2
MD
MD1
MD2
1
50
100
0.0
7.0
32.7
62.6
92.8
97.8
0.0
1.4
4.8
32.3
87.1
94.8
46.6
89.2
95.4
0.0
1.6
4.8
Enlisting of the percentage of individuals identified correctly within the top 1% at
various false alarm rates for all similarity measures.
classification rate of 100%) and so appear to be on top of the y-axis.
Both ED1 (Fig. 3(b)) and MD1 (Fig. 3(e)) performed equal to chance
while ED (Fig. 3(a)) and MD (Fig. 3(d)) perform slightly better than
E/MD1 because the net area of both graphs lies above the chance
line.
Recognition curves in Fig. 4 indicate that ED1 and MD1 perform
best in their respective distance groups, achieving correct
recognition of 62.6% and 46.6% individuals respectively at an
extremely strict false alarm rate (less than 1%). At a false alarm rate
of 50%, an extremely high rate of counterparts (92.8%) were
identified correctly and ranked in the top 1% (Table 2) by ED1. At a
false alarm rate of 100%, 97.8% and 94.8% of individuals were
identified correctly by ED1 and MD1 respectively. Both ED2 and
MD2 performed poorly and never detected and identified more
than 5% of counterparts (Table 2) even when no operating
threshold was set (false alarm rate of 100%).
4. Discussion
In both classification tasks, similarity measures MD2 and ED2
successfully and accurately classified the greatest number of
individual scans into their correct expression groups. This is
expected because both ED2 and MD2 measured variation as a
distance due to a change in expression. Thus ED2 and MD2
measured inter-population variation. Likewise, ED1 and MD1 both
perform the worst in their distance groups, because both these
similarity measures were perpendicular to the expression axis and
quantify variation that is independent of facial expression and is
not relevant in classification. MD achieved the equivalent of MD2.
However the discriminating power of MD2 is much higher than
MD. This too is expected because MD contains variation caused by
MD1. Furthermore, the performance of MD is poorer than MD2 in
the open-classification which represents a more realistic and
harder scenario.
For the same reasons why ED1 and MD1 performed poorly in
the classification tasks, they performed the best in the recognition
task. This was because these two similarity measures expressed
variation that was independent of expression and so measured
inter-individual variation only. The recognition curve in Fig. 4(b)
also shows that for ED1, all of the counterparts that were detected
at a false alarm rate of 1% were ranked in the top 1%. A false alarm
rate of 1% sets an extremely strict threshold to minimise the
number of false alarms. Even at this false alarm rate, ED1 still
managed to achieve a high percentage of positive identifications.
This is especially applicable in situations like security systems
where access has to be strict and the number of false alarms has to
be minimised as much as possible. At a false alarm rate of 50%, an
extremely high rate of counterparts (92.8%) were identified
correctly in the top 1% (Table 2) however, the cost is that for
everyone who is correctly identified, an incorrect identification of a
probe occurs. For a security system, this cost may be too high and
the ideal operating threshold would be somewhere in between the
false alarm rates of 1% and 50%. Further increasing the false alarm
rate to 100% resulted in 97.8% of individuals correctly ranked
within the first percentile.
123
Alternatively in the situation of long-term missing persons
where all possible leads have to be matched to a photo of the
individual, it may be of interest to maximise the number of leads to
achieve a high probability of achieving a positive identification.
Here, a false alarm rate of 100% may be chosen to maximise the
probability of finding a match deemed similar enough. In other
words, an operating threshold is no longer of importance because
all possible leads need to be investigated. Using similarity measure,
ED1, the identity of all probes were ranked within the top 18%,
which indicates the power of the ED1 similarity measure
(Fig. 4(b)).
On the other hand, both ED2 and MD2 performed poorly in the
recognition task and never detected and identified more than 5% of
counterparts (Table 2) even when no operating threshold was set
(false alarm rate of 100%) because expression dependent (interpopulation) variation was measured. ED and MD performed in the
middle because they are a combination of both components.
Both ED and MD are distances that have already been used in
previous studies. However, it is unclear as to which distance is the
more appropriate to use in measuring similarity. Here, MD
outperformed ED in all tasks. This is because a Mahalanobis
distance is a normalised distance which results in the reduction of
large sources of variation and amplification of smaller sources of
variation. As a result, all kinds of variation represented by each PC
become equally significant. In the Classification tasks, expressionindependent variance was minimised and since expression
variance had a great impact on the similarity score, it caused
MD to behave like MD2. MD minimises variation caused by gender,
age and BMI whereas ED places a larger emphasis on whether
gender, age and BMI of the individual scan matches that of the
archetype. In the Recognition task, a large source of variation
between a neutral scan of an individual and its smiling counterpart
is due to expression. MD reduces this type variation allowing it to
correctly recognise counterparts regardless of facial expression
and so performed similar to MD1. Hence, in this study MD is the
preferred distance compared to ED. However, the decomposed
Euclidean distance (ED1 and ED2) trumps MD and its relative
decomposed components. This is especially true in the case of ED1
vs. MD1 in the recognition task. This is because the Mahalanobis
version caused a reduction in remaining relevant variation. ED1
retained the relevant inter-individual variation fully which aided
in the identification of correct counterparts. Thus, where
Mahalanobis distances fail, the decomposed Euclidean distances
succeed.
5. Conclusion
The hardest challenge in achieving positive identification is that
under various circumstances, the same individual appears
different. In this work faces are modelled as single points in a
PCA space enabling distance to be used as a similarity measure.
Property pathways provide a means of isolating specific variation
within this variation based space. Accordingly, distance decomposition into a property-dependent and independent component
provides a novel approach for describing and dealing with such
differences. Depending on the context, certain components prove
better at discriminating between inter-population, inter-individual and intra-individual variation.
Two kinds of distances were calculated between faces in the
model space. MD performed better than ED but more importantly,
the decomposed Euclidean distances (ED1 and ED2) proved to be
more successful at achieving correct identification than the
Mahalanobis distances (MD1 and MD2). To the authors’ best
knowledge this concept of distance decomposition is novel to the
field and establishes the foundation for an entirely new technique
of comparing and quantifying facial variation.
124
G. Aeria et al. / Forensic Science International 201 (2010) 118–124
Facial expression was used as a surrogate for more naturally
occurring facial characteristics because it was easy to acquire data
ethically and to categorise in a well controlled test. However, such
a scenario is fairly artificial and so the results obtained here are
better than would be expected in a real-life situation. Facial
expression also follows a linear pattern and so is easily modelled
using the property pathways. The developed framework in this
study is designed to address such linear variation and can only be
of limited use when predicting non-linear variations. The accuracy
of this face space framework is also highly dependent on the
amount of facial images in the dataset, and if they are insufficient
or unevenly distributed with respect to a particular characteristic,
the system will make erroneous predictions. Nonetheless,
the developed framework and its similarity measures can be
applied to a range of studies including, but not limited to, suspect
identification and surveillance systems.
Further work needs to be carried out to determine the
behaviour of other properties, particularly the ageing process
and changes to body mass index and how they can be modelled
within the PCA space. The recent availability of publicly accessible
databases listed in Ref. [8] of facial data enables additional tests to
be carried out as well as the comparison of findings and model
systems developed by other groups within the field.
Acknowledgements
The authors would like to thank Miranda Norquay and Mark
Walters from the Princess Margaret Hospital for Children (PMH) in
Perth, W.A. for providing us with high quality 3D Scans. The data
has been used to generate the results. This work was supported by
the Australian Research Council (ARC) grant DP0772650.
References
[1] A.M. Bronstein, M.M. Bronstein, R. Kimmel, Expression invariant 3d face recognition, in: Proceedings of the 4th International Conference on Audio and Videobased Biometric Person Authentication, Springer, 2003.
[2] C.H. Liu, C.A. Collin, A.M. Burton, A. Chaudhuri, Lighting direction affects recognition of untextured faces in photographic positive and negative, Vis. Res. 39 (24)
(1999) 4003–4009.
[3] V. Bruce, Z. Henderson, C. Newman, A.M. Burton, Matching identities of familiar
and unfamiliar faces caught on cctv images, JEP: Appl. 7 (3) (2001) 207–218.
[4] G.E. Pike, R.I. Kemp, N.A. Towel, K.C. Phillips, Recognizing moving faces: the
relative contribution of motion and perspective view information, Vis. Cogn. 4 (4)
(1997) 409–438.
[5] L.G. Farkas, Anthropometry of the Head and Face in Medicine, Elsevier North
Holland, Inc., New York, 1981.
[6] R. Taylor, Analysis of three-dimensional craniofacial images: applications in
forensic science, anthropology and clinical medicine, in School of Dental Science.,
University of Melbourne, Melbourne., 2008, p. 326.
[7] M. Zelditch, D. Swiderski, D.H. Sheets, W. Fink, Geometric Morphometrics for
Biologists, Elsevier Academic Press (2004) ISBN 0127784608.
[8] D. Smeets, P.Claes, D. Vandermeulen, J.G. Clement, Objective 3D Face Recognition:
Evolution, Approaches and Challenges, Forensic Sci. Int., same issue (2010).
[9] I.L. Dryden, K.V. Mardia, Statistical Shape Analysis, John Wiley & Sons, 1998.
[10] P. Hammond, T.J. Hutton, J.E. Allanson, L.E. Campbell, R.C.M. Hannekam, S. Holden,
M.A. Patton, A. Shaw, I.K. Temple, M. Trotter, K.C. Murphy, R.M. Winter, 3d
analysis of facial morphology, Am. J. Med. Genet. 126a (2004) 339–348.
[11] R.J. Hennessy, A. McLearie, J.L. Kinsella, Waddington, Facial surface analysis by 3d
laser scanning and geometric morphometrics in relation to sexual dimorphism in
cerebral-craniofacial morphogenesis and cognitive function, J. Anat. 207 (2005)
283–295.
[12] P. Claes, A robust statistical surface registration framework using implicit function representations: application in craniofacial reconstruction, in: Faculteit
ingenieurswetenschappen department Elektrotechniek, afdeling PSI, K.U.Leuven,
Belgium: Leuven, 2007.
[13] X. Li, H. Zhang, Adapting geometric attributes for expression invariant 3d face
recognition, in: Proceedings of the IEEE International Conference on Shape
Modeling and Applications, Washington, DC, USA, 2007, IEEE Computer Society.
[14] I. Mpiperis, S. Malassiotis, M.G. Strintzis, 3-d face recognition with the geodesic
polar representation, IEEE T. Inf. Forensic. Secur. 2 (3) (2007) 537–547.
[15] C. Hesher, A. Srivastava, G. Erlebacher, A novel technique for face recognition
using range imaging, ISSPA 2003 2 (2003) 201–204.
[16] S.Z. Li, A.K. Jain, Handbook of Face Recognition, Springer-Verlag, New York, 2005.
[17] P. Claes, D. Vandermeulen, S. De Greef, G. Willems, P. Suetens, Statistically
deformable face models for cranio-facial reconstruction, CIT 14 (1) (2006) 21–30.
[18] P. Claes, D. Vandermeulen, S. De Greef, G. Willems, P. Suetens, Craniofacial
reconstruction using a combined statistical model of faces shape and soft tissue-depth: methodology and validation, Forensic Sci. Int. 159 (1) (2006) S147–
S158.
[19] K. Aldridge, S.A. Boyadjiev, G.T. Capone, V.B. DeLeon, J.T. Richtsmeier, Precision
and error of three-dimensional phenotypic measures acquired from 3dmd photogrammetric images, Am. J. Med. Genet. A 138A (3) (2005) 247–253.
[20] A.I. Shaweesh, C.D.L. Thomas, A. Bankier, J.G. Clement, Delineation of facial
archetypes by facial averaging, Ann. Roy. Australas. Coll. Dent. Surg. 17 (2004)
73–79.