Poster#1 - People Server at UNCW

Can Computer Algorithms Guess Your Age and Gender?
Andrea Kaniuka1, 2, William Smith1, Nina Thigpen1, 2
Supervisor: Dr. Cuixian Chen
1 UNCW
Department of Mathematics and Statistics
2 UNCW Department of Psychology
Introduction
Statistical Techniques
Demography classification refers to perceptions of gender, age, and race in social situations. Of
particular interest is the perception of gender and age, as social communication necessitates our
correct identification of gender and age when interacting with others. Gender classification has
received attention recently in terms of human-computer interaction technology and artificial
intelligence, with research attempting to improve computer classification of gender to reach
identification rates comparable to that of humans. The current study examines 1) age classification
with computer algorithms and 2) gender classification with computer algorithms.
Linear Discriminate Analysis (LDA) and Multiple Discriminate Analysis (MDA): Classification tools
used to linearly separate a dataset into different groups. MDA is similar to LDA, but uses multiple
midpoints to better determine how to separate the groups. The current study used LDA and MDA
to classify the images from the FG-NET database into gender groups (male and female) and age
groups (under 20 and over 20).
Results
Gender Recognition Rate
Method
LDA 5-Fold CV
LDA LOPO CV
MDA 5-Fold CV
MDA LOPO CV
Overall Accuracy: %
80.24 ± 1.82
71.84 ± 23.74
78.14 ± 1.42
68.63 ± 22.05
Applications: marketing, video surveillance, photograph management
Confusion Matrix for LDA 5-Fold CV
Data
Actual Female
Actual Male
The images used for the study were drawn from the FG-NET database which is a database
available to the public that contains a longitudinal collection of facial images. The database is
comprised of 1002 face images, both color and grey scale, from 82 subjects. The facial images
range in age taken from 0 to 69.
Percent
Female
35
42.7
Total
82
100
Predicted
Male
108
339
The confusion matrix shows that:
• 108/ 573 (18.85%) of females were predicted to
be males
• 90 / 339 (26.55%) of males were predicted to
be females
Gender Distribution of 82 Facial Image Subjects
Male
47
57.3
Predicted
Female
465
90
5-Fold Cross-Validation: Partitions the dataset into 5 groups and estimates the prediction error of
the model by assigning testing and training groups. The final prediction error is an average of the
5-folds. For the current data set, the 1002 images are divided into four groups of 200 and one
group of 202.
Age Recognition Rate
Method
LDA 5-Fold CV
LDA LOPO CV
MDA 5-Fold CV
MDA LOPO CV
Age Distribution of 1002 Facial Images
Overall Accuracy: %
84.63 ± 3.21
82.78 ± 14.98
83.93 ± 3.05
80.65 ± 16.13
Confusion Matrix for LDA-5-Fold-CV
Leave-One-Person-Out (LOPO) Cross-Validation: Partitions the data set into 82 groups by subject.
All of the images for one subject are used as the testing set, while the images for the remaining
subjects are the training set.
Each facial image in the FG-NET data set was computationally recognized by nodes (also known
as landmarks). Information from these nodes provided an annotated face, which was then
encoded and transformed via Active Appearance Modeling (AAM) into a textural representation of
the facial image. The difference between the current image (extraction) and the target image was
calculated and is the parameter of interest for the current study. The current study has 109
parameters.
Sample Images
Actual > 20
Actual < 20
Predicted
> 20
206
86
Predicted
< 20
68
642
The confusion matrix shows that:
• 86 / 728 (11.81%) of images under 20 were
predicted to be over 20
• 68 / 274 (24.82%) of images over 20 were
predicted to be under 20
Conclusion
• The most accurate algorithm for determining gender classification was LCA 5-Fold
Cross Validation, with an overall accuracy of 80.24% ± 1.82%.
• The most accurate algorithm for determining age classification was LDA 5-Fold CrossValidation with an overall accuracy of 84.63 % ± 3.21%.
• Future work can utilize computer algorithms to predict both the age and gender of facial
images. Additionally, future studies can refine the current study and predict more
precise ages, rather than large age groups. Last, a direction for future research is the
inclusion of facial images taken from a variety of angles.