Example: Data Mining for the NBA - The University of Texas at Dallas

Introduction to Biometrics
Dr. Bhavani Thuraisingham
The University of Texas at Dallas
Lecture #11
Biometric Technologies: Iris Scan
September 28, 2005
Outline
 Introduction
 Components
 Iris Scan Process
 Template generation and matching
 Market and Deployment
 Strengths and Weaknesses
 Research Directions
 Conclusions
 Appendix: Updated information on project
References
 Course Text Book, Chapter 6
 http://www.biometricsinfo.org//irisrecognition.htm
Introduction
 Iris scan biometrics employs the unique characteristics and
features of the human iris in order to verify the identity of an
individual.
 The iris is the area of the eye where the pigmented or colored
circle, usually brown or blue, rings the dark pupil of the eye.
 The iris-scan process begins with a photograph. A
specialized camera, typically very close to the subject, no
more than three feet, uses an infrared imager to illuminate the
eye and capture a very high-resolution photograph.
 This process takes only one to two seconds and provides the
details of the iris that are mapped, recorded and stored for
future matching/verification.
Introduction (Continued)
 The inner edge of the iris is located by an iris-scan algorithm
which maps the iris’ distinct patterns and characteristics.
 An algorithm is a series of directives that tell a biometric
system how to interpret a specific problem.
 Algorithms have a number of steps and are used by the
biometric system to determine if a biometric sample and
record is a match.
 Iris’ are composed before birth and, except in the event of an
injury to the eyeball, remain unchanged throughout an
individual’s lifetime.
Introduction (Continued)
 Iris patterns are extremely complex, carry a large amount of
information and have over 200 unique spots.
 The fact that an individual’s right and left eyes are different
and that patterns are easy to capture, establishes iris-scan
technology as one of the biometrics that is very resistant to
false matching and fraud.
 The false acceptance rate for iris recognition systems is 1 in
1.2 million, statistically better than the average fingerprint
recognition system.
Introduction (Continued)
 Iris-scan technology has been piloted in ATM environments in
England, the US, Japan and Germany since as early as 1997.
 In these pilots the customer’s iris data became the verification
tool for access to the bank account, thereby eliminating the
need for the customer to enter a PIN number or password.
 When the customer presented their eyeball to the ATM
machine and the identity verification was positive, access was
allowed to the bank account.
 These applications were very successful and eliminated the
concern over forgotten or stolen passwords and received
tremendously high customer approval ratings.
Introduction (Concluded)
 Airports have begun to use iris-scanning for such diverse
functions as employee identification/verification for
movement through secure areas
 Allowing registered frequent airline passengers a system that
enables fast and easy identity verification in order to expedite
their path through passport control.
 Other applications include monitoring prison transfers and
releases, as well as projects designed to authenticate on-line
purchasing, on-line banking, on-line voting and on-line stock
trading.
Components of the Iris Scan System
 Front-end acquisition hardware with central processing
software
 Software components: Image processing and matching
engines, Proprietary database
 Web-enabled iris scan applications are being developed
 Integration with middleware systems
Process
 Iris-Scan: How it Works: Dr. John Daugman's work in iris
recognition form the basis of this information. Information
and images found on his website,
http://www.cl.cam.ac.uk/users/jgd1000, are presented below.
 Iris recognition leverages the unique features of the human
iris to perform identification and, in certain cases,
verification.
IrisCode (TradeMark)
Misidentification
rate
Process: The Iris
 Iris recognition is based on visible (via regular and/or infrared
light) qualities of the iris.
 A primary visible characteristic is the trabecular meshwork
(permanently formed by the 8th month of gestation), a tissue
which gives the appearance of dividing the iris in a radial
fashion.
 Other visible characteristics include rings, furrows, freckles,
and the corona
Process: IrisCode (TradeMark)
 Iris recognition technology converts these visible characteristics as
a phase sequence into a 512 byte IrisCode(tm), a template stored for
future identification attempts. From the iris' 11mm diameter, Dr.
Daugman's algorithms provide 3.4 bits of data per square mm.
 This density of information is such that each iris has ‘ 266 'degrees
of freedom', as opposed to 13-60 for traditional biometric
technologies.
 After allowing for the algorithm's correlative functions and for
characteristics inherent to most human eyes, Dr. Daugman
concludes that 173 "independent binary degrees-of-freedom" can be
extracted from his algorithm - an exceptionally large number for a
biometric.
 A key differentiator of iris-scan technology is the fact that 512 byte
templates are generated for every iris, which facilitates match speed
(capable of matching over 500,000 templates per second)
Process: Iris Acquisition
 The first step is location of the iris by a dedicated camera no
more than 3 feet from the eye.
 After the camera situates the eye, the algorithm narrows in
from the right and left of the iris to locate its outer edge.
 This horizontal approach accounts for obstruction caused by
the eyelids. It simultaneously locates the inner edge of the iris
(at the pupil), excluding the lower 90° because of inherent
moisture and lighting issues.
Process: Iris Scan issues
 Iris-scan technology requires reasonably controlled and
cooperative user interaction - the enrollee must hold still in a
certain spot, even if only momentarily.
 In applications whose user interaction is frequent (e.g.
employee physical access), the technology grows easier to
use.
 Applications in which user interaction is infrequent (e.g.
national ID) may encounter ease-of-use issues. Over time,
with improved acquisition devices, this issue should grow
less problematic.
Process: Iris Scan issues (Concluded)
 The accuracy claims associated with iris-scan technology
may overstate the real-world efficacy of the technology.
 Because the claimed equal error rates are derived from
assessment and matching of ideal iris images (unlike those
acquired in the field), actual results may not live up to the
projections provided by leading suppliers of the technology.
 Since iris technology is designed to be an identification
technology, fallback procedures may not be as fully
developed as in a verification deployment (users accustomed
to identification may not carry necessary ID, for example).
Image Acquisition
 Kiosk-based systems
- User stands 2-3 feet from camera positioned at the height
of the user’s eye
 Physical access devices
- Small camera mounted behind a mirror acquires the
image. User locates his eye on the mirror
 Desktop cameras
- 18 inches from device; PC/Workstation-based
Inage Processing
 Process of mapping iris is the same for any acquisition device
 After camera locates the eye, an algorithms narrows in from
the right and left of the eye to find the iris’s outer edge
 Iris scan algorithm locates the inner edge of the iris at the
pupil
- Challenging for very dark eyes
 Once the parameters of the iris have been defined a black and
white image of the iris is used for feature extraction
Distinctive Characteristics
 Primary visible characteristic is Trabecular Meshwork, a
tissue that gives the appearance of dividing the iris in a
radial fashion
 Others include: Rings, Furrows, Freckles, and Corona
 Maps segments of the iris into hundreds of independent
vectors
 Characteristics derived from the iris features are the
orientation and the spatial frequency of distinctive areas
along with the position of the areas
 Not all of the iris is used
Template Creation/Generation
 Vectors located by the iris scan algorithm are used to
form enrollment and match templates
 Templates are generated in hexadecimal format
 Between one and four Iris images are needed for
enrollment and template generation
Template Matching
 Usually identification is performed more than verification
- Template is matched against the ones in the database
to identify the person
 Verification is performed infrequently
- Matches a person’s iris template against the one stored
for him/her in the database
Application Market and Deployment
 Iris-scan technology has traditionally been deployed in high-
security employee-facing physical access implementations
 Iridian - the technology’s primary developer - is dedicated to
moving the technology to the desktop, and has had some
success in small-scale logical access deployments.
 The most prominent recent deployments of iris-scan
technology have been passenger authentication programs at
airports in the U.S., U.K., Amsterdam, and Iceland
 The technology is also used in corrections applications in the
U.S. to identify inmates.
 A number of developing countries are considering iris-scan
technology for national ID
 It is believed that the largest deployed Iridian database spans
under 100,000 enrollees.
Application Market and Deployment (Concluded)
 Iris-scan is set to grow substantially through 2007 and
beyond.
 Iris-scan offers low false match rates and hands-free
operation, and is the only viable alternative to fingerprint
technologies in 1:N applications where a single record must
be located.
 Iris-scan revenues are projected to grow from $16.2m in 2002
to $210.2m in 2007.
 Iris-scan revenues are expected to comprise approximately
5% of the entire biometric market.
Strengths of Iris Scan
 Resistance to False Matching
- 1 in 1,200,0000 approx.
 Stability of Characteristic over lifetime
- Characteristics formed pre-birth; changed only by injury
 Can be used for both logical and physical access
Weaknesses of Iris Scan
 Difficult to use
- Acquisition systems are not straightforward
- User must be positioned correctly
 False nonmatching and failure to enroll
- False nonmatch rates need to improve
- Works with smaller databases
- Difficult to capture the images
 User discomfort
- Users reluctant to capture their iris
 Acquisition devices are mostly proprietary
Research Directions
 Improve False Nonmatch rates
 Better performance for larger databases
 Capture images when user is wearing glasses
 Techniques for very dark eyes
Technology Comparison
 Method
Coded Pattern
Misidentification rate
Security
 Iris Recognition
Iris pattern
1/1,200,000
 Fingerprinting
Fingerprints
1/1,000
 Hand Shape
Size, length and thickness of hands
 Facial Recognition
Outline, shape and distribution of eyes and nose
1/700
1/100
 Signature
Shape of letters, writing order, pen pressure 1/100
 Voiceprinting
Voice characteristics
1/30
Summary
 Low failure rate: 1 in 1,200,000 approx
 Usable for highly secure applications
 Need better acquisition techniques
 Need better performance
Introduction to Biometrics
Dr. Bhavani Thuraisingham
The University of Texas at Dallas
Information Relevant to the Project – Version 2
September 28, 2005
Outline of the Unit
 Project Information
 Some Details of the Project
Project Information
 In this project, you are asked to do the two following tasks:
 Recognize
one’s Face.
 Recognize one’s face with different poses (i.e. straight,
left, right and up).
Project Information (Continued)
 PART1
 For this experiment you will use neural network package
given in the code subdirectory in location
http://www.cs.cmu.edu/afs/cs.cmu.edu/user/avrim/www/M
L94/face_homework.html
-
 For training and testing, you will use the face images that are
listed in the trainset subdirectory in
- http://www.cs.cmu.edu/afs/cs.cmu.edu/user/avrim/www/M
L94/face_homework.html
- The face images that are listed in the trainset subdirectory
are given in the faces subdirectory in the same above
location.
Project Information (Continued)
 PART2
 (OPTIONAL) You will also use k-nearest neighborhood using
the same dataset and compare its performance with neural
network.
Project Information (Continued)
 RECOMMENDED
 You don’t need to do significant amounts of coding for part1
of the project. Only you need to make small changes in the
files, imagenet.c and facetrain.c given in the code
subdirectory in the location
- http://www.cs.cmu.edu/afs/cs.cmu.edu/user/avrim/www/M
L94/face_homework.html
 For training your datasets in part 1, it will take time. It is
recommended that you read the materials in the above location
thoroughly ( particularly in assignment document, section
6,”Documantations” ) and start it earlier.
Project Information (Concluded)
 For part2 (OPTIONAL), you need some extensive coding. You
will use k-nearest neighborhood instead of back propagation
in the given code.
Some Details of the Project
 FACE IMAGES
 The image data can be found in the faces subdirectory in
- http://www.cs.cmu.edu/afs/cs.cmu.edu/user/avrim/www/M
L94/face_homework.html
 This subdirectory contains 20 subdirectories, one for each
person who volunteered for the photo shoot, named by
userid. Each of these subdirectories contains several
versions of the face images.
 For images the following naming convention is used:
<userid>_<pose>_< expression>_<eyes>_<scale>.pgm
Some Details of the Project (Continued)
 For further details see the assignment document, section 2 in
the following link.
- http://www.cs.cmu.edu/afs/cs.cmu.edu/user/avrim/www/M
L94/face_homework.html
Some Details of the Project (Continued)
 HOW TO VIEW THE FACE IMAGES
 You will need View.java and View.class to view the images.
These two files will be placed on my web site by tomorrow
 View.java handles a variety of image formats, including the
PGM format in which the face images are stored.
 To start View, just type on the command line:
java View ImageInput
Here ImageInput corresponds to the image you want to
view
Some Details of the Project (Continued)
 THE NEURAL NETWORK AND IMAGE ACCESS CODE
 You can have C code for a three layer fully connected feed
forward neural network which uses the back propagation
algorithm to tune its weights. You can also have the top level
program ( facetrain.c ) for training and recognition, as a
skeleton for you to modify.
 The code is located in the code directory in the following
location
- http://www.cs.cmu.edu/afs/cs.cmu.edu/user/avrim/www/M
L94/face_homework.html
 Copy all the files in this area to your UTD home directory, and
type make.
Some Details of the Project (Continued)
 When the compilation is done, you should have one
executable program : facetrain.
 facetrain takes lists of image file as input and uses these as
training and test sets for a neural network.
 facetrain can be used for training and/or recognition, and it
also has the capability to save networks to file.
 facetrain outputs a number of performance measures in the
output file at the end of each epoch, in the folllowing format
<epoch> <delta> <trainperf> <trainerr> <t1perf> <t1err> <t2perf>
<t2err>
Some Details of the Project (Continued)
 For further details see in the subdirectory assignment
documents, section 4 & 6.2 in the following link
- http://www.cs.cmu.edu/afs/cs.cmu.edu/user/avrim/www/M
L9/face_homework.html
Some Details of the Project (Continued)
 ASSIGNMENT
 Part1
1. Copy straight_train.list, straight_test1.list, straight_test2.list,
all_train.list, all_test1.list, all_test2.list in your home
directory to obtain the training and test set data for this
assignment from the following trainset link in
http://www.cs.cmu.edu/afs/cs.cmu.edu/user/avrim/www/
ML94/face_homework.html
-
2. Train it with the default learning parameter settings (learning
rate 0.3, momentum 0.3) for 75 epochs, with the following
command.
facetrain –n you.net –t straight_train.list -1
straight_test1.list -2 straight_test2.list –e 75
Some Details of the Project (Continued)
 you.net is the name of the network file which will be saved
when training is finished.
 straight_train. list, straight_test1.list, and straight_test2.list
are text files which specify the training set (70 examples) and
two test sets (32 and 50 examples), respectively.
 This command creates and trains your net on randomly
chosen sample of 70 of the 152 “straight” images, and test it
on the remaining 32 and 50 randomly chosen images,
respectively.
 Report your train/test performance and error as a function of
epochs. If you had stopped training when the performance on
test1 leveled off, what would the performance have been on
test2? And vice versa?
Some Details of the Project (Continued)
3. Implement a face recognizer; i.e. implement a neural net
which, when given an image as input, indicates who is in the
image.
 To do this, you will need to implement a different output
encoding (since you must now be able to distinguish among
20 peoples). Describe your output encoding. (Hint: leave
learning rate and momentum at 0.3, and use 20 hidden
units).
 Train the network for 100 epochs:
facetrain –n face.net –t straight_train.list -1
straight_test1.list -2 straight_test2.list –e 100
 Report your train/test performance and error as a function of
epochs. If you had stopped training when the performance
on test1 leveled off, what would the performance have been
on test2? And vice versa?
Some Details of the Project (Continued)
4. Implement a pose recognizer; i.e. implement a neural net
which, when given an image as input, indicates whether the
person in the image is looking straight ahead, up, to the left, or
to the right.
 You will also need to implement a different output encoding
for this task. Describe your output encoding. (Hint: leave
learning rate and momentum at 0.3, and use 6 hidden units).
 Train the network for 100 epochs:
facetrain –n pose.net –t all_train.list -1 all_test1.list -2
all_test2.list –e 100
 Report your train/test performance and error as a function of
epochs. If you had stopped training when the performance
on test1 leveled off, what would the performance have been
on test2? And vice versa?
Some Details of the Project (Continued)
What changes you should make in the code.
 You will need to modify the routine load_target in code
imagenet.c to setup appropriate target vectors for the output
encodings you choose, when implementing the face
recognizer and the pose recognizer.
 You will need to modify the code facetrain.c to change
network sizes and learning parameters, both of which are
trivial changes.
 You will need to modify the two performance evaluation
routines performance_on_imagelist() and
evaluate_performance() in code facetrain.c ,when
implementing the face recognizer and the pose recognizer.
5.
Some Details of the Project (Concluded)
6. For further assistance see in assignment document, section
5-3, 5-6, 5-8 and 6 in the following link
http://www.cs.cmu.edu/afs/cs.cmu.edu/user/avrim/www/ML9/f
ace_homework.html
 Part2

(OPTIONAL) Use k-nearest neighborhood to do all the tasks
above. Here you will determine the best k value.