Development of Counting Coins Program in Computer Vision

Development of Counting Coins Program in
Computer Vision Approach using Matlab
Lu Zhang
Computer Information Systems
2006
Development of Counting Coins Program in Computer Vision a
Approach using Matlab
Submitted by
Lu Zhang
COPYRIGHT
Attention is drawn to the fact that copyright of this dissertation rests with its author.
The Intellectual Property Rights of the products produced as part of the project belong
to the University of Bath (see http:// www. bath.ac.uk/ordinances/#intelprop).
This copy of the dissertation has been supplied on condition that anyone who consults
it is understood to recognize that its copyright rests with its author and that no
quotation from the dissertation and no information derived from it may be published
without the prior written consent of the author.
Declaration
This dissertation is submitted to the University of Bath in accordance with the
requirements of the degree of Batchelor of Science in the Department of Computer
Science. No portion of the work in this dissertation has been submitted in support of
an application for any other degree or qualification of this or any other university or
institution of learning. Except where specifically acknowledged, it is the work of the
author.
Signed
This dissertation may be made available for consultation within the University Library and may be
photocopied or lent to other libraries for the purposes of consultation.
Signed
i
Abstract
The main purpose of this project is to apply computer vision techniques to develop a program
which should recognize coins in an image, and enumerate their value. That is to have a
computer read the image and calculated the total value of the coins which are on the image.
There are several techniques involved, such as image color segmentation, image edge
detection, noise filtering, Hough transformation and so on. The key to accomplish this project
is the color segmentation of coins and edge enhancement, which separates coins with their
color difference and provides the efficiency. After the computer programs are established, an
experiment which applies the programs with UK coins shows that it works well and the error
depends on the qualities of the coins’ images. A database containing large number of images
is required by using this method.
ii
Acknowledgements
Firstly I would like to thank my project supervisor Dr. Peter Hall for all his time, help and
direction throughout this project’s development. I also want to thank Dr Alwyn Barry for his
kindness and understanding of the difficulies i faced during the project.
iii
Contents
Abstract........................................................................................................................................... ii
Acknowledgements........................................................................................................................ iii
Contents ..........................................................................................................................................iv
Chapter 1- Introduction.....................................................................................................................1
1.1
) Purpose: ......................................................................................................................1
1.2
) Aims:...........................................................................................................................1
1.2.1)
Initial Goals.......................................................................................................1
1.2.2)
Motivation.........................................................................................................1
Chapter 2- Literature survey .............................................................................................................3
2.1) General background Information.......................................................................................3
2.2) Similar projects..................................................................................................................3
2.2.1) CIS-Benchmark project ..........................................................................................4
2.2.2) Automatic coin counter ...........................................................................................4
2.3) Basic strategy of digital image recognizing.......................................................................5
2.3.1) Feature Vector Classification ..................................................................................6
2.3.2) Fitting Models to Photometry .................................................................................6
2.3.3) Fitting Models to Symbolic Structure.....................................................................7
2.3.4) Combined Strategies ...............................................................................................8
2.4) Specific Background on relevant techniques .....................................................................9
2.4.1) Colour image analysis.............................................................................................9
2.4.2) Hough Transform..................................................................................................11
2.5) Summary..........................................................................................................................13
Chapter 3-Plan& Design .................................................................................................................14
3.1) Introduction......................................................................................................................14
3.2) Program working conditions............................................................................................14
3.2.1) Hardware & Software requirements .....................................................................14
3.2.2) Physical setup of the project .................................................................................14
3.3) Design Stages...................................................................................................................15
3.3.1) Possible strategies for detecting coins ..................................................................15
3.3.2) Stages of development ..........................................................................................16
3.4) Overall Algorithms Definition ......................................................................................17
3.4.1) Colour segmentation .............................................................................................18
3.4.2) Noise reduction .....................................................................................................20
3.4.3) Edge enhancement ................................................................................................21
3.4.4) Edge detection.......................................................................................................23
3.4.5) Hough Transform for the circles ...........................................................................25
3.5) Summary..........................................................................................................................26
Chapter 4- Implementation & Validation ........................................................................................27
4.1) Introduction......................................................................................................................27
4.2) Implementation Stages.....................................................................................................27
4.2.1) Coulor segmentation .............................................................................................27
4.2.2) Edge enhancement ................................................................................................29
iv
4.2.3) Edge detection.......................................................................................................30
4.2.4) Hough Transform..................................................................................................30
4.2.5) Calculating the values of coins .............................................................................33
4.3) Validation .........................................................................................................................33
Chapter 5- Testing ...........................................................................................................................34
5.1) Introduction......................................................................................................................34
5.2) Test plan...........................................................................................................................34
5.2.1) Initial setup for the program .................................................................................34
5.2.2) Group the coins with different color .....................................................................35
5.2.3) Detect edges and count coins separately by groups ..............................................35
5.2.4) Calculate the total value........................................................................................38
5.3) Discussion of the test result .............................................................................................38
Chapter 6- Review ..........................................................................................................................39
6.1) What went well & Why ...................................................................................................39
6.2) What could be improved & how......................................................................................39
6.3) Conclusion of the project.................................................................................................39
Chapter 7- Bibliographic.................................................................................................................41
v
Chapter 1- Introduction
1.1 ) Purpose:
The basis of this project is to apply computer vision techniques to develop a program which
should recognize coins in an image, and enumerate their value. That is to have a computer
“watch” the image and then tell the user the total value of the coins which are on the image.
1.2 ) Aims:
1.2.1) Initial Goals
The main goals of this project are:
¾ Recognize the coins on the image.
¾ Count the coins, and then get the total value.
The possible sub-goals for the project are:
¾ Group the coins according to their features (such as colours, shapes).
¾ Distinguish the coins in the same group according to the size or shape.
¾ Count the coins within the group.
¾ Calculate the total value of the coins that are displayed on the image.
1.2.2) Motivation
Nowadays banks usually use the bill counting machine to enumerate the money they have
received. Obviously this is a more efficient method than counting the cash by hands. However,
when the customer wants to pay a large number of cash into the bank, they still have to wait
for quite a while. Particularly, there is large amount of coins inside. And it is also easy to
make mistakes for the bank staffs to calculate. Some coins from different foreign currency
look similar. So sometimes it is difficult to distinguish them by using human eyes, especially
for large amount of coins.
Moreover, because of the globalization, the banks often receive foreign currency that the staff
may not recognize. The charities face the same situation as the bank, because the donators
come from all over the world. So it is necessary to develop a system that can help them to
recognize and calculate the money that they receive.
In 2002 there are twelve countries in Europe that changed their local currencies to the Euro.
That means, great volumes of money had to be physically returned to the national banks of
the member states. National banks need such a device that can sort and recirculate these high
volumes of coins. So in 2003 a coin sorting device called Dagobert was developed by ARC
Seibersdorf research GmbH according to the requirements of nation banks. The coins from
1
more than 100 countries were sorted by Dagobert within two years. In total more than 2000
different coin faces of over 600 different coin types had to be trained and comprise the
backbone of the recognition unit of Dagobert. This is a really good starting point for
developing automatic coins counting system(Nölle 2004, p.284).
Some examples of the application of this project are:
¾ The program can be a guide for blind people, when they are going to pay money into the
bank or they withdraw the cash from the bank.
¾ The program can be used for the bank to calculate the coins. When the staff put the coins
under the camera, the program can read the image captured from the camera, and then
calculate the coins.
¾ The program could be a base for the other system, such as exchange the total value of the
coins on the image to the other currency.
¾ The program could also be used by the others who want to expand it to a new system that
can distinguish the coins from different currency.
2
Chapter 2- Literature survey
2.1) General background Information
Computer vision refers to the application of human vision techniques to a computer, teaching
the computer to see. The subject itself has been around since the 1960s, but it is only recently
that it has been possible to build useful computer systems using ideas from computer vision.
This subject is driven by three main areas: computational geometry, artificial intelligence, and
image processing(Forsyth 2003).
Computational geometry now is widely used in every corner of science and engineering, from
design and manufacturing to astrophysics, molecular biology and fluid dynamics. To build 3D
computer models, lots of problems can be solved in these areas.
In artificial intelligence field, people use computer vision technique as a tool to involve both
the acquisition and processing of visual information. For instance, recently some companies
and research groups focus on face recognition technique, which is widely used in virtual
reality, national ID, security trading terminal, CCTV control and etc(Zhao 2003).
Image processing is the base of the other two areas. It refers to processing digital images by
means of a digital computer. We know that 80% of the information that we get from outside
world are captured by the vision, so it is not surprising that images play the single most
important role in human perception. However, humans are limited to the visual band of the
electromagnetic spectrum, such as ultrasound, electron microscopy. Therefore, we need the
imaging machines which cover almost the entire electromagnetic spectrum, ranging from
gamma to radio waves. They can operate on images generated by sources that human are not
accustomed to associating with image. Thus, digital image processing is applied a wide and
varied fields(Gonzalez 2002, p.1-2). For example, in medical imaging, it can be used to
enhance imagery, or identify important phenomena or events.
2.2) Similar projects
This project bases on the image processing area that we just mentioned above. The main goal
of it is to recognize the coins on the image. There are several similar projects around. We will
illustrate two of them that link with our project. One is coin images Seibersdorf-Benchmark
(CIS-Benchmark) project that we have mentioned briefly in section 1.2.2. The second one is
called automatic coin counter that was designed by J. Provine, Mike McClintock, Kristen
Murray, and Angela Chau, who were students from Rice University in Houston.
3
2.2.1) CIS-Benchmark project
CIS-Benchmark project that we have introduced in section 1.2.2 was designed to sort and
recirculate high volumes of coins according to the individual requirements of the national
banks. In this project, the sorting criteria were thickness, diameter and the images of both
sides of coins. For every coin type, there are at least two coin classes that were defined, one
for the obverse/front side and on for the reverse side of the coin. For each coin class, there are
up to 30 training images, uncommon coin faces might be represented by only a few or even
one training image. There is also a parameter file containing the coin class, angle to rotate the
learn image into position of average image, minimum and maximum thickness and diameter
of the coin. The number of images used to calculate the smallest and biggest values is also
included in the parameter file. This system basically uses matching method. For each coin on
the image, search the matching image in the database. Then the coin could be identified. This
is a really large, complex system. The database consists of roughly 2000 patterns (classes) of
coins from many different countries. And there are 100 000 coin images collected during the
automatic sorting process. However, there is no further information about the running time of
this system. And also there are no details of the algorithm they used for each step(Nölle
2004).
Considering our counting coin program, the size of each coin that we need detect is unique,
but the texture of the coins could be the same. For example, for 5 pence coins and 10 pence
coins which are both produced in 2004, on the head side they have the same texture, the same
picture of Queen’s head and same text. So the diameter or the radius can be one sorting
criteria in our program.
2.2.2) Automatic coin counter
This project was constructed by J. Provine, Mike McClintock, Kristen Murray, and Angela
Chau who had to write a program using feature extraction technique for a course work.
The physical setup of the problem is considered to be that the coins run along on a conveyor
belt and are filmed by a digital video camera from which frames are taken and analyzed at
intervals to count the change on the belt. Then they set the background of the testing image
that was black and a shade of bright yellow. They assumed the coin counting machine had a
mechanical sweeper arm which made certain coins were only lying flat. There were four U.S.
coins- the quarter, dime, nickel and penny that are used to be detected. They use median
filters to smooth out the small edges and then use Roberts gradient method function to work
out the edges of the coins. Then the radiuses of each particular coin were calculated. Finally,
for each pixel, the program checked in eight directions for edges. If it found more than a
certain threshold number of edges that are of the same length from the center pixel, it would
mark that pixel as a possible center and then checked in eight more different directions fro
that radius. If within the entire set of sixteen checks, the number of hits exceeds a certain
4
threshold, it will declare that pixel as the center of a certain type of coin. This program
worked extremely well under all the assumption they have made(Provine 1999).
This project provides a guide line for our counting coins program. Firstly, we should set some
conditions for the program so that the program can work well. If we do not do this, there are
hundreds of conditions for the coins on the image, such as, the size of the same coins will be
different according to the position of the camera. Although we can use scale to solve this
problem, it requires extra computations. And also by using different light conditions, the
images could be various for the same coin. According to the restrict time of our project, we
can not run through all the conditions. Secondly, we could consider using the edge detection
method that they used. Because our program will face the same problems they had met. We
could smooth the texture of the coins, and then find the edge. Finally, the method used to
detect the coin in automatic coin counter could be considered, even there are only use four
coins that were used as the targets. Although the group members of the automatic coin
counting project claim that since the methods were discrete, there were slight inaccuracies in
center-finding(Provine 1999). Generally speaking, this project could be assumed as a good
starting point of our project. However, because the size of 10 pence and 2 pence coins are
nearly the same, radius can not be the only one feature. We have to add the colours of the
coins as the additional feature.
2.3) Basic strategy of digital image recognizing
We should get some ideas about this project from the discussion of the section 2.1. Now we
will discuss the object recognition algorithms. The program will work more effectively and
efficiently by choosing an appropriate computational algorithm. There are lots of methods
surrounded. In Suetens and Fua’s paper(1992), they classified these method into four
categories according to the nature of the computational strategy used. The four classes are
proposed as feature vector classification, fitting models to photometry, fitting models to
symbolic structures and combined strategies (see figure 2.1).
Suitability
for Complex
Image data
Fitting Models to Photometry
Combined Strategies

Hough transform

3DPO

Snake

MDL
Feature Vector Classification
Fitting
Models
to
Symbolic
Structures

HYPER

ACRONYM
Suitability
for Complex
Models
Figure 2.1. Classification space for object recognition strategies(Suetens 1992)
5
From figure 2.1, we can see two main characteristics used to classify the computational
strategies are: suitability of the strategies for complex image data and complex models (see
figure 2.1). The data complexity is defined as corresponding to the signal to noise ratio in a
digital image. If the image free of noisy, this means that the target object characteristics are
unambiguously and completely, we call the data simple. Otherwise, the data is complex. For
the models, if the model is defined by a simple criterion like a single shape template or the
optimization of a single function implicitly containing a shape model, we call it as a simple
model. Otherwise, the model is complex.
2.3.1) Feature Vector Classification
This strategy is typically applied only to simple data with simple model. It has described
extensively in the literature(Duda 1973) and showed that it is useful in many industrial
applications.
In this approach, objects are modeled as vectors of characteristic features, and each of which
corresponds to a point in the multidimensional feature space. The features could be the gray
value, color, infrared or area, compactness, and so on. To apply this method, we should select
the relevant features first. And then we determine a way to measure them. Finally, a criterion
should be defined to distinguish the desired object from others. There are two example
approach of this strategy. One is the pixel classification approach, which is the simplest and
most straightforward application. It is based on the idea that each pixel is a member of a
different model class. The pixels could be classified by the intensities or frequency spectrum.
The other one is classification of labels. This method requires images with simple photometric
statistics. The reason is that we use features that characterize regions which are typically
obtained by some photometry-based method in this approach(Suetens 1992).
In the counting coins program, we could use this strategy as a starting point. By using pixel
classification, we can divide the coins into three groups by the feature of colours. The first
group is two pound and one pound coins. The second group is the two pence and one penny
coins. And the last group is the group of fifty pence, twenty pence, ten pence and five pence
coins. However, the coins in the same group could not be distinguished by using this method.
For instance, two pence coins have the same color with one penny coins. It means that
another algorithm is needed.
2.3.2) Fitting Models to Photometry
This class of techniques is used when the models are sufficient and simple but the
photometric data are noisy. Comparing with the feather vector classification, these methods
use incorporating model knowledge in their procedures and replace local pixel classification
by more global considerations. We divide the basic strategies into two categories: Rigid
Model Fitting and Flexible Model Fitting(Suetens 1992).
6
¾ Rigid Model Fitting
The shape or photometry of the target object is known, so the model can be either rigid or
parametric. The methods are effective only when the model is rigid. Small changes in scale,
orientation and shape can strongly disturb the match.
The example method is the Hough Transform. The standard Hough Transform(Rosenfeld
1969) detects curves whose shape can be described as an analytic curve, such as lines, circles,
and so on. The details of this algorithm will be discussed later. This algorithm could be useful
in our project. It can be used to detect the center and the radius of the coins. The radius of the
coins in the same group is different. So we can distinguish all the coins.
¾ Flexible Model Fitting
The template matching is restricted to rigid or parametric object models, so the flexible model
fitting technique was developed. The strategies in this category specified by a set of generic
constraints on object characteristics, such as, smoothness, curvature, compactness, symmetry,
and homogeneity. We will not discuss this strategy in details. Since in our project, the models
are rigid, this strategy would not be used. The details of this strategy are shown on Suetens’s
article(1992).
2.3.3) Fitting Models to Symbolic Structure
Fitting models assumes that a set of features which have been reliable extracted from the
image data by some preprocessing operation. The feathers are usually found by a local
statistics-based operator, without using shape information or contextual scene knowledge.
This process is often referred to as segmentation. The features may also be the raw pixel
intensities or even labels produced by a method such as template matching (Suetens 1992).
This strategy is based on the strategy we mentioned in section 2.2.1 and 2.2.2, thereby it is a
hybrid strategy. It is applied when complex models are required but reliable symbolic
structures can be accurately inferred from simple data. The two examples of this strategy are
HYPER(Ayache 1986) and ACRONYM systems(Brooks 1981; 1983). There are two major
categories in this strategy: Graph Matching and Composite (Hierarchical) Model Fitting.
¾ Graph Matching
In graph matching approach, objects are modeled as a relational structure or graph of
primitives. The nodes of the graph are components of the object or scene, and labels are
usually assigned by searching for the optimal match between the model graph and the graph
derived from the image data. Heuristic search can be used to speed up the optimization
procedure.
There are three different approaches in this area: direct search, dynamic programming and
7
relaxation labeling. Direct search is the most straightforward approach to graph matching. It
searches for sets of labels and relations that match subparts of the graph. Dynamic
programming is a process that recursively searches for an optimal path in the graph(Bellman
1962). It may be used to increase the computational efficiency by using additional storage in
the search. Relaxation labeling is provides a parallel searching technique which makes the
computation feasible. It is widely used in grouping similar pixels, feature vectors, or data
structures. It has further been extended from discrete labeling to probabilistic labeling, in
which the labels extracted from the image are assigned a probability that is iteratively
increased or decreased based on its compatibility or incompatibility with related labels in the
structure. One example for this approach is MSYS- discrete relaxation labeling, which is
discussed by Tenenbaum and Barrow(1977). The system partition an image into meaningful
regions by merging small initial regions in accordance with their candidate interpretations.
This method could be used in the counting coins program to classify the coins into different
groups according to their colours. The color feature could be extracted from the image by the
segmentation process. After segmentation, the coins could be distinguished by three groups.
Then we can use Hough Transform to each group. The radiuses and centers could be found.
By using them, the coins in the same group could be classified. Finally, we can get the value.
This is the most appropriate strategy we discussed so far.
¾ Composite (Hierarchical) Model Fitting
In this approach, a reduction of the search space is obtained by working hierarchically. It
means a hierarchy of intermediate models would be used to find partial matches, and then the
models could be refined. This method is used to limit the range of labeling possibilities, or
limit the search steps. According to the nature of our project, there are only few coins that
need to be found. So we do not consider this method.
2.3.4) Combined Strategies
The strategies in this approach are used when both the data and the desired model are
complex. It requires the combination strategies, which are based on section 3.1, 3.2 and
3.3.The two examples of these methods are 3DPO(Bolles 1986) and minimal description
length(Fua 1991). There are three categories in this technique, refining matches by
resegmentation, refining matches by template matching, and refining matches by flexible
model matching(Suetens 1992). Because this project is not complex enough to use this
strategy, we will not discuss them in details here.
In this section, we discuss the strategies that could be used in object recognitions. Lots of
algorithms are mentioned briefly, however, all the references of the algorithms are given. We
have found the appropriate strategy and algorithm that could be used in this project. In next
section, we will introduce some specific background on relevant techniques.
8
2.4) Specific Background on relevant techniques
2.4.1) Colour image analysis
Colour is a property of enormous importance to human visual perception. Every object in the
world that we see has its own colour. So it is a powerful descriptor that simplifies object
identification and extraction from a scene. And also, humans can discern thousands of color
shades and intensities, compared to about only two dozen shades of gray. These two factors
motivate the use of colour in image processing.
Colour image processing is divided into two major areas: full-colour and pseudo-colour
processing. Full-colour image typically are acquired with a full-colour sensor, such as a
colour TV, camera and scanner. In pseudo-colour processing, a colour is usually assigned to
the image with a particular monochrome intensity or range of intensities. Before the colour
sensors and hardware for processing color images have become available at reasonable prices,
most digital colour image processing was done at the pseudo-colour level. Now the fullcolour techniques are used in a broad range of applications, including publishing ,
visualization, and the Internet(Gonzalez 2002, p.282)
¾ Color image representation in Matlab
The programming language used in this project is Matlab. The Image Processing Toolbox of
Matlab handles colour images either as indexed images or RGB (red, green, blue) images. As
the images captured by the camera are usually stored as true colour image, in other words,
RGB images, we do not consider to use indexed images in this project.
An RGB colour images in Matlab are expressed by an M × N × 3 array of colour pixels,
where each colour pixel is a triplet corresponding to the red, green, and blue components of
an RGB image at a specific spatial location (see figure 2.2).
Figure 2.2 How pixels of an RGB colour
image are formed from the corresponding
pixels
of
the
three
component
images(Mathworks 1998).
.
9
¾ Colour models
As we discussed above, a colour images can be expressed by three component images, red,
green, and blue. Red, green and blue are called primary colours. Because approximately 65 %
of all cones of our eyes that are sensitive to red light, 33% are sensitive to green light, and
only about 2% are sensitive to blue light(Gonzalez 2002, p.284). Colours can be as seen as
variable combinations of these three colours. There are also the other systems to specify the
colours, such as CIE chromaticity diagram, and HIU system. We do not discuss them in
details.
In order to analyze the colour easily, we construct the colour models to facilitate the
specification of colors in some standard. Basically, it is a specification of a coordinate system
and a subspace within that system where each colour is represented by a single point. There
are numerous colour models in use today. In terms of digital image processing,, the most
commonly used models in practice are the RGB model, the CMY and CMYK model, and the
HIS model. The RGB Model is used for colour monitors and a broad class of colour video
cameras. The CMY (cyan, magenta, and yellow) and CMYK (cyan, magenta, yellow and
black) models are used for colour printing. And the HIS model corresponds closely with the
way humans describe and interpret color. It also has the advantage that it decouples the colour
and gray-scale information in an image. Since in this project, we only use RGB model to
analyze our picture, we only discuss it in details.
RGB Colour Model
A particular pixel may have associated with a three-dimensional vector (r, g, b) which
provides the respective colour intensities , as it is showed in figure 2.3, where (0, 0, 0) is
black, (1, 1, 1) is white, (1, 0, 0) is ‘pure’ red, (0, 0, 1) is ‘pure’ blue, (0, 1, 0) is ‘pure’ green,
(1, 1, 0) is ‘pure’ yellow, (1, 0, 1) is ‘ pure’ magenta, and (0,1,1) is ‘ pure’ cyan.
Figure 2.3 RGB colour cube
10
The different colours in this model are points on or inside the cube, and defined by vectors
extending from the origin. Fro convenience, the assumption is that all colour values have been
normalized so that the cube shown in Fig. 2.3 is the unit cube. That is, all values of R, G, and
B are assumed to be in the range [0, 1].
The RGB image can be represented in the RGB colour model by the three component images,
red, green, and blue. When fed into an RGB monitor, these three images combine on the
phosphor screen to produce a composite colour image. To turn the way round, we can
segment an object of a specified colour range in an RGB image. Firstly, we obtain an estimate
of the “average” colour that we wish to segment. Then for each RGB pixel in the given image,
we check if it has the average colour or not. If it has, then we export this pixel. All the pixels
exported will form the region that we want to segment. This technique could be used in our
project to classify the coin into groups by their colours.
2.4.2) Hough Transform
Hough transform is a basic image processing method for finding the global relationship
between the pixels, which means it can help us to determine if the pixels lie on a curve of
specified shape. An example is given below to discuss this algorithm in details.
The line detection from the binary image seems to be the simplest application of Hough
Transform. Now we illustrate it as an example below.
Suppose there is a line in the image, and we want to know the position of the line.
The function of lines is y = k ∗ x + b , where k, and b are parameter. If there is a poi
nt ( x0 , y0 ), then all the lines that pass this points should be satisfied this function
y0 = k ∗ x0 + b . This means that the point ( x0 , y0 ) decides a group of lines. For exa
mple, when k=1, b=2, the line is y0 = x0 + 2 . When k=-7/5, b=7, the line is
7
y0 = − x0 + 7 , and so on. Further more, we should know that on k-b plan (also call
5
ed parameter space) the function y0 = −3 x0 + 4 (The function can be rewrite as
b = − x0 ∗ k + y0 ) is a line. So each point on x-y plane will correspond to one line on
the parameter space (see figure 2.4)
11
y
b
y0 = k ∗ x0 + b
y0 = x0 + 2
b = − x0 ∗ k + y0
( x 0 , y0 )
x
k
7
y0 = − x0 + 7
5
Figure 2.4
For instance, suppose that we have a line on the binary image is y=x. Firstly, we choose three
points from this line, A (0, 0), B (1, 1), C (2, 2). Then we can get the line function that
satisfied point A, which is b=0. The line function satisfied point B is 1=k+b. Do same thing
for the point C. The function for C is 2=2k+b. These three functions are corresponding to
three lines on the parameter space, and these three lines will joint on one point k=1, b=0. For
the other points on line y=x (e.g. (3, 3), (4, 4)), the corresponding lines on the parameter space
will also pass point (k=1, b=0) (see figure 2.5)
y
b
y-=x
(2,2)
C
B
(1,1)
A
(0,0)
1=k+b
b =0
x
(1,0)
k
2=2k+b
Figure 2.5
The above section discussed where Hough Transform comes from. The algorithm of line
detector is as follow:
Firstly, we have an initial vector, which is for the k-b parameter, set the initial value to be 0.
For every point on the x-y plane, we should find the corresponding line function on the
parameter space. Add 1 to all the points on this line. Finally, we find the point which gets the
biggest value on the parameter space. This point will be the parameter we are looking for (see
figure 2.7)
12
Figure 2.6, Two lines for Hough Transform
Figure 2.7, The Hough Matrix for Hough
Transform for lines of image shown in
figure 2.6
For detecting circles, the only changing of the algorithm is that we need three parameters to
express the centre of the circle as (a, b) and the radius of the circle as r instead of two
parameter (k, b) that are used in the line detection. We will use this algorithm in our project to
detect the centers and radiuses of the coins, so that we can get the value of the coins.
2.5) Summary
The key concept of this chapter was to find a possible design approach according to the
literature we read. Firstly, we discussed two similar projects, CIS-Benchmark project and
automatic coins counter project. Then the basic strategies of digital image recognizing had
been discussed. Finally, we discussed two techniques could be used in this project.
13
Chapter 3-Plan& Design
3.1) Introduction
In this section the planned development of the project will be discussed. The working
conditions of our counting coins program will be set up. The development stages will be
specified in this section. Then the algorithms used in the project will be discussed in details.
There will be no technical definition of the algorithm used at each stage, since this will follow
in the implementation section.
3.2) Program working conditions
As we discussed in section 2.2.2, the program working conditions should be set up to improve
the efficiency and effectiveness of the program. Because there are hundreds conditions about
how the coins display on an image. For instance, the size and shape of the same coin could
also vary according to the position of the camera that captures the coins. If the camera is
placed just above the coin, the shape of the coin will be a circle. Otherwise, the shape of the
coin will be ellipse. And also if the camera placed near the coins, the size of the coin on the
image captured by the camera will be relatively bigger than the size of coins captured by the
camera which placed far from the coins. Although these two problems can be solved by using
scale, the time of this project is restricted. So this program is not considered working in these
kinds of conditions.
3.2.1) Hardware & Software requirements
¾ Hardware requirements
I
II
A computer has Matlab software installed.
The memory of the computer should be at least 512M, since the program need lots of
computations.
¾ Software requirements
Matlab 6.5 is the development tool for this project. So the program will be working under Matlab
environment.
3.2.2) Physical setup of the project
This project is developed under the physical setup below:
14
¾
The camera should be placed just above the coins.
¾
The distance between the camera and the coins should be fixed.
¾
¾
The coins should be placed on the black background.
The coins should be placed lying flat.
3.3) Design Stages
Although there wasn’t really a need for a comprehensive software lifecycle for this project, a
certain amount of planned design was still considered necessary. In this section, the
computation strategy was discussed first. Then according to the computation strategy we
found, the development of this project was divided into five main stages. Due to the algorithm
we used each main stage was also broken down to substeps. These stages will be discussed in
section 3.3.2.
3.3.1) Possible strategies for detecting coins
As we discussed in section 2.3, the most appropriate computation strategy used in this project
should be graph matching. According to this strategy, the coins should be classified into
different groups by their features. The features used in this project could be the colours and
the radiuses.
If the colour feature is used first, the coins should be classified into three groups. The first
group should be 2 pound and 1 pound coins. The second group should be 2 pence and 1 penny
coins. And all the other coins would be included in the last group. After colour segmentations,
the second features should be used. Since the radiuses of the coins in the same colour group
are different, all the coins should be distinguished (see figure 3.1).
Input image
Color
segmentation
Color feature
2 pound and
1 pound coins
Hough transform for The radiuses Hough transform for The radiuses
circles
circles
2 pound coins
10p&5p,
20&50p coins
2 p & 1 p coins
1 pound coins
1 p coins
2 p coins
Figure 3.1.The Computation Strategy
15
The radiuses
10p
coins
5p
coins
Hough transform for
circles
20p
coins
50p
coins
Finally, the total value of the coins could be calculated by using the number of coins in each
subgroup.
The computation strategy could be also constructed in the opposite way. In the other words,
we could use the radiuses feature first, then use the colour feature. The radiuses could not be
used by themselves, because the radius of 2 pence coins is nearly the same the radius of 10
pence coins. However, considering the number of computations, the first strategy was chosen
for this project.
3.3.2) Stages of development
The development of the counting coins program algorithm was divided into five main stages
according to our computation strategy (see figure 3.2). Then each main stage was broken into
substeps according to the algorithm used. Every substep was planned as a distinct algorithm,
which could be written and tested separately, and then incorporated into the main project.
Colour
segmentation
Input
image
Edge
enhancement
Classified
coins
Edge
detection
Circle with
clear edges
Hough
Transform
Edge
points
Centers
and
radiuses
Calculate
values
Output
value
Figure 3.2
¾ Stage 1: Colour segmentation
In this stage, the coins on the input image should be classified into groups according to their
colours. The output image should be the coins belonging to the same colour group on the
input image. This stage was divided into four substeps.
1.1) Select the colour region.
1.2) Calculate the “average” colour of the selected region, which we called mean.
1.3) Find the measure of similarity between each colour pixels in selected region and the
mean, which we called threshold.
1.4) Segment the coins on the input image with the same colour as the selected region by
using the mean and threshold found in step 1.2 and 1.3.
¾ Stage 2: Edge enhancement
After segmentation we did in stage 1, the image containing the coins that were in the same
colour group would be exported as a binary image. However, the pixels of the coins on this
output image were not connected because not all the pixels of the coin could be found in stage
1. In order to improve the accuracy of Hough Transform, the clear edges of the coins were
16
required. In other words, edge enhancement was required. The substeps were shown below:
1.1) Reduce the noisy. The isolate pixels on the image should be removed in this stage.
1.2) Fill the region. As we said the pixels of the coins we got were not connected, we need fill
the gaps between them in order to recover the pixels that we lost in stage one. Now the
edge of the coin should be clear.
¾ Stage 3: Edge detection
In stage 2, the image containing the coins with cleared edges was outputted. The next step is
detecting the edges of the coins, then output them. There was no substeps needed.
¾ Stage 4:Hough Transform
Now we could apply Hough Transform algorithm to all the edge points. Then the sizes and
locations of the coins should be detected according to the radiuses and centres.
¾ Stage 5:Calculate the total value of the coins on the input image
According to the radiuses and centres we got in stage 4, we should know the number of coins
in each group. Then value of coins in each group could be calculated. Finally, we could add
all the values together to get the total value.
These five main stages covered the main counting coins algorithm. An additional algorithm
was added to improve the accuracy of Hough Transform.
Calculate the radius of each single coin
As we discussed in section 3.2.2, the distance between the camera and coins should be fixed.
So when the program was installed at the first time, after the camera was fixed, we should
place each single coin under the camera, and then apply this algorithm to get the radius of the
coin. We put these numbers into our Hough Transform function. The computation of Hough
transform algorithm would be reduced. And the accuracy of the algorithm would be improved.
We will discuss the reasons in validation section in chapter 4.
3.4) Overall Algorithms Definition
As we discussed in section 3.3, there were five main stages and twelve algorithms used to
develop the project. We will discuss the algorithms used in this project by the stage of
development.
17
3.4.1) Colour segmentation
Segmentation is know as dividing an image into parts that have a strong correlation with
objects or areas of the real world contained in the image ( (Sonka 1999, p.123). In this project,
we would segment the coins on the input image based on their colours. This method was
called colour segmentation. There are two methods used generally in this area: segmentation
in HIS colour space and segmentation in RGB vector space.
As we discussed in section 2.4.1, in this project all the input images were RGB images. And
the colour images handled by Matlab were either indexed images or RGB images. So the
segmentation in RGB vector space was considered.
The objective of colour segmentation used in this project is to classify the coins into groups
by their colours. The UK coins could be specified by three grouped as we discussed in section
3.3.1 (see figure 3.3). From figure 3.3, we could see that part of two pound coins had the
similar colours with 20 pence. That means we could also put them into group three. However,
if we did so, only the white part of the two pound coins could be segmented, and the size of
this part was similar with twenty pence coins.
Figure 3.3 Grouped coins
Once the colour region, which we are interested in, was selected, we could obtain an estimate
of the “average” colour in this region. Let this average color be denoted by the RGB vector a.
The objective of segmentations is to classify each RGB pixel in a given image as have a
colour in the specified range or not. So a measure of the similarity was needed to perform this
comparison(Gonzalez 2002, p.333). One of the simplest measures is the Euclidean distance.
Let z denote an arbitrary point in RGB space. We say z is similar to a if the distance between
them is less than a specified threshold, D0 . The Euclidean distance between z and a is given
by
18
D( z, a) = z − a
= ⎡⎣ ( z − a )T ( z − a ) ⎤⎦
1/ 2
= ⎡⎣( zR − aR ) 2 + ( zG − aG ) 2 + ( z B − aB ) 2 ⎤⎦
1/ 2
Where the subscripts R, G,, and B, denote the RGB components of vectors a and z. The locus
of points such that D ( z , a ) ≤ D0 is a solid sphere of radius D0 , as illustrated in figure 3.4.
The points contained within or on the surface of the sphere satisfy the specified color criterion;
the points outside the sphere do not.
B
G
R
Figure 3.4
Now we could export the sleeted regions by using the average colour, a and the threshold, D0 .
The coins on the input image then should be segmented. One example of colour segmentation
for the 2 pence and 1 penny group was shown below:
Figure 3.5. The original image
Figure 3.6. The output image of
the original image after colour
segmentation
19
3.4.2) Noise reduction
After colour segmentation, the coins in the same colour group would be displayed on the
binary image. As we discussed in section 3.3.2, due to the value of the threshold, D0 , the
output pixels of the coins could be vary. If we chose a small D0 , only a narrow range of
colours on the coins would be exported. That means, only some isolate points would be
displayed on the output image. If a big threshold was chosen, the range of colours may be so
wide, and then the background points could be also exported. It was impossible for us to
choose a threshold, so that the output image just contained the coins segmented. Because the
colours of the coins could be different on the different places of the surface due to the texture
of the coins (there may be the other reasons, such as noisy).
As such, a noise reduction algorithm as applied to the image exported in colour segmentation.
The aim of this algorithm is to eliminate the random, isolate pixels which we called noisy
here, and keep almost all the pixels that are part of the coins. To do so, the median filter is
chosen.
The median filter is working as replacing the value of a pixel by the median of the gray levels
in the neighborhood of that pixel:
fˆ ( x, y ) = median { g ( s, t )}
( s ,t )∈S xy
The original value of the pixel is included in computation of the median.
Since the input image in this stage was binary image, the noisy and the coins were all
expressed as 1s. All the other pixels were expressed as 0s. We could use zero matrixes as the
mask, and then the isolate pixels could be eliminated by computing the median of the
neighbors. However, the pixels of the coins could be left. And the region containing the coins
could be blurred. A median filter was used to figure 3.6, and the output image was shown
below.
Figure 3.7. The output image of the image
shown in figure 3.6 after applying the median
filter
20
Since, the median filter function was installed in Matlab already, so we do not discuss this
algorithm in implementation.
3.4.3) Edge enhancement
From figure 3.7, we could see that the pixels of the coins shown on the image were not only
the edge pixels, so we could not use Hough Transform algorithm directly. Since the intensity
of the coins were not distributed all over the coins, if we use the edge detector directly to the
image, lots of non edge pixels would be found. In order to improve the accuracy of Hough
Transform, edge enhancement would be necessary in this project.
Binary Morphology was chosen to solve the problem we faced in this stage. Binary
Morphology is a subset of basic mathematical morphology. It is based on “the algebra of
non-linear operators operation on object shape”(Sonka 1999, p559). Morphology is used in
many different areas of computer vision and image analysis such as, noise filtering, shape
simplification, skeletonising, thinning, thickening, convex hull calculation, object
segmentation, and the quantitative description of objects(Sonka 1999). The principle behind
mathematical morphology in computer vision is to treat the binary image as a point set, where
each element is a 2-D vector whose coordinates are the (x, y) coordinates of a black (or white,
depending on convention) pixel in the image. To apply a morphological operation, a particular
structuring element (see figure 3.8) is moved systematically across the image. The application
of this element basically means computing the relation between the input image (points set X)
with the structuring element (points set B) expressed with respect to a local origin, then
computation result is stored in the output image in the current image pixel position.
1
1
1
0
1
0
1
1
1
1
1
1
1
1
1
0
1
0
1
0
1
Figure 3.8. Typical structuring elements
The binary morphological operations that will be used in this project are dilation, erosion,
opening, and closing.
¾ Dilation
The morphological transformation dilation ⊕ combines two sets using vector addition. The
dilation X ⊕ B is the point set of all possible vector additions of pairs of element, on from
each the sets X and B.
X ⊕ B =
{p ∈ ε
2
: p = x + b, x ∈ Xandb ∈ B}
21
Dilation is used to fill small holes and narrow gulfs in objects. It increases the object size. In
this project, dilation was used to fill the spaces between the isolated pixels of the coins (see
figure 3.8). From the figure 3.7, we could see that there were big the gaps within the coins.
After using dilation, the gaps were filled, and become smaller(Sonka 1999, p.563).
Figure 3.9 The output image of the image
shown in figure 3.7 after applying the
dilation.
¾ Erosion
Erosion : combines two sets using vector subtraction of set elements and is the dual operator
of dilation. However, neither erosion nor dilation is an invertible transformation.
X : B = { p ∈ ε 2 : p + b ∈ X for every b ∈ X }
Erosion is used to simplify the structure of an object or the parts of objects with width equal
to on will disappear, it might thus decompose complicated objects into several simpler
ones(Sonka 1999, p.565).
In this project, the erosion could be used to shrink the size of the coins that was enlarged by
the opening operation. The example is shown as figure 3.12.
¾ Opening & Closing
Opening and closing are two operations, which are both constructed by erosion and dilation.
In opening operation, erosion algorithm would be used first to the image, and the dilation is
followed. The opening of an image X by the structuring element B is denoted by X D B and
is defined as
X D B = ( X : B) ⊕ B
In closing operation, the dilation is used first, and then the erosion is followed. The closing of
an image X by the structuring element B is denoted by X • B and is defined as
X • B = ( X ⊕ B) : B
Opening generally smoothes the contour of an object, breaks narrow isthmuses, and
eliminates thin protrusion. Closing is usually used to connect objects that are close together,
fill up small holes that may appear in objects and smooth object outlines. From figure
22
3.10-3.12, we could see how these Morphology algorithms used in this project.
Figure 3.10. The output image of the
image shown in figure 3.9 by applying
the closing algorithm
Figure 3.11. The output image of the
image shown in figure 3.10 by applying
the opening algorithm
Figure 3.12. The output image of the
image shown in figure 3.11 by applying
the erosion algorithm
3.4.4) Edge detection
After applying Morphology algorithm to enhance the edge, we could see that the edge of the
coins was clearly shown on the image (see figure 3.12). Now we could use the edge detector
to find the edge of the coins. The Sobel edge detector was chosen in this project. Since this
algorithm is not as slow as Canny edge detector, the output result is good enough for this
project. Although it is slower to compute than the Roberts cross operator, but its larger
convolution mask smoothes the input image to a greater extent and so makes the operator less
sensitive to noise. It also generally produces considerably higher output values for similar
edges, compared with Roberts Cross. The details of the algorithm will be discussed below.
We know that the edges are places in the image with strong intensity contrast. So they
correspond to strong illumination gradients. The Sobel operator is used to find the
approximate absolute gradient magnitude at each point in an input image. The operator
consists of a pair of 3 × 3 masks as shown in figure 3.13.
23
-1
0
1
1
2
1
-2
0
2
0
0
0
-1
0
1
-1
-2
-1
Gx
Gx
Figure 3.13 Sobel convolution masks
These two masks shown in figure 3.13 are designed to respond maximally to edges running
vertically and horizontally relative to the pixel grid, one mask for each of the two
perpendicular orientations.
The masks can be applied separately to the input image, to produce separate measurements of
the gradient component in each orientation (Gx and Gy shown in figure 3.13). These can then
be combined together to find the absolute magnitude of the gradient at each point and the
orientation of that gradient. The gradient magnitude is given by:
G = Gx 2 + Gy 2
In practice, this implementation is not always desirable because of the computational burden
required by squares and square roots. An approach used frequently is to approximate the
gradient by absolute values:
G = Gx + Gy
which is much faster to compute. Only the absolute magnitude could be seen by the user as
output of the detector. Because the two components of the gradient are conveniently
computed and added in a single pass over the input image using the pseudo-convolution
operator shown in figure 3.14.
Z1
Z2
Z3
Z4
Z5
Z6
Z7
Z8
Z9
Figure 3.14 Pseudo- convolution
mask
Using this mask the approximate magnitude is given by:
G = ( Z1 + 2 × Z 2 + Z 3 ) − ( Z 7 + 2 × Z8 + Z 9 ) + ( Z 3 + 2 × Z 6 + Z 9 ) − ( Z1 + 2 × Z 4 + Z 7 )
The example of using Sobel edge detector is shown in figure 3.15.
24
Figure 3.15 The output image of the image
shown in figure 3.12 after applying Sobel
edge detector.
3.4.5) Hough Transform for the circles
After using Sobel edge detector, the edges of the coins should be exported. Now we can use
Hough Transform algorithm to analyze these edge pixels, and then find the relationship
between these pixels. As we all know that the shapes of most coins in UK are circles except
20 pence and 50 pence coins. However, in the edge enhancement stage, after applying the
Morphology algorithm, the 20 pence and 50 pence coins could be reshaped as a circle. So the
Hough Transform algorithm was used to detect the circles (the shapes of the coins). Once the
centers and the radiuses of the circles were found, the value of the coins in the same group
could be calculated.
In the last section of literature survey, Hough Transform for detecting lines was discussed in
details. The Hough Transform for circles is similar to the algorithm of detecting lines. The
only difference is we need three parameters to describe the circle instead two parameters used
for the lines.
As we know that the basic idea of the standard Hough Transform is that every pixel (x, y) in
the image is transformed to the parameter space ((a, b), and r for the circles where (a, b) is the
centre of the circles, r is the radius) by the relations between the pixel and the parameter used,
such as the example we discussed in section 2.4.2 , the points on the lines are transformed to
the slop and intercept of the line on the parameter space. Then by “voting” in parameter space,
patterns in the image data conspire to produce local extrema at the most likely parameter
values.
The relationship between the pixels on the image and the parameter (a, b), and r is shown
below:
a = x − r ∗ cos θ
b = y − r ∗ sin θ
where θ ∈ [ 0, 2π ] .
The standard Hough Transform runs really slow, because a huge amount of computations are
required for every edge pixel on the image. Kimme and Ballard(1975) suggested that the
25
computation could be reduced by substitute θ by a function of r. As if the unit of the cell of
the accumulate array is big enough, the pixel (x, y) could be still within the cell after θ
increased in a small amount by using a certain r. So r × ∆θ could be approximately equal to
1 (unit of the cell) by the function of the radian. For example, in figure 3.16, we could see that
for r=3 and r-5, r × ∆θ =1. 1 here means one unit cell. So” r ∆θ is approximately equal to
the spacing between adjacent points of the accumulate array” (Kimme 1975).
Figure 3.16 ∆θ as a
function of r (Kimme
1975)
The computation steps would be reduced. The computation steps could also be reduced by the
gradient of the edge pixels, as we did not implementation this strategy in our program, it
would not be discussed in this section.
3.5) Summary
In this chapter introduction to the main stages of the development of the project was given
and a general description of the problems, methods or algorithm used in the development of
that stage were discussed. In the next section a more technically in-depth discussion of each
stage described in this section will be given along with the relevant algorithm
implementations.
26
Chapter 4- Implementation & Validation
4.1) Introduction
In this chapter, we will discuss the implementations of the design strategy we chose in section
3.3.1. There are total five stages, colour segmentation, edge enhancement, edge detection, Hough
Transform and get the total value, which we have discussed in section 3.3. The validation will be
discussed at the end of this chapter.
4.2) Implementation Stages
4.2.1) Coulor segmentation
In this stage, the implementation was divided into four steps.
¾
¾
Input the image and selecting the interested regions.
The function imread was used to transform the image into 3-dimentional array. The first
dimensional array was denoted red colour, the second dimensional array was expressed
green colour, and the third expressed the blue colour as we mentioned in 2.4.1. Then we
used function roipoly to select the area we interested. These two functions were provided
by Matlab already.
Calculate the average colour of the select area, mean and the threshold that was the
measure of similarity between every single colour pixels and the mean
The function ‘threshold’ was used to decide the average color and the color bias that we called
threshold. When the interested region was selected, the function would calculate the average
colors of each pixel in the whole region by each RGB channel and the color variance in the
region as the bias.
The algorithm we used is displayed below:
27
Algorithm of threshold:
Obtain the interested region by using the function roipoly, which produces a binary
mask of a region selected interactively.
Get the color of the region of interest.
Extract the coordinates of the points
Rearrange the color pixels in g as rows of Image
Find the rows that indices of the color pixels that are not black.
Compute the mean and covariance
If No pixels in the special case
Return value of threshold as Zero
Else
Compute an unbiased estimate
Subtract the mean from each row
Compute a new unbiased estimate and convert it to a column vector, as bias
End if
Return the value of threshold, which contains the average color and the bias
¾
Segment the coins on the input image with the same colour as the selected area
The function ‘colorseg’ was used to select the pixels whose color is closed to the color of
coins in RGB units. The key to this stage was to find a good matched pair of average color
and the bias. After sampling a number of threshold on the different regions of coins pictures,
the data base of coins’ average color were obtained as well as the bias for selecting them. By
applying these dada, the pixels in the similar color to the coins we select were easily found
from the original image and saved as a ‘pre-edge-detected’ image for the next stage.
Algorithm of colorseg:
Input image, average color, bias
If
the input image is not RGB
Return error
End if
Convert the image to vector format
Compute the Euclidean distance between all rows of image and average color.
Find all the pixels whose distances are less than bias.
Reshape these pixels
Output these pixels as colorseg result as ‘pre-edge-detect’ image
28
4.2.2) Edge enhancement
Before the edge detecting stage, the ‘pre-edge-detected’ image was an image with all kinds of
noise and hard to produce an accurate edge for the Hough Transform. This led to the stage of
edge enhancement.
¾ Reduce the noisy.
A median filter function was applied to get rid of the isolate pixels on the image. This was a
function in Matlab functions named as ‘medfilt2’, which median filtering of the matrix A in
two dimensions. Each output pixel contained the median value in the m-by-n neighborhood
around the corresponding pixel in the input image. This was quite useful to remove the
‘salt-pepper’ like noise, but it costs a price of reducing the edge of coins a little and
disconnecting the edge.
To recover the edge, the function ‘imdilate’ was used. It dilated the pixels in the image with a
specified structure which was a small line structure and connected the edge back. After the
dilation, the edge was connected but there was some rice like noise both inside and outside
the edge region. T
¾ Fill the region
The functions ‘imclose’ and ‘imopen’ were used to do so. They performed very well in this
substep.
¾ Adjust the result from last step.
Because the coins had been dilated in last step, we need adjust the size of the coins. Then the
function ‘imerode’ was used here.
The algorithm of these steps was displayed below:
Algorithm of edge enhancement:
Remove the ‘salt-pepper’ noise with Median filter
Dilate the remain pixels to fill the disconnections and gaps of edges
Remove the ‘rice’ noise with image open and close operation of morphology:
Close image to remove the noise inside the edge
Open image to remove the noise outside the edge
Erode the image back to the original size
Return the results
29
4.2.3) Edge detection
As we discussed in section 3.3.4, in this project, “Sobel” edge detector was chosen in this
project. In Matlab, the edge detected function called edge had already been installed. So we
could use it directly. Since the edge of coins had been well enhanced, the function figured the
edge with a line effectively.
4.2.4) Hough Transform
In last step, the edges of the coins had been found. In this step, we will analyze these edges in
order to recognize every single coin on the input image.
The implementation of Hough Transform could be divided into two parts:
¾ Implement the standard Hough Transform
The standard Hough Transform we have discussed in section 3.3.5. Firstly, we need transform
every single pixels of the edge to its parameter space by using the function described their
relationship. Then, the local extrema should be found in accumulator.
The algorithm of Hough Transform.
Algorithm of Hough Transform:
Input the image, the radius of coins
Find all the non-background pixels in the image
Hough Transform:
Initiate the array of radii r
Initiate the array of direction theta as a function of radii, which means delta theta
equal 1/r
Use the pixels calculate the parameter space of circles with the specified arrays of
radii and theta
Store these points in the parameter space as Hough Matrix, which is 3-deminsion
arrays including the coordinates of centers and the corresponding radius
Return the Hough Matrix
The algorithm of accumulator array.
30
Algorithm of accumulator:
Input the Hough Matrix
Add up the occurrence of points in parameter space
Store the occurrence as the value of Hough Matrix
Find the local maximum value in the Matrix and return the coordinates and radii as the
possible circles
Filter the false centers and radii which are too close to each others
Return the left centers and radii
The result of this function was not quite satisfied. Firstly, the program run so slowly, for the
large image, it could run several hours. Secondly, because all the number we used in this
program was rounded to integer as they were the index of the matrix, there were system errors
which we could not eliminate. And also, the smallest unit of the accumulator was one as the
index of the matrix had to be integer. The index of the accumulator matrix was used to
express the coordinates of the center and radius of coin. So the output could only be integer.
The result was that there were sometimes several output expressed one circle. After we adjust
the outputs, we assumed that all the output numbers around a range were expressed the same
circle. However, if the distances of the centers of the coins were in this range, our program
would not have error output. This situation only occurred when two coins were overlapped.
See the figure 4.1 to 4.3, the two coins that were overlapped could be distinguished. But the
two coins overlapped as figure 4.4, the program could not distinguish.
Figure 4.1. 5 pence and 50 pence
Figure4.2. 50 pence was recognized
31
Figure4.3. 5 pence was recognized
Figure4.4. two 1 penny overlapped
Figure4.4. Left penny was not recognized
Figure4.5. Right penny was recognized
¾ Improved Hough Transform
As we discussed 3.3.5, the Hough Transform was improved by reducing the computations.
We replaced θ with 1/r. the running time now was accepted.
We also limited the number of radius to increase the accuracy and reduce the running time by
initial setting of the program.
The overall algorithm for Hough Transform is displayed below:
32
Algorithm of Hough Transform:
Input the image, the radius of coins
Find all the non-background pixels in the image
Hough Transform:
Use the pixels calculate the parameter space of circles with the specified radius and
the array of directions
Store these points in the parameter space as Hough Matrix
Accumulate the Hough Matrix to find the possible centers of circles
Use an accumulator to add up the occurrence of points in parameter space
Check the the Threshold value of occurrence
If so
Set the point to be one possible centers
End if
Remove the false centers which is too close to each other to be a new center.
Return the centers
4.2.5) Calculating the values of coins
Since the number of coins has been calculated in the last stage, it is easy to add up the value
of coins.
4.3) Validation
There are several points to note in the implementation. The first one is that the color average
and the bias are the key to color segmentation stage. Since the color and brightness of picture
changes when taking the photos of coins, the color average and bias are changed too. This
requires several times of samples to establish a proper database to select an accurate
‘Threshold’ value.
The second is that we set the direction array as a function of radii which saves a lot of
calculated evaluations as well as the consumed memory. The last point is that edge
enhancement is an important bridge between the color segmentation and the edge detection.
33
Chapter 5- Testing
5.1) Introduction
In this chapter, we will discuss the testing of the program that implemented in chapter 4. We
will only use black box testing, because during the implementation stage, the insides of each
function had been tested. So the test plan was divided by four steps, the initial setting up for
the program, colour segmentation, edge detection and find the value of the coins in each
colour group, and get the total values of the coins on the input image.
5.2) Test plan
The test plan was designed due to the computation strategy we discussed in chapter 3.
According to that strategy, the coins should be segmented into three groups. However, due to
the nature of colour segmentation as we discussed in section 3.4.2, only the colours in a
specified range could be segmented from the input image, there is a big error for the threshold
we got for one pound coins and two pound coins. So we decided segment them in two
different groups. This is why there were four groups in section 5.2.2. And also according to
the working conditions we set for the program in section 3.2.2, each time when the position of
the camera is changed, the initial data of the radiuses of the coins should be set up. So, the test
plan was classified by four steps as below.
5.2.1) Initial setup for the program
¾
All kinds of coins should be sampled to set a database for colour average and bias as well
as the radii of coins. For example, to find these data of two pence coin, we load a picture
of two penny as following (2p_a.jpg)
Figure 5.1 The two pence coin
34
¾
The threshold function was running to decide the average color, which is 114.0884(R)
66.8627(G) and 30.27(B), and the bias is 19.17347.
¾
The function ‘find_radii’ was applied to find the radius of the two penny coin and the
result is 81 (pixels). Repeating these procedure for all the other coins such one penny, ten
penny and so. Finally a database of color threshold and radii of coins was set up.
5.2.2) Group the coins with different color
¾ The coins were classified into four grouped according to their colours.
A: Red-coins group with one penny and two penny coin.
B: One pound coin group,
C: Two pound coin group,
D: White-coins group with five penny, ten penny, twenty penny and fifty penny coins.
The groups are shown in the following picture.
Figure 5.2. Coins in three colour
group.
5.2.3) Detect edges and count coins separately by groups
¾
Test for group A
we run function ‘edge_red’ and got the image of edge as following:
Figure 5.3. The edges for two pence and one
penny coin.
35
The function ‘countcoin’ was run respectively on one penny and two penny coins. The
results are in the figures, where the red circles are the positions of coin’s center found in
the images.
Figure 5.4. Recognizing one
penny coin
Figure 5.5. Recognizing two
pence coin
And the number of one penny coins and two penny coins are stored.
¾
For group B, C, D, we run edge detect function and countcoin function too. The results
are as following:
Group B: one pound coin
Figure 5.6. The edges for one pound coin.
Actually there is no one pound coin in the test
picture, so all the results are noises.
Figure 5.7. Recognizing one pound coin.
The output of countcoin function is that there is
zero coin in the image.
36
Group C: two pound
Figure 5.8. The edges for two
pounds coin.
Figure 5.9. Recognizing two
pounds coin
Ground D: White coins
Figure 5.10. The edges for white coins,
including five pence, ten pence, twenty
pence and fifty pence coins.
Figure 5.11. Recognizing five pence coin
Figure 5.12. Recognizing ten pence coin
37
Figure 5.13. Recognizing twenty pence
coin
Figure 5.14. Recognizing fifty pence coin
And all the numbers of coin have been stored.
5.2.4) Calculate the total value
The result of calculation is 288 pence, which is exactly the value of coins shown in the test
picture.
5.3) Discussion of the test result
The output of the program for the testing image was satisfied due to the goals that set in
section one. Each single type of coins could be recognized by this program. And the output of
the total values was correct. However, due to the method used in segmentation, we could not
possible get a global threshold that could be used for every single coin in the same group. The
reason is the colours of the coins with the same value could be various. The colours for the
new coins are distinctively different with the old coins. So our program could not recognize
all the coins unless each time when the colours of the coins with the same value can be looked
differently with the sample coins that we used setup the program, we reset the program. In
this case, the coins could be recognized by our program.
38
Chapter 6- Review
6.1) What went well & Why
The testing we did in section 5.2 showed that the program has the abilities to recognize all
eight kinds of UK coins and it could also calculate the total values of the coins in a given
image. The main goals and the subgoals of this project were all achieved. However, the
program works well for the test image because we have set working conditions for both the
program and input image. The reason for doing so was discussed in section 3.2.
Since in real life there are hundreds of conditions about how the coins could be displayed on
the image, such as the size and shape of the same coin could also vary according to the
position of the camera that captures the coins as we mentioned in section 3.2. So the camera
that used to capture the coins was fixed. And also because there are relative errors in colour
segmentations according to the value of the threshold we chose, and these errors could not be
eliminated due to nature of our method used. The colour of background which the coins
displayed on was set to black. As the position of the camera was fixed, the radiuses of the
coins could be set up at the beginning of the program in other to reduce the program running
time. The last problem of the program has been discussed in section 5.3, which we will not
discuss here.
As we said at the beginning of this section, the performance of the program was really
appreciated under the conditions we discussed above.
6.2) What could be improved & how
There are two areas of the program that could be improved.
¾
If we do not fix the position of the camera, we could add the image rotation and scaling
method to this program, so that it could be applied in a more widely area.
¾
We can add the template matching method to improve the accuracy of the recognition as
the CIS-Benchmark project that we mentioned in section 2.2.1. We could create a big
database. The more the images of the coins with different colours are stored, the better
result our program will achieve. However, the program running time would be a problem
if we install it by using Matlab.
6.3) Conclusion of the project
39
Generally speaking, the whole processes of this project went well. Although at beginning of
the project, because so many conditions that the coins could be displayed on an image were
considered, it was difficult to find a starting point of this project.
Thanks for Dr.Hall who provide a starting point for the project. After reading enough literatures,
and analyze the similar projects, a design strategy was chosen, Differing the projects that had
done in this area, we chose a new approach to recognize coins. The colour segmentation was
used firstly to classify the coins into groups according to the colours, and then we used Hough
Transform to detect the size of the coins, so the each kind of coins could be distinguished.
Finally, the total value of the coins was calculated by the coins we found. Although the
program only worked well under the condition that we set up, the running time was quite
satisfied.
In the future, as the work that we discussed in section 6.2 is done, the program could have
abilities to recognize all the UK coins with any conditions that captured by the camera.
40
Chapter 7- Bibliographic
Ayache, N., and Faugeras,O. (1986). "HYPER: A new aproach for the recognition and positioning of
two-dimensional objects." IEEE Trans. Pattern Anal PAMI 8 (1): 44-54.
Bellman, R., and Dreyfus, S. (1962). Applied dynamic programming. Princeton, Princeton University
Press.
Bolles, R. C., and Horaud, P. (1986). "A three-dimensional part orientation system." Int.J.Rob. Res.
5(3): 3-26.
Brooks, R. A. (1981). "Symbolic reasoning among 3-D models and 2-D images. ." Artif Intell 17:
285-348.
Brooks, R. A. (1983). "Model-based three dimensional interpertations of two-dimensional images."
IEEE Trans. Pattern Anal PAMI 5(2): 140-150.
Duda, R. O., and Hart,P.E. (1973). Pattern classification and Scene analysis. New York, John Wiley &
Sons.
Forsyth, D. A., and Ponce J. (2003). Computer Vision A modern Approach Edition. London, Prentice
Hall.
Fua, P., and Hanson, A.J. (1991). "An optimization framework fro feature extraction." Mach, Vision
Appl 4: 59-87.
Gonzalez, R. C., and Woods, R. E. (2002). Digital Image processing. Lodon, Prentice Hall.
Kimme, C., Ballard, D., and Sklansky, J. (1975). "Finding circles by an array of accumulators."
Communications of the ACM 18(2): 120-122.
Mathworks (1998). RGB Images. http://visl.technion.ac.il/labs/anat//2-ImageTypes/#RGB%20Images.
Nölle, M., Jonsson, B., and Rubik, M. (2004). Coin Images Seibersdorf- Benchmark, ARC Seibersdorf
research GmbH: 1-8.
Provine, J., McClintock,M. , Murray, K., and Chau, A. . (1999). "Automatic coin counter." from
http://www.owlnet.rice.edu/~elec539/Projects99/JAMK/proj2/report.html.
Rosenfeld, A. (1969). Picture Processing by Computer. New York, Academic Press.
Sonka, M., Hlavac,v., and Boyle, R. (1999). Image processing, analysis and machine vision. Londow,
PWS.
Suetens, P., Fua, P., and Hanson, A. J. (1992). "Computational strategies for object recognition." ACM
Computing Surveys 24(1): 5-61.
Tenenbaum, J. M., and Barrow, H.G. (1977). "Experiments in interpretation-guided segmentation."
Artif. Intell 8: 241-274.
Zhao, W., Chellappa, R., Phillips, P.J., and Rosenfeld, A. (2003). "Face recognition: a literature survey."
ACM Computing Surveys 35(4): 399-458.
41

Download Report

Development of Counting Coins Program in Computer Vision

Paperzz.com

Your Paperzz