AhmedSaiful1986

CALIFORNIA STATE UNVERSITY, NORTHRIDGE
CHARACTER RECOGNITION TECHNIQUES
A graduate project submitted in partial satisfaction of
the requirements for the degree of Master of Science in
ELECTRICAL ENGINEERING
by
Saiful Islam Ahmad
May, 1986
The graduate project of Saiful I. Ahmad is approved
DR. JAGDISH C. PRABHAKAR
PROF. WILLIAM MACD NALD
Committee Chairman
CALIFORNIA STATE UNIVERSITY, NORTHRIDGE
ii
To my parents, for believing in me.
iii
TABLE OF CONTENTS
iii
Dedication
Table of figures
Abstract
vi
vii
1
INTRODUCTION
Section 1
1.1
1.2
1.3
1.4
TEMPLATE MATCHING
Gradient Picture
Recognition System
Error Calculation
Similarity Measure
10
Section 2
2.1
2.2
2.3
2.4
CONVEX AND CONCAVE FEATURES
Feature Extraction
Features
Matching
Dictionary
13
13
17
18
19
Section 3
3.1
3.2
3.3
POLYGONAL APPROXIMATION
Feature Generation
Classification
Performance
21
21
28
32
Section 4
4.1
4.2
4.3
4.4
RECOGNITION OF CONTOUR-TRACED CHARACTERS
Feature Extraction
Contour Tracing Algorithm
Classification Algorithm
Performance
35
35
37
40
42
Section 5
5.1
5.2
5.3
5.4
SYNTACTIC CHARACTER RECOGNITION
The Grammar
Output Of The First Parser
Recognition Algorithm
Performance
45
47
50
54
3
3
4
6
iv
56
Section 6
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
Section 7
CHARACTER RECOGNITION SIMULATING
VISUAL NERVOUS SYSTEM
Information Processing in the
Visual System
Mathematical Model
Design Method For Property Filters
Character Recognition Model
Property Filter
Property Discriminator
Pattern Classifier
Performance
60
62
65
72
74
74
77
78
CONCLUSIONS AND DISCUSSION
80
60
REFERENCES
83
APPENDIX
84
v
Q
TABLE OF FIGURES
5
FIGURE 1.1
FIGURE 1. 2
FIGURE 1.3
Gradient Picture
Simple Template
Modified Template
FIGURE 2.1
FIGURE 2.2
Concavity And Convexity Emphasis
Convex and Concave Hole, Segments
14
15
FIGURE 3.1
FIGURE 3.2
FIGURE 3.3
23
24
FIGURE 3.4
TABLE 3.1
Definition of
Definition of Location Zones
Classification Tree (a)
Classification Tree {b)
Characters for Classification Tree
Topological Information
FIGURE 4.1
FIGURE 4.2
FIGURE 4.3
Contour Tracing
CODE & COORD Words
Closing Breaks
39
43
FIGURE
FIGURE
FIGURE
FIGURE
FIGURE
5.1
5.2
5.3
5.4
5.5
Data Base Numerals
Definition of Entry
Definition of Arc
Definition of Location Attributes
First Parser Encoding of Numerals
46
49
52
53
55
FIGURE
FIGURE
FIGURE
FIGURE
FIGURE
FIGURE
TABLE
FIGURE
6. 1
6.2
6.3
6.4
6.5
6.6
6.1
Human Visual Nervous System
LI Structure
Responses of LI Model
Property Filter Using LI Structure
Character Extracted By LI Structure
Pattern Recognition System
Properties Used In Recognition
Nerve Connection Coefficients
61
63
66
68
71
73
6.7
vi
7
8
29
30
31
34
38
75
76
'
ABSTRACT
CHARACTER RECOGNITION TECHNIQUES
by
Saiful I. Ahmad
MASTER OF SCIENCE IN ENGINEERING
The improvements in image processing techniques and the
effectiveness of computers to process
and recognize
images have expanded the applications of image processing
in industry. Character Recognition is a part of image
processing and recognition in which a compu·ter is able to
recognize characters
( A to Z and 0 to 9).
In this
project several techniques of character recognition are
described and their performance discussed.Computer
subroutines have been developed for several of these
techniques.
Experimental
results
have
verified
the
general character and effectiveness of these techniques
in character recognition.
vii
INTRODUCTION
The ability of using computers to recognize objects have
been
incorporated
in many application.
Character
recognition can reduce the step of inputting information
through the keyboard thus saving time and labour. Direct
input can also reduce human error.
Character
recognition
has
been
a
difficult
problem
because of the diversity in hand-writing, type styles,
etc. If a computer is trained to read one kind of script
or style, it wi 11 not be able to recognize another type.
Since there is enormous variety of characters, at present
time recognition ability is limited to characters of a
specific size and type.
Another problem in character recognition is the amount of
time taken by an algorithm to process the information.
This makes the process slow as compared to the way a
human beings do.
There are several different techniques that are used in
character recognition.
They vary from template matching
to contour tracing, to simulating human visual nervous
1
2
system.
These
and
several
other
techniques
discussed in the rest of the sections.
will
be
SECTION l
TEMPLATE MATCHING
Template matching technique involves the process of
determining if a
scene contains
the picture of
a
previously specified object. In character recognition,
template matching is very useful, it is a simple way of
finding if a certain character or a set of characters
exist in a scene being processed by the computer.
Template matching can be performed on both binary
gray level pictures.
and
Binary picture contains only black
(0) and white (l) elements, where as gray level pictures
contain various level or shades of gray including black
and white.
Template matching is first considered within
the context of binary pictures,
and then extended to
general case.
1.1 Gradient Picture
Before proceed to template matching, there is a need to
define
the
obtaining
term
the
"Gradient
gradient
picture".
picture
The
process
involves
of
spatial
differentiation, edge enhancement, sharpening, and taking
gradient. The magnitude of gradient at a point ( i, j) can
3
4
be approximated by
IIV g ( i' j )1\
.,0 R ( i' j )
={[g(i,j) -
g(i+l,
j+l)] 2 + [g(i,
j+l) -
g(i+l,
j)]2}1/2 (1)
The picture in Figure 1.1, shows a thresholded gradient
picture.Thresholding a picture involves assigning a value
of 1 for example, to a certain range of numbers or gray
levels and assigning rest of the numbers or gray levels a
value of 0.
1.2 Recognition System
Using the
Gradient picture in Figure 1.2, that shows the
outlines of various objects, and it is desirable to find
out if a
certain character appears in the picture. In
order to perform template matching template or mask of
the character as shown in Figure 1.2 is constructed. Then
scan
the
template
systematically
across
the
entire
picture. If a position for which the hole in the template
was filled with white is found,
letter
"0"
has
been found.
it is
assumed that
The draw back of
this
procedure is that any sufficiently large white region
will
fill
the hole in the template thus erroneously
giving the existence of the letter 0.
This draw back of template matching can be overcomed by
5
FIGURE 1.1 Gradient picture
6
not merely looking for white region that filled the
template, but for a white circular region surrounded by
black region. Figure 1.3 shows how this can be done with
a template. Using this modified template, existence of
the letter 0 is considered if each template region covers
an area of the picture whose gray values corresponds to
the template label.
In other words, the 0 {zero) regions
of the template must show only gray values of 0, and the
1 region must only show the gray values of 1. Note that
the template in Figure 1.3 is itself a binary picture.
The size of template is typically smaller than the size
of the original picture, since the objective of template
matching is to discover the presence of a "sub-picture"
within the given picture.
Mathematically, the domain of definition of the template
is smaller than the domain of definition of the original
picture.
1.3 Error Calculation
In most practical applications, perfect template match is
not possible. A better procedure would be to consider the
measure of how well a portion of picture matches the
template.
One way would be to consider the error in
matching the template to various regions and choosing as
7
FIGURE 1.2 A simple template
8
FIGURE 1.3 Modified Template
9
best match the region with least amount of error which
satisfied a previously defined threshold.
Define:
g(i,j)
the digital picture
t(i,j)
the template
D
the domain of definition of the
template.
The domain of definition of the template is usually
smaller than that of the picture. For example the picture
could be 256 x 256 pixels (picture elements) and the
template 8 x 8 pixels. The measure of how well a portion
of the picture matches the template can be defined as
M(m,n)
=~2._
lg(i,j) - t ( i - m, j - n)j
( 2)
i,j such that (i-m, j-m) is in D
This equation is comparing each template t(i,j) at the
match
position
(m,n)
to
the element
in the picture
superimposed by the template. Set M(m,n) equal to the sum
of the absolute differences. The value of M(m,n) for all
the
template
translations
(m,n)
would have
to be
computed, and the location for which the M(m,n) value is
the smaller will be noted for a possible match position.
10
1.4 Similarity Measure
In terms of picture functions, template matching process
is looking for a region of the picture plane such that
the picture function in that region is similar to a
previously specified picture function called template. We
need to measure the similarity between the template and
the region of picture plane. Euclidean distance is good
measure to use for similarity, it finds the distance ( or
similarity) between the two picture functions.
In
general, the distance can be measured in the p-norm or LP
as:
I(m,n) = {~ ~[g(i,j) - t ( i - m, j - n)]P}l/p
J
l
{3)
Equation {2) is distance measured in the L 1 norm. The
distance can also be measured in the
L
2 or 2 - norm as:
E{m.n)= {~ ~[g{i,j) - t{i- m, j - n]2}1/2
l
J
{4)
Another way of measuring the distance is the co - norm or
Loo
S{m,m) =.max )g { i , j ) - t { i
l,J
- m,
j
-
n )l
From equation {2), by squaring both sides we get,
{5 ) •
11
E 2 (m,n)= 1.."i.[g 2 (i,j)- 2g(i,j)t(i-m,j-n)+t 2 (i-m,jn)]
(6)
where the summation is over all the i and j, such that
the arguments of "t" remains within the domain of
definition. As the template is moved around the picture
by varying m and n , the summation of the last term is
unchanged,
since for any m and n the range of the
arguments of t 2 is the domain of definition of "t". The
summation of the first term called the picture energy in
the window varies generally with (m,n), since the range
of i
and j
vary with m and n.
variation is
sma 11,
the
term
If the picture energy
can be
ignored,
then,
E 2 (m,n) will be small when the second term of the
equation is large. Therefore, we can define the crosscorrelation between the two picture functions g and t by
Rgt(m,n)
=
~
l
i.g(i,j )t(i - m, j - n)
J
(7)
This definition of cross-correlation can used as the
measure for similarity between the template and the
picture region in the vicinity of (m,n). The template and
the picture region can be called similar if the crosscorrelation is larger then a pre-determined threshold.
12
If the translated template falls on a region of an image
picture that was all white, the cross-correlation will be
high thus giving erroneous measure of similarity. This
can
resolved
by
computing
the
normalized
cross-
correlation:
(8 )
The normalized cross-correlation has a maximum value when
match with the most likely picture function.
Aside
from
having
the
need
for
some
measure for
similarity, the selection of the template is also very
important.
The character being looked for can appear
anywhere in the picture. A family of templates may be
required to anticipate various matching conditions. Since
the templates have to be scanned across the whole
picture,
the computations and thus the time needed will
be quite large. This problem can be solved by using a set
of local templates instead of the global template. The
local templates are designed to match specific characters
of the objects. The advantage of using local templates is
that individual character is simpler to detect than the
entire object.
SECTION 2
CONVEX AND CONCAVE FEATURES
In many character recognition systems, extracted feature
of alpha-numeric characters are used for the recognition.
Convexity
and
concavity
are
two
such features.
The
contour of a character is examined by the system and is
divided
into
convex
and
concave
segments
by
the
difference of edge direction. Each character can also be
categorized by such features as angular distribution,
center of gravity, degree of concavity and the opening
direction of
the
segments.
A dictionary of
these
attributes is used to match the characters.
2.1 Feature Extraction
The
concave
and
convex features
of
a
character are
extracted by scanning the character. The concave features
are emphasized by the outermost points, while the convex
features are restricted at the endpoints of the branches
as shown in Figure 2.la. Figure 2.lb shows the convex
features
emphasized
while
restricted at cross points.
the
concave
features
are
Both convex and concave
emphasized features can extracted synchronously as shown
in Figure 2.lc. Following is the procedure for feature
13
14
)
/
/
/
/
/
/
b) Convexity
emphasis
a) Concavity
emphasis
Convexity
//]
Concavity
~
r::: )
~uv
FIGURE 2.1 Concavity and
convexity emphasis
15
Pe
FIGURE 2. 2 CONVEX AND CONCAVE SEGMENI'S
0
'
16
extraction:
The starting point ST (Figure 2.2) of each contour is
found by scanning from the top left down to bottom right
of the input pattern. Then the following procedure is
used to obtain the convex and concave segments :
l)
Trace
the
contour
with
clockwise
motion
from
starting point ST. When the condition 8i - 8e > Sl
is met at a point i
from starting
point ST or
before CS, then the point i is the starting point PS
of a convex segment. When the condition e-m - 9-i > S 2
is met at a point i from PS, then the point i is the
starting point CS of a concave segment. Repeat the
above steps till the point i is found.
2)
Next, trace the contour with a counter-clockwise
motion. When the condition em point
1
&i >
s3
is met at a
from PS, then this point i is the endpoint
PE of a convex segment. Here the method collects a
set of features for the convex segment that has been
accumulated from PS to PE. When the condition ei ee > S4 is true at a point i
from CS,
then this
point i
becomes the endpoint CE of a
concave
segment.
Here the method collects a set of features
for the concave segment that has been accumulated
from CS to CE. Step 2 is repeated until the contour
17
has been completely traversed.
ei is the angle perpendicular to the tracing direction at
the point i of the contour of the character. The angle is
measured counter-clockwise relative to the horizontal. 9m
is
the
largest angle
starting point ST,
between the point i-1 and the
CS or PS. 9e is the minimum angle
between the point i-1 and the starting point ST, CS or
PS. Sl, S2, S3, and S4 are positive constants.
Concave segments from CS to CE and convex segments from
PS to PE are extracted. When the method is applied to the
character
"L",
the
result
shown
in Figure
2.2
is
obtained. PS is covered by PE when the starting point PS
is in the same position as the end point PE of the last
convex.
2.2 Features
The following nine features are used to describe each
convex or concave segment:
1)
Cis the index of convexity,
concavity or "hole"
segment.
2)
G(Gx,Gy) is the center of gravity of the segment.
3)
Ll is the length of the segment along the contour
from starting point CS(Sx,Sy) ( or PS on a convexity
) to the end point CE(Ex,Ey) (or PE on a convexity)
18
of the segment.
4)
L2 is the straight line distance between CS and CE.
5)
The degree of concavity L 0 is defined by equation
(9) which gives the minimum distance between the
center point G and line CS to CE.
(9 )
where CB is constant.
6)
The directional
angle AG is the angle of the
direction perpendicular to line CS->CE.
7)
Dl2 is the ratio of L2 to Ll.
8)
Ll2 is the difference between Ll and L2.
9)
H(r)
is
the angular distribution of
Q.
1
for the
segment. These angles are also quantized in eight
directions.
2.3 Matching
Each template is composed of an upper limit Ui and a
lower limit Li for the feature i of each segment. The
distance di of each feature i and distance jD of mask j
are calculated
as follows:
If Li-ELi <= Ii <= Ui + EUi then
di = max(Ii-ui, Li-Ii, 9)
(1 0)
19
otherwise di
jD
=
=~ .
*
f_w·1
8
d·1
where
.
l
=
1 to M
( 11 )
where Ii is the input value of feature i and EUi and ELi
are positive constants. Wi is the weight of axis i, and M
is the number of features. The computations resulted in
finding the smallest of jD values.
2.4 Dictionary
Each template is made from training data corresponding to
each category.
Each category is divided into sub-
categories according to the grouping parameters (~ , ~ ,
~ , U~'), where the number of disconnected strokes is
the number of holes is denoted by
~
1
, and the number
of curve segments by $. U&' is a set of angles of the
open directions of the concavities. Ui and Li are the
upper and lower value of the sub-category. A template of
sub-category j is composed as following:
U (U·,
L·)
l
l
for
l
= 1 to M
(12)
Following is an example of a template contained in the
dictionary for
"( =2,
? =0
and (b =3:
20
N C
CX CY Ll L2 AC LO DV SA Hl H2 H3 H4 H5 H6 H7 H8
1 1
1
27 39 47
17 31 21
Character
2 31 36
1 0 25
recognition
2 46 56 37 28 17 16 16 51 33
0 21 17 19 5 3 1 1 0 5
using
the
concave
and
convex
feature has a very high recognition rate (97.2%). But the
templates
generated
for
the
dictionary
has
many
multiples. This is due to the unstability in the convex
and concave features of characters if they are coming
from different sources. The multiple templates can be
averaged and stored in the dictionary to reduce memory
space.
SECTION 3
POLYGONAL APPROXIMATIONS
Polygonal
boundary
approximation
is
used
for
shape
recognition of characters in general and handwritten
characters in particular. This process performs numerical
data
pre-processing
followed
by
an
analysis
of
the
reduced data. Polygonal approximation is used to search
for
curvature
extrema on
variable knots theory.
the
basis
of
splines
with
According to this theory, corners
in polygonal approximations are likely to occur near
maxima of the second derivative which often coincides
with
the
curvature
piecewise linear,
maxima.
If
the
approximation
is
then the minimization of the least
square error will locate the vertices usually at or near
curvature maxima with continuity and in few cases, at or
near zeros of curvature or the inflection points without
continuity.
3.1 Feature Generation
The polygonal
approximation
method
relies on the
concavity information of characters. The first step is
the identification of the boundary points by a tracing
algorithm. During this process the number of holes in the
21
22
character are determined. Other parameters were also
evaluated which indicated broken line and 'filled/ holes.
An ordered list of the boundary is obtained and a
piecewise linear approximation of the external boundary
is performed. The output of this step is a list of linear
segments each described by three parameters Oi, di, and
(13)
where
R l· - 1
<
X
<=
R l· ,
i
=
1,2,3, ... ,n
and
R0
=
Rn for a closed boundary.
C/Ji is an angle with the horizontal, di is distance from
the origin, and Ri is the abscissa of the right endpoint
of the ith linear segment. The right endpoint of the ith
segment is also taken as the location of the ith vertex,
while the supplement of the angle at this vertex is given
by:
( 14)
as shown in Figure 3.1. If Wi <'1T
the angle is concave,
while if li!i > iT, than the angle is convex.
A concave arc is a set of uninterrupted sequence of
concave vertices.
The number of these arcs,
their
Q .
23
FIGURE J.l Definition of
~~
24
FIGURE 3.2 Definition of location
zones
25
locations, and their parameters define the main features
used in character classification.
Polygonal
boundaries
are
scanned
in
the
clock-wise
direction to distinguish the left and the right arc of a
character. In Figure 3.2, BCDEFGHJK is the left arc and
MNPSQRTUV is the right arc.
The uppermost and the
lowermost points are the convex vertices.
The distances
between the uppermost and the lowermost points is divided
into
zones as
shown in
Figure 3.2.
The location of
aconcave arc is described by the average of the y
coordinates of its vertices.
Aside from the concavity, the presence and number as well
as the location of holes are also used as classifying
feature.
The
location of
holes
is
described by the
uppermost (h 1 ) and the lowermost (h 2 ) y coordinates.
Following are the features used in the recognition of
characters by polygonal approximation:
A. Integer Variables:
u 1 = Number
of
holes
computed during
the
tracing
algorithm.
u 2 = Number of 'filled' holes, i.e., number of regions of
approximately square shape within the character whose
area exceeds 1/12 of the matrix size computed during
26
tracing.
u 3 = Hole description variable, varies from 0, 1, or 2.
u3
=
1 if l/2(h 1 + h 2 ) is in zones 6 or 7 and
h 1 in zones 5, 6 or 7 (low hole).
u3
=
2 if l/2(h 1 + h 2 ) is in zones 1 and h 2 is
in zones 3, 4, or 5 (high hole).
u 3 = 0 otherwise.
u 4 = Number of concave arcs on external boundary.
u 5=
Number of
concave
arcs on the left
side of
the
character.
u 6 ,u 7 ,u 8 = Distance from the bottom of the character of
the
first,
second,
and
the
third
concave
arc
respectively.
u 9 = Length of the outside polygonal boundary.
u 10 = Length of the outside polygonal boundary from the
top to the first concave vertex.
u 11 ,u 12
=
u 13 ,u 14
= h 1 , h 2 of the lowermost hole, used if u 1 + u 2 =
h 1 , h 2 of the uppermost or the only hole.
2.
B. Logical Variables:
v1
True, if there is an indication of a broken loop at
the top of the character, computed during tracing.
v2
True,
if an 'upward facing' concave arc lies
in
27
zones 3-6.
True,
if a vertical line
length greater than Y/3 is encountered before or at
the first concave angle.
v4
True,
if the vertical projection extending from the
top vertex in the left side is greater than Y/3 or
if the first concave angle exceeds 135°.
True, if the first concave arc occurs on the left
•
side and in zones 3 or 4 and the angle
~
of the
first following segment exceeds 120° or if the first
concave arc consists of two angles lying entirely in
zone 8.
v6
True,
if the first concavity occurs on the right
side in zones 5, 6, or 7.
v7
True,
if the vertical projection of the first
segment exceeds Y/3.
v8
True, if before any concave arc, a very sharp convex
angle of less than 45° is encountered.
It should be observed that most of the logical variables
are classification functions. For example, v 4 and v 5 are
used for the separation of the numerals l and 7.
From the Figure 3.2, the character a has the following
features:
28
u1 u 2 u 3 u 4 u 5 u 6 u 7 u 8 u 9 v 1 v 2 v 3 v 4 v 5 v 6 v 7 v 8
1
u 10
0
1
3
1
4
7
6
0
F
F
F
F
F
F
F
F
= Length of polygonal arc ABCDE.
3.2 Classification
Geometric diversity in shapes of various numerals is such
that linear classification is not practical. Piecewise
linear classification is preferable where the feature
space is divided into linear regions, but with each class
occupying more
than one region.
Piecewise linear
classification can be implemented by a decision tree
where the various features are examined sequentially.
This decision tree can be set up before hand on the basis
of the intuitive description of the character. It is then
tested on a design set where the following modifications
can be made:
1) optimize the parameters at each node
which corresponds to linear classification; 2) add new
nodes by selecting new features. Figure 3.3 shows the
final decision tree with boxes denoting classes. Figure
3.4 shows 'arche-types' of each subclass.
Short branches
of the tree conceived before hand, while most of the long
branches which involve features were developed during the
tests with the design set.
29
FIGURE 3.3 Classification Tree
(a)
.
---- - ----~- -- -- -
-- -~- --
.
-
- - -
30
-
·I
-..o
4h
& u 2-=-l
2
10.
T
FIGURE 3-3 Classification Tree(o)
FIGURE
f====== I
l
3. 4
TYPICAL CHARACI'ERS FOR EACH
LEAF OF CLASSIFICATION TREE
I
1
1
1
3
4
I
I
1
2
4
Ar-
I
t======t
I
2.
h
lt
4
;;t
(F
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~1~~~~~~~~~~1~~~~~~~~~~1~~~~~~~~~~
-j
4
4
---------,----------,----------~----------,----------~----------,----------~----------,----------
~---=:____ ----=~---- ----~:____ ----~~---- ----~~---- ----~~---- ----~=---- ----~=---- ----~~---(:,
5
5
6
7
7
7
7
~
:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~,~~~~~~~~~~,~~~~~~~~~~,~~~~~~~~~~,~~~~~~~~~~
s
i
j
s
~
czs·
9
B
9
r---;~----~----;~----~----;~----~----;~----~----;~----~----;~----,----;;----~----;~----,----;~---
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------
~---------
9
0
Cf
3
rt
~
¢
.¢
-e-
~--;~~----,---;~~~---,----;~----,----~~----,----~~----,----~~----~----~~----,---~~~----,---~~~~--I
I
t
1
I
&
l
l
::J
\...V
1-'
32
3.3 Performance
Polygonal approximation method using the decision tree
when ·tested gave
Pavlidis).
a
Most of
maximum
error
the error was
of
14.7%
(Ali
and
caused by the non-
stationarity of the data base. The samples from different
sources are clustered,
and there exists a wide variation
from on source to other.
Another method for classification was the use of the
decision tree and separating hyperplanes.
u1,
u4,
u 5,
and v 1 ,
The variables
which describe major topological
information, were used to separate the characters into
groups as shown in the Table 1 . For each group about
five
integer
and
logical
variables
were
chosen
as
features to be used with the linear classifier. One third
of the characters were assigned before hand , and about
half of these in each group were used as design set to
determine separating hyperplanes
training rule).
(using the
incremental
This classification method produced and
error of about 15.9% (Ali and Pavlidis).
Polygonal
powerful
approximation technique
tool
in character
was
found to be a
recognition.
The
integer
variables (ui) are easy to form from the approximation.
33
The logical integers require slightly more work. The
decision tree is rather difficult to construct because
most of
i t has to be constructed before hand and
intuitively. The error
ra~e
is little high than some
other types of classifiers (Ali and Pavlidis).
TABLE
ul
L
ru5
h
I
GRUJP
3.1
MAJOR ClASSES FCUND
TCm\L
~
USED IN
DESIGN SE!'
--------- ------------------------- --------- -------------0
0
0
0
0
0
0
1
1
1
1
1
1
2
0
1
2
2
2
3
4
2
0
1
2
3
3
-
-
-
1
2
0
-
-
-
1
0
0
0
1
0
-
1
2
3
4
5
6
7
8
9
10
11
12
13
14
1(46)
6(11),7(45)
2(36),4(9),5(46),9(5)
3(13) ,4(10)
4(2)
3(35),4(8)
4(7)
0(8)
--
6(34),8(7),9(39)
2(5),4(3),8(3),9(3)
8(2)
2(4),4(6),8(3)
8(33) ,0(40)
47
59
98
25
4
47
7
8
1
84
18
2
14
76
27
48
12
I
21
I
40
9
7
36
------200
The number in parentheses denote the number of samples per class.
\......)
~
il
SECTION 4
RECOGNITION OF CONTOUR-TRACED CHARACTER
Contour tracing technique was first introduced by Clemens
and Mason. Several modifications has been done to this
technique and has been used with different classifiers
for character recognition. This section deals with the
recognition of the contoured traced characters and the
contour tracing algorithm.
4.1 Feature Extraction
Feature extraction and classification are common to all
pattern recognition schemes. Let the features from a
pattern Ci be described by a vector X= (x 1 ,
x2,
... ,
xD). If P(Ci) is the probability of Ci and P(X/Ci) the
probability
of
of
X conditioned
on
Ci,
then
the
probability of misclassification is minimized by choosing
i to maximize any monotonic increasing function of
Ri
= P(X/Ci)P(Ci)
= P ( xl , x2, . . . , xD/Ci) P ( ci) .
(l 5 )
In most character recognition schemes, the dimensionality
of D is large, typically D is greater the 50. Even if the
35
'
36
components of X are binary, learning the statistics of 2D
different probabilities P(X/Ci) for each i
requires many
training samples and large storage requirements. However,
when the
D components
xk
of
X are statistically
independent, then
P(X/Ci)
=
p
~~ P(xk/Ci)
(16 )
in which case it is necessary to only learn and store the
The feature extraction algorithm design is based on two
observations. First,
reasonably legible Roman characters
are recognizable solely on the basis of their external
contour.
Second,
variation between characters is not
random, but highly structured. Accordingly, a character's
outside contours is used to generate a binary vector X
which is then classified either as a single character, or
as one of a group of characters not easily distinguished
solely on the basis of X. The transformation of character
into the binary vectors can easily be implemented.
37
4.2 Contour Tracinq Algorithm
The character whose contours are to be transformed into
vectors is traced by a 'scanning spot' applied on the
outside contours.
The scanning spot moves,
point by
point, from the bottom to the top of the leftmost column,
and successively repeats this procedure on the column
immediately to the right of the column previously
scanned,
until the first black point is found.
Upon
locating this point the scanner enters the CONTOUR mode.
In this mode,
the scanning spot moves right after
encountering a white point and left after a black point.
The contour mode terminates
when the scanning spot
completes its trace around the outside of the character
and then returns to its starting point. Figure 4.1
illustrates graphically the contour tracing algorithm.
After the character is scanned, it is divided into either
four or six equal sized rectangles whose size depends on
the height and the width of the letter as shown in Figure
4.2 . A y threshold equal to one half each rectangle's
height and an x
threshold equal
to one half each
rectangle's width is defined. Whenever the x coordinate
of the scanning spot reaches a local extremum and moves.
in the opposite direction to a point one threshold away
from the resulting extremum,
the resulting point is
38
s
..
--
.
-~------------------··-·
FIGURE 4.1 Contour tracing, Start at S
39
01
START -.,.. t---+-~­
CONTOUR
MODE
START --..
CONTOUR
MODE
ooo
CODE WORD
CODE WORD
10111
1001010010
COORD WORD
COORD WORD
0011110010
000,111,111,110,000,
001,001,000
FIGURE 4.2 CODE & COORD words
40
designated as either an xmax or xmin· After an xmax (
xmin) has occured,
no additional xmax's
(xmin's) are
recorded until after an xmin (xmax> has occured. In the
similar manner, the Ymax and Ymin's are recorded. The
starting point of the contour mode is regarded as an
xmin·
The code word for a character consists of a 1
followed by binary digits whose order coincides with the
order in which extrema occurs during the contour tracing.
A 1 denotes
max's and min's in x, while a 0 denotes max
or min in y. The rectangles are designated by binary
numbers, and the ordering of these numbers in accordance
with the rectangles in which extrema
fall
in event
sequence constitutes the COORD word. The rectangles are
numbered in such a way that rectangles far away
from one
another differ in more binary digits than do rectangles
close to each other.
The feature vector consists of the
CODE word followed by the COORD word. Figure 4.2 shows
CODE and COORD words for the letter
'c'.
4.3 CLassification Algorithm
In classification algorithm S, P(X/Ci) is estimated from
the data, and unknown characters are the classified by
choosing i to maximize Ri in equation 1. In algorithm T,
41
which follows from equation 1 and 2 when these equations
are suitably modified to account for the fact that the
length D of the feature vector is variable , i is chosen
to maximize Ti such that :
'b
Ti
=
~~ ln P(xk/Ci, D) + ln P(D/Ci)
+ ln P(Ci).
In
algorithm
U and
(1 7)
V,
which are minimum decoding
distance algorithms, i is chosen to minimize :
1:>
~lxk - P(xk = 1/Ci, D)\
U·1 =
V·1
- ln P(Ci)
b
=£
- P(xk = 1/Ci, D)\ 2 - ln P(Ci)
IC: I \xk
Probability P(xk
= 1/Ci,
D~
( 18)
(19) •
is equal to the mean of xk
• •••'1..
conditioned on Ci and D.
When all
characters
are
equiprobable and when the xk's are binary, then Ui is
identical to the first term of Ti in equation 3 and
taking the anti-logarithm
follows
that
U·1
will
of all remaining terms. It
never
yield
a
lower
error
probability than T when the characters are equiprobable
and binary vector components xk are statistically
independent. Algorithm U and V are optimum under some
conditions.
Let M be the maximum value of D. In algorithm W andY,
vectors of length D<M were made equal to M by setting xk
42
=
0 for D<k<=M. Index
1
was chosen to maximize Wi and
minimize Yi, where
w.l = L
""' ln P{xk/Ci) + ln P(Ci)
{ 2 0)
k • I
y.
l
M
= L
xk - P(xk = 1/Ci)
\4.:
-
ln P {ci > •
{ 21)
I
Probabilities P(X/Ci),
P{xk/Ci), and P(xk/Ci, D) were
learned by determining the relative number of times a
vector X or component xk occured, given the event Ci or
the joint event Ci, D.
4.4 Performance
Algorithm S yielded the lowest probability error with 23%
{Donaldson and Toussaint). When the training and test
data is identical,
the difference between S and T is
small. Algorithm T, U, and V a 11 performed much better
than S when training and test data were disjoint.
Results forT versus Wand U versus Y shows that making
all
vectors
equal
in length by adding zeros made
recognition more difficult.
In addition to yielding a feature vector of relatively
low dimensionality, the recognition scheme is insensitive
to character size, position and line thickness. It is
also relatively insensitive to variation in character
,, .
43
START
CONTOUR
MODE
FIGURE 4.3 Closing breaks
44
style. Touching characters are difficult to recognize by
this method.
Broken letters are also difficult to
recognize as shown in Figure 4.3. One corrective measure
is to close small breaks by blackening all white
squares
adjacent to a black square. A second alternative is to
move a bar of length b along the outside contour, keeping
one end of the bar against the contour and maintaining an
angle of 90° between the bar and the line tangent to the
contour at the bar-contour point of contact. Whenever the
other end of the bar encounters black, the line defined
by that portion of the bar between the two black points
is made a part of the character as shown in Figure 4.3.
The bar must be long enough to close most breaks but
short enough that intentional breaks are retained.
SECTION 5
SYNTACTIC CHARACTER RECOGNITION
A
classical
approach
to
the
automatic
character
recognition is to fit a variety of templates upon the
letters and numerals in order to select that template and
class which fits
best.
A better
approach
is
not
to
consider the observed patterns as a whole but to use the
structure of the pattern. This approach describes the
patterns in terms of the constituent basic elements and
their relationships. Formal grammars are considered a
useful procedure for this structural approach.
Polygonal approximation has been considered in an earlier
section,
in this section the approach is extended by
using a two level syntactic analysis of the polygonal
approximation
instead
of
the
empirical
Syntactic techniques have been widely used,
method
features.
however this
of description of boundary plane objects has two
disadvantages.
i) The direct syntactic analysis is faced
with the need to handle the effects of noise which causes
complications in developing strings. ii) The parsing (
analysing
grammatically and
45
syntactically)
of
whole
46
FIGURE
5.1 Data base numerals
47
boundary requires use of the very difficult context
sensitive grammar. The use of polygonal approximations as
a
preprocessor
techniques
and
the
application
of
syntactic
in hierarchical manner overcomes both
problems. Figure 5.1 shows some of the data base after
the application polygonal approximation .
The polygonal approximations are parsed in two steps. In
first step a general shape parser which is applicable to
any type of plane contours is used.
It is based on a
grammar which is described below. The second step uses a
set of finite automata, each one accepting strings from
exactly one class only. These automata are expressed in
terms of regular expressions which are described in the
recognition algorithm.
5.1 The Grammar
The shape grammar assumes that the boundaries of objects
of interest consists of concatinations of entities, a
formal
name
is
assigned
to each one of
them
for
convenience.
*
Arcs which can be approximated by quadratic
curve are called QUAD.
*
Sharp protrusions
TRUS.
or intrusions are called
48
*
*
Long linear segments are called LINE.
Short segments having no regular shape are
called BREAK.
The polygonal approximations encode the boundary in a
sequence of vectors X, which are considered the terminals
of the grammar. A QUAD is approximated by a subsequence
where both the length and the vertex angle is more or
less uniform. A LINE is either a single vector or a set
of
mutually almost
colinear vectors.
A BREAK
is
approximated by one or two vectors of very short length.
It is also assumed that a TRUS always consists of two
LINES either separated by a BREAK or indirect sequence.
If the LINES are close to being parallel, then the TRUS
is called a STROKE;
otherwise it is called a CORNER.
Formally, the production rules include semantics and are
listed below:
BDY + TRUS I BDY + QUAD I BDY + LINE I BDY +
BDY ->
BREAK
BDY
->
TRUS I QUAD I LINE I BREAK
TRUS
->
STROKE I CORNER
STROKE->
LINE 1 + BREAK + LINE 2
(the angle of
LINE 1 ,LINE 2 ) differ from 180 degrees by less than 9)
STROKE-> LINE 1 + LINE 2 (the angle of both lines
differs from 180 degrees by less than 9)
CORNER-> LINE 1 + LINE2 angle of both lines > 9
, '
FIGURE 5.2
Definition of entry
50
-> LINE + V (the angle of LINE V differs from
LINE
180 degrees by less than 9)
QUAD
-> QUAD+ V (variance of V-angle < d(delta);
variance of V-length < d(delta))
5.2 Output Of The First Parser
The output of the first parser is a representation of the
object and is of following form:
<output>
<gen>
<descriptor>
->
->
->
->
<addn>
->
<gen> <descriptors> <addn>
dddldldlddldd:
<descriptors> descriptor
descriptor
"8 :" I "O :" I ""
<gen> stores a three digit identification number of the
source, the numeral, the number of loops (i.e. number of
holes + 1), reparsing status, and number of segments. The
first two entries are actually carried through from the
original
input
and
are used only during the design
process. <addn> is a null string except when the number
of loops is three or more. It is then set to
the holes overlap vertically;
"0 :" if
"8 ·" if they do not
overlap vertically see Figure 5.2.
<addn> is used to
overcome the difficulty of differentiating between 8's
and 0 's. The parsed string for these two classes showed
great variation, and the simplest way to differentiate
between most members of these two classes is to use the
relative position of the holes. Each descriptor consists
51
of seven characters. The meaning of the characters is
shown below.
TABLE l
========================================================
CHARACTER
POSSIBLE VALUES
l
+,-
2
/,<,~
3
0-7
4
p,q,r,s,t,u,V
5
h,n,m,l
6
f,c,g
7
L,M,S,N
EXPLANATION
This character describes
whether the arc is convex
(+) or concave (-).
This character describes
whether the segment is
classified as a stroke(/),
a corner(<), or a quadratic
arc(~).
This
describes
the
orientation
of
the
bisectrix
of
a
segment.Fig5.3
This is the measure of
external angle between the
first and last side of an
arc. They are quantized in
intervals of 45°, except V
which stands for
any
external angle greater than
270°.
This indicates the vertical
location of the arc. The
frame is divided into four
equal regions for this
purpose. Fig. 5.4.
The frame is divided
horizontally into three
equal regions: left(f),
center(c),
right(g).
Fig.5.4
This indicates the length
of the arc as a function of
total parameter and average
segment length of the
object.The notation used is
large L,middle M,small S,
negligible N.
From the table i t can be seen that a descriptor gives a
fairly complete description of the arc. For example,
the
52
'
(a)
FIGURE 5.3 Definition of arc
orientation
53
B
A
hf
he
hg
nf
nc
ng
mf
me
mg
lf
lc
lg
c
FIGURE 5.4 Definition of location
attributes.
54
descriptor +<3rhgl represents a large,
sharp,
convex
corner facing NW located at the top left of the object.
Figure 5.5 shows a typical example of an encoding.
5.3 Recognition Algorithm
The majority of the strings produced by the first parser
from various examples of a single numeral are virtually
identical. Furthermore, the string corresponded to the
intuitive description of the numeral. Therefore, it is
relatively easy to characterize each class in terms of
regular
expressions,
i.e.,
strings
which
can
be
recognized by a finite automaton. Such expressions can be
represented as strings of symbols with the following
special provisions. A dot (.) stands for any symbol. A
star (*) denotes any number of repetitions of the
previous symbol, so that ". *" denotes any sequence. The
square brackets denote the union of the enclosed symbols.
Thus [ abc ] matches a , b , or c.
This approach is explained by the following example. From
the design set consisting of 480 numerals, it is found
that
many
3 's
can
be
represented
/.[345] [lm] [gc].*-/.[345].[hn])
&
NOT
as
(/./l/.*-
(+/O.lg)
(Figure
5.5). This string can be explained as:
The object has no holes and the boundary has a
55
(I
-'t4shfS
+<lshg S
FIGURE 5.5 First parser encoding
of numeral
'
56
concave arc facing W located in the right (or
middle)
lower half of the figure,
followed by
another concave arc facing W in the top half of the
figure. In addition, the lower right of the object
does not have a stroke facing E
Two other common representations of '3' were found. The
union
of
these
three
classifier for '3'.
all
the
representations
became
the
Such classifiers are constructed for
numerals
and
are
listed
in Table
2.
The
confidence level of each expression was determined with
the help of the design set.
This is used to resolve
conflicts in assigning a class, if the string matched two
or more classifiers. For example, if the string matched
two classifiers,
then it will be assigned to the one
which has the higher confidence level.
5.4 Performance
The syntactic recognition scheme has an overall error
plus rejection rate of 5.37% (Fu). Thus the recognition
rate is about 94%. After an input sample is processed by
the polygonal approximation method and the first parser,
the reduced
(string) data is compared by the second
parser against the given strings of Table 2. If none of
the expressions match the string, the string is reparsed
by the first parser, or else the string is assigned to
57
the class which has the highest confidence level. This is
the only place where any stochastic decisions are made.
Virtually all the processing time is spent during the
boundary tracing of the samples and the polygonal
approximations. This makes data very compact and thus the
latter algorithms only process small data.
58
TABLE 2.
EXPRESSIONS USED FOR NUMERAL RECOGNITION
========================================================
~0~ = t01 U t02 U t03 U t04
t 0 1 = ( I. I [ 2 3 4 ] • * - < 7 [ pq ] [ m 1 ] • + [ I< ] [ 5 6 7 ] [ r s ] l) & NOT (+I 0 )
t02=(1.1[234]1.*+[1<][567][rs]1) & (-.[23]) &
NOT(+I[70].1g)
t 0 3 = ( I. I [ 2 3 4 ] I. * + [ I < ] [ 5 6 7 ] [ r s ] 1 ) & NOT (- ••••• [ 1m s ] )
t04= <1-1[34].*0:)
.
"1" = t11 u t12 u t13
t11= (/.111.*+1[123].h[ML]+I{567]1[ML]) & NOT(-)
t 1 2 = ( I. I 1 I. *+I [ 1 2 ] • h { ML ] - < [ 0 1 ] [ 1m ] [ ML ]
+ [<I] [ 7 0 ]l •• + {I<] [54 ] l. . -. [ 3 4] • [1m] )
t 1 3 = ( I. I 1 I.*+ I [ 2 ] • h. [ M L] +I [ 6 ] .l. [ ML]- [ I<] [ 5 ] • [ mnh ••
+[1<][5] •• £.$)
~2~ = t21 u t22 u t23 u t24
t21= (-.[01][1m] •• +[l~][701][rstu]lg[LMS].*+.[345][1m]*­
.[45]) & NOT (-.[345].*+.*-.[345])
t 2 2 = ( - • [ 0 1 ] [ 1m] ••• *+I [ 7 0 1 ] [ r s t ] [ 1m] g [ LMS } • * + • [ 3 4 5 ] [ 1 rn] *[456]) & NOT (-[345]*+.*-.[345])
t23= (+. [12]h • • -[1<] [01] [1m] [gc]+ •••••• + •••••• < [56] • [ 1 rn].. + [I<] [ 4 56 ] • if.*-. [ 3 4 5] )
t24= (+A[012]. ••• +A[345] •••• -~[345]) & NOT(-.*-)
"3" = t31 u t32 u t33
t31= (+ •••••• +. [2345] •••• -. [345]. *+. [345] •••• -. [3456])
t32= < I. I 1 I. * - • [ 3 4 5 1 [ 1 m J [ g c 1 • *-. [ 3 4 5 J [ h n 1 >
&
NOT (+I [ 01 ]lg)
t33= (~<[701] •••• + •••••• + .•.••• ~~[345]) & NOT(+IOs.g)
"4" = t41 u t42 u t43 u t44
t41= (1.1[12]1*+[1<] [12] [hn] •• [ <I] [ 12] [ hnm]. * + [ <I] [56 ] s l..- < [56 ] [ 1 mn] ) & NOT (• [ 4 56 7 0 ] • h ) & NOT ( +A [ 4 5 ] ) & NOT ( + < [ 3 4 5 ] .••• + < [ 3 4 5 ] )
t42= (+[1<][12][hn] •• -[<1][12][hnm].+[I<)[017] •.•• -<7 ••••
+ [<l][56][rs]l.-<56][1mn])
t43= <l.l[12]1*+[1<][12]h.*+[l<][70][mn]g.*+[l<][567][rs]1 •• <5 •••• +[<1][45] •• f) & NOT (-3<[pq))
t4 4= ( + [I<] [ 12] • h [ cg] [ ML] +I [56 7] • l. [ ML]<[qr] ••• +[l<][45].mf).
"5" = t51 U t52 U t53
t 5 1 = ( +I [ 0 1 2 ] • h [ g c ] • - • [ 7 0 ] [ hn • • + • [ 7 0 6 ] [ 1m ] . • + • • • • • • .[2345][1m) ••• ) & NOT(-.[345].*+.*-.[345])
t52= (l.l[12]1.*+[701)[rst]hg.• [ 7 0 ] [ pqr s t] [ hn]. * + [ 4 56 7] l. *-. [ 3 4 5 ] [ 1m 1 ] ) & NOT (.*+.*-.*+.*-)
t5 3= (+I [ 012] . h [ gc] • -. [ 7 0] [ • hn] •• .[01]. ••• +.[706][1m •• + •••••• -.[234][1m])
&
NOT(. [345] .*-. [345])
"6" = t61 u t62 u t63 u t64 u t65
t61= (:+I[012].h •• -[1<][7012] •••• *+A[4567]) & NOT(+A.*•• [qrstuV]) & NOT(-.*+.*-)
59
t62= (:+"[45]V •• L-[70])
t63= (/./[12]/.*+/[012]h •• -[012] & NOT(-.*+.*-)
t6 4= < :+I [ o1 2 J h •. - [ I< J [ o 1 2 J •••• + [ "I J [ 4 56 7 J •••• - < p f > &
NOT(-.*+.*-.*+.*-)
t65= ( :+/[012] .h •• -[/<] [012] .••• +[ "/] [701] ...• < [ 6 7] [ pq] [ f c ] ) & NOT (-. * +. *-. * +. *- )
"7" = t71 u t72 u t73
t71= (/./1/.*+.[0123][hn][g][ML]+/[56]1[ML].*• [456] [hn] [MSL]) & NOT(-.[0123] [rstuV] .• [LMS])
t72= (/./1/.*+.1 hg- [01] •••• +. [ 017] .• g.- [ 07] .... +/6sl..[54) •..• +[345] •.•. -]
t 7 3 = (I . I 1 I . * + • [ 0 12 3 ] . [ hn ] [ g ] [ ML ] .[701]p ... +/[56].l.[ML].*-.[456][hn][MLS]) & NOT(• [ 0 1 2 3 ] [ r s t u V ] • • [ LMS ] )
"8" = t81 u t82 u t83 u t84 u t85 u t86
t81= (/./[34]/.*8:)
t8 2= ( I. I [ 2 3 4 ] • * " [ 1 2 3 ] * " [ 5 6 7 ] ) & N 0 T ( ( +I [ 56 7 ] ) &
NOT(+/01g) & NOT(-.[345].*-.[345])
t8 3 = <I. I [ 2 3 4 J * + [ "< J [ 3 4 5 J f * + [ " < J [ 3 4 5 f > & < -. [ 3 4 5 J •• f > &
NOT(-/[345].*-.[345])
&
NOT(+/[710]1g)
&
NOT(+/[567])
t84= (+[/"][12]h[gc]- [70][pq] ••• + [76][rstu]) & NOT([345]*-.[345]) & NOT(+/[701]lg) & NOT(-.3[hn])
t85= (+[</"][012][rstuV][hn][gc].. [ 7 0 ] [ pq ] •.• +. [ 7 6 ] [ r stu ] ) & NOT (-. 5 • [ m n ) ) & NOT ( -. 3 )
& NOT(-.3) & NOT(-.4[rstu]) & NOT(+/[701].1g)
"9" = t91 u t92 u t93 u t94 u t95 u t96 u t97 u t98
t91= (/./[2]/.*+["/<][1230]{hn} •• +{/<][56][rs]l .. [/<][54][n])
t92= (/./[12]/.*+"[1230][hn] •• +[/"<][56][rst]1 •• [54}[nm]) & NOT(-.*+.*-)
t93= (+"[123].h.*+/[567].1..-<5[mn]) & NOT(-.3.h)
t 9 4 = < I . I 2 I . *+ [ 12 l [ hn J •• +I [ 56 J1. . <[45][mn] •• +["<][45][nh])
t95= (/./2/.*"[01267] ••.. *-[/<S[mn]) & NOT(-.*+.*-)
t96= < I . I [ 1 2 J I . * + I 6 . 1 .• - [ I < J [ 4 s J [ m n J )
&
( :+<[345] [mnh] .*+<[345] [mnh .. $)
t97= ((+"[34][hn]f.+[01567][hn] .••• *-.*+.[123)[hn] .. +
/[56] 1..-1/<] [45]n)
t98= (/./1/.*+.[123].[hn].*-.*+.[123][hn] •. +/[56].1 .. .[456][rs] .. +"[45][mnh])
SECTION 6
CHARACTER RECOGNITION SIMULATING VISUAL NERVOUS SYSTEM
The character recognition techniques described in the
previous sections are relatively accurate but they still
have
five
to
techniques
individually,
ten percent error rate.
allow
characters
they are not
to
be
Whi.le these
recognized
recognizable when these
characters are linked together. These techniques are also
not very flexible,
they only perform if the character is
within a certain range of rotation, usually +15°, the
lines are of certain thickness, or the character is of
certain height/width. The best recognition system is the
human visual nervous system. A system simulating the
characteristics of the human visual nervous system could
possibly
be
applied
to
character
recognition.
This
section discusses the this possibility.
6.1 Information Processing In The Visual System
The human visual nervous system is depicted in Figure 6.1
Visual patterns are transformed to the nervous impulse
frequencies by the receptors.
The information transmits
through the lateral geniculate body and is recognized in
60
61
lateral
geniculate
body
1 i gh t
_1A::o=c;r
bipolar receptors
cells
section of the retina
FIGURE 6.1
Human visual nervous system
the visual cortex. Each cell receives information
directly or indirectly from its own region on the retina
called 'receptive field'. There are two kinds of neurons
in the retina, one having 'on-center' receptive field and
the other having 'off-center' receptive field. The oncenter cells excites when the central region of the field
is stimulated by a light, and it is inhibited when the
peripheral region is stimulated. And viceversa is true
for the off-center cell.
The neurons in the visual cortex respond to a white or a
black line or an edge separation, white from black; these
cells detect the orientation of a line or an edge. When
the line shifts to .a new position without changing its
orientation, the response of the cells termed 'simple'
decreases. Other cells called 'complex' respond to the
line regardless of its position. Cells which are more
organized than the complex cells are called 'hypercomplex'.
6.2 Mathematical Model Of The Visual System
The visual system is essentially constructed of lateral
inhibition (LI) structure. Nervous impulses evoked from a
receptor by a spot of light is reduced by stimulating
neighbouring receptors. The LI will be simulated by a
63
p(x)
(a)
(b)
q y)
FIGURE 6.2 LI structure
a) Forward inhibition
b) Backward Inhibition
64
mathematical model and its characteristics analyzed.
Figure
6.2
shows
two
elementary
types
of
the
LI
structure. One is the 'forward inhibition' and the other
'backward inhibition'. Circles and lines represent cell
bodies and neural connections, respectively. The response
q(y)
of
the
forward
and
the
backward
structures to a visual pattern p(x) can
inhibition
be shown by the
following integral equations:
(JO
q(y)
=
(forward)
Jao w(y-x) p(x) dx
(22)
()0
q(y)
=
p(y) -
S w(y-x)
q(x) dx
(backward)
( 2 3)
-r:O
where the weighting function w(y-x) represents an
intersection coefficient from a cell (x) to another cell
(y). These equations show that LI structure is a spatial
filter.
the
The backward inhibition structure correspond to
feedback control system,
and can be transformed
equivalently to forward inhibition structure.
The LI structure is also considered as a model of the
receptive field. For example, the on-center receptive
field
can
be
simulated
by
the
following
forward
inhibition structure with the weighting function w(r).
co""
q(r,x)
=
Jf
~OD .tP
w(x-~,
y-n) p(
~,n)d ~dn
( 24)
w(r) =
Kl
exp<-r2
2Ti6'12
K2
2 0 22
r2
exp<-
2 rrO.:2 2
2~2
2
where r 2 = x2 + y2
( 25)
where K1 and K2 are coefficients of the excitory and
inhibitory connections,
parameters
respectively. Also, G; and
representing
the
spread
of
these
6""2. are
neural
connections.
Figure 6.3 a shows the theoretical response q(x,y) of the
on-center neurons to a square and the Figure 6. 3b shows
the on-center neurons responses to a T shape. Neurons
excite on the sides and the corners of the square, and
are
inhibited
outside.
The
response
to the
T shape
increases on the the intersecting portion and the end
portion of the lines. Therefore, it can be
concluded
that the cell having on-center receptive field extracts
the important information for visual pattern recognition.
6.3 Design Method For Property Filters
Since the nerves connect uniformly in the LI structure,
the property can be fi 1 tered regardless of the position
and size of the present pattern.
Also,
since the LI
structure conducts spatial integration, it is not apt to
be affected by the distortion of the patterns.
66
---- ------:... ........
··---·-- .....
.... ....
'
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-------- ----- -·(a)
FIGURE 6.3 Responses of the LI model
a) Square b) T
67
Two methods were developed based on the LI structure as a
property filter.
One method is simulating the visual
nervous system in a multi-layer LI structure. This method
is advantageous in considering
non-line~rity
of each
neurons in the visual system. However, it is difficult to
regulate the connections of successive nerve layers. In
the second method, if the non-linearity of each nerve
layer is disregarded,
entire visual
it is possible to express the
system by an equivalent l-layer LI
structure. Method 2 will be further considered here.
When m kinds of properties (A, B, ... ,M) are given in the
LI structure (Figure 6.4a),
the parameters of the LI
structure which filter any one property can obtained in
the following manner. If the interval between properties
is greater than the spread of the nerve connection,
Figure 6.4a becomes Figure 6.4b. Therefore, outputs qA
and qB, ... q!vl re spending to the properties A, B, ... , M
become :
K=A,B, ... ,M
( 26)
where Pk i is l or 0 according to whether the i th receptor
is
stimulated when property K is given to the LI
structure.
Also,
w·l
is the coefficient of the nerve
68
property A
property B
(a)
FIGURE 6.4 Property filter using LI structure
connection to be determined, and n is the spread of the
connection. When the spread of the connection is chosen
large, there are instances when the experimental results
on
property
extracting do
specifications
of
the
filter
not
accord
design,
with
because
the
it is
assumed that the spread of the connection must be smaller
than the interval between properties.
spread is made small,
declines
and
the
However,
when the
freedom of nerve connection
property
filtering
function
deteriorates.
When the nerve connection has symmetry,
equation 26
becomes :
qK
""'
= t,.::O
i... N·l PKi
"1'\
W·l
K·l W·l
= z_
i.=o
( 27)
where,
K·l
where
= VNiPKi
N·l
is
the
W·l
=
number
.rr;;;.
l
l
of
the
receptors
in
the
symmetrical positions. Since any coefficient of the nerve
connection must be a finite value, the next restrictive
condition is assumed.
( 28 )
Satisfying the condition and maintaining qB, qc, ... , qM
70
as small values, the coefficient wi which filters only
one property A is obtained as,
qA, Aj, ... , Ak
Ai, Aj, ... ,Ak
qM, Mj , ... , Mk
Mi, Mj, ... , Mk
L
~+\
W·l
=
c
"W\-1
(29 )
2
Au . . • Av/
""+' c __,
where
expresses
the
Mu ... Mv
sum
of
n+l Cm-l
or
n+l em
determinants made by taking m-1 or m inputs without
duplication from n+l
Also,
inputs corresponds to i=O,l,2, ..• ,n.
the relationship between qA,
qB, ... , qM is given by
ellipsoid in m-dimensional space as,
qA,Aj, ... ,Ak
L
n+l
•
•
Cm-1
Au~
=
2_
the properties
are
linearly
regardless of i=O,l,2, ... ,n as,
( 3 0)
n+lCm
qM,Mj, ..• ,Mk
When
.. Av
~1u
... Mv
dependent
together
71
FIGURE 6.5 Character extracted by LI structure
72
( 31)
the denominator of Equation 29 becomes 0. In this case,
outputs qA,
qB, ... ,
qM are also linearly dependent as
following, and each input cannot be freely determined.
( 3 2)
6.4 Character Recognition Model
A recognition model using the LI structure as the
property filters is shown in Figure 6.6. This recognition
system may be applied to various kinds of visual patterns
by altering the property filters.
The character to be recognized is quantized by an
artificial retina consisting of 20 x 20 receptors. This
number corresponds to one of the receptors in the retina
stimulated when the smallest letter is looked at from a
distance of 30 em. The enters the property filters, where
each property is extracted. A classifier is also designed
to compute the linear discriminant function Zi in
relation to the position and number of the extracted
properties.
The
character
is
identified
as
the
one
corresponding to the maximum in the function Zi. After
identification,
the coefficients of the function are
73
r--------1
1
I
horizontal
lines
inclined
lines
curvea
decision
I
intersecting
j
points
1
ends
sensors~
property
20x20 ' - - -~::=:::.-
I
:
__j
I
I
Z,.
I
adjust
adaptive I
l _ _!i. · __cla~~i!.i:!:J
I
FIGURE 6.6 Pattern recognition system
using LI structure
74
adjusted on the basis of
the
correct
answer by a
teacher. This recognition system is programmed into a
computer.
6.5 Property Filters
Generally, if the patterns are linear-separable, i t is
possible to classify these patterns by the adjustments of
the coefficients in the classifier. The purpose of the
property filtering
information
properties
into
shown
recognition
of
is
transforming
linear-separable
in
Table
numerals.
6.1
are
Figure
the
character
patterns.
used
6.7
The
for
the
shows
the
coefficients of the property filters designed by the
method described previously. The properties about+ and
X-type intersecting portions, and the end portions are
extracted isotropically and the remaining properties are
filtered anisotropically.
The property extraction is not
affected its position, but is restricted in relation to
the length and thickness of the line.
6.6 Property Discriminator
The cell outputs of each filter are scanned and when a
cell emitting output is found, a counter corresponding to
75
/
+X
L
TABLE 6.1 Properties used in recognition
76
-I -2 -2 -2
_,
-2
0 _,
4 1 0 I
0 -1 -2 -1
-1 -2 -2 -2
_,
"
0
4
0
-1
Horiz. segment
e =S
Vertical segment
-I
0
0
0
4 -I 8 _, -4
1 I
I I
-I 0
·1 -1
_,
-1
6 I
I
-4 ·5 -I -3
_, 0
1 t
0 0
' 1
0 1 -1 0
-1 -1 0 -1
0 0
0 0 -I 0
'
·3P J6
2 01211
6 I
+•type mtersecting point
e .. 38
e .s
-I -I
·3 o o 1o r-3
I 2 0 12 1 I
I 2 I ·1
-4 -I 8 -t ·(
·I
1
fi_-1
e aS
I 2 I -I
8 2 0 '2 8
Right doWDward
oblique line
_,
1
Left downward
oblique line
-I
I
-1 -I 1 I
-1 -1 0 I I
-1 1) 0 0 -I
I 1 0 -I -I
e =S
-I ·I
-1 -I
0 -1
I
-I I
_,
0 -2
-2 0
·2 -I I ·I -2
-2 -I 0 -1 -2
-2 -I I -1 -2
·2 0 4 0 -2
·31!16
x•type iAtersecting point
e =-26
-1 0 0
-1 0 0
0 ~1 ·I
0 0 -3
0 -I -1
1 I
-6
-1 -2 0
0 -2 _,
-t -2
·2 -I
-1 ·1 0 _, -I
_, -2
-2 . I -I
0 ·2
-I
-2 0
4 I ·4
•type CUlVe
e =10
!Dd point
8 =-8
FIGURE 6.7 Nerve connection coefficients
of property filter
,, .
77
the property is incremented by 1.
Then the scanning
procedure is resumed skipping over the vicinity of the
discovered property positions. Each extracted property is
processed in the following manner. The straight line
group is classified according to whether each number
corresponds to 0, 1, 2, 3 or more. Curves are classified
in the same manner plus, + and X-type intersecting points
are classified according to their presence or absence. In
the case of end points, the center of gravity of the
letter is computed and image plane is divided into 4
quadrants. The properties about ends are classified as to
whether the number appearing in each quadrant is 0, 1, 2,
3, or more. Thus the 44 bit property vector is obtained.
6.7 Pattern Classifier
After the character information has been concentrated
into the property vector, the properties obtained from
one kind of letters are still very diverse. Therefore, it
is difficult to predict the identification logic. Then,
assuming that the property vector X is a linear separable
pattern, the classifier is constructed by the learning
theory as follows.
Linear discriminant function Z is
computed in relation to the property vector X=(xl, x2,
78
....
zi = ~ aij xj,
i=O,l, ... ,9,
n=44.
( 3 3)
.>=•
and the letter is identified based on the maximum among
the functions Zi. Coefficients aij are adjusted as:
a·.
( 3 4)
lJ
on the basis of the
~onditional
probability P(Ci/xj) that
the input character is Ci has
whe~
a property xj is
filtered. However, assuming that the characters are given
with uniform probability, then,
a·.
( 35)
lJ
may be used instead of equation 34 according to the
Baysian
rule.
Here,
P(x·/C·)
J
l
is
the
conditional
probability that the property xj will be filtered when
the character Ci is given. If the property vector X is
the linear separable pattern, it is possible to construct
the classifier which will make correct identification.
6.8 Performance
The performance of the recognition system is examined for
handwritten numerals. 100 selected characters from the
400 learned numerals were tested, the recognition rate
79
was 100% (Pratt). When numerals are small as compared
with the surface of the artificial retina, receptors are
not utilized effectively. When the size of numerals is
reduced to 2/3,
declines
the number of effective receptors
to about
one half
the
total.
identification results are still
numerals
with
15°
recognition whereas
recognition rate
clockwise
90%
However,
the
(Pratt).
The
had
90%
15° counterclockwise had
80%
(Pratt).
orientation
SECTION 7
CONCLUSIONS AND DISCUSSION
The different character recognition algorithms discussed
in
the
project
have
their
own
advantages
and
disadvantages. Template matching algorithm can be very
slow if the picture region is
large compared to the
template or if a family of templates are required. The
advantage is the ease of implementation. Therefore, if
the speed of recognition is not an important factor,
template matching could be a simple and efficient
solution.
Use of
concave
and
convex features
for
character
recognition can provide a high recognition rate,
but the
templates generated for the dictionary can have many
multiples.
Therefore
if
there
are
many different
variations in the characters, this may not be the best
algorithm to use. On the other hand if there is only one
set of characters with little variations, the template
dictionary will be small resulting in fast algorithm with
high recognition rate.
80
81
Polygonal approximation technique is a powerful tool in
character recognition. The integer variables are very
easy to form from the approximation but the logical
integers require slightly more work. The decision tree is
also difficult to construct thus contributing to the
complexity of the algorithm. The recognition rate is high
with polygonal approximation which makes it very useful.
Contour tracing for character recognition is useful when
used
for
handprinted
characters.
Commonly
confused
characters are differentiated very easily by this
algorithm
giving
it
a
big
advantage
over
other
procedures. Syntactic recognition scheme is a step above
contour tracing and polygonal approximation. It uses data
from
boundary
tracing
of
samples
and
polygonal
approximation which makes the data very compact. Most of
the time is spent during the boundary tracing and
polygonal
approximation.
confidence
level
Syntactic recognition has high
because
of
all
the
preliminary
processing used in the procedure.
The most efficient character recognition device is the
human eye and brain so it is not very surprising that
simulating visual nervous system provide one of the best
methods for character recognition. The big draw back of
this procedure is the complexity of the hardware and
82
(l
software requirements which make this simulation
presently not very practical.
•
REFERENCES
F. ALI AND T. PALVIDIS, "Syntactic Recognition of
Handwritten Numerals," IEEE Transactions on Systems, Man,
and Cybernetics, Vol. SMC-7, No. 7, July 1977, pp. 537541.
F. ALI AND T. PALVIDIS, "Computer Recognition of
Handwritten Numerals by Polygonal Approximations," IEEE
Transactions on Systems, Man, and Cybernetics, Vol. SMC5, No. 6,November 1975, pp. 610-614.
W. DONALDSON AND G.T. TOUSSAINT, "Algorithm for
Recognizing Contour-traced Handprinted Characters," IEEE
Transactions on Computers, June 1970, pp. 541-556.
R.O. DUDA AND P.E. HART, Pattern Classification and Scene
Analysis, John Wiley and Sons, New York, 1973.
K.S. FU, Syntactic Pattern Recognition and Applications,
Prentice-Hall, Inc, New Jersey, 1982.
E.L. HALL, Computer Image Processing and Recognition,
John Wiley and Sons, New York, 1970.
K.K PINGLE, Automatic Interpretation and Classification
of Images, A. Grasseli Ed., New York, Academic Press.
W.K. PRATT, Digital Image Processing,
Sons, New York, 1978.
John Wiley and
R.Y. WONG, Computer Pattern Classification And Scene
Matching, California State University, Northridge, 1981.
K. YAMAMOTO, "Recognition of Handprinted Characters By
Convex and Concave Features," Proceedings of IEEE, Vol.
CH1499, March 1980, pp. 708-711
83
APPENDIX A
The computer subroutines given in this appendix are
written in Sanyo Basic [MS-DOS] version 1.31 on Sanyo
MBC-555-2 with 520K bytes free.
These subroutines
classify the characters from the data provided from the
main programs. In the begining of the subroutines, the
type of data requireq is explained.
84
85
TEMPLATE MATCHING
This subroutine requires the values of all the pixels in
the picture [g(i,j)]
and the template
[t(i~j)].
It
translates the template through the whole picture. To
save the memory required and processing time following
assumption are made:
Template size is an integer multiple of the picture.
The character is within the template size.
The character does not overflow out of the m,n grid.
1000
'THIS SUBROUTINE IS WRITTEN FOR PICTURE 100 X 100
PIXELS AND TEMPLATE 20 X 20 PIXELS.
1001
'IT CAN BE MODIFIED TO ANY SIZE AS LONG AS
TEMPLATE IS AN
INTEGER MULTIPLE OF THE PICTURE.
1010 DIM G(lOO,lOO), T(20,20), M(5,5), L(5,5)
1020
'PIXEL VALUE IS PROVIDED BY THE PROGRAM.
1030 'TEMPLATE IS BEING TRANSLATE.
1040 M(M,N) = 0 : L(M,N) = 0
1050 FOR M = 1 TO 5
1060 FOR N = 1 TO 5
1070 FOR I = 1 TO 20
1080 FOR J = 1 TO 20
10 9 0 M( M, N) = M( M, N) + ABS ( G(I, J) - T ( I- M, J- N) )
1100 NEXT J
1110 NEXT I
1120 L(M,N) = M(M,N)
1130 M(M,N) = 0
1140 NEXT N
1150 NEXT M
1160
~HECKING FOR SMALLEST M(M,N)
1165 MS = L(l,l)
ll 7 0 FOR M = 1 TO 5
1180 FOR N = 1 TO 5
1190 IF MS <= L(J:vl+l, N+l) THEN A=M: B = N
ELSE A=M+l : B = N+l : MS = L(M+l, N+l)
1210 NEXT N
86
1220
1230
N="B
1240
NEXT M
PRINT "SMALLEST M(M,N) ="MS" AT COORD M="A"AND
RETURN
87
POLYGONAL APPROXIMATION
The
subroutine
is
classification trees.
of
polygonal
approximation's
Both trees are in this subroutine.
Following data is assumed to be available :
Integer variables.
Logical variables.
Reference table.
1000
lOOS
1010
1020
1030
1040
lOSO
1060
1070
1086
1090
1100
lllO
1200
1210
1220
1230
1240
l2SO
1260
1270
1280
1290
RETURN
1300
1310
'TREE A
IF Ul + U2 > 0 THEN GOTO 1300 :'GO TO TREE B
IF U4 = 0 THEN PRINT "lA" : RETURN
IF U4 => 4 THEN PRINT "4D" : RETURN
IF U4 = 3 THEN GOTO 1200
IF U4 = l THEN GOTO 1270
'FOR LOGICAL VARIABLES l IS TRUE AND 0 IS FALSE
IF V2 = l AND US = 2 THEN GOTO 1070
ELSE PRINT "3A" : RETURN
IF U6 = 3 AND U7 = 3 THEN PRINT "7C"
RETURN
IF Vl = l THEN PRINT "9A"
RETURN
IF V2 = l THEN PRINT "4A"
RETURN
IF V7 = l THEN PRINT "SA"
RETURN
IF V 3 = l THEN PRINT "l C"
RETURN
ELSE PRINT "2A" : RETURN
'U 4 = 3 BRANCH
IF V2 = l THEN GOTO 1220
ELSE GOTO 1230
IF Vl = l THEN PRINT "9B" : RETURN
ELSE PRINT "4B" : RETURN
IF US <> 2 THEN PRINT "4C" : RETURN
IF U8 = 3 THEN PRINT "SB" : RETURN
IF U8 = 4 OR U8 = S THEN PRINT "3B"
RETURN
IF U8 = 6 OR U8 = 7 THEN PRINT "2B"
RETURN
IF US = 0 THEN PRINT "6A" : RETURN
IF VS = l THEN PRINT "7 A" : RETURN
IF (Ul0/U9) > 6 AND V4 = 0 THEN PRINT "7B"
ELSE PRINT "lB" : RETURN
'CLASSIFICATION TREE B
IF Ul + U2 => 2 THEN 1320
ELSE 1330
88
1320
IF U11=5 OR U1=6 OR U11=7 OR (U12+1) => U13 OR
U14=3 OR U14=4 OR V8=1 THEN PRINT "OE" : RETURN
ELSE PRINT 8G" : RETURN
'FOR U1 + U2 = 1
IF U4 <= 2 THEN GOTO 1400
IF V1 = 1 THEN PRINT 8E
RETURN
IF U3 = 0 THEN PRINT 4G
RETURN
IF U3 = 1 THEN GOTO 1390
IF V2 = 1 THEN PRINT 0D
RETURN
IF U6 = 8 THEN PRINT "2D
RETURN
ELSE PRINT "8D
RETURN
11
1330
1340
1350
1360
1370
1380
1390
1430
1440
'u4
< =
:
IF U3 = 0 THEN PRINT 0A
RETURN
IF V7 = 1 AND U6 = 1 AND U11 = 1 THEN PRINT 4E
: RETURN ELSE GOTO 1450
IF U4 = 0 THEN GOTO 1440
ELSE 1450
IF U3 = 2 THEN PRINT 8A
RETURN
ELSE PRINT 0B
RETURN
IF U4 = 2 THEN GOTO 1480
IF US > 0 THEN PRINT 9C
RETURN
IF U3 = 2 THEN PRINT 8B
RETURN
ELSE PRINT 6B
RETURN
IF V1 = 1 THEN PRINT 8C
RETURN
IF U3 <> 1 THEN GOTO 1500
IF U6 = 8 THEN PRINT 2C"
RETURN
ELSE PRINT 8D
RETURN
IF V2 = 1 THEN PRINT 0C
RETURN
IF U7 <> 3 AND U2 = 1 THEN PRINT "1D" : RETURN
IF U6 = 3 OR U6 = 4 THEN PRINT 9D" : RETURN
ELSE PRINT "4F" : RETURN
11
11
:
11
11
11
11
11
11
11
11
11
11
11
:
:
11
11
1510
1520
1530
11
2
11
1480
1490
1500
11
11
11
1450
1460
1470
11
11
11
11
1400
1410
1420
11
11
:
11
11
11
11
TEMPLATE MATCHING
The following program is a complete template matching
procedure. It gathers data from the the Videometrix video
processing unit. The information is processed on a HP 85
computer. First, the program learns the template which is
used for rna tching, then it reads the whole picture. The
template is matched to different windows in the picture
and
then
it
is
classified to
the
class
i t belongs.
Characters or any other object can be recognized by this
program.
90
n
..-·
11 I) CLEAF:
120 0!~1 !$[4~j]
13(1 DH1 S1(2(1,;;:·.:.
140 DIM :::2.::20 .. 2::0
150 DIM S ·:. 1(1 2)
16121 DH1 '·H2.· 2)
170 DIM l·H:2>
180 DIM B<z:··
190 DIM 0$[4~•1
2 0 ~j ! HH T I AL I Z E I . . ·tJ :t :t. :t. .n: :t; :t: :t: t ·t.
::·1(1
**
8~3=lt1
RE·::ET 8€t
CONTROL 80.4 ; 26
240 CONTPOL 80,3 ; 15
250 CONTROL 80.2 ; 2
260 ! SET WINDOW*******~******
220
230
270
@ E=0
OS~"RG.€,230,8.240"
280 GOSUB 980
290 ! SET CROSS HAIR************
30(1 0$=" C:S ...-H.' 11 !:•"
31l)
GOSUB 980
36~3
GOSUB 980
@
E=0
320 O$="CS/V,120" @ E=0
330 GOSUB 980
340 ! DIGITIZE VIDEO************
350 O$="DV/E" @ E=0
170 ! SET THRESHOLD*************
380 O$="TH/40n @ E=0
390 GOSUB 98~)
400 ! DISPLAY DIGITIZED PIX~****
410 O$="DM/D" @ E=0
420 GOSUB 980
430 ! READ LABELED SAMPLE Sl****
440 FOF.: 1=0 TO 9
450 ! READ # OF BLACK PIX*******
460 O$="RR/8" ~ E=1
470 GOSUB 980
480 X='. .'AL (I$)
490 S1<I,l)=X
500 ! READ # OF WHITE PIX*******
51 (1 0!-=" AR.-···w• @ E= 1
520 GOSUB 980
530 't'=VAL(l$>
540 S 1 (I .• 2) =Y
55~ ! PRINT S1******************
L I ET~·0€i
560 0 I SP II s 1 (II ; I_;
II) II_;
s 1 (I·' 1) _;
II·' II
.•E:lr:.J . 2)
570
~~E~::T
I
580 DISP "******************"
590 ! SET ~INDOW FOR S2*********
600 Q$="AG,236.400.8.240" @ E=0
610 GOSUE~ 980
620 Q$=HCS/H.320" @ E=0
630 GOSLIE 980
640 ! READ LABELED SAMPLE S2****
~Ci•" ~="Or;· J=~ TO 9
..; € .~~
~·
= ;:.
~
~
--, c
-:-: :_
~ f~· r..:·
F
-~
: ±· ~- :t ·t: :.t~
1:·
±:
91
~::,
-~;
-: 7'(··
(i:i:
r.:)8~~
GOSUB S+80
690
700
710
720
730
740
750
t·
I!•
E.=:
S2(J,l)=VALCl$J
! READ # OF WHITE PIX*******
OS="AR/W" @ E=l
GOSUB 980
S2(J.2)=VALCJ$)
! PRINT S2******************
D!SP "S2C";J;")";S2(J.1);"."
;S2CJ,2)
760 HEXT J
770 ! GEN. CLASSIF. RULES***~*•*
78€1 GOSUE: 1860
790 ! GOTO UNLABELED SAMPLES***
800 CLEAF:
810 OISP "MOVE CAMERA TO THE UNL
ABELED
SAMPLES"
820 DISP "THEN INPUT 'C'"
830 INPUT A$
840 IF A$<>"C" THEN GOTO 810
850 GOSUB 1020
860 U=l
870 GOSUE' 1940
880 GOSUB 1230
890 Ll=2
900 GOSUB 1940
910 GOSUB 1440
920 U=-3
93€1 GOSUB 1940
940 GOSUB 1650
950 U=4
960 GOSUB 1940
970 END
980 ! TALK TO VPU***********~**
990 OUTPUl 80 ;OS
1000 IF E=1 THEN ENTER 80 ; I$
1010 RETUPN
1020 ! SAMPLE WIHOOWSt**********
l:
1030
1040
1050
1060
1070
10E!0
1090
1100
1110
1120
1130
1140
1150
1160
1170
1180
1190
1200
1210
O$="AG .. 0. 230. 8 .. 240" ~~ E=0
GOSUB 9E:0
O$="CS/H.115"@ E=0
GOSUB 980
O$="CS/V,120"@ E=0
GOSUB 980
0$="0U/E" @ E=0
GOSLIB 980
O$="TH/40" @ E=0
GOSUE: 980
O$="DM/O" @ E=0
GOSIJB 980
! READ UNLABLEO SAMPLES***
O$="AR/8" @ E=1
GOSUB 980
SC1.1)=VAL(l$)
O$="AR/W" @ E=l
GO~;UB 980
S(1,2>=VAL<I$)
122f-t RETI_IJ;;•N
92
1 .:: ~
~...
1250
1260
1270
1280
1290
1300
1310
1320
1330
1340
1350
1360
1370
1380
1390
1400
1410
1420
1430
1440
1450
1460
1470
1480
1490
1500
1510
1520
1530
1540
1550
1560
1570
1580
1590
1600
1610
1620
163~
1640
1650
1660
1670
1680
1690
1700
1710
1720
1730
1740
1750
1760
1770
1780
1790
1800
o~" = , ~ .:
:::: ;: f
a. ~Z1 (-: . :=: ..:, ~ e "
'!
!:
=..
GOSUB 980
O$="CS'H.320" @ E=0
GOSUB 980
O$="CS/V,124" @ E=0
GOSUB 980
O$="DV/E" @ E=0
GOSUB 980
O$="TH/40" @ E=0
GOSUB 980
O$="DM/D" @ E=0
GOSUB 980
! READ UHLABLED SAMPLE****
O$="AR'B" @ E=l
GOSUB 980
SC1.1>=VAL(l$)
O$="AR/W" @ E=l
GOSLIB 98€1
S<1.2>=VAL(l$)
F~ETLIRN
! SAMPLE WINDOWS***********
O$="AG.236.400.240.503" @ E
=0
GOSllr! 980
O$="CS/H,J20" @ E=0
GOSUB 980
O$="CS/V,372" @ E=0
GOSUB 9E~0
O$="DV/E" @ E=0
GOSUB 980
O$="TH/40" @ E=0
GOSUE; 980
O$="DM/O" @ E=0
GOSLIB 980
! READ UNLABELED SAMPLE*~**
O$="RR/B" @ E=l
GOSUB 980
S<l.l)=VALCI$)
O$="AR/W" @ E=l
GOSUE: 980
SC1.2)=VALCl$)
RETLI~:N
! SAMPLE WINDOW************
O$="AG.0.230.240.503" @ E=0
GOSUB 980
O$="CS/H,115"@ E=0
GOSUB 980
O$="CS/V,372" @ E=0
GOSIJB 980
O$="DV/E" @ E=0
GOSIJB 980
O$="TH/40" @ E=0
GOSIJB 980
O$="DM/0" @ E=0
GOSUB 380
! READ UNLABELED SAMPLE****
O$="AR/B" @ E=l
GOSUB 980
1 ::: 1 e ~:: ( ~ ~ ·:! =;..t P ~ ,. . r :t >
1 ::~ 2 ~ ·:i :!" = ~ P lJ " '! E = l
,j
93
184>)
1850 RETURN
1860 ! CLASSIFICATION RULES****
1870 ! CALCULATE AVERAGE VECT**
1880 GOSUB 2010
1890 ! CALCULATE DIREC. VECT.**
1900 GOSUE: 2190
1910 ! CALCULATE OFFSET b
19.20 GOSUB 2240
1930 RETURN
1940 ! CLASSIFIER SUBROUTINE***
1950 ! CALCULATE OISCR. PUNC****
1960
1970
1980
1990
2000
2010
2020
2030
2040
2050
2060
2070
2080
2090
2100
2110
2120
2130
2140
2150
2160
2170
2180
2190
2200
2210
2220
2230
2240
2250
2260
2270
2280
2290
2300
231£1
2320
2330
GOSUB 2330
! APPLY THRESHOLD**********·
IF G>=0 THEN GOSUB 2410
IF G<0 THEN GOSUB 2450
RETUPN
! AVERAGE SUBROUTINE*******
1<=0
FOR I=1 TO 2
•..HI<.· I )=0
FOR J=3 TO 9
VCK.I>=UCK,I>+S1CJ,J)
NE:=<T .J
VCK,l)=V<K.l)/10
NE~<T I
K=1
FOR I=1 TO 2
I,} (I<} 1) =0
FOR J=0 TO 9
V<K . J)=U<•<.l)+S2CJ . J)
NE::<:T .J
U(K . l>=VCK.l)/10
NE:=<T I
RETURt-4
! DIRECTION VECTOR*********
FOR 1=1 TO 2
lrH I>='··'< 0.· I )-V<: 1 . I>
NE:=-::T I
RETURN
! OFFSET b*****************
FOF.: t<·:=e TO 1
8(1<:)=0
FOR 1=1 TO ~·
BCK)=B<K>+V(K,I>*V(K,I)
NE::<T !
NE:=<T K
D=t:P<0>-B<1>>...-2
RETUR~t
!
DISCREMINENT FUHC********
2340 G=0
2350 '(=0
2360
FOR I=1 TO :2
Y=\'+~4( I )*S( 1 I)
2380
2390
2400
2410
HE:=<T I
2370
;;.~.::·~;
I
G='r'-D
RETURN
! MESSAGE FOR CLASS1*******
~· 4? (1
F· ;:· I ··~ T
!I ·:-
p ~"~1 F' L E It
0 CLASS 1"
u . E: E ~ Cit·~ ;:; -:: li
2440 F.:ETIJFH~
2450 ! MESSAGE FOR CLASS2*******
2460 OISP "SAMPLE";U;"BELONGS TO
CLASS 2"
2470 PRINT "SAMPLE";U;"BELOHGS T
Ct CLASS 2"
2480 RETURN