English version

Traffic Lights Recognition
in a Scene using a PDA
Daniel MORENO EDDOWES, [email protected]
ETSE (UAB) Escola Tècnica Superior d’Enginyeria
(Universitat Autonoma de Barcelona)
and UFR 6 (Paris 8).
Labo GRAII, A-145,
Université Paris 8, 93526 St. Denis
http://ufr6.univ-paris8.fr/desshandi/
Jaime LOPEZ KRAHE. [email protected]
Director of UFR6 (Paris 8).
Labo GRAII, A-145,
Université Paris 8, 93526 St. Denis
http://ufr6.univ-paris8.fr/desshandi/
Abstract:
We present a project that consists in the Pedestrian Traffic Lights
Interpretation oriented for blind people. It is implemented on a mobile and
autonomous system, a PDA with a Video Camera incorporated. The traffic lights
interpretation is based on the analysis of the images taken by the Video Camera
and the general problem can be defined as finding a certain object in a complex
scene.
The complete system integrates other adapted functions for disabled people,
as for example a vocal agenda that features word recognition on an appointment.
The problem is subdivided in two different processes : the Segmentation
Process and the Recognition Process. In the first stage, the image is processed to
find the contours of the objects of interest using a colour segmentations. In the
second stage, we use structural methods over the patterns selected to decide if the
pedestrian traffic light is Green (moving figure) or Red (stopped figure).
The system should allow us at medium-term, the learning and recognition of
multiple images specific for the user (multiple objects) with the object of becoming
an adaptable technical help.
We thank « Pocket Entreprises » for the facilities offered with the used
material.
Keywords:
Scene Analysis, Image Segmentation,
Pattern Recognition, Technical help, Edit
Distance,
Compensating
Help
for
Handicapped.
To end the introduction, we will mention
that this application is included in a bigger
project of adaptable technical help for
handicapped persons, totally implemented
on a PDA.
1-Introduction:
2-Problem Presentation
The objective of this project is to
provide a solution for one of the problems
that visual handicapped persons find in an
urban environment : crossing a road
governed by a traffic light.
The utilisation of a PDA (Personal
assistant) is a good solution because it is a
portable, autonomous and generic system
with an assumable price.
Actually PDAs are potent enough to
grant a high computational cost, as this
project requires (Processor at 400 Mhz).
Their evolution is quick, and at mid-term we
will be able to find PDAs with similar
features at the actual PCs.
The recognition problem is difficult
because of the complexity of the scene : the
variability of the position of the traffic light,
the different sizes of them depending on the
distance or the optic, the different shapes of
the silhouette, and the big variation in the
brightness of the colours (Fig.1, Fig.2). Also
other cases face us to interesting theorist (or
heuristic) problems : road crossing in two
times, or contradictory information (Fig.10).
A database with 200 images is freely
available on :
http://ufr6.univ-paris8.fr/desshandi/bdfeux
due to allow us to have other comparative
results.
Fig. 1. Difference in-between the night and day
Fig. 2. Zebra-crossing and traffic light post
Images.
situation.
We have divided the images in two
different types, according to the contrast and
brightness : the night images and the day
images. The day images have a big intensity
variation on the RGB plan. That means that
colour is not a determinant factor. In the
other hand, in the night images, colour can
be used as a conductor factor that can allow
us to discriminate the regions out of interest.
According to this criteria, we can realise
two different implementations of the
segmentation stage. In first place, a colour
segmentation (Green, Red) is a necessary
condition to define the regions of interest.
The definition of the day colour space, the
processing time, and the error percentage are
also higher. The number of false alarms can
be very high (all the objects selected do not
have to be traffic lights), and a colour
process is not enough.
The second stage produces a binary
segmented image. This stage is the same for
the day and night images.
The objective is to grant at 100% the
recognition of the green traffic light due to
assume the security of the crossing. That
implies an increase on the reject class.
Anyway, we think that the 100% of
recognition can not be assured with the
actual knowledge. We have detected some
ambiguous cases that can not be solved.
The case that is shown in (Fig.11) that has
two traffic lights one next to the other is an
example. The right traffic light that is green,
corresponds to the perpendicular street ! that
means we can not cross the street. This
pathology has been increased a few days
latter : in (Fig.12) we can see the same
traffic light after an accident. Now the left
traffic light has become nearly invisible !
3.1- Segmentation Stage
The Segmentation Stage finds the
contours of de possible forms that satisfy the
given criteria. The segmentation is divided
in two process :
3.1.1 Elimination of the regions
without interest :
Night : In the night images (Fig.1),
using a colour threshold we can isolate the
objects that are very luminous. The value for
the threshold can be high, in-between 90% /
95% of the maximum intensity value, in the
plane R or G, depending on the silhouette
searched. This process allows us to isolate
the relevant forms.
Day : The colour threshold is less
efficient. It is completed with other
knowledge criteria that allow us to find the
traffic light in the scene, as the position of
the zebra crossing and the traffic post
(Fig.2).
Also in the two cases, if we observe the
scenes, it is not necessary to use the top and
the bottom part, because the traffic light will
not be there. We discard then a 35% of the
size of the image, 10% at the top and 25% at
the bottom.
3.2.2 Contour Detection :
Night : The contour detection is
implemented
using
mathematical
morphology [2]. With the binary image
received from the segmentation stage we use
Dilatation and Erosion operations :
3- System Implementation :
D( E( D(image) ) ) – E( D(image) )
We present in this section an
approximation of the procedure used in the
two stages, the segmentation stage and the
recognition stage.
With this structural element :
M={
1 1 1
1 1 1
1 1 1
}
Using this structural element for
dilatation and erosion we get 4-connexe
closed and discrete curve contours.
Day : The best results have been
obtained using de Gradient [3], [7]. In this
point we use « thinning » algorithms too get
the simple contours.
In the (Fig.3) we can see the result of the
segmentation stage. The left silhouette
corresponds to finding the contour of the
forms, and the right one represents the
« thinning » of it.
3.2.1- Freeman Code :
The Freeman Code is going to allow us
to transform each object we have found in a
string. The (Fig.5) shows us the alphabet for
the 4-connectivity :
Fig.5 Freeman Alphabet 4-connexe.
We reject the forms bigger than a given
size, because they will not be silhouettes to
reduce the number of strings to compare.
Fig. 3 Results after the segmetation and the
transformation in discrete contours.
3.2- Recognition Stage :
In this stage we receive the binary image
of the 4-connexe closed and discrete curve
contours.
The first approximation is using the
Freeman Code [4] to code the silhouette
models (example of the models Fig.4), that
are different depending on the size (height).
After this, we code all the figures found in
the segmented image.
With this strings, we use multiple
detection criteria :
The Edit Distance to find the most
similar string,
Determination of Axial Symmetry
(red silhouette),
Structure detection using structural
methods (grammar) to determine if
we have found a traffic light
silhouette.
3.2.2- Approximation
curves :
We have defined a no-terminal
alphabet : X = { a,b,c,d }. In (Fig.6) we can
see an example of transformation.
This alphabet is defined with the
grammar shown below :
S→a|b|c|d
a → a1 | a2 | a3 | a3’ | a4 | a4’
a1 → a1d | a1i
a1d → 0 a1d | 0 | λ
a1i → 2 a1i | 2 | λ
a2 → a2ab | a2ar
a2ab → 3 a2ab | 3 | λ
a2ar → 1a2ar | 1 | λ
a3 → 0 a3d | 3 a3ab
a3d → 3 a3ab | 3 | λ
a3ab → 0 a3d | 0 | λ
a3’ → 2 a3i | 1 a3ar
a3i → 1 a3ar | 1 | λ
a3ar → 2 a3i | 2 | λ
Fig. 4 Example of the models used.
par
a4 → 0 a4d | 1 a4ar
a4d → 1 a4ar | 1 | λ
a4ar → 0 a4d | 0 | λ
a4’ → 2 a4i | 3 a4ab
a4i → 3 a4ab | 3 | λ
a4ab → 2 a4i | 2 | λ
b → b1 | b2 | b3 | b4
b1 → a4 b11 a3 | a1d b111 a1d
b11 → a1d b111 a1d | λ
b111 → a4 b11 a3 | λ
b2 → a2ab b22 a2ab | a3 b222 a4’
b22 → a3 b222 a4’ | λ
b222 → a2ab b22 a2ab | λ
Fig.6 Alphabet and example.
b3 → a1i b33 a1i | a4’ b333 a3’
b33 → a4’ b333 a3’ | λ
b333 → a1i b33 a1i | λ
The Freeman Code for the figure 6 is :
Silhouette = abcbabcababadabbadbdacacb
b4 → a2ar b44 a2ar | a3’ b444 a4
b44 → a3’ b444 a4 | λ
b444 → a2ar b44 a2ar | λ
3.2.3- Similarity Measure :
c → c1 | c2 | c3 | c4
The Similarity Measure in between
strings is calculated with the Edit Distance
[4]. The Edit Distance defines three basic
operations that will allow us too change one
string in the other one. We will associate a
cost for each operation:
c1 → a3’ c11 a4’ | a1i c111 a1i
c11 → a1i c111 a1i | λ
c111 → a3’ c11 a4’ | λ
c2 → a2ar c22 a2ar | a4 c222 a3’
c22 → a4 c222 a3’ | λ
c222 → a2ar c22 a2r2 | λ
Substitution : a → b cost γ (a,b)
Insertion : λ → a cost γ (λ,a)
Elimination : a → λ cost γ (a, λ)
c3 → a1d c33 a1d | a3 c333 a4
c33 → a3 c333 a4 | λ
c333 → a1d c333 a1d | λ
The edit distance will the cost of the
basic operations made to convert one string
in the other. The additive of the criteria
assures we can use Dynamic Programming.
Also we can use another criteria, the
symmetry of the red silhouette. In the
(Fig.7) we can see the axial symmetry.
c4 → a2ab c44 a2ab | a4’ c444 a3
c44 → a4’ c444 a3 | λ
c444 → a2ab c44 a2ab | λ
d → d1 | d2 | d3 | d4
d1 → a2ar a1i a2ab |a2ar a1d a2ab
d2 → a1i a2ab a1d | a1i a2ar a1d
d3 → a2ab a1d a2ar | a2ab a1i a2ar
d4 → a1d a2ab a1i | a1d a2ar a1i
d1
d2
d3
d4
Fig.7 Red Silhouette Symmetry.
The Silhouette is represented by the string :
Silhouette = abcbcabdbacbcb
a
d
b
b
c
a
b
c
c
b
a
c
b
b
We can observe the symmetry inbetween the marked cells (inverted values).
The evaluation of this criteria is quick and is
going to orientate the research.
We speak of an Axial Symmetry, leftright. This will allow us to search for a
symmetric form : if we find one, then we
will compare it with the red silhouette
model, if not we will compare it with the
green one.
4-Results :
This solution is a good approach for the
problem. The results are shown in (Table 1).
The segmentation stage finds the classic
image processing problems. There are plenty
of different approximations to solve the
segmentation in complex scenes, specially
for the day images. But we can’t find an
optimal method.
Fig. 9 Far Traffic Light and zoom.
.
In the table we can see the results,
classed as : Red, Green, Not Class.
Table 8 : Result Table (Percentage)
Red-Day
Green-Day
Red-Night
Green-Night
Red
70
3
90
2
Green
5
64
1
92
Not Class.
25
33
9
6
5-Conclusion :
The utilisation of no-specific equipment
to solve technical help for sensorial
handicapped is a good solution for their
integration.
The treated problem is complex and it
depends on different parameters with a big
variability (brightness, different forms,
distance, etc). The implementation results in
particular in the night case are really
encouraging.
The used technology is also a source
problem. The video camera used, does not
allow an optic zoom to take a better image
of the region of interest. So we find a
resolution problem in the recognition of the
Fig. 10 Two traffic lights in the same scene.
little traffic lights in the scene (Fig.9), but
this kind of obstacles will be solved with the
advance of the technology.
We have worked on the static images
taken from the PDA resolution. We have not
started the camera movement question, that
could bring us new complications.
Finally, we have found other problems
as the problem shown in the (Fig.10). In this
image there are two independent traffic
lights in the same scene. In this case the
nearest traffic light (the biggest one in the
image) has de priority. So if the nearest
traffic light is red and the other one is green,
the system will find the good solution. But,
if the nearest traffic light is red, and the
other one is green, and something (car,
person, .. ) covers the red one, the system
could give a valid recognition.
Also we have found a peculiar case,
(Fig. 11 and Fig. 12). This is a pathologic
case, where de recognition can be nearly
random.
Fig. 11 Ambiguous case.
6- References :
[1] PDAs http://www.lepdashop.com/
[2] Jean Serra. Image analysis and
mathematical
morphology.
Academic Press, London. 1982.
[3] John C. Russ. The image processing
handbook. 2a edition. CRC Press,
1995.
[4]
L. Miclet. Méthodes structurelles
pour la reconnaissance des formes.
Éditions Eyrolles, Cnet-Enst 1984.
[5] J. M . Chassery, A. Montanvert
Géométrie discrète. Hermes, Paris,
1991
[6] A. Cornuéjos, L. Miclet,
Apprentissage artificiel. Eyrolles,
Paris, 2002
[7] H. Maitre, Le traitement des images.
Hermes, Paris, 2003
Fig. 12 Pathologic Case.