Traffic Lights Recognition in a Scene using a PDA Daniel MORENO EDDOWES, [email protected] ETSE (UAB) Escola Tècnica Superior d’Enginyeria (Universitat Autonoma de Barcelona) and UFR 6 (Paris 8). Labo GRAII, A-145, Université Paris 8, 93526 St. Denis http://ufr6.univ-paris8.fr/desshandi/ Jaime LOPEZ KRAHE. [email protected] Director of UFR6 (Paris 8). Labo GRAII, A-145, Université Paris 8, 93526 St. Denis http://ufr6.univ-paris8.fr/desshandi/ Abstract: We present a project that consists in the Pedestrian Traffic Lights Interpretation oriented for blind people. It is implemented on a mobile and autonomous system, a PDA with a Video Camera incorporated. The traffic lights interpretation is based on the analysis of the images taken by the Video Camera and the general problem can be defined as finding a certain object in a complex scene. The complete system integrates other adapted functions for disabled people, as for example a vocal agenda that features word recognition on an appointment. The problem is subdivided in two different processes : the Segmentation Process and the Recognition Process. In the first stage, the image is processed to find the contours of the objects of interest using a colour segmentations. In the second stage, we use structural methods over the patterns selected to decide if the pedestrian traffic light is Green (moving figure) or Red (stopped figure). The system should allow us at medium-term, the learning and recognition of multiple images specific for the user (multiple objects) with the object of becoming an adaptable technical help. We thank « Pocket Entreprises » for the facilities offered with the used material. Keywords: Scene Analysis, Image Segmentation, Pattern Recognition, Technical help, Edit Distance, Compensating Help for Handicapped. To end the introduction, we will mention that this application is included in a bigger project of adaptable technical help for handicapped persons, totally implemented on a PDA. 1-Introduction: 2-Problem Presentation The objective of this project is to provide a solution for one of the problems that visual handicapped persons find in an urban environment : crossing a road governed by a traffic light. The utilisation of a PDA (Personal assistant) is a good solution because it is a portable, autonomous and generic system with an assumable price. Actually PDAs are potent enough to grant a high computational cost, as this project requires (Processor at 400 Mhz). Their evolution is quick, and at mid-term we will be able to find PDAs with similar features at the actual PCs. The recognition problem is difficult because of the complexity of the scene : the variability of the position of the traffic light, the different sizes of them depending on the distance or the optic, the different shapes of the silhouette, and the big variation in the brightness of the colours (Fig.1, Fig.2). Also other cases face us to interesting theorist (or heuristic) problems : road crossing in two times, or contradictory information (Fig.10). A database with 200 images is freely available on : http://ufr6.univ-paris8.fr/desshandi/bdfeux due to allow us to have other comparative results. Fig. 1. Difference in-between the night and day Fig. 2. Zebra-crossing and traffic light post Images. situation. We have divided the images in two different types, according to the contrast and brightness : the night images and the day images. The day images have a big intensity variation on the RGB plan. That means that colour is not a determinant factor. In the other hand, in the night images, colour can be used as a conductor factor that can allow us to discriminate the regions out of interest. According to this criteria, we can realise two different implementations of the segmentation stage. In first place, a colour segmentation (Green, Red) is a necessary condition to define the regions of interest. The definition of the day colour space, the processing time, and the error percentage are also higher. The number of false alarms can be very high (all the objects selected do not have to be traffic lights), and a colour process is not enough. The second stage produces a binary segmented image. This stage is the same for the day and night images. The objective is to grant at 100% the recognition of the green traffic light due to assume the security of the crossing. That implies an increase on the reject class. Anyway, we think that the 100% of recognition can not be assured with the actual knowledge. We have detected some ambiguous cases that can not be solved. The case that is shown in (Fig.11) that has two traffic lights one next to the other is an example. The right traffic light that is green, corresponds to the perpendicular street ! that means we can not cross the street. This pathology has been increased a few days latter : in (Fig.12) we can see the same traffic light after an accident. Now the left traffic light has become nearly invisible ! 3.1- Segmentation Stage The Segmentation Stage finds the contours of de possible forms that satisfy the given criteria. The segmentation is divided in two process : 3.1.1 Elimination of the regions without interest : Night : In the night images (Fig.1), using a colour threshold we can isolate the objects that are very luminous. The value for the threshold can be high, in-between 90% / 95% of the maximum intensity value, in the plane R or G, depending on the silhouette searched. This process allows us to isolate the relevant forms. Day : The colour threshold is less efficient. It is completed with other knowledge criteria that allow us to find the traffic light in the scene, as the position of the zebra crossing and the traffic post (Fig.2). Also in the two cases, if we observe the scenes, it is not necessary to use the top and the bottom part, because the traffic light will not be there. We discard then a 35% of the size of the image, 10% at the top and 25% at the bottom. 3.2.2 Contour Detection : Night : The contour detection is implemented using mathematical morphology [2]. With the binary image received from the segmentation stage we use Dilatation and Erosion operations : 3- System Implementation : D( E( D(image) ) ) – E( D(image) ) We present in this section an approximation of the procedure used in the two stages, the segmentation stage and the recognition stage. With this structural element : M={ 1 1 1 1 1 1 1 1 1 } Using this structural element for dilatation and erosion we get 4-connexe closed and discrete curve contours. Day : The best results have been obtained using de Gradient [3], [7]. In this point we use « thinning » algorithms too get the simple contours. In the (Fig.3) we can see the result of the segmentation stage. The left silhouette corresponds to finding the contour of the forms, and the right one represents the « thinning » of it. 3.2.1- Freeman Code : The Freeman Code is going to allow us to transform each object we have found in a string. The (Fig.5) shows us the alphabet for the 4-connectivity : Fig.5 Freeman Alphabet 4-connexe. We reject the forms bigger than a given size, because they will not be silhouettes to reduce the number of strings to compare. Fig. 3 Results after the segmetation and the transformation in discrete contours. 3.2- Recognition Stage : In this stage we receive the binary image of the 4-connexe closed and discrete curve contours. The first approximation is using the Freeman Code [4] to code the silhouette models (example of the models Fig.4), that are different depending on the size (height). After this, we code all the figures found in the segmented image. With this strings, we use multiple detection criteria : The Edit Distance to find the most similar string, Determination of Axial Symmetry (red silhouette), Structure detection using structural methods (grammar) to determine if we have found a traffic light silhouette. 3.2.2- Approximation curves : We have defined a no-terminal alphabet : X = { a,b,c,d }. In (Fig.6) we can see an example of transformation. This alphabet is defined with the grammar shown below : S→a|b|c|d a → a1 | a2 | a3 | a3’ | a4 | a4’ a1 → a1d | a1i a1d → 0 a1d | 0 | λ a1i → 2 a1i | 2 | λ a2 → a2ab | a2ar a2ab → 3 a2ab | 3 | λ a2ar → 1a2ar | 1 | λ a3 → 0 a3d | 3 a3ab a3d → 3 a3ab | 3 | λ a3ab → 0 a3d | 0 | λ a3’ → 2 a3i | 1 a3ar a3i → 1 a3ar | 1 | λ a3ar → 2 a3i | 2 | λ Fig. 4 Example of the models used. par a4 → 0 a4d | 1 a4ar a4d → 1 a4ar | 1 | λ a4ar → 0 a4d | 0 | λ a4’ → 2 a4i | 3 a4ab a4i → 3 a4ab | 3 | λ a4ab → 2 a4i | 2 | λ b → b1 | b2 | b3 | b4 b1 → a4 b11 a3 | a1d b111 a1d b11 → a1d b111 a1d | λ b111 → a4 b11 a3 | λ b2 → a2ab b22 a2ab | a3 b222 a4’ b22 → a3 b222 a4’ | λ b222 → a2ab b22 a2ab | λ Fig.6 Alphabet and example. b3 → a1i b33 a1i | a4’ b333 a3’ b33 → a4’ b333 a3’ | λ b333 → a1i b33 a1i | λ The Freeman Code for the figure 6 is : Silhouette = abcbabcababadabbadbdacacb b4 → a2ar b44 a2ar | a3’ b444 a4 b44 → a3’ b444 a4 | λ b444 → a2ar b44 a2ar | λ 3.2.3- Similarity Measure : c → c1 | c2 | c3 | c4 The Similarity Measure in between strings is calculated with the Edit Distance [4]. The Edit Distance defines three basic operations that will allow us too change one string in the other one. We will associate a cost for each operation: c1 → a3’ c11 a4’ | a1i c111 a1i c11 → a1i c111 a1i | λ c111 → a3’ c11 a4’ | λ c2 → a2ar c22 a2ar | a4 c222 a3’ c22 → a4 c222 a3’ | λ c222 → a2ar c22 a2r2 | λ Substitution : a → b cost γ (a,b) Insertion : λ → a cost γ (λ,a) Elimination : a → λ cost γ (a, λ) c3 → a1d c33 a1d | a3 c333 a4 c33 → a3 c333 a4 | λ c333 → a1d c333 a1d | λ The edit distance will the cost of the basic operations made to convert one string in the other. The additive of the criteria assures we can use Dynamic Programming. Also we can use another criteria, the symmetry of the red silhouette. In the (Fig.7) we can see the axial symmetry. c4 → a2ab c44 a2ab | a4’ c444 a3 c44 → a4’ c444 a3 | λ c444 → a2ab c44 a2ab | λ d → d1 | d2 | d3 | d4 d1 → a2ar a1i a2ab |a2ar a1d a2ab d2 → a1i a2ab a1d | a1i a2ar a1d d3 → a2ab a1d a2ar | a2ab a1i a2ar d4 → a1d a2ab a1i | a1d a2ar a1i d1 d2 d3 d4 Fig.7 Red Silhouette Symmetry. The Silhouette is represented by the string : Silhouette = abcbcabdbacbcb a d b b c a b c c b a c b b We can observe the symmetry inbetween the marked cells (inverted values). The evaluation of this criteria is quick and is going to orientate the research. We speak of an Axial Symmetry, leftright. This will allow us to search for a symmetric form : if we find one, then we will compare it with the red silhouette model, if not we will compare it with the green one. 4-Results : This solution is a good approach for the problem. The results are shown in (Table 1). The segmentation stage finds the classic image processing problems. There are plenty of different approximations to solve the segmentation in complex scenes, specially for the day images. But we can’t find an optimal method. Fig. 9 Far Traffic Light and zoom. . In the table we can see the results, classed as : Red, Green, Not Class. Table 8 : Result Table (Percentage) Red-Day Green-Day Red-Night Green-Night Red 70 3 90 2 Green 5 64 1 92 Not Class. 25 33 9 6 5-Conclusion : The utilisation of no-specific equipment to solve technical help for sensorial handicapped is a good solution for their integration. The treated problem is complex and it depends on different parameters with a big variability (brightness, different forms, distance, etc). The implementation results in particular in the night case are really encouraging. The used technology is also a source problem. The video camera used, does not allow an optic zoom to take a better image of the region of interest. So we find a resolution problem in the recognition of the Fig. 10 Two traffic lights in the same scene. little traffic lights in the scene (Fig.9), but this kind of obstacles will be solved with the advance of the technology. We have worked on the static images taken from the PDA resolution. We have not started the camera movement question, that could bring us new complications. Finally, we have found other problems as the problem shown in the (Fig.10). In this image there are two independent traffic lights in the same scene. In this case the nearest traffic light (the biggest one in the image) has de priority. So if the nearest traffic light is red and the other one is green, the system will find the good solution. But, if the nearest traffic light is red, and the other one is green, and something (car, person, .. ) covers the red one, the system could give a valid recognition. Also we have found a peculiar case, (Fig. 11 and Fig. 12). This is a pathologic case, where de recognition can be nearly random. Fig. 11 Ambiguous case. 6- References : [1] PDAs http://www.lepdashop.com/ [2] Jean Serra. Image analysis and mathematical morphology. Academic Press, London. 1982. [3] John C. Russ. The image processing handbook. 2a edition. CRC Press, 1995. [4] L. Miclet. Méthodes structurelles pour la reconnaissance des formes. Éditions Eyrolles, Cnet-Enst 1984. [5] J. M . Chassery, A. Montanvert Géométrie discrète. Hermes, Paris, 1991 [6] A. Cornuéjos, L. Miclet, Apprentissage artificiel. Eyrolles, Paris, 2002 [7] H. Maitre, Le traitement des images. Hermes, Paris, 2003 Fig. 12 Pathologic Case.
© Copyright 2026 Paperzz