Navigation through Crosswalks with the Bionic Eyeglass Mihály Radványi Kristóf Karacs Faculty of Information Technology Pázmány Péter Catholic University Budapest, Hungary [email protected] Faculty of Information Technology Pázmány Péter Catholic University Budapest, Hungary [email protected] Abstract— In this paper we present an algorithm that help blind and visually impaired people to navigate through urban environments by detecting pedestrian crosswalks. In addition to detecting the presence of a crosswalk, its orientation and position respected to the camera is also determined to help the approach of intersections and traversing them. Keywords – bionic, CNN, crosswalk, navigation I. INTRODUCTION The Bionic Eyeglass [1] is a portable device recently proposed to aid blind and visually impaired people in everyday navigation, orientation, and recognition tasks. In our previous works we have already introduced the concept and the prototype of the Bionic Eyeglass [2] that was built using the Bi-i visual computer [3] as its main computational platform. The Bi-i is based on the Cellular Neural/Nonlinear Network – Universal Machine (CNN-UM) [4] and the underlying Cellular Wave Computing principle. All algorithms for spatio-temporal processing proposed in the paper are realizable on Cellular Wave Computers and the instruction templates references can be found in [5]. Our experiments showed good results in detecting and recognizing pedestrian crosswalks, as Fig.1. shows. Figure 1. Crosswalk detection results marked on sample input frames. II. TOWARDS NAVIGATION orientation of it. When talking about position we mean the position within the frame where the crosswalk appears (left, right, middle, up or down). Orientation gives information on the direction in which the user should proceed. Position and orientation are the two most important factors in guiding the user across the crosswalk correctly. A. Estimating position In order to help the user in moving the camera towards the best position, the position of the crosswalk within the frame is estimated. To calculate the center of mass of the crosswalk we use the convex hull fitted on the stripes. This was done through a series of three CNN templates. At first we apply a vertical shadow operator upwards and downwards separately, and then we take the intersection of them using the LOGAND template. Then the center of mass can easily be calculated, and the position within the frame to be determined. Based on the position of the crosswalk a feedback command is given to the user which direction to move his hand: left, right, up or down. B. Estimating orientation After having reached the proper position with the camera, the orientation of the crosswalk is calculated. The arrow on Fig.2 refers to the lengthwise center line of the crosswalk that can be thought of as an ideal pass-through path. We obtain this by fitting lines on the left-side and right-side pixels of the stripes and taking the angle bisector. Our experiments show that it is enough to analyze the tangent of the closest (base) line to arrive to satisfactory results. At first the images are downsampled, and the coordinates of the bottommost black pixels are collected. With minimal number of outliers a line is fitted on these coordinates by calculating the slope from the modus of the coordinate differences. Based on the slope of the baseline the system can identify a suggested path that user should follow. Fig.3 shows the results of position and orientation estimation for sample image frames. Recognizing crosswalks and detecting their presence is important, but to create a fully functional system that navigates and helps the user to reach a crosswalk and to traverse an intersection, new methods were needed for the estimation of orientation, position and direction of the crosswalk. The method previously described [6,7] gives a confidence value whether there is a crosswalk in front of the user, but does not contain any additional information of the position and Bolyai János Research Scholarship Figure 2. The basic concept of crosswalk direction estimation. practical point of view, false positives are much more dangerous than false negatives, because they induce the person to cross at a point where no crosswalk is present, thus this value has to be minimized at all costs. False negative (missclassified crosswalk) results appeared in 25% – all the five videos – and 16% – on good quality videos – of the cases. A high percentage of missclassified cases is caused by the slow fade in or fade out effect during the adaption of the builtin auto-gain function of the mobile camera. Figure 3. Visualization of orientation and position estimates. The arrows show how to move the camera, the stars represent the sample points for baseline fitting. The baselines are shown shifted for better visibility. C. Using key frames Although processing each frame with the whole algorithm gives robust results, it is a waste of time and processing power due to the high correlation and similarity between the frames following each other. By selecting key-frames that are fully processed it suffices to carry out much more restricted operations on intermediate ones using the hypotheses obtained earlier. Our keyframing method works as follows. If a crosswalk is found, a bounding rectangular can easily be determined as the Region of Interest (ROI) for the following frame. In that candidate area only a quick process is carried out – such as a THRESHOLD followed by an EDGE template [8] – to get zebra confidence values. Since these calculations are fast enough we are still able to estimate the position and the orientation of the crosswalk on these frames. This process is repeated unless the confidence value falls below a threshold, or we reach the end of a five frame long cycle. This can be done in a fraction of the running time of the original algorithm. This way we get the possibility to run parallel algorithms in real time on an image flow. Fig.4. illustrates the concept. III. RESULTS AND DISCUSSION The improved algorithm has been tested on a much broader range of possible inputs, with an increased number of test images. The performance of crosswalk detection on five prerecorded video flows with a total of over 1700 frames is shown in Table I. In 73.4% of the cases the proposed method performed well, however false positive (misclassified non-crosswalk) results account for only 1% of the cases. Considering that two of the videos were recorded in bad lighting conditions, we made a different summary containing only the better quality ones. In that case the ratio of correct answers was 82.5%. From a TABLE I. IV. CONCLUSIONS Optimization of processing of the video flows allowed us to make estimations on the position and orientation of crosswalks. The additional information provides users with a useful feedback, and enables them to obtain a more reliable and a more complete model about their environment. REFERENCES [1] [2] [3] [4] [7] Figure 4. (a) Crosswalk detected and its bounding rectangle; (b) ROI indicated on the following frame. Crosswalk not detected 447 666 Results on position estimation were compared to human observations. Supervisors were asked to separate the correctly classified crosswalk frames into five group based on the position whithin the frame (strong-left, left, center, right, strong-right). These observations were compared to three valued – left, center, right – algorithmic outputs, thus generating two type of miscalssification, hard and smooth. Hard misclassifications appered only on 3% of over 600 frames containing crosswalk, and smooth appeared on 17% of them. [6] (b) Crosswalk detected 613 15 Crosswalk present No Crosswalk present [5] (a) DETECTION RESULTS ON VIDEO FRAMES K. Karacs, A. Lázár, R. Wagner, D. Bálya, T. Roska, and M. Szuhaj, “Bionic Eyeglass: an Audio Guide for Visually Impaired,” in Proc. of the First IEEE Biomedical Circuits and Systems Conference (BIOCAS 2006), London, UK, Dec. 2006, pp. 190–193. K. Karacs, A. Lázár, R. Wagner, B. Bálint, T. Roska, and M. Szuhaj, “Bionic Eyeglass: The First Prototype, A Personal Navigation Device for Visually Impaired,” in Proc. of First Int’l Symp. on Applied Sciences in Biomedical and Communication Technologies (ISABEL 2008), Aalborg, Denmark, 2008. Á. Zarándy and C. Rekeczky, “Bi-i: a standalone ultra high speed cellular vision system,” IEEE Circuits Syst. Mag., vol. 5, no. 2, p. 36– 45, 2005. T. Roska and L. O. Chua, “The CNN universal machine: an analogic array computer,” IEEE Trans. Circuits Syst. II, vol. 40, pp. 163–173, Mar. 1993. K. Karacs, Gy. Cserey, and Á. Zarándy. (2010) Software Library for Cellular Wave Computing Engines. [Online] Available: http://cnntechnology.itk.ppke.hu/Template_library_v3.1.pdf, visited on 24-092010 K. Karacs, M. Radványi, M. Görög, T. Roska, “A Mobile Visual Navigation Device: new algorithms for crosswalk and pictogram recognition”. in Proc. of 2nd Int’l Symp. on Applied Sciences in Biomedical and Communication Technologies (ISABEL 2009), Bratislava, Slovakia, 2009. M. Radványi, G. Pazienza, and K. Karacs, “Crosswalk Recognition through CNNs for the Bionic Camera: Manual vs. Automatic Design” in Proc. of European Conf. on Circuit Theory and Design (ECCTD 2009) Antalya, Turkey, 2009.
© Copyright 2026 Paperzz