II. Towards navigation

Navigation through Crosswalks with the Bionic
Eyeglass
Mihály Radványi
Kristóf Karacs
Faculty of Information Technology
Pázmány Péter Catholic University
Budapest, Hungary
[email protected]
Faculty of Information Technology
Pázmány Péter Catholic University
Budapest, Hungary
[email protected]
Abstract— In this paper we present an algorithm that help blind
and visually impaired people to navigate through urban
environments by detecting pedestrian crosswalks. In addition to
detecting the presence of a crosswalk, its orientation and position
respected to the camera is also determined to help the approach
of intersections and traversing them.
Keywords – bionic, CNN, crosswalk, navigation
I.
INTRODUCTION
The Bionic Eyeglass [1] is a portable device recently
proposed to aid blind and visually impaired people in everyday
navigation, orientation, and recognition tasks.
In our previous works we have already introduced the
concept and the prototype of the Bionic Eyeglass [2] that was
built using the Bi-i visual computer [3] as its main
computational platform. The Bi-i is based on the Cellular
Neural/Nonlinear Network – Universal Machine (CNN-UM)
[4] and the underlying Cellular Wave Computing principle. All
algorithms for spatio-temporal processing proposed in the
paper are realizable on Cellular Wave Computers and the
instruction templates references can be found in [5].
Our experiments showed good results in detecting and
recognizing pedestrian crosswalks, as Fig.1. shows.
Figure 1. Crosswalk detection results marked on sample input frames.
II.
TOWARDS NAVIGATION
orientation of it. When talking about position we mean the
position within the frame where the crosswalk appears (left,
right, middle, up or down). Orientation gives information on
the direction in which the user should proceed. Position and
orientation are the two most important factors in guiding the
user across the crosswalk correctly.
A. Estimating position
In order to help the user in moving the camera towards the
best position, the position of the crosswalk within the frame is
estimated. To calculate the center of mass of the crosswalk we
use the convex hull fitted on the stripes. This was done through
a series of three CNN templates. At first we apply a vertical
shadow operator upwards and downwards separately, and then
we take the intersection of them using the LOGAND template.
Then the center of mass can easily be calculated, and the
position within the frame to be determined. Based on the
position of the crosswalk a feedback command is given to the
user which direction to move his hand: left, right, up or down.
B. Estimating orientation
After having reached the proper position with the camera,
the orientation of the crosswalk is calculated. The arrow on
Fig.2 refers to the lengthwise center line of the crosswalk that
can be thought of as an ideal pass-through path. We obtain this
by fitting lines on the left-side and right-side pixels of the
stripes and taking the angle bisector.
Our experiments show that it is enough to analyze the
tangent of the closest (base) line to arrive to satisfactory results.
At first the images are downsampled, and the coordinates of the
bottommost black pixels are collected. With minimal number
of outliers a line is fitted on these coordinates by calculating
the slope from the modus of the coordinate differences. Based
on the slope of the baseline the system can identify a suggested
path that user should follow.
Fig.3 shows the results of position and orientation
estimation for sample image frames.
Recognizing crosswalks and detecting their presence is
important, but to create a fully functional system that navigates
and helps the user to reach a crosswalk and to traverse an
intersection, new methods were needed for the estimation of
orientation, position and direction of the crosswalk.
The method previously described [6,7] gives a confidence
value whether there is a crosswalk in front of the user, but does
not contain any additional information of the position and
Bolyai János Research Scholarship
Figure 2. The basic concept of crosswalk direction estimation.
practical point of view, false positives are much more
dangerous than false negatives, because they induce the person
to cross at a point where no crosswalk is present, thus this
value has to be minimized at all costs. False negative
(missclassified crosswalk) results appeared in 25% – all the
five videos – and 16% – on good quality videos – of the cases.
A high percentage of missclassified cases is caused by the slow
fade in or fade out effect during the adaption of the builtin
auto-gain function of the mobile camera.
Figure 3. Visualization of orientation and position estimates. The arrows
show how to move the camera, the stars represent the sample points for
baseline fitting. The baselines are shown shifted for better visibility.
C. Using key frames
Although processing each frame with the whole algorithm
gives robust results, it is a waste of time and processing power
due to the high correlation and similarity between the frames
following each other. By selecting key-frames that are fully
processed it suffices to carry out much more restricted
operations on intermediate ones using the hypotheses obtained
earlier.
Our keyframing method works as follows. If a crosswalk is
found, a bounding rectangular can easily be determined as the
Region of Interest (ROI) for the following frame. In that
candidate area only a quick process is carried out – such as a
THRESHOLD followed by an EDGE template [8] – to get
zebra confidence values. Since these calculations are fast
enough we are still able to estimate the position and the
orientation of the crosswalk on these frames. This process is
repeated unless the confidence value falls below a threshold, or
we reach the end of a five frame long cycle. This can be done
in a fraction of the running time of the original algorithm. This
way we get the possibility to run parallel algorithms in real
time on an image flow. Fig.4. illustrates the concept.
III.
RESULTS AND DISCUSSION
The improved algorithm has been tested on a much broader
range of possible inputs, with an increased number of test
images. The performance of crosswalk detection on five
prerecorded video flows with a total of over 1700 frames is
shown in Table I.
In 73.4% of the cases the proposed method performed well,
however false positive (misclassified non-crosswalk) results
account for only 1% of the cases. Considering that two of the
videos were recorded in bad lighting conditions, we made a
different summary containing only the better quality ones. In
that case the ratio of correct answers was 82.5%. From a
TABLE I.
IV.
CONCLUSIONS
Optimization of processing of the video flows allowed us to
make estimations on the position and orientation of crosswalks.
The additional information provides users with a useful
feedback, and enables them to obtain a more reliable and a
more complete model about their environment.
REFERENCES
[1]
[2]
[3]
[4]
[7]
Figure 4. (a) Crosswalk detected and its bounding rectangle; (b) ROI
indicated on the following frame.
Crosswalk not
detected
447
666
Results on position estimation were compared to human
observations. Supervisors were asked to separate the correctly
classified crosswalk frames into five group based on the
position whithin the frame (strong-left, left, center, right,
strong-right). These observations were compared to three
valued – left, center, right – algorithmic outputs, thus
generating two type of miscalssification, hard and smooth.
Hard misclassifications appered only on 3% of over 600
frames containing crosswalk, and smooth appeared on 17% of
them.
[6]
(b)
Crosswalk
detected
613
15
Crosswalk present
No Crosswalk present
[5]
(a)
DETECTION RESULTS ON VIDEO FRAMES
K. Karacs, A. Lázár, R. Wagner, D. Bálya, T. Roska, and M. Szuhaj,
“Bionic Eyeglass: an Audio Guide for Visually Impaired,” in Proc. of
the First IEEE Biomedical Circuits and Systems Conference (BIOCAS
2006), London, UK, Dec. 2006, pp. 190–193.
K. Karacs, A. Lázár, R. Wagner, B. Bálint, T. Roska, and M. Szuhaj,
“Bionic Eyeglass: The First Prototype, A Personal Navigation Device
for Visually Impaired,” in Proc. of First Int’l Symp. on Applied Sciences
in Biomedical and Communication Technologies (ISABEL 2008),
Aalborg, Denmark, 2008.
Á. Zarándy and C. Rekeczky, “Bi-i: a standalone ultra high speed
cellular vision system,” IEEE Circuits Syst. Mag., vol. 5, no. 2, p. 36–
45, 2005.
T. Roska and L. O. Chua, “The CNN universal machine: an analogic
array computer,” IEEE Trans. Circuits Syst. II, vol. 40, pp. 163–173,
Mar. 1993.
K. Karacs, Gy. Cserey, and Á. Zarándy. (2010) Software Library for
Cellular Wave Computing Engines. [Online] Available: http://cnntechnology.itk.ppke.hu/Template_library_v3.1.pdf, visited on 24-092010
K. Karacs, M. Radványi, M. Görög, T. Roska, “A Mobile Visual
Navigation Device: new algorithms for crosswalk and pictogram
recognition”. in Proc. of 2nd Int’l Symp. on Applied Sciences in
Biomedical and Communication Technologies (ISABEL 2009),
Bratislava, Slovakia, 2009.
M. Radványi, G. Pazienza, and K. Karacs, “Crosswalk Recognition
through CNNs for the Bionic Camera: Manual vs. Automatic Design” in
Proc. of European Conf. on Circuit Theory and Design (ECCTD 2009)
Antalya, Turkey, 2009.