Binocular tracking and accommodation controlled by retinal motion ow
Abstract
In this paper we described a tracking and accommodation system controlled by retinal motion ow. The tracking system is decomposed in two basic behaviors: smoothpursuit to track target movements along the horopter and
vergence to track movements along the cyclopean axis.
Both behaviors are optical ow based. The eyes accommodation (focusing) is based on a focusing-by-vergence
approach combining a ow based vergence process with
an o-line pre-calibration of the lens focusing odometry.
The vergence process is controlled by the retinal optical
ow disparity and the target depth velocity is directly obtained from this disparity. The relationship between the
target depth (focused target distance) and the lens focusing odometry is obtained by an o-line focusing calibration using a bivariate polynomial to model the relationship. The focusing and tracking system run is parallel
and are both velocity controlled.
1. Introduction
If a visual object is in movement, the observer's eyes
usually move in pursuit of it. The aim of this pursuit
is to make the retinal image of the object as stationary
as necessary relative to the retina. By this, we mean a
velocity of movement of the retinal image so small that
the eyes retain a high level of resolving power. When an
object of perception, moving in space, moves nearer to
or further from the observer, pursuit is accompanied by
convergence or divergence of the optical axes (vergence
movements).
Tracking an object is the combination of these two
movements, sliding the xation point along the axis of
the cyclopean eye (vergence) and also displacing the axis
of the cyclopean eye in space (smooth-pursuit). Dierent cues are used to control these movements. The vergence movement basically use retinal disparity combined
with accommodation cues (focus or defocus measures).
Smooth pursuit is mainly controlled by retinal velocity.
Retinal velocity is the main cue used to control the
tracking system presented in this paper. The binocular
head joints involved on the tracking process are velocity
controlled, being the angular velocity of the joint in-
volved on pursuit obtained through the average retinal
velocities and the angular velocity of the joints involved
on vergence obtained through the retinal motion ow
disparity.
The integration of accommodation cues in active
vision is mostly oriented to approaches that use accommodation as a cue to vergence control, combining depth from accommodation with disparity vergence.
In these approaches the main objective of the accommodation is depth estimation [6]. Extracting depth
from accommodation can be done by using image infocus [1, 8, 9, 13, 15, 5, 12] or defocus [7, 6, 10, 11]
measurements.
In the proposed focusing method [3, 4], the main purpose is maintain the images in-focus during the visual
tracking of moving objects, using the estimated spatial
target velocity obtained during the vergence control to
control the focus motor velocity (focusing-by-vergence).
The target depth velocity is expressed as a function of
the angular velocity of the eye azimuth joints, and this
angular velocity is controlled by the retinal motion ow
disparity. Computing the retinal motion disparity for a
symmetric vergence geometry, we are able to obtain the
target depth velocity. Using the estimated target depth
velocity combined with a pre-calibration of the focus motor setting as a function of the accommodation depth,
the proposed approach is able to control the velocity of
the focus motor of the lens to maintain the optical system in focus. The focus motor velocity depends on the
focal length of the lens, retinal motion disparity and the
present vergence geometry. The relationship between
the focus motor velocity and the target depth velocity
is obtained by dierentiating the bivariate polynomial
function that models the relationship between focus motor setting and accommodation depth.
2. The Tracking System
Consider the existence of a point Pc with coordinates (Xc ; Yc ; Zc) in the cyclopean eye coordinate system (Fig. 1), moving with velocity Vc = c Pc + tc ,
being c = [
1; 2 ; 3]T the angular velocity and
tc = [tx ; ty ; tz ]T the translational velocity. This velocity, Vc = [Vx ; Vy ; Vz ]T , can be expressed in each one of
V
Vz
1.5
θv
θr
Horopter
θl
Motion flow (pixels/frame)
Vx
CIP
1
0.5
0
−0.5
−1
−1.5
−2
0
Z
50
100
150
# of frames
Wl
X
Wc
Wr
2B
Figure 1. Symmetric xation geometry. The target
velocity V has two main components: Vz that is
the velocity along the cyclopean axes and Vx that
is the velocity perpendicular to the cyclopean axes.
Motion flow (pixels/frame)
3
2.5
2
1.5
1
0.5
0
−0.5
−1
−1.5
−2
0
50
100
150
200
250
# of frames
the retina coordinate systems, Vleft=right , by
Vl=r = Rl=r [
c Pc + tc ]
(1)
Figure 2. Motion ow and retinal motion ow disparity for targets moving along (top) and outside(bottom) the horopter. (-: Left retinal velocity,
- -:Right retinal velocity, : Retinal disparity.)
representing Rl=r the rotation matrix between the cyclopean and the left/right retinal coordinate systems.
Representing the angular velocity of the joints inThe retinal image velocity is dened by
volved on the pursuit process (cyclopean eye pan and
# " f
"
#
tilt) by p = @@tp = PVcx^z^ and t = @@tt = PVcy^z^ and
vxl=r = Pl=rx z^ (Vl=r x^) ? (Pl=rfxz^)2 (Pl=r x^)(Vl=r z^)
taking the equations 4 and 5 results
fy
^) ? (Pl=rfyz^)2 (Pl=r y^)(Vl=r z^)
vyl=r
Pl=r z^ (Vl=r y
(2)
vx = vxl + vxr
p = f cos
(6)
with Pl=r = Rl=r Pc + Tl=r .
2 2 f cos2 For a symmetric vergence geometry (l = ?d ), which
means that its coordinates are Pc = [0; 0; Zc] and its
vyl + vyr
y
image projection are [0; 0]l=r , the equation 2 reduces to
t = f vcos
=
(7)
2 f cos #
"
vxl=r = fl=r cos l=r (V x^) + sin l=r (V z^) : that represent the angular velocity of the joints involved
(V y^)
Pl=r z^
on the pursuit process. These velocities are function
vyl=r
(3) of the present vergence geometry () and the average
retinal velocity.
2.1. The Smooth-Pursuit Process
Taking the average velocity between the left and rigth 2.2. The Vergence Process
retinal velocity for a symmetric geometry results
Figure 2 shows the motion ow and retinal motion
ow
disparity for a target moving along and outside the
l
r
vx = vx +2 vx = f P(V z^x^) cos2 (4) horopter. As can be seen through the gure, the velocity
c
induced on the retinas (images) is almost the same when
the target moves along the horopter (constant cyclopean
vyl + vyr f (V y^) cos depth)
resulting in a small motion ow disparity. How(5)
vy = 2 =
Pc z^
ever, if the target is moving outside the horopter, then
that are functions of the focal length of the lenses (fl = the velocity induced on the retinas is no longer the same,
fr = f), the horizontal/vertical component of the 3D and the retinal motion ow disparity can be used to contarget velocity in the cyclopean eye coordinate system trol de vergence of the retinas.
and the actual vergence geometry ( = l=r ).
Representing the retinal image motion ow disparity
?ftz sin 2
x
Zc
v = v
=
:
(8)
vy
0:0
Since the binocular system is verged on the target
with equal vergence angles, Zc = B(tan )?1 , results
for the Z component of the translational velocity tz the
equation
tz = ?B v2x ;
2 f sin that is a function of two important parameters: the horizontal retinal optical ow disparity vx and the actual
vergence geometry .
v2x and combinConsidering that @Z@tc = tz = 2?fBsin
ing this equation with the result of dierencing Zc with
respect to time, @Z@tc = sin?B2 @
@t , results
vx
(9)
v = @
@t = 2f
that represents the angular velocity of the vergence
joints to maintain vergence on the moving target. For
this particular geometry for vergence, only the horizontal motion ow disparity is required to control the joints
vergence velocity of both retinas.
3. Focus motor setting calibration
To model the non linear relationship between the infocus motor setting (focus odometry) and the focusing
distance, we calibrated o-line the focus motor settings
mf across a range of zoom settings mz and focusing
distances D.
To get the image in focus we used as sharpness criterion function the modied Laplacian proposed by Nayar [13], monitoring its behavior as the focus motor
changed its settings.
We used to describe these functional relationships bivariate polynomials [18]. The general formula for a nth
order bivariate polynomial is
BP(D; mz ) =
n?i
n X
X
=0 j =0
i
aij Di mjz
and the number of coecients required by the polynomial is NC =(n + 1)(n + 2)=2.
4
x 10
10
Focus settings
by v = vl ? vr
f
f
(V
^
)
V
^
P
^
l z
l x
l x
v = P z^ V y^ ? (P z^)2 P y^ ?
l
l
l
l
f
Vr x^ ? f (Vr z^) Pr x^ ;
Pr z^ Vr y^
(Pr z^)2 Pr y^
after some mathematical manipulation, and considering
that the target is verged with equal vergence angles (l =
?r ; v = 2 ), results
8
6
4
2
1500
8
2000
D
2500
(m
m)
6
ngs
3000
2
3500
i
4
ett
m s
Zoo
4
x 10
0
Figure 3. The focus motor setting as a function of
the zoom setting and focusing distance.
Three main relationships were modeled with direct
inuence in the auto-focusing mechanism implemented:
focus motor setting mf , eective focal length f and image magnication M.
mf = g(D; mz ) f = h(mf ; mz ) M = (mf ; mz ):
(10)
A third order bivariate polynomial function g was
used to model the focus motor setting mf
mf = g(D; mz )
(11)
= a00 +a01mz +a02 m2z +a03 m3z +a10 D+
2
2
2
a11Dmz +a12 Dmz +a20 D +a21 D mz +a30D3 ;
being the modeled data presented on gure 3.
The focusing distance was measured using stereo triangulation from verged foveated images.
4. Focusing by vergence
In a focusing system conducted by vergence, the angular velocity of the eyes joints controls the xation point
(target) depth velocity and this depth velocity determines de velocity of the focusing system.
Representing the focusing distance D as a function of
the xation geometry (symmetric geometry) by (g. 4)
D = B (sin )?1, the velocity of the focusing distance
as a function of the angular velocity of the eye azimuth
joint is given by
@D = @D @ = ?B cos @ :
(12)
@t @ @t
sin2 @t
Being the angular velocity of the eye azimuth joint
dened by equation 9, the focusing distance velocity is
given by
?
r
l
_D = @D = B cos v2x ? vx
(13)
@t
2f sin ?
representing vx = vxr ? vxl the retinal motion ow
disparity.
Tz
θ
D
θ
.
.
θ
θ
B
B
D
Figure 4. The focusing-by-vergence geometry.
Maintaining the invariance of the focal length makes
the focus motor velocity also dependent on the zoom
motor velocity (m_ z 6= 0) increasing the complexity of
the function that models the focus motor velocity.
Modeling the eective focal length as a function of the
zoom and focus settings maintain the focus motor veloc^
ity only as a function of the focusing distance velocity D,
but this solution doesn't maintain the invariance of the
focal length, being the velocity induced on the images
dependent on the velocity of the focal length variation.
Considering a focal length f that changes its value
with a velocity f,_ the velocity induced on the image
(x;_ y)
_ by a point
P that moves with a spatial velocity
_
_
_
V = X; Y ; Z is represented by
Representing the relationship between the focus lens
setting and the focusing distance by mf = g (D; mz ),
x_ = PP xz^^ f_ + P f z^ X_ ? (P f z^)2 (P x^)Z_
dierentiating mf in order to time we obtain the focus
motor velocity m_f as a function of the focusing distance
velocity D_ and as a function of the zoom motor velocity
y_ = PP yz^^ f_ + P f z^ Y_ ? (P f z^)2 (P y^)Z:_
m_ z .
For a xation point P with camera coordinates P =
@mf = @g @mz + @g @D m_ = @g m_ + @g D_ (0; 0; D), the velocity induced on the image is indepenf
@t @mz @t @D @t
@mz z @D
dent of the velocity of the focal length f,_ being the image
(14) velocity only dependent on the velocity of the target V .
Assuming a constant zoom value during the focusing
process (m_z = 0), the focus motor velocity is only a 5. Real-time Tracking and Accommodafunction of the focusing distance velocity, resulting the
tion
relationship
The focusing accommodation runs in parallel with the
?
visual behaviors of xation and tracking and is accomB cos vxr ? vxl
@g
(15) plished in two major steps:
m_f = @D 2f sin2 1. The initial focusing is performed in parallel with
@g
=b1+b2 (sin )?1+b3 (sin )?2 , with the coebeing @D
the front-symmetric cyclopean xation of the syscients b1 =a10+a11 mz+a12 m2z , b2 =2B (a20 +a21 mz )
tem (see-[4]). The focus motor is position conand b3 =3B 2a30 .
trolled and the odometric position for focus is deApart from the retinal optical ow disparity and xaned by
?
tion geometry, the focus motor velocity (eq.15) is also a
mf = g mz ; B(sin )?1
(17)
function of the eective focal distance of the lens. Being
with = jl j = jr j.
the eective focal length of the lens a function of the
zoom and focus settings of the lens, the focusing process
2. Assuming that both lenses are in focus after the
changes the eective value of the focal length.
cyclopean xation of the system, the focusing acA solution to this problem can be obtained by using
commodation of the lenses is performed during the
two dierent strategies:
smooth-pursuit of the system using the approach
presented on the paper. During this period the
1. Maintaining the invariance of the focal length,
focus
motor is velocity controlled, being the focus
compensating the variation of the focal length with
motor
velocity dened by equation 16.
the zoom motor.
To take into account the position error that results
2. Modeling the focal length f as a function of the from
inaccurate retinal velocity estimation, motor inlens zoom and focus settings, f = h(mf ; mz ), re- ertia or
inaccurate focus motor setting calibration, the
sulting
focus motor velocity that maintain the optical system in
?
focus in obtained by
B cos vxr ? vxl
@g
:
(16)
m_f = @D 2 h (mf ; mz ) sin2 u_ = m_ f + K mf
(18)
Tracking System
l
vy
+ vy
+
vyr
Ωt
1
Ωt
+
+
2 f cosθ
Kalman
Filter
K
r
+ vx
+
l
vx
∆ py
Ωp
1
+
+
4
Tilt
Motor
3
*
2
.
Ωp
Kalman
Filter
2
2 f cos θ
u f (k)
∆ px
K
vx
5
.
u f (k)
∆p ’
Pan
Motor
1
x
**
***
0
K
+ ∆ vx
-
.
Ωv
1/2f
+
−1
Ωv
u f (k)
Kalman
Filter
+
Vergence
Motor
−2
−3
.
B cosθ
2 f sin2 θ
x
mf
+
u(k) Kalman u f (k)
Filter
+
vm
f
pm
Focus
Motor
h
dg
dD
g
^
m
f
+
∆ mf
D
2.5
5
7.5
10
12.5
15
time(sec)
4
K
mz
−4
0
f
f
-
3
mf
*
2
Focusing Accommodation system
Figure 5. Block diagram of the tracking and
focusing-by-vergence system. (Pmf and vmf repre-
sent respectively the position and velocity of the focus
motor.)
1
**
0
***
−1
−2
being mf = [g(D; mz ) ? mf ] the position error measured between the actual focus motor position mf and
the modeled focus position for the target depth D. K
represents a proportional gain component.
This approach was also used to control the angular
velocities of the joints involved on the tracking process,
being the angular velocities dened by
_ = + K p
(19)
being the second term (p ) used to enforce the positional constraint for xation (targets located on the image center).
The angular velocity of the tracking joints and the focus motor velocity are ltered by a Kalman lter before
they are send to the PID servo-controllers. The Kalmar
lter is used for two main reasons: 1) lter the noisy velocity components; 2) maintain a synchronous velocity
information to the PID servo-controller. A velocity command is send to the PID controller every 10ms. Figure 5
shows the block diagram of the tracking and focusing accommodation system controlled by retinal motion ow.
The behavior of the proposed focusing-by-vergence
method was analyzed under real environment conditions.
For that purpose, we used an high texture pattern that
was moved forward and backward along the cyclopean
axes of the MDOF system. Two dierent tests were
performed, considering dierent initial locations for the
target (target depth): 2:0 and 2:5 meters away from the
cyclopean eye. The focal length of both lenses were set
to 15:0mm, which corresponds to a zoom motor setting
−3
0
5
10
15
20
25
30
time(sec)
Figure 6. Behavior of the proposed focusing-byvergence method. The gure on top corresponds
to the closest target setup and the bottom gure
the farther target setup. Each gure presents three
plots: (*) Target depth (D) estimated by the vergence process (m=sec), (**) Filtered retinal motion
ow disparity and (***) Filtered focus motor angular velocity ( =sec). The Y axes of the plots
applies to all the plots.
equal to mz = 80000. On both tests the target was
moved 1:0m forward and backward from its initial position and the behavior of the proposed focusing system
is presented on gures 6.
The estimated target depth was obtained during the
vergence process and is in agreement with the target
movement performed. Good performance can be observed for the proposed symmetric vergence tracking
process. The retinal motion ow disparity used to control the angular vergence velocity and the focus motor
velocity presents higher values for short target depth values, and the focus motor velocity required to maintain
the optical system in focus also increase when the target
approach the binocular system.
6. Conclusions
[13] Nayar, S., Shape from focus, Carnegie Mellon University, 1989.
[14] Murray D., Bradshaw K., MacLauchlan P., Reid I.,
Sharkey P.: Driving Saccade to Pursuit Using Image Motion. Intern. Journal of Computer Vision 16,
No.3, November, (1995), 205{228.
[15] Das, S., Ahuja, N., A comparative study of stereo,
vergence and focus as depth cues for active vision,
In this paper we presented a binocular tracking and
accommodation system controlled by retinal motion
ow. The vergence control is based on the concept of
binocular retinal image motion disparity that allows the
computation of the angular velocity of the vergence's
joints and the computation of the target depth velocity.
The proposed focusing system is vergence controlled,
fusing the information supplied by the vergence process
IEEE Int. Conf. on Computer Vision and Pattern
and some pre-calibration of the lens. The pre-calibration
Recognition, New-York, June 1993.
of the lens parameters was done using bivariate polynomials to model the relationship between the calibration [16] Brown,C., Coombs D.: Real-Time Binocular
parameters and the variable lens parameters.
Smooth Pursuit Intern. Journal of Computer Vision
11, No.2, October, (1993), 147{165.
References
[1] Abbott, L., Ahuja, N., Surface reconstruction [17] Burt, P., Bergen, J., Hingorani, R., Kolczynski, R.,
by dynamic integration of focus, camera vergence
Lee, W., Leung, A., Lubin, J., Shvaytser, H.: Object
and stereo, IEEE Int. Conf. on Computer Vision,
tracking with a moving camera. Proc. IEEE WorkFlorida, December 1988.
shop Visual Motion, Irvine, (1989).
[2] Aloimonos, Y., Weiss, Y., Bandopadhay, A., Active [18] Willson, R.: Modelling and Calibration of AutoVision. Intern. J. Comput. Vision 7 (1988) 333{356
mated Zoom Lenses. CMU-RI-TR-94-03, Carnegie
Mellon
University, (1994).
[3] Batista, J. Active Vision Systems : Behaviors and
Calibration. PhD Thesis - DEE-FCTUC, Coimbra, [19] Carpenter, R. H. S.: Movements of the Eye. Pion,
1999.
(1988).
[4] Batista, J., Peixoto, P. Araujo, H., Real-Time Visual
Behaviors with a Binocular Active Vision System.
ICRA97 - IEEE Int. Conf. on Robotics and Automation. Albuquerque, New Mexico, USA, April, (1997)
[5] Andersen, C., A framework for control of a camera
head. PhD Thesis - LIA, Aalborg, 1996.
[6] Pahlavan, K.: Active Robot Vision and Primary Ocular Processes. PhD Thesis, CVAP, KTH, (1993),
Sweden.
[7] Horri, A. Depth from defocusing CVAP Technical
report TRITA-NA-P116, KTH Sweden, 1992.
[8] Horn, B.P., Focusing MIT Articial Intelligence Laboratory, May 1968.
[9] Krotkov, E.P., Focusing Int. Journal of Computer
Vision, 3(1), 1987.
[10] Pentland, A., A new sense for depth of eld, IEEE
PAMI, 9(4), July 1987.
[11] Surya, G., Subbarao, M., Depth from defocus by
changing camera aperture: A spatial domain approach. IEEE Int. Conf. on Computer Vision and
Pattern Recognition New-York, June 1993.
[12] Tennenbaum, J., Accommodation in Computer Vision, PhD Thesis, Stanford University, 1970.
© Copyright 2026 Paperzz