Enhancement of Local Methods by Using Spatial Support Weight
Hong Phuc Nguyen1, Thi Dinh Tran2, Vinh Dinh Quang3
IDepartment of Computer Engineering, University of Science, Vietnam
2Department of Computer Engineering, University of Agriculture and Forestry, Vietnam
3Department of Electrical and Computer Engineering, Sungkyunkwan University, Korea
{nguyenhongphuc1505, trandinh0013, dinhquangvinh77}@gmail.com
Abstract-In window-based stereo matching algorithms, a local
window is used to measure the similarity (or dissimilarity) between
pixels of a stereo pair. An implicit assumption in local stereo
matching is that all pixel within a support window have the same
(or similar) disparity, and many local methods have been proposed
to satisfy this assumption. However, objects in real-word images have
arbitrary sizes and shapes, and hence this assumption can be violated
frequently. In this paper, we propose spatial fixed window, spatial
shiftable window, spatial multiple window, and spatial variable win
dow methods which are improved methods of fixed window, shiftable
window, multiple window, and variable window, respectively. We also
experiment these improved algorithms in gray and color images, and
the experimental results using the Middleburry images show that the
proposed methods outperform their corresponding original methods.
Keywords-stereo matching, spatial weighted window, window
based method
I.
I NTRODUCTION AND R ELAT E D W ORK
Stereo matching algorithms can be classified in many ways,
but the most popular way is to categorize them into local and
global algorithms [1, 11]. Global methods explicitly assume an
energy function with data and smoothness energy terms, and
almost all of them need expensive computation and sometimes
require many parameters that are difficult to determine. Local
methods compute each pixel's disparity value from the inten
sity values within a window of finite size, and a commonly
used smoothness assumption in local stereo matching is that
all pixels within a support window have the same (or similar)
disparity. Thus for each pixel p in the reference image, the
costs for p are typically computed by moving a window of
pixels around pixel p along the corresponding epipolar line in
the target image and computing the differences between the
pairs of windows. Pixel p is then assigned a disparity with a
minimum cost. Some popular methods for computing window
based matching costs include sum of absolute differences
(SAD), sum of squared differences (SSD), and normalized
cross correlation (NCC). Some methods for computing match
ing costs, which can work robustly in radiometric conditions,
include adaptive normalized cross correlation (ANCC) [7],
rank and census transforms [8], mutual information [9], and
local binary pattern [10]. However, whereas this smoothness
assumption is typically true, it can be violated frequently
when support windows are near depth discontinuities or object
boundaries.
Furthermore, choosing the size of the support window is
very important because window-based matching methods often
mismatch in textureless regions. A support window should
therefore be small enough to include only pixels that have
the same (or similar) disparity and large enough to contain
pixels having sufficient variation in intensity. For that reason,
many local methods have been proposed.
In the shiftable window method [11], the reference point
can be located at any position in the support window, and
the position with minimum cost is selected. The multiple
window method [12] chooses the optimal support window
from a number of windows having the best cost. In the multiple
window method, support windows have various shapes, but
the size is unchanged. The support windows in the shiftable
window have a fixed size and shape, while in the multiple
window they have a fixed size and a limited number of shapes;
as a result, these methods can be mismatched with real-world
images, since objects usually have arbitrary sizes and shapes.
The variable window method [2] enhances the above two
methods by using a useful range of window sizes and shapes
and creating a window cost that is suitable for comparing
windows of different sizes. Even though a number of different
sizes and shapes are deployed using the variable window
method, the number is still very small in order to satisfy the
smoothness assumption that all pixels in a support window
have the same (or similar) disparity. Adaptive support weight
methods [3, 13] implicitly deploy a type of segmentation to
estimate the support window for each pixel. One previously
described method [3] assumes that pixels that are similar in
color and close in Euclidean distance are more likely to lie
in similar disparities and then uses photometric information
and geometric information to build a weighted window for
each pixel, while another method [13] computes a weighted
window based on the color information and geodesic distance.
In this paper, we propose stereo matching algorithms of an
spatial fixed window (SFW), spatial shiftable window (SSW),
spatial multiple window (SMW), and spatial variable window
(SVW) which are enhanced algorithms of fixed window (FW),
shiftable window (SW), multiple window (MW), and variable
window (VW), respectively. We also show how to construct a
spatially weighted window and analyze why applying spatial
support weight for each pixel in a support window can make
these enhanced methods work more robustly than their original
methods.
The post-processing step is an important step in local
978-1-4673-5604-6/12/$3l. 00 ©20 12 IEEE
000245
stereo matching algorithms which can make disparity maps
much more accurate. Some commonly used post-processing
techniques include left-right consistency check [4], sub-pixel
interpolation [5], and image filtering techniques such as me
dian filtering or bilateral filtering [5]. However, in this study
we do not use any post-processing techniques in testing our
methods so that the tests will be fair and so that we might
highlight the superior performance of our proposed methods.
The remainder of this paper is structured as follows. In
section 2, we present details of our proposed algorithms. The
experimental results of our algorithms are reported in section
3. Finally, conclusions are presented in section 4.
II. SPATIAL
W EIGHT FOR L OCAL
METHODS IN
A. Shiftable Window Algorithm
In order to reduce the violation of the smoothness assump
tion, the reference pixel in the SW method [11] can locate at
all positions in the support windows. In other words, different
support windows can have different positions of the reference
pixel.
Suppose S8 is the support window where the reference pixel
( XT)
are the
locates at the center,
and
intensity values of the pixels
and
respectively.
The window cost of the reference pixel
in the left
image can be defined as:
L(x,y) R (x - d,y)
(x,y)
(x- d,y),
(xr, Y r)
Yr)
Cd(xr,Yr)
G RAY AND
C OLOR I MAGES
In this section, we present how to apply spatial weight for
some window-based stereo matching methods such as fixed
window (FW), shiftable window (SW) [11], multiple window
(MW) [12], and variable window (VW) [2] in gray and color
images. We also show why these methods, after combined
with spatial weight, work more robustly than their original
methods.
The most popular, and simplest, matching cost in stereo
matching is absolute difference which is used in SW and MW
methods. Suppose
and
is the intensity values
of pixel
in the left and right images, respectively, the
matching cost of absolute difference can be expressed when
working with gray image as follows:
(x,y)
L (x,y)
R (x,y)
e�(x,y) IL (x,y) - R (x - d,y)I ,
d
e�(x,y)
=
LC (x,y)
L
cE{r,g,b}
RGB
L e�(x,y),
(4)
(x,y)ESs
where d is some disparity. The SW method [11] continuously
compute window costs with others positions of the reference
pixel in the support windows, and finally pixel
in the
left image is assigned the disparity which the window cost is
at minimum.
(xr,Yr)
B.
Multiple Window Algorithm
The MW method [12] tries to satisfy the smoothness as
sumption by choosing the optimal window from a number of
windows that have the best costs.
(1)
=
where is some disparity. When working in
matching cost can be defined as:
=
image, the
ILC (x,y) - RC (x - d,y)I ,
(2)
RC (x,y)
(x,y)
where
and
are the intensity values of the
color band c at pixel
in the left and the right images,
respectively.
Some algorithms, such as the VW method, use the measure
ment error, developed by [14], as matching cost which can be
expressed as:
e� (x,y) min (IIr (x,y) - It (x - d - i ,y)I,
IIr (x,y) - It (x - d + 2' y)I,
IIr (x - i ' Y ) - It (x - d,y)I,
IIr (x + 2' y) - It (x - d,y)I,
IIr (x,y) - It (x - d,y)I)·
=
(3)
A s mentioned above, a smoothness assumption i n local
stereo matching is that all pixels within a support window
have the same (or similar) disparity. The simple method of
fixed window does not take into account this assumption, and
hence its performance is very disappointed. However, many
local algorithms, such as SW, MW, VW, have been developed
to satisfy this assumption.
(a)
(b)
Fig. 1. Choosing a optimal support window in MW algorithm. The positions
of small back squares in (a) and (b) are positions of the reference pixels.
(a) shows the support window is divided into 9 sub-windows. (b) shows the
optimal support window after eliminating 4 sub-windows which have the worst
costs.
Suppose Sw is a support window, and the MW method first
divide it into nine sub-windows, as shown in Fig. la. For each
sub-window Swi, except for the center sub-window containing
the reference pixel, we compute its sub-window cost, and 4
sub-windows which have the worst costs are eliminated from
the support window Sw, as shown in Fig. lb. The sub-window
cost can be computed as:
Cd
=
L e�(x,y),
(5)
(x,y)ESwi
ith
where Swi is the
sub-window. After choosing the optimal
window Sop, the window cost of the optimal window can be
000246
computed as:
Cd'(xr,Yr)
=
L e�(x,Y ),
(6)
(x,y)ESop
d ((xr,Yr), (x,y))
(xr,Yr)
(x,y),
where
is the Euclidean distance between the
pixels
and
and
is a constant that the smaller
it is, the faster the spatial weight value of pixel
reduces
from the spatial weight value of the reference pixel.
( Xr' Yr)
and the reference pixel
in the left image is assigned
the disparity with the minimum window cost.
C.
t::'
The VW method [2] deploys a range of sizes for support
window, while the shape of support window is constrained to
square shape.
The window cost in the variable window algorithm can be
defined as:
C:I( Sv)
=
e
,
•
I
Variable Window Algorithm
(3
+ a.var(e) + JTS;;T
,
"
�'
.
As
(x,y)
.
"
I
,7'
"
;
':= ..
_.
"...1". ... .
.
,......
!
I
7'
J:::"
'
J
,
I
.
,.....
,.n... .
,.,-......
Fig. 2. In all three cases, the support windows do not fit the lamp object, so
there are some pixels in the support windows which have different disparities
compared to the disparity of the lamp.
(7)
+ "
with
(8)
and
var(e)
=
=
Sv
(e�(x,y»)2
IS.I
e2 - (e) 2
2:(x,Y)ESv
_
(
2:(X'Y)ESV
ISvl
)
e�(x,y» 2
(9)
(0)
I Sv I
where
is a square set of pixels,
is a window size,
d is some disparity,
is the measurement error of
pixel (x,y) in the reference image with disparity d, e is the
average measurement error in the support window, var(e) is
the variance of the errors in the support window, and a, (3, and
, are parameters. The last term in equation (7) is larger for
smaller support windows [2]. Therefore, in textureless regions
where the first and the second terms of equation (7) are similar
for all support windows, and larger support windows are more
likely to be selected.
D.
ed(x,y)
Spatial Support Weight
Even though these above methods attempt to satisfy this
smoothness assumption by changing the size and the shape
of the support window, and even the position of the reference
pixel in the support window, the assumption is still violated
frequently because objects in real-world images can have
arbitrary sizes and shapes, as shown in Fig. 2. To reduce the
violations, we apply a spatial support weight for each pixel in
a support window. By applying spatial support weight for each
pixel in a support window, we assume that pixels which are
spatially closer to the reference pixel
in the support
window have more probabilities to have the same (or similar)
Suppose
disparity as the disparity of
is the
reference pixel in the support window, and
is any pixel
in the support window, the spatial weight of pixel
can
be computed as:
(xr,Yr)
(xr,Yr)
(x,y)
(x,y)
(xr,Yr).
Ws(( Xr,Yr), (x,y))
=
e
d( xr,Yr) ,(x,y))
AS
(10)
(d)
Fig. 3.
Spatially weighted windows when the reference point is at two
different positions in the support window. (a) shows the po-sition of the
reference pixel in the top-left, and the corresponding spatially weighted
window is as shown in (b). Similarly, when the reference pixel is located
at the middle-right, as shown in (c), the corresponding spatially weighted
window is as shown in (d).
Each support window has its corresponding spatially
weighted window where each position is a spatial support
weight. Different support windows have different spatially
weighted windows due to the position of the reference pixel
in the support window, as shown in Fig. 3.
E. Spatial Support Weight for Local Methods
In this subsection, we present improved methods for local
stereo matching which are a combination of spatial support
weight for each pixel in support window of FW, SW, MW, and
VW methods, and called spatial fixed window (SFW), spatial
shiftable window (SSW), spatial multiple window (SMW), and
spatial variable window (SVW) algorithms, respectively. These
improved methods are basically the same as these original
methods except for their window costs.
For the SSW method, the window cost can be redefined
from equation (4) as follows:
Cd(xr,Yr)
=
L ws((xr,Yr), (x,y)) ed(x,y).
x
(11)
(x,Y)ESs
Similarly, the window cost of the SMW method can be
000247
reexpressed from equation (6) as:
G:r( XnYr)
L ws((xnYr), (x,y)) e'd(x,y).
x
=
(12)
(x,y)ESop
And in the same manner, the window cost of the SVW
method can be redefined from equation (7) as follows:
Gd( Sv)
=
e
+ a.var(e) +
�
,
I Svl +
(13)
I
FW
SFW (gray)
SFW(ROB)
SW
SSW (gray)
SSW (ROB)
MW
SMW (gray)
SMW(ROB)
VW
SVW (gray)
SVW(ROB)
Tsukuba(RMS)
Venus(RMS)
Teddy(RMS)
Cones(RMS)
3.0673
3.1317
13.5887
9.5218
2.6785
2.7433
11.8451
9.0381
2.7428
2.9024
12.3947
1.7347
1.9573
9.1708
2.0362
1.9739
9.6321
9.2503
9.3360
8.9632
8.8034
8.6074
8.4981
8.0982
1. 7373
1.7494
8.5656
7.8411
1.5992
1.7662
8.8435
1.9648
1.8561
1.7852
1.3637
1.1825
1.2548
2.1636
1.7238
1.6753
1.6927
8.0042
7.5812
7.2586
7.4766
8.1156
7.5961
7.9415
with
(14)
Table 1.
Performance comparison of our proposed algorithms with other
local test algorithms in the Tsukuba, Venus, Teddy, and Cones images.
and
var (e )
=
(ws ((Xr,Yr),(X,y»xe;'l(x,y»2
ISul
2
L(x,Y)ESv Ws ((xr,Yr),(x,y»xed(x,y»
IS"I
e2 (e) 2
L(x,Y)ES"
)
(
=
(15)
-
In equations (11), (12), (14), and (15), the measurement
error
changes due to type of images which a method
is working on. If the images are gray images, the measurement
error
Otherwise, the images are RGB
images, the measurement error
e'd(x,y)
e'd(x,y)
=
e�(x,y).
e'd(x,y) e�(x,y).
=
III.
EXPERIMENT
In this section we report the experimental results of the
SFW, SSW, SMW, and SVW methods using Middleburry
stereo pair [11] in gray and color images. For all experiments,
we fixed the parameters of the SVW method as follows:
A
20, a
0.7, fJ
18, I
2 0 with a window size
range from (5 x 5) to (35 x 35). For the SMW method, we
set A
15, and a support window size (15 x 15). For the
SSW method, we set A
12, and a support window size
(9 x 9), and for the SFW method, we set A 10 , and a support
window size (9 x 9). We found the above optimal parameter
values for these test algorithms empirically.
We tested the performance of our proposed algorithms using
images with ground truth, and then compared the performance
of the proposed methods with the original methods: FW, SW
[11], MW [12], and VW [2]. In order to make the testing fair,
we set the parameters for the original methods to the values
described in the original papers, and we do not use any post
processing technique for any of the methods tested, including
our proposed methods.
Table 1 summarizes the performance of the test stereo meth
ods for the test images. We use the root-mean-square (RMS)
error method [11] to compute the percentage of bad matching
pixels for all pixels using depth maps and the ground truth. As
shown in Table 1, the proposed algorithms demonstrate better
performance than the corresponding original algorithms.
Fig. 4 shows the resulting curves which compare the im
proved methods with their corresponding original methods
in the Tsukuba, Venus, Teddy, and Cones images. As can
=
=
=
=
=
-
.
,
=
=
=
=
=
be seen in Fig. 4, all the improved algorithms have smaller
error (RMS) than their original algorithms. Fig. 5 depicts the
resulting curves which compare the improved methods in the
Tsukuba, Venus, Teddy, and Cones images, and in both gray
and RGB images, the SVW method has superior performance.
Fig. 6 depicts the result of SFW (RGB ) , SSW (RGB ) , SMW
(RGB ) , and SVW (RGB ) for the Tsukuba, Venus, Teddy, and
Cones images. Figs. 6a, 6b, 6c, and 6d depict the ground truth
images of Tsukuba, Venus, Teddy, and Cones. respectively.
Figs. 6e, 6f, 6g, and 6h show the disparity maps of the SFW
(RGB ) method for the stereo image pairs. Figs. 6i, 6j, 6k, and
6l show the disparity maps of the SSW (RGB ) method for the
test stereo images. Figs. 6m, 6n, 60, and 6p depict the disparity
maps of the SMW (RGB ) method for the stereo image pairs.
Figs. 6q, 6r, 6s, and 6t depict the disparity maps of the SVW
(RGB ) method for the test stereo images.
IV.
C ONCLUSI ONS AND F UT URE W ORK
In this paper, we presented four enhanced blocking-based
matching methods for stereo matching: spatial fixed window,
spatial shiftable window, spatial multiple window, and spatial
variable window methods. The advantages of the proposed
methods are that they work robustly and overcome the weak
nesses of the original methods. We did not use any post
processing techniques for any of the test methods in our
experiment, and the experimental results show that our algo
rithms outperform the original methods. Our methods can be
improved in the future to work more robustly in slanted and/or
very large textureless regions.
R EFERENCES
[1] B. Cyganek and J. P. Siebert, An Introduction to 3-D Computer Vision
Techniques and Algorithms, New York: WileyBlackwell, 2009.
[2] O. Veksler, "Fast Variable Window for Stereo Correspondence Using
Integral Images," Proc. of the Int. Conf. on Computer Vision and Pattern
Recognition, pp. 556-561, 2003.
[3] K.-J. Yoon and I.-S. Kweon, "Locally Adaptive Support-Weight Ap
proach for Visual Correspondence Search," Proc. of Conf. on Computer
Vision and Pattern Recognition, pp. 924-931, 2005.
[4] P. Fua, "A Parallel Stereo Algorithm that Produces Dense Depth. Maps
and Preserves Image Features," Machine Vision and Applications, vol.
6, pp. 35-49, 1993.
000248
oil
14
�
g �
.......
ff
ff
,
::E 12
'"
10
8
6
4
2.
o
T :;u kubl!
_�W
o
Tc;dd'l'
Vc n ��
Cgnc�
T
..... S<FWIi'"�' ...... SFW!RGSj
(a)
10
II
7
; ,
/
/
./
6
4
0
9>
I'-
II
il
/I:
(b)
T�1cubl)
VC'lU�
-t- MW
..... 5Mw(j�vl
cddv
,/I:
/-
s
I
I
"
"
3
2
0
Conu
SMW�AGiiJ
iIiI:®I
Teddy
Vt!Mn
__ VW
___ WW�r!Jl" __ '!/'nii(RG81
(c)
�s
(d)
Fig. 4. Comparison of the improved methods with their corresponding original methods in the Tsukuba, Venus, Teddy, and Cones images. (a) compares
the three methods: FW, SFW (gray), and SFW (RGB) methods. (b) shows the comparison of SW, SSW (gray), SSW (RGB) methods. (c) compares the three
algorithms: MW, SMW (gray), and SMW (RGB) algorithms. (b) shows the comparison of V W, SV W (gray), SV W (RGB) algorithms
14
14
<II
�
'"
12
/'-//�
10
8
6
'"
�SFW
(gray)
8
6
/'
Teddy
Venus
..
/ -.....
/ ..
/�
___ ssw (gray) __ SMW (gray)
CC>nes
�SVW
(gray)
(a)
o
......
'"
/�
4
./.1'
Tsukuba
12
�10
'"""-.
/�
4
o
<II
."
Tsukuba
Venus
__ SFW (RG B) ___ SWW (RG B)
Teddy
SMW (RGB)
Cones
�SVW
(RGB)
(b)
Fig. 5. Comparison of the improved methods in the Tsukuba, Venus, Teddy, and Cones images. (a) compares the methods: SFW (gray), SSW (gray), SMW
(gray), and SFW (gray) methods. (b) shows the comparison of SFW (RGB), SSW (RGB), SMW (RGB), and SFW (RGB) methods.
[5] Qingxiong Yang, Ruigang Yang, James Davis, David Nist, "Spatial
Depth Super Resolution for Range Images," In Proc. IEEE Con! on
Computer Vision and Pattern Recognition, 2007.
[6] c. Tomasi and R. Manduchi, "Bilateral Filtering for gray and color
images," Int. Conf on Computer Vision, pp. 839-846, 1998.
[7] Yong Seok Heo, Kyoung Mu Lee, and Sang Uk Lee, "Robust Stereo
Matching using Adaptive Normalized Cross Correlation," IEEE Trans.
Pattern Analysis and Machine Intelligence , Vol. 33, No. 4, pp. 807-822,
2011.
[8] Ramin Zabih and John Woodfill, "Non-parametric local transforms for
computing visual correspondence," In European Conf on Computer
Vision, pp. 151-158, 1994.
[9] G. Egna, "Mutual information as a stereo correspondence measure,"
Technical Report MS-CIS-00-20, Compo and Inf. Science, U. of Penn-
sylvania, 2000.
[10] T. Ojala, M. Pietikainen, and D. Harwood, "Performance evaluation of
texture measures with c1assioncation based on Kullback discrimination
of distributions," Proc. Int. Conf on Pattern Recognition, vol. 1, pp.
582-585, 1994.
[II] D. Scharstein and R. Szeliski, "A taxonomy and evaluation of dense
two-frame stereo correspondence algorithms," Int. Journal of Computer
Vision, vol. 47, pp. 7-42, 2002.
[12] H. Hirschmuller, P. Innocent, J. Garibaldi, "Real-time correlation based
stereo vision with reduced border errors," Int. Journal of Computer
Vision, Vol. 47, pp. 229-246, 2002.
[13] A. Hosni, M. BJeyer, M. Gelautz, and C. Rhemann, "Local stereo
matching using geodesic support weights," IEEE Int. Con! on Image
Processing, pp. 2093-2096, 2009.
000249
(a)
(b)
(c)
(d)
(e)
(I)
(g)
(b)
(i)
(j)
(k)
(I)
(m)
(n)
(0)
(P)
(q)
(r)
(s)
(t)
Fig. 6. Results of the test stereo methods on the Tsukuba, Venus, Teddy, and Cones images. Figs. Ila, lib, Ilc, and lid show the ground truth images of
Tsukuba, Venus, Teddy, and Cones. respectively. Figs. Ile-Ilh show the disparity maps of the SFW (RGB) method for the stereo image pairs. Figs. Iii, Ilj,
11k, and III show the disparity maps of the SSW (RGB) method for the test stereo images. Figs. 11m, lin, 110, and lip depict the disparity maps of the
SMW (RGB) method for the stereo image pairs. Figs. Ilg, Ilr, lis, and lit depict the disparity maps of the SV W (RGB) method for the test stereo images.
[14] S. BirchField and C. Tomasi, "A pixel dissimilarity measure that is
insensitive to image sampling," IEEE Trans. on Pattern Analysis and
Machine Intelligence, Vol. 20, No. 4, pp. 401-406, 1998.
000250
© Copyright 2025 Paperzz