Effective Spatial Resolution of Temporally and

Effective Spatial Resolution of Temporally and Spatially Interlaced Stereo
3D Televisions
Joohwan S. Kim, Martin S. Banks
Vision Science Program, University of California, Berkeley
Abstract
We measured the effective spatial resolution of temporally and
spatially interlacing of stereo three-dimensional (S3D) televisions
at three different viewing distances: 1.5, 3 and 6 times picture
height. Temporally interlacing S3D television has significantly
better effective resolution at viewing distances of 1.5 and 3 times
picture height.
Author Keywords
Effective spatial resolution; S3D display; temporally interlacing;
spatially interlacing
1. Objective and Background
We investigate the effective spatial resolution for two types of
stereo 3D (S3D) televisions: one with temporal interlacing and
with spatial interlacing. The temporal-interlacing approach is
schematized in Figure 1. It involves presenting full-resolution
images to the two eyes at different times. Using active shutter
eyewear synchronized to the television, the left eye receives a
full-resolution image while the right eye receives a dark image,
and then the right eye receives a full-resolution image while the
left eye receives a dark image.
Figure 1. Temporal interlacing. The left and right panels
schematize the display and viewer at two times (t1 and t2).
Two objects are displayed on the screen: A red rectangle at
the distance of the display and a blue rectangle behind it.
The images to the left and right eyes alternate in time: first
the image shown in the left panel is delivered to the right
eye and then the image in the right panel to the left eye.
This is achieved by synchronizing liquid-crystal shutter
glasses to the display. The retinal images created by this
technique are shown at the bottom.
The spatial-interlacing approach is schematized in Figure 2. It
involves presenting half-resolution images to the two eyes
simultaneously. The odd rows on the television are polarized in
one direction and the even rows in the orthogonal direction.
Passive polarization eyewear directs the odd rows to the one eye
and even rows to the other.
Figure 2. Spatial interlacing. The display panel is shown at
the top and the viewer’s eyes at the bottom including the
retinal images produced by viewing the display. Two objects
are shown on the screen: a red rectangle that should be
perceived at the distance of the display and a blue rectangle
that should be perceived as farther. In spatial interlacing,
the even rows are presented to one eye (here the left) and
the odd rows to the other eye (here the right). This is
achieved by circularly polarizing even and odd rows in
opposite directions. The viewer wears passive polarizers
that transmit one direction to the left eye and the opposite
direction to the right eye. The panels at the bottom show
roughly what the retinal images would be in this case. The
white lines represent transmitted rows and the black lines
non-transmitted rows.
Presumably, the effective resolution in the temporal-interlacing
approach is the full resolution of the television because both eyes
receive information from all the pixels. It is not obvious what the
effective resolution is in the spatial-interlacing approach. There
are two possibilities. 1) The resolution could be determined solely
by the number of pixels received by either the left or right eye; in
that case, the effective vertical resolution would be half the
vertical resolution of the television (the horizontal resolution
should be unaffected). 2) The resolution could be determined by
the sum of the pixels received by either eye; that is, the brain
might sum the two eyes’ images to have in effect a full-resolution
representation. In that case, the effective resolution (vertically and
horizontally) would be the full resolution of the television. This
second possibility seems somewhat unlikely, however, when one
considers how the eyes are likely to align relative to each other in
order to fuse the display binocularly. If the odd and even rows are
visible (as they would be at normal viewing distances), the eye
would very likely make a small vertical vergence movement to
align the bright rows in the two eyes’ images. Once aligned that
way, the binocular image could only have half vertical resolution.
We measured the effective spatial resolution of two types of S3D
televisions. They both have HD resolution (1080 x 1920 pixels)
and are the same size (68 x 122 cm). The televisions were run in
stereo 3D mode.
2. Stimuli
The stimuli were black letters presented on a white background.
Examples are provided in Figure 3. All 26 letters in the English
alphabet were used. We used the Bailey-Lovie letter acuity chart
[1] as a guideline for creating the letter stimuli. We did so because
the Bailey-Lovie test is commonplace in optometric assessment.
Most of the letters were drawn in Arial font because that is most
similar to the Bailey-Lovie letters. The other letters were designed
according to Bailey and Lovie’s description and were also similar
to Arial font. Seven letter sizes were presented. Stroke width was
always 1/5 of the letter height. Letters were presented three at
time as shown in Figure 4 with a spacing of twice the letter width.
those presented at 3H; we did this to avoid presenting letters with
fewer than 3 pixels per letter height.
On each trial, three letters were presented for 600 msec. The
subject then reported the three letters seen by typing on a
keyboard. If he/she did not see the letters clearly, they still had to
respond, even if it meant guessing. A correct response was typing
the correct letter in the correct position. Thus, if the stimulus was
A B C, responses of “A” “B” “C” were three correct responses, a
proportion correct of 1. “B” “C” “X” were three incorrect
responses, a proportion correct of 0. “A” “C” “X” were one
correct response, a proportion correct of 1/3. Subjects were not
given feedback about the correctness of their responses.
We tested six young adults with good visual acuity and good
stereopsis. If a subject normally wears an optical correction (i.e.,
spectacles or contact lenses), he/she wore it during testing.
Subjects were paid on an hourly basis for their participation. All
but one (JK) were unaware of the hypotheses under consideration.
They viewed the stimuli binocularly.
The ordering of letter type, letter size, and disparity was
randomized from trial to trial. The ordering of television type and
viewing distance was randomized across six experimental
sessions. Subjects wore liquid-crystal shutter glasses when we
were testing the temporally interlaced display and passive
polarizing glasses when we were testing the spatially interlaced
display.
Figure 3. Ten of the letters used in the acuity test. A total of
26 different letters were used.
4. Results
The results for the six subjects are shown in Figure 5. Each panel
plots proportion of correct responses as a function of letter stroke
width. The solid lines represent data for the temporally interlaced
television. The dashed lines represent data for the spatially
interlaced television. Green, red, and blue represent data at
viewing distances of 6H, 3H, and 1.5H, respectively. The data
from the three different disparities have been averaged because
there was no effect of disparity.
Figure 4. Letter dimensions and layout during testing. Letter
height was varied from trial to trial. Letter width and stroke
width were proportional to letter height. The horizontal
spacing between letters was twice the letter width.
3. Procedure
The letters were presented with three disparities: 10 pixels
uncrossed (i.e., behind the screen), 0 pixels (at the screen), and 10
pixels crossed (in front of the screen). There were three viewing
distances: 6H (six times screen height), 3H, and 1.5H; those
distances correspond respectively to 408, 204, and 102 cm. The
Nyquist frequencies (the highest spatial frequencies that can be
represented without aliasing by the pixel grid) at those distances
were 56.5, 28.3, and 14.1 cycles/deg. Thus, the pixel grid was not
visible at 6H (because at the contrast of the pixel grid at such a
high spatial frequency is not visible; [2]), was barely visible at
3H, and was quite visible at 1.5H. At the viewing distance of 6H,
the seven letter sizes corresponded to different numbers of pixels:
6, 8, 10, 13, 16, 20, and 25 pixels. At 3H, the sizes corresponded
to 3, 4, 5, 6, 8, 10, and 13 pixels, so the letters had the same
angular size as those presented at 6H. At 1.5H, the sizes
corresponded to 3, 4, 5, 6, 8, 10, and 13 pixels, so the letters were
the same size on the display screen (not the same angular size) as
Figure 5. Individual visual acuity data. Each panel shows
the data from one subject; it is the proportion of correct
letter identifications as a function of letter stroke width
expressed as the logarithm in arcmin. In these units,
clinically normal visual acuity of 20/20 is 0 (the logarithm of
1 arcmin). The solid lines represent data obtained with
temporally interlaced television and dashed lines data
obtained with spatial interlaced television. The green, red,
and blue lines and symbols represent data obtained at
viewing distances respectively of 6, 3, and 1.5 times picture
height. The dashed horizontal lines correspond to a
proportion correct of 0.75.
The data were very similar across subjects, so we averaged them
to produce Figure 6. Again solid and dashed lines represent data
from the temporally interlaced television and the spatially
interlaced television, respectively. And again, green, red, and blue
represent the data from viewing distances of 6H, 3H, and 1.5H,
respectively. The data from different disparities have been
averaged. Smaller letters could be identified with the temporally
interlaced television at the shorter viewing distances (1.5 and 3H).
There was no difference at the longest viewing distance (6H).
Thus, the temporally interlaced television has better effective
spatial resolution than the spatially interlaced television at the
short and medium viewing distances. (Note that the recommended
[3] distance for HD resolution is three times picture height, the
medium distance we tested.) The cause of the difference in
effective resolution is undoubtedly the reduction in vertical
resolution when one presents only half the rows to one eye as
occurs with the spatially interlaced television. At the longest
distance, the effective resolution is the same for the two devices
because at that distance the pixel grid is much finer than the
highest visible spatial frequency, and therefore the limit to
performance becomes the optics and neural mechanisms of the
viewer’s visual system rather than the pixel grid.
Figure 6. Average visual acuity data. The data from Figure
5 have been averaged across subjects. Again the
proportion of correct letter identifications is plotted as a
function of letter stroke width (expressed as the logarithm of
stroke width in arcmin). The solid and dashed lines
represent data from the temporal- and spatial-interlaced
televisions, respectively. The green, red, and blue lines and
symbols represent data with viewing distances of 6, 3, and
1.5 times picture height, respectively. The dashed horizontal
line represents a proportion correct of 0.75. The vertical
dashed lines represent the size of individual pixels
expressed as the logarithm in arcmin; green, red, and blue
for viewing distances of 6, 3, and 1.5 times picture height,
respectively.
We can estimate visual acuities from the data in Figure 6 by
finding the letter size that yields a proportion correct of 0.75.
Figure 7 shows the resulting. Better visual acuities were obtained
with the temporally interlaced television at viewing distances of
1.5 and 3H (p < 0.01, two-tailed t test). The acuities were the
same at 6H (p > 0.10, two-tailed t test). Therefore, as we observed
in Figure 6, the temporally interlaced television provides better
effective spatial resolution at short and medium viewing
distances; it never provides lower resolution.
Figure 7. Average visual acuity as a function of viewing
distance and interlacing technique. The letter size
associated with a proportion correct of 0.75 was estimated
from Figure 6. Different viewing distances—1.5, 3, and 6
times picture height—are represented on the horizontal
axis. Blue and red represent the acuity estimates for
temporal and spatial interlacing,. The error bars are
standard deviations. ** indicates a statistically significant
difference between the two acuities (p < 0.01) and n.s.
indicates no statistically significant difference.
These results show that effective spatial resolution is better with
the temporally interlaced television than with the spatially
interlaced television except at long viewing distance (six times
picture height) where the effective resolutions are the same. We
note that the recommended viewing distance is usually three times
picture height, so the temporally interlaced television will provide
better effective resolution at that distance. These results are not
surprising because, as we said earlier, the viewers of spatially
interlaced televisions are very likely to make vertical vergence
eye movements to align the bright rows (i.e., to align the odd rows
in one eye with the even rows in the other). Once this occurs,
there is no way the binocular image could have more than half
resolution vertically.
5. Binocular Summation
We also looked at how the effective spatial resolution is affected
by binocular presentation. If the limit to effective resolution is the
resolution of the images presented to the left or right eye,
performance should be similar with binocular and monocular
viewing. To test this possibility, we reran the experiment with
binocular and monocular viewing. The results are shown in Figure
8. Proportion correct is plotted as a function of letter size. The
solid and dashed lines represent the data obtained with binocular
and monocular viewing, respectively. As you can see,
performance was very similar in those two viewing conditions for
all viewing distances and both types of television. The small
improvement in acuity with binocular viewing has been observed
many times before [4] and is almost certainly the result of having
more information presented to the visual brain when the two eyes
are viewing rather than one is viewing. We conclude that the limit
to effective spatial resolution is the resolution of the images
presented to either eye. There is only a small improvement due to
binocular summation.
Figure 8. Average proportion correct as a function of letter
size for different display protocols and for binocular and
monocular viewing. The solid and dashed lines represent
the data for binocular and monocular viewing, respectively.
The green, red, and blue lines and symbols represent the
data for the temporally interlaced television. The cyan,
purple, and black lines and symbols represent the data for
the spatially interlaced television. The vertical lines
represent pixel size at viewing distances of 6, 3, and 1.5
times picture height. The dashed horizontal line represents
a proportion correct of 0.75.
These results show that effective spatial resolution is better with
the temporally interlaced television than with the spatially
interlaced television except at long viewing distance (six times
picture height) where the effective resolutions are the same. We
note that the recommended viewing distance is usually three times
picture height, so the temporally interlaced televisions will
provide better effective resolution at that distance. A viewing
distance of six times picture height, where the effective
resolutions are the same, is much farther than the recommended
and observed viewing distance [3].
6. Summary & Conclusion
We examined the effective spatial resolution of two televisions:
temporally interlaced television and spatially interlaced television.
We found that the temporally interlaced television has
significantly better effective resolution at viewing distances of 1.5
and 3 times picture height, and that the two televisions have the
same effective resolution at a distance of 6 times picture height.
These results are very sensible. At 1.5 and 3 times picture height,
the Nyquist frequencies of the televisions are 14.1 and 28.3
cycles/deg, which means that the pixel grid is easily visible at the
former distance and somewhat visible at the latter. Thus, one
expects to observe a difference in effective resolution at those
distances because the spatially interlaced television presents to
each eye half the number of pixels per degree of vertical visual
angle as the temporally interlaced television does. If the pixels are
visible, the number of pixels presented per eye has an effect on
performance. At 6 times picture heights, the Nyquist frequency is
56.5 cycles/deg, which would not be visible to the human eye
given the relatively low contrast of the pixel grid. Thus, one
expects to find no difference in effective resolution at that
distance because the eye cannot resolve the pixels in that case. We
also examined the claim that the effective spatial resolution of the
spatially interlaced television is twice the resolution of the images
presented to each eye. We tested this by comparing binocular and
monocular presentations. We found no difference, so there is no
support for the idea that the brain integrates the two monocular
images in a way that provides better effective spatial resolution
binocularly.
7. Acknowledgements
This research was supported by Samsung America and NIH Grant
R01EY012851.
8. References
[1] I. L. Bailey, J. E. Lovie, “New Design Principles for Visual
Acuity Letter Charts,” American Journal of Optometry and
Physiological Optics 53(11), 740-745 (1976).
[2] F. W. Campbell, D. G. Green, “Optical and Retinal Factors
Affecting Visual Resolution,” Journal of Physiology 181,
576-593 (1965).
[3] Recommendation ITU-R report BT.709-5, “Parameter
Values for the HDTV Standards for Production and
International Programme,” (2002).
[4] F. W. Campbell, D. G. Green, “Monocular versus Binocular
Visual Acuity,” Nature 208, 191-192 (1965).