beyond the kodak image set: a new reference set of color

BEYOND THE KODAK IMAGE SET:
A NEW REFERENCE SET OF COLOR IMAGE SEQUENCES
Stefano Andriani, Harald Brendel, Tamara Seybold and Joseph Goldstone
ARRI - Arnold and Richter Cine Technik GmbH
Image Science Group, R&D Department
Türkenstraße 89, 80799 Munich, Germany
ABSTRACT
In 1991 Kodak released a set of 24 digital color images derived
from a variety of film source materials. Since then, most image
processing algorithms have been developed, optimized, tested and
compared using this set. Until a few years ago it was considered
“the” image set; however, today it shows its limitations. Researches
have expressed their need for better, more up-to-date material. We
present a new set of high quality color image sequences captured
with our professional digital cinema camera. This camera stores uncompressed raw sensor data and the set is freely available for FTP at
ftp://[email protected]/
password: imageset.
Index Terms— Kodak set, ARRI set, sensor data, color demosaicking
1. INTRODUCTION
In the last twenty-five years, use of digital cameras has expanded
from laboratory researchers, to consumers, and now to the most
demanding professional photographers. Contemporary sensors can
capture images with very high spatial resolution, high dynamic range
and relatively low noise. This allows photographers to benefit from
the flexibility and low costs of a digital workflow without suffering reduced image quality or compromising their artistic intent. Improvements in processor and storage performance and system architecture have led to completely new image processing algorithms, and
digital images today can be managed and reproduced with very high
fidelity. As a results, digital cameras have almost completely displaced film cameras in professional photography. Over the last few
years the same move to digital is occurring in cinematography, albeit
at a different pace.
These advances have been obtained through tightly coupled development of hardware and software components. In addition, after camera manufacturers develop or adapt specific image processing algorithms for a camera, third-party companies often collaborate
with the camera manufacturer to offer additional capabilities. This
latter type of development is typically done under a non-disclosure
agreement, allowing the camera manufacturer to divulge the camera’s most intricate design decisions while safeguarding that manufacturer’s intellectual property.
Outside the world of industry, on the other hand, the academic
community still typically develops, tests and compares image processing algorithms using the “Kodak image set”. This set was
released as 8-bit full RGB images in three different formats: “Full
detail” 3072×2048, “HDTV” 1536×1024 and “TV-comparable”
768×512, where the two lowest-resolution formats were computed
978-1-4799-2341-0/13/$31.00 ©2013 IEEE
as subsampled versions of the “Full detail” format. The two larger
formats were used only sporadically because the computational
power of that era’s computers was so limited that even simple image
processing algorithms required extensive computational resources.
Thus, most of the algorithms present in the literature have been
developed using the “TV Comparable” format. Nowadays, though
the computational power of today’s computers is enough to execute
most of the older algorithms in real-time, the community still uses
the smallest format so as to compare most directly any new results
against older ones. That said, the scientific community has come to
see the limits of the Kodak set in many fields of image processing
research, and the need for a new set.
Corporately as a camera manufacturer and individually as researchers we would like to share some of our test sequences to satisfy
this need and help advance the state of the art. Our digital camera, introduced in Section 2, can capture unprocessed data directly from the
sensor and store it uncompressed in its original 2880×1620 (16:9)
format. This uncompressed output gives cinematographers the opportunity to extract the highest possible quality image data from the
camera, deferring all color processing until post-production. Such
uncompressed output is rare, as almost all professional video cameras compress both sensor data (if they make sensor data available
at all) and reconstructed HDTV-format imagery prior to storage.
Through this uncompressed imagery, the image processing community will have free access to high quality reference sequences which
can be used to develop and test new algorithms in a wide range of
image processing fields, including:
• demosaicking, by working with actual uncompressed color
filter array imagery;
• denoising, by analyzing real noise and its impact at each step
of the image processing chain;
• video compression, by starting from high resolution uncompressed video material; and
• optical flow, by analyzing the correlation among frames in
either Bayer pattern data or reconstructed color imagery.
Furthermore, the possibility to test all these algorithms on image
sequences (rather than the stills of the Kodak set) offers the opportunity to look at an algorithm’s results extended across the temporal
domain where introduced artifacts are usually more visible.
2. ARRI ALEXA CAMERA
The sequences in the set were captured by the ARRI ALEXA camera as it was tested on a variety of subject material and in a variety
of capture environments. The ALEXA camera family has been developed for digital cinema acquisition and it is comprised of four
2289
ICIP 2013
Fig. 1. ARRI ALEXA family.
Fig. 2. ARRI Alexa provided with the color wheel.
models: ALEXA, ALEXA Plus, ALEXA M and ALEXA Studio
(see Figure 1), all of which are built around the “ALEV III” CMOS
sensor which has the following characteristics:
special ALEXA research camera in which the Bayer color filter array
has been removed, providing a “gray-scale” digital camera. Along
with this camera we also constructed a color wheel, positioned in
front of the lens, that can be equipped with up to four different color
filters. Its rotation is driven by a step motor synchronized with the
camera’s recordering processes (see Figure 2). The color wheel is
fixed when recording is active; during the time the color wheel rotates to change the color filter in front of the camera, recording is
deactivated. In this way, we can obtain full color images from a
single sensor camera without an integrated color filter array.
We have developed this system because of the utility of having
full individual red, green and blue images as a reference, or as a
starting point of a research investigation, e.g. in examining new patterns for a color filter array. The system is useful and was cheap and
easy to realize. Its main drawback is the impossibility of shooting
real sequences of arbitrary image content with it; in fact, the scene
captured during the rotation of the wheel must be motionless lest the
images associated with each color filter contain relative spatial variation that could introduce unwanted color artifacts. This problem can
be seen in the “Color Wheel” sequence if one focuses on the peacock
feathers in the center of the scene.
The main characteristics of the proposed set are reported in Table 1 and a preview of its content is shown in Figure 3
• Size: 35mm film frame
• Sensitivity: 160 - 3200 ISO;
• Dynamic Range: > 14 Stops;
• Resolution: 2880 x 1620
• Dual Gain Architecture (DGA)
The DGA technology simultaneously provides two separate readout paths, with differing amplification, from each pixel. The first
path contains a highly amplified signal. The second path contains a
signal with lower amplification, to capture any information clipped
in the first higher-gain path. Both paths feed into the A/D converters of the camera, each delivering a 14 bit image. These images
are then combined into a single 16 bit higher dynamic range image.
This arrangement enhances low light performance and significantly
extends the dynamic range of the resulting image. The shadow areas are reconstructed from the high gain path and the highlights are
reconstructed from the low gain path producing an image containing meaningful luminance information in all 16 bits. This output
is then logarithmically compressed into 12-bit form, decreasing required bandwidth without reduction of overall quality.
A filter pack situated in front of the sensor contains an optical
low pass filter, an infrared (IR) cut-off filter and an ultraviolet (UV)
cut-off filter. The low pass filter blocks high image frequencies that
would lead to artifacts when captured by the sensor. Finally, between
the filter pack and the sensor there is a Bayer pattern color filter
array [1] allowing the capture of color images with a single sensor
through demosaicking.
The ALEXA camera provides its imagery in a variety of formats, from ARRIRAW (12-bit logarithmic uncompressed unencrypted data from the sensor) as the highest quality output on the
one hand, to compressed or uncompressed HD on the other. More
information is available at the ARRI website: www.arri.com.
3. CONTENTS OF THE NEW SET
The proposed set is comprised of twelve sequences captured in different locales under different lighting conditions. The first eleven
sequences were captured by normal ALEXA cameras equipped with
a standard Bayer pattern color filter array, and require color reconstruction with a demosaicking algorithm such as those proposed in
the literature [2–6]. The twelfth and final sequence was captured by a
4. COLOR PROCESSING THE IMAGES
We provide the images as linear 16 bit TIFF files reconstructed from
the 12-bit logarithmic format. Linear, in this context, means that
the digital numbers are proportional to the amount of light collected
by each pixel. The sensor has a read-out noise of approximately
2.1 digital code values and, as with most commercial cameras, its
spectral responsivities are not colorimetric: they do not meet the
Luther-Ives condition. In this section, we present color correction
matrices that are derived from minimizing the color error over a set
of selected test colors. Those matrices include a chromatic adaptation transform from the scene illumination to the white point of the
target color space (D65).
To render the images for display on a monitor one could apply
several of the tone-mapping techniques described in the literature,
such as those of Reinhard et al. in [7]. We provide here a global,
spatially-invariant transform that is very similar to the way the camera generates a high definition video image. This processing, appropriately enough for its typical application, is inspired by traditional
motion-picture film processing.
First, the linear data is white balanced and color reconstructed.
2290
Fig. 3. Proposed sequence set.
#
Name
01
02
03
04
05
06
07
08
09
10
11
12
Akademie
Arri
Church
Color Test Chart
Face
Lake Locked
Lake Pan
Night at Odeonplatz
Swimming Pool
Sharpness Chart
Night at Siegestor
Color Wheel
Exp.
Index
400
400
800
800
400
800
800
800
800
400
1600
400
Color
Temp.
5600 K
5600 K
5000 K
3200 K
5600 K
5600 K
5600 K
5600 K
5600 K
3200 K
5600 K
3200 K
Number of
frame
240
240
48
10
48
48
48
240
48
10
240
24
Red
Green
x
0.6840
0.2210
y
0.3130
0.8480
Blue
White
x
0.0861
0.3127
y
-0.1020
0.3290
Table 2. Chromaticity coordinates of the primaries and the white
point of the wide-gamut color space.
point of the wide-gamut color space are given in Table 2. Note that
blue is a virtual primary. This results from a desire to encode all
colors with positive numbers. The colorimetric error of the camera
will map certain colors outside of the spectral locus. This behavior
is not unique to the ALEXA, and can often be seen in digital still
cameras as well. [10].
Several other approaches could be used to display the images. A
general procedure is outlined below.
First, the normalize the data for the exposure by:
Table 1. Recording information for the proposed set.
18 EI Rraw − 256
·
(1)
400
65535 − 256
where EI is the exposure index of the scene (see Table 1) and Rraw
is a 16 bit pixel value from the TIFF file. G and B are likewise thus
normalized. This converts the raw pixel values into normalized (but
still unreconstructed) values scaled such that a captured gray card
would be represented by RGB values of (0.18, 0.18, 0.18).
Second, apply a color correction matrix. Conversion matrices
for arbitrary target color spaces can be derived from the chromaticity
coordinates of the wide-gamut color space mentioned above combined with the color correction matrix for the scene. For example, the color correction matrix for a scene photographed in daylight
(D5600) is:
R=
Then, the color correction matrix transforms the camera RGB values
into values for a wide-gamut color space. The encoding primaries
for this color space are chosen to avoid clipping in all but the most
extreme cases. These wide-gamut RGB values are then non-linearly
transformed by a function named Log C , which mimics the results
obtained by scanning color negative film. The Log C function is basically a logarithmic transform of the data, with a small offset added
to create a “toe” at the lower end of the curve. The next step is a tonemapping curve, similar to what would be found in motion-picture
print film. Compared to a motion-picture print film tone-mapping
curve, ours induces less contrast, and maintains more shadow detail,
making the resulting images easier to use in an electronic viewfinder,
or to view with a on-set monitor. This tone-mapped data is then matrixed into the color space defined in ITU Recommendation 709 [8],
which has the same primaries as the sRGB color space. The final
step in color processing is compensation for the non-linear electrooptical transfer function of the monitor, assumed to be a power function with an exponent of 2.4 [9].
The chromaticity coordinates of the primaries and of the white

 
Rwg
1.1766
 Gwg  =  −0.0194
Bwg
0.0368
−0.1190
1.0606
−0.2019


−0.0576
R
−0.0412   G  (2)
1.1652
B
The conversion from the wide-gamut colorspace to ITU Rec.
709 or sRGB is done by the following matrix:
2291
Lighthouse image processing chain
Full Detail
Format
Spatial
TVC
Downsampl. Format
Bayer
Downsampl.
Full Detail Demosaicking
Bayer
Downsampl. Format
MAC/ADA-3
Demosaicking TVC "A"
MAC/ADA-3 Format
Spatial
TVC "B"
Downsampling Format
ARRI Sequences image processing chain
White
Balance
3K
Format
Demosaicking
MAC/ADA-3
3K
Color
post-processing Format
Fig. 4. Image processing chains for the tests on the “Lighthouse”
image and the ARRI sequences.


Rwg
−0.0802
−0.2640   Gwg 
Bwg
1.2481
(3)
where [Rs Gs Bs ] is the red, green, blue triplet in the sRGB color
space.
Finally, apply the correction for the non-linear EOCF of the
display, e.g. the transform described in the Amendment 1 of IEC
61966-2-1 [11]:
 
1.6175
Rs
 Gs  =  −0.0706
−0.0211
Bs

C0 =
−0.5373
1.3346
−0.2270
12.92 C
1.055 C 1/2.4 − 0.055
if C ≤ 0.0031308
if C > 0.0031308
Fig. 5. Comparison between the MAC and the ADA-3 debayering
algorithms on the Kodak Lighthouse image. The image contrast has
been enhanced to make the differences more visible.
(4)
where C is R, G, or B, respectively.
Fig. 6. Comparison between the MAC and the ADA-3 debayering
algorithms on two images from the proposed set. No image contrast
enhancement has been applied.
5. DEBAYERING COMPARISONS BETWEEN KODAK
AND PROPOSED SET.
In this section we show two sets of comparisons between the results
of demosaicking algorithms, showing the importance of evaluating
every algorithm in its proper position inside the image processing
chain. In the first set, the comparison is made with images from the
”Kodak image set”; in the second, the new set is used. The first
demosaicking algorithm, called MAC, is presented by Menon et al.
in [6]. It is a well known algorithm developed using the “Kodak” image set. The second demosaicking algorithm is called ADA-3 (ARRI
Debayering Algorithm version 3) and it has been developed and optimized using the images of the proposed set.
The first comparison is made with the “Lighthouse” image using the two image processing chains shown in Figure 4. In the TVC
“A” path, the “Full Detail” format is first downscaled into the “TV
Comparable” format, then downsampled according to the Bayer pattern and finally color reconstructed. In the TVC “B” path, the “Full
Detail” format is firstly downsampled according with the Bayer pattern then color reconstructed and at the end downscaled into the “TV
Comparable” format. This second path is much more similar to real
camera color processing than the first, even though the color reconstruction is performed after and not before the color processing. Figure 5 presents cropped versions of the resulting images, with contrast
enhancement making differences more visible. Against the Kodak
set, the MAC algorithm appears to work slightly better than ADA3; in the TVC “A” path, the fence presents less color aliasing, and
the reconstruction of the edges of the life preserver is slightly better. When the results of the TVC “B” path are considered, however,
the visual quality of the products of the two algorithms is almost the
same. In any case, the visual quality of the TVC “B” path is much
higher than that of the TVC “A” path.
In Figure 6 we have applied the two algorithms to two sequences
of the proposed set: “Night at Odeonplatz” and “Sharpness Chart”.
Here the MAC algorithm introduces strong visual artifacts due to the
too strong over- and under-shooting of its reconstruction filter and,
in the “Sharpness Chart” image, it also presents zippering artifacts.
The ADA-3 algorithm presents none of these problems and visually
outperforms the MAC algorithm. This result can be seen without
any contrast enhancement.
6. CONCLUSIONS
In this paper, we presented a set of color image sequences as a reference for the development of new image processing algorithms. Its
main advantage is in the opportunity to work directly with sensor
data that has not been compressed or color processed, allowing different image processing algorithms to be developed and tested in
their proper position in the image processing chain. Furthermore,
these are image sequences, so temporal image processing techniques
such as optical flow or inter-frame compression can be studied and
compared. We demonstrated the benefit of the new set by comparing two demosaicking algorithms twice, once using the Kodak set,
and once using the proposed set, showing how the new set allows for
easier and more straightforward evaluation of algorithm results.
2292
7. REFERENCES
[1] B.E. Bayer, “Color imaging array,” US Patent 3,971,065, Jul
1976, Eastman Kodak Company.
[2] R. Kimmel, “Demosaicking: image reconstruction from CCD
sample,” Trans. Image Processing, vol. 8, no. 9, pp. 1221–
1228, Sep 1999.
[3] R Lukac, K.N. Platoniotis, D. Hatzinakos, and M. Aleksic, “A
novel cost effective demosaicking approach,” Trans. Consumer
Electronics, vol. 50, no. 1, pp. 256–261, Jan 2004.
[4] D.D. Muresan and T.W. Parks, “Demosaicking using optimal
recovery,” Trans. Image Processing, vol. 14, no. 2, pp. 267–
278, Feb 2005.
[5] K. Hirakawa and T.W. Parks, “Adaptive homogeneity-directed
demosaicking algorithm,” Trans. Image Processing, vol. 14,
no. 3, pp. 360–369, Mar 2005.
[6] D. Menon, S. Andriani, and G. Calvagno, “Demosaicking with
directional filtering and a posteriori decision,” Trans. Image
Processing, vol. 16, no. 1, pp. 132–141, Jan 2007.
[7] E. Reinhard, W. Heidrich, P. Debevec, S. Pattanaik, G. Ward,
and K. Myszkowski, High Dynamic Range Imaging, Second Edition: Acquisition, Display, and Image-Based Lighting,
Morgan Kaufmann, Waltham, MA, USA, 2010.
[8] ITU, “Rec. ITU-R BT.709-4: Parameter values for the HDTV
standards for production and international programme exchange,” Mar 2000.
[9] ITU, “Rec. ITU-R BT.1886: Reference electro-optical transfer
function for flat panel displays used in HDTV studio production,” Mar 2011.
[10] J. Holm, “Capture color analysis gamuts,” in Proc. 14th Color
Imaging Conference. IS&T, 2006.
[11] IEC, “IEC/4WD 61966-2-1: Colour measurement and management in multimedia systems and equipment - Part 2.1: Default colour space - sRGB.,” May 2006.
2293