BEYOND THE KODAK IMAGE SET: A NEW REFERENCE SET OF COLOR IMAGE SEQUENCES Stefano Andriani, Harald Brendel, Tamara Seybold and Joseph Goldstone ARRI - Arnold and Richter Cine Technik GmbH Image Science Group, R&D Department Türkenstraße 89, 80799 Munich, Germany ABSTRACT In 1991 Kodak released a set of 24 digital color images derived from a variety of film source materials. Since then, most image processing algorithms have been developed, optimized, tested and compared using this set. Until a few years ago it was considered “the” image set; however, today it shows its limitations. Researches have expressed their need for better, more up-to-date material. We present a new set of high quality color image sequences captured with our professional digital cinema camera. This camera stores uncompressed raw sensor data and the set is freely available for FTP at ftp://[email protected]/ password: imageset. Index Terms— Kodak set, ARRI set, sensor data, color demosaicking 1. INTRODUCTION In the last twenty-five years, use of digital cameras has expanded from laboratory researchers, to consumers, and now to the most demanding professional photographers. Contemporary sensors can capture images with very high spatial resolution, high dynamic range and relatively low noise. This allows photographers to benefit from the flexibility and low costs of a digital workflow without suffering reduced image quality or compromising their artistic intent. Improvements in processor and storage performance and system architecture have led to completely new image processing algorithms, and digital images today can be managed and reproduced with very high fidelity. As a results, digital cameras have almost completely displaced film cameras in professional photography. Over the last few years the same move to digital is occurring in cinematography, albeit at a different pace. These advances have been obtained through tightly coupled development of hardware and software components. In addition, after camera manufacturers develop or adapt specific image processing algorithms for a camera, third-party companies often collaborate with the camera manufacturer to offer additional capabilities. This latter type of development is typically done under a non-disclosure agreement, allowing the camera manufacturer to divulge the camera’s most intricate design decisions while safeguarding that manufacturer’s intellectual property. Outside the world of industry, on the other hand, the academic community still typically develops, tests and compares image processing algorithms using the “Kodak image set”. This set was released as 8-bit full RGB images in three different formats: “Full detail” 3072×2048, “HDTV” 1536×1024 and “TV-comparable” 768×512, where the two lowest-resolution formats were computed 978-1-4799-2341-0/13/$31.00 ©2013 IEEE as subsampled versions of the “Full detail” format. The two larger formats were used only sporadically because the computational power of that era’s computers was so limited that even simple image processing algorithms required extensive computational resources. Thus, most of the algorithms present in the literature have been developed using the “TV Comparable” format. Nowadays, though the computational power of today’s computers is enough to execute most of the older algorithms in real-time, the community still uses the smallest format so as to compare most directly any new results against older ones. That said, the scientific community has come to see the limits of the Kodak set in many fields of image processing research, and the need for a new set. Corporately as a camera manufacturer and individually as researchers we would like to share some of our test sequences to satisfy this need and help advance the state of the art. Our digital camera, introduced in Section 2, can capture unprocessed data directly from the sensor and store it uncompressed in its original 2880×1620 (16:9) format. This uncompressed output gives cinematographers the opportunity to extract the highest possible quality image data from the camera, deferring all color processing until post-production. Such uncompressed output is rare, as almost all professional video cameras compress both sensor data (if they make sensor data available at all) and reconstructed HDTV-format imagery prior to storage. Through this uncompressed imagery, the image processing community will have free access to high quality reference sequences which can be used to develop and test new algorithms in a wide range of image processing fields, including: • demosaicking, by working with actual uncompressed color filter array imagery; • denoising, by analyzing real noise and its impact at each step of the image processing chain; • video compression, by starting from high resolution uncompressed video material; and • optical flow, by analyzing the correlation among frames in either Bayer pattern data or reconstructed color imagery. Furthermore, the possibility to test all these algorithms on image sequences (rather than the stills of the Kodak set) offers the opportunity to look at an algorithm’s results extended across the temporal domain where introduced artifacts are usually more visible. 2. ARRI ALEXA CAMERA The sequences in the set were captured by the ARRI ALEXA camera as it was tested on a variety of subject material and in a variety of capture environments. The ALEXA camera family has been developed for digital cinema acquisition and it is comprised of four 2289 ICIP 2013 Fig. 1. ARRI ALEXA family. Fig. 2. ARRI Alexa provided with the color wheel. models: ALEXA, ALEXA Plus, ALEXA M and ALEXA Studio (see Figure 1), all of which are built around the “ALEV III” CMOS sensor which has the following characteristics: special ALEXA research camera in which the Bayer color filter array has been removed, providing a “gray-scale” digital camera. Along with this camera we also constructed a color wheel, positioned in front of the lens, that can be equipped with up to four different color filters. Its rotation is driven by a step motor synchronized with the camera’s recordering processes (see Figure 2). The color wheel is fixed when recording is active; during the time the color wheel rotates to change the color filter in front of the camera, recording is deactivated. In this way, we can obtain full color images from a single sensor camera without an integrated color filter array. We have developed this system because of the utility of having full individual red, green and blue images as a reference, or as a starting point of a research investigation, e.g. in examining new patterns for a color filter array. The system is useful and was cheap and easy to realize. Its main drawback is the impossibility of shooting real sequences of arbitrary image content with it; in fact, the scene captured during the rotation of the wheel must be motionless lest the images associated with each color filter contain relative spatial variation that could introduce unwanted color artifacts. This problem can be seen in the “Color Wheel” sequence if one focuses on the peacock feathers in the center of the scene. The main characteristics of the proposed set are reported in Table 1 and a preview of its content is shown in Figure 3 • Size: 35mm film frame • Sensitivity: 160 - 3200 ISO; • Dynamic Range: > 14 Stops; • Resolution: 2880 x 1620 • Dual Gain Architecture (DGA) The DGA technology simultaneously provides two separate readout paths, with differing amplification, from each pixel. The first path contains a highly amplified signal. The second path contains a signal with lower amplification, to capture any information clipped in the first higher-gain path. Both paths feed into the A/D converters of the camera, each delivering a 14 bit image. These images are then combined into a single 16 bit higher dynamic range image. This arrangement enhances low light performance and significantly extends the dynamic range of the resulting image. The shadow areas are reconstructed from the high gain path and the highlights are reconstructed from the low gain path producing an image containing meaningful luminance information in all 16 bits. This output is then logarithmically compressed into 12-bit form, decreasing required bandwidth without reduction of overall quality. A filter pack situated in front of the sensor contains an optical low pass filter, an infrared (IR) cut-off filter and an ultraviolet (UV) cut-off filter. The low pass filter blocks high image frequencies that would lead to artifacts when captured by the sensor. Finally, between the filter pack and the sensor there is a Bayer pattern color filter array [1] allowing the capture of color images with a single sensor through demosaicking. The ALEXA camera provides its imagery in a variety of formats, from ARRIRAW (12-bit logarithmic uncompressed unencrypted data from the sensor) as the highest quality output on the one hand, to compressed or uncompressed HD on the other. More information is available at the ARRI website: www.arri.com. 3. CONTENTS OF THE NEW SET The proposed set is comprised of twelve sequences captured in different locales under different lighting conditions. The first eleven sequences were captured by normal ALEXA cameras equipped with a standard Bayer pattern color filter array, and require color reconstruction with a demosaicking algorithm such as those proposed in the literature [2–6]. The twelfth and final sequence was captured by a 4. COLOR PROCESSING THE IMAGES We provide the images as linear 16 bit TIFF files reconstructed from the 12-bit logarithmic format. Linear, in this context, means that the digital numbers are proportional to the amount of light collected by each pixel. The sensor has a read-out noise of approximately 2.1 digital code values and, as with most commercial cameras, its spectral responsivities are not colorimetric: they do not meet the Luther-Ives condition. In this section, we present color correction matrices that are derived from minimizing the color error over a set of selected test colors. Those matrices include a chromatic adaptation transform from the scene illumination to the white point of the target color space (D65). To render the images for display on a monitor one could apply several of the tone-mapping techniques described in the literature, such as those of Reinhard et al. in [7]. We provide here a global, spatially-invariant transform that is very similar to the way the camera generates a high definition video image. This processing, appropriately enough for its typical application, is inspired by traditional motion-picture film processing. First, the linear data is white balanced and color reconstructed. 2290 Fig. 3. Proposed sequence set. # Name 01 02 03 04 05 06 07 08 09 10 11 12 Akademie Arri Church Color Test Chart Face Lake Locked Lake Pan Night at Odeonplatz Swimming Pool Sharpness Chart Night at Siegestor Color Wheel Exp. Index 400 400 800 800 400 800 800 800 800 400 1600 400 Color Temp. 5600 K 5600 K 5000 K 3200 K 5600 K 5600 K 5600 K 5600 K 5600 K 3200 K 5600 K 3200 K Number of frame 240 240 48 10 48 48 48 240 48 10 240 24 Red Green x 0.6840 0.2210 y 0.3130 0.8480 Blue White x 0.0861 0.3127 y -0.1020 0.3290 Table 2. Chromaticity coordinates of the primaries and the white point of the wide-gamut color space. point of the wide-gamut color space are given in Table 2. Note that blue is a virtual primary. This results from a desire to encode all colors with positive numbers. The colorimetric error of the camera will map certain colors outside of the spectral locus. This behavior is not unique to the ALEXA, and can often be seen in digital still cameras as well. [10]. Several other approaches could be used to display the images. A general procedure is outlined below. First, the normalize the data for the exposure by: Table 1. Recording information for the proposed set. 18 EI Rraw − 256 · (1) 400 65535 − 256 where EI is the exposure index of the scene (see Table 1) and Rraw is a 16 bit pixel value from the TIFF file. G and B are likewise thus normalized. This converts the raw pixel values into normalized (but still unreconstructed) values scaled such that a captured gray card would be represented by RGB values of (0.18, 0.18, 0.18). Second, apply a color correction matrix. Conversion matrices for arbitrary target color spaces can be derived from the chromaticity coordinates of the wide-gamut color space mentioned above combined with the color correction matrix for the scene. For example, the color correction matrix for a scene photographed in daylight (D5600) is: R= Then, the color correction matrix transforms the camera RGB values into values for a wide-gamut color space. The encoding primaries for this color space are chosen to avoid clipping in all but the most extreme cases. These wide-gamut RGB values are then non-linearly transformed by a function named Log C , which mimics the results obtained by scanning color negative film. The Log C function is basically a logarithmic transform of the data, with a small offset added to create a “toe” at the lower end of the curve. The next step is a tonemapping curve, similar to what would be found in motion-picture print film. Compared to a motion-picture print film tone-mapping curve, ours induces less contrast, and maintains more shadow detail, making the resulting images easier to use in an electronic viewfinder, or to view with a on-set monitor. This tone-mapped data is then matrixed into the color space defined in ITU Recommendation 709 [8], which has the same primaries as the sRGB color space. The final step in color processing is compensation for the non-linear electrooptical transfer function of the monitor, assumed to be a power function with an exponent of 2.4 [9]. The chromaticity coordinates of the primaries and of the white Rwg 1.1766 Gwg = −0.0194 Bwg 0.0368 −0.1190 1.0606 −0.2019 −0.0576 R −0.0412 G (2) 1.1652 B The conversion from the wide-gamut colorspace to ITU Rec. 709 or sRGB is done by the following matrix: 2291 Lighthouse image processing chain Full Detail Format Spatial TVC Downsampl. Format Bayer Downsampl. Full Detail Demosaicking Bayer Downsampl. Format MAC/ADA-3 Demosaicking TVC "A" MAC/ADA-3 Format Spatial TVC "B" Downsampling Format ARRI Sequences image processing chain White Balance 3K Format Demosaicking MAC/ADA-3 3K Color post-processing Format Fig. 4. Image processing chains for the tests on the “Lighthouse” image and the ARRI sequences. Rwg −0.0802 −0.2640 Gwg Bwg 1.2481 (3) where [Rs Gs Bs ] is the red, green, blue triplet in the sRGB color space. Finally, apply the correction for the non-linear EOCF of the display, e.g. the transform described in the Amendment 1 of IEC 61966-2-1 [11]: 1.6175 Rs Gs = −0.0706 −0.0211 Bs C0 = −0.5373 1.3346 −0.2270 12.92 C 1.055 C 1/2.4 − 0.055 if C ≤ 0.0031308 if C > 0.0031308 Fig. 5. Comparison between the MAC and the ADA-3 debayering algorithms on the Kodak Lighthouse image. The image contrast has been enhanced to make the differences more visible. (4) where C is R, G, or B, respectively. Fig. 6. Comparison between the MAC and the ADA-3 debayering algorithms on two images from the proposed set. No image contrast enhancement has been applied. 5. DEBAYERING COMPARISONS BETWEEN KODAK AND PROPOSED SET. In this section we show two sets of comparisons between the results of demosaicking algorithms, showing the importance of evaluating every algorithm in its proper position inside the image processing chain. In the first set, the comparison is made with images from the ”Kodak image set”; in the second, the new set is used. The first demosaicking algorithm, called MAC, is presented by Menon et al. in [6]. It is a well known algorithm developed using the “Kodak” image set. The second demosaicking algorithm is called ADA-3 (ARRI Debayering Algorithm version 3) and it has been developed and optimized using the images of the proposed set. The first comparison is made with the “Lighthouse” image using the two image processing chains shown in Figure 4. In the TVC “A” path, the “Full Detail” format is first downscaled into the “TV Comparable” format, then downsampled according to the Bayer pattern and finally color reconstructed. In the TVC “B” path, the “Full Detail” format is firstly downsampled according with the Bayer pattern then color reconstructed and at the end downscaled into the “TV Comparable” format. This second path is much more similar to real camera color processing than the first, even though the color reconstruction is performed after and not before the color processing. Figure 5 presents cropped versions of the resulting images, with contrast enhancement making differences more visible. Against the Kodak set, the MAC algorithm appears to work slightly better than ADA3; in the TVC “A” path, the fence presents less color aliasing, and the reconstruction of the edges of the life preserver is slightly better. When the results of the TVC “B” path are considered, however, the visual quality of the products of the two algorithms is almost the same. In any case, the visual quality of the TVC “B” path is much higher than that of the TVC “A” path. In Figure 6 we have applied the two algorithms to two sequences of the proposed set: “Night at Odeonplatz” and “Sharpness Chart”. Here the MAC algorithm introduces strong visual artifacts due to the too strong over- and under-shooting of its reconstruction filter and, in the “Sharpness Chart” image, it also presents zippering artifacts. The ADA-3 algorithm presents none of these problems and visually outperforms the MAC algorithm. This result can be seen without any contrast enhancement. 6. CONCLUSIONS In this paper, we presented a set of color image sequences as a reference for the development of new image processing algorithms. Its main advantage is in the opportunity to work directly with sensor data that has not been compressed or color processed, allowing different image processing algorithms to be developed and tested in their proper position in the image processing chain. Furthermore, these are image sequences, so temporal image processing techniques such as optical flow or inter-frame compression can be studied and compared. We demonstrated the benefit of the new set by comparing two demosaicking algorithms twice, once using the Kodak set, and once using the proposed set, showing how the new set allows for easier and more straightforward evaluation of algorithm results. 2292 7. REFERENCES [1] B.E. Bayer, “Color imaging array,” US Patent 3,971,065, Jul 1976, Eastman Kodak Company. [2] R. Kimmel, “Demosaicking: image reconstruction from CCD sample,” Trans. Image Processing, vol. 8, no. 9, pp. 1221– 1228, Sep 1999. [3] R Lukac, K.N. Platoniotis, D. Hatzinakos, and M. Aleksic, “A novel cost effective demosaicking approach,” Trans. Consumer Electronics, vol. 50, no. 1, pp. 256–261, Jan 2004. [4] D.D. Muresan and T.W. Parks, “Demosaicking using optimal recovery,” Trans. Image Processing, vol. 14, no. 2, pp. 267– 278, Feb 2005. [5] K. Hirakawa and T.W. Parks, “Adaptive homogeneity-directed demosaicking algorithm,” Trans. Image Processing, vol. 14, no. 3, pp. 360–369, Mar 2005. [6] D. Menon, S. Andriani, and G. Calvagno, “Demosaicking with directional filtering and a posteriori decision,” Trans. Image Processing, vol. 16, no. 1, pp. 132–141, Jan 2007. [7] E. Reinhard, W. Heidrich, P. Debevec, S. Pattanaik, G. Ward, and K. Myszkowski, High Dynamic Range Imaging, Second Edition: Acquisition, Display, and Image-Based Lighting, Morgan Kaufmann, Waltham, MA, USA, 2010. [8] ITU, “Rec. ITU-R BT.709-4: Parameter values for the HDTV standards for production and international programme exchange,” Mar 2000. [9] ITU, “Rec. ITU-R BT.1886: Reference electro-optical transfer function for flat panel displays used in HDTV studio production,” Mar 2011. [10] J. Holm, “Capture color analysis gamuts,” in Proc. 14th Color Imaging Conference. IS&T, 2006. [11] IEC, “IEC/4WD 61966-2-1: Colour measurement and management in multimedia systems and equipment - Part 2.1: Default colour space - sRGB.,” May 2006. 2293
© Copyright 2025 Paperzz