UNIVERSITY OF CALIFORNIA Santa Barbara "The Role of Illumination Perception in Color Constancy" This Dissertation submitted in partial satisfaction of the requirements for the degree of Doctorate of Philosophy in Psychology by Melissa Drake Rutherford Committee in charge: Professor David H. Brainard, Chairperson Professor John M. Foley Professor Jack M. Loomis Professor Russell Revlin August 2000 i The dissertation of Melissa Drake Rutherford is approved ________________________________________________ Professor John M. Foley ________________________________________________ Professor Jack M. Loomis ________________________________________________ Professor Russell Revlin ________________________________________________ Professor David H. Brainard Committee Chairperson July 2000 ii August, 2000 Copyright by Melissa Drake Rutherford 2000 iii VITA December 20, 1968 — Born—Portland, Oregon 1992 — B.A., Yale College 1997-1998 — Fulbright Scholar — University of Cambridge, England 1998-1999 — Assistant Professor — Reed College, Portland, Oregon 1999-2000 — Research Assistant — University of California at Santa Barbara PUBLICATIONS Rutherford, M.D. & Brainard, D.H. (2000). The Role of Illumination Perception in Color Constancy. Investigative Opthamology & Visual Science, 41, S525. Baron-Cohen, S., Wheelwright, S., Stone, V., & Rutherford, M. (1999). A mathematician, a physicist, and a computer scientist with Asperger Syndrome: performance on folk psychology and folk physics tests. Neurocase, vol 5, pp.475483. Brainard, D.H., Rutherford, M.D. & Kraft, J.M. (1997). Color constancy compared: Experiments with real images and color monitors. Investigative Opthamology & Visual Science, 38, S476. Rutherford, M.D., Tooby, J., & Cosmides, L. (1997). The effects of power on social reasoning. In M.G. Shafto & P. Langley (Ed.), Proceedings of the Nineteenth Annual Conference of the Cognitive Science Society, p.1029. FIELDS OF STUDY Major Field: Psychology Studies in Perception and Color Constancy. Professor David H. Brainard Studies in Theory of Mind and Social Cognitive Development. Dr. Simon Baron-Cohen Studies in Evolutionary Psychology and Social Reasoning. Professor Leda Cosmides iv ABSTRACT "The Role of Illumination Perception in Color Constancy" by Melissa Drake Rutherford According to the albedo hypothesis, the visual system estimates the illuminant of a scene and uses this estimate and the luminance reflected from surfaces to determine the color appearance of those surfaces. This hypothesis is common to many models of color constancy. In the eight experiments reported here, observers viewed a standard and experimental scene in alternation. The illumination for each scene was under independent computer control. Each scene contained a test region that consisted of a computer-controlled display, masked so that it appeared to be an illuminated surface. On each trial of the experiments, observers made two adjustments: they adjusted the illumination in the experimental scene so that it appeared the same as the illumination in the standard scene, and adjusted the test region in the experimental scene so that it appeared to have the same lightness as the test region in the standard scene. The albedo hypothesis predicts that when both the illuminant and the test regions in the two scenes appear the same, the physical luminance of the test patch will be the same. However, the experiment yielded different results. When the surfaces in the experimental scene were chosen to be systematically less reflective than those in the other, the illumination matches set were not veridical matches. This bias in illumination matches was accompanied by a bias in the reflectance matches, as predicted by the albedo hypothesis. The two biases, however, were not completely complementary, and the luminance measurements falsify the albedo hypothesis. In addition, manipulating the immediate surround of the test patch affected the matched surface lightness without affecting the matched illuminant. Together, these eight experiments rule out the possibility that the perceived illuminant, as measured by matching, is the only variable that governs the relation between physical luminance and perceived surface reflectance. v TABLE OF CONTENTS Chapter 1 The problem of color constancy; theoretical background Lightness constancy Methodologies in lightness constancy Illumination perception The relationship between surface color and illumination perception Tests and demonstrations of the proposed relationship Egde Classification Quantitative Tests Oyama , 1968 Logvenenko & Menshikova, 1994 Kozaki & Noguchi, 1976 Noguchi & Kozaki, 1985 7 7 11 11 15 17 19 21 23 27 31 33 34 Chapter 2 The logic of the current experiments General Methods Experiment 1A: Symmetric Matches Experiment 1B: Symmetric Matches over a Range of Reflectances Experiment 2A: Asymmetric Matching Experiment 2B: Asymmetric Matching with Change in Surround 36 36 38 43 52 60 81 Chapter 3 General Discussion Conclusion 91 91 93 References 95 Appendix 1 99 Appendix 2 108 Appendix 3 110 Appendix 4 112 vi Chapter 1 The purpose of this project was to investigate the perception of illumination and to consider the role of illumination perception in surface color constancy. There are two reasons why illumination perception may be interesting. First, the perception of illumination might itself be functional: one may need to estimate the relative warmth or visibility of two possible paths, predict the weather, or to estimate the time of day (Zaidi, 1998; Jameson & Hurvich, 1989). The second possible reason for illumination perception, the primary focus of this project, is that it may be necessary to estimate the illuminant in order to see surfaces as having constant colors. Indeed, many computational models of surface color perception assume that the perceived illuminant plays a central role. Color constancy: theoretical background Color may be an important clue in object recognition. Human vision allows the observer to create a stable perceptual mapping between a particular surface and a given color, even as the proximal stimulus1 changes from context to context. This is remarkable, given that color perception involves the parsing of an inherently ambiguous proximal stimulus. The light reaching the eye from an object is a product of the object’s surface reflectance function (the reflectance at each wavelength) and the intensity and spectral distribution (light energy at each wavelength) of the light source. Figure 1 illustrates the inherent ambiguity of color perception. Notice that both the illuminant and the surface reflectance contribute at each wavelength to the light reaching the eye. Because more than one factor influences the proximal stimulus, it is possible for the same object to give rise to a very different proximal stimulus as the lighting changes, for example at mid-day compared to at sunset. This effect can be so extreme that the proximal stimulus resulting from a blue color chip in a tungsten light can be the same as that from a yellow color chip in sunlight (Jameson, 1985). Figure 2 shows an example of the physical difference between two scenes that would appear very similar to an observer who was immersed in either scene. 1 The proximal stimulus is the stimulus on the retina: the exact pattern of excitation of the retina. It must somehow be transformed into a perception of something in the world. The proximal stimulus is often contrasted with the distal stimulus: the physical objects and illuminants in the world that give rise to the proximal stimulus. 7 Reflectance Spectral Power i1 a1 x a2 Spectral Power e1 = i2 e2 x = Wavelength Figure 1: The light reaching the eye from a surface is the product of both the surface reflectance and the spectral power of the illumination at each wavelength. In this figure, the rows depict surface reflectance function (left), and illuminant spectral power distribution (center) and the product. Notice that one cannot determine either the surface reflectance or the illuminant given the product. 8 [This color plate not included in the electronic version. See Evans’ book for the original.] Figure 2: These two pictures show the same scene under different illuminations. Although the two pictures look very different when shown together, they would look almost the same to a viewer of the upper picture adapted to daylight, and a viewer of the lower picture adapted to an ordinary tungsten light. Photos from Evans, 1948, plate X. 9 Even with these different proximal stimuli, however, a human observer sees the object as being roughly the same color, that is, humans show a great deal of color constancy (e.g. Burzlaff, 1931; Arend and Reeves, 1986; Brainard & Wandell, 1992; Brainard, Brunt & Speigle, 1997; Brainard, 1998). This means that across a range of conditions, the visual system produces a color representation that is better predicted by the distal stimulus (the surface reflectance) than the proximal stimulus (the light reaching the eye). Thus, the observer is able to maintain a mapping between an object and a perceived color, which is exactly what is needed given the assumption that color perception is important in object recognition. How can an observer perceive the surface of an object as a given color through changes in the proximal stimulus which result from changes in illumination? How can an observer maintain a constant representation of an object’s color across a change in background color? These are the central problems in color constancy. Human color constancy is not perfect, and there are both biological and physical sources of its limitation. There are biological limitations in cell response ranges: One could not represent colors that are too bright or too dark for the nervous system’s range. (More generally, there are neural limits to the perception of chromaticity as well as limits in resolution caused by the limited density of the neurons.) Furthermore, the human eye (and any other natural eye) has a limited number of types of photoreceptor, each maximally sensitive at a particular wavelength. These biological limits might prevent the visual system from maintaining a stable mapping between reflectance and color appearance, and thus interfere with color constancy. Physical limitations in color constancy (limitations inherent to the problem) include the inherent ambiguity in the distal stimulus. Since there are an infinite number of possible reflectance functions of a surface that could give rise to the same proximal stimulus, it is not possible to resolve the exact reflectance function with any finite number of photoreceptors. For example, if one wanted to represent a particular color using paint or a CRT screen, that percept could be created in a number of different ways, mutually indistinguishable to the human visual system. Another physical limit to color constancy is the spectral power distribution of the illuminant itself. In normal situations, the illuminant provides a wide enough spectrum to test the reflectance of a surface at each wavelength. However, it is possible to create an artificial illumination that is too narrow to reveal the full reflectance spectrum of the surface. For example, the artificial light of the parking garage may make it difficult for you to recognize your blue car because it appears to be yellow; there may be no part of the illuminant that tests the right part of the visible spectrum to reveal a blue color. 10 Lightness constancy Color constancy applies to both chromatic and achromatic color perception. Lightness constancy is a special case of color constancy. It is the stable representation of lightness across illumination and background changes, without reference to spectral distribution or hue. The question is: how do observers judge “white” or “black” or intermediate shades of gray, as the intensity of the illuminant varies? Lightness constancy is illustrated by Figure 3. In this figure, the visual system perceives the checkbox in the shadow as being similar in lightness to the checkbox in the top corner, even though they are in fact physically different, as illustrated by the color chips. The chip on the right is physically the same as the checkbox in the shadow, and the chip on the left is the same as the checkbox in the top corner. The visual system processes the light reaching the eye from inside and outside the shadow differently with the effect of stabilizing perceived surface lightness against changes in illumination. Although lightness constancy is a special case of color constancy, it shares the same central feature: an infinite number of combinations of surface and illuminant can produce identical proximal stimuli. There are also some differences between the two cases: in the case of achromatic color constancy one is interested in how the stimulus varies in one dimension, whereas in tri-chromatic color constancy, one is interested in the independent intensities for at least three different wavelengths. The experiments described in this project all deal with lightness constancy. Methodologies in lightness constancy According to Koffka (1935), Katz published the first work in the field of color constancy in 1911. Hering introduced the name “memory color” to describe the phenomenon in 1920. One of the earliest experimental demonstrations of lightness constancy was that of Burzlaff (1931). In this experiment, there were two displays of 48 shades of gray. One display was placed near the window and the other was placed deep in the interior of the room where it only got 5% as much illumination. The observer, who was next to the window, sequentially matched a square of a particular lightness on the near display (called the test patch) to a square on the far display that appeared to be the best lightness match (see Figure 4). The results showed striking (though imperfect) lightness constancy. Katz (1935) measured color constancy using a matching paradigm in which observers had to set the black to white ratio of a spinning color wheel to match a gray paper, when the color wheel was well illuminated and the standard patch was in shadow (see Figure 5). He found that it was impossible for an observer to exactly perceptually equate the color of the light gray paper with the spinning wheel (represented by a circle on the back wall of the apparatus in Figure 5). The observer was never satisfied that the two looked alike, even once the match had been made. 11 Figure 3: This figure illustrates lightness constancy. The checkbox in the shadow looks similar in lightness to the checkbox in the top corner, even though the lighter chip on the left is physically the same as the checkbox in the shadow, and the chip on the right is the same as the checkbox in the top corner. The visual system processes the light reaching the eye from inside and outside the shadow differently to create lightness constancy. Image courtesy of Ted Adelson, http://wwwbcs.mit.edu/people/adelson/adelson.html 12 Figure 4: An example of the matching method. Burzlaff (1931) had two displays, one placed near the window and the other was placed deep the room where it was darker. The observer sat next to the window and selected a color chip to match the test patch. Adapted from Gilchrist et. al, 1999. These two early experiments both test lightness constancy across two different contexts. In this general method, called asymmetric matching, observers must match one aspect of the visual scene in one context (say the surface color or the illumination of the scene) with the same aspect in a different scene. These two scenes can be side by side boxes, like the Katz example, or they can be near and far displays, like in the Burzlaff example. As Katz found, this matching method does not always yield perceptually satisfying results. 13 Figure 5: Katz (1935) had observers match perceived color in a side by side display where one side was illuminated and the other one shaded. Adapted from Gilchrist et. al, 1999. As an alternative to the matching paradigm, it is possible to have observers rate, scale, or name colors in one or more contexts (see the descriptions of Kozaki & Noguchi (1976) and Logvenenko & Menshikova (1994) below). These methods are slightly more difficult to interpret because they introduce additional cognitive and linguistic elements, and rely on the observer to accurately describe the percept. Scaling data is also difficult to interpret because one does not know whether the difference between two adjacent points on the scale, for example, “dark gray” and “very dark gray” is equal to the difference between another pair of adjacent points, for example “gray” and “light gray.” (See Speigle, 1998 for a discussion and comparison of different methods for assessing appearance.) Quantitatively, there have been several proposed measures of achromatic color constancy, starting with Katz (1911, 1935). Today, a measure of color constancy proposed by Brunswik in 1933 is widely employed (e.g. Brainard, Brunt & Speigle, 1997; Brainard, 1998; Arend, & Reeves, 1986). For the purposes of the current experimental series, it is important to note that the vast majority of color constancy studies to date have focused on the manipulation and perception of the surface color. The perception of surface color has been of interest largely because of the functional role it plays in object recognition. This interest in 14 surface colors as opposed to illuminant color is illustrated by a quote from Helmholtz who suggested “In visual observation we constantly aim to reach a judgment on the object colors and to eliminate differences of illumination.” (1962/1866 p. 408). Relatively few studies have focused on (or even addressed) the perception of the illuminant. Those few studies that have are described below (see also Beck, 1959). Illumination perception There are a number of reasons to believe that humans can perceive illumination. Early support for this idea stems mainly from theoretical arguments or informal observations (but see the experimental work reviewed in the next section). First, a number of authors have suggested, based on informal observations, that one’s sense of an environment or scene includes some sense of the illuminant (e.g. Katz, 1935; Woodworth, 1938; Adelson & Pentland, 1991). Second, the ability to judge illuminant properties could itself be perceptually important (e.g. Zaidi, 1998, see below; Jameson & Hurvich, 1989). Third, estimation of the illumination may be an important step in achieving surface color constancy (e.g. Helmholtz, 1962/1866; Koffka, 1935; Beck, 1972; Epstein, 1973). This third possibility is the central idea throughout this project. Finally, the perception of illumination has been of great interest to artists (e.g. Caravaggio, Rembrant, Pissaro, Monet), suggesting that its representation is an important part of visually comprehending a scene. It is worth noting that there may be a difference between perceiving the overall illumination in a visual scene and perceiving the illuminant at a single scene location. Certainly, one could have a sense of an overall illumination in a room, for example, but still be able to perceive that some recesses in the scene are shadowed such that the illuminant at various surfaces differs. Of the experiments described here, the earlier four ask observers to assess the overall illumination, while the later four explicitly instruct observers to focus on the amount of light falling on a given point. For the diffusely illuminated scenes used here, the two judgements do not seem to differ. The first author to propose that we have some ability to perceive illumination may have been Katz (1935). Katz asserted on subjective grounds that empty spaces appear to the observer to be illuminated. He further claimed that the impression of illumination is stronger even than the impression of surface colors. Katz also observed that the illumination of any given empty space does not need to be uniform but can contain areas of different distinct perceptible illumination. In other words, a scene can contain multiple frameworks. (Here “framework” in used in the Gilchrist et al. (1999) sense, meaning an area that is “grouped” together and seen as having the same illuminant.) Such differences in illumination in a visual scene can be side by side or can be one behind the other, as when the observer looks down a dark hallway into a well-lit room. 15 Woodworth (1938) also thought that the visual system was sensitive to information regarding the illuminant, but suggested that we use the term “registering” rather than “perceiving” the illumination, since, he suggested, there was not necessarily an explicit representation of the illumination. He felt that only an explicit representation should be called a perception. 2 There can be an explicit perception according to Woodworth, as when a light is turned off or the sun goes behind a cloud, but such an explicit representation is not necessary in order for illumination to play a role in color constancy. As to the question of whether illumination can be seen as different in different parts of the visual field, Woodworth assures us that “nothing is more certain,” offering as an example the obvious flecks of direct sunlight under a shady tree (1938, p.432). Adelson & Pentland (1991) oppose restricting the investigation of lightness perception to 2D images, arguing that perception of lightness is dependent on the perceived 3D structure of the scene. Their suggestion (consistent with Gilchrist’s early models described below) emphasizes the importance of seeing each change in luminance as either a change in shape, a change in lighting or a change in shading.3 Adelson & Pentland propose a computer model with three “specialists”: the set builder (who determines shape), the painter (who determines color), and the lighting expert (who determines illumination). Their model involves making a Bayesian estimation of the 3D structure, the color and the illumination, given the retinal image, where the Bayesian “cost” is represented as the inverse of probability. In short, their model agrees with others that illumination perception is important and necessarily related to color perception. Zaidi (1998) proposed that observers are able to encode both the surface colors and the illuminant. He suggests that the problem for the visual system to solve is not to bring about stable color appearance under different illuminants by discounting the illuminant, but to recognize that objects are indeed being viewed under different illuminants and to discover what the illuminant properties are. He opposes recent models that propose that the illuminant is “discounted” via adaptation or other processes early in perception, suggesting instead that failures in color constancy are by design, and are evidence that observers can extract information about the illuminant. Thus people can and do perceive differences in illumination, for example between a sunny and a shaded path, and such information is important, for example for the hiker who seeks warmer or cooler trails. According to Zaidi, perceived object colors do change with illuminants and this change in color can be used to extract 2 In spite of my appreciation for Woodworth’s suggestion, I will use the more familiar term “perceive” throughout, but do not intend it to necessarily imply a conscious awareness or the ability to describe the percept. The percept may be explicitly represented, as it is in the current project, or it may not be. 3 In Gilchrist and colleague’s discussion of color constancy, the changes in shape and illumination are not distinguished, since the shading is produced by an “attached illumination edge” in the object. 16 information about illuminants. Zaidi also points out that various painters such as Monet and Corot exploit this relationship to provide information about the illuminant. One might question, however, whether “discounting” the illuminant (or taking it into account in the process of color perception) truly precludes illumination perception. In fact, illumination perception or registration might be necessary in order to calculate surface color. Perhaps failures in color constancy reflect failures of accurate illumination estimation rather than revealing illumination perception, as Zaidi suggests. The relationship between surface color and illumination perception Does illumination perception have a role in the perception of surface color? The fact that human observers have (some degree of) color constancy suggests that different proximal stimuli can give rise to the perception of (approximately) the same color. For example, imagine the same gray paper in first dim then bright light. It is seen as the same middle gray in both cases. One suggestion is that in order for this process to work, the illumination difference (which accounts for the difference in the proximal stimuli) must be a factor in the calculation of the color. In order to be a factor, the illumination must be perceived (or at least registered) by the visual system. Another way to say this is that when the observer looks at a gray sheet of paper in a given light, the retinal stimulus gives rise to two (not necessarily conscious) percepts: the surface color and the illumination. The exact mathematical relationship between these two percepts could take one of a number of different forms, as discussed below. Helmholtz (1962/1866) proposed that the judgments of color (or lightness) and illumination must be psychologically coupled, in the sense that the perception of one (lightness) is based on perceiving and taking the other (illumination) into account. “In visual observation we constantly aim to reach a judgment on the object colors and to eliminate differences of illumination.” (1962/1866 p. 408). He suggested that the luminance of a particular test field was compared with the perceived illumination of the overall framework (which may or may not have been the complete visual scene). The surface reflectance, he suggested, was calculated by dividing the luminance of the retinal image by the perceived illumination. (This is the classic form of the albedo hypothesis as discussed below.) This particular operation was chosen because if one were dealing strictly with the physics of reflectance, the luminance would be equal to the surface reflectance times the illumination. Whether this is the actual psychological relationship is an empirical question, tested in this project. Hering (1907/1920) raised an objection to any proposal (such as that of Helmholtz) suggesting that from a single known quantity (the light reflected to the eye) we could reliably derive two different perceived quantities. He suggested that it would be 17 logically impossible to know which factor was contributing more: was the luminance high because of a high reflectance or because of a high illumination? Indeed, this is a major puzzle in color constancy. This general problem, the problem of anchoring, is discussed further by Gilchrist et al. (1999). Perhaps this seeming paradox can be solved by one or more of the following: First, under natural viewing conditions, the visual field has multiple objects, each of which provides some cue to the illumination (Kardos, 1929). Second, the “field of indirect vision” or the periphery may provide information about the illuminant (Woodworth, 1938). Third, the visual system may simply make a guess (although the data seem to suggest somewhat more accuracy than guessing would predict.) Finally, the computational approach to color constancy suggests that one can make a principled estimate of the illuminant based on certain regularities in the world (see, e.g. Maloney & Wandell, 1986; D’Zmura, 1992; Brainard & Freeman, 1997; Buchsbaum 1980). Earlier this century, Koffka (1935) also suggested that there was an invariant relationship between perceived lightness and perceived illumination in any case where there is color constancy. He believed that there would be no other logical way that color constancy would be possible. He did not firmly advocate any particular relationship between perceived lightness and perceived illumination, but did assert that there must be some invariant relationship. Specifically, he suggested “a combination of whiteness and brightness, possibly their product, is an invariant for a given local stimulation under a definite set of total conditions.” (p.244). Woodworth (1938) also proposed that there was an invariant causal relationship between perceived color and perceived illumination. According to him, the visual system somehow inferred the illumination, based on various cues in the visual field, and then judged lightness based on this “registration” of the illumination. In the late 1960’s and 1970’s there was a revival of the idea that an observer used an estimate of the illumination to perceive lightness. This was called the albedo hypothesis (Beck, 1972). Epstein (1973) called it the “taking-into-account hypothesis,” a term which is no longer used. Beck formalized the hypothesis, describing it as “the view that an observer discounts the intensity of the illumination in perceiving lightness.” (1972, p.99). As the term is used in the literature today, the albedo hypothesis suggests that the visual system first estimates the illuminant, and then uses that estimate to calculate the surface reflectance for a given surface luminance.4 Beck suggested that the albedo hypothesis required a strictly invariant relationship between the perceived lightness and perceived illumination. One possible relationship is that the light reaching the eye, or luminance (e) equals the perceived lightness or albedo (â) times the perceived illumination (î). 4 Albedo is here used as a synonym for reflectance. 18 e=â*î (He later tested and rejected this hypothesis.) According to the albedo hypothesis there is a causal relationship between the two percepts: the perceived illumination has a causal role in the perception of lightness. Thus, a change in the perceived illumination is a sufficient condition for a change in perceived lightness provided that the luminance remains unchanged. The above formulation is the classical form of the albedo hypothesis, but in principle there are a number of possible formal relationships describing an invariant relationship between perceived illumination and perceived albedo. The important requirement is that there be a regular and causal relationship between the two, such that for a given luminance, the perceptual system is able to uniquely determine the albedo, given the perceived illumination. Variations of this possible equation are discussed below (see the subsection entitled “Quantitative tests.”) Tests and demonstrations of the proposed relationship If it is the case that perceived surface color depends on both the surface luminance and the perceived illuminant, then a prediction would be that by manipulating cues to the illuminant, one could influence perceived color. One very simple demonstration of the relationship between perceived illumination and perceived surface color is Hering’s (1907/1920) ringed-shadow demonstration: place an object on a white sheet of paper in a room with a single light source. Take a thick felt pen and trace the penumbra (the lighter gray area) of the shadow on the paper, so that it no longer appears to be a penumbra. Without a penumbra the shadow will not appear to be a shadow, and the paper will appear stained; the perceived color will have been changed by a manipulation of the perceived illuminant.5 A second more well known demonstration of deceptive illumination influencing perceived color is the “Gelb effect” (Gelb, 1929). In this demonstration, the room was dimly lit and the walls were covered with an assortment of objects. The experimenter presented a black disk suspended from the ceiling, which was illuminated by a hidden light source. In this arrangement, the observer was unaware of the light source, and there was no penumbra. Thus, the disk was not seen as highly illuminated but was seen as white. Importantly, being told about the light source did not change the percept; the observer still saw the black paper as white. The exact inverse of this demonstration has been shown (Kardos, 1934). In this case, the room was very well lit and contained a variety of objects. A disk of white paper 5 Notice however, that this demonstration alone does not require an invariant relationship between the illuminant and the surface color, since other aspects of the retinal image, like the “crispness” of the edge have necessarily changed as well (see also Beck, 1971, described below.) It is, however, consistent with there being such a relationship, and thus may be a suggestive demonstration. 19 was suspended from the ceiling and a concealed shadow caster prevented light from falling directly on the disk. Observers reported seeing a black disk. As before, subsequent manipulations reveal that the effect is not cognitively penetrable: knowing that the shadow caster was there did not change the percept. Only cues of shading in the visual field changed the perceived color. Again, this suggests that it is possible to manipulate perceived color by manipulating perceived illumination alone. Beck (1971) offered a more recent replication and improvement on this demonstrative paradigm. He noticed that it was not logically possible to distinguish between the effect of illumination perception and the effect of simultaneous contrast (or lateral inhibition) in the above demonstrations. (Indeed, it was Woodworth and Schlosberg (1954), not Gelb, who suggested that the original demonstration revealed the effect of illumination perception.) Beck projected a bright beam of light onto a white background such that it fell halfway onto a black surface in the foreground. Because of the angle of observation, it was in one case possible to see the shadow (an obvious cue to illumination) and in the other case not (see Figure 6). In these two cases the contrast effect was held constant since the reflectance and perimeter of the edge were equated. The results were in agreement with the earlier demonstrations: the majority of the observers rated the target (the illuminated area of the black foreground surface) as darker in the shadow condition than in the non-shadow condition. Thus, even with adjacent contrast equated, a visible cue to illumination affected the perceived surface color. White Background Illuminated Area Shadow Black Surfaces Figure 6: Experimental set up from Beck (1971). A bright beam of light shines on a white background and falls halfway onto a black surface in the foreground. In one case (shown on the left) it was possible to see the shadow (an obvious cue to illumination) and in the other case (shown on the right) it was not. The contrast effect was held constant. Adapted from Beck, 1971. 20 Edge Classification Gilchrist and his colleagues (e.g. Gilchrist, 1988; Gilchrist & Jacobsen, 1984) organized and explained these observations using the important concept of edge classification. According to this perspective, another way to characterize the “deceptive illumination” in the above demonstrations is to say that the observer has (perceptually) misclassified an abrupt change in luminance. Gilchrist and colleagues proposed that the perceptual system automatically classifies any abrupt change in luminance as one of two types of edges. A “reflectance edge” is a change in color caused by a change in surface reflectance, e.g. by a stripe of paint or a change in material. An “illumination edge,” is a change in the amount of light reaching the surface, either because of shadowing or because of a bend in the surface. In the case of Hering’s ringed shadow, the edge is seen as a reflectance edge, a change in the color of the paper, rather than as an illuminant edge or a shadow. Gilchrist and Jacobsen (1984) measured observers’ ability to judge color and illumination in two achromatic scenes. Gilchrist and Jacobsen constructed two identical miniature rooms that differed only in the reflectance of their surfaces. Each room contained the same objects: a milk carton, two paint cans, a wooden cube, and an egg carton. Each room was painted uniformly such that all the surfaces, including the walls and the objects, were of the same reflectance. One was matte black (with a reflectance of 4.6%) and the other a matte white (with a reflectance of 84%). Each chamber was illuminated with a bulb that was hidden by a baffle, such that it was not visible to the observer. In one condition the rooms were equally illuminated. Here the luminance level was much higher in the white room than in the black room, but the authors argue that this alone should not change the appearance of the room, unless one expected that every illumination change would also change the lightness appearance of the room. In another condition, the illumination in the white room was adjusted so that the light reaching the eye was actually less in the white room than in the black room (both in total and at every measured point). Observers were first asked to describe what they saw and were then asked whether the illumination appeared to be the same everywhere and whether the surfaces all appeared to be the same shade. Next, observers were asked to make 8 illumination matches: they were asked to adjust the illumination on a Munsell chart until it matched the illumination they saw at 8 different points in each room. In a second experiment, the experimenters asked for a Munsell match of the reflectance at each of the 8 test points. By this method, Gilchrist and Jacobsen hoped to answer the following questions: First, will the different surfaces within a single-reflectance room look different? This 21 first question is a test of the current contrast account of color constancy, which suggests that the perceived color of a surface is influenced by the color of its adjacent surfaces. Second, what is the perceived illumination at the different test spots in a room with uniform illumination? Third, will the two single-reflectance rooms look different from one another? The results suggested that the various surfaces within each room had roughly the same apparent reflectance or color. The within observer differences in color judgment were very small compared to the differences in illumination judgments. Results also showed that the differences in illumination levels throughout the array were perceived, and the illumination judgments closely paralleled the true illumination levels. The exact match was not always veridical (the matches in the white room showed a consistent error), but the ratios between the 8 test spots were perceived veridically. Finally, results showed that the two rooms appeared to the observers to be different from each other. Even the brightly lit black room was seen as darker (Munsell match 5.5) than the dimly lit white room (Munsell match 7.5), although one might have predicted the opposite based on the intensity of the light reaching the eye. They drew two conclusions: First observers were judging color differently, and more accurately, than simultaneous contrast theory would predict. Second, observers were remarkably good at judging the illumination at eight different points in the visual scene. The authors intend the experiment as a demonstration that observers are able to classify edges as either reflectance edges or illumination edges, and take these results to support this view. They point out that sensory theories of color constancy ignore illumination perception, or assume that illumination is poorly perceived. Given these results, the authors concluded that contrast theories could not entirely account for color constancy, nor was the “photometer metaphor”6 accurate in describing human color vision. They offer as an alternative the view that what the visual system needs to do is to categorize edges. Contrast theories would suggest that an area of higher luminance would always appear to be whiter than an adjacent area of lower luminance. Gilchrist and Jacobson suggest that this is only the case if the border between the areas is seen as a reflectance edge; the same inference cannot be made if the edge is an illumination edge. Subsequently, Gilchrist (1988) proposed that edges are categorized by whether the ratio or luminance difference remains the same at an intersection. Based on this view, he was able to experimentally manipulate whether the observer saw a luminance 6 The photometer metaphor suggests the photometer as a model of human lightness perception. A photometer measures luminance, which is the product of illumination and reflectance, and has no way to disambiguate the two. This metaphor has previously been discredited by the work Helmoltz, Hering, and the contrast effects literature. 22 gradient as a reflectance edge or an illumination edge, thus manipulating the perceived color. Gilchrist had the observer look into a small room. On the back wall of this room was a piece of white (90% reflectance) paper. Illumination came from a hidden point light source, and the paper was half shaded by a shadow caster. The target piece of gray paper was attached to the shaded side of the white background, and a Munsell chart, from which observers were asked to chose the matching chip, hung in the illuminated region. In one condition, the observers’ view was unobstructed, such that they could see the context of the room, the paper, and the shadow. In another condition, observers viewed the room through a hole in a sheet of black paper, such that they could only see the target, the Munsell grid, and part of the background. According to Gilchrist, the case in which the observers could see the context unobscured was an experiment on color constancy, and the case in which the observer’s view was obscured by the baffle, such that the illumination difference was not apparent and the shaded area appeared to be darker paper, was an experiment on the contrast effect. Results clearly show a difference between the two conditions: when the observers could see the context, they judged the target paper to be much lighter than when they could not see the context. This is a very important effect, given that observers are looking at the same target; the retinal stimuli would have been exactly the same in the two conditions, at least in the center of the field of view. These results suggest that perceived illumination can have an effect on perceived color, given the same proximal (and distal) stimulus. In addition, Gilchrist also concluded that constancy effects are far greater (six times larger) than contrast effects. Based on these data, Gilchrist suggests that contrast effects represent failures of constancy, contrary to some current authors who have suggested that constancy and contrast are examples of the same phenomenon. Quantitative tests The previous section provides a review of the evidence that humans perceive illumination and some demonstrations suggesting a relationship between perceived illumination and perceived surface color. Next it would be interesting to know whether there is a consistent, quantifiable relationship. Below is a review of some attempts to quantitatively test the relationship between perceived illumination and perceived reflectance. In addition to the original formulation of the albedo hypothesis, there is another, more general possible quantitative relationships that will be considered and tested in this experimental project, as well as an even more specific hypothesis. 23 The most specific hypothesis suggests that the percept is a function of and uniquely determined by the physical stimulus: î = f (i) â = g (a) According to this hypothesis, the perceived illuminant intensity (î) at any point is determined by the physical illuminant intensity (i) at that point, and the perceived albedo (â) of any surface is determined by the physical albedo (a) or reflectance of that surface. It is known that this form of the albedo hypothesis is not always true (e.g. Hering, 1907/1920; Gelb, 1929; Kardos, 1934; Beck, 1971; Logvenenko & Menshikova, 1994). In its classic form (Helmholtz, 1866; Koffka, 1935; Beck, 1972), the albedo hypothesis is7: â=e/î Here e is the amount of light energy reaching the eye (or the luminance) from the test patch, â is the perceived albedo of the test patch, and î is the perceived illuminant intensity at the test patch. According to this model, color perception involves the following steps: 1) The retinal image is formed as the light is reflected from surfaces 2) the visual system calculates the illuminant based on information available in the entire scene, and 3) The perceived reflectance is calculated according to this equation. This original form is the most commonly tested form of the albedo hypothesis. A more general formulation of the albedo hypothesis is â = f(e, î) where f() is a function that is unknown but which is consistent across surface reflectances and contexts. This general form is similar to and not mutually exclusive of the classic one. Here the perceived albedo (â) is determined by the light reaching the eye in a manner that depends on the perceived illumination (î). Again, this is a multi-stage model involving 1) the formation of the retinal image based on the luminance 2) an estimation of the illuminant based on the entire scene and 3) the calculation of the reflectance by some hypothetical reflectance calculation function 7 This notation may strike the reader as odd, since it is equating a physical measurement with a psychological representation. This makes sense mathematically only if one assumes that the functions f and g mentioned above are identity functions, that is that i = î and a = â. Note that even if this assumption fails, the following experimental predictions are the same. In that case, call the registered, inaccessible percepts î’ and â’, and the consciously accessible explicit representations î and â. Now as long as î and â are some fixed functions of î’ and â’ and each function has a one to one relationship, the experimental predictions in the matching experiments that follow will be the same. 24 based on the luminance and the estimated illuminant. Notice that this function takes as input only the perceived illuminant and the actual physical luminance reflected from the surface of interest according to this model. The output is the perceived surface reflectance as illustrated in the top panel of Figure 7. This suggestion is appealing, because it relaxes the requirement that the psychological relationship parallel the laws of physics, i.e. that luminance intensity (e) = albedo (a) * illuminant intensity (i) This form of the albedo hypothesis is also the most general formulation of the threestage model of surface perception described above. It is (nearly) universal among computational models of surface color perception. Furthermore, this variant is consistent with some extant data (e.g. Oyama, 1968; Logvenenko & Menshikova, 1994). Alternatively, it is possible that none of these forms of the albedo hypothesis is correct. In this case, there are at least two alternative possibilities. First, it is possible that perceived illumination affects perceived reflectance, but does not uniquely determine it. Perhaps it is one of multiple factors that is used by the hypothetical albedo calculation function to mediate the relationship between luminance and perceived reflectance â = f(e, î, x). Here, the variable x represents an unknown factor used in the calculation of the perceived surface reflectance. As seen below, this factor could be the reflectance of the immediate surround of the test patch or the ratio between the luminance of the test patch and that of the surround, for example. The second panel of Figure 7 illustrates the first alternative to the albedo hypothesis. The luminance and the perceived illuminant are factors in the calculation of the perceived reflectance, but so is some other factor. As a second alternative, it is possible that both perceived illumination and perceived albedo are represented, but that there is no relationship between the two. $ â $ î 25 i î raa e reflectance calculation function â other factor(s) i î aa reflectance calculation function e î i î a â â Figure 7: This figure illustrates the three alternatives: the top panel illustrates a model consistent with the albedo hypothesis: the stimulus yields an estimate of the illuminant (î), which is available for computing the albedo (â) given the stimulus. The perceived illuminant (î) governs the relationship between stimulus and perceived surface. The second panel shows this same model, but with additional factors influencing the relationship between physical luminance and perceived reflectance. The third panel shows the perception of both î and â with no fixed relationship between the two. 26 This alternative is the proposal that Beck (1959; 1961; 1971; 1972) promotes. The third panel shows the perception of both î and â with no fixed relationship between the two. The experiments described below were designed to select between the albedo hypothesis and these two alternatives to it. The quantitative relationship between perceived surface color and perceived illumination (given a particular retinal image) has been investigated, for example, by Beck (1959, 1961), Kozaki (1973) and Oyama (1968) and Kozaki & Noguchi (1976; Noguchi & Kozaki, 1985) as reviewed below. Many of these studies reject the most specific hypothesis and the classic form of the albedo hypothesis, and some of the data may reject even the more general form of the albedo hypothesis for certain contexts. Nevertheless, another, more rigorous test of the albedo hypothesis is justified, since all of these tests used simple, poorly articulated scenes, which may have an effect on color perception. Oyama, 1968 One experimental test of the hypothesis that perceived albedo is inferred by dividing the luminance by the perceived illumination (a re-arrangement of the classic form of the albedo hypothesis: â = e / î) was conducted by Oyama (1968). In this experiment, there were three boxes, the standard box, comparison box I and comparison box II, each with a rectangular aperture cut in the front of it. Each box was illuminated by a light source from the top front of the box. The standard box was lined with gray paper, and had a hole in the back of the box beyond which there was a very low reflectance surface. A standard disk (one of 5) was presented in isolation by hanging it down in this opening for the observer to see. Both the illuminant and the reflectance of the standard disk could be set experimentally. Comparison box I was lined with black paper on all sides and had a comparison disk on the back wall with an adjustable white-black ratio. The illuminant in this box was set near the upper range of those used in the standard box. Comparison box II was lined with white paper, and on the back wall was a white square (the target) mounted on a black disk. An observer adjusted the white-black ratio of the disk in comparison box I to match the surface color of the standard disk in the standard box. Then she adjusted the illumination in Comparison box II to match that in the standard box. Notice that the surface color match and the illumination match were made in different boxes with different wall colors, so the most general form of the albedo hypothesis could not be tested with these methods. From the matching of the surface color of Comparison box I to the test disk in the standard, the main empirical result is that the illuminant in the standard box as well as the surface color in the standard box influenced the matched surface in comparison 27 box I. The fact that the illuminant had an effect on matched surface color indicates a deviation from perfect surface color constancy. Perhaps more pertinent to the current question are the illumination matches in comparison box II. One important finding is that observers were able to match the illumination between the two boxes rather well. That is, there was a linear relationship between standard illuminant and matched illuminant with each standard disk. The matched illuminant, however, depended not only on the standard illuminant but also on the standard surface. The dependence on the test surface indicates a deviation from perfect illumination color constancy: the surfaces in the standard box affect the illuminant matches, but this deviation is quite small. Of interest from this study is not only the question of whether observers can match the illuminant intensity, but also whether Oyama’s data can be used to test any form of the albedo hypothesis. In fact, these data can be used to test the classic form of the albedo hypothesis: e = â * î. As the step by step analysis below will show, this form of the albedo hypothesis predicts a linear relationship between the luminance of the test patch in the standard box (e1) and the test box (e2). The logic is as follows: If the albedo hypothesis were correct in each of the two contexts, then e1 = â1 * î1 and e2 = â2 * î2 which gives us e1 / î1= â1 and e2 / î2= â2. Observers matched the surface reflectance of the test disk to the standard disk, so â1 = â2. From this, one can derive the prediction that the light from the surface at the two matched test patches should have a linear relationship. Given the matched reflectances, one can derive an equality in the previous two equations, yielding e1 / î1 = e2 / î2 which can be stated: e2 = e1 * î1 / î2 28 or e2 = e1 * c where c is a scalar constant, which represents the relationship between the perceived illumination in the two contexts. In other words, the classic form of the albedo hypothesis says that when the surface reflectance of the test patches in the two chambers are subjectively matched, there should be a linear relationship between the luminances, measured at the two test patches. Notice that the two perceived illuminants, î1 and î2 do not have to be the same, they just have to have a consistent relationship during the test. Data from the surface color matching part of this experiment do not show this result, and thus falsify the classic form of the albedo hypothesis for Oyama’s context. A plot (see Figure 8) of the log standard reflectance against log matched reflectance shows a slope of greater than one (as opposed to the predicted 1) which falsifies the albedo hypothesis in its classic form. Notice that although the two perceived illuminants do not have to be equal, they may be. If so, î1 = î2 then î1 / î2= 1 so e1 = e2 In other words, according to the classic form of the albedo hypothesis, if â1 = â2 and î1 = î2 then it must be the case that e1 = e2. This will also be true of the more general form of the albedo hypothesis, f(e, î) = â. This prediction is crucial to the logic of the experiments in the current project, especially experiment 2A, described below. With Oyama’s data it is not possible to test the general form of the albedo hypothesis. The illumination matches were not set in the same box as the surface matches, so one cannot test the hypothesis that there is some invariant relationship between â, î and e. One cannot test the idea that the perceived reflectance is determined by the surface light reaching the eye, taking perceived illumination into account via some as yet unspecified relationship. 29 Effect of luminance and reflectance on matched reflectance Standard Reflectance Matched Reflectance (%) 88% 46% 24% 12% 5.80% Standard Luminance (e) The relationship between Standard and Matched Luminance Standard Reflectance Matched Luminance (mL) .9 .47 .245 .123 predicted Standard Luminance (mL) Figure 8: a) Schematic representation of the major finding from Oyama (1968) in log coordinates. The reflectance of the matched surface depends both on the surface color and the luminance at the standard disk. (Each line represents a different standard disk.) That the illuminant has an effect on perceived surface color is a deviation from color constancy. b) The second panel, based on Oyama’s Figure 4, shows the luminance of the matched disk as a function of the luminance of the standard disk, both plotted in log coordinates. It provides a critical test of form 2 of the albedo hypothesis. The hypothesis predicts a slope of 1 for this graph; thus the data clearly challenge the hypothesis. 30 In considering Oyama’s data, it may be relevant to consider some factors in the display. This display was very simple and not very well articulated. The test patch was large (11° diameter) and was seen against a large (40° by 30°) black background. Gilchrist et al (1999) would thus predict that the test patch itself would have a large influence on anchoring. The brighter the test patch, the bigger this self-adaptation effect might be, and thus lightness of the test patch would grow less rapidly in response to physical reflectance than if it were viewed in a scene where other factors dominated the state of adaptation. Logvenenko & Menshikova, 1994 More recently, Logvenenko & Menshikova (1994) tested and claimed to have falsified the classic form of the albedo hypothesis. Their data are inconsistent with the idea that the retinal image is exactly equal to the perceived illuminant times perceived surface reflectance. However, they suggest that there is an invariant relationship between perception of illumination and perception of surface reflectance, but not the simple relationship of the classic albedo hypothesis. Thus, their data is consistent with the most general form of the albedo hypothesis, (f(e, î)= â). Logvenenko & Menshikova started by developing a methodology that would allow them to compare perceptions of shaded regions to perceptions of painted regions. They used a bisection task (after Torgerson 1958; Pfanzagl, 1968) to develop scales relating physical and perceived qualities: one scale related perceived lightness to surface reflectance and another scale related perceived illumination to illuminant intensity. Observers saw one black chip and one white chip and had to set a third chip to be of a mid-level lightness, exactly equally different from the white chip and the black chip. Then a chip that was equal to their midpoint judgment (actually the median of nine trials) was shown with first the white, then the black chip, and the observer again had to choose the midpoint between the two. Thus, a scale of perceptual increments was created. An analogous procedure was used to create a scale for an illuminant. These first two experiments yield a relationship between the perceived and the physical quantities such that one can compute one as a (non-linear) function of the other. These functions would, however, only be valid in the context in which the original experiment was conducted. At this point, it would be possible to test a relationship between perceived surface lightness and perceived illumination by asking whether the two functions have the same form. The authors did not do this comparison, but the graphs of the two functions (see Figure 9) reveal a similar shape. The psychophysical functions between physical illumination and perceived illumination, and between physical lightness and perceived lightness, were both non-linear, and were similar in shape. 31 The authors interpret this non-linearity as evidence against the classic form of the albedo hypothesis. That is, if human color perception were like a photometer, then there would be a perfectly linear relationship between physical illumination and perceived illumination, and this would determine a linear relationship between surface lightness and perceived surface lightness. However, it may not be necessary to make this inference from these data: These are scaling data, and one could imagine that the result of scaling captures the true perceived lightness or illuminant only after a non-linear output transformation (see Foley, 1977; Philbeck & Loomis, 1997). It is still possible that there are inaccessible variables, perceived lightness and illumination, that have a linear relationship to their real world analogs. In their third experiment, the authors manipulated whether the observers were perceiving a shadow or a colored region by having them look through a pseudoscope. Originally, observers saw a cone on a white sheet of paper, which cast a shadow. Through the pseudoscope, the cone appeared as a hole in the paper, so the shadow looked like a stain. While looking at this inverted scene, observers were asked to match the “colored surface” to a chip. When the scene was then seen in normal depth the observers were asked to match the shadow to a real shadow. The authors used the scales created in the first two experiments to relate the matches to perceived lightness and perceived illumination. The authors used this third experiment to consider the general form of the albedo hypothesis, that perceived surface lightness is a function of the light reflected to the eye and perceived illumination (rather than the physical analogs of these percepts). If this were true, the authors argue, then a plot of perceived illumination versus perceived surface lightness should be linear. In fact, Logvinenko and Menshikova’s data seem to show this linearity. Unfortunately, the authors failed to measure color and perceived illumination in the same condition. Thus, in order to draw this conclusion, one must assume that when judging the illumination at different shadow intensities, the perceived lightness at that location (perceived surface reflectance) does not change, and vice-versa when judging the perceived lightness. The authors assure us that this assumption is valid. Ultimately, Logvenenko & Menshikova (1994) do not reject the general form of the albedo hypothesis. 32 Perceived "grayness" Perceived illumination Reflectance averaged over two observers Illumination averaged over two observers Figure 9: This sketch of Logvenenko & Menshikova’s (1994) data show a non-linear relationship between a) surface reflectance (a) and perceived surface reflectance (â) and between b) illumination (i) and perceived illumination (î). Notice, however, that the shapes of the curves are similar. Kozaki & Noguchi, 1976 Another pair of studies which may provide evidence against the albedo hypothesis in its classic form is that of Kozaki and Noguchi (1976; Noguchi & Kozaki, 1985). In the first of these studies, observers made categorical judgments of both lightness and 33 illumination, choosing one of nine category labels for each (e.g. very blackish gray, blackish gray, rather blackish gray, etc.). Illumination and lightness were judged in independent sessions. In each session, observers made judgments for a number of different experimentally set test patches, backgrounds and illuminations, for a total of 376 judgments for each session. The data that these sessions yielded can be used to test the various forms of the albedo hypothesis. The experimenters measured, for each trial, the luminance of the test patch. Thus, one can compare judgments of illumination and lightness since the light reaching the eye was held constant. All three forms of the albedo hypothesis would predict that if the light reaching the eye is the same, and the illumination is judged to be the same, then perceived albedo must also be the same. This prediction is falsified, and the albedo hypothesis rejected, given Kozaki and Noguchi’s data. One should be cautious about interpreting data given the fact that the dependent measure was scaling data. Furthermore, it should be noted that even if it was the case that the albedo hypothesis could be rejected in this study, it could only be rejected for these particular stimulus conditions. The stimulus conditions used in this study were rather simple and not well articulated, which may be important (see Gilchrist et al., 1999). Noguchi & Kozaki, 1985 The same authors later replicated the experiment, this time testing to see whether there was any effect of the test patch being seen as background, in two conditions: one in which small black squares were attached to the test patch, and one in which small white squares were attached to the test patch. Again, results from this study contradict predictions of all three forms of the albedo hypothesis. Although there is a reciprocal relationship between lightness and perceived illumination (specific to each condition), judgments of illumination were influenced by the co-existence of higher luminance regions. The albedo of the foreground patches affected the relationship between i and î and the relationship between a and â. Since illumination judgments were influenced by the albedo of the test field and its interaction with the albedo of the patches, there was no simple relationship between perceived lightness and perceived illumination, according to the authors. The authors conclude that although the visual system can use the equation e = î * â, it only does so for a particular, fixed e, and the relationship changes across context. In sum, the studies described in this section attempted to test the albedo hypothesis, and some (e.g. Logvenenko & Menshikova 1994; Kozaki & Noguchi, 1976; see also Beck, 1961) disprove the albedo hypothesis in its classic form. That is, it is not the case that both lightness and illumination judgments have a linear relationship to their physical analogs, and that these have an unchanging multiplicative relationship to 34 light reaching the eye. The eye is not a light meter. Some of the above authors even reject the most general form of the albedo hypothesis, but relying on scaling data, and only for poorly articulated scenes. Chapter 2 The logic of the current experiments In this chapter 4 experiments presented designed to examine the role of illumination perception in color constancy. Four more control experiments are included in appendix 1, and essentially replicate the findings in this chapter. These experiments employed a matching paradigm; the observer sat between two experimental chambers while a computer controlled motorized mirror rotated between the two chambers to change the view. In each chamber there was a mirror reversed complex scene that included an LCD panel on the back wall that served as the test patch. The illumination in each chamber was controlled by a bank of diffused overhead lights. The intensity in the match chamber was controlled by the observer via the computer. The observer's first task was to match the illuminant in the match chamber to the illuminant in the standard chamber. The observer's second task was to match the surface reflectance of the test patch on the back wall of each chamber. Observers matched the surface reflectance in the same trial as they matched the illuminant. That is, once the observer had completed the two tasks, both the perceived illuminant and the perceived surface reflectance matched for that observer, before they went on to the next trial. The first two experiments measured the veridicality of illuminant and surface matching in this paradigm, and the efficacy of this method. The next two tested the albedo hypothesis in both its classic, and most general forms. Experiment 2A did so by creating a bias in illuminant matching; the albedo hypothesis suggests that the perceived illuminant determines the perceived reflectance given the physical luminance. Although the judgment of illumination can be “incorrect” (i.e. not determined solely by the actual illumination) whatever that perception is should determine how the surface lightness is perceived. In Experiment 2B, the stimuli were designed to manipulated the surface reflectance matches without any change in the illuminant match. If one can be manipulated independent of the other, than the albedo hypothesis in its most general form is false. Experiments 3A and 3B are replications and control experiments that are designed to ensure that the relationship between illuminant and surface reflectance is localized to the test patch. Experiments 4A and 4B are also replications; they control for the possibility that the causal relationship between illuminant perception and reflectance perception might be in the opposite direction than commonly supposed. 35 The experiments, particularly the last 6 (including the 4 replications in Appendix 1), were designed to provide a rigorous test of the most general form of the albedo hypothesis. The logic of how Experiment 2A might falsify the classic form of the albedo hypothesis is as follows: Assume that the albedo hypothesis is true in both chamber 1 and chamber 2 e1 = î1 * â1 and e2 = î2 * â 2 then e1 / â1 = î1 and e2 / â 2 = î2. Then, since after the illumination match î1=î2 then e1 / â1 = e2 / â 2 and, since after the surface reflectance match â1 = â 2 then e1 = e2 In other words, the albedo hypothesis makes a prediction about the physical luminance of the regions that appear to match. Furthermore, the most general form of the albedo hypothesis also predicts that the physical luminance must match after the perceived illuminant and the perceived surface reflectance are matched. Remember that the most general form of the albedo hypothesis, the assumption of most computational models of color perception, is that there is a reflectance calculation function that takes as input only the physical 36 luminance of the test patch and the perceived illuminant, calculated across the entire scene. If the albedo hypothesis is correct, then there is an important prediction for our matching paradigm: Once the perceived illuminant is matched in the two chambers, and the perceived surface reflectance is matched in the two chambers, the physical luminance, or the light coming off the test patch, must match in the two chambers. Again, assume that the hypothesis, (f(e, î)= â), is true in each of our two experimental chambers. Only e and î influence the calculation of â. After the two matches, î will be the same in the two chamber and â will be the same in the two chambers. Thus, it must be the case that e, the only other factor that influences the calculation â, must be the same in the two chambers. The general form of the albedo hypothesis makes a measurable prediction for these experiments. Notice that this prediction is true no matter what the exact form of the reflectance calculation function is. These experiments can test the hypothesis that the only way that context affects perceived surface reflectance is by a change in the perceived illuminant. This set of experiments can also test whether illuminant matching is possible, and whether there is more reliability in surface lightness matching or illuminant matching. As mentioned earlier, little is known about illumination perception relative to surface color perception, so any preliminary measurement of illumination perception is of interest. General Methods The general paradigm employed in these experiments was a matching paradigm in which observers were asked to adjust one display until they saw some particular aspect of it (e.g. illumination intensity) as perceptually indistinguishable from the same aspect in another display. This matching method has often been used in color constancy research. In surface color constancy experiments, the extent to which the change in the illuminants (or in some cases backgrounds) perturbs the matched surface color is measured. Illuminant matching is analogous; one can quantify the veridicality of the matches and manipulate the reflectance of the surfaces in the scene. In the following experiments, observers saw two scenes, each in a separate experimental chamber as illustrated in Figure 10. The independent variable was the illuminant or the test patch reflectance in the “standard” chamber, and the dependent variable was the illuminant or test patch reflectance in the "match" chamber after the observer made a match. Observers were asked to match the illuminant and the reflectance of the test patch in the two chambers. The surface reflectances of the walls 37 and objects in the chambers could be manipulated between experiments. Since both the illuminants and the surface reflectance of the test patch were matched by the observer, so both illuminant and surface color constancy were tested in these experiments. Figure 10: Overhead schematic shows the two experimental chambers, motorized mirror, and observer. Apparatus There were two identical side by side chambers, either of which could be used as the standard or the match chamber. The chambers were built out of plywood, and the floor of each was 36" deep by 31" wide. The ceiling, which was out of view of the observer, was 35.5" above the floor, and was made of two layers of diffuser paper (Rosco 3026) separated by 1.5". The interiors of the two chambers were identical to each other except for one being mirror reversed, and for the surface reflectances of the walls and objects, which varied across experiments. In the rear third of each chamber (that is, what appears to the observer to be the rear after mirror reflection) was an array of objects painted in shades of monochromatic paint, creating a rich, naturalistic viewing environment, identical in the two chambers. The objects were a 1/2 gallon milk carton, a large Styrofoam cup, a small paper cup, a roll of toilet paper, a mason jar, a cardboard cup holder, a plastic cup lid, a cardboard box measuring 4 1/4" cubed, an egg carton, and a cardboard cylindrical container measuring 5 1/4" high and 5" in diameter. Figure 11 shows what the view looked like in one experiment from the observers’ point of view. 38 The observer sat between the two chambers and was able to view the interior of each via a rotating mirror. The observer could only see one chamber at a time, and a shutter in the viewing screen occluded the observer's view while the mirror rotated. The mirror measured 16" by 16" and sat beyond and between the two chambers seen in Figure 10. The opening in the chamber through which the observers looked was 24" wide by 17" tall. The aperture in the viewing screen through which observers looked was 4 1/2" by 3 1/2" and was 14 1/4 " in front of the observer's right eye. Observers used a chin rest to restrict movement, and viewing was monocular (right eye for every observer). The chin rest was adjusted for each observer before each session to standardize the position of the right eye and thus the view. Figure 11: One experimental chamber from the observers’ point of view. The test patch is visible on the back wall. Directly above the diffuser paper in each chamber was a bank of 6 stage lamps (SLD Lighting 6" Fresnel #3053, with BTL 500 watt bulbs) arranged in concentric triangles. Each lamp was covered in a red (Rosco 6100 "flame red"), green (Rosco 1959, "light green") or blue (Rosco 4600 "blue") 6.3" round filters. There were three blue lamps, two green lamps, and one red lamp in each chamber. Using the three primaries made it possible to maintain an achromatic illuminant by adjusting the 39 intensity of each. The number of lamps for each primary was chosen because it allowed the maximum range of luminances given the requirement that the illumination be achromatic (approximately x= .31, y= .34 in CIExyY coordinates). Each lamp could be controlled individually by a Power Macintosh 7200/120. The lamp intensities were controlled by varying the RMS voltage across the bulbs (NSI 5600 Dimmer Packs, NSI OPT-232 interface card, 256 quantization levels.) Each chamber also had a flat vertical monitor placed in the back wall of the chamber which was visible through a 1 3/8" by 2 3/8" rectangular opening cut in the cardboard. The monitor was a High Resolution Active Matrix Color LCD Panel (Marshall, product number V-LCD5V), and was under the control of the Power Macintosh computer. The monitor appears as the rectangular test patch in Figure 11.This monitor served as a test patch for surface matching. Each panel was covered by a gray gel (the exact gel varied across the experiments) and a layer of translucent plastic, which was added to reduce the angular dependence and make the patch look like a piece of paper. Each patch could be computer-adjusted as needed. The monitors and the software controlling them were identical, so either monitor could serve as standard or match surface. Calibration Both the screens and the bank of overhead lamps were calibrated so that the computer could accurately control the intensity levels and chromaticities. For the monitors, the procedure involved measuring each of the three primaries (red, green and blue) at 35 intensity levels, and then creating a model fit that predicts chromaticity and luminance of the monitors with the three primaries acting in concert. For the overhead lamps, the procedure was similar: Each primary (the one red lamp, the two green lamps or the three blue lamps) was measured at 25 levels of intensity and the computer calculated what intensity would be needed from each primary to produce an achromatic light of any given intensity. For a more complete discussion of calibration procedures, see Brainard et al. (1997). Because of building-wide fluctuations in the power supply, a standardized illumination measurement was taken before each experiment, once the lamps were warmed up. The calibration data was then scaled to match the current power level so that intensities would more closely approximate the desired levels. Ultimately, the independent variables were the measurements taken at the end of the experiment, rather than the nominal intensity so any failure in these procedures should not affect the conclusions one can draw from this study. During the matching experiment, as the illumination changed in the match chamber, care had to be taken to keep the simulated reflectance of the test patch perceptually 40 the same. The surface luminance (e) of the test patch was a result of two independent components: 1) the actual reflectance of the surface of the unit times the illuminant in the chamber, and 2) the light generated by the LCD panel. In order to create the appearance of a constant reflectance as the illuminant changed, the LCD’s surface reflectance (when off) was measured independently during the calibration, so that the two independent components could be summed to produce the appropriate surface luminance. Then, for each step that the observer took to increase or decrease the illuminant, the computer was programmed to take six interleaved steps. That is, the illuminant would change one sixth of the total adjustment, then the reflectance would change one sixth of the total adjustment until the entire adjustment had been made. This process happened in less than a second, and created the appearance of a gradually changing illuminant with little or no perceptual change of the surface reflectance. Procedure First observers were led into the viewing booth and the chair and chin rest were adjusted for comfort and to standardize the eye position. Observers had to align a small thread hanging from the near aperture with a specific point in the viewing scene by adjusting the chin rest, in order to standardize the view. Next observers were given instructions orally (see appendix 1 for complete instructions). They were told that there would be a series of 16 trials each consisting of an illumination match followed by a surface reflectance match. They were told not to worry about whether the surface colors looked the same during the illumination match. Conversely, observers were told that during the surface matches, they should not worry about whether the illumination levels matched, but that they should make the 2 test patches look like they were cut out of the same piece of paper. They were given the details about how to adjust illuminations and surface levels and how to accept or reject the matches using a Gravis Game Pad. The Game Pad had a joystick on the left side and a set of four buttons on the right side. The observers moved the joystick up to increase the illumination or reflectance and down to decrease the illumination or reflectance. In order to accept a match, observers had to push a blue button on the right side of the Game Pad. If observers could not make a satisfying match, they could push the yellow button to reject the match and move on to the next trial. Observers were encouraged to reject a match if they were unable to adjust conditions in the match chamber to the point where there was a perceptual match. A computerized speech simulator cued observers with the words "Do an illumination match" or "Do a surface match," to help them remember which match to do. The intensity of illumination and the surface reflectance in the standard chamber were pre-programmed and set by computer. Intensities and reflectances were equally 41 spaced within the range of possible intensities that could be produced achromatically in the chamber and matched in the match chamber, and the range varied across experiments (see below). The starting points for the illuminant and the reflectance in the match chamber were selected randomly by the computer from any point in the possible achromatic range. Illumination was physically measured in the standard and the match chamber after the observer had completed the entire session. The illuminant level was measured with a Photo Research PR-650 spectrometer measuring light off a highly reflective standard, which was placed immediately in front of the test patch. Once the observer left, the computer replayed the settings of both the standard chamber and the match chamber at the end of each trial and took the measurements. Similarly, reflectance data came from measurements taken after the observer had completed the entire session. The spectrometer measured light energy from the test patch as the illuminant and test patch settings were replayed by the computer. The reflectance, represented as a proportion, was derived from these measurements. Experiment 1A: Symmetric Matches Experiment 1A was designed to measure illumination matching under very simple conditions. It was also intended as a baseline experiment using a new methodology, to make sure that the observers could do the task, that apparatus and the procedure worked as expected, and that there was no bias between the two chambers. The surface reflectances in the two chambers were identical and monochromatic. Thus, this was a symmetric matching experiment. It is possible that this experiment might falsify the albedo hypothesis if matching is not veridical or luminance does not match in the two chambers, after the two matches, but such an outcome is not expected for symmetric matching. This experiment was not designed as a test of the albedo hypothesis. Apparatus: 1A In this experiment, the walls and floor of each chamber were lined with cardboard of the same mid level reflectance. The objects in each chamber were painted with the same middle-reflectance paint. The gel on the LCD panels was a dark gray (Rosco 98) in each chamber. Thus, the two chambers were identical in surface reflectance, size, lighting hardware, and the array of objects inside. The array of objects and the "views" of the chambers were mirror images of each other, so that the retinal stimuli would not be spatially identical, and observers would know unambiguously which chamber they were looking in. 42 Observers: 1A There were 8 observers. They included two females in their early 20s, one female in her early 30s, and five males in their early 20s. All observers were naïve except for the author. Observers were paid $10 for each session, except for the author. Procedure: 1A Each chamber served as the standard chamber in half of the sessions, but the standard chamber did not change during a particular session. The illumination in the standard chamber was set at the beginning of each trial to one of four different standard starting points by the computer (15, 35, 55, and 74 cd/m2). Observers were told that in order to make an illumination match, they should match the amount of light hitting any two corresponding points in the two chambers, for example the amount of light hitting the test patches. The surface reflectance was also set to one of four predetermined levels, and the four illuminant levels were crossed with the four surface levels (.13, .24, .37, and .50), such that each of the 16 trials was a unique combination of illuminant level and reflectance. Each observer completed two independent sessions on different days. Results: 1A Were illuminant matches veridical? The maximum number of illuminant matches accepted by each observer was 32, 16 each for two sessions. (There were few rejected matches during this experiment.) Figure 12 shows all illuminant matches for one observer. The standard illuminants were plotted on the X-axis and the matched illuminants on the Y-axis. Notice that for this observer, illuminant matching was nearly veridical, and the data fall along the diagonal. The relationship between the illumination in the standard chamber and the illumination in the match chamber may be characterized by the slope of the regression line when the two are plotted against each other. For each observer, data from both sessions was used to calculate a slope, with the intercept constrained to zero. Veridical matching would produce a slope of one, since the physical illuminant in the match chamber would be the same as the physical illuminant in the standard chamber. Figure 13 shows the slopes for all observers. The average slope for all 8 observers was .95. A paired two-tailed t test on the individual matches showed a significant difference between standard illuminant and matched illuminant for one observer (SIM: t(31) = 4.45, p = .0001). The difference was not significant for any other observer. Another comparison uses difference scores, calculated by subtracting the matched value from the standard value, and averaging the differences, giving each 43 observer a difference score. A one sample t test for all observers’ average difference scores showed that these scores could not be distinguished from zero (t(7)=1.73, n.s.). Matched Illuminant (cd/m 2 ) 100 80 60 40 20 MDR 0 0 20 40 60 80 Standard Illuminant (cd/m 100 2 ) Figure 12: All illuminant matches for one observer for experiment 1A. The data points lie roughly along the diagonal, indicating that the illuminant in the match box was approximately equal to the illuminant in the standard box. Matching was veridical. 44 Illuminant Slopes 1.2 1 0.8 0.6 0.4 0.2 0 MDR SIM MBG ISH JAB BGS JXK JSB Figure 13: Illuminant match slopes for all observers for experiment 1A. Slopes are close to one for all observers, indicating that matches were approximately veridical for all observers. 45 Are surface matches veridical? Reflectance matches are also of interest. The maximum number of data points for each observer was again 32. Figure 14 shows the reflectance matches for one observer. The standard reflectances are plotted on the Xaxis and the matched reflectances on the Y-axis. Veridical matching would mean that the data fall along the diagonal. Notice that for this observer, reflectance matching was nearly veridical, and the data close to the diagonal. Matched Reflectance 0.5 0.4 0.3 0.2 0.1 MDR 0 0 0.1 0.2 0.3 0.4 0.5 Standard Reflectance Figure 14: All reflectance matches for one observer for experiment 1A. The data points lie roughly along the diagonal, indicating that the simulated reflectance in the match box was approximately equal to the simulated reflectance in the standard box. Matching was veridical. 46 As with the illuminants, the relationship between the reflectances in the standard chamber and the match chamber were plotted, a regression line calculated, and the slope taken for each observer. Figure 15 shows the reflectance slopes for each observer. The average slope for all 8 observers was 1.02. A paired two-tailed showed a significant difference between standard reflectances (mean .24) and matched reflectances (mean .27) for one observer (SIM: t(31) = .4.35, p = .0001). The difference was not significant for any other observer. A one sample t test for all observers’ average difference scores showed that these scores could not be distinguished from zero (t(7)=1.14, n.s.). Reflectance Slopes 1.2 1 0.8 0.6 0.4 0.2 0 MDR SIM MBG ISH JAB BGS JXK JSB Figure 15: Reflectance match slopes for all observers for experiment 1A. Slopes are close to one for all observers, indicating that matches were approximately veridical for all observers. 47 Does the albedo hypothesis hold for these stimuli? These data are qualitatively consistent with the albedo hypothesis. However, remember that in order to test either the classic or the more general form of the albedo hypothesis, the relationship between the physical luminance in the two chambers (measured at the two test patches after both matches were made) is of interest. Any form of the albedo hypothesis predicts that once the perceived illuminant and the perceived reflectance 2 Matched Luminance (cd/m ) 25 20 15 10 5 MDR 0 0 5 10 15 20 25 2 Standard Luminance (cd/m ) Figure 16: All Surface luminance data for one observer for experiment 1A. The data points lie roughly along the diagonal, indicating that the physical luminance measured at the test patch in the match box was approximately that in the standard box. 48 Luminance Slopes 1.2 1 0.8 0.6 0.4 0.2 0 MDR SIM MBG ISH JAB BGS JXK JSB Figure 17: Surface luminance slopes for all observers for experiment 1A. Slopes are close to one for all observers. Physical luminance at the test patch was approximately the same in the standard chamber and the match chamber after the illuminant match and reflectance match were both made. are matched in the two chambers, the physical luminance will match as well. Luminance was measured directly from the test patches, as described above. Figure 16 shows the luminance data for one observer. Notice that the data lie along the diagonal, on average. Generally, the luminance of the test patch in the match chamber was close to the luminance of the test patch in the standard chamber for this observer. This is consistent with the albedo hypothesis. 49 Figure 17 shows the luminance slopes for all observers. The average slope for all 8 observers was .97. A paired two-tailed t test showed no significant difference between luminance of the standard test patch and luminance of the match test patch for any observer. A one sample t test for all observers’ average difference scores showed that these scores could not be distinguished from zero (t(7)=1.02, n.s.). This is consistent with the albedo hypothesis. Although one cannot reject the albedo hypothesis with these data, this symmetric matching experiment was not intended as a strong test of the hypothesis. Are the two chambers the same? In order to test whether there was any physical difference between the two chambers, a two-tailed paired t test was performed between slopes from sessions in which chamber 1 was the standard, and sessions in which chamber 2 was the standard, paired by observer. The average of the illuminant slopes were .98 and .94 respectively, and a paired two-sample t test showed that they were not significantly different (t(7)= .90, n.s.). The average of the reflectance slopes were 1.00 and 1.00 respectively, and a paired two-sample t test showed that they were not significantly different (t(7)= .09, n.s.). Discussion: 1A One of the most important conclusions from this study is that our observers were able to understand and perform the tasks of illumination matching and surface reflectance matching in this apparatus. All observers found the task reasonable. The fact that the slope of the illuminant matches was close to 1 suggests that for these very simple conditions, illuminant matching is nearly veridical. One may doubt that the data truly show veridical illuminant and surface matches. Indeed, there is an unexpected bias that reaches significance for one observer. The slopes of the illuminant matches may have been slightly less that one, on average. In fact, one may not need to explain any bias in order to test the albedo hypothesis. The magnitude of the bias should simply be taken as a baseline to compare biases that will be induced in future experiments. Nonetheless, one possible explanation for the bias is the following: If one assumes that there is more variance in matching for higher standard illuminations than lower (consistent with both Weber's law and the data), then the fact that the starting point in the match chamber is chosen randomly (rather than constrained by the illuminant in the standard chamber) might be relevant. The starting illuminant in the match chamber is more likely to be lower than the veridical point for the higher illuminants. It is known that the starting point of the match stimuli can bias the match slightly (Brainard, 1998). Perhaps this is an error that the observer can overcome for lower illuminants, but not for higher illuminants, since lower illuminants are easier to accurately match. 50 Another practically, if not theoretically, important result is that there is no measurable difference in matching when either chamber is used as the standard. This is important because this method of illuminant matching is new, and it is important to ensure that there are not any artificial biases in the hardware or the software. This experiment was not designed to be as strong test of the albedo hypothesis, and with these results one cannot reject either the classic or the general form of the albedo hypothesis (e = î * â) or (f(e, î)= â). These data are consistent with even the strongest form of the hypothesis, i = î and a = â. Experiment 1B: Symmetric Matches over a Range of Reflectances Experiment 1B was designed to measure accuracy and reliability of illumination matching in the context of a range of surface colors. The two chambers were identical to each other, but in this experiment, objects and wall reflectances ranged from high to medium to low. The back wall was split vertically in reflectance such that 1/2 of the visible wall was covered with white cardboard and 1/2 with black cardboard. As in experiment 1A, the matches were symmetrical. Apparatus: 1B The walls ranged in reflectances from high to medium to low, perceptually white, gray and black. The floor of each chamber was gray. The side wall that was visible to the observer was black and the side wall opposite (out of view of the observer) was white. The visible portion of the back wall was split in half such that the immediate surround of the LCD panel was white and the other half of the back wall was black. Each LCD panel had a light gray gel (Rosco 97). The objects that were painted black were the Java holder, the small Dixie cup, the cup top, and the cylindrical container. The objects that were painted gray were the Mason jar, the toilet paper roll, and the egg carton. The objects that were painted white were the milk carton, the box and the Styrofoam cup. The object positions were mirror images of each other in the two chambers. Figure 18 shows each chamber from the observer’s point of view. Observers: 1B Observers were the same 7 of the 8 observers from experiment 1A. They included two females in their early 20s, one female in her early 30s, and four males in their early 20s. Observers were paid $10 for each session, except for the author. 51 Procedure: 1B The procedure for this experiment was identical to that for experiment 1A. Again, the illumination in the standard chamber was set at the beginning of each trial to one of four different standard starting points, this time (8, 27, 45 and 64 cd/m2). The surface reflectance was also set to one of four levels (.28, .41, .54, and .67), and the four illuminant levels were crossed with the four surface levels, such that each of the 16 trials was a unique combination of illuminant level and reflectance. Figure 18: Stimuli for experiment 1B. There was a range of reflectances in each box, from high reflectance to medium reflectance to low reflectance. Each object was painted with the same paint as the corresponding object in the other chamber. Thus, this experiment involved essentially symmetric matches. Results: 1B Are illuminant matches veridical? Again, for each observer, a slope was calculated using data from both sessions. The maximum number of data points for each observer was therefore 32. (There were few rejected matches during this experiment.) Figure 19 shows the illuminant matches for one observer. The standard illuminants were plotted on the X-axis and the matched illuminants on the Y-axis. Notice that for this observer, illuminant matching was nearly veridical, and the data fall along the diagonal. Figure 20 shows the slopes for each observer. The average slope for all 7 observers was .96. Paired two-tailed t tests revealed a significant difference between standard illuminants (mean 40.01 cd/m2) and matched illuminants (mean 32.98 cd/m2) for SIM (t(31)= 7.09, p = 5.7E-8) but not for any other observers. A one sample t test showed that the average difference scores (standard minus match) for the 7 observers were not different from zero (t(6)=1.73, n.s.). Were surface matches veridical? The maximum number of data points for each observer was therefore 32. (There were few rejected matches during this experiment.) 52 Figure 21 shows the reflectance matches for one observer. Again, for each observer, a slope was calculated by using surface reflectance matches from both sessions. The standard reflectances were plotted on the X-axis and the matched reflectances on the Y-axis. Notice that for this observer, illuminant matching was nearly veridical, and the data fall along the diagonal. Matched Illuminant (cd/m 2 ) 100 80 60 40 20 MDR 0 0 20 40 60 80 Standard Illuminant (cd/m 100 2 ) Figure 19: All illuminant matches for one observer for experiment 1B. The data points lie roughly along the diagonal, indicating that the illuminant in the match box was approximately equal to the illuminant in the standard box. Matching was veridical. 53 Illuminant Slopes 1.2 1 0.8 0.6 0.4 0.2 0 MDR SIM MBG ISH JAB BGS JXK Figure 20: Illuminant match slopes for all observers for experiment 1B. Slopes are close to one for all observers, indicating that matches were approximately veridical for all observers. 54 Matched Reflectance 1 0.8 0.6 0.4 0.2 MDR 0 0 0.2 0.4 0.6 0.8 1 Standard Reflectance Figure 21: All reflectance matches for one observer for experiment 1B. The data points lie roughly along the diagonal, indicating that the simulated reflectance in the match box was approximately equal to the simulated reflectance in the standard box. Matching was veridical. Figure 22 shows the slopes for each observer. The average slope for all 7 observers was .97. Paired two-tailed t tests revealed a significant difference between standard reflectance (mean .345) and matched reflectance (mean .366) for SIM (t(28)= 3.33, p = .002) but not for any other observers. A one sample t test showed that the average difference scores (standard minus match) for the 7 observers were not different from zero (t(6)=8*E-17, n.s.). 55 Reflectance Slopes 1.2 1 0.8 0.6 0.4 0.2 0 MDR SIM MBG ISH JAB BGS JXK Figure 22: Reflectance match slopes for all observers for experiment 1B. Slopes are close to one for all observers, indicating that matches were approximately veridical for all observers. Does the albedo hypothesis hold for these stimuli? In order to test the albedo hypothesis, the relationship between the luminance of the two test patches is of interest. The albedo hypothesis predicts that the surface luminance in the standard chamber and the surface luminance in the match chamber should be the same after the illuminant match and reflectance match are completed. Figure 23 shows the reflectance matches for one observer. Notice that the data lie along the diagonal, on average. Generally, the luminance of the test patch in the match chamber was close to the luminance of the test patch in the standard chamber for this observer. This is consistent with the albedo hypothesis. 56 2 Matched Luminance (cd/m ) 25 20 15 10 5 MDR 0 0 5 10 15 20 25 2 Standard Luminance (cd/m ) Figure 23: All surface luminance data for one observer for experiment 1B. The data points lie roughly along the diagonal, indicating that the physical luminance measured at the test patch in the match box was approximately that in the standard box. 57 Luminance Slopes 1.2 1 0.8 0.6 0.4 0.2 0 MDR SIM MBG ISH JAB BGS JXK Figure 24: Surface luminance slopes for all observers for experiment 1B. Slopes are close to one for all observers. Physical luminance at the test patch was approximately the same in the standard chamber and the match chamber after the illuminant match and reflectance match were both made. Figure 24 shows the slopes for all 7 observers. If the surface luminance slopes are not 1, then the albedo hypothesis is false. The average slope for all 7 observers was .94. Paired two-tailed t tests revealed a significant difference between standard chamber luminance (mean 11.30 cd/m2) and matched chamber luminance (mean 9.88 cd/m2) for SIM (t(28)= 4.03, p = .0004) and for JXK (mean 12.77 cd/m2 standard compared to mean 11.04 cd/m2 match) (t(20) = 2.40, p = .03), but not for any other observers. A one sample t test showed that the average difference scores (standard minus match) for the 7 observers were not different from zero (t(6)=1.22, n.s.). Although the albedo hypothesis does not hold for two observers, notice that the effect size is quite small 58 (see Figure 24) and should be regarded as a baseline effect size for comparison with future experiments. Are the two chambers the same? In order to test whether there was any physical difference between the two chambers, a two-tailed independent samples t test was performed between slopes from session in which chamber 1 was the standard, and sessions in which chamber 2 was the standard. The illuminant slopes were .95 and .99 respectively, and were not significantly different (t(14)= .61, n.s.). The reflectance slopes were .98 and .97 respectively, and were not significantly different (t(14)= .11, n.s.). Discussion: 1B Again, illuminant matching is near veridical in this situation where the surfaces in the two chambers have the same (mirror reversed) reflectance across chambers. Overall, these results do not reject any form of the albedo hypothesis, including the strongest form. Notice, however, that since these were symmetric matches, this experiment was not a strong test of any form of the albedo hypothesis. This experiment could have falsified some form of the albedo hypothesis, but was not expected to. Again, these data suggest that there are no measurable differences between the hardware and software in the two chambers. Hence forth, it will be assumed that the two chambers are interchangeable, and they are not counterbalanced in the rest of the experiments. Experiment 2A: Asymmetric Matching In experiments 2A and 2B, the constancy of illuminant matching across scenes composed of different surface reflectances was measured. In the standard chamber (light chamber), all surfaces were either high reflectance (perceptually white) or medium reflectance (perceptually gray). In the match chamber (dark chamber), all surfaces were either medium reflectance (perceptually gray) or low reflectance (perceptually black). These surfaces were spatially arranged so that there was an isomorphism between the two chambers. That is, objects that were middle reflectance in the light chamber were low reflectance in the dark chamber, and objects that were high reflectance in the light chamber were middle reflectance in the dark chamber. Although results from experiments 1A and 1B suggest that observers are able to match illuminants, the design of those experiments still leaves open the possibility that the matches were made based on a low-level cue (e.g. matching the retinal stimulus) rather than on a perception of the illuminant. Because in Experiment 1 the surface reflectances in the two chambers were the same, when the illuminants matched, the retinal images matched as well. Experiments 2A and 2B can extend the 59 test to cases where luminance and thus the retinal image in the two chambers differs. Experiments 2A and 2B were designed to provide a stronger test of the albedo hypothesis than experiments 1A and 1B. Pilot testing suggested that matching across chambers with different reflectance ranges would bias the perception of the illuminant. Thus, it was expected that people would set the illumination in the dark chamber higher than veridical. If these stimuli can induce a bias in illumination judgments, then illuminant color constancy is imperfect. Furthermore, inducing a bias in perceived illumination would make it possible to test whether perceived illumination alone influences matched surface reflectances. Thus, any form of the albedo hypothesis in which perceived illumination mediates the relationship between physical luminance and perceived surface reflectance can be tested by this experiment. Apparatus: 2A In the standard chamber, the floor was covered with gray cardboard, and the back wall was split vertically, such that the cardboard around the test patch was gray, and the other half was white. The side walls were covered with white cardboard. The objects that were painted white were the cup top, the egg carton, the cardboard box, the toilet paper roll, and the Mason jar. The objects that were painted gray were the cylindrical container, the small Dixie cup, the Styrofoam cup, the Java holder, and the milk carton. The gel on the LCD panel was light gray (Rosco 97). In the match chamber, the floor was covered with black cardboard, and the back wall was split vertically, such that the cardboard around the test patch was black, and the other half was gray. The side walls were covered with gray cardboard. The objects that were painted gray were the cup top, the egg carton, the cardboard box, the toilet paper roll, and the Mason jar. The objects that were painted black were the cylindrical container, the small Dixie cup, the Styrofoam cup, the Java holder, and the milk carton. The gel on the LCD panel in the dark chamber was dark gray (Rosco 98). The gels in the two chambers were different in order to bring the perceptual ranges closer together and allow most of the standard reflectances to be perceptually matched. The object positions were mirror reversed in the two chambers. Figure 25 shows the standard and match chambers under equal illumination from the observer’s point of view. 60 Figure 25: Stimuli for Experiment 2A under equal illumination The surfaces in the standard chamber were all high or medium reflectance. The surfaces in the match chamber were all medium or low reflectance. These stimuli were intended to create a bias in illuminant matching. Observers: 2A The same 7 observers from the previous experiment served in this experiment. Procedure: 2A The procedure for this experiment was the same as for previous experiments. Again, the illumination in the standard chamber was set at the beginning of each trial to one of four different standard starting points, this time (8, 19, 30, and 40 cd/m2). The surface reflectance was also set to one of four predetermined levels (.38, .46, .53 and .60). The illuminant levels were low to compensate for the bias so the perceptual match would be possible. The highest levels of illuminants and lowest reflectances were determined by multiplying the highest possible achromatic level in the match chamber by the slope of the standard versus the match levels in the pilot data. This should have allowed the average observer to make a perceptually satisfying match for any standard, while still taking advantage of the range of possible standards. The four illuminant levels were crossed with the four surface levels, such that each of the 16 trials was a unique combination of illuminant level and reflectance. Results: 2A Did observers show illuminant color constancy? Illumination constancy is analogous to surface color constancy, described in the introduction. Perfect constancy would mean that observers see two physically identical illuminants as the same, in spite of changes in the surface reflectances in the scene. Complete lack of constancy would mean that the matched illuminant was entirely determined by the total luminance in 61 the whole scene, rather than determined by the physical illuminant in the standard chamber. Matched Illuminant (cd/m 2 ) 100 80 60 40 0.38 0.46 0.53 0.60 20 Refl Refl Refl Refl MBG 0 0 20 40 60 80 Standard Illuminant (cd/m 100 2 ) Figure 26: All illuminant matches for one observer for experiment 2A. You can see that the slope is greater than 1 for this observer. For any given data point, the illuminant in the match chamber is much higher than the illuminant in the standard chamber. The stimuli in this experiment were designed to induce a bias in illuminant matching. All data for one observer are shown in Figure 26. The dashed line indicates the diagonal, on which the data would lie if the illuminant in the match chamber was physically the same as the illuminant in the standard chamber at the end of each trial. 62 Notice that the data clearly lie above the line; this observer does not show perfect illuminant color constancy. Illuminant Slopes 2.5 2 1.5 1 0.5 0 MDR SIM MBG ISH JAB BGS JXK Figure 27: Illuminant match slopes for all observers for experiment 2A. The average slope for the illuminant match was 1.84. The fact that the illuminant slopes are not 1 demonstrates a failure of illuminant color constancy, induced by the manipulation of the stimuli in the scene. 63 Again, the data are summarized for each observer with a slope, derived from the data from all accepted matches in two sessions. The slopes for all 7 observers are shown in Figure 27. The average slope for all observers was 1.84. In other words, on the average trial, an observer would set the illumination in the darker match chamber to 1.84 times the lighter standard chamber illuminant level. The slopes are not 1, on average, so illuminant matching is not always veridical; these observers did not show perfect illuminant color constancy. Paired two-tailed t tests showed that the difference between standard illuminants and matched illuminants was significant for each observer. A one sample t test showed that the average differences between the standard and the matched illuminant for the 7 observers were significantly different from zero (t(6)=10.16, p,.0001, two-tailed). It is possible that the average slope of the illuminant matches can be explained by the observers making some physical match. Observers might be matching the luminance of a medium reflectance object in each chamber. They might be matching the luminance of a medium reflectance object in the standard chamber to that of a low reflectance object in the match chamber, or matching the luminance of a high reflectance object in the standard chamber to that of a medium reflectance object in the match chamber. Or, observers might be matching the total light energy passing through the aperture from each chamber. In fact, the following tests revealed no physically measurable quantity that subjects were matching. That is to say, no physical measurement predicted the observers’ performance. One possibility was that observers were matching the physical luminance of objects of like reflectance. If this had been the case, illumination matches would have fallen along the diagonal, so this possibility can be rejected by data already presented. To formally test this, however, luminance measurements were taken with the illumination in the standard chamber set at the four standard levels used in experiment 2A. The illumination in the match box was set at the corresponding standard times the average slope. Luminance measurements were taken at two corresponding points of medium luminance (the back wall), but the luminance was not matched in the two chambers. The average across the four levels was 7.83 cd/m2 in the standard chamber and 13.55 cd/m2 in the match chamber. This seems reasonable, given that the two walls had the same reflectance and the illumination was higher in the match chamber. The surface in the match chamber must show a higher luminance. More revealing comparisons might be between a medium reflectance surface in the standard chamber and a low reflectance surface in the match chamber. Is it possible that a match of physical luminance of these objects predicts observers’ performance? Figure 28 shows the slope that would be predicted if this were the case (11.72), as a solid line, the diagonal representing veridical performance as a broken line, four dots representing the average matches for the four standard illuminants (average slope 64 1.84), and the shaded area representing the area between the highest slope (2.16) and the lowest slope (1.53) for the 7 observers. Performance was not predicted by a luminance match between medium reflectance surfaces in the standard chamber and low reflectance surfaces in the match chamber. After setting the illuminant in the standard chamber to each of the four standards and the illuminant in the match chamber to the corresponding average match, the luminance was measured at the milk carton in each chamber. The medium reflectance milk carton in the standard chamber measured 8.38 cd/m2 averaged over the four levels, compared to the low reflectance milk carton in the match chamber, which measured 1.06 cd/m2. 80 Match Illuminant (cd/m 2 ) 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60 70 2 Standard Illuminant (cd/m ) Figure 28: The solid line shows where the data are predicted to fall if observers were matching the luminance of the low reflectance objects in the match chamber to the medium reflectance objects in the standard chamber. The broken line shows the prediction if the illuminant matching were veridical. The circles show the observers’ average matches at the four standards, and the shaded area represents the area between the highest and lowest observers’ slopes. 65 80 Similarly, one could compare a high reflectance surface in the standard chamber to a medium reflectance surface in the match chamber. Is it possible that a match of physical luminance of these objects predicts observers’ performance? This is what the highest luminance rule would predict (see, e.g., Gilchrist et al., 1999). Figure 29 shows the slope that would be predicted if this were the case, as a solid line with a slope of 4.22. The diagonal representing veridical performance is a broken line, and four dots represent the average matches for the four standard illuminants. Performance was not predicted by a luminance match between high reflectance surfaces in the standard chamber and medium reflectance surfaces in the match chamber. After setting the illuminant in the standard chamber to each of the four standards and the illuminant in the match chamber to the corresponding average match, the luminance was measured at the toilet paper roll in each chamber. Again, there was no physical match: the luminance of the high reflectance toilet paper roll in the standard chamber was 14.1 cd/m2 on average, compared to the medium reflectance toilet paper roll in the match chamber, which measured 5.97 cd/m2. Finally, one might ask whether the luminance averaged across the whole scene predicts the matches of the observers. In order to measure the average luminance in each scene at each of the four standard and match illuminations, a digital photograph of the open aperture was taken at each setting. These images were taken with a high quality monochrome CCD camera (Photometrics PXL) with a linear intensityresponse function. Three images were taken for each scene measured, one each with 500 nm, 550 nm, and 600 nm interference filters placed in the optical path of the camera. Each image was corrected by subtracting a dark image (taken without opening the shutter) of the same exposure duration. The images were then scaled so that the image data at three chosen locations matched direct luminance measurements of the same three locations. The scaled image data were then averaged over the viewing aperture, but excluding the area of the test patch. This provided an estimate of the average luminance of the scene. Estimates from the three separate monochromatic images were then averaged to produce the final estimate used. Although some error in the estimates is introduced by not measuring the full spectrum of the scene at every pixel, this error should be small as the scenes used were approximately isochromatic. Figure 30 shows what a match of the total scene luminance would predict as a solid line. This line does not predict observers’ data, shown as averages for each of the four matches. Incidentally, the highest possible achromatic setting in the match chamber was close to 70 cd/m2, so using this strategy would have allowed a satisfying match on only the lowest standard illuminant trials. Remember that the standard illuminants were decided upon based on pilot data and were intended to allow the average observer to make a satisfying match. 66 80 Match Illuminant (cd/m 2 ) 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60 70 80 2 Standard Illuminant (cd/m ) Figure 29: The solid line shows where the data are predicted to fall if observers were matching the luminance of the medium reflectance objects in the match chamber to the high reflectance objects in the standard chamber. The broken line shows the prediction if the illuminant matching were veridical. The circles show the observers’ average matches at the four standards, and the shaded area represents the area between the highest and lowest observers’ slopes. 67 80 Match Illuminant (cd/m 2 ) 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60 70 80 2 Standard Illuminant (cd/m ) Figure 30: The solid line shows where the data are predicted to fall if observers were matching the total scene luminance of the match chamber to total scene luminance of the standard chamber. The broken line shows the prediction if the illuminant matching were veridical. The circles show the observers’ average matches at the four standards, and the shaded area represents the area between the highest and lowest observers’ slopes. Did observers show surface color constancy? All data for one observer are shown in Figure 31. The dashed line indicates the diagonal, on which the data would lie if the simulated surface reflectance in the match chamber was the same as the simulated surface reflectance in the standard chamber at the end of each trial. Notice that the data clearly lie below the line; this observer does not show surface color constancy. 68 1 2 8 cd/m Illum Matched Reflectance 2 19 cd/m Illum 0.8 2 30 cd/m Illum 2 40 cd/m Illum 0.6 0.4 0.2 MBG 0 0 0.2 0.4 0.6 0.8 1 Standard Reflectance Figure 31: All reflectance matches for one observer for experiment 2A. You can see that the slope is less than 1 for this observer. For any given data point, the surface reflectance in the match chamber is lower than the surface reflectance in the standard chamber. The reflectance match slopes were again calculated using all accepted matches. The maximum number of data points for any observer is the number of illuminant matches accepted, although the number could be lower if an observer accepted an illuminant match but rejected the surface match in a given trial. All reflectance match slopes for the 7 observers are shown in Figure 32. The average slope for all observers was .39. In other words, on the average trial, an observer would set the reflectance in 69 the match chamber to less than half the reflectance of the standard chamber test patch. The slopes are not 1, indicating that surface matching is not veridical; these observers did not show perfect surface color constancy. The difference between standard reflectances and matched reflectances was significant for each observer. A one sample t test showed that the average differences between the standard and the match reflectance for the 7 observers was significantly different from zero (t(6)=24.88, p<.0001, two-tailed). Reflectance Slopes 1 0.8 0.6 0.4 0.2 0 MDR SIM MBG ISH JAB BGS JXK Figure 32: Reflectance match slopes for all observers for experiment 2A. For this experiment the average slope for the surface reflectance matches was .38 for 7 observers. The fact that these slopes were not 1 demonstrates a failure of surface color constancy. Setting the surface reflectance low is what you might expect given the albedo hypothesis, but see below for a quantitative test. 70 As with the illumination matches, we can ask whether the surface matches were determined by any obvious physical measurement. The surface reflectance in the match chamber might vary, for example, with the highest luminance surface in the scene, the luminance of a medium reflectance surface, the lowest luminance surface in the scene, or with the average luminance coming from the scene. First, there is the possibility that the surface reflectance was determined by the highest luminance surface in each scene. In order to compare this hypothesis with the data, four measurements were taken of the highest luminance surface in each box with the illuminant in the standard chamber set at the four standards and the illuminant in the match chamber set at the average match (that is, the standard times the average slope). Then a slope was calculated representing the relationship between the highest luminance surface in the standard chamber and the highest luminance surface in the match chamber. This slope was .42. One can compare this to the surface match slopes for this experiment, which ranged between .30 and .44. The predicted slope does not fall outside the range of data for these observers, so one cannot reject the hypothesis that surface reflectance in the match chamber is determined by the highest luminance surface. Figure 33 illustrates these relationships; the solid line represents the predicted slope, the broken line represents veridical matching and the dots represent average data for the four standard reflectances. Second, consider the possibility that the surface reflectance matches were determined by the lowest luminance in scene. As above, four measurements were taken of the lowest luminance surface in each box with the illuminant in the standard chamber set at the four standards and the illuminant in the match chamber set at the average match. Then a slope was calculated representing the relationship between the highest luminance surface in the standard chamber and the highest luminance surface in the match chamber. This slope was .128. One can compare this to the surface match slopes for this experiment, which ranged between .30 and .44. The hypothetical slope falls outside the range of data for these observers, so one can reject the hypothesis that surface reflectance in the match chamber is determined by the lowest luminance surface. Figure 34 illustrates these relationship; the solid line represents the predicted slope, the broken line represents veridical matching and the dots represent average data for the four standard reflectances. Third, consider the possibility that the surface matches are determined by the relationship between the luminances of the same middle reflectance surface in the two chambers once the illumination match has been made. This possibility seems unlikely, since they would have different luminances after the illuminant match, but it was formally tested in the same way as described above. The slope between a medium reflectance surface under the standard illuminant and the corresponding medium reflectance surface under the matched illuminant was 1.74. (Note that the 71 slope used here was based on actual measurements of these surfaces, rather than the nominal slope.)This is out of range of the actual data, as illustrated in Figure 35. 1 Match Reflectance 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 Standard Reflectance Figure 33: The solid line shows where the data are predicted to fall if the highest luminance surface in the scene determined surface matching. The broken line shows the prediction if the reflectance matching were veridical. The circles show the observers’ average matches at the four standards, and the shaded area represents the area between the highest and lowest observers’ slopes. Notice that the predicted slope falls within the range of actual data. 72 1 Match Reflectance 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 Standard Reflectance Figure 34: The solid line shows where the data are predicted to fall if the lowest luminance surface in the scene determined surface matching. The broken line shows the prediction if the reflectance matching were veridical. The circles show the observers’ average matches at the four standards, and the shaded area represents the area between the highest and lowest observers’ slopes. The prediction does not fall within the range of actual data. 73 1 Match Reflectance 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 Standard Reflectance Figure 35:The solid line shows where the data are predicted to fall if the luminance of two middle reflectance surfaces in the two chambers determined surface matching. The broken line shows the prediction if the reflectance matching were veridical. The circles show the observers’ average matches at the four standards, and the shaded area represents the area between the highest and lowest observers’ slopes. The prediction does not fall within the range of actual data. Finally, one could consider the possibility that the reflectance of the matched surface is determined by the average luminance in the whole scene. The average luminance was calculated as described above in the discussion of illumination constancy. The slope between the total luminance in the standard chamber under standard illuminant and the total luminance in the matched chamber under the matched illuminant was .44. One can compare this to the surface match slopes for this experiment, which ranged between .30 and .44. The hypothetical slope does not fall 74 outside the range of data for these observers, so one cannot reject the hypothesis that surface reflectance in the match chamber is determined by the average luminance in the entire scene. Figure 36 illustrates these relationships. 1 Match Reflectance 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 Standard Reflectance Figure 36: The solid line shows where the data are predicted to fall if the average luminance in the entire scene determined surface matching. The broken line shows the prediction if the reflectance matching were veridical. The circles show the observers’ average matches at the four standards, and the shaded area represents the area between the highest and lowest observers’ slopes. Notice that the predicted slope falls within the range of actual data. Does the albedo hypothesis hold for these stimuli? Finally, remember that the critical test of the albedo hypothesis is whether the measured physical luminance in the two 75 chambers match after both the illuminant and the surface reflectance are perceptually matched. Even the most general form of the albedo hypothesis predicts a slope of one. All luminance data for one observer are shown in Figure 37. On the average trial, once the illuminant and the surface reflectance were perceptually matched in the two chambers, the luminance was not equal; it was lower in the match chamber. The dashed line indicates the diagonal, on which the data would lie if the physical luminance measured at the test patch in the match chamber was the same as the luminance of the test patch in the standard chamber at the end of each trial. Notice that most of the data clearly lie below the line. A paired two-tailed t test revealed a significant difference between the standard and matched luminance levels for this observer (t(30)=4.31, p=.00016). The data for this observer falsify the albedo hypothesis . All luminance slopes for the 7 observers are shown in Figure 38. The average slope for all observers was .67. For 6 out of 7 of the observers, a paired two-tailed t test showed that the measured luminance in the match chamber was significantly different from the measured luminance in the standard chamber, (see table 1 for t tests). A one sample t test revealed that the average difference between the match and standard illuminant for each observer was significantly different from zero (t(6)=4.82, p=.003, two tailed). Since the relevant question is whether the luminance data fall on the diagonal (i.e. whether the slope between the luminance of the two test patches is 1), all luminance data for all observers is plotted in Figure 39 for easy reference. Observer MDR SIM MBG ISH JAB BGS JXK T value, paired t test for difference between standard and matched luminance t(31) = 9.24 t(31) = 9.36 t(31) = 4.30 t(27) = 5.09 t(25) = 6.76 t(20) = 1.18 t(18) = 3.86 p value, Two-tailed paired t test 2.01498E-10 1.48565E-10 0.00016125 2.39029E-05 6.8041E-07 0.250075254 0.001134842 Table 1: Experiment 2A table of surface luminances 76 n.s. 2 Matched Luminance (cd/m ) 25 20 15 10 5 MBG 0 0 5 10 15 20 25 2 Standard Luminance (cd/m ) Figure 37: All surface luminance data for one observer for experiment 2A. The quantitative test of the albedo hypothesis is the relationship between the luminance in the standard chamber and the luminance in the match chamber. Once the observer has perceptually matched the illuminants and the surface reflectances in the two chambers, even if each is non-veridical, the albedo hypothesis predicts that the physical luminance will match. Notice that for this one observer, the slope of the regression line is less than one. 77 Luminance Slopes 1 0.8 0.6 0.4 0.2 0 MDR SIM MBG ISH JAB BGS JXK Figure 38: Surface luminance slopes for all observers for experiment 2A. For this experiment the average slope was .68. The expected value for each of the slopes would be 1, if the albedo hypothesis were true. These data falsify the albedo hypothesis. 78 2 Matched Luminance (cd/m ) 25 20 15 10 5 0 0 5 10 15 20 25 2 Standard Luminance (cd/m ) Figure 39: These are all of the accepted matches from all for experiment 2A. Compare the data with the dashed line, which represents the prediction of the albedo hypothesis. These data differ from the prediction, and thus falsify the albedo hypothesis. 79 Discussion: 2A The most striking and important result from this experiment is that the albedo hypothesis, even in its most general form is false. The classic form (e = î * â) and the general form (f(e, î)= â) of the albedo hypothesis both predict that the surface luminance (e) must match after the illuminant and the surface reflectance have been perceptually matched. Notice that according to the general form of the albedo hypothesis, the function that calculates the perceived reflectance takes as input the perceived illuminant and the actual physical luminance from the test patch. If the perceived illuminant were the only variable mediating the relationship between the physical luminance and the perceived reflectance, then once the perceived reflectance and the perceived illuminant were both matched, the physical luminance would have to match. It does not. Therefore, even the most general form of the albedo hypothesis is false. Perceived illuminant does not uniquely determine perceived surface reflectance for a given luminance level. These results also falsify the hypothesis, that i= î and that a=â. If this were true, the physically measured illuminants would have to match in the standard and the match chambers, and the physically measured surface reflectances would have to match in the standard and the match chambers once the perceptual matches were made. This design, with the "light" chamber as the standard and the "dark" chamber as the match chamber, induced a bias in the illuminant matches. If the illuminant matching were veridical, the illuminant matching slope for the average observer would be 1, yet it was 1.84. This falsifies the hypothesis that i = î. These observers did not show illuminant color constancy; the difference between standard illuminants and matched illuminants was significant for each observer. The illuminant match data are in between veridical matches and true luminance matches for the whole scene, so observers show neither perfect illuminant color constancy nor a complete lack of it. Notice that one could not, infact, make the dark chamber look exactly like the light chmaber by adjusting the illumination, because of inter-reflecting. The difference in surface inter-reflectance within the scene, as well as a difference in ratios between the paints in the dark chamber and the paints in the light chamber, made it impossible to make an exact retinal match. The reflectance matches were not veridical either, and the average slope for reflectance matches was not 1. This falsifies the hypothesis that a=â. These observers did not show perfect surface color constancy; the difference between standard reflectances and matched reflectances was significant for each observer. 80 Experiment 2B: Asymmetric Matching with Change in Surround Results from experiment 2A suggest that that the albedo hypothesis fails for at least the conditions created for that experiment. Experiment 2B is designed to further challenge the most general form of the albedo hypothesis by testing whether reflectance matches can be manipulated while illuminant matches stay constant. Consider the general form of the hypothesis: f(e, î)= â. If there are experimental conditions which affect perceived surface reflectance (â) without affecting perceived illumination (î), then it cannot be the case that perceived illumination uniquely determines perceived surface reflectance for a given proximal stimulus. In experiment 2B, the surfaces are nearly all the same as in experiment 2A, and the average reflectance in the view is approximately the same so the illuminant matches are expected to be about the same. The difference between this and the previous experiment is that the two halves of the back wall have switched positions in the match chamber. The standard chamber was identical to that in experiment 2A. Thus, experiment 2B, the immediate surround of the test patch is the middle reflectance (gray) cardboard in each chamber. With this design, the albedo hypothesis can be further challenged. If it is possible to manipulate the luminance slope (e.g. by changing the reflectance of the immediate surround), this would falsify the albedo hypothesis. According to the hypothesis, the slope should be 1. Matched illumination is likely to be unaffected by the manipulation, since mean reflectance of all surfaces is unchanged. Any manipulation that has an effect on surface color matching but little or no effect on illumination matching would disprove the albedo hypothesis. Apparatus: 2B The apparatus here was identical to that of experiment 2A, with one exception. The two cardboard halves of the back wall were reversed in the match chamber only. In the match chamber, half of the visible wall was low reflectance (black) and the other half medium reflectance (gray), with the immediate surround of the test patch being medium reflectance. See Figure 40 for a view of the chambers. Observers: 2B Observers were the same 7 observers from experiment 2A. Procedure: 2B The procedure was identical to the previous experiment. The illuminant levels in the standard chamber were again 8, 19, 30 and 40 cd/m2. The reflectance levels of the test surface in the standard chamber were .17, .31, .46, and .60. As before, the highest 81 levels of illuminants and lowest reflectances were determined by multiplying the possible achromatic levels in the match chamber by the slope of the standard versus the match levels in the pilot data. Again, the four illuminant levels were crossed with the four surface levels, such that each of the 16 trials was a unique combination of illuminant level and reflectance. Figure 40: Standard and Match chamber for experiment 2B. The Standard chamber, shown on the left, was identical to that in the previous experiment. In the match chamber, shown on the right, the only difference was in the placement of the cardboard on the back wall: now the medium reflectance rather than the low reflectance cardboard immediately surrounded the test patch. Results: 2B Were illuminant matches different from experiment 2A? All illuminant match slopes for the 7 observers are shown in Figure 41. The average slope for all observers was 1.86, compared to 1.84 for experiment 2A. As in experiment 2A, observers did not show illuminant color constancy. A paired twotailed t test revealed that the slopes for the seven observers were not significantly different from the slopes from experiment 2A (t(6)= .146, n.s.). There was no significant difference in difference scores across the two experiments, either (t(6)= .902,n.s.). Thus, there was no significant difference between illuminant matching slopes in experiment 2A and experiment 2B. Were surface reflectance matches different from experiment 2A? All surface reflectance match slopes for the 7 observers are shown in Figure 42. The average slope for all observers was .55, compared to .38 for experiment 2A. As in experiment 2A, observers did not show perfect surface color constancy. However, a paired two-tailed t test revealed that the slopes for the seven observers were significantly different from the slopes from experiment 2A (t(6)= 6.79, p=.0005,). 82 There was a difference between surface reflectance matching slopes in experiment 2A and experiment 2B. Average difference scores were also different across the two experiments (t(6)= 7.52, p=.0003). Illuminant Slopes 2.5 2 1.5 1 0.5 0 MDR SIM MBG ISH JAB BGS JXK Figure 41: The black bars show illuminant match slopes for all observers for experiment 2B. The average slope was 1.86, compared to 1.84 in experiment 2A, represented by gray bars. The changing the immediate surround of the test patch did not affect the illuminant matches. 83 Reflectance Slopes 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 MDR SIM MBG ISH JAB BGS JXK Figure 42: The black bars show reflectance match slopes for all observers for experiment 2B. The average slope was .55, compared to .38 in the previous experiment, represented by gray bars. Changing the immediate surround of the test patch did have an effect on the surface reflectance matches. Since perceived illumination and perceived surface reflectance can be manipulated independently, perceived illumination cannot uniquely determine perceived surface reflectance. Were surface luminance slopes different from experiment 2A? The interesting question addressed by this experiment is whether one can manipulate the surface luminance slope simply by changing the immediate surround of the test patch. Thus, it is important to note that the relationship between the surface luminance measured at the test patch in the two chambers in experiment 2B was 84 different from that relationship in experiment 2A. The average slope for all observers was 1.03 in experiment 2B, compared to .68 in the previous experiment, as illustrated by Figure 43. Surface Luminance Slopes 1.2 1 0.8 0.6 0.4 0.2 0 MDR SIM MBG ISH JAB BGS JXK Figure 43: The black bars show luminance slopes for all observers for experiment 2B. The average slope was 1.03, compared to .69 in the previous experiment, represented by gray bars. Changing the immediate surround of the test patch had an effect on the luminance slopes, whereas the albedo hypothesis predicts a constant. 85 A paired two tailed t test revealed that the surface luminance slopes was different between these two experiments (t(6)= 5.87, p=.001). Average difference scores were also different across the two experiments (t(6)= 5.81, p=.001). This difference is best illustrated by Figure 44, which shows all of the luminance data from these two experiments, plotted in different colors. 2 Matched Luminance (cd/m ) 25 20 15 10 5 0 0 5 10 15 20 25 2 Standard Luminance (cd/m ) Figure 44: These are all of the accepted matches from all observers for experiment 2A and experiment 2B. The albedo hypothesis predicts that after illuminant and surface matches, the slope of the relationship between luminance in the standard chamber and that in the match chamber should be a constant, and should be 1. These experiments show that the slope is not always 1, and is not even a constant; it can be experimentally manipulated. 86 Discussion: 2B The novel conclusion that can be drawn from the above experiment is that changing the reflectance of the surface surrounding the test patch can change the slope of the measured luminance while the perceived illuminant and perceived reflectance match. Remember that the prediction of the most general form of the albedo hypothesis was that after the two perceptual matches, the luminance slope should be 1. Data from the experiment 2A suggested that in general the slope is not always 1, and data from these two experiments suggest that it is not even a constant. The difference in results between experiment 2A and experiment 2B suggest that it is possible to manipulate the relationship between luminance, perceived illumination, and perceived reflectance. With this information, it is theoretically possible to create conditions in which the albedo hypothesis in its general form holds true, and it is likewise possible to create conditions in which the relationship is strongly violated. Other conclusions from this experiment are in agreement with conclusions from experiment 2A. Again, the strongest form of the albedo hypothesis (i = î and a = â) is falsified by these results, since neither the illuminant matching slopes nor the surface matching slopes were 1. In addition, the classic form of the albedo hypothesis, e = î * â, and the weaker form of the albedo hypothesis, f(e, î) = â, are falsified by these results. If perceived illuminant and perceived surface lightness can be manipulated independently, then it cannot be the case that perceived illumination uniquely determines the relationship between physical luminance and perceived surface lightness. Are observers better at illuminant matching or surface reflectance matching? One question that can be addressed by this matching paradigm is whether observers perform more consistently during illuminant matching or surface reflectance matching. The slopes of the regression line, discussed above, give a quantitative estimate of the veridicality of the matches. Likewise, the standard deviations can give us an estimate of the consistency of the matches. Each observer completed two sessions for each of the above experiments. Each trial in each experiment was unique, but one can compare the two corresponding trials in the two sessions. The standard deviation of the two like trials across the two sessions for a given observer gives an estimate of consistency. The proper comparison is between the illuminant measurements and surface luminance measurements, since both are measured in candelas per meter squared. However, the average illuminant measurements are on the order 4 times greater than surface luminance measurements, so Weber’s law suggests that a direct comparison of the standard deviations would likely lead to the conclusion that illuminant matching was more variable, even if that were not true. A better estimate of the 87 consistency of the matches would be the slopes of the best fitting line fitted to the standard deviation plotted against mean illuminant or mean luminance (in candelas per meter squared), for each trial pair. Notice, however, that the measures of surface luminance include differences in illumination matching as a source of error. In order to correct for this confound and get a better estimate of the variance in reflectance matching alone, the luminance of the two trials in a pair was calculated by multiplying the reflectance by the average illumination match set in that pair of trials. This removes the effect of variability in illuminant matching from the surface matching standard deviations. See Table 2 for these data for each of the four experiments described above. The aggregate slope of the regression line when the (two trial) illuminant mean was plotted against the (two trial) standard deviation for illuminant matches was .118 and for surface matches was .151. These differences are not big enough to conclude that human observers are measurably better at either illuminant matching or surface lightness matching. Figure 45 shows the matched measurements in cd/m2 on the xaxis and the standard deviation on the y-axis; data from all four experiments are combined, and surface data and illuminant data are plotted in different colors for comparison. Experiment 1A 1B 2A 2B Standard Deviation v Illuminant Slope .0785 .1300 .1129 .1631 Standard Deviation v. Reflectance Slope .0522 .1869 .2016 .2277 Table 2: Standard deviation slopes for illuminant and surface reflectance matches calculated by taking the slope of the regression line when mean matches were plotted on the x-axis and standard deviations plotted on the y-axis, and correcting for variance in illumination matches, as described in the text This analysis was of interest since there has been very little experimental work done on illuminant perception or illuminant matching, compared to the work done on surface color perception and matching. Although any difference found was small enough to be inconclusive, it does not seem that illuminant matching is less consistent. Appendix 1 shows two pairs of control experiments that were conducted subsequently. These experiments ensure that the results are the same when observers are instructed to match the illuminant just at the point of the test patch, and when 88 observers adjust the illuminant and the surface reflectance simultaneously. The results and conclusions from these replications are consistent with experiments 2A and 2B (see Appendix 1). 16 2 Standard Deviation (cd/m ) Illumination 14 Surface 12 10 8 6 4 2 0 0 10 20 30 40 50 60 70 80 2 Matched Luminance (cd/m ) Figure 45: Standard deviation of matched trials plotted against the mean match for the same trial. Illuminant matches are plotted in green, and surface matches plotted in red. The difference between the two is not big, but illuminant matching at least no less consistent than surface matching. 89 Chapter 3 General Discussion The primary purpose of this project was to test the albedo hypothesis, in its classic and more general forms. Other important aims were to provide one of the first tests of observers’ ability to match illuminations, to compare consistency in illumination matching to that in surface color matching, and to explore illumination color constancy. The main conclusions that can be drawn from the results of this project are as follows: The albedo hypothesis does not hold, even in its most general form. Perceived illumination does not uniquely determine perceived surface lightness for a given luminance. People are able to make illumination matches. These matches are (nearly) veridical in the case of symmetric matching, but can be biased by manipulating the stimuli. People do not always show illumination color constancy. These data do not show conclusively whether people are more consistent when matching illuminants or surface colors. The conclusion that the albedo hypothesis is false rests on results from experiments 2A and 2B, and on the replications of these experiments. In experiment 2A, once the perceived illuminant and the perceived surface reflectance were matched, the physical luminance was not equal when measured at the two test patches. Thus, the perceived illuminant cannot uniquely determine the perceived reflectance for a given luminance, as the albedo hypothesis suggests. Results from experiment 2B suggest that it is possible to manipulate surface reflectance matches without affecting illuminant matches. Again, this would not be possible if there were a consistent relationship between the three variables across contexts. The question of whether the albedo hypothesis is correct is of broad theoretical interest. Many current models of color perception and color constancy rely on the assumption that we can understand surface color perception as driven entirely by the visual system's estimate of the illuminant. The hypothesis was important to test, because if it had held, then one could have usefully linked human visual performance to physics-based computational models of vision (e.g. Landy and Movshon, 1991; Gilchrist & Jacobsen, 1984; Knill and Richards, 1996). These data suggest that a different approach is required. In the introduction, two alternatives to the albedo hypothesis were discussed. The first was that perceived surface reflectance is a function of perceived illuminant and physical luminance, but there are other factors that influence perceived reflectance. The second alternative is that there is no relationship between perceived illumination and perceived surface reflectance. These experiments were not designed to test these 90 two alternatives, and the current data do not require favoring one alternative over the other. Certainly, these data do not reject the idea that perceived illumination influences perceived reflectance. Notice that in experiment 2A, the illuminant matches are higher and the surface reflectance matches are lower, which is qualitatively consistent with there being some compensation. Furthermore, the literature reviewed in the introduction (see especially Gilchrist, 1988) suggests a relationship between these two percepts. One cannot reject the idea that î influences â. Although the current study rules out the possibility that it is the only factor that influence perceive reflectance, perceived illumination may be one factor. Other factors that may influence perceived surface reflectance apparently have to do with the immediate surround. Experiment 2B, (see also 3B and 4B in Appendix 1) shows that changing the immediate surround changes the relationship between the perceived illuminant and the physical luminance. It may be that the perceived reflectance of the immediate surround or the ratio of the luminance between the immediate surround and the test patch are taken as input by the reflectance calculation function. Surface color constancy was imperfect in these experiments. In some cases, surface reflectance in the match chamber was set to less than half what it was in the standard chamber. This seems striking in light of the large literature on the human visual system’s high degree of color constancy. Notice, however, that human color constancy can be challenged in contrived situations. The Gelb effect, for example (see also Gilchrist, 1988; Logvenenko & Menshikova, 1994) shows an experimental situations in which color constancy fails. It is also possible to make color constancy based optical illusions that rely on simultaneous contrast. The fact that color constancy failed in the above experiments is not new or unique, what is interesting is the stimulus correlates of its failure. The standard deviation was used in this study as an estimate of the consistency of matches. Just comparing standard deviations across trial type would have made it appear that illuminant matches were noisier, since the absolute values of these measurements was higher than surface luminance measurements. The standard deviation was plotted against the matched illuminant or surface luminance, so that the magnitude of the mean measurement (in cd/m2) would be taken into account. There was no remarkable difference in consistency between illuminant and surface matches. However, for each experiment, the illuminant matches are (qualitatively) more consistent than surface matches, so one can conclude that illuminant matching is no less consistent than surface reflectance matching. This is interesting, since illuminant perception has not been widely studied in the past. Illuminant perception and illuminant matching are reasonable tasks for human observers. If observers were better at illumination matching than surface matching, this would be consistent with Katz’s suggesting that an observer’s impression of illumination is stronger than the 91 impression of surface colors (Katz, 1935). It would be inconsistent with Gilchrist and Jacobsen’s (1984) finding that within observer difference in color judgments were small compared to the differences in illumination judgments. The albedo hypothesis, as described in the literature, is very clearly a hypothesis about a causal relationship. The model suggests that the illuminant is estimated first, and then the surface reflectance is calculated based on this estimate. Although this causal relationship was what these experiments were designed to test, notice that it is possible to eliminate a “correspondence” relationship as well. Even models that propose that the causal relationship goes the other way (i.e. the perceived surface reflectance determines the perceived illuminant) or that the two percepts mutually influence each other in a consistent way are falsified by these results. Experiment 2A (and 3A and 4A, see Appendix 1) shows that î, â and e do not have any consistent relationship, and in experiment 4A this was tested without relying on assumptions about the causal relationship between the factors. Perhaps a few caveats are in order while considering the conclusions one can draw from this project. First, one may be tempted to draw conclusions from these data about the precise nature of the perceived variables, î and â. In fact, it would be impossible, based on this study, to make any strong claims about these representations. It could be the case that any relationship that one could calculate between the matched reflectance and the matched illuminant captures the true perceived lightness (or brightness) only after a non-linear output transformation (see Foley, 1977; Philbeck & Loomis, 1997). It is still possible that there are inaccessible variables, î and â, that have a non-linear relationship to the matched illuminant and matched surface lightness that the observers produced. Nonetheless, with this matching paradigm it is possible to falsify certain hypotheses about î and â, and to test the albedo hypothesis in general. Specifically, Experiments 2 through 4 falsify the hypotheses that i = î and that a = â, and show that perceived illumination does not uniquely determine the relationship between luminance and perceived albedo. This matching paradigm is the best current methodology for testing the albedo hypothesis. A second question to consider when drawing conclusions from these data is: Is this really about constancy? One of the major themes of this work (indeed, the title) is the role of illumination perception in color constancy. However, the phenomena observed here may or may not be about color constancy. Color constancy, or any perceptual constancy, deals with the invariant relationship between a real world attribute and a percept representing that attribute. Thus, one might suggest that the albedo hypothesis is not necessarily about a perceptual constancy, since if the perceived illumination is wrong, then the perceived surface color will be wrong as well. Still, the albedo hypothesis is about constancy in the sense that to the extent that people are color constant, the perception of the illuminant is supposed to play a role in producing that constancy. The fact that a non-veridical percept at one stage of the process predicts a 92 non-veridicality in a later stage does not mean that these models and hypotheses may not play a role in real world perception. In the real world perceived reflectance does have a fairly consistent and regular relationship to actual physical reflectance. Conclusion According to the albedo hypothesis, surface color perception is accomplished when the visual system first estimates the illuminant of a scene and then uses this estimate to determine the color of a particular surface given the luminance reflected from the surface. Results from these matching experiments falsify this hypothesis. On each trial, observers matched the illumination in the scene and the color of the test patch. The albedo hypothesis predicts that when both the illuminant and the color of the test patch in the two scenes appear the same, the physical luminance of the two test patches will be the same. It was not, at least with stimuli that induced a bias in illumination matching. Manipulating the immediate surround of the test patch affected the matched surface lightness without affecting matched illuminant, which also rules out the possibility that the perceived illuminant is the only variable that governs the relation between physical luminance and perceived surface reflectance. 93 References Adelson, E.H. & Pentland, A.P. (1991). The perception of shading and reflectance. In Blum, B.(ed.) Channels in the visual nervous system: Neurophysiology, psychophysics and models. London, England UK: Freund Publishing House, Ltd Arend, L. E., & Reeves, A. (1986). Simultaneous color constancy. Journal of the Optical Society of America A, 3, 1743-1751. Beck, J. (1959). Stimulus Correlates for the judged illumination of a surface. Journal of Experimental Psychology, 58 (4) 267-274. Beck, J. (1961). Judgments of surface illumination and lightness. Journal of Experimental Psychology, 61: 368-373. Beck, J. (1971). Surface lightness cues for the illumination. American Journal of Psychology, 84, 1-11. Beck, J. (1972). Surface Color Perception. Cornell University Press. Brainard D. H., Brunt W. A., Speigle J. M. (1997). Color constancy in the nearly natural image. 1. asymmetric matches. Journal of the Optical Society of America A Vol 14, 2091-2110. Brainard D. H. (1998). Color constancy in the nearly natural image. 2. achromatic loci. Journal of the Optical Society of America A Vol 15, 307-325. Brainard D. H. & Freeman, W. T. (1997). Bayesian color constancy. Journal of the Optical Society of America A, 14 (7): 1393-1411. Brainard, D. H., & Wandell, B. A. (1992). Asymmetric color-matching: how color appearance depends on the illuminant. Journal of the Optical Society of America A, 9(9), 1433-1448. Brunswik, E. (1933). Die Zugänglichkeit von Gegenständedn fur die Wahrnehmung. Arch. Fur der Ges. Psychologie., 88: 377.418. Buchsbaum, G. (1980). A spatial processor model for object colour perception. Journal of the Franklin Institute, 310: 1. Burzlaff, W. (1931). Methodologishe Beiträge zum Problem der Farbenkonstanz. Z. Ps, 119, 177-235. 94 D’Zmura, M. (1992) Color constancy: surface color from changing illumination. Journal of the Optical Society of America A Vol 9, No. 3. 490-492. Epstein, W. (1973). The process of "taking-into-account" in visual perception. Perception. 2(3): 267-285. Evans, R.M. (1948). An Introduction to Color. New York: John Wiley & Sons, Inc. Foley, J.M. (1977). Effect of distance information and range of two indicies of visually perceived space, Perception, 6 449-460. Gelb, A. (1929). Die Farbenkonstanz der Sehdinge. Handbuch der normalen und pathologischen Physiologie, 12: 594-678. Gilchrist, A.L. (1988). Lightness contrast and failures of constancy: A common explanation. Perception & Psychophysics. Vol 43(5), 415-424. Gilchrist, A. & Jacobsen, A. (1984). Perception of lightness and illumination in a world of one reflectance. Perception, 13(1): 5-19. Gilchrist, A., Kossyfidis, C., Bonato, F., Agostini, T., Cataliotti, J., Li, X., Spehar, B., Szura, J., Annan, V., & Economou, E. (1999). An anchoring theory of lightness perception. Psychological Review, 106 (n4):795-834. Helmholtz, H. von. (1962/1866). Helmoltz’s treatise on physiological optics. Edited by J.P. Southall, translated from the 3rd German edition, vol.2. New York: Dover. Hering (1907/1920).Grundzüge der Lehre vom Lichtsinn. 1st ed, Leip., Englemann; 2nd ed Berlin, Springer. Jameson D (1985). Opponent-color theory in light of physiological finding. In D. Ottoson & S. Zeki (Eds.), Central and peripheral mechanisms of color vision, (pp.8102). New York: Macmillan. Jameson, D. & Hurvich, L.M. (1989). Essay concerning color constancy. Annual Review of Psychology, 40:1-22. Katz, D. (1911). Die Erscheinungsweisen der Farben und ihre Beeinflussung durch die individuelle Erfahrung. Leipzig: J.A. Barth. Katz, D. (1935). World of Colour, New York, Johnson Reprint Corp. 95 Kardos, L. (1929). Die “Konstanz” phänomenaler Dingmomente. Beitr Problemgeschichte Ps (Bühler Festschr) 1-77. Jena, Fischer. Kardos, L. (1934). Ding und Schatten: Eine experimentelle Untersuchung über die Grundlagen des Fabensehens. Zeitschrift für Psychologie, 23. Knill, D. & Richards, W. (1996). (Eds.) Perception as Bayesian Inference. Cambridge University Press, Cambridge, MA. Koffka, K. (1935). Principles of Gestalt Psychology. New York: Harcourt, Brace. Kozaki, A. (1973). Perception of lightness and brightness of achromatic surface color and impression of illumination. Japanese Psychological Research, 15, 194-203. Kozaki, A. & Noguchi, K. (1976). The relationship between perceived surfacelightness and perceived illumination. Psychological Research, 39 (1): 1-16. Landy, M.S. & Movshon, J.A. (1991). Computational Models of Visual Processing. Cambridge, MA: MIT Press. Logvinenko, A. & Menshikova, G. (1994). Trade-off between achromatic colour and perceived illumination as revealed by the use of pseudoscopic inversion of apparent depth. Perception, 23(9): 1007-1023. Maloney, L.T. & Wandell, B.A. (1986) Color constancy: a method for recovering surface spectral reflectance. Journal of the Optical Society of America A Vol 3, No. 1. 29-33. Noguchi, K. & Kozaki, A.(1985). Perceptual scission of surface-lightness and illumination: An examination of the Gelb effect Psychological Research, 47(1): 1925. Oyama, T. (1968). Stimulus determinants of brightness constancy and the perception of illumination. Japanese Psychological Research. 10(3): 146-155. Pfanzagl, J. (1968). Theory of Measurement. New York: John Wiley Philbeck, J. & Loomis, J.M. (1997). Comparison of two indicators of perceived egocentric distance under full-cue and reduced-cue conditions. Journal of Experimental Psychology: Human Perception & Performance, 23 (1): 72-85. 96 Speigle, J.M. (1998). Univariance and Constancy: Color Appearance Assessed by Scaling, Matching, and Achromatic Adjustment. Doctoral Dissertation. University of California, Santa Barbara. Torgerson, W.S. (1958). Theory and Methods of Scaling New York: John Wiley. Woodworth, R.S. (1938). Experimental Psychology. London: Methuen. Woodworth, R.S. & Schlosberg, H. (1954). Experimental Psychology New York: Holt. Zaidi, Q. (1998). Identification of illuminant and object colors: Heuristic-based algorithms. Journal of the Optical Society of America, Vol 15(7), 1767-1776. 97 Appendix 1 Experiment 3A: Asymmetric Matching and Location Specific Instructions The following four experiments were replications of and control experiments for experiments 2A and 2B. The purpose of experiments 3A and 3B, was to replicate experiments 2A and 2B using instructions that localized the point of illuminant matching to the test patch. The purpose of this entire project is to test the relationship between illumination perception and surface color perception, at the point of a given surface of interest. In the previous experiments. If observers are matching the illumination in the chamber as a whole, but matching the surface reflectances of just the test patches, the method may not be testing the intended relationship. These two experiments thus replicate 2A and 2B, but now observers are specifically instructed to match the illuminants at the test patches. Apparatus: 3A The apparatus for experiment 3A was identical to that in experiment 2A. Observers: 3A There were 4 naïve observers, none of whom had participated in any of the previous experiments. They included 1 female in her early 20s and three males in their early 20s. Observers were paid $10 for each session. Procedure: 3A The procedure for this experiment was the same as for experiment 2A. The only difference between this and experiment 2A was the wording of the instructions observers were given. For illuminant matches, observers were told Your job is to match the amount of light that is falling on the two test patches. When you do illuminant matching, it would be possible to think about matching the illuminant in the whole scene or matching the illuminant at a particular point in the scene. We want you to do the latter, and in particular to match the illuminant at the test patch. All other instructions were essentially the same, except for reminders of this specific task. The complete instructions can be found in appendix 3. Results: 3A 98 Again, the data are summarized for each observer with a slope, using the data from all accepted matches in the two sessions. For illuminant match slopes for all 4 observers, see table 3. The average slope for all observers was 2.03. Paired two-tailed t test showed that the standard and matched illuminants were different for each observer. A one sample t test on the average differences scores (standard illuminant minus match illuminant) for the four observers revealed that the differences were different from zero (t(3)=14.52, p<.0001, two-tailed). An unmatched t test between illuminant slopes from in experiment 2A and 3A showed no difference between two experiments (t(9)=.923, n.s.). All reflectance match slopes are also shown in table 3. The average slope for all observers was . 37. Paired two-tailed t test showed that the standard and matched reflectances were different for each observer. A one sample t test on the average differences scores (standard illuminant minus match illuminant) for the four observers revealed that the differences were different from zero (t(3)=31.91, p<.0001, twotailed). An unmatched t test between reflectance slopes from in experiment 2A and 3A showed no difference between two experiments (t(9)=.912, n.s.). Finally, the average luminance slopes for all observers was .76 (shown in table 3). Paired two-tailed t test showed that the luminance measurements in the standard and the luminance measurements in the match chamber were different for three of the four observers (see table 4). A one sample t test on the average differences scores (standard luminance minus match luminance) for the four observers revealed that the differences were different from zero (t(3)=3.92, p=.03, two-tailed). As in experiment 2A, on the average trial, once the illuminant and the surface reflectance were perceptually matched in the two chambers, the luminance was not equal, and was lower in the match chamber. Experiment 3A Illuminant slopes DCB JLM MJH LVK 1.63 2.70 1.91 1.89 Average 2.035 Experiment 3B DCB 1.65 JLM 2.20 MJH 2.08 LVK 1.48 Average 1.85 Reflectance slopes .41 .35 .34 .39 .37 Luminance slopes .76 .92 .66 .71 .76 .53 .52 .50 .61 .54 .91 1.17 1.06 1.00 1.03 Table 3: Illuminant, reflectance, and luminance slopes for each observer in experiments 3A and 3B 99 Observer DCB JLM MJH LKV T value, paired t test for difference between standard and matched luminance t(30) = 2.56 t(12) = 1.39 t(27) = 7.05 t(25) = 4.00 p value, One tailed paired t test .0159 .191 1.41E-07 .0005 n.s. Table 4: Experiment 3A table of surface luminances Discussion: 3A These results essentially replicate experiment 2A. These data show both a failure of illuminant color constancy and surface color constancy, as in experiment 2A. Neither average illuminant slopes nor average surface reflectance slopes were 1, which they would be if the matches were veridical. Because the luminance slope (the relationship between luminance measured at the test patch in the standard chamber and the test patch in the match chamber) is not 1, the albedo hypothesis is false. The logic is outlined in experiment 2A. This replication is important given the new instructions. Because observers were instructed to match the illuminants specifically at the test patch, one can have more confidence that this experiment tests the relationship between surface reflectance and illumination at the location of the surface. Experiment 3B Surround Change with Location Specific Instructions Again, this experiment is a replication of experiment 2B, with the new, more specific instructions, to ensure that the method tests the relationship between perceived surface color and the perceived illumination at the point of the surface in question. Apparatus: 3B The apparatus here was identical to that of experiment 2B. Observers: 3B Observers were the same four observers from experiment 3A. Procedure: 3B 100 The procedure was identical to the previous experiment. Again observers were instructed to match the illuminant at the point of the test patch; instructions were the same as for experiment 3A. Results: 3B Were illuminant matches different from experiment 3A? For illuminant match slopes for all 4 observers, see table 3. The average slope for all observers was 1.85, compared to 2.03 for experiment 3A. Again, observers did not show illuminant color constancy, and their deviation from perfect constancy is, on average, about the same as that in experiments 2A and 2B. A paired two-tailed t test revealed that the slopes for the 4 observers were not significantly different from the slopes from experiment 3A (t(3)= 1.11, p=.35, n.s.), nor were the difference scores (t(3)= .84, p=.46, n.s.). Thus, there was no measurable difference between illuminant matching slopes in experiment 3A and experiment 3B. An unmatched t test between illuminant slopes from in experiment 2B and 3B showed no difference between two experiments (t(9)=.15, n.s.). Were surface reflectance matches different from experiment 3A? All reflectance match slopes are also shown in table 3. The average slope for all observers was . 54, compared to .37 for experiment 3A. As in experiment 3A, the observers did not show surface color constancy. A paired two-tailed t test revealed that the slopes for the 4 observers were significantly different from the slopes from experiment 3A (t(3)= 8.03, p=.004) as were the difference scores (t(3)= 15.54, p=.0006,). Thus, there was a significant difference between surface reflectance matching slopes in experiment 3A and experiment 3B, just as there was between experiments 2A and 2B. An unmatched t test between reflectance slopes from in experiment 2B and 3B showed no difference between two experiments (t(9)=.124, n.s.). Were surface luminance slopes different from experiment 3A? As in experiments 2A and 2B, the interesting question is whether it is possible to manipulate the luminance slope simply by changing the immediate surround of the test patch. Again, the relationship between the surface luminance measured at the test patch in the two chambers in experiment 3B was different from that relationship in experiment 3A. The average slope for all observers was 1.03 (shown in table 3), compared to .76 in the experiment 3A. A paired two tailed t test revealed that the surface luminance slopes were different between these two experiments. (t(3)= 5.25, p=.01) as were the difference scores (t(3)= 7.34, p=.005,). Discussion: 3B 101 Results from experiments 3A and 3B are in complete agreement with results from experiments 2A and 2B. Changing the reflectance of the surface surrounding the test patch can independently change the slope of the measured luminance after perceptual illuminant and reflectance matches. The albedo hypothesis predicts that after the two perceptual matches, the luminance slope should be 1. The slope is not always 1, and again these data suggest that it is possible to manipulate the relationship between luminance, perceived illumination, and perceived reflectance. Again, changing the reflectance of the surface surrounding the test patch can independently change the slope of the measured luminance after perceptual illuminant and reflectance matches. Importantly, this is true even in the case when observers are specifically instructed to match the illumination at the point of the same point where they are matching the surface reflectance. This replication is important given the new instructions: Because observers were instructed to match the illuminants specifically at the test patch, one can have more confidence that this experiment tests the relationship between illumination and surface reflectance. Other conclusions from this experiment are also in agreement with conclusions from previous experiments. Again, the strongest form of the albedo hypothesis, that i = î and a = â are falsified by these results, since neither the illuminant matching slopes nor the surface matching slopes were 1. Experiment 4A: Alternating Illuminant and Surface Matches Experiments 4A and 4B, were additional replications and control experiments for 2A and 2B. In 4A and 4B, observers were able to adjust both the illuminant and the surface reflectance simultaneously. They did not have to first accept the illuminant match before doing the surface match. Instead, they could make adjustments to one, then the other, and then the first again. They did not accept the match until both the illuminant and the surface reflectance were perceptually matched. This method ensures that any influence of perceived surface reflectance on perceived illuminant is taken into account in the test of the albedo hypothesis. It relaxes the assumption of the albedo hypothesis that there is a one way causal relationship. Apparatus: 4A The apparatus for experiment 4A was identical to that in experiment 2A. Observers: 4A There were 4 observers, all of whom had participated in the first four experiments. They included 1 female in her early 20s, 1 female in her early 30s, and two males in their early 20s. Observers were paid $10 for each session. 102 Procedure: 4A The procedure for this experiment was largely the same as for experiment 2A, except observers had two Game Pads, and could adjust both the illuminant and the surface reflectance before going on to the next trial. Observers were told Unlike the experiment you did before, you will be able to adjust the illumination and the surface lightness at the same time. We would like you to try to adjust both the illuminant and the surface lightness a little bit each time the mirror moves, to get both into the right ballpark before you start making your final adjustments. As you make your final adjustments, continue to alternate between the two judgements. …Remember, you won't accept the matches until you have adjusted both the illuminant and the surface lightness. The rest of the instructions were essentially the same. The complete instructions can be found in appendix 4. Results: 4A Again, the data are summarized for each observer with a slope, using the data from all accepted matches in the two sessions. For illuminant match slopes for all 4 observers, see table 6. The average slope for all observers was 1.39. Paired two-tailed t test showed that the standard and matched illuminants were different for three of the four observers. A one sample t test on the average differences scores (standard illuminant minus match illuminant) for the four observers revealed that the differences were different from zero (t(3)=3.20, p=.049, two-tailed). A paired two-tailed t test including just those observers included in both experiments 2A and 4A showed no difference in illuminant slopes between those two experiments (t(3) = 2.11, n.s.). All reflectance match slopes are also shown in table 6. The average slope for all observers was .38. Paired two-tailed t test showed that the standard and matched reflectances were different for each observer. A one sample t test on the average differences scores for the four observers revealed that the differences were different from zero (t(3)=23.98, p<.0001, two-tailed). A paired two-tailed t test including just those observers included in both experiments 2A and 4A showed no difference in reflectance slopes between those two experiments (t(3) = .11, n.s.). Finally, the average luminance slopes for all observers was .51. Paired two-tailed t test showed that the standard and matched illuminants were different for each of the four observers (see table 6). A one sample t test on the average differences scores for the four observers revealed that the differences were different from zero (t(3)=8.57, 103 p=.003, two-tailed). As in experiment 2A, on the average trial, once the illuminant and the surface reflectance were perceptually matched in the two chambers, the luminance was not equal, and was lower in the match chamber. In this replication, the results again reject the albedo hypothesis. Discussion: 4A These results essentially replicate experiment 2A. These data show both a failure of illuminant color constancy and surface color constancy and falsify the albedo hypothesis, as in experiment 2A. Neither average illuminant slopes nor average surface reflectance slopes were 1, which they would be if the matches were veridical. Because the luminance slope is not 1, the albedo hypothesis is false. (The logic is outlined in experiment 2A.) Experiment 4A Illuminant slopes MDR SIM MBG ISH 1.45 1.40 1.67 1.04 Average 1.39 Experiment 4B MDR 1.54 SIM 1.47 MBG 2.00 ISH 1.35 Average 1.59 Reflectance slopes .42 .29 .39 .42 .38 Luminance slopes .57 .41 .63 .44 .51 0.688 0.544 0.568 0.587 .597 1.13 0.86 1.17 0.76 .98 Table 5: Illuminant, reflectance, and luminance slopes for each observer in experiments 4A and 4B Observer MDR SIM T value, paired t test for difference between standard and matched luminance t(31) = 10.56 t(31) = 11.22 p value, One tailed paired t test 8.57 E-12 1.92 E-12 104 MBG ISH t(31) = 9.64 t(31) = 11.33 7.55 E-11 1.48 E-12 Table 6: Experiment 4A table of surface luminances Experiment 4B: Alternating Illuminant and Surface Matches Again, this experiment is a replication of experiment 2B, but now the observers adjusted both the illuminant and the surface reflectance before going on to the next trial, as in experiment 4A. Apparatus: 4B The apparatus here was identical to that of experiment 2B. Observers: 4B Observers were the same four observers from experiment 4A. Procedure: 4B The procedure is identical to the experiment 4A. Results: 4B Are illuminant matches different from experiment 4A? For illuminant match slopes for all 4 observers, see table 6. The average slope for all observers was 1.59, compared to 1.51 for experiment 4A. Again, observers did not show illuminant color constancy, and their deviation from perfect constancy is, on average, about the same as that in experiments 2A and 2B. A paired two-tailed t test revealed that the slopes for the 4 observers were not significantly different from the slopes from experiment 4A (t(3)= 2.73, n.s.), nor were the difference scores (t(3)= 3.01, n.s.). Thus, there was no significant difference between illuminant matching slopes in experiment 4A and experiment 4B. An unmatched t test between all observers in experiment 2B and the four in 4B showed no difference between the illuminant slopes in the two experiments (t(9)=1.87, n.s.). Are surface reflectance matches different from experiment 4A? All reflectance match slopes are also shown in table 6. The average slope for all observers was .60, compared to .37 for experiment 4A. As in experiment 4A, the observers did not show surface color constancy. A paired two-tailed t test revealed that the slopes for the seven observers were significantly different from the slopes from experiment 4A (t(3)= 8.24, p=.004), as were the difference scores (t(3)= 11.04, 105 p=.002.). Thus, there was a significant difference between surface reflectance matching slopes in experiment 4A and experiment 4B, just as there was between experiments 2A and 2B. An unmatched t test between all observers in experiment 2B and the four in 4B showed no difference between the reflectance slopes in the two experiments (t(9)=1.20, n.s.). Are surface luminance slopes different from experiment 4A? As in experiments 2A and 2B, the interesting question is whether it is possible to manipulate the luminance slope simply by changing the immediate surround of the test patch. Again, the relationship between the surface luminance measured at the test patch in the two chambers in experiment 4B was different from that relationship in experiment 4A. The average slope for all observers was .98 (shown in table 6), compared to .53 in the experiment 4A. A paired two tailed t test revealed that the surface luminance slopes were different between these two experiments. (t(3)= 8.49, p=.003) as were the difference scores (t(3)= 32.30, p=.00007.). Discussion: 4B These results replicate findings from experiment 2A. It is possible to manipulate the relationship between the luminance of the standard test patch and the luminance of the match test patch after both illumination and reflectance are perceptually matched. Since the reflectance matches changed (relative to experiment 4A) and illuminant matches did not, perceived illumination cannot be the only factor determining perceived reflectance. Even when one relaxes the assumption about the causal relationship between perceived illuminant and perceived reflectance, there is not a consistent relationship. 106 Appendix 2 : Instructions used in 1A, 1B, 2A and 2B. Thank you for participating in this experiment. There will be a total of sixteen trials. In each trial you will first do an illuminant match, and then a surface lightness match. A computer voice will tell you when to do an illumination match and when to do a surface match, and will tell you which trial you are on. The surfaces you will match are the little patches you see on the back wall of each chamber. You will be able to control the illuminant in one of the two boxes during the illuminant match, and you will be able to control the surface lightness when you are doing the surface match. When you are doing an illuminant match, do not worry about how any of the surfaces of the walls or objects look. They may look the same or they may look different when the illuminations match. Just try to match the amount of light that is falling at any two points in the two boxes, for example, think about how much light is falling on the test patch, or how much light is falling on the egg carton or the milk carton. Likewise, when you are matching the surfaces in the two boxes, don’t worry about how the illumination levels look. Just try to make the two test patches look like they were made out of the same piece of paper. To match the illumination, you will use this GamePad. Move the joystick up to increase the illumination, and down to decrease the illumination. Each trial will start with the biggest changes, so when you think that your are in the right ballpark, and want to decrease the change in illumination with each movement of the joystick, you can move the joystick to the left. That will decrease the increment size. There are three different increments, and they cycle through, so once you get beyond the smallest increment, you’ll be back to the largest. You will hear a beep each time you change the size of the increment. Finally, when you want to accept the match, that is, when it looks like the illumination in the two boxes is the same, press the blue button and you'll go on to the surface match. If you get to the top or the bottom of the range, you’ll hear a beep, and you won’t be able to adjust the illumination any farther. If this happens, you can either accept the match by pressing the blue button, or reject the match by pressing the yellow button. You should reject the match if you don't think the illumination in the two boxes looks the same. In either case, the surface match will start. To match the surfaces, you will again use this GamePad. Just like in the illumination match, you will move the joystick up to increase the lightness, and down to decrease it. When you think that you are in the right ballpark, and want to decrease the change in lightness with each movement of the joystick, you can move the joystick to the left, and again you’ll hear a beep as the increments change. Finally, when you want to 107 accept the match, if you think that the lightness of the test patch is the same in the two boxes, press the blue button. If you reach the end of the range, you’ll hear a beep and you can either accept the match by pressing the blue button, or reject the match by pressing the yellow button. In either case, you will go on to the next trial. 108 Appendix 3: Instructions used in 3A and 3B. Thank you for participating in this experiment. There will be a total of sixteen trials. In each trial you will first do an illuminant match, and then a surface lightness match. A computer voice will tell you when to do an illumination match and when to do a surface match, and will tell you which trial you are on. The surfaces you will match are the little patches you see on the back wall of each chamber. You will be able to control the illuminant in one of the two boxes during the illuminant match, and you will be able to control the surface lightness when you are doing the surface match. When you are doing an illuminant match, do not worry about how any of the surfaces of the walls or objects look. They may look the same or they may look different when the illuminations match. Your job is to try to match the amount of light that is falling on the two test patches. When you do illuminant matching, it would be possible to think about matching the illuminant in the whole scene or matching the illuminant at a particular point in the scene. We want you to do the latter, and in particular to match the illuminant at the test patch. Likewise, when you are matching the surfaces in the two boxes, don’t worry about how the illumination levels look. Just try to make the two test patches look like they were made out of the same piece of paper. To match the illumination, you will use this GamePad. Move the joystick up to increase the illumination, and down to decrease the illumination. Each trial will start with the biggest changes, so when you think that your are in the right ballpark, and want to decrease the change in illumination with each movement of the joystick, you can move the joystick to the left. That will decrease the increment size. There are three different increments, and they cycle through, so once you get beyond the smallest increment, you’ll be back to the largest. You will hear a beep each time you change the size of the increment. Finally, when you want to accept the match, that is, when it looks like the illumination is the same at the two test patches, press the blue button and you'll go on to the surface match. If you get to the top or the bottom of the range, you’ll hear a beep, and you won’t be able to adjust the illumination any further. If this happens, you can either accept the match by pressing the blue button, or reject the match by pressing the yellow button. You should reject the match if you don't think the illumination is the same at the two test patches. In either case, the surface match will then start. To match the surfaces, you will again use this GamePad. Just like in the illumination match, you will move the joystick up to increase the lightness, and down to decrease 109 it. When you think that you are in the right ballpark, and want to decrease the change in lightness with each movement of the joystick, you can move the joystick to the left, and again you’ll hear a beep as the increments change. Finally, when you want to accept the match, if you think that the two test patches look like they are cut from the same piece of paper, press the blue button. If you reach the end of the range, you’ll hear a beep and you can either accept the match by pressing the blue button, or reject the match by pressing the yellow button. In either case, you will go on to the next trial. 110 Appendix 4: Instructions used in 4A and 4B. Thank you for participating in this experiment. There will be a total of sixteen trials. In each trial you will do an illuminant match and a surface lightness match simultaneously. Unlike the experiment you did before, you will be able to adjust the illumination and the surface lightness at the same time. We would like you to try to adjust both the illuminant and the surface lightness a little bit each time the mirror moves, to get both into the right ballpark before you start making your final adjustments. As you make your final adjustments, continue to alternate between the two judgements. You will be able to control the illuminant and the surface lightness in the match box using separate joysticks. When you are doing an illuminant match, do not worry about how any of the surfaces of the walls or objects look. They may look the same or they may look different when the illuminations match. Your job is to try to match the amount of light that is falling on the two test patches. When you do illuminant matching, it would be possible to think about matching the illuminant in the whole scene or matching the illuminant at a particular point in the scene. This time, we want you to do the later, and in particular to match the illuminant at the test patch. Likewise, when you are matching the surfaces in the two boxes, don’t worry about how the illumination levels look. Just try to make the two test patches look like they were made out of the same piece of paper. To match the illumination, you will use this GamePad on the left. Move the joystick up to increase the illumination, and down to decrease the illumination. Each trial will start with the biggest changes, so when you think that your are in the right ballpark, and want to decrease the change in illumination with each movement of the joystick, you can move the joystick to the left. That will decrease the increment size. There are three different increments, and they cycle through, so once you get beyond the smallest increment, you’ll be back to the largest. You will hear a beep each time you change the size of the increment. To match the surfaces, you will use this GamePad on the right. Again, you will move the joystick up to increase the lightness, and down to decrease it. When you think that you are in the right ballpark, and want to decrease the change in lightness with each movement of the joystick, you can move the joystick to the left, and again you’ll hear a beep as the increments change. Remember, you won't accept the matches until you have adjusted both the illuminant and the surface lightness. When you want to accept the match, that is, when it looks like the illumination and the surface lightness in the two boxes are the same, press the blue button on either GamePad and you will go on to the next trial. If you get to the 111 top or the bottom of the range, you’ll hear a beep, and you won’t be able to adjust the illumination any farther. If this happens, you can either accept the matches by pressing the blue button, or reject the matches by pressing the yellow button on either GamePad. You should reject the match if you don't think the illumination in the two boxes looks the same or if the test patches don't look like they are cut out of the same piece of paper. In either case, you will go on to the next trial. 112
© Copyright 2025 Paperzz