PSYCHOLOGICAL SCIENCE Research Article EXAGGERATING TEMPORAL DIFFERENCES ENHANCES RECOGNITION OF INDIVIDUALS FROM POINT LIGHT DISPLAYS Harold Hill1 and Frank E. Pollick1,2 1 Department of Psychology, University of Glasgow, Glasgow, Scotland, United Kingdom, and 2ATR Human Information Processing Laboratories, Kyoto, Japan Abstract—Humans are very good at perceiving each other’s movements. In this article, we investigate the role of time-based information in the recognition of individuals from point light biological motion sequences. We report an experiment in which we used an exaggeration technique that changes temporal properties while keeping spatial information constant; differences in the durations of motion segments are exaggerated relative to average values. Participants first learned to recognize six individuals on the basis of a simple, unexaggerated arm movement. Subsequently, they recognized positively exaggerated versions of those movements better than the originals. Absolute duration did not appear to be the critical cue. The results show that time-based cues are used for the recognition of movements and that exaggerating temporal differences improves performance. The results suggest that exaggeration may reflect general principles of how diagnostic information is encoded for recognition in different domains. Humans and other animals are very good at perceiving each other’s movements, or what is called biological motion (Johansson, 1973). Biological motion can be represented by moving point light displays, which are very limited displays consisting only of the motion of points corresponding to the joints of the actor. How the complex patterns of relative motion in these displays are perceived and interpreted remains underspecified. Like other moving stimuli, point light displays have spatial, temporal, and spatiotemporal properties. In this article, we investigate the effects of varying temporal properties of simple motion sequences, leaving spatial cues constant. The movements of the points in displays of biological motion allow people to immediately recognize the action depicted (e.g., walking). Further, the same limited information allows people to make more specific categorizations with a reasonable degree of accuracy; for example, observers may be able to determine the direction of walking, the sex of the walker, and even the identity of the walker (e.g., Cutting & Kozlowski, 1977; Mather & Murdoch, 1994). The information that allows such decisions is not yet fully understood, but there are at least two possibilities. These can be illustrated through the example of categorizing sex from a moving point light display of someone walking. One possibility is that people use motion as a cue to three-dimensional shape. In the case of the walker, the relative movements of shoulder and hip joints specify the center of moment, and this captures differences in torso shape between males and females (Cutting, Proffitt, & Kozlowski, 1978). Alternatively, motion itself may directly inform categorization. For example, the velocities of shoulder and hip joints of the walker appear to specify sex even Address correspondence to Harold Hill, Department of Psychology, University College London, Gower St., London WC1E 6BT, United Kingdom; e-mail: [email protected]. VOL. 11, NO. 3, MAY 2000 when differences in torso shape have been eliminated (Mather & Murdoch, 1994). In this case, a spatiotemporal motion cue, body sway, is used directly. Evidence that spatiotemporal properties are particularly important for the perception of biological motion comes from masking studies. For example, noise masks with the same properties as the motion being masked are particularly effective for disrupting the perception of biological motion, and this cannot be accounted for in terms of spatial properties alone (Cutting, Moore, & Morrison, 1988). Purely temporal properties are also important. For example, disrupting the temporal synchronization of points in a biological motion display by starting them at different places in their cycle disrupts perception of the motion (Bertenthal & Pinto, 1994). Temporal presentation properties (e.g., frame durations and interframe intervals) also affect the perception of biological motion. Further, the effects of these parameters on biological motion are distinct from their effects on other motion displays, such as displays of simple translational motion (Mather, Radford, & West, 1992; Shiffrar & Freyd, 1990, 1993; Thornton, Pinto, & Shiffrar, 1998). In particular, biological motion information is integrated over longer time periods than nonbiological motion and adapts to stimulus-based properties, such as number of cycles, rather than stimulus-independent properties, such as absolute time (Neri, Morrone, & Burr, 1998). In this study, we wished to test whether exaggerating temporal properties can facilitate the recognition of biological motion. In the spatial domain, exaggerating differences from the average—often referred to as caricaturing—facilitates recognition (e.g., of faces; Brennan, 1985; Rhodes, 1996; Rhodes, Brennan, & Carey, 1987). Investigators have argued that the enhanced recognition of exaggerations suggests either that faces may actually be stored as exaggerations or that exaggerations provide better access to the stored representation than their unexaggerated originals (Rhodes et al., 1987). Examples of such “supernormal” stimuli that produce a stronger response than the actual stimuli they replace are common in the animal world (Tinbergen, 1953, pp. 206–210). These have in common that some property or properties of the stimuli appear critical, and supernormal stimuli are produced only when these properties are exaggerated. It is possible to exaggerate movement by exaggerating positions in individual frames in a manner analogous to the way faces are exaggerated (Fidopiastis & Pollick, 1998). However, in this research, we calculated and exaggerated differences in the temporal rather than the spatial domain, that is, in terms of durations rather than distance. Any exaggeration increases differences between stimuli, “stretching” the underlying dimensions of variability. However, as with supernormal stimuli, the stretching will affect performance only if these dimensions are critical for the task. We used point light displays of the right sagittal view of the arm Copyright © 2000 American Psychological Society 223 PSYCHOLOGICAL SCIENCE Temporal Exaggeration movement involved in picking up, drinking from, and putting down a glass. These sequences were divided into segments at key frames specifying the start and end of each segment (see Fig. 1). Animators use similar key frames, defining spatial positions in these frames manually while generating intermediate frames automatically (Thomas & Johnson, 1981). As Figure 2a shows, all but one of the key frames that we used fell at local minima on the velocity profile of the wrist, that is, at relatively stationary points between periods of movement. We used the velocity profile of the wrist to determine the key frames, as extremities of this kind are known to be important for the perception of biological motion (Mather et al., 1992). The segments between the key frames correspond to distinct parts of the motion sequence—for example, the first segment corresponds to picking up the glass. The key frame not at a minimum is at a point where the velocity profile changed markedly and corresponds to the onset of actual drinking. We temporally exaggerated sequences by scaling the durations of the movement segments relative to average values. (See Fig. 2b for the exaggeration of the velocity profile shown in Fig. 2a.) The actor whose movement data the velocity profiles in Figure 2 were derived from picked up, lifted, and drank from the glass in a shorter time than average. In the exaggeration, the durations of these segments became even shorter. Spatial properties like the distance traveled by points between the consecutive key frames shown in Figure 1 remained the same, so spatiotemporal properties, including peak and average velocity of segments, also changed when temporal properties changed. The last two segments of this actor’s movement had durations close to average and were therefore not much affected by exaggeration. (Additional technical details and examples of the animations used are available on the World Wide Web at http://www.psy.gla.ac.uk/harry/ tempexag.html.) Temporal exaggerations allowed us to test the effect of varying temporal properties of movement while leaving spatial properties constant. EXPERIMENT In this experiment, participants learned to recognize individuals from their arm movements. The task and stimuli chosen were relatively arbitrary, but had the advantage that the task demands were clearly specified and the information available was controllable. We compared recognition performance for learned and exaggerated sequences, thus determining whether the properties affected by temporal exaggeration are important for this task. Fig. 1. Single frames taken from an animation sequence at points corresponding to the key frames shown in Figure 2. The frames represent the maximum spatial extents of the different phases of movement for both unexaggerated and exaggerated sequences. The images are shown in reverse polarity with enlarged dots; they are framed for illustrative purposes. The individual frames represent (a) the start position, (b) picking up the glass, (c) the beginning of drinking, (d) the end of drinking, (e) replacing the glass, and (f) the final position. 224 VOL. 11, NO. 3, MAY 2000 PSYCHOLOGICAL SCIENCE Harold Hill and Frank E. Pollick properties are constant. We predicted that temporal exaggeration will improve performance if it affects dimensions of psychological space that are important for recognition by stretching them in a way that facilitates discrimination between individuals. Method Participants Sixteen students from the University of Glasgow took part in the experiment. They were paid for their participation, which took about an hour. Procedure Fig 2. Example velocity profile for one sequence from one actor (a) and the same sequence after temporal exaggeration by a factor of +1 (b). The profiles are based on tangential wrist velocity. Dotted lines indicate the positions of key frames. The profile in (b) illustrates a time-varying exaggeration in which total duration was not normalized. We tested both time-varying and time-normalized exaggerations. For both types of exaggerations, the relative durations of segments were produced in the same manner–that is, differences from average values were increased by a scaling factor. However, for timenormalized exaggerations, we normalized the overall length of the sequences so that this was always equal to the average value for the actor involved. Comparing these two types of exaggeration provides a test of whether the total duration of movement sequences is itself the critical cue to identifying the actor. We also varied the level of exaggeration by using different scaling factors. We used two positive levels of exaggeration, with differences increased by factors of +0.5 (+50%) and +1 (+100%); an unexaggerated condition, with differences unchanged (0%); and a negative level of exaggeration, which was scaled by −0.5 (−50%). Positive exaggerations should be better recognized and negative exaggerations worse recognized if exaggeration enhances dimensions of variability important for the task. The original learned sequences correspond to the zero-exaggeration level of the time-varying exaggerations and provide a baseline measure of performance. In summary, the experiment tested whether changing temporal properties affects recognition of biological motion even when spatial VOL. 11, NO. 3, MAY 2000 We presented sequences as QuickTime animations using PsyScope (Cohen, MacWhinney, Flatt, & Provost, 1993). In the initial stage, for each sequence observers saw the first name of the person followed by an original, unexaggerated example of the movement. Three different sequences were shown for each of six individuals, making a total of 18 sequences. Participants did not make any responses during this stage of the experiment. The second stage of the experiment involved training with feedback. Participants saw 1 of the 18 learned sequences at random and indicated who they thought it was. Responses were made using the keys 1 to 6, each corresponding to a particular actor. A list of which number corresponded to which actor was available throughout the experiment. Automatic feedback indicated whether the response was right or wrong. If the response was wrong, the correct name and number were shown, and that sequence was shown again later in this stage of the experiment. Participants trained until they had correctly identified all 18 sequences twice each. The number of trials to criterion was recorded for each participant. In the third and final stage of the experiment, participants saw a mixture of the learned sequences and previously unseen, exaggerated sequences in a random order. They responded in the same way as in the second training stage, but there was no feedback and incorrectly identified trials were not repeated. There were 144 trials in this stage (6 actors × 2 types of exaggeration × 4 levels of exaggeration × 3 sequences). Design The core design was a 2 (exaggeration type: time-varying or timenormalized) × 4 (exaggeration level: −0.5, 0, +0.5, or +0.1) × 6 (actor) within-participants factorial design. Exaggeration levels were the scaling factors, which were applied to the initial differences between the sequence and the grand average. We had recorded 12 repetitions for each actor when creating the stimuli. In order to make use of all the repetitions and avoid effects due to idiosyncrasies of particular sequences, we produced four versions of the experiment. Each version used a different random triplet of the 12 available sequences for each actor. Version was included as an additional between-participants variable. Hits and false alarms for each actor in each condition were combined into a single measure of sensitivity, dL, an analogue of d⬘ based on a logistic distribution (Snodgrass & Corwin, 1988). A dL of zero corresponds to chance performance, and a dL of 5.4 corresponds to perfect performance (3/3 hits and 0/15 false alarms) in any condition of this experiment. 225 PSYCHOLOGICAL SCIENCE Temporal Exaggeration Fig. 3. Performance as a function of exaggeration. Error bars show standard errors. Results The average number of trials to correctly recognize each sequence twice in the training-with-feedback stage was 87, with a standard deviation of 33. The minimum possible would have been 36 trials (18 × 2). Clearly, observers took time to learn these discriminations, and there was considerable variation among observers. A 2 (type of exaggeration) × 4 (level of exaggeration) × 6 (actor) × 4 (version) analysis of variance showed a main effect of level of exaggeration, F(3, 36) ⳱ 9.3, p < .001, and an interaction between actor and version, F(15, 60) ⳱ 2.1, p < .05. There was no effect of exaggeration type, nor were there any higher-order interactions (all ps > .1). The effect of level of exaggeration is illustrated in Figure 3. It is clear that performance increased with increasing level of exaggeration, with positively exaggerated examples recognized better than the learned examples. A post hoc Newman-Keuls test showed significant differences (␣ ⳱ .05) between the −0.5 level of exaggeration and all other levels of exaggeration and between the 0 and +1 levels of exaggeration. A post hoc orthogonal linear contrast was consistent with a linear relationship over the range tested, F(1, 36) ⳱ 26.8, p < .05, with no significant nonlinear residual, F(2, 36) ⳱ 0.6, p > .1. The interaction between version and actor shows that some of the actors were better recognized depending on the particular set of sequences. As this effect was not strong, was independent of the main effect of level of exaggeration, and was not itself of theoretical interest, it was not pursued further. DISCUSSION The results confirm that changing temporal properties affects the recognition of identity from biological motion1 even when the spatial 1. The stimuli used are convincing depictions of biological motion although we used only six points and represented only part of the body. To test 226 properties remain constant. Exaggerating temporal differences improved performance on a task involving discriminating among people. This suggests that the physical properties that we altered—duration and properties based on duration—are psychologically important for the task of recognizing different individuals from their movements. Exaggerating temporal differences from mean values facilitated recognition although participants had learned unexaggerated versions. Although exaggerations were based on differences in duration, total duration does not appear to be the critical cue, as there was no difference between time-varying and time-normalized exaggerations. Although theories of object recognition (e.g., Biederman, 1987; Marr & Nishihara, 1978) emphasize the importance of spatial information, primarily shape, it appears that temporally derived information is also useful for recognition. The effect of temporal exaggeration shows that spatial cues, which were available (e.g., the distances between points and the motion path), do not fully determine performance for this task. Participants reported using spatial cues (e.g., the height to which the glass was lifted), but performance was not determined by spatial cues alone, as it was also sensitive to the temporal or spatiotemporal cues altered by temporal exaggeration. This result is consistent with previous evidence that temporal and spatiotemporal properties of the signal and noise are critical in determining performance on tasks based on the perception of biological motion (Bertenthal & Pinto, 1994; Cutting et al., 1988; Thornton et al., 1998). There was no effect of whether the absolute duration of the sequence was allowed to vary or was time normalized, ruling out absolute duration as the critical cue. A purely temporal cue like relative duration might still be important, however. For example, the relative durations of the segments of a movement may capture the rhythm of movement and be useful for recognition. Temporal exaggeration might increase individual differences in rhythm and thereby facilitate recognition. However, spatiotemporal cues, such as peak velocity, are also affected by temporal exaggeration, and it may be one or more of these cues that is critical for recognition. Participants mentioned differences in both speed and duration as cues that they had used. The lack of an effect of time normalizing suggests that whatever whether the results are specific to the perception of biological motion or if they reflect general properties of the perception of moving point stimuli, we conducted a shortened version of the experiment in which all sequences, both learned and tested, were played backward. This manipulation does not affect low-level properties of motion—points still move the same distances in the same time. However, sequences played backward look jerky, perhaps because they disrupt the rhythm of movement and violate implicit knowledge of how people move. In the backward experiment, there was an effect of level of exaggeration, F(3, 33) ⳱ 5.1, p < .05, but the pattern was very different from that reported for forward sequences. Mean dLs (with standard errors in parentheses) were 2.9 (0.3) for the −0.5 exaggeration, 2.9 (0.3) for no exaggeration, 3.2 (0.3) for the +0.5 exaggeration, and 2.6 (0.3) for the +1.0 exaggeration. (These means are higher than in the forward condition because participants only had to distinguish between four rather than six people.) A Newman-Keuls test showed a significant difference (␣ ⳱ .05) between +0.5 and +1 exaggerations (performance was worse for +1 exaggerations), but no other significant differences. An orthogonal contrast was not consistent with a linear relationship, F(1, 33) ⳱ 1.1, p > .1, the nonlinear residual being significant, F(2, 33) ⳱ 5.3, p < .05. There was a main effect of actor in this experiment, F(3, 33) ⳱ 6.2, p < .05, but this was independent of level of exaggeration, p > .1. In summary, the very different pattern of results found when sequences were played backward is consistent with interpreting the results of the main experiment as being specific to the perception of biological motion. VOL. 11, NO. 3, MAY 2000 PSYCHOLOGICAL SCIENCE Harold Hill and Frank E. Pollick the critical cue is, it is encoded in a relative rather than an absolute way. It appears characteristic of cyclic biological motions that critical properties are specified in terms of the number of stimulus cycles rather than in terms of absolute duration (Neri et al., 1998). The results reported here suggest that noncyclic biological motion is also encoded in relative, stimulus-based terms. In summary, there are important temporal or spatiotemporal cues diagnostic for recognition, and these are encoded in relative, not absolute, terms. Recognition performance increased with level of exaggeration. Negative exaggerations were more poorly recognized than the unaltered sequences, and +1 exaggerations were recognized better than the originals. The relationship appeared linear over the range tested. Any exaggeration necessarily makes stimuli more discriminable by increasing the differences between examples, and exaggeration would therefore be expected to facilitate discrimination between exaggerated sequences. However, the current results go further in that exaggeration facilitated recognition of previously learned, unexaggerated sequences. This suggests that temporal exaggeration enhanced psychologically important information about motion that was encoded from the learned, unexaggerated sequences. In the case of face recognition, it has been suggested that exaggerations are associated with advantages because faces are stored as deviations from the average in a hypothetical “face space” (Valentine, 1991; for a review, see Rhodes, 1996). The representations might themselves be exaggerations; this would directly explain improved performance, as exaggerations would be a closer match than nonexaggerated examples. Alternatively, exaggerations might provide better access than unexaggerated examples to representations that are themselves veridical (Rhodes et al., 1987); this would be possible within a scheme in which both the local density of the representation space and the distance between representations are important (Krumhansl, 1978). The results presented here are consistent with the possibility that movements are encoded relative to an average spatiotemporal template or prototype (Posner & Keele, 1968). However, the effects of exaggeration can be accounted for within other representational schemes, for example, an exemplar-based account (Medin & Schaffer, 1978; Nosofsky, 1986). It is beyond the scope of this article to determine the nature of the representation involved, but the results do show that temporal information is diagnostic and must be accessible from whatever representation is used (cf. Schyns, 1998). The current results also extend the face-caricature effect to a different domain and suggest that exaggeration may reflect general principles of how information is encoded for within-class discriminations. For the most part, participants did not report being aware that the timing was being varied. With sequential presentation, as used in this experiment, differences are not, in general, readily apparent, although they can be seen when sequences are presented simultaneously. In the cases in which there was an obvious difference, such as when the glass was picked up particularly fast, this was reliably associated with a particular person and so provided a cue to recognition regardless of how natural it looked. Similarly, even when facial caricatures are obviously distorted, they are still clearly recognizable. Although the movement, task, and stimuli used here were relatively arbitrary, we think that the results reflect general properties of how biological motion is perceived. It should be possible to temporally exaggerate any movement that can be divided into segments. Indeed, the widespread use of key frames by animators suggests that this would normally be the case. Exaggerating the durations of these VOL. 11, NO. 3, MAY 2000 segments would be expected to enhance performance on tasks requiring discrimination between the motions involved, at least when temporal or spatiotemporal properties are diagnostic. Our recognition task is just a particular case of a within-category discrimination, and we would expect that the physical and psychological dimensions varied here would be important for other tasks involving the categorization of biological motion. In conclusion, temporal properties are important to how biological motion is perceived, and exaggerating differences appears to facilitate recognition on the basis of biological motion. The results generalize the effect of exaggeration to another domain and are consistent with exaggeration reflecting general properties of the way information for within-class categorizations is encoded or accessed. Acknowledgments—We would like to thank the actors and participants who took part in the experiment. We also thank Pascal Mamassian, Philippe Schyns, Donald Morrison, and the two reviewers for helpful and insightful comments on an earlier draft of the article, and Alan Johnston for related discussions. The work was supported by Engineering and Physical Science Research Council Grant GR/M36052 to Frank Pollick. REFERENCES Bertenthal, B.I., & Pinto, J. (1994). Global processing of biological motions. Psychological Science, 5, 221–225. Biederman, I. (1987). Recognition-by-components: A theory of human image understanding. Psychological Review, 94, 115–147. Brennan, S.E. (1985). The caricature generator. Leonardo, 18, 170–178. Cohen, J.D., MacWhinney, B., Flatt, M., & Provost, J. (1993). PsyScope: A new graphic interactive environment for designing psychology experiments. Behavior Research Methods, Instruments, & Computers, 25, 257–271. Cutting, J.E., & Kozlowski, L.T. (1977). Recognizing friends by their walk: Gait perception without familiarity cues. Bulletin of the Psychonomic Society, 9, 353–356. Cutting, J.E., Moore, C., & Morrison, R. (1988). Masking the motions of human gait. Perception & Psychophysics, 44, 339–347. Cutting, J.E., Proffitt, D.R., & Kozlowski, L.T. (1978). A biomechanical invariant for gait perception. Journal of Experimental Psychology: Human Perception and Performance, 4, 357–372. Fidopiastis, C.M., & Pollick, F.E. (1998). Recognition of exaggerated human movement. Investigative Ophthalmology and Visual Science, 39, S1094. Johansson, G. (1973). Visual perception of biological motion and a model for its analysis. Perception & Psychophysics, 14, 201–211. Krumhansl, C. (1978). Concerning the applicability of geometric nodes to similarity data: The interrelationship between similarity and spatial density. Psychological Review, 85, 445–463. Marr, D., & Nishihara, H. (1978). Representation and recognition of the spatial organization of three-dimensional shapes. Proceedings of the Royal Society of London B, 200, 269–294. Mather, G., & Murdoch, L. (1994). Gender discrimination in biological motion displays based on dynamic cues. Proceedings of the Royal Society of London B, 258, 273– 279. Mather, G., Radford, K., & West, S. (1992). Low-level visual processing of biological motion. Proceedings of the Royal Society of London B, 249, 149–155. Medin, D.L., & Schaffer, M.M. (1978). Context theory of classification learning. Psychological Review, 85, 207–238. Neri, P., Morrone, M., & Burr, D. (1998). Seeing biological motion. Nature, 395, 894– 896. Nosofsky, R. (1986). Attention, similarity and the identification-categorization relationship. Journal of Experimental Psychology: General, 115, 39–57. Posner, M., & Keele, S. (1968). On the genesis of abstract ideas. Journal of Experimental Psychology, 77, 353–363. Rhodes, G. (1996). Superportraits: Caricatures and recognition. Hove, England: Psychology Press. Rhodes, G., Brennan, S., & Carey, S. (1987). Identification and ratings of caricatures: Implications for mental representations of faces. Cognitive Psychology, 19, 473– 497. 227 PSYCHOLOGICAL SCIENCE Temporal Exaggeration Schyns, P.G. (1998). Diagnostic recognition: Task constraints, object information and their interactions. Cognition, 67, 147–179. Shiffrar, M., & Freyd, J.J. (1990). Apparent motion of the human body. Psychological Science, 1, 257–264. Shiffrar, M., & Freyd, J.J. (1993). Timing and apparent motion path choice with human body photographs. Psychological Science, 4, 379–384. Snodgrass, J.G., & Corwin, J. (1988). Pragmatics of measuring recognition memory: Applications to dementia and amnesia. Journal of Experimental Psychology: General, 117, 34–50. 228 Thomas, F., & Johnson, O. (1981). The illusion of life: Disney animation. New York: Hyperion. Thornton, I.M., Pinto, J., & Shiffrar, M. (1998). The visual perception of human locomotion. Cognitive Neuropsychology, 15, 535–552. Tinbergen, N. (1953). The herring gull’s world. London: Collins. Valentine, T. (1991). A unified account of the effects of distinctiveness, inversion, and race upon face recognition. Quarterly Journal of Experimental Psychology, 43A, 161–204. (RECEIVED 4/28/99; ACCEPTED 8/16/99) VOL. 11, NO. 3, MAY 2000
© Copyright 2024 Paperzz