Social Role-Taking: A Review of the Constructs, Measures, and Measurement Properties Author(s): Robert D. Enright and Daniel K. Lapsley Reviewed work(s): Source: Review of Educational Research, Vol. 50, No. 4 (Winter, 1980), pp. 647-674 Published by: American Educational Research Association Stable URL: http://www.jstor.org/stable/1170298 . Accessed: 07/01/2012 13:30 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. American Educational Research Association is collaborating with JSTOR to digitize, preserve and extend access to Review of Educational Research. http://www.jstor.org Review of Educational Research Winter, 1980, Vol. 50, No. 4, Pp. 647-674 Social Role-taking: A Review of the Constructs, Measures, and Measurement Properties Robert D. Enright and Daniel K. Lapsley University of Wisconsin-Madison Social role-taking is examined psychometrically through a review of the constructs,measures developed to representthe constructs,and the reliability and validity of the measures. The construct is described in several different ways in the literature. Most measures show adequate interraterreliabilities, but there is less evidence regarding temporal stability or internal consistency of the scales. The validation efforts have primarily been on the age-stage relationship rather than on other aspects of the construct, although a more expandedfocus has begun within the last few years. The review shows that Chandler's cognitive, Selman's sociomoral, and Flavell's nickel-dime tasks possess the best psychometric properties. Recommendationsfor improving measurementin the social role-taking area are discussed. The construct of role-taking has begun receiving increased attention recently from educators and researchers in child development. Role-taking or perspective taking represents the child's cognitive abilities to understand another person's thoughts or feelings from the other's point of view. Despite G. H. Mead's (1934) early discussion of role-taking and Piaget's (1926) pioneering research on egocentrism, the domain remained relatively untapped until Feffer and Gourevitch's (1960) research. Later, Flavell, Botkin, Fry, Wright, and Jarvis (1968) independently formulated research under the role-taking construct. What we now have is almost two decades of research using a variety of constructs and measures seen as related under the general heading of role-taking development. There have been two recent developments in this field within the last several years. The first is the increasing popularity of social education programs which attempt to promote role-taking development in children (Chandler, 1973;Chandler, Greenspan, & Barenboim, 1974; Elardo, Note 1; Wentink, Smits-van Sonsbeck, Leckie, & Smits, Note 2). The overall results of the programs show an inconsistent picture with some showing growth in the children and others showing no growth at all. The second, more recent development is the call by researchers for more attention to the psychometric properties of the role-taking instruments. For instance, Kurdek (1978), Rubin (1978), and Hudson (1978) have recently concluded that we need to examine far more closely the measurement error in role-taking scales. Kurdek (1977a) and Specialthanksto DanielKeating,RoyalGrueneich,MichaelSubkoviakandtwo anonymous reviewersfor theirhelpfulcomments. 647 ENRIGHT AND LAPSLEY Rubin (1978) have further recommended an examination of the convergent-discriminant validation of these scales because the task performance may be affected by children's verbal abilities. It is not surprising that these recommendations for psychometric analyses have waited for over two decades when it is realized that the theoretical orientation of role-taking is Piagetian. It is often mistaken (Elkind, 1969) that Piagetian and psychometric orientations are conceptually distinct. What is confounded in such discussions is a theoretical orientation (such as individual difference IQ testing that uses psychometrics) and a methodological orientation (psychometric theory). The confound leads to such conclusions as: (1) Piagetian theory is concerned with stage development across age and (2) psychometric theory is concerned with individual differences within age. Although the constructs of stage development and individual differences are distinct, stage theory is not incompatible with the psychometric methodology that has been traditionally used to study individual differences. Psychometrics, quite simply, is concerned with test construction, the reliability of one's measuring tool, and validation of the theoretical construct. Claiming that Piagetian measuring tools are exempt from such precise methodology would seem to be weakening the scientific precision of proposed findings. For example, suppose a researcher correlated age with performance on Piagetian task A and did not find a significant relationship. What is the conclusion here? An exclusive focus on theory could lead to the conclusion that construct A is not developmental. A psychometric focus, on the other hand, would lead to such a conclusion only after examining the test construction and the reliability properties of task A. Possibly this tool is so filled with measurement error that it does not accurately represent or measure the underlying construct. Psychometrics, then, can work in conjunction with such a theory to help us more precisely measure our theoretical constructs. Not only is it time to examine the psychometric properties of role-taking tasks to aid basic research as Kurdek (1978), Rubin (1978), and Hudson (1978) suggest, but also it would seem necessary to do so to aid educators in their social education programs. As stated above, the results have been varied across these programs. Are we to conclude when a program fails that the educational methods were inappropriate? A competing conclusion at the present time is that the measures selected were either unreliable or measured a different aspect of role-taking than was emphasized in the program. If such educational programs are to be successful, it would seem necessary to have, as far as possible, error-free instruments as well as a clear understanding of the subtle conceptual differences underlying the many scales that have now appeared in the role-taking literature. The purpose of the present article, then, is to address two issues with regard to psychometric theory merging with Piagetian theory in the role-taking area. First, the existing psychometric properties of the role-taking scales will be examined. This will include an examination of the constructs as they are described in the literature, a description of the measures used to operationalize the constructs, and an examination of the reliability and validity of each scale. The second purpose is to draw conclusions from these findings and to examine the ways of improving reliability and validity of the scales. Although other reviews of role-taking do exist (Chandler, 1977; Ford, 1979; Hill & Palmquist, 1978; Shantz, 1975), none has analyzed each measure in detail, allowing for a direct evaluation of the relative validation strengths of each. 648 SOCIAL ROLE-TAKING Only this latter analysis will allow social educators to choose the most appropriate scale for their programs. As a final point before we present the psychometric analysis, it should be noted that most investigators in this area have not had as their purpose the development of psychometric instruments. Instead, many have attempted a descriptive analysis of social role-taking stages through clinical interview procedures. For the most part, these procedures have been successful in elucidating a stage sequence. Because the field is relatively new, there has not been an attempt to develop only one role-taking construct, to develop measures from psychometric theory, or to develop comparable measures. This should not be seen as a fault or shortcoming of the researchers, but instead should be seen as the current state-of-the-art. The Role-taking Construct On the most general level, all role-taking constructs refer to the process of "stepping inside the other's shoes" and predicting the other's thoughts, feelings, or what the other is seeing. The latter ability of visual or spatial role-taking will not be discussed here because it does not necessarily require a social ability. That is, when the child is at point A in a room and is asked how the room would look at point B, the child has to only imagine the spatial arrangement of the room from that other perspective. Understanding qualities of other people do not seem to be necessary to solve the task. At the same time, the educational programs have not concerned themselves with this dimension. Therefore, only cognitive and affective role-taking constructs will be considered. For those interested in spatial role-taking, see Fehr's (1978) review. Social role-taking has generally been classified into two different constructs: cognitive role-taking and affective role-taking. Cognitive role-taking refers to the child's ability to think about what the other is thinking. Similarly, affective roletaking is the child's ability to understand another's internal, subjective, or feeling states. Cognitive Role-taking Within the cognitive role-taking domain, there have been many researchers who have described levels of development (see, e.g., Byrne, 1974;Chandler, 1972;DeVries, 1970; Feffer & Gourevitch, 1960; Flavell et al., 1968; Selman, 1971a, Selman & Byrne, 1974; Kuhn, Note 3). Most of the researchers'levels show commonalities with the other researchers' descriptions as well as with Mead's (1934) observations. A summary of the stage sequences of four cognitive role-taking theorists is described in Table I. The parallel trends in task performance described in the table are conceptual, not empirical. An abstraction of the commonalities across the various sequences is as follows: At first, the child does not consider another's viewpoint. For example, if the child likes candy, then he/she thinks everybody must like candy. The child next realizes that others may think differently than the child about a given situation. This is seen in Chandler's (1972) nonegocentrism stage in which the child realizes he/she may have privileged information. As a typical example, the child under observation may see Child A wave good-bye to his/her father as he leaves in the car. Child B then comes along 649 CN L/ TABLE 1 Summary of Developmental Levelsfor Role-taking Constru Chandler (privileged information) Selman and Byrne (sociomoral task) Flavell (nickel-dime game Total egocentrism in that the child confuses his/her thoughts with the thoughts of others Child attributeshis/her knowledge to others in a probabilistic way Tendency for egocentric response to change to nonegocentric Level 0: Egocentric role-taking; no differentiation of viewpoints Level 0: S unable to impute or offers one without justifi Nonegocentrism: the child is aware that he/she can have privileged information in that Si' and 0" view the same solution differently because they have different data about it Level 1: Subjective role-taking; S realizes that O can think differently than S because both may have different data regarding a given situation Level 1: S is aware that O ha Level 2: Self-reflective role-taking; the child can view him-/herself from O's viewpoint. This is a sequential ability Level 2: S thinks that 0 is a thoughts Level 3: A "generalized other" perspective in which S can simultaneously consider the viewpoints of self and O Level 3: Infinite regress, S kn aware that S knows O's str Note. S' = the experimental child or subject 0 = the other person being considered SOCIAL ROLE-TAKING and offers a toy car to A. Child A starts to cry. If the experimental child has this ability, he/she should understand that child B does not understand exactly why child A is crying because B did not see the previous incident between A and the father. In other words, the experimental child and child B have two different understandings of the situation. If the child realizes this and does not attribute the previous knowledge of A and the father to child B, then the child has acquired this level of cognitive role-taking. Selman and Byrne's (1974) subjective role-taking describes a similar phenomenon, but not in a privileged information context. Selman's sociomoral procedure has shown that the child realizes another person has thoughts and feelings different from the self's, but the child cannot yet simultaneously consider the self's and the other's viewpoints in resolving a moral dilemma. Flavell et al.'s nickel-dime game describes this stage as level I in which the child realizes the other person has a game strategy, but the child does not yet realize that the other is thinking about the child's strategy. This stage is also seen in Feffer and Gourevitch's (1960) simple refocus I level in which the child changes a story when retelling it from different characters' viewpoints. Although this level shows that the child is aware of differences between characters, he/she still confuses roles, showing that the child is not yet aware that characters take each others' perspectives. The next stage seems to occur when the child can switch roles cognitively and view the world, including the self, from the other's viewpoint. This is described in Selman and Byrne's self-reflective role-taking, Flavell et al.'s level 2, and Feffer and Gourevitch's consistent elaboration stages. Finally the child can step back from a situation and view all perspectives simultaneously. This is described in Selman and Byrne's level 3, Flavell et al.'s infinite regress level 3, and Feffer and Gourevitch's change of perspective level. It should be noted that not all theorists study this entire stage sequence. The privileged information construct, for example, ends with an awareness of differences. Also, some theorists such as Chandler (1972) and Feffer and Gourevitch (1960) postulate transitions between stages. For example, before the child understands privileged information he/she has a tendency to fluctuate between egocentrism and nonegocentrism (Chandler, 1972). In explaining what underlies or is responsible for reasoning on a particular level, theorists have chosen one of three interrelated ideas: egocentrism, operative knowing, and role-taking structure. Egocentrism is a Piagetian (1926) concept which refers to the child's inability to switch the focus of interpreting the world from a self-referenced system to an other-directed system. This would be analogous to the first level abstracted from Table I. Egocentrism is no longer the prevailing cognitive style when the child is able to decenter his/her cognitions. That is, when the child has acquired operational structureswhich allow him/her to consider the selfs own viewpoint and those of others in a balanced fashion then egocentrism in role-taking is no longer observed. From this developmental point, the child begins to increasingly decenter his/her perspective until he/she can shift foci from self to other in a coordinated and flexible fashion as the situations demand. In this regard the concept of decentration is not viewed as a unidimensional construct, in that one is either "centered" or 651 ENRIGHT AND LAPSLEY "decentered." Rather the progress from egocentrism to mature perspective-taking takes place along a decentering continuum which culminates in successive and then simultaneous apprehension of self-other viewpoints (Selman, 197la; Urberg & Docherty, 1976). To further explain what underlies role-taking abilities across the developmental levels, Youniss (1975) has suggested the concept of operative knowing. This concept brings explanations in social cognition more in line with Piaget's later theoretical work as Furth (1969) remarks. Operative knowing refers to the use of cognitive actions or schemes in coming to know an object or a person. The action here is meant to be taken literally just as one has a physical action when reaching for an object. In the case of role-taking, when the child is faced with the possibility of understanding another person, that child will operate via mental action systems on the other person. Knowing the other person, then, is a result of two things: (a) the characteristicsof the other person and (b) the particular cognitive actions performed by the child. The resulting knowledge of the other will differ among children who perform different cognitive action systems on the other. The differential knowledge defines the different role-taking stages. The differential mental action systems explains in theory the existence of progressive stages. As with egocentrism, operative knowing is still vague in theory if we are not given a precise description ofjust what the mental action systems are at each level. Selman's (1971a; Selman & Byrne, 1974) construct of role-taking structure is an attempt to describe such cognitive actions. As such, Selman's work is a specialized case of the more general notion of operative knowing. To simplify, let us take only one level and show the underlying structure responsible for social knowing on that level. On level two, Selman symbolizes the mental action system as S <= O. In words, the subject (S) can take the perspective of the other (O). At the same time, the subject is aware that O is considering S's perspective. This action, in theory, leads to the roletaking ability of the child to consider the other's thoughts about the child. For example, the child may be able to reason, "I wonder if Billy thinks that I'm a nice person?" Whether one accepts the explanations of either egocentrism/decentration or operative knowing, theorists interested in cognitive role-taking usually adhere to certain assumptions about the developmental levels. First, the Piagetian levels of concrete and formal operations (developed to explain logical reasoning) are considered to be necessary, but not sufficient for the parallel levels of role-taking. For example, a child is considered to be in concrete operations if he/she demonstrated reversibility in conservation tasks. Selman (1976) maintains that without the reversibility structure, the child cannot develop the reciprocity structure of level 2 role-taking, which, in effect, is a reversibility of perspectives. Second, there should be an internal consistency of responding across tasks because one's ability constitutes an organized, consistent way of reasoning about the social world. Third, the levels are considered invariant in that the child cannot skip from level 1 to level 3. Likewise, regression should not occur in level of reasoning. Fourth, the levels are considered hierarchical.They are characterized by increased complexity and integration as one moves up the stage ladder. For instance, lower levels are characterized by a focus on the self. Later, the child integrates this perspective with the perspective of the other. The added complexity of self and other leads to reciprocal role-taking. Next, the levels are considered in theory to be universal. 652 SOCIAL ROLE-TAKING Presumably, the sequence of development can be found in all cultures. Sixth, although stage theory is not normative, in that all six-year-olds are expected to be on a given level, the stages are linked to the idea of competence or social adequacy (White, 1959), so that children lagging far behind peers would be expected to show inadequate social interactions. Finally, the cognitive domain, like any role-taking domain, is considered to measure abilities other than only verbal or logical ability. It is important to keep in mind that not all researchersand theorists in this area have described their construct so as to include all of the above points. Most, however, do suggest a link between their constructs and Piagetian structural theory. Therefore, while the above stage assumptions are not necessarily made explicit in all manuscripts discussing theory, those assumptions are certainly implicit. Affective Role-taking Affective role-taking, the ability to infer the feeling states of others, is often confused with empathy, defined as the ability to share the feelings of others. Consequently, much of the research in this area has been devoted to resolving the conceptual distinctions between these concepts (Borke, 1971, 1972, 1973; Chandler & Greenspan, 1972; Feshbach & Roe, 1968; Flapan, 1968; Hoffman, 1976; Hoffman & Levine, 1976; Rothenberg, 1970). In theory the affective role-taking dimension should differ from cognitive role-taking in subject matter only. Although the tasks employed to index empathic awareness (Borke, 1971;Feshbach & Roe, 1968) in very young children seem to require only the primitive skills of identifying the affective state of another, or in projecting one's own feeling to the other in an emotionally stereotypic situation, these skills nonetheless represent developmentally prior abilities which serve as important precursorsto genuine social decentration and affective role taking. An integration of the available research in this area, then, would seem to suggest the following developmental sequence: At first, young children are aware that emotional states originate in other people (Hoffman, 1976); Next, children are able to correctly anticipate the affective reactions of others through the mechanism of projection, identification, and stereotyping (Borke, 1972); When asked to adopt different roles in affect-arousing situations, children confuse their own point of view with those of others (Chandler & Greenspan, 1972); Children can assume multiple viewpoints relative to the affective states of others in a flexible and coordinated fashion (Chandler & Greenspan, 1972). Explanations of performance at the latter two levels of this sequence rely primarily on the decline of egocentrism and the ascent of perspectivistic thought, an achievement of middle childhood in the Piagetian conceptualization of cognitive development. Hoffman (1976; Hoffman & Levine, 1976) would argue that the first level described above may signify an innate precursor of empathic distress, itself a component of altruistic motivation which may have survival value for the species. Not until the highest level does the child show decentering ability that would 653 ENRIGHT AND LAPSLEY characterize actual role-taking. Because this domain is also Piagetian, it would share the structuralistassumptions described for the cognitive domain. The Role-taking Measures Just as there are several different constructs which supposedly define role-taking, there are many measures now in use which supposedly measure the construct. For example, in adding the measures of two studies (Zahn-Waxler, Radke-Yarrow, and Brady-Smith, 1977; Wentink et al., Note 2) there are a total of 12 different tasks or measures used to assess role-taking. Because of space limitations the most popular measures and those for which psychometric data have been obtained will be described. This will be done for both cognitive and affective role-taking domains. Cognitive Role-taking Based on the number of researchersusing the task, Flavell et al.'s (1968) cognitive measure, usually referred to as the "apple-dog" story, is probably the most popular of all role-taking tasks. It is a task employing privileged information in which the child knows more about a story then someone else. It assesses the extent to which the child realizes that the other person views the situation differently because the other lacks some knowledge about the story. The child is first shown a seven picture sequence. The cards are as follows: (a) A boy is walking along a sidewalk; (b) The boy looks frightened as he sees an angry looking dog running toward him; (c) The boy, looking over his shoulder, runs from the dog who continues chasing him; (d) The boy runs toward an apple tree a few feet away. The dog is not shown. The boy's face is hidden by some leaves; therefore, no anxiety is apparent; (e) The boy is climbing up the tree while the dog stands below, apparently barking; (f) The boy is shown in the tree, not looking at the dog, nor appearing frightened. The dog is pictured as walking across the street with its back to the boy. No apparent expression is on the dog's face; (g) The boy is in the tree eating an apple. The dog is not in the picture. After the child tells the story, cards b, c, and e, all of which showed a ferocious dog, are removed. Another person then enters the room and the child must predict how the new person will tell the story. If the child can decenter from his/her own perspective, then the child should accurately predict the other's conception of a new story. The following researchers have incorporated this measure into their work: Hollos (1975), Hollos and Cowan (1973), Hudson (1978), Kurdek and Rodgon (1975), Marvin, Greenberg, and Mossler (1976), Mossler, Marvin, and Greenberg (1976), Selman (1971a), West (1974), Zahn-Waxler et al. (1977), Kurdek (Note 4), and Olejnik (Note 5). Chandler's (1973) role-taking measure also assesses cognitive role-taking. As with the Flavell et al. task it too deals with privileged information. With Chandler's measure, the child is exposed to two story characters, one who has privileged information and another who does not. It assesses the extent to which the child understands that each character, because of differential knowledge, views a given situation differently. The child is given 10 to 12 items, each consisting of a series of cartoon sequences which show a character psychologically influenced by a series of events. A new characterwho did not observe the prior events witnesses the subsequent behavior. For example, a boy playing baseball accidently breaks a window. After654 SOCIAL ROLE-TAKING wards a knock at the boy's door produces a reaction of fear in him. His father, not knowing of the broken window, looks puzzled. In theory, if the experimental child is capable of adopting a perspective other than his/her own, then the child should realize and report that the father does not know why the boy appears afraid. Because such an ability develops by middle childhood (Chandler, 1972), ceiling effects are possible with this measure during middle childhood and beyond. A low score represents low egocentrism or high ability to decenter. This measure has been employed by Chandler (1972), Chandler et al. (1974), Kurdek (1977a,b), Leahy and Huard (1976), Rubin (1978), and Urberg and Docherty (1976). Similar kinds of tasks have been developed by Rotenberg (1974) and Mossler, Greenberg, and Marvin (Note 6). Flavell et al.'s (1968) nickel-dime game assesses the child's reasoning about an opponent's thoughts in a game of strategy. In theory, the more complexly the child can reason about the other's thoughts, including the other's thoughts about the child, the more successful the child will be. The child is first presented with two plastic cups. Under one cup is a nickel and under the other is a dime. The contents of both cups are easily identified because there is an extra nickel taped to the top of the cup housing the game nickel. A dime is taped to the other cup. In its original version, the opponent is to leave the room. The child is to remove one of the coins under a cup and the opponent is to reenter and choose one of the cups which he or she believes still houses some money. If the opponent picks the cup with money still under it, then he/she keeps the money. While the opponent is out of the room, the experimenter asks the child which cup he/she thinks the opponent will pick and why. If the child decides that the opponent will pick the dime cup because it has more money, this would represent a lower level of role-taking activity. Note there is little role-taking on the child's part beyond the perspective that the opponent is capable of thinking about the game materials. A higher level response occurs when the child reflects on the opponent's thoughts about the child. For instance, the child may say, "He would want the dime. But he knows that I know this. So he may pick up the nickel cup." Further recursive thinking is also measured. The nickel-dime game or similar guessing games have been used in the research of Byrne (1974), DeVries (1970), Iannotti (1978), Kurdek (1977b), Moir (1974), Selman (1971a,b), Kuhn (Note 3), and O'Connor (Note 7). Besides in Flavell et al., a scoring manual for the nickeldime game can be found in Selman and Byrne (1973). Selman and Byrne (Note 8) have described a relatively new method for assessing cognitive role-taking development. An open-ended dilemma is read or shown via filmstrip to the child. The child is then interviewed with a series of questions about the story. As an example of a story, Holly was asked by her dad not to climb trees. Holly's friend, Sean, asks Holly if she would help him retrieve his kitten caught in a nearby tree. Holly must decide whether or.not to climb the tree to save the kitten. If she does, she will be breaking her promise to her dad. To assess role-taking activity, questions are asked such as "Why might Sean think Holly won't climb the tree?," "What does Holly think her father will think of her if he finds out?" As with the Flavell et al. task, this task assesses the extent to which the child can take multiple perspectives. The ability to take multiple perspectives is seen by Selman and Byrne as a characteristic of operative thinking/structuralism. As a scoring example, if the child understands that Holly can reflect on her dad's thoughts about her, the child is scored on level 2 for that response. The highest level exhibited throughout the 655 ENRIGHT AND LAPSLEY interview constitutes that child's final score. Such sociomoral interviews have been employed in the work of Byrne (1974), Kurdek (1977b), Selman and Byrne (1974), Selman and Damon (1976), Selman and Lieberman (1975), and Gordon, Damon, and Selman (Note 9). Miller, Kessel, and Flavell (1970) have described another cognitive task to assess recursive thinking. This task assesses the extent to which a child can reason about another's thoughts in four categories: (a) the other thinking about another person; (b) the other thinking about action between people (e.g., a conversation); (c) the other thinking about another's thoughts (one-loop recursion); and (d) the other thinking about thinking (two-loop recursion). The child is first shown a series of people in cartoon-type drawings. The child is taught to distinguish between scalloped cartoon clouds representing thought in the cartoon character and smooth clouds representing talking. Once this is grasped, the child is shown four characters: A boy who is the main character doing the thinking, a girl, a mother, a father, all of whom will be the object of the boy's thoughts or conversation. The experimenter explains that the child is to orally trace the thinking of the cartoon boy. The boy's thinking is to be in a big "thinking cloud." Up to 18 cards randomly ordered are shown to the child. Some cards represent lower levels of recursive thinking while others are quite complex. For example, in the first card, the boy is thinking of an apple. If the child says, "The boy is thinking of an apple," (some fruit, etc.) it is correct. On a higher level of complexity, the cartoon boy may be thinking about the father's thoughts about the boy. For the experimental child to understand this cartoon, he/she would in theory have to be aware that the cartoon boy can think of another's thoughts about himself. One point is given for each correct answer. This measure has been used by Rubin (1973, 1978) and by Wentink et al. (Note 2). Affective Role-taking Measures There have been several measures of affective role-taking (Borke, 1971, 1972; Burns and Cavey, 1957; Chandler and Greenspan, 1972; Feshbach and Roe, 1968; Flapan, 1968;Gottman, Gonso, and Rasmussen, 1975;Rotenberg, 1974;Rothenberg, 1970; Feshbach and Kuchenbecker, Note 10; Shantz, Note 11; Watson, Note 12). Only those developed by Rothenberg, Flapan, and Chandler and Greenspan will be detailed here. The Borke procedure, though popular and controversial, will not be discussed because her task is only appropriate for preschoolers. Thus, it would not be used in elementary or secondary social education programs. In Rothenberg's task, four tape recordings of a male and female engaged in interactions which depict adult concerns are played to the child. In each story an affect occurs encompassing across the stories the emotions of happiness, anger, anxiety, and sadness. For example, in the anxiety story the man tells the woman that he has invited some friends to come for dinner. The woman does not feel she can be ready and does not know where to start. Adult problems are presented to avoid the problem of the experimental child's identifying with the characters rather than understanding them from the other's point of view. Before the story is played, the child is instructed to focus on only one character to provide a similar point in listening for all subjects. After the story, the child is asked how the character felt and why he/she felt that way. To prevent carryover from a previous story, each story uses a different pair of male-female actors. To receive the highest score, the child 656 SOCIAL ROLE-TAKING must be aware of changes in feelings of the focal character. In the previous example, the woman changed from calm or happy as the man came home to anxious after he told of the plans. The measure has also been used by Hudson (1978), Johnson (1975), Moir (1974), and Rubin (1978). The Flapan (1968) procedure entails the presentation of two sound-film clips which portray episodes of social interaction. Each film contains "a sequence of events that constituted a complete self-contained story, with an introductory scene establishing a 'problem,' intervening scenes on the same theme, and a scene concluding the action dealing with the theme" (p. 11). Each of the two film selections are shown in five episodes, with each episode lasting between 1.5 and 3 minutes. At the conclusion of each episode in each film the child is asked to give an accounting of what occurred during the scene portrayed. Responses to this unstructured interview are scored for affective sensitivity if the subject is able to infer or interpret feelings, interpersonal perceptions, and intentions or expectations that are not obviously expressed or labeled. Following this, standardized questions are asked of the child for each episode to determine the kinds of interpretations that are given to the feeling when attention is directed to specific events in the episode. This has not been a widely used task. Chandler and Greenspan's (1972) task consists of two parts. The first half of the procedure utilizes three cartoon sequences which portray the story characters in affect-arousing situations "leading inevitably to feelings of anger, fear, and sadness" (p. 105). Subjects are questioned at this point to determine their anticipation of the emotional reaction of the story characters. A sequel to these stories is included in the second phase of the assessment procedure. The central characters, who are shown to behave in a consistent manner with their previously aroused affective state, arejoined by a late-arriving character who witnesses only the demonstration of the emotional behavior, but not the antecedant circumstances that caused it. The child is required first to relate the entire story from his/her own point of view and then to reinterpret the events from the limited perspective of the partially informed bystander. If the privileged information available only to the child is part of the account offered as descriptive of the point of view of the late-arriving bystander, then such a response is considered egocentric. This two-step assessment procedure adopted by Chandler & Greenspan allows an educator to distinguish developmentally between less mature affective sensitivity and flexible role-taking. Cognitive and Affective Role-taking Measures Feffer and Gourevitch's (1960) projective role-taking task (RTT) was the first roletaking measure to be developed. Both cognitive and affective responses are elicited. The child is first shown a variety of cut-out figures of people and two different scenes. The child is asked to tell a story about each scene using any of the figures. Usually three figures are used per story. Next, the child must retell each story several times, each time from the viewpoint of one character who appeared in the child's initial story of that scene. In theory, each new role that the child takes adds to the perspective-taking complexity of the situation. A successful performance is evaluated by the child's ability to decenter or refocus attention to the new character's point of view. If the child refocused on the character's perspective while retelling the story, but did not decenter entirely, the child was scored on the simple refocusing level. For 657 ENRIGHT AND LAPSLEY example, suppose in the initial story the child described the father as having a "terrible day at the office." In the retelling, the child in the father's role might say, "It's 5:00 and I'm hungry," without mentioning the office. Although he/she could take the perspective of the father the child did not consistently refocus on the father's concerns in the initial story. If the child can change perspective at will, that is, take the role of the father and of a second character, the mother, who is aware of father's sadness, then the child is given an even higher score. For more details on the scoring system see Schnall and Feffer (Note 13). This measure has been used in the following research: Keller (1976), Kurdek (1977a,b), Piche, Michlin, Rubin, and Johnson (1975), Turnure (1975), Wolfe (1963), and Marsh and Serafica (Note 14). The above measures' psychometric properties will now be described. Reliability of Role-taking Measures Reliability can be thought of as consistency. There are two different ways a measure can show consistency: across items and across time. The former is referred to as internal consistency while the latter is temporal stability. Interraterreliability is a consistency check on two or more raters or scorers. The coefficient tells the extent to which the judges' scores are proportional when expressed as deviations from their means. Interrateragreement gives the degree to which one judge gives the exact same scores as the other judge. For example, suppose two raters independently rated three items. Rater No. 1 assigned the scores 1, 3, 5 while Rater No. 2 assigned the scores 2, 4, 6 to the same items. Their interrater reliability would be r = 1.00, while their exact agreement would be 0 percent. See Tinsley and Weiss (1975) for a discussion of these two interraterconsistency checks. Tables II and III give the existing reliability data in role-taking. Table II shows the cognitive tasks; Table III shows the affective measures. Most, although not all, of the values represented in both tables were derived from Pearson product moment correlation coefficients. Although reliabilities such as internal consistency have specific techniques for deriving a coefficient (Cronbach's alpha, Hoyt's reliability, Spearman-Brown formula, Kuder-Richardson formula) these have not been traditionally used in role-taking. The exceptions follow. Feffer and Gourevitch (1960) performed a X2test of independence rather than a Pearson r to derive their internal consistency estimate. Chandler (1973) used a Spearman-Brown split-half reliability treating five of the cartoon sequences as one item and the other five sequences as another item. Piche et al.'s (1975) interrater reliability coefficient was derived by Scott's r. What becomes immediately apparent when examining both tables is the few studies represented when compared to the total number of role-taking studies undertaken. Less than half of the studies are represented. Of those represented, few cite all reliabilities. Such information would be helpful in interpreting the usefulness of a given measure because each reliability answers a different question about that measure. By far the most represented reliabilities encompass interrater agreements and reliabilities. It seems that consensus has set a precedent in that these reliabilities are seen as sufficient for reporting a measure's consistency. Although these data give valuable information, they tell us more about the judges' accuracy than something about the measure's consistency. When interpreting any given value, it is possible to subtract the value from the 658 TABLE II Reliabilityfor Cognitive Role-taking Measures Measure Apple-dogstudy Olejnik (Note 5) Chandler'smeasure Chandler (1971)" Chandler & Greenspan (1972) Chandler (1973) Olejnik (Note 5) Piche et al. (1975) Leahy & Huard (1976) Urberg & Docherty (1976) Kurdek (1977b) Rubin (1978) Nickel-dime game and other guessing games Byrne (1974) O'Connor (Note 7) Kurdek (1977b) Sociomoral dilemmas Selman & Byrne (1974) Byrne (1974) Selman & Lieberman (1975) Kurdek (1977b) Recursivethinking Rubin (1973) Rubin (1978) Other CognitiveMeasures Mossler et al. (1976) Measure Ages/Grades Grades k-3 Not available Grades 1-7 Ages 11, 13 Grades K-3 Grade 4 Grades 4-6 Ages 3-5 Grades 1-4 Preschool Grade 1 Grade 3 Grade 5 at each age N Nat Not available 86 (total) 45 40 20 -22 14 24 26 41 36 40 16 -25 24 Ages 4, 6, 8, 10 Ages 10, 13, 16, adult Grade 2 Grades 1-4 10 16 68 24 Ages 2.5-6.5 a Reported in Chandler et al. (1974) Tempor Stabilit Stabili 40 Ages 10, 13, 16, adult Grades 3-5 Grades 1-4 Grades K, 2, 4, 6 Preschool, grades 1, 3, 5 Consiste Internal Consistency .91 .92 .65 to .86 .56 .68 .26 .33 .32 .52 (all grades) .77 to .85 .69 .41 .62 .62 .66 20 143 (total) 10-20 .84 .85 to . .90 to .93 ON ON Reliabilityfor Measure Rothenberg'smeasure Rothenberg (1970) Moir (1974) Rubin (1978) Hudson (1978) Flapan's task Flapan (1968) Chandler& Greenspan Chandler & Greenspan (1972) Feffer's measure Feffer (1959) Feffer & Gourevitch (1960) Wolfe (1963) Piche et al. (1975) Turnure (1975) Keller (1976) Marsh & Serafica (Note 14) Kurdek (1977b) TABLE III Affective and Cognitive-Affective Role-taking Me Measure Ages/Grades Ages/Grades Grades 3, 5 Age 11 Preschool Grade I Grade 3 Grade 5 age N at each age -50 Temp Cons Internal Consistency emp .28 to .47 40 26 41 36 40 .18 .20 .39 Grade 2 110 .50 (all grades) .70 to .75 Age 6 Age 9 Age 12 20 20 20 .64 .67 .82 Grades 1-7 Adults Ages 6-7, 8-9, 10-11, 12-13 Ages 10-21 Grade 4 Ages 7, 9, 12 Ages 12.5 Ages 4-10 Grades 1-4 86(total) 35 -20 -90 20 20 67 20 24 .42 .40 .27 .40 SOCIAL ROLE-TAKING integer 1 to derive an estimate of the amount of error in the scores. For instance, a coefficient of .97 represents a measure which produces scores with only .03 or 3 percent error. This means that any given score is probably very close to its theoretically true score (see Stanley, 1969). With this in mind, it is clear that all of the measures represented can be scored with a minimum of judges' errors. The consistency of the other aspects of reliability for most measures, however, either is not as strong or is unclear because of lack of data. The exceptions are Chandler's task, Selman's sociomoral dilemmas, and the nickel-dime game, all of which appear to have an adequate degree of internal homogeneity and temporal stability. Poor internal consistency, on the other hand, is apparent in the Rothenberg affective task, and in the Feffer scale. It should be noted also that Rubin's (1978) technique of examining internal consistency within age leads to quite different conclusions than when that statistic is obtained by collapsing across age. In both instances in which he did this, Rubin obtained poor internal consistencies within each age, but minimally acceptable values when collapsing across age. One must be cautious in interpreting the within-age values because the restricted range may be producing spuriously low correlations. The lack of adequate reliability values in many measures creates a problem in turning to the validity results. One usually does not draw conclusions about a measure's relationship with other variables until that measure has been first found to be reliable (Nunnally, 1967). This is the case since, if measurement error is responsible for the scores on one of the measures, acceptance of a null hypothesis may have come about anomalously through random fluctuations in that measure. False negatives, then, and at times false positives, may be the result. Unfortunately, the focus in role-taking has been predominantly on validity outcomes without a concomitant focus on reliability. Construct Validity All social cognitive developmental scales are theory-based. That is, the scales are developed to reflect the construct underlying that scale. Whenever a scale is so constructed, then the researcher must demonstrate through empirical evidence that the scale does represent that particular construct. This is done by studying the relationships between the scale and other variables which would be expected in theory to either show a relationship or not. Such a procedure to elucidate the connection between a scale and its underlying construct by studying relationships between the scale and outside variables is construct validation. This section will consider the construct validity of role-taking. Cognitive Role-taking Given that the cognitive and affective constructs as outlined previously are structural/developmental, then there are at least 11 validation criteria for a construct valid scale. These criteria are in Table IV. Turning to the cognitive scales first, we see that the first criterion is clearly supported with all scales because all previously cited stage theorists have shown the levels to appear in the expected order with their scales. Validation criteria two and three have no data for any of the measures. With regard to the fourth validation criterion, the apple-dog story, because it is a one item test, is not amenable to internal consistency analyses. Chandler's (Chandler 1972; 1973;Olejnik, Note 5) measure and 661 ENRIGHT AND LAPSLEY TABLE IV Summary of Evidence Needed to Validatethe Role-taking Constructsas StructuralDevelopmental ValidationCriteria 1. Stages should increase with age to reflect the developmental nature of the construct. 2. Criteria should be established which clearly state the kinds of evidence necessary to support one and refute the other theory of egocentrism/decentration or operative knowing, if these two are seen as conceptually distinct. 3. It should be demonstrated that the necessary but not sufficient Piagetian stages do precede and are in part responsible for a corresponding role-taking level. 4. High internal consistency is needed to show that a child's reasoning represents a structured whole. 5. High temporal stability with no regression to lower levels is needed to support the invariance construct. 6. Criteria must be established with empirical support to show that the stages are hierarchical. 7. Cross-cultural evidence must be obtained to demonstrate the universality of the stages. 8. For the cognitive tasks, significant correlations with other cognitive role-taking scales must be obtained to demonstrate homogeneity within the domain. Affective tasks should also relate to cognitive tasks if there is a general role-taking domain. 9. For the affective tasks, significant correlations with other affective role-taking scales must be obtained to demonstrate homogeneity within this domain. Cognitive tasks should also relate to affective tasks if there is a general role-taking domain. 10. Differences on the scale between groups reflecting different levels of social adjustment would be expected to show the reasoning and behavior relationship. 11. There should be a higher within-scale or within-domain correlation than correlations between the scale and general intelligence. Mossler et al.'s (Note 6) adaptation of that measure have shown evidence to support the conclusion of a homogeneous domain. This is further supported by Urberg and Docherty's (1976) cluster analysis in which Chandler's tasks all clustered together showing common variance. Both the nickel-dime game and Selman's sociomoral dilemmas also seem to have support (see Table II). For the fifth criterion, while Chandler's, the nickel-dime, Selman's sociomoral dilemmas, and the recursive thinking scales all show moderate stability (see Table II), no study to date has examined the invariance assumption. No scale has been examined via criteria six or seven. For the eighth criterion, when comparing Chandler's measure with the apple-dog story, Olejnik (Note 5) and Kurdek (1977a) found relationships between the measures. Kurdek's findings, however, may be sample specific because he used a stepwise multiple regression procedure without cross-validation. The maximization procedure used may have produced spurious results. In a chi-square analysis, Selman (1971a) found a significant relationship between the apple-dog story and the nickel-dime game. Although Chandler's task did correlate with both Miller et al.'s recursive thinking task and a guessing game similar to the nickel-dime game (Rubin, 1978), it did not relate to Feffer's task. Either the poor internal consistency or the affective components of the Feffer measure could have been partly responsible for the lack of relationship between the two. The sociomoral dilemmas in a principal components analysis have been related to Feffer's task (Kurdek, 1977b). The principal components 662 SOCIAL ROLE-TAKING solution, however, is probably inappropriate in light of the minimal internal consistency of Feffer's scale. For the ninth criterion, the correlations across cognitive and affective role-taking domains are low with no significant relationships in most cases (see Kurdek & Rodgon, 1975; Moir, 1974; Rotenberg, 1974; Kurdek, Note 4). The only significant relationship (.34) has been between the apple-dog story and Rothenberg's tasks (Hudson, 1978). The latter did not relate significantly to the nickel-dime game (Moir, 1974). For the 10th criterion, Chandler (1972, 1973; Chandler et al., 1974) has shown that emotionally disturbed and delinquent children score lower on his measure than do normal children. In a similar way, Selman (1976) has demonstrated that his sociomoral dilemmas can discriminate emotionally disturbed from normal children and adolescents. Finally, for criterion 11, Chandler's measure shows adequate convergent-discriminant validity. Chandler (1973) reports an internal consistency of .92 and a correlation of-.30 between role-taking and IQ as measured by the Peabody Picture Vocabulary Test (PPVT). The higher within-domain correlation suggests a distinct domain apart from IQ. Similarly, Kurdek (1977b) has shown that the internal consistency for the Chandler tasks is in the .60's whereas the Chandler and IQ (via Raven's Progressive Matrices) is .51. An r to z transformation, however, should be performed on the latter values to test for a statistically significant difference. Only Rubin's (1978) data are contradictory of the above. He has demonstrated that a composite of the Chandler, Miller et al., and guessing game role-taking tasks correlated .18 with the PPVT and only .20 within themselves. This could be a function, however, of the unreliability of any of the tasks or of a lack of construct validity within a larger cognitive role-taking domain rather than a lack of discriminant validity for the Chandler measure. The nickel-dime game via a parallel form correlated .69 internally, but correlated .40 with the Raven's test (Kurdek, 1977b), thus satisfying the lth criterion. The latter study further reports a higher internal consistency (.62) for Selman's dilemmas than between that scale and Raven's (.38). In summary, 11 criteria would be needed if any given cognitive role-taking measure is to demonstrate construct validity. The criteria and their evidence for each measure are in Table V. All measures passed the first and eighth criteria showing that all are developmental and suggesting that all tap a similar domain. No measure passed all criteria. Caution should be used as stated previously in judging any criterion that requires a correlation with another role-taking measure because not all measures have demonstrated adequate reliability. An examination of the table reveals that both Chandler's and Selman's tasks show the best construct validity. When this is coupled with the adequate reliability of these tasks, these measures seem appropriate for use in scientific investigations. It should be stressed, however, that the two scales measure different aspects of cognitive role-taking as seen in Table I. The one drawback of the Chandler scale is its potential for a ceiling effect. The drawbacks of the Selman scale are its interview format with nonstandardized probe questions and the subjective scoring criteria (see the Role-taking Measures section). The nickeldime game also may be appropriate for educational programs because it passed five validation criteria, and failed only one, the latter being a lack of relationship with an unreliable scale. The one rather surprising set of results is that for the apple-dog story. Although the apple-dog task is very popular with researchers, it has yet to demonstrate adequate reliability and construct validity. This is not the case because 663 ENRIGHT AND LAPSLEY TABLE V Summary of A vailable Construct ValidationEvidencefor CognitiveMeasures Evidence for Scales Chandler NickelNkelDime Recursive Recursve Thinking Validation Criterion Apple-Dog 1. Relationshipwithage 2. Egocentrismor operativeknowing 3. Piagetianstagesas necessary 4. Homogeneity 5. Temporalstability;invariance yes na" yes na yes na yes na yes na na na na 6. Hierarchization 7. Universality 8. Relationshipwithothercognitive scales 9. Relationwithaffectivescales na na yes na yes partial support na na yes na yes partial support na na yes na yes partial support na na yes' na na partial support na na yes yes na' na na 10. Relationwithbehavior 11. Convergent-discriminant validity na na yes yes not supported' na yes yes yes na na Sociomoral ana = not available "The relationshipis withthe Feffertask,whichis difficultto classifyas eithercognitive or affective. 'There was no relationshipwith the Feffertask,whichis difficultto classifyas either cognitiveor affective.Forthesepurposes,it is givenmoreweightas a cognitivevariable becauseif therewerea relationship, it mightbe attributable to the overlapbetweenthe cognitivecomponentsof the two constructs. dThe nonsignificantrelationshipwas with the Rothenbergscale, which has shown questionableinternalconsistencyreliability. the scale has failed any criteria, but rather because there have been few tests of its reliability and validity. Affective Measures The affective measures of role-taking developed by Rothenberg (1970), Chandler and Greenspan (1972), and Flapan (1968) all satisfy the first validation criterion. For criteria two and three there are no available data. The fourth criterion regarding high internal consistency was not met by all the scales. Rothenberg's (1970) withindomain correlations of .28-.47 do not produce strong evidence of a homogenous domain. Hudson's (1978) data, however, show homogeneity for this scale. Chandler and Greenspan (1972) offer no evidence that a child's reasoning represents a structured whole. Flapan (1968), while reporting only moderate reliability estimates for her younger subjects, calculates a coefficient of .82 for the eldest subjects sampled in her study. The Flapan and Chandler and Greenspan procedures have no evidence for the remaining validations criteria. The Rothenberg measure has no evidence for criteria five, six, and seven, though it does have evidence for the remaining four criteria. Regarding the eighth criterion, a relationship between the Rothenberg task and the apple-dog story was significant (Hudson, 1978) although its relation with the nickeldime game, a cognitive role-taking task, has been negligible (Moir, 1974). Rubin's (1978) correlation of the Rothenberg scale and one intended to tap empathy (Borke, 664 SOCIAL ROLE-TAKING 1971), a construct similar to affective role-taking, yielded no significant relationships, a test of the ninth criterion. The Rothenberg task seems to satisfy the 10th criterion in that high scores on affective role-taking have been related to peer and teacher ratings of one's interpersonal skill. The final criterion is not satisfied. The Rothenberg task does not appear to be a domain separate from general intelligence, as both sets of correlations were reported in the .20's. A summary of the construct validity of each measure is in Table VI. It seems clear that the Rothenberg scale does not demonstrate adequate construct validity. The validity of the other two scales is still unknown. Cognitive and Affective Measures As with all other measures, the Feffer scale is related to age, thus satisfying criterion one (see, e.g., Feffer & Gourevitch, 1960; Kurdek, 1977b; Turnure, 1975). There is no evidence for the next two criteria. It is clear that criterion four is not supported as the values in Table III indicate. Criterion five has partial support because one study (Kurdek, 1977b) found adequate temporal stability. While criteria six and seven have no evidence, criterion eight has partial support because the scale has been related to the sociomoral dilemmas (Kurdek, 1977b) but not to a task similar to the nickel-dime game (Rubin, 1978). The only other criterion evaluated has been the 11th(see Table VI), which has partial support. Two sets of data for Feffer's scale suggest that the individual stories of that measure share more common variance with IQ than with parallel forms of the stories. For instance, Turnure (1975) reports an internal consistency for the Feffer scale of .40, but a correlation with the Kuhlmann and Anderson IQ test of .60. Similarly, Keller (1976) found higher correlations between the scale and IQ than between two of Feffer's stories. Only Kurdek (1977b) has found a higher within-scale r (.40) than an r (.19) between the scale and IQ via Raven's test. Although this is a popular scale with researchers, it does not warrant use in social education programs for several reasons: (I) the construct is not clearly defined; (2) the internal consistency reliability is low; and (3) there is only one validation criterion, TABLE VI A Construct Validation vailable for Affective and Cognitive-AffectiveMeasures Summary of Evidencefor Scales Validation Criterion Rothenberg Chandler& Greenspan Flapan Feffer 1. Relationship with age 2. Egocentrism or operative knowing 3. Piagetian stages as necessary 4. Homogeneity 5. Temporal stability; invariance 6. Hierarchization 7. Universality 8. Relationship with cognitive scales 9. Relation with other affective scales 10. Relation with behavior 11. Convergent-discriminant validity yes na" na partial support na na na partial support not supported yes not supported yes na na na na na na na na na na yes na na yes na na na na na na na yes na na not supported partial support na na partial support na na partial support na = not available 665 ENRIGHT AND LAPSLEY the first one, that has clear support. Any educational program that failed with this as the dependent measure would be left with too many competing hypotheses regarding the failure. A Comparison of the Measures Across Domains It is clear from Tables V and VI that cognitive role-taking is the most developed domain regarding validity. It appears that this domain is both developmental and cognitive (see criteria one and eight, Table V). The cognitive tasks with the most validity also have evidence to suggest that even though they shared variance with general intelligence, they can be discriminated from it. It should be kept in mind, however, that a comparison between internal consistency within role-taking and a correlation with intelligence offers only a rough estimate of discriminant validity. Even so, the evidence here is in direct opposition to Ford's (1979) conclusion regarding cognitive egocentrism in general. This discriminant validity is an advantage for social educators, because gain on these measures, most likely, will not lead to the competing conclusion that the educational program promoted only verbal ability or general intelligence rather than cognitive role-taking per se. Although the usual educational procedure is to develop a program and to then choose a relevant measure, such a strategy may not work effectively in role-taking education. Affective role-taking programs may be difficult to do at present. Not only are the measures questionable, but the constructs are currently confounded with empathy. Carefully designed and researched affective role-taking programs may have to wait for further psychometric development of the domain. For those interested in cognitive role-taking programs, Chandler's and Selman's tasks and the nickel-dime game are recommended, given that these are relevant to the skills chosen in the program. Conclusion and Recommendations in Role-taking The above analysis leads to four general conclusions. First, there are conclusions with regard to the construct itself. The construct of social role-taking is used in the literature in several different (but possibly overlapping) ways. There is cognitive roletaking with variations within that domain (see Table I) and there is affective roletaking with a controversy of empathy and role-taking existing within it. It is recommended that future role-taking studies clearly define the subconstruct and assumptions of that subconstruct. This is not always done at present. For example, in one study (Ambron & Irwin, 1975) 32 role-taking items are correlated with moral judgment but not theoretical classification of role-taking is reported. Without knowing into which role-taking construct to place the various items, the reader cannot judge the theoretical usefulness of the data. As a second point with regard to the construct, most theory at present contains untestable assumptions. It would seem that such assumptions as invariance and regression, while philosophically useful, do not lend themselves to clear scientific conclusions if the theorist maintains that even one or two subjects regressing violates the assumption. Because measurement error occurs in all tests, it may be unclear whether observed regressions are due to error or to a true developmental regression. A second general conclusion regards the construction and choice of measures. Decisions, at present, as to which measure to use in a study seems to be based more 666 SOCIAL ROLE-TAKING on consensus rather than on whether the measure is reliable. For instance, the appledog story has been used extensively, but the measure, as stated previously, has yet to demonstrate psychometric value. Similarly, the Rothenberg scale has limited psychometric value. To begin alleviating the measurement problems it is recommended that new role-taking measures be developed because most existing role-taking tasks simply do not have the number of items needed for rigorous psychometric analyses. It is recommended that the classical approach to test construction be taken because it has proven so successful in the past with other areas such as personality psychology. Although this approach is not new, it is new for the role-taking area. It is recommended, then, that the following steps be taken: 1. Define a domain and specify the hypothetical constructs of that domain. 2. Develop what appears to be a homogeneous item pool reflecting the domain. For example, the number of Chandler's bystander cartoons could be increased using existing ones as a guide. Because children's responses to any item are brief, a 20-item test may not increase fatigue. 3a. Do an item analysis to empirically test for homogeneity. It may be difficult to analyze each item on the clinical measures common in role-taking because not all children are given the same questions. Yet, the researcher can still select those questions that consistently lead to responses by most or all subjects and analyze the items via item discrimination (Nunnally, 1967). The latter concerns the average intercorrelation among items and could be used for any of the scales. One way to analyze discrimination is to correlate each item with the total score. Because roletaking scales have so few items, it may be best to do an item-total correlation with that item removed from the total. Otherwise, the correlation may be inflated. This will tell the research whether any given item differentiates between the average total score of people receiving high scores and the average total score of people receiving low scores. Another procedure, item difficulty, is relevant only for items that have objective right or wrong answers. A measure such as Chandler's could be adapted so that a composite of the correct answers rather than a stage score represents the total score. Difficulty refers to how hard or easy an item is. For instance a p-value of .5 for an item means that 50 percent of the sample answered the item correctly. Selecting items with difficulty half-way between chance and 1.00 leads to maximum spread of scores. The use of item difficulty and discrimination together can lead to the selection of the most reliable items which tend to discriminate among people as much as possible.' 'For open-endedquestionsthep-valuedesiredis usually.5 for each item becauseit is halfway between0 (or chance)and 1.00.For a dichotomousitem such as the adaptedChandler measure(in whichthe child role-takesor not) chanceis probably.5 (the child has one chance in two of obtainingthe rightanswer).The mid-pointbetweenchanceand 1.00wouldthen be .75, which would be the p value sought.The open-endedand dichotomousdistributionsare representedbelow: 0 (chance) .5 Open-endedItem 100 .75 .5 (chance) DichotomousItem 100 667 ENRIGHT AND LAPSLEY 3b. Even if a researcher does not wish to spend the time required in 3a, he or she could still construct parallel form tests and, treating each score as an item, do an internal consistency on the two or three forms. 4. With the more easily scored scales like Chandler's cognitive scale, eliminate those items that are too difficult or too easy for a given age group. With any of the scales, eliminate those items that are not highly correlated with total score. In both the difficulty and discrimination procedures, new items could then be added and tested via 3(a, b) above because fewer items tend to lower reliability. Again, the purpose is to maximize internal consistency. This is the case because the domains are theory-based, representing a distinct ability, and therefore should have high internal consistency. 5. For those particularly interested in educational uses, the next step would be to retest a sample and, via a time 1 to time 2 correlation on each item, select only those already internally consistent items that are temporally stable. This latter property is especially important for those measuring change because the investigator certainly does not want spurious change due to poor stability in the dependent measure following an experimental procedure. 6. Replicate by sampling from a different population to eliminate results that are sample specific. 7. Use the measure to validate the construct and the items with the possibility of adding or eliminating items depending on correlational results with other domains. For instance, referring to Table IV, one would expect one of Chandler's cognitive role-taking items to correlate positively with Selman's cognitive measure. On the other hand, one would probably eliminate an item that shares much variance with general intelligence. What has happened is that researchers have gone from steps 1 and 2 to step 7, bypassing test construction. The result is few significant relationships with other measures and no confidence in those findings that are significant because of possible sample specificity and measurement error. Ford (1979), too, has overlooked this test construction point, calling for a reevaluation of the egocentrism construct based on the lack of relationships, even before well-constructed scales have been developed. Some (Zahn-Waxler et al., 1977) have suggested that to alleviate role-taking measurement problems one should administer a battery of role-taking tasks and derive one score which would be a "stable indicator." The resulting variate is a composite of the various tasks. Such procedures would seem, however, to confuse the issue more than rectify it because of the problems of multiple constructs possibly being represented in the variate as well as uncertain reliabilities for many items also being represented. Multiple scales might be used without a composite if it is eventually found with adequate measures that people vary in their role-taking abilities across the various role-taking areas. Until systematic attempts are made to resolve the methodological questions regarding test construction, however, such a conclusion may be premature. A third general conclusion regards the existing reliabilities of the various roletaking scales. One factor which might contribute to the unstable reliabilities of some role-taking measures is the influence of practice effects. Researchers have not investigated whether subjects demonstrate improved performance on later trials in a multitrial assessment procedure, though conceivably some role-taking tasks would be amenable to such a response bias. In the Chandler (1972) procedure, for example, 668 SOCIAL ROLE-TAKING is it not possible that children would respond differentially to the latter cartoon sequences after having been exposed to earlier cartoon trials? Perhaps it is the influence of practice effects which accounts for the disparate estimates of internal consistency reported by Chandler (1973), Kurdek (1977b), and Rubin (1978). Such an explanation has yet to be ruled out. Further, in addition to poorly constructed test items or practice effects, low internal consistency estimates may also be related to the content of role-taking measures. If the content of such measures varies in terms of such variables as item complexity or familiarity, or other stimulus properties, then the resulting test performance may not be an accurate reflection of role-taking ability. An analogous situation exists in the spatial perspective-taking domain. In a recent review, Fehr (1978) identified numerous methodological variables such as stimulus dimensionality, the number of objects presented in the spatial arrays, the presence or absence of landmarks, the familiarity of the stimuli, the orientation and nature of the other observer, and others, all of which contribute to wide fluctuations relative to the onset and decline of egocentric responding. A similar concern for content variables in the social role-taking domains may reduce the occurrence of false positive and false negative diagnoses, and improve reliability estimates. Yet a third factor concerning internal consistency yet to be considered is that consistent performance on role-taking tasks may itself be a developmental phenomenon. In the Flapan (1968) study, for example, age trends in both the structured and unstructured phases of the assessment procedure are evident, indicating that older subjects are more consistent responders than younger subjects. Internal consistency also shows changes with age in the Rubin (1978) study, with reliability estimates ranging from .18 for subjects in the first grade to .39 for fifth-graders, the eldest subjects sampled. However, caution must be exercised in evaluating the trends reported in these studies because the reliability coefficients were calculated with a restricted range of scores. One possible way to overcome the restricted range problem would be to assess three groups of younger subjects, say first, second, and third graders and three groups of older subjects in grades five, six, and seven. One could then compare the pattern of internal consistency responding of younger and older subjects with an expanded range of scores. A fourth general conclusion which can be reached is that the validation focus in role-taking is, at present, too narrow. The social role-taking constructs are highly complex and yet the validation focus has primarily been on the age-stage relationship. If a true validation picture is to emerge, it would seem that researchers must begin to ask questions about necessary conditions for growth, hierarchization, cross-cultural generalizations, the relationship with behavioral competence, and expanded discriminant validation studies. As a way to begin examining necessary conditions for growth, role-taking researchers could follow Walker and Richards' (1979) approach of examining such necessary conditions in the moral judgment area via training programs. For example, these researchers found it easier to train stage 3 subjects (who take a group perspective) to think on stage 4 (a societal, abstract perspective) if the subjects already had some evidence for abstract thinking on Piaget's formal operational stage. Not only could such findings have utility for basic research, but also for education, since educators would know how to maximize the probabilities for role-taking change. 669 ENRIGHT AND LAPSLEY To examine the hierarchization assumption, role-taking researchers could follow Rest's (1973) lead in the moral area by asking comprehension questions for each role-taking level. The assumption here is that if the child understood stage three statements, he or she should also be able to understand stage two and one statements. Further, the child's lack of understanding the stage two statements should lead to the same lack of comprehension for higher stages. Guttman analyses could then be used to assess the hierarchical nature of the stages. The examination of a role-taking relationship with behavioral competence could be expanded to include ecological validity or naturalistic observation of behavioral reciprocity. Such analyses have begun in the area of moral judgment with successful results (Enright & Sutterfield, 1980). Knowledge of a role-taking and social behavior relationship in school children could give educators a clue as to how newly acquired role-taking abilities could aid the child in his or her everyday social interactions. As a validation suggestion for discriminant validity, the current convergent-discriminant validations could possibly be improved via a multitrait-multimethod analysis (Campbell & Fiske, 1959). This could be done by sampling a role-taking domain in two different ways, such as Selman's interview, which relies on verbal production, and a paired-comparisons version of the same trait, which relies on recognition. For the latter, a procedure similar to Enright, Franklin, and Manheim (1980) could be used. This would provide the multimethod. The within-domain relationship, then, could be compared with IQ, again assessed in two different ways, one a production and the other a recognition task. This would provide for the multitrait component. The expectation, of course, is that both Selman procedures would share greater variance than that between the Selman and IQ traits. Two assumptions which appear untestable to us are criteria two and part of five. Regarding the former, if egocentrism and operative knowing are two distinct explanations, as Youniss (1975) insists, then there should be concomitant evidence or discussion of why operative knowing as opposed to egocentrism should be considered the underlying cognitive mechanism responsible for the level observed in the child. The invariance assumption of the fifth criterion appears to have no empirical support because it was derived from a philosophical assumption (see, e.g., Selman, 1976) that is not easily amenable to empirical investigation. That is, as stated previously, if a child regresses there is no way to unconfound the two possibilities of either a true regression or measurement error which has led to temporal instability in the test. If a theorist wishes to include untestable assumptions in his or her role-taking model, it should be clearly stated that these, indeed, are untestable. It is our view, however, that assumptions in scientific work should eventually lead to testable hypotheses. Given both the reliability and validity difficulties that exist at present in roletaking, it is not at all surprising to find inconsistent results in role-taking education programs. It is quite possible for an educator to develop a program to promote a privileged information ability, but assess that program with, for instance, Feffer's measure which is more concerned with cognitive and affective role-taking. Also, the tasks chosen as the dependent measure may be so internally inconsistent or temporally unstable as to blur actual change. A careful choice of measure by potential educators could alleviate some of these problems. In conclusion, the development of any science usually proceeds from vague, intuitive ideas, to imperfect measurement, to standardization. It seems that roletaking is at the point where the field must move toward greater refinement of 670 SOCIAL ROLE-TAKING measurement. Given the philosophical tradition of this area, however, some may claim that in doing so we are sacrificing a richness of theory for precision. This does not,have to be the case. The constructs could remain as they are with a few alterations in the untestable assumptions. Measurement need not impair a detailed construct, but'rather requires that the construct is indeed testable. If it is realized that the roletaking instruments are subject to the same measurement laws as are intelligence and personality tests then the transition from imperfect to more precise measurement may be easier. At the same time, if it is realized that psychometrics is blind to the notion of individual differences, stage-related properties, or any other theoretical approach, progress may be realized in linking measurement methodology and social cognitive stage theory. Social education programs could only benefit from such precision. Reference Notes 1. Elardo, P. ProjectA WARE. Paper presented at the fourth annual H. Blumberg Symposium, Chapel Hill, North Carolina, 1974. 2. Wentink, W., Smits-van Sonsbeck, B., Leckie, G., & Smits, P. The effect of a social perspective-taking training on role-taking ability and social interaction in preschool and elementary school children. Paper presented at the meeting of the International Society for the Study of Behavioral Development, Guilford, Great Britain, March 1975. 3. Kuhn, D. The development of role-taking ability. Unpublished manuscript, Columbia University, 1972. 4. Kurdek, L. Perceptual,cognitive,and affectiveperspective-takingand empathyin kindergarten through third-gradechildren. Paper presented at the meeting of the Society for Research in Child Development, Denver, April 1975. 5. Olejnik, A. Developmentalchanges and interrelationshipsamong role-taking, moraljudgments and children'ssharing. Paper presented at the meeting of the Society for Research in Child Development, Denver, April 1975. 6. Mossler, D., Greenberg, M., & Marvin, R. The early developmentof conceptualperspectivetaking. Paper presented at the meeting of the Society for Research in Child Development, Denver, April 1975. 7. O'Connor, M. Decentration revisited: A two-factor model for role-taking development in young children. Paper presented at the meeting of the Society for Research in Child Development, Denver, April 1975. 8. Selman, R., & Byrne, D. Manualfor scoring social role-taking in social dilemmas. Unpublished manuscript, Harvard University, 1973. (Available from [R. L. Selman, Harvard Graduate School of Education, Larsen Hall, Appian Way, Cambridge, MA 02138]). 9. Gordon A., Damon, W., & Selman, R. Social perspectivetaking and the developmentof the concept of moral intentionality.Unpublished manuscript, Harvard University, 1974. 10. Feshbach, N., & Kuchenbecker, S. A three component model of empathy. Paper presented at the meeting of the American Psychological Association, New Orleans, September 1974. 11. Shantz, C. Empathy in relation to social cognitive development. Paper presented at the meeting of the American Psychological Association, New Orleans, September 1974. 12. Watson, M. A developmental study of empathy: Egocentrism to sociocentrism or simple to complex reasoning? Paper presented at the meeting of the Society for Research in Child Development, Denver, April 1975. 13. Schnall, M., & Feffer, M. Manual for the scoring of the role-taking task. Unpublished manuscript (No. 9010), n.d. (Available from [ADI Auxiliary Publications Project, Photoduplication Service, Library of Congress, Washington, D.C.]). 671 ENRIGHT AND LAPSLEY 14. Marsh, D., & Serafica, F. Perspectivetaking and moraljudgment: A developmentalanalysis. Paper presented at the meeting of the Society for Research in Child Development, New Orleans, April 1977. References Ambron, S., & Irwin, D. Role-taking and moral judgment in five- and seven-year-olds. DevelopmentalPsychology, 1975, 11, 102. Borke, H. Interpersonal perception of young children: Egocentrism or empathy? Developmental Psychology, 1971, 5, 263-269. Borke, H. Chandler and Greenspan's "Ersatz Egocentrism": A rejoinder. Developmental Psychology, 1972, 7, 107-109. Borke, H. The development of empathy in Chinese and American children between three and six years of age: A cross-cultural study. DevelopmentalPsychology, 1973, 9, 102-108. Burns, M., & Cavey, L. Age differences in empathic ability among children. Canadian Journal of Psychology, 1957, 11, 227-230. Byrne, D. The development of role-taking in adolescence. Unpublished doctoral dissertation, Harvard University, 1974. Campbell, D., & Fiske, D. Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 1959, 56. 81-105. Chandler, M. Egocentrism in normal and pathological childhood development. In W. Hartup & J. DeWitt (Eds.), Determinants of behavioral development. New York: Academic Press, 1972. Chandler, M. Egocentrism and anti-social behavior: The assessment and training of socialperspective taking skills. DevelopmentalPsychology, 1973, 9, 326-332. Chandler, M. Social cognition: A selective review of current research. In W. Overton & J. Gallagher (Eds.), Knowledge and development, Vol. 1, Advances in research and theory. New York: Plenum Press, 1977. Chandler, M., & Greenspan, S. Ersatz egocentrism: A reply to H. Borke. Developmental Psychology, 1972, 7, 104-106. Chandler, M., Greenspan, S., & Barenboim, C. Assessment and training of role-taking and referential communication skills in institutionalized emotionally disturbed children. Developmental Psychology, 1974, 10, 546-553. Cooney, E. Social cognitive development: Applications to intervention and evaluation in the elementary grades. The CounselingPsychologist, 1977, 6, 6-9. DeVries, R. The development of role-taking as reflected by the behavior of bright, average, and retarded children in a social guessing game. Child Development, 1970, 4, 759-770. Elkind, D. Piagetian and psychometric conceptions of intelligence. Harvard Educational Review, 1969, 39, 319-337. Enright, R., Franklin, C., & Manheim, L. Children's distributive justice reasoning: A standardized and objective scale. DevelopmentalPsychology, 1980, 16, 193-202. Enright, R., & Sutterfield, S. An ecological validation of social cognitive development. Child Development, 1980, 51, 156-161. Feffer, M. The cognitive implications of role taking behavior. Journal of Personality, 1959, 27, 152-168. Feffer, N., & Gourevitch, V. Cognitive aspects of role-taking in children. Journal of Personality, 1960, 28, 283-396. Fehr, L. Methodological inconsistencies in the measurement of spatial perspective-taking ability: A cause for concern. Human Development, 1978, 21, 302-315. Feshbach, N., & Roe, K. Empathy in six and seven year olds. Child Development, 1968, 39, 133-145. Flapan, D. Children'sunderstandingof social interaction.New York: Columbia University Press, 1968. 672 SOCIAL ROLE-TAKING Flavell, J., Botkin, P., Fry, C., Wright, J., & Jarvis, P. The development of role-taking and communicationskills in children. New York: Wiley, 1968. Ford, M. The construct validity of egocentrism. Psychological Bulletin, 1979, 86, 1,169-1,188. Furth, H. Piaget and knowledge. Englewood Cliffs, N.J.: Prentice-Hall, 1969. Gottman, J., Gonzo, J., & Rasmussen, B. Social interaction, social competence, and friendship in children. Child Development, 1975, 46, 706-718. Hill, J., & Palmquist, W. Social cognition and social relations in early adolescence. International Journal of Behavioral Development, 1978, 1, 1-36. Hoffman, M. Empathy, role-taking, guilt and development of altruistic motives. In T. Lickona (Ed.), Moral developmentand behavior: Theory, research and social issues. New York: Holt, Rinehart, & Winston, 1976. Hoffman, M., & Levine, L. Early sex differences in empathy. DevelopmentalPsychology, 1976, 12, 557-558. Hollos, M. Logical operations and role-taking abilities in two cultures, Norway and Hungary. Child Development, 1975, 46, 638-649. Hollos, M., & Cowan, P. Social isolation and cognitive development: Logical operations and role-taking abilities in three Norwegian social settings. Child Development, 1973, 44, 630-641. Hudson, L. On the coherence of role-taking abilities: An alternative to correlation analysis. Child Development, 1978, 49, 223-227. lannotti, R. The effects of role-taking experiences on role-taking, altruism, empathy, and aggression. DevelopmentalPsychology, 1978, 14, 119-124. Johnson, D. Affective perspective taking and cooperative predisposition. Developmental Psychology, 1975, 11, 869-870. Keller, M. Development of role-taking ability: Social antecedents and consequences for school success. Human Development, 1976, 19, 120-132. Kurdek, L. Convergent validation of perspective taking: A one year follow-up. Developmental Psychology, 1977, 13, 172-173. (a) Kurdek, L. Structural components and intellectual correlates of cognitive perspective taking in first- through fourth-grade children. Child Development, 1977, 48, 1,503-1,511. (b) Kurdek, L. Perspective taking as the cognitive basis of children's moral development: A review of the literature. Merrill-PalmerQuarterly, 1978, 24, 3-28. Kurdek, L., & Rodgon, M. Perceptual, cognitive, and affective perspective taking in kindergarten through sixth-grade children. DevelopmentalPsychology, 1975, 11, 643-650. Leahy, R., & Huard, C. Role taking and self-image disparity in children. Developmental Psychology, 1976, 12, 504-508. Looft, W. Egocentrism and social interaction across the life span. Psychological Bulletin, 1972, 78, 93-102. Marvin, R., Greenberg, M., & Mossler, D. The early development of conceptual perspective taking: Distinguishing among multiple perspectives. Child Development, 1976, 47, 511-514. Mead, G. H., Mind, self, and society. Chicago: The University of Chicago Press, 1934. Miller, P., Kessel, F., & Flavell, J. Thinking about people thinking about people thinking about ... A study of social cognitive development. Child Development, 1970, 41, 613-623. Moir, D. Egocentrism and the emergence of conventional morality in preadolescent girls. Child Development, 1974, 45, 299-304. Mossler, D., Marvin, R., & Greenberg, M. Conceptual perspective taking in 2-to-6-year-old children. DevelopmentalPsychology, 1976, 12, 85-86. Nunnally, J. Psychometrictheory. New York: McGraw-Hill, 1967. Piaget, J. The language and thought of the child. New York: Harcourt, Brace, & World, 1926. Piche, G., Michlin, M., Rubin, D., & Johnson, F. Relationships between fourth graders' performances on selected role-taking tasks and referential communication accuracy. Child Development, 1975, 46, 965-969. Rest, J. The hierarchical nature of moral judgment: A study of patterns of comprehension and preferences of moral stages. Journal of Personality, 1973, 41, 86-109. 673 ENRIGHT AND LAPSLEY Rotenberg, M. Conceptual and methodological notes on affective and cognitive role taking (sympathy and empathy): An illustrative experiment with delinquent and nondelinquent boys. Journal of Genetic Psychology, 1974, 125, 177-185. Rothenberg, B. Children's social sensitivity and the relationship to interpersonal competence, intrapersonal comfort, and intellectual level. Developmental Psychology, 1970, 2, 335-350. Rubin, K. Egocentrism in childhood: A unitary construct? Child Development, 1973, 44, 102110. Rubin, K. Role-taking in childhood: Some methodological considerations. Child Development, 1978, 49, 428-433. Selman, R. The relation of role-taking to the development of moral judgment in children. Child Development, 1971, 42, 79-91. (a) Selman, R. Taking another's perspective: Role-taking development in early childhood. Child Development, 1971, 42, 1,721-1,734. (b) Selman, R. Toward a structural analysis of developing interpersonal relations concepts: Research with normal and disturbed pre-adolescent boys. In A. Pick (Ed.), Minnesotasymposium on childpsychology (Vol. 10). Minneapolis: University of Minnesota Press, 1976. Selman, R., & Byrne, D. A structural-developmental analysis of levels of role-taking in middle childhood. Child Development, 1974, 45, 803-806. Selman, R., & Damon, W. The necessity (but insufficiency) of social perspective taking for conceptions of justice at three early levels. In D. DePalma & J. Foley (Eds.), Contemporary issues in moral development.Potomac, Md: Lawrence Erlbaum, 1976. Selman, R., & Lieberman, M. Moral education in the primary grades: An evaluation of a developmental curriculum. Journal of Educational Psychology, 1975, 67, 712-716. Shantz, C. The development of social cognition. In E. M. Hetherington (Ed.), Review of child developmentresearch, Vol. 5, Chicago: University of Chicago Press, 1975. Stanley, J. Reliability. In R. Thorndike & E. Hagen (Eds.), Measurement and evaluation in psychology and education. New York: Wiley, 1969. Tinsley, H., & Weiss, D. Interrater reliability and agreement of subjective judgments. Journal of CounselingPsychology, 1975, 22, 358-376. Turnure, C. Cognitive development and role-taking ability in boys and girls from 7 to 12. DevelopmentalPsychology, 1975, 11, 202-209. Urberg, K., & Docherty, E. Development of role-taking skills in young children. Developmental Psychology, 1976, 12, 198-203. Walker, L., & Richards, B. Stimulating transitions in moral reasoning as a function of stage of cognitive development. DevelopmentalPsychology, 1979, 15, 95-103. West, H. Early peer-group interaction and role-taking skills: An investigation of Israeli children. Child Development, 1974, 45, 1,118-1,121. White, R. Motivation reconsidered:The concept of competence. Psychological Review, 1959, 66, 297-333. Wolfe, R. The role of conceptual systems in cognitive functioning at varying levels of age and intelligence. Journal of Personality, 1963, 31, 108-123. Youniss, J. Another perspective on social cognition. In A. Pick (Ed.), Minnesota symposiumon child psychology, Vol. 9. Minneapolis: University of Minnesota Press, 1975. Zahn-Waxler, C., Radke-Yarrow, M., & Brady-Smith, J. Perspective-taking and pro-social behavior. DevelopmentalPsychology, 1977, 13, 87-88. AUTHORS ROBERT D. ENRIGHT, Assistant Professor, Department of Educational Psychology, University of Wisconsin, Madison, WI 53706. Specialization: Social cognitive development. DANIEL K. LAPSLEY, Department of Educational Psychology, University of Wisconsin, Madison, WI 53706. Specialization: Social cognitive development. 674
© Copyright 2026 Paperzz