Theory of Mind and Executive Function Impairments in Autism Spectrum Disorders and their Broader Phenotype: Profile, Primacy, and Independence Dana Wong, B.Sc. (Hons.) This thesis is presented in partial fulfilment of the degree of Doctor of Philosophy/Master of Psychology (Clinical Neuropsychology) of the University of Western Australia School of Psychology, 2004 ABSTRACT Impairments in both theory of mind (ToM; the ability to attribute mental states to oneself and others) and executive function (EF; a group of high-level cognitive functions which help guide and control goal-directed behaviour) have been demonstrated in individuals with autism spectrum disorders (ASDs). Both deficits have been proposed by different groups of researchers as being the single primary cognitive deficit of autism, which can subsume the other deficit as secondary or artefactual. However, few studies have examined the nature of the relationship between ToM and EF in ASDs or conducted a systematic investigation of their relative primacy. This research principally sought to establish the primacy and independence of impairments in ToM and EF in ASDs and thereby evaluate the validity of single versus multiple primary deficit models of autism. These aims were addressed in two studies, both broad in scope. The first study was an investigation of the profile, primacy, and independence of ToM and EF impairments in individuals with ASDs. The sample included 46 participants with ASDs and 48 control participants matched on age and non-verbal ability. The profile of impairments was examined by measuring ToM and a range of EF components using tasks employing, wherever possible, process-pure indices of performance. Primacy was measured by focussing on i) whether or not the deficits observed were universal among individuals with ASDs; ii) whether the deficits were able to discriminate individuals with ASDs from matched controls (i.e., predict group membership); and iii) the ability of ToM and EF deficits to explain the full range of autistic symptomatology, as measured by correlating cognitive performances with behavioural indices. The relationship between ToM and EF impairments was investigated by conducting correlations between ToM and EF variables as well as analysing the incidence of dissociations between impairments in the two domains. The ASD group was found to demonstrate significant impairments in ToM and several components of EF including planning, verbal inhibition, working memory (in a context where inhibitory control was required), and both verbal and non-verbal generativity. However, neither ToM nor EF impairments were able to meet all of the criteria for a primary deficit in ASDs. EF deficits were found to be more primary, but could not account for ToM as a secondary deficit, as ToM and EF were found to be independent (i.e., uncorrelated and dissociable) deficits in the ASD group. This pattern of results suggested that a multiple deficits model involving at least two independent impairments appeared to best characterise i ASDs, but the data were compatible with several variants of such a model (e.g., involving distinct subtypes versus a multidimensional spectrum). The second study was an investigation of ToM and EF impairments in siblings of individuals with ASDs, who have previously been found to demonstrate a subclinical “broad autism phenotype”. The main aims of this study were i) to identify whether ToM or EF deficits could meet criteria for an “endophenotype” or vulnerability marker for the autism genotype in unaffected relatives, which would have further implications about the primacy of ToM and EF in ASDs; and ii) to further investigate the validity of various multiple deficits models of ASDs by examining the pattern of ToM and EF performance in those showing the broad phenotype. Participants were 108 siblings of individuals with ASDs and 67 siblings of controls, tested on the same ToM and EF tasks used in the first study. Confirming the superior primacy of EF deficits found in Study One, there was no significant difference in ToM performance between ASD and control siblings, but ASD siblings showed weaknesses on two measures of EF. Furthermore, there appeared to be different subgroups of siblings demonstrating different cognitive profiles, consistent with the heterogeneity evident in the first study. This research indicated that ASDs cannot be explained by a single primary cognitive deficit. These findings hold important theoretical and empirical implications and highlight further questions about which type of multiple deficits model might best explain ASDs. ii TABLE OF CONTENTS ABSTRACT............................................................................................... LIST OF TABLES...................................................................................... LIST OF FIGURES.................................................................................... ACKNOWLEDGEMENTS......................................................................... i vii ix x CHAPTER 1. General Introduction: Explaining Autism...................... 1.1 Autism: Diagnosis and epidemiology.................................................. 1.2 Explaining autism: The cognitive level of explanation........................ 1.3 Overview of the thesis........................................................................ 1.3.1 Rationale and aims................................................................... 1.3.2 Thesis structure........................................................................ 1 2 4 9 9 11 CHAPTER 2. Literature Review: Theory of Mind and Executive Function in Typical Development and in Autism.................................. 2.1 Theory of mind (ToM) ........................................................................ 2.1.1 Defining and measuring ToM.................................................... 2.1.2 Models of ToM and its development......................................... 2.1.3 ToM in autism........................................................................... 2.2 Executive function (EF) ...................................................................... 2.2.1 Defining and measuring EF...................................................... 2.2.2 Models of EF and its development........................................... 2.2.3 EF in autism.............................................................................. 2.3 The ToM-EF relationship.................................................................... 2.3.1 Models of the ToM-EF relationship........................................... 2.3.1.1 Expression accounts.................................................... 2.3.1.2 Common conceptual requirements of ToM and EF..... 2.3.1.3 Emergence accounts................................................... 2.3.1.4 Common neuroanatomical bases for ToM and EF...... 2.3.2 The ToM-EF relationship in autism........................................... 13 14 14 16 21 32 32 36 42 54 54 55 62 66 72 78 CHAPTER 3. Selection and Description of Measures......................... 3.1 Diagnostic measures.......................................................................... 3.1.1 Autism Screening Questionnaire.............................................. 3.1.2 Autism Diagnostic Interview – Revised..................................... 3.2 IQ measures....................................................................................... 3.3 ToM measures.................................................................................... 3.3.1 Simple false belief task............................................................. 3.3.2 First-order false belief task....................................................... 3.3.3 Second-order false belief task.................................................. 3.3.4 Dewey stories........................................................................... 87 88 88 89 90 90 91 92 93 94 iii 3.4 EF measures...................................................................................... 3.4.1 Tower of London....................................................................... 3.4.2 Intra-dimensional, Extra-dimensional Set-shifting task............. 3.4.3 Response Inhibition and Load task........................................... 3.4.4 Opposite Worlds....................................................................... 3.4.5 Relational Complexity............................................................... 3.4.6 Pattern Meanings...................................................................... 3.4.7 Uses of Objects........................................................................ 3.4.8 Stamps task.............................................................................. 3.5 Behavioural measures........................................................................ 3.5.1 Measures of repetitive behaviour.............................................. 3.5.1.1 Repetitive Behaviours Questionnaire.......................... 3.5.1.2 Repetitive Behaviours Interview.................................. 3.5.2 Measures of social behaviour and communication................... 3.5.2.1 Social Behaviour Questionnaire.................................. 3.5.2.2 Social and communication ADI-R domains................. 95 96 99 104 106 107 110 113 114 116 116 116 117 121 121 121 CHAPTER 4. Study One: Profile, Primacy, and Independence of Theory of Mind and Executive Function Impairments in Autism Spectrum Disorders................................................................................ 4.1 Introduction......................................................................................... 4.1.1 Aims.......................................................................................... 4.1.2 Hypotheses............................................................................... 4.2 Method................................................................................................ 4.2.1 Participants............................................................................... 4.2.2 Procedure................................................................................. 4.3 Results................................................................................................ 4.3.1 Data screening.......................................................................... 4.3.2 Group comparisons on ToM and EF tasks............................... 4.3.2.1 False belief tasks......................................................... 4.3.2.2 Dewey Stories.............................................................. 4.3.2.3 Tower of London.......................................................... 4.3.2.4 IDED set-shifting task.................................................. 4.3.2.5 Response Inhibition and Load task.............................. 4.3.2.6 Opposite Worlds task................................................... 4.3.2.7 Relational Complexity.................................................. 4.3.2.8 Pattern Meanings......................................................... 4.3.2.9 Uses of Objects........................................................... 4.3.2.10 Stamps task............................................................... 4.3.2.11 Summary and effect sizes of group comparisons...... 4.3.3 Universality of ToM and EF deficits.......................................... 4.3.4 Ability of ToM and EF variables to predict group membership. 123 124 124 126 131 131 134 136 136 136 139 141 142 143 144 147 149 149 151 153 154 157 159 iv 4.3.5 Behavioural measures: Group comparisons and derivation of indices used in correlational analyses...................................... 4.3.5.1 Repetitive Behaviours Interview.................................. 4.3.5.2 Social and communicative functioning......................... 4.3.6 Correlations between ToM/EF and behavioural measures....... 4.3.7 Relationship between ToM and EF........................................... 4.3.7.1 Correlations between ToM and EF.............................. 4.3.7.2 Dissociations between ToM and EF............................ 4.4 Discussion.......................................................................................... 4.4.1 Profile of ToM and EF deficits................................................... 4.4.2 Primacy of ToM and EF deficits................................................ 4.4.3 Independence of ToM and EF deficits...................................... 4.4.4 Towards a “multiple primary deficits” model of ToM and EF in ASDs......................................................................................... 162 162 164 165 171 171 175 176 177 186 193 199 CHAPTER 5. Literature Review: The Broad Autism Phenotype......... 5.1 Autism as a genetic disorder.............................................................. 5.2 The broad phenotype.......................................................................... 5.2.1 The behavioural phenotype...................................................... 5.2.2 The cognitive phenotype........................................................... 5.2.2.1 General intellectual ability............................................ 5.2.2.2 Specific cognitive deficits............................................. 207 208 210 210 212 213 215 CHAPTER 6. Study Two: Theory of Mind and Executive Function in Siblings of Individuals with Autism Spectrum Disorders.................... 6.1 Introduction......................................................................................... 6.1.1 Aims.......................................................................................... 6.1.2 Hypotheses............................................................................... 6.2 Method................................................................................................ 6.2.1 Participants............................................................................... 6.2.2 Procedure................................................................................. 6.3 Results................................................................................................ 6.3.1 Sibling group comparisons on ToM and EF tasks.................... 6.3.1.1 False belief tasks......................................................... 6.3.1.2 Tower of London.......................................................... 6.3.1.3 IDED Set-shifting task.................................................. 6.3.1.4 Response Inhibition and Load task.............................. 6.3.1.5 Opposite Worlds task................................................... 6.3.1.6 Pattern Meanings......................................................... 6.3.1.7 Uses of Objects........................................................... 6.3.1.8 Stamps task................................................................. 6.3.1.9 Summary of sibling group comparisons....................... 6.3.2 Comparisons between ASD siblings and ASD probands......... 221 222 222 225 226 226 228 228 228 229 231 232 233 235 238 238 239 241 242 v 6.3.3 Ability of cognitive variables to predict sibling group membership............................................................................. 6.3.4 Proband-sibling relationships within the ASD families............. 6.3.4.1 Correlations between proband IQ and siblings’ cognitive performances............................................... 6.3.4.2 Correlations between probands’ and siblings’ cognitive performances............................................... 6.3.5 Prevalence of deficits in ASD siblings...................................... 6.3.6 Correlations between ToM and EF........................................... 6.3.7 Dissociations between ToM and EF......................................... 6.3.8 Results from behavioural measures......................................... 6.4 Discussion.......................................................................................... 6.4.1 Endophenotype status of ToM and EF impairments................. 6.4.2 Differentiating the multiple deficits models............................... 243 244 244 246 246 246 251 252 254 254 260 CHAPTER 7. General Discussion: Constructing an Explanatory Model for ASDs........................................................................................ 7.1 Summary of the findings..................................................................... 7.2 Methodological strengths and limitations............................................ 7.3 Conclusions on constructing an explanatory model for ASDs............ 7.4 Future directions................................................................................. 265 266 267 269 272 REFERENCES.......................................................................................... 279 APPENDIX A. Repetitive Behaviours Interview – Current Version..... APPENDIX B. Correlations between EF task variables in the control group (Study One)................................................................................... APPENDIX C. Separate ToM-EF correlations for young and old age subgroups within the control sample (Study One).............................. APPENDIX D. Separate group comparisons for young and old age subgroups on EF tasks (Study One)...................................................... 333 349 351 353 vi LIST OF TABLES Table: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. The five scores computed for each item on the Tower of London............. Demographic characteristics of the samples............................................... Order of test battery and age range for each test........................................ False belief task results: Percentage of participants in each group with perfect scores [or high scores in the case of the alternative aggregate score] on belief questions, and significance of group comparisons........... IDED Set-shifting task results: Percentage of low error scorers in each group for each stage of each task condition, and significance of group comparisons................................................................................................ RIL task results: Mean (and SD) of each group, and significance of group comparisons, for error and RT difference scores and the shape error score................................................................................................... Opposite Worlds results: Mean (and SD) of each group for error/time scores in each condition and difference scores, and significance of group comparisons................................................................................................ Pattern Meanings results: Mean (and SD) of each subgroup [or the percentage of low error scorers for dichotomous variables], and significance of group comparisons............................................................. Uses of Objects results: Mean (and SD) of each group [or the percentage of low error scorers for dichotomous variables], and significance of group comparisons...................................................................................... Stamps task results: Mean (and SD) of each group [or the percentage of low scorers for dichotomous variables], and significance of group comparisons................................................................................................ Summary and effect sizes of significant group differences........................ Universality of ToM and EF deficits in the ASD group............................. Logistic regression analysis of group membership as a function of VIQ, ToM and EF variables................................................................................. Median (and range) of RBI severity summary scores for the ASD and control groups............................................................................................. Factor loadings of RBI severity summary scores....................................... Raw and partial correlations between cognitive measures and behavioural factors within the ASD group................................................. Raw and partial correlations between cognitive measures and RBI composite scores within the ASD group..................................................... Raw and partial correlations between ToM and EF measures within the control group............................................................................................... Raw and partial correlations between ToM and EF measures within the ASD group.................................................................................................. 99 133 135 141 144 146 148 151 152 154 155 159 161 162 163 166 169 173 174 vii Table: 20. Summary of significant partial correlations between ToM and EF variables in the control and ASD groups.................................................... 21. The incidence of ToM-EF dissociations in the ASD group........................ 22. Demographic characteristics of the sibling samples................................... 23. False belief task results: Percentage of siblings in each group with perfect scores [or high scores for the alternative aggregate] on belief questions, and significance of group comparisons..................................... 24. IDED Set-shifting task results: Percentage of low error scorers in each sibling group for each stage of each task condition, and significance of group comparisons...................................................................................... 25. RIL task results: Mean (and SD) of each sibling group, and significance of group comparisons, for error and RT difference scores and the shape error score................................................................................................... 26. Opposite Worlds results: Mean (and SD) and significance of group comparisons for each sibling group for error/time scores in each condition and difference scores, and for each gender for time scores........ 27. Uses of Objects results: Mean (and SD) of each sibling group, and significance of group comparisons............................................................. 28. Stamps task results: Mean (and SD) of each sibling group [or the percentage of low scorers for dichotomous variables], and significance of group comparisons.................................................................................. 29. Effect sizes, r (and d), of significant group differences between sibling groups and between proband groups.......................................................... 30. Results of logistic regression analysis of sibling group membership......... 31. Raw and partial correlations between proband PIQ and VIQ and siblings’ scores on ToM and EF measures................................................. 32. Raw and partial correlations between ToM and EF variables within control siblings............................................................................................ 33. Raw and partial correlations between ToM and EF variables within ASD siblings........................................................................................................ 34. Summary of partial correlations between ToM and EF variables in the control and ASD probands and siblings..................................................... 35. The incidence of ToM-EF dissociations in the ASD siblings.................... 175 176 227 231 233 235 237 239 240 242 243 245 248 250 251 252 viii LIST OF FIGURES Figure: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. A single primary cognitive deficit model of autism.............................. A multiple cognitive deficits model of autism, in which each cognitive deficit underlies a different domain of symptomatology....... An example of a Dewey Story............................................................... The starting configuration for the Tower of London stimuli................. Stimuli for the Perseveration condition of the IDED set-shifting task.. Stimuli for the Learned Irrelevance condition of the IDED setshifting task............................................................................................ Example of a Relational Complexity item with 1 relational change..... Example of a Relational Complexity item with 4 relational changes.... Example of a more difficult Relational Complexity item without consistent relational changes.................................................................. One of the five test stimuli for the Pattern Meanings task..................... The practice stimulus for the Pattern Meanings task............................. 5 8 95 98 102 103 109 109 110 111 112 ix ACKNOWLEDGEMENTS First and foremost credit clearly goes to my principal supervisor Murray Maybery, who is a rare treasure in putting his students’ needs before his own. He is unfailingly patient, encouraging, logical, and sensible. Thanks also to my co-supervisor Joachim Hallmayer, whose expertise in autism and genetics and constructive feedback on a draft improved the clarity, accuracy, and coherence of the thesis. This PhD research formed part of a larger project on the broad autism phenotype, the Western Australia Family Study of Autistic Spectrum Disorders (WAFSASD), which was funded by a National Health and Medical Research Council grant. Alana Maley, research assistant extraordinaire on the WAFSASD, put in countless hours of recruiting families, driving to opposite ends of the city and state, and interviewing and testing a seemingly endless number of participants. My hugest appreciation for all that you contributed. Dorothy Bishop, one of the WAFSASD’s chief investigators, offered expert guidance throughout the project. Wayne Hill put together a monstrous database as well as doing a number of the ADI-Rs. Sarah Davenport, Isabel Fernandez, Kate Fitzpatrick, Elise Mengler, Sarra Miller, Bronny Morgan, Nicole Petterson and Keira Thomson all helped with testing and/or data entry for the WAFSASD. Valued assistance was also provided by Matt Huitson, whose task programming skills saved me a lot of time and frustration, and Herb Jurkiewicz, who helped with some of the stimuli. Liz Pellicano shared with me the questions, ideas, and bafflement that go along with doing autism research, and in doing so managed to help rekindle my enthusiasm for not only my own research but also research in general, right when it was needed. My officemates, co-whingers, and distractors Kate Harwood and Mark Woodman served proficiently as my credibility meters (as well as keeping me up to date on world affairs). Opinions, grievances, ridicule, coffee, and gossip were also shared with Kate Frencham, Keira Thomson, Flavie Waters, and Allyson Browne. I was kept fed and financed by my generous family, particularly in the later stages after my scholarship had run out. My gorgeous Glen saw me through to the finishing line with a constant supply of comfort, silliness, and (bad) humour. Finally, my humble and sincere gratitude to the participants of this research – the kids both with and without ASDs, their brothers and sisters, and mums and dads – who gave their time and effort so generously. May this thesis be a step forward in understanding the puzzle of autism. x CHAPTER 1 General Introduction: Explaining Autism 1.1 Autism: Diagnosis and epidemiology 1.2 Explaining autism: The cognitive level of explanation 1.3 Overview of the thesis 1.3.1 Rationale and aims 1.3.2 Thesis structure 1 1.1 Autism: Diagnosis and epidemiology Autism is classified as a pervasive developmental disorder and is defined and diagnosed by its clinical symptomatology, rather than biological markers or aetiology. Current diagnostic criteria, as specified by the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV; APA, 1994) and the International Classification of Diseases, 10th edition (ICD-10; WHO, 1992) require the presence of symptoms in three categories: i) impairment in social interactions, ii) abnormal development of language and nonverbal communication, and iii) restricted and repetitive patterns of behaviour, interests and activities. Examples of specific symptoms listed in DSM-IV in the social domain include impaired use of nonverbal behaviours such as eye contact, facial expressions and gestures, failure to develop appropriate relationships with peers, and a lack of spontaneous seeking to share enjoyment and interests; the communication domain lists features such as a delay in or total lack of language development, pragmatic difficulties, stereotyped use of language, and impaired pretend play and imitation; and examples of repetitive behaviours include intense preoccupations, rigid adherence to routines and rituals, and stereotyped motor mannerisms such as hand flapping. The DSM-IV criteria specify that six of the twelve symptoms listed must be present, with at least two from the social domain and one from each of the other two domains. Delayed or abnormal functioning in at least one of the three domains must also have been present prior to the age of three years. While it possible for autism to be identified as young as 18 months (Baron-Cohen, Allen, & Gillberg, 1992; Johnson, Siddons, Frith, & Morton, 1992), it is more commonly and reliably diagnosed at around the age of three years or older. Other pervasive developmental disorders1 such as Asperger syndrome (individuals with autistic symptomatology who have normal intelligence and adaptive skills and no delay in the onset of speech) and Pervasive Developmental Disorder Not Otherwise Specified (PDDNOS; individuals who show significant symptomatology but who do not meet full criteria for a specific PDD) are generally considered related but distinct entities on the autism spectrum, although the boundaries and validity of each diagnosis remain a matter of current debate (Bishop, 2000; Macintosh & Dissanayake, 2004; Miller & Ozonoff, 2000; Ozonoff, South, & Miller, 2000; Rapin, 1997). One of 1 The term “pervasive developmental disorder” refers to the DSM-IV/ICD-10 category which includes autism, Asperger syndrome, Pervasive Developmental Disorder Not Otherwise Specified, Rett’s disorder, and Childhood Disintegrative disorder. Throughout this thesis, the term “autism spectrum disorder” will be used to refer to the former three of these diagnoses. 2 the characteristics of autism is its variability, with symptom severity, intellectual ability, and degree of language impairment varying widely across individuals. Most studies estimate that around 70% of individuals with autism are mentally retarded – that is, have an IQ below 70 (see Fombonne, 2003). When individuals with more broadly defined autism spectrum disorders (ASDs) are included, the proportion of affected individuals with comorbid mental retardation decreases substantially; for example, Chakrabarti and Fombonne (2001) found that less than half of children with ASDs have Performance IQs less than 70. Conservative prevalence estimates for autism currently stand at 10/10,000, with estimates for Asperger syndrome at 2.5/10,000, and at 15/10,000 for PDDNOS, making a combined prevalence for all ASDs of 27.5/10,000 (Fombonne, 2003). The prevalence of ASDs has reportedly increased in recent years, with three of the latest surveys providing estimates around twice as high as the above figures (Baird et al., 2000; Bertrand et al., 2001; Chakrabarti & Fombonne, 2001). Fombonne (2003) reports that the median prevalence rate for autism in 16 surveys published between 1966 and 1991 was 4.4/10,000, whereas the median rate for 16 surveys published in the period 19922001 was 12.7/10,000. While this apparent increase has led some to propose various environmental aetiologies for autism, other possible contributing factors include changes in diagnostic practice, increased awareness, “diagnostic substitution” (e.g., choosing a diagnosis of autism instead of mental retardation for the purposes of educational placement or funding), earlier diagnosis, and methodological issues (see Volkmar, Lord, Bailey, Schultz, & Klin, 2004). Autism is more common in boys than in girls, with a mean sex ratio of 4.3:1 across epidemiological studies; the ratio is higher for non-retarded individuals with autism, with a median of 5.75:1 across studies (Fombonne, 2003). High socioeconomic status and immigrant status have been associated with higher rates of autism in some small samples, but larger, well-designed studies have not supported these associations (Fombonne, 2003). A number of comorbid medical conditions have also been commonly associated with autism, with the most prevalent being epilepsy (Fombonne’s review estimates that 16.8% of individuals with autism also have epilepsy, but this may be an underestimate given that the median age of the samples is lower than the usual age of onset of seizures in autism). Proposed associations with other conditions such as Fragile X, tuberose sclerosis, neurofibromatosis, and phenylketonuria (PKU) are less well established as many studies do not provide evidence that the prevalence is higher than predicted by chance (Fombonne, 2003; Volkmar et al., 2004). 3 1.2 Explaining autism: The cognitive level of explanation The construction of a causal model of autism (and ASDs more broadly) has proven an extremely complex and challenging task at all levels of explanation: genetics, neurobiology, cognition, and behaviour2. While we are now confident that autism has a genetic basis (see Chapter 5), the genetic mechanisms and specific genes involved are still not understood and non-genetic factors have also been implicated. Attempts to identify key neuroanatomical and neurobiological abnormalities have resulted in a variable array of inconsistent findings, with almost all areas of the brain proposed as being abnormal in autism at one time or another. At the level of behaviour, it remains unclear whether autism is best conceived of as a unitary syndrome, a set of related but distinct subtypes, or a continuum or spectrum of abnormalities (Boucher, 1996; this is discussed further below). Paralleling this search for convergence at the genetic, neurobiological, and behavioural levels of explanation has been the pursuit of a core marker or single primary deficit at the level of cognition. In the absence of a unique biological marker for autism, the identification of a primary cognitive deficit could help to both define the boundaries of the disorder and highlight possible neurobiological substrates. The notion of a single primary cognitive deficit is attractive because it is parsimonious and provides unity – that is, it is a way of explaining the regular co-occurrence of the triad of impairments which characterise autism and it justifies the use of a single label, “autism”. For these reasons, Morton and Frith (1995, 2001; see also Frith, Morton, & Leslie, 1991) have argued strongly that autism may be explained by a single primary cognitive deficit which underlies the whole range of autistic symptomatology. The basic structure of this kind of model is presented in Figure 1. The notion of a primary or core deficit has been crucial in guiding and constraining cognitive theories of autism. Michael Rutter was one of the first autism researchers to promote the idea of a primary deficit, with his treatment of the term implying that he considered universal manifestation, early appearance, prognostic significance, and ability to account for performance on a range of tasks to be important signs of primacy (Rutter, 1968). More recently, a primary cognitive deficit has been defined as “universal, specific, and necessary and sufficient to cause the symptoms of 2 This distinction of four broad levels of explanation follows Pennington and Welsh (1995), among others, and should be considered provisional. Other divisions are possible; for example, Morton and Frith (1995) collapse genetic and neurobiological factors under one heading, “biological”. Finer divisions are also possible, for example within the level of neurobiology. 4 the disorder...in other words,...the proximal cognitive cause of the behavioural symptoms of the disorder” (Pennington & Ozonoff, 1996, p. 57). These three criteria of universality, uniqueness to autism, and ability to explain the behavioural symptoms of autism have consistently recurred in recent definitions of primacy (e.g., Hughes, 2001; Ozonoff & McEvoy, 1994; Turner, 1997). An additional criterion commonly used to assess primacy is that of causal precedence, or the ability of the proposed deficit to predate and explain the earliest symptoms of autism (Boucher, 1996; Happé, 1994b; Pennington & Ozonoff, 1991; Pennington & Welsh, 1995; Tager-Flusberg, 2001). Four key criteria3 for judging the primacy of a cognitive deficit in autism may therefore be identified as: 1. Its universality among individuals with autism; 2. Its uniqueness to individuals with autism; 3. Its causal precedence, or ability to account for the earliest symptoms of autism; and 4. Its explanatory value, or ability to explain the full range of autistic symptomatology. Non-genetic factors Genetic liability Brain abnormalities Brain abnormalities Cognitive deficit Behavioural symptom Behavioural symptom Behavioural symptom Figure 1. A single primary cognitive deficit model of autism. 3 These four criteria will be used to evaluate primacy throughout this thesis, although the list is not claimed to be comprehensive or definitive. Other features frequently cited as signifying a primary deficit include persistence or stability throughout development (e.g., Ozonoff & McEvoy, 1994; Pennington & Welsh, 1995; Rutter, 1983) and existence in the broad phenotype of autism (Bailey, Phillips, & Rutter, 1996; Hughes, 2001; see Chapter 5). 5 It could be argued that these criteria for primacy are too stringent, due to the phenotypic variability which exists in any syndrome (Tager-Flusberg, 1999a) and the possibility of subgroups within the autism spectrum. However, any single primary cognitive deficit model of autism should theoretically be able to meet the criterion of universality and be able to account for the range of symptoms displayed by individuals with autism (multiple deficits models are discussed further below). Over the years, cognitive theories of autism have adopted many different forms. The pioneering work of Hermelin and O’Connor (1970) demonstrated that neither general mental retardation or peripheral (i.e., sensory or motor) processing could explain the specific pattern of impairments displayed by individuals with autism, instead finding evidence of abnormal “central” processes such as sequencing, concept formation, and abstraction. Around the same time, Rutter (1968) proposed that language or “coding” deficits were primary to autism. Subsequent hypotheses regarding the nature of the primary impairment in autism have included aberrant sensory processing (Ornitz, 1969, 1988), deficits in arousal modulation and attention (Dawson, 1991; Dawson & Lewy, 1989; Hutt & Hutt, 1968), impaired complex information processing (Minshew, Goldstein, Muenz, & Payton, 1992; Minshew, Johnson, & Luna, 2001), lack of socio-affective or interpersonal relatedness (Hobson, 1989, 1993), and abnormal social responsiveness or orienting to social information (Klin & Volkmar, 1993; Mundy & Neal, 2001; Mundy & Sigman, 1989). However, difficulties meeting the various criteria for primacy (particularly the criterion of explanatory value for the full range of symptoms) have meant that none of these theories has established itself as a widely accepted candidate for a single primary deficit. Current research is dominated by three main theories of the primary cognitive deficit in autism: i) lack of theory of mind (inability to attribute mental states to oneself and others), ii) executive dysfunction (impairment in high-level cognitive functions which guide and control behaviour toward attainment of a goal), and iii) weak central coherence (tendency for piecemeal or local information processing). Significant impairments in these areas have been established in numerous studies of individuals with ASDs4. Proponents of these theories, in particular the former two, have strongly asserted that the impairment in question is the single primary cognitive deficit in autism. Additional impairments in other domains are usually accounted for as secondary, correlated, or artefactual consequences of the single primary deficit. 4 Studies of theory of mind and executive function in ASDs are reviewed extensively in Chapter 2. Central coherence studies are briefly discussed in Section 4.4.4 of Chapter 4. 6 The idea of a single primary deficit has been subjected to increasing criticism, however. Goodman (1989) is often cited as an advocate of the multiple primary deficits approach, arguing that genetic and environmental insults may act upon several distinct neural systems which share in common a vulnerability to those insults. These multiple neurological abnormalities then create simultaneous impairments in several cognitive domains, and “synergistic interactions” between these impairments result in a distinct syndrome. In this model, the shared vulnerability of several neural systems (e.g., through shared blood supply or neurotransmitters) is the unifying factor in creating a unitary syndrome. In a similar vein, Pennington et al. (1997) proposed that the unifying explanation may occur at the level of neurochemistry (e.g., a dopaminergic deficit), which would result in multiple cognitive impairments that were not necessarily connected at a cognitive level. Others in favour of multiple primary cognitive deficits have argued against the notion that autism is a unitary syndrome which requires a single unifying level of explanation. As mentioned earlier, at least two alternative conceptions are possible, which also incorporate ASDs besides autism. One is the notion of related but distinct subgroups, or a “categorical” system of subtyping. Categorical systems are “intended to divide populations into subgroups that share a common aetiology, symptom presentation, and course that is distinct from those of other subgroups” (Beglinger & Smith, 2001, p. 412). Subgroup divisions in ASDs could be defined in a number of different ways, such as according to PDD subtype (i.e., autism, Asperger syndrome, PDDNOS), the domains in which symptoms are present, symptom severity, or level of (intellectual/adaptive) functioning, with the latter variable appearing to hold the best discriminative and predictive validity in studies employing cluster analysis (Fein et al., 1999; Prior et al., 1998; Stevens et al., 2000). If ASDs are conceptualised as a group of distinct subtypes, then a single primary deficit model would not be plausible, and instead there would need to be as many primary deficits as there were subgroups (unless more extreme or severe subgroups were characterised by a larger number of primary deficits and other milder subgroups were characterised by fewer primary deficits). Therefore, across ASDs as a whole, primary deficits would not meet the criteria of universality or explanatory value (although, they should meet these criteria within the relevant subgroup). The other major alternative model of ASDs is that of a multidimensional spectrum, where dimensions such as symptom severity or level of functioning are conceptualised as a continuum ranging from “normal” to severe or extreme, rather than 7 forming discrete subgroups. The idea of autism as a unitary syndrome is also compatible with the notion of a spectrum, but in that case it would be unidimensional in nature. In a multiple primary deficits model, there would be more than one cognitive deficit, each underlying a different dimension. Again, the various dimensions could be defined in different ways; for example, each symptom domain could be a dimension, or there could be one dimension for symptom number and severity, and another for level of functioning (Szatmari et al., 2002). In the version where the dimensions are symptom domains, there would need to be as many primary deficits as there were symptom domains5 (thus, a minimum of three independent cognitive deficits of varying severity would need to underlie the triad of impairments in autism, whereas individuals with PDDs showing symptoms in only two domains would show two primary deficits). Therefore, the criteria of universality and explanatory value across all individuals with ASDs would not be met by primary deficits in this model either (although these criteria should be met by anyone displaying the relevant symptom, with differing degrees of impairment according to the severity of the symptomatology). Figure 2 presents an example of a multiple primary cognitive deficit model of autism based on the concept of autism as a continuum with three dimensions, with each dimension corresponding to a symptom domain. Non-genetic factors Genetic origins Brain abnormalities Brain abnormalities Cognitive deficit Cognitive deficit Cognitive deficit Behavioural symptom Behavioural symptom Behavioural symptom Figure 2. A multiple cognitive deficits model of autism, in which each cognitive deficit underlies a different domain of symptomatology. 5 This assumes that the symptom domains are dissociable, such that each symptom could potentially be displayed in isolation. 8 While these multiple cognitive deficits models of ASDs represent plausible alternatives to the notion of autism as a unitary syndrome with a single primary cognitive deficit, strong claims about singular primacy are still being made by proponents of the major current cognitive hypotheses. The validity of these claims not only rests on how well the proposed primary deficit can meet the four criteria for primacy, but also on whether the deficit can explain or subsume the other cognitive impairments which characterise ASDs. The construction of an integrated explanatory model of ASDs requires identification of which cognitive processes are the most primary in ASDs and how they relate both to each other and to the genetic, neurobiological, and behavioural levels of explanation6. 1.3 Overview of the thesis 1.3.1 Rationale and aims The overarching aim of the current research is to contribute to an explanatory model of ASDs, primarily by investigating the structure of the cognitive level of explanation, but also by examining its relationships with other levels of explanation (mainly the behavioural, but also the genetic in an indirect sense) – and thereby to evaluate the validity of a single versus multiple primary cognitive deficit model of ASDs. More specifically, this thesis focusses on two of the major current cognitive theories of primary deficits in ASDs: lack of theory of mind and executive dysfunction. These two theories represent the most fertile ground for debate regarding the primacy of and relationship between cognitive deficits in ASDs. This is firstly because proponents of these theories have made the strongest claims, as well as presenting the most convincing yet controversial evidence, about the deficit in question being the single primary deficit in autism (whereas those arguing for weak central coherence have tended to more often present it as one of multiple deficits); and secondly because the relationship between theory of mind (ToM) and executive function (EF) has been the subject of considerable theoretical and empirical scrutiny in typical development, but has been less well studied 6 Of course, this assumes that the cognitive level of analysis is necessary and/or useful in explaining autism. The importance of cognition in constructing causal models for developmental disorders has been justified persuasively by Morton and Frith (2001) and Tager-Flusberg (1999a), who argue that cognition is necessary to bridge the gap between brain and behaviour in a parsimonious and theory-driven manner. Postulating areas of strength and weakness at the mediating level of cognition allows us to form sensible, coherent interpretations of apparently unrelated behavioural and biological observations. 9 in ASDs (although several claims and assumptions have been made about their relatedness in ASDs). This lack of empirical attention is somewhat surprising, as any proponent of a single primary deficit model must show that the primary deficit (e.g., in ToM) causes any other deficit (e.g., in EF) demonstrated by individuals with ASDs. Moreover, most multiple deficits models would need to show that ToM and EF were independent impairments (either characterising different subgroups or underlying different dimensions of ASDs). The current research consists of two studies, both broad in scope. The first study examined the profile, primacy, and independence of ToM and EF impairments in individuals with ASDs. This is only the second study to examine these issues together in one large investigation, with the first (Ozonoff, Pennington, & Rogers, 1991) containing several limitations which were addressed in this study (see Chapter 4). The three central aims of Study One were to determine i) the specific profile of ToM and EF deficits which characterises ASDs (as a necessary first step before further examining primacy and independence); ii) whether impairments in ToM and/or EF can meet the criteria for a primary cognitive deficit in ASDs (as assessed by its universality, uniqueness, and explanatory value), and, should no impairment meet the criteria fully, which appears to be the most primary; and iii) whether or not ToM and EF impairments are related in ASDs, and if so, what the nature of that relationship might be. Several competing hypotheses about the relative primacy of and relationship between ToM and EF were tested, with each having different implications for which type of single or multiple deficit model could best explain ASDs. These aims and hypotheses and the way in which they were addressed are elaborated in Chapter 4. The second study attempted to confirm and extend the results of Study One by investigating ToM and EF impairments in siblings of individuals with ASDs. As ASDs are genetic disorders (see Chapter 5), examining cognitive weaknesses in relatives of individuals with ASDs can be a useful method of identifying potential markers of genetic vulnerability as well as testing models of primary deficits in ASDs. The main aims of Study Two were i) to identify whether ToM or EF performance can meet criteria for an “endophenotype” or vulnerability marker for the autism genotype, and thereby seek confirmation of the results of Study One regarding the relative primacy of ToM and EF in ASDs; and ii) to further investigate the validity of various single/multiple deficits models of ASDs by examining the pattern of ToM and EF performance in individuals showing the broad phenotype. Again, the aims and 10 hypotheses of this second study and its extensions to previous research are further discussed in Chapter 6. 1.3.2 Thesis structure In Chapter 2, the constructs of ToM and EF are reviewed with reference to both typical development and autism. Each ability is defined; its methods of measurement are discussed; relevant models of its structure and typical development are presented; and evidence for its impairment in and primacy to autism is critically reviewed. Next, the various hypotheses about the nature of the relationship between ToM and EF in typical development are considered, and these hypotheses are then re-examined with regard to the relationship in autism. This critical analysis of previous research on the nature, primacy, and independence of ToM and EF in typical development and autism provides the context for the thesis and for Study One in particular. A large range of diagnostic, IQ, cognitive, and behavioural measures were used in both of the studies in the thesis. Chapter 3 is devoted to the description and rationale for selection of these measures. For each questionnaire, interview, and task, the basis for its inclusion in the research and a thorough description are both provided. This reflects a general emphasis on the use of appropriate assessment tools, particularly in the area of EF, which has suffered from a history of poor measurement precision. Chapter 4 contains the major study of the thesis. The main aims of Study One were outlined in the previous section. The broader phenotype of autism is reviewed in Chapter 5, as a background for second study of the thesis. This briefer review covers the genetic basis for autism and the behavioural and cognitive characteristics of firstdegree relatives of individuals with autism. Chapter 6 contains Study Two, the central aims of which were also described in the previous section. In the General Discussion in Chapter 7, the results of both studies are summarised and their implications for conceptual models of ASDs are discussed. The importance of integration between the various levels of explanation is highlighted and emphasis is placed on the need to consider the process of development in constructing explanatory models of developmental disorders. 11 12 CHAPTER 2 Literature Review: Theory of Mind and Executive Function in Typical Development and in Autism 2.1 Theory of mind (ToM) 2.1.1 Defining and measuring ToM 2.1.2 Models of ToM and its development 2.1.3 ToM in autism 2.2 Executive function (EF) 2.2.1 Defining and measuring EF 2.2.2 Models of EF and its development 2.2.3 EF in autism 2.3 The ToM-EF relationship 2.3.1 Models of the ToM-EF relationship 2.3.1.1 Expression accounts 2.3.1.2 Common conceptual requirements of ToM and EF 2.3.1.3 Emergence accounts 2.3.1.4 Common neuroanatomical bases for ToM and EF 2.3.2 The ToM-EF relationship in autism 13 This chapter reviews previous research on the constructs of ToM and EF both in typical development and in autism, providing a context for three of the central concerns of the current research – the profile, primacy, and independence of ToM and EF impairments in ASDs. ToM is discussed in the first section, followed by EF in the second section. Each of these sections contains i) a solid background on how the construct is defined and measured, reflecting a general emphasis on measurement precision, particularly in the area of EF; ii) a review of relevant models of the typical development of ToM/EF, in order to provide a theoretical context within which both evidence of the impairment of ToM/EF in ASDs and models of the ToM-EF relationship may be evaluated; and iii) a review of evidence for the impairment of ToM/EF in autism and the specific profile of that impairment, followed by a critical analysis of evidence for the primacy of the ToM/EF impairment to autism. The third section of the review addresses the relationship between ToM and EF, covering both i) theories of the nature of the relationship in typical development, which are outlined in detail as each makes different predictions about the ToM-EF relationship in autism; and ii) evidence for the nature of the relationship in autism, which not only has implications for the validity of theories of the relationship based on typical development, but more importantly is relevant for the question of primacy (i.e., can a primary deficit in ToM explain or subsume a secondary deficit in EF, or vice versa?) This review of methodology, theory, and evidence in the fields of ToM and EF is therefore intended as a backdrop against which findings from the current research may be appraised and interpreted. 2.1 Theory of mind (ToM) 2.1.1 Defining and measuring ToM The term “theory of mind” refers to the ability to attribute oneself and others with mental states, such as desires, beliefs, and intentions, in order to explain and predict actions. The phrase was first used by Premack and Woodruff (1978), who stated that: In saying that an individual has a theory of mind, we mean that the individual imputes mental states to himself and to others...A system of inferences of this kind is properly viewed as a theory, first, because such states are not directly observable, and second, because the system can be used to make predictions, specifically about the behavior of other organisms (p. 515). 14 After noting flaws in the methodology used by Premack and Woodruff (1978) to examine whether or not chimpanzees have a ToM, Dennett (1978) pointed out that ToM could be demonstrated conclusively only by predicting the way another person will behave on the basis of a false belief (otherwise the actual situation, habitual or regular aspects of the other person’s behaviour, or the subject’s own true beliefs could be used to predict the person’s actions, without the need to appeal to mental states). This proposal was first employed with humans by Wimmer and Perner (1983), who tested typically developing children on what has now become a classic false belief task, sometimes called the “unexpected transfer” test. A scenario is presented in which a story character, Maxi, puts a chocolate in cupboard A before he goes out to play. While he is gone, his mother moves the chocolate to cupboard B. Maxi then returns, and participants are asked, “Where will Maxi look for the chocolate?”. Two control questions testing knowledge of the chocolate’s original and current location ensure that the child recalls the story and followed the sequence of events. In order to answer the belief question correctly (that Maxi will look in cupboard A), the child requires an understanding that Maxi holds a false belief, which is different from the child’s own knowledge of the actual situation, and which will lead Maxi to behave in a way which contradicts the actual situation (i.e., Maxi’s behaviour is a product of what he believes to be true rather than what is really true). An accurate answer on the belief question therefore suggests that the participant appreciates the distinction between mind (the internal and mental) and world (events, situations or behaviours). In another commonly used false belief task, variously called the “Smarties task”, the “unexpected contents” task, or the “deceptive box test” (Perner, Leekam, & Wimmer, 1987), the child is shown a box of Smarties and asked what s/he thinks is inside. After responding “Smarties”, the child is shown that the box actually contains a pencil. The pencil is then put back in the box and the box is closed. The child is asked a control question about the actual content of the box (i.e., a pencil). The child’s ability to attribute false beliefs to others is then assessed by asking what another child (or family member) would think was in the box. In some versions, the child is also asked about his/her own previous belief about the content of the box, when s/he was first shown it. Gopnik and Astington (1988) found that children who fail the false belief question also incorrectly answer that they thought the box contained pencils when they themselves first saw it, consistent with the view that the development of ToM pertains to knowledge of one’s own mind as well as the minds of others. 15 A variation on the Smarties task is a test of the “appearance-reality distinction”, or the “unexpected identity” task (Flavell, Flavell, & Green, 1983). In this task, the child is shown an object with a deceptive identity, such as a sponge that looks like a rock, and is asked what it looks like. The real nature of the object is then demonstrated to the child (e.g., by squeezing the sponge), and the child is asked what it really is. The subsequent two questions follow the same structure as the Smarties task, with the child being asked what s/he thought the object was when first shown it, and what another child would think the object was. Such false belief tasks are now central to current developmental research on social cognition, serving as a marker for ToM in both typically developing and disordered populations (Wellman, Cross & Watson, 2001). However, ToM has been measured in many other ways, some of which also exploit the false belief concept, and others of which measure other types of mentalistic understanding. Baron-Cohen (2000) reviews 20 kinds of tasks which are purported to measure ToM, including tests of deception, the mental-physical distinction, recognition and expression of mental-state words, decoding mental states from the eyes, and understanding the mental functions of the brain. 2.1.2 Models of ToM and its development The timing and mechanisms of the normal development of ToM have been studied extensively (see Astington, Harris & Olson, 1988; Carruthers & Smith, 1996; Lewis & Mitchell, 1994; Mitchell & Riggs, 2000; Perner, 1991; Wellman, 1990; Wellman et al., 2001; Whiten, 1991). The large majority of studies demonstrate a definitive improvement in performance on false belief (and other ToM) tasks between the ages of 3 and 5 years, with 3-year-olds consistently making errors suggesting that they are unable to separate belief from reality (e.g., in the unexpected transfer task they will assert that Maxi will look for the chocolate in cupboard B, its actual location). In their meta-analysis of 178 studies measuring young children’s performance on false belief tasks, Wellman et al. (2001) found that average false belief performance changes rapidly between 3 and 4.5 years from significantly incorrect (i.e., below chance) to significantly correct (above chance). Although ToM development through middle childhood to adolescence is much less well studied, evidence suggests that advances in these years include “an understanding that people’s mental states...are often consistent across situations in the form of personality traits, a greater appreciation of the mind as 16 an active constructor and interpreter of knowledge, and a growing awareness of the presence, influence, and sources of ongoing thoughts – that is, active mental ideation” (Wellman & Lagatutta, 2000, p. 31). Theoretical accounts of ToM development tend to focus on the crucial period between 3 and 5 years, centreing on the debate as to whether young children fail false belief tasks because they lack the conceptual understanding required to respond correctly (“competence”), or because the insufficient development of other cognitive capacities (e.g., inhibitory control, linguistic comprehension) masks the access to or expression of understanding (“performance”). Explication of these models is useful because of their in-depth analysis of what is involved in successful performance on false belief tasks. This is important both for understanding what underlies the ToM impairment in autism, and for analysing the relationship between ToM and EF. An overview of the major relevant accounts of ToM development therefore follows. Competence accounts vary in their postulated mechanisms of ToM development, but they share in common the idea that ToM matures continuously in a series of successive stages of discovery, each of which is developmentally related to the next (as opposed to an innate, modular ability which comes on-line early). Gopnik, Wellman and colleagues (Gopnik, 1993; Gopnik & Meltzoff, 1997; Gopnik & Wellman, 1994; Wellman & Gelman, 1998) favour the “theory theory”, which proposes that children’s early conceptions of the mind are theory-like, sharing key features with scientific theories: they are abstract (i.e., framed in a different vocabulary from empirical observations), hold explanatory and predictive power, lead to distinctive interpretations of evidence, and are open to revision based on counterevidence. The theory theory holds that ToM development is a gradual transition from one view of the mind to another, rather than being a simple all-or-none acquisition of “a” theory of mind. In line with the theory theory, Wellman and colleagues (Bartsch & Wellman, 1989, 1995; Gopnik & Wellman, 1994; Wellman, 1990; Wellman & Woolley, 1990) propose a specific developmental sequence in which children’s understanding of the motivational forces behind people’s actions advances from a “simple desire” psychology to a more adult “belief-desire” psychology. In Wellman’s framework, 2year-olds hold a simplified understanding of desire and perception (where others are attributed internal dispositions toward or against certain actions or objects), but fail to 17 understand that people have internal mental representations1 of the world such as beliefs; in an intermediate phase, 3-year-olds develop a nonrepresentational understanding of belief while beginning to comprehend representational aspects of desire and perception; and at around age four, children begin to realise that individuals’ beliefs (i.e., their representations of reality rather than reality itself) determine their actions. According to the theory theorists, 2- and 3-year-olds fail the standard false belief (unexpected transfer) task because as simple desire psychologists, they do not attribute a belief to Maxi, but rather they predict that Maxi will act to fulfil his desire for chocolate and will therefore look where the chocolate actually is. Perner (1991, 1993, 1995, 2000) has articulated an alternative competence account of ToM development which, like Wellman, Gopnik and colleagues, focuses on children’s understanding of mental states as representations. While Perner considers himself a theory theorist, he states that the use of the word “theory” is meant to signal the notion that conceptual understanding unfolds as a result of the growth of interdependent concepts, rather than suggesting that children’s intellectual growth is analogous to scientists making new discoveries (which is explicitly proposed by Gopnik & Wellman, 1994). He argues that young children begin with a nonrepresentational conception of mind, and that it is only when children acquire a general theory of representations (a Representational Theory of Mind; RTM) that they are able to solve false belief tasks. In Perner’s model, an RTM involves comprehending that propositions (e.g., beliefs) are semantically evaluable as being true or false. That is, propositions are “about” a world against which their truth is evaluated (Perner, 2000). In claiming that young children do not understand representations, his contention is that they do not understand that a proposition can be evaluated by someone else as having a different truth value than the one it has in reality (or the one assigned to it by the child his/herself). Thus, Perner’s (1995) explanation of young children’s failure on the unexpected transfer task is that: ...they cannot distinguish between the state of the world that the belief is about and how the believer conceives of that state of the world, or in other words, children cannot conceive of belief as misrepresenting where the chocolate really is as being in location A. Without this understanding children cannot understand why an agent who wants to find the chocolate in its real world location (B) would act as if the chocolate were in A. (p. 251). 1 Here, a “representation” may be defined as an entity in the mind which represents a state of affairs in the world, like a “picture-in-the-head” (Leslie & Thaiss, 1992). This is distinct from Leslie’s (1987, 1994a) concept of “metarepresentation”, which is discussed later. 18 However, understanding propositions as evaluable as true or false is not enough to successfully solve false belief tasks (Perner, 2000). The child must additionally realise that the belief has causal power – it takes precedence (over the world itself) in determining behaviour. Thus, a false belief about the chocolate’s location, rather than the chocolate’s actual location, makes Maxi look in the wrong place. Another distinction between the positions outlined by Perner and Wellman is the specific nature of the successive theories of mind that children are said to discover, with Perner rejecting Wellman’s simple desire psychology in favour of his concept of “prelief”. He argues that young children’s appreciation of pretence (i.e., their ability for pretend play, which is first employed between 18 and 24 months) implies a realisation that people do not always act in a way that satisfies their desires objectively. However, since the young child cannot differentiate between actions based on a false belief which is held as true and actions based on pretence where what is being pretended is not held as true, s/he understands these states of belief and pretence as the amalgamated protoconcept of prelief, which s/he conceptualises as “behaving as if” (Perner, Baker & Hutton, 1994). Children eventually develop a more adult stage of understanding of “behaving as is”, whereby people behave according to the beliefs they hold as being true (this stage in Perner’s model is not distinguishable in obvious ways from the 4year-old stage of understanding outlined by Wellman and colleagues). In contrast to competence accounts such as those of Wellman and Perner, performance accounts hold that young children fail false belief tasks because processing limitations2 mask their true ability, as evidenced by demonstrations of earlier competence when the testing procedure is modified – such as by asking the child “Where will Maxi look first for his chocolate?” in the unexpected transfer task (e.g., Chandler, Fritz, & Hala, 1989; Freeman & Lacohée, 1995; Mitchell & Lacohée, 1991; Roth & Leslie, 1998; Siegal & Beattie, 1991). This position has been articulated most thoroughly by Leslie and colleagues, who argue that ToM arises from an attentional mechanism specialised for selectively attending to mental states (the Theory of Mind Mechanism; ToMM) which is innate, domain-specific, operates spontaneously from very early in life without formal instruction, and can be dissociably damaged – in other words, it is modular (German & Leslie, 2000; Leslie, 1987, 1991, 1994a, 1994b; Leslie & Roth, 1993; Leslie & Thaiss, 1992; Roth & Leslie, 1998; Scholl & Leslie, 1999, 2 These include executive functions such as inhibition and working memory. development based around advances in EF are reviewed in Section 2.3.1. Accounts of ToM 19 2001; Surian & Leslie, 1999). For Leslie, very young children’s apparent appreciation for something as abstract and unobservable as others’ mental states is best explained by an innately specified module. As for Perner, the very early appearance of pretend play is an important factor in Leslie’s model, however for Leslie it marks an early capacity for metarepresentation, which also underlies the concept of belief and indicates the early presence of a ToMM. Leslie (1987, 1994a), based on Pylyshyn (1978), distinguishes between primary representations, which are direct, literal representations about a state of affairs in the world; and metarepresentations3, which may be described as representations of representations. A metarepresentation describes an agent’s (e.g., mother’s, self’s) mental state, or provides an “agent-centred” description of a situation, which is “decoupled” from the primary representation and processed as if it were a copy or report of the primary representation (Leslie & Roth, 1993). It does this by specifying an “informational relation” (or “propositional attitude”; e.g., DESIRING, PRETENDING) between the agent, an aspect of reality (described by a primary representation) and an imaginary situation (described by the “decoupled” representation). For example, the metarepresentation mother PRETENDS [of] this banana [that] “it is a telephone” allows the child to make sense of his/her mother’s behaviour of talking to a banana by reference to his/her mother’s mental state (i.e., her attitude of pretence towards the banana), without making the real-world inference that “bananas are telephones”. Leslie’s assumption is that there is a small core set of innate informational relations available to the ToMM early on, such as BELIEVING, DESIRING, and PRETENDING. As these attitudes are all deployed within the same metarepresentational structure, Leslie’s explanation for why young children are able to demonstrate understanding of pretence and desire, but not belief, rests on an additional component of his model termed the “Selection Processor” (SP; Leslie & Thaiss, 1992). The SP is an inhibitory mechanism which allows the child to select the specific relevant information that is required for the belief content inference, while disregarding prepotent competing information (e.g., in the unexpected transfer task, inferring the correct content of Maxi’s belief requires selecting the situation which Maxi was 3 Perner (1991) has criticised Leslie’s use of the term metarepresentation as suggesting that the young child has a conscious “theory of representation”. However, Leslie has specified (Leslie & Thaiss, 1992; Leslie & Roth, 1993) that he does not intend it in this sense, but rather intends it to denote a kind of data structure computed by our cognitive system, or an information processing mechanism which helps to create conceptual knowledge. He does not mean to imply that children have a conscious theory that mental states are representations in the head (as Perner does to some degree when he proposes the Representational Theory of Mind). 20 exposed to at the beginning of the scenario from memory and resisting basing the inference on the current situation - a tendency which is prepotent because beliefs are usually true representations of current reality; Leslie, 1994a). According to Leslie, 3and 4-year-old children do not differ fundamentally in their conception of belief, but 3year-olds fail the false belief task because the SP is poorly developed. Hence, task manipulations which decrease the load on the SP often result in an improvement in young children’s performance on false belief tasks. A volley of criticisms directed at both theoretical foundations and methodological approaches continues to shoot back and forth between competence and performance theorists4 (see, for example, German & Leslie, 2000; Perner, 2000; Roth & Leslie, 1998; Wellman et al., 2001). Theory theorists have accused modularity accounts of being “antidevelopmental” (Gopnik & Wellman, 1994; for a defence see Scholl & Leslie, 1999), while modularity theorists argue that theory theories are purely descriptive (lacking a specification of the cognitive architecture and mechanisms underlying theory development), and require that the young child develop explicit theories about impossibly abstract concepts (Roth & Leslie, 1998; but see Perner, 2000). Competence theorists claim that studies purporting to show improved performance of 3year-olds on simplified false belief tasks have not been consistently replicated and are open to alternative interpretations (Perner, 2000), while performance theorists maintain that 3-year-olds failure on standard false belief tasks is a false negative (Leslie, 1994a). While it is not possible to do full justice to these arguments here, placing ToM in at least a broad theoretical context is helpful both in evaluating models of the ToM-EF relationship and in conceptualising the impairment of ToM in autism – the latter of which we turn to now. 2.1.3 ToM in autism Complementing the vast literature addressing the typical development of ToM, an equally large, if not larger, number of studies have investigated ToM in children with autism. This body of work began with a seminal paper by Baron-Cohen, Leslie and Frith (1985), in which children with autism were tested on a variation of Wimmer and Perner’s (1983) unexpected transfer task. Baron-Cohen et al.’s “Sally-Anne” scenario, which has become the most frequently used version of the unexpected transfer test in 4 For other major accounts of ToM development which have not been reviewed here (e.g., simulation theory, counterfactuality), the reader is referred to Mitchell and Riggs (2000). 21 the autism literature, involves two doll protagonists, Sally and Anne. Sally places a marble into her basket, then leaves the scene. Anne takes the marble and hides it in her box. When Sally returns, the child is asked “Where will Sally look for her marble?” (the Belief Question). Two control questions probe knowledge of the current location of the marble (the Reality Question) and the marble’s initial location (the Memory Question). Baron-Cohen et al. found that while all autistic and control children were able to correctly answer the control questions, only 20% of children with autism passed the Belief Question, compared with 85% of typically developing children and 86% of children with Down’s Syndrome (suggesting that the poor performance in children with autism was not attributable to intellectual disability). They interpreted this result as evidence for a metarepresentational deficit specific to autism (based on Leslie’s (1987) theory of ToM), which had the potential to explain autistic symptoms such as social impairment and lack of pretend play. Impaired performance of individuals with autism on false belief tasks has since been replicated in numerous studies (although failures to replicate have also occurred, as discussed later). These studies have included a range of task variations such as using real people instead of puppets and a “think” question rather than a “look” question (i.e., “Where does Sally think the marble is?”), as well as using alternative false belief paradigms such as the deceptive box (“Smarties”) test and the unexpected identity (appearance-reality) task (Baron-Cohen, 1989a; Charman & Baron-Cohen, 1992; Eisenmajer & Prior, 1991; Leekam & Perner, 1991; Leslie & Frith, 1988; Leslie & Thaiss, 1992; Perner, Frith, Leslie & Leekam, 1989; Ozonoff et al., 1991; Reed & Peterson, 1990; Surian & Leslie, 1999). Individuals with autism have also demonstrated significantly poorer performance than controls on various other tasks tapping mentalising ability5, such as sequencing of mentalistic picture stories (BaronCohen, Leslie & Frith, 1986); tests of the mental-physical distinction (Baron-Cohen, 1989a; Ozonoff et al., 1991); describing the mental functions of the brain (BaronCohen, 1989a; Ozonoff et al., 1991); recognition, comprehension and expression of mental state terms (Baron-Cohen et al., 1994; Tager-Flusberg, 1992; Ziatas, Durkin & Pratt, 1998); inferring the mentalistic significance of the eyes (Baron-Cohen, Campbell, Karmiloff-Smith, Grant, & Walker, 1995; Baron-Cohen et al., 1999a; Baron-Cohen, Jolliffe, Mortimore, & Robertson, 1997; Baron-Cohen, Wheelwright, Hill, Raste, & Plumb, 2001a); attribution of mental states to animated shapes (Castelli, Frith, Happé & 5 The term “mentalising ability” is intended as a synonym for ToM (i.e., the ability to make inferences about mental states). 22 Frith, 2002); conceptual perspective-taking (Dawson & Fernald, 1987); tests of deception (Baron-Cohen, 1992; Russell, Mauthner, Sharpe, & Tidswell, 1991; Sodian & Frith, 1992); understanding that “seeing-leads-to-knowing” (Baron-Cohen & Goodhart, 1994; Leslie & Frith, 1988); understanding that beliefs cause emotions (Baron-Cohen, 1991a); and understanding of intentions (Phillips, Baron-Cohen & Rutter, 1998). On the basis of this kind of evidence, some researchers have proposed that the whole range of autistic symptomatology may be explained by a single, primary, cognitive deficit in ToM (e.g., Baron-Cohen, 1988, 1991c; Frith et al., 1991; Leslie, 1987, 1991). Furthermore, the same authors have argued that the apparent domain specificity of the ToM impairment in autism is existence proof that ToM is a modular capacity. For example, Leslie (1987, 1991; Leslie & Thaiss, 1992) has argued that the modular Theory of Mind Mechanism (ToMM), which automatically leads us to interpret behaviour in terms of an agent’s mental states, is specifically impaired in autistic individuals and can explain their social and communicative impairments and lack of pretend play. Baron-Cohen (1994, 1995, 1998) outlined an alternative view whereby ToMM does not come fully prepackaged as an innate module, but rather is preceded by several lower-level modular mechanisms which extract relevant social information and provide critical inputs to the development of ToM. These mechanisms include an Eye Direction Detector (EDD) which alerts the infant to the eye region and thereby provides opportunities to learn the mentalistic significance of eye gaze; an Intentionality Detector (ID) that directs attention to animate actions, enabling the infant to learn about goaldirectedness; and a Shared Attention Mechanism (SAM), which uses inputs from the other two mechanisms to allow the infant to work out if s/he and another person are jointly attending to the same thing. In this model, ToMM is conceptualised as being either a more mature development of SAM, or is triggered by SAM. Tests of the validity of the ToM hypothesis of autism (the view that autism may be explained by a primary deficit in a ToM module) have focused on the central criteria required to uphold the position (outlined in Chapter 1, Section 1.2): that i) a ToM impairment is universal among individuals with autism; ii) a ToM impairment is unique to individuals with autism, iii) a ToM impairment can explain the earliest signs of autism in infants (causal precedence); and iv) a ToM impairment can account for the entire range of symptoms displayed by individuals with autism (explanatory value). In addition, the modular ToM hypothesis must meet criterion v), that failure on ToM tasks is best explained by a domain-specific ToM impairment, and cannot be accounted for in 23 terms of other cognitive constructs. The evidence for each of these claims is reviewed below. i) Universality. From the first study of ToM in autism (Baron-Cohen et al., 1985), it was evident that a proportion of autistic individuals were able to pass false belief tasks. The percentage of autistic individuals found to pass standard false belief (unexpected transfer or “Sally-Anne”) tasks in subsequent studies has varied from 15% (Reed & Peterson, 1990) to 55% (Prior, Dahlstrom, & Squires, 1990), with 90% of participants with autism passing in one study (Dahlgren & Trillingsgaard, 1996). Although in most cases the proportion of passers with autism is significantly smaller than the proportion of successful control participants (usually matched on verbal mental age), the finding that any child with autism passes false belief tasks poses a challenge to the ToM hypothesis of autism (although random responding could result in a correct response). Baron-Cohen (1989b) responded to this challenge with a study demonstrating that individuals with autism who pass standard first-order false belief tasks are still unable to make more complex second-order false belief attributions (i.e., of the form “Mary thinks that John thinks the icecream van is in the park”; Perner & Wimmer, 1985). He proposed that autism is characterised by a specific developmental delay in ToM, such that older and more able participants with autism are able to pass first-order false belief tasks which are usually mastered by the age of four, but still fail on more difficult tasks which are usually only passed by the age of six or seven. However, Baron-Cohen’s (1989b) finding has not been replicated in a number of subsequent studies, which have found that a subset of participants with high-functioning autism or with Asperger syndrome also pass second-order false belief tasks (Bauminger & Kasari, 1999; Bowler, 1992; Dahlgren & Trillingsgaard, 1996; Leekam & Prior, 1994; Ozonoff et al., 1991; Sparrevohn & Howie, 1995). Ozonoff et al. (1991) found that EF deficits were more universal than ToM impairment among high-functioning autistic individuals (see Section 2.2.3 for further discussion of this finding). Furthermore, Tager-Flusberg and Sullivan (1994b) showed that both autistic and control first-order task passers were able to pass a shorter and less complex second-order task than the task used in previous studies, suggesting that their failure on traditional secondorder tasks was more likely to be due to the high information processing load than a lack of conceptual understanding. Nevertheless, it has been argued that ToM is not measurable only by performance on false belief tasks (e.g., Tager-Flusberg, 2001). Studies using higherlevel tests of mentalising ability have found that first- and second-order false belief task 24 passers still demonstrate evidence of impairment in ToM. Happé (1994a) found that individuals with autism who passed both first- and second-order false belief tasks were significantly poorer than mentally handicapped and normal children and adults at providing context-appropriate mental state explanations for nonliteral utterances made by story characters, which she argues is a more “advanced”, naturalistic test of ToM. First- and second-order passers performed more poorly than controls on a test requiring inference of complex mental states from expression of the eyes (Baron-Cohen et al., 1997), and high functioning adults with autism who passed a first-order false belief task performed significantly worse than controls on tests measuring attribution of mental states to voices and eyes (Kleinman, Marciano & Ault, 2001). Frith, Happé and Siddons (1994) also found that first-order passers still showed impairments in everyday social behaviours which require mentalising. Studies examining the characteristics of autistic false belief passers have tended to find that a high verbal mental age or verbal IQ is a necessary but not sufficient condition for passing false belief tasks (Charman & Baron-Cohen, 1992; Eisenmajer & Prior, 1991; Leekam & Perner, 1991; Prior et al., 1990; Sparrevohn & Howie, 1995). In a review of the literature, Happé (1995) found that children with autism require a verbal mental age more than twice as high as control participants in order to pass false belief tasks. Other studies have found that chronological age is a significant factor, either in addition to or instead of verbal mental age (Baron-Cohen, 1992; Prior et al., 1990), while still others have found no relationship between age and ability variables and false belief task performance (Baron-Cohen et al., 1985; Perner et al., 1989). The general finding that passers tend to be of higher verbal ability is consistent with the idea, put forward by a number of authors, that individuals with autism who pass false belief tasks do so not by the usual use of ToM, but by using an alternative compensatory route to success (Eisenmajer & Prior, 1991; Frith et al., 1991; Happé, 1995; Holroyd & BaronCohen, 1993; Ozonoff et al., 1991). For example, Frith et al. (1991) suggested that able autistic individuals may have learned or extracted explicit rules about certain social situations, such as “When something in the world changes, people who just happen not to have seen the change occur behave (for some reason) as if they do not know about these changes” (p. 436). A study by Happé et al. (1996) provides some support for this idea, finding that adults with Asperger syndrome showed activation of different areas of the prefrontal cortex from controls when listening to mentalistic stories. However, more direct evidence confirming that false belief task passers are using compensatory strategies to deduce their solution is yet to be obtained. 25 ii) Uniqueness. Proponents of the ToM hypothesis argue that a ToM impairment is unique to autism, citing evidence that control groups of children with either Down’s syndrome (e.g., Baron-Cohen et al., 1985), other kinds of mental retardation (e.g., Charman & Baron-Cohen, 1992), or specific language impairment (Leslie & Frith, 1988) do not show impaired performance on false belief tasks in comparison with children with autism. However, these findings have been challenged in a number of other studies which have either failed to replicate significantly poorer performance of children with autism on various ToM tasks compared with controls (Carpenter, Pennington & Rogers, 2001; Charman & Lynggaard, 1998; Dahlgren & Trillingsgaard, 1996; Oswald & Ollendick, 1989; Prior et al.,1990, Tager-Flusberg & Sullivan, 1994a), or have found ToM impairments in other clinical populations. It has become apparent that mentally retarded, non-autistic individuals perform more poorly on false belief tasks than would be expected given their chronological and mental age (Benson, Abbeduto, Short, Bibler-Nuccio, & Maas, 1993; Yirmiya, Erel, Shaked, & SolomonicaLevi, 1998; Yirmiya & Shulman, 1996; Yirmiya, Solomonica-Levi, Shulman, & Pilowsky, 1996; Zelazo, Burack, Benedetto, & Frye, 1996a). Yirmiya et al.’s (1998) meta-analysis comparing the ToM abilities of individuals with autism, mental retardation (MR), and typically developing individuals showed that although autistic individuals were the most severely impaired on ToM tasks, individuals with MR also performed significantly more poorly than typically developing individuals. This result led them to conclude that it may be the severity of ToM impairment rather than the impairment itself that is unique to autism. They also found that the aetiology of the MR was an important factor, with individuals with Down’s syndrome performing better than other individuals with MR of unknown aetiologies. In addition to MR, impairments in ToM have been found in deaf children (de Villiers, 2000; Peterson, 2002; Peterson & Siegal, 1995), blind children (Brown, Hobson, Lee, & Stevenson, 1997; Minter, Hobson, & Bishop, 1998), and individuals with schizophrenia (Corcoran, Mercer, & Frith, 1995; Mazza, De Risio, Surian, Roncone, & Casacchia, 2001; Pilowsky, Yirmiya, Arbelle, & Mozes, 2000), bipolar affective disorder (Kerr, Dunbar, & Bentall, 2003), borderline personality disorder (Fonagy et al., 1995), non-verbal learning disorder (Buitelaar, Swaab, van der Wees, Wildschut, & van der Gaag, 1996), Parkinson’s disease (Mengelberg & Siegert, 2003; Saltzman, Strauss, Hunter, & Archibald, 2000), and frontotemporal dementia (Gregory et al., 2002; Lough & Hodges, 2002). Contrary to the findings of Leslie and Frith (1988), other studies have found ToM deficits in children with specific language 26 impairment and other communicative disabilities (e.g., Dahlgren, Dahlgren Sandberg, & Hjelmquist, 2003). These challenges to the uniqueness criterion of the ToM hypothesis have been refuted by claims that these non-autistic clinical groups do not show as severe an impairment on ToM tasks as individuals with autism, and that they fail ToM tasks for different reasons than individuals with autism (i.e., their failure is not due to a genuine metarepresentational deficit). For example, individuals with MR may fail because of poor general cognitive and linguistic skills, deaf and blind children may fail because they lack the necessary perceptual input, such as access to language or facial information, and individuals with borderline personality disorder may fail because parental neglect and abuse prevented the normal development of ToM (Baron-Cohen, 2000; Corcoran, 2000; Tager-Flusberg, 2001). However, these claims are yet to be confirmed empirically. In addition, as discussed further in the domain specificity section, it must be demonstrated that children with autism do not also fail ToM tasks because of domain-general cognitive or linguistic difficulties. iii) Causal precedence. Much of the evidence for the ToM hypothesis has focused on performance on false belief tasks, on which successful performance normally develops at around the age of four and is interpreted as evidence of a metarepresentational capacity (Leslie, 1987) or representational understanding of mind (Perner, 1991). However, in most cases autism is apparent at a much younger age, with deficits in social responsiveness and reciprocity, symbolic play, gaze behaviour, joint attention, and imitation often noticed during infancy or when the child is a toddler (e.g., Dawson & Adams, 1984; Klin, Volkmar & Sparrow, 1992; Mundy & Sigman, 1989; Volkmar et al., 1987). Klin et al. (1992) pointed out that as the crux of the ToM hypothesis of autism is the inability to represent others’ mental states, then the resulting prediction would be that social impairment in autism should only become apparent at the age at which metarepresentational skills appear in typically developing children. It is unclear exactly when this is, with Leslie’s (1987) original thesis being that pretend play may be the earliest manifestation at around 18 months, and later authors suggesting that earlier behaviours such as protodeclarative pointing (11-12 months) and joint attention (8-12 months) may be the earliest signs (Baron-Cohen, 1989d, 1991b), although these latter abilities are proposed as “precursors” to ToM rather than signs of an early ToM itself6. Regardless, Klin et al. (1992) found that six types of social behaviour from the Vineland Adaptive Behavior Scales which emerged prior to the age 6 Leslie and Happé (1989) have, however, argued that joint attention may also indicate the emergence of an ability to represent mental states, as these behaviours convey the intention to communicate. 27 of eight months successfully discriminated autistic children from controls. The nonrepresentational nature of these behaviours, such as “shows anticipation of being picked up by a caregiver” and “reaches for familiar person”, was taken to indicate an early pre-mentalising social impairment in autism. These kind of findings of early, apparently non-mentalistic social impairment in autism have been interpreted as evidence in favour of a more primary affective, emotional, or intersubjective impairment in autism (e.g., Hobson, 1993; Klin & Volkmar, 1993; Mundy, Sigman, & Kasari, 1993). However, the very early recognition of autism in Klin et al.’s (1992) participants is not typical, with most other studies finding that it is not possible to reliably detect autism until at least the age of 18 months (e.g., Johnson et al., 1992). In addition, some of Klin et al.’s autistic participants did show typical social behaviours, raising the possibility of different subgroups within the autism spectrum. The question of whether the ToM hypothesis can meet the criterion of causal precedence remains a matter of debate (see Charman, 2000). iv) Explanatory value. The strongest form of the ToM hypothesis asserts that impairment in the ToM module can explain the entire range of symptoms displayed by individuals with autism (Frith et al., 1991), although original accounts of the ToM hypothesis focussed mainly on the social and communicative impairments characteristic of autism. For example, Baron-Cohen (1988) proposed that the ToM hypothesis would predict impairments in social skills requiring an ability to represent mental states, as well as pragmatic language skills, as conversing requires that the speaker be aware of the listener’s mental state. While the relationship between ToM and real-life social skills appears to make intuitive sense, it has not been directly investigated in many studies. Dawson and Fernald (1987) reported a significant correlation between autistic children’s conceptual perspective-taking ability and their teachers’ ratings of social skills. Frith et al. (1994) found that individuals with autism who passed false belief tasks were more likely to show evidence of “mind-reading” in their everyday social behaviour and had better communicative abilities, while those who failed false belief tasks showed few social behaviours requiring understanding of mental states. In their sample of young French preschoolers with autism or PDDNOS, Hughes, SoaresBoucaud, Hochmann, and Frith (1997) found significant differences between ToM “passers” and “failers” in ratings of everyday social behaviours requiring mentalising abilities, but only when the teacher rather than the parent was the informant. However, neither Prior et al. (1990) nor Sparrevohn and Howie (1995) found a significant 28 correlation between false belief performance and social skills, as rated by parents and teachers respectively. The literature examining the relationship between ToM and language abilities in autism is much larger, with most studies confirming Baron-Cohen’s (1988) prediction of a relationship between ToM and pragmatic language skills (e.g., Capps, Kehres, & Sigman, 1998; Tager-Flusberg & Sullivan, 1995). However, it has also become clear that individuals with autism also show non-pragmatic language impairments (e.g., in lexical and grammatical knowledge) which are not likely to be the result of a ToM deficit (Tager-Flusberg, 1999b), but which do correlate with false belief task performance (e.g., Happé, 1995; Sparrevohn & Howie, 1995). This raises the question of the direction of the causal relationship between language and ToM, the answer to which is “likely to be complex” (Tager-Flusberg, 2000). While longitudinal studies have shown that joint attention behaviours (arguably “precursors” to ToM) in toddlers with autism predicted language gains several years later (Sigman & Ruskin, 1999), suggesting that ToM ability is necessary for adequate language development, the reverse has also been demonstrated - that structural language skills play a key role in ToM development (de Villiers & de Villiers, 1999). Regardless, it is clear that there is a close relationship between ToM and language in autism. The same cannot be said, however, for the relationship between ToM and the much-neglected third feature of the autistic triad, the repetitive behaviours and restricted interests which are part of the DSM-IV criteria for autism. While the ToM hypothesis is able to account for the lack of pretend play displayed by autistic children, it is less obvious how it might explain other aspects of the third feature of the triad, such as obsessional interests or repetitive arm-flapping or toe-walking. Baron-Cohen (1989c) and Carruthers (1996) both attempted to explain repetitive behaviours in autism by proposing that they develop as a strategy to cope with and gain control over the unpredictable and frightening social world that surrounds the child who is unable to understand others’ mental states. This account predicts that the frequency of repetitive activities should be higher in social settings, especially those which lack a predictable structure. However, most studies have reported the converse finding, that rates of stereotyped behaviour are lowest during periods of social interaction and highest during periods where no interpersonal demands are made (Clark & Rutter, 1981; Dadds, Schwartz, Adams, & Rose, 1988; Donnellan, Anderson, & Mesaros, 1984). In the only study reported in the literature so far to directly investigate the relationship between ToM and repetitive behaviours in autism, Turner (1996, 1997) found no relationship 29 between false belief task performance and the incidence or severity of a large range of repetitive behaviours. Similarly, the ToM hypothesis faces difficulty explaining so-called “non-triad” features of autism, which appear frequently but are not part of the diagnostic criteria (Frith & Happé, 1994; Tager-Flusberg, 2001). These include savant abilities, exceptional visuospatial and visuoperceptual skills, over-selective attention, and heightened sensory sensitivities. These aspects of autism do not bear an obvious relation to ToM ability, and may be better explained by the local processing style that appears to be characteristic of autistic individuals (Frith & Happé, 1994; Happé, 1997, 1999; Plaisted, 2000, 2001). The inability thus far of the ToM hypothesis to adequately meet the criterion of explanatory value could arguably be considered one of the most substantial problems to have faced it. v) Domain specificity. The criterion of domain specificity results from the claim that ToM reflects an innate module that develops separately from other cognitive capacities, and is independently impaired in autism (Leslie, 1987, 1991; Baron-Cohen, 1991c). This assertion has been defended by citing evidence that individuals with autism are able to pass tasks which have equivalent structure and demands to false belief (or other ToM) tasks, but do not have mentalistic content. This approach of comparing autistic assets and deficits on tasks which require mentalising and those which do not has been dubbed the “fine cuts” technique by Frith and Happé (Frith & Happé, 1994; Happé & Frith, 1995). For example, it has been found that while individuals with autism fail false belief tasks, they are able to pass tests involving false photographs, drawings, and models7 (Charman & Baron-Cohen, 1992, 1995; Leekam & Perner, 1991; Leslie & Thaiss, 1992). Similarly, autistic individuals demonstrate understanding of behavioural but not mentalistic picture sequences (Baron-Cohen et al., 1986), understand “see” but not “know” (Perner et al., 1989), and engage in physical sabotage but not deception (Sodian & Frith, 1992). Additionally, Baron-Cohen (1991c) found that participants with autism were not impaired in domains of social cognition which do not require a ToM. However, the claim of domain specificity has come under increasing criticism, with recent evidence suggesting that impaired performance on false belief tasks may 7 The false photograph paradigm, for example, runs as follows: a horse puppet takes a photograph of a cat puppet. The cat then moves from the chair to the bed. The child is asked, “In the photograph, where is the cat sitting?”, as well as two control questions probing knowledge of where the cat was when the horse took the photograph and where the cat is now. It is argued that this task is identical in structure to the false belief task, but requires reasoning about outdated physical representations instead of mental representations. 30 still be explained, and even better accounted for, by deficits in more domain general processes such as EF (see Section 2.3) or language impairment (Bruner & Feldman, 1993; Tager-Flusberg, 2000). Zelazo and colleagues (Zelazo et al., 1996a; Zelazo, Burack, Boseovski, Jacques, & Frye, 2001) have argued that failure to explicitly test and meet the assumption that two tasks (such as the false belief and false photograph tasks) are of the same underlying complexity is problematic, and weakens any arguments for domain specificity and modularity (this argument, and further criticisms of the false photograph task, are discussed in Section 2.3). They have found that children with autism are impaired on another control task without mental content, which is matched for underlying complexity to the false belief task (e.g., Zelazo et al., 1996a). Frye (2000) also provides a number of cogent a priori arguments for why ToM should not be considered a domain specific function. For example, given that ToM refers to understanding one’s own beliefs as well as others’ (with research confirming self-other equivalence on the Smarties and appearance-reality tasks), Frye questions where the domain boundaries in our own beliefs would be, as our beliefs can be about nonmentalistic things such as physics or biology. He also provides a critique of BaronCohen’s (1994) Intentionality Detector module, pointing out that assigning intention on the basis of the direction of movements will tend to over-ascribe intentionality to every change in direction we happen to take, and under-ascribe intentionality to acts which do not involve movement, such as not preventing something from happening. While proponents of the ToM hypothesis have constructed some fairly plausible defences against several of the attacks on its claims of universality, uniqueness, causal precedence, explanatory value, and domain specificity, converging counterargument and evidence has resulted in a general retreat from the strong version of the hypothesis that autism may be explained by a single primary cognitive deficit in a ToM module. While some authors still largely adhere to the original strong version of the ToM hypothesis (e.g., Surian & Leslie, 1999), many of its original proponents now advocate a weaker version in which ToM is conceptualised as one of multiple cognitive impairments in autism (Baron-Cohen & Swettenham, 1997; Frith & Happé, 1994; Happé & Frith, 1996), and/or is not necessarily considered to be a unitary module, but rather a more multidimensional ability which emerges gradually during development (Tager-Flusberg, 2001). 31 2.2 Executive function (EF) 2.2.1 Defining and measuring EF Because of its complex and theoretical nature, defining and operationalising “executive function” has proven to be a persistent problem. To some extent, the chosen definition of EF is dependent upon the author’s favoured model of its underlying structure. However, EF is generally understood to be an umbrella term covering a number of related but distinct high-level cognitive capacities which help guide and control purposeful behaviour towards attainment of a goal (e.g., Lezak, 1993; Luria, 1966; Stuss & Benson, 1986; Welsh & Pennington, 1988). These capacities include planning, set-shifting (also known as attentional switching or cognitive flexibility), strategy formation, inhibition, working memory, generativity8, decision-making, and selfmonitoring. In his overview of issues in EF assessment, Rabbitt (1997) proposed that: “executive control is necessary to deal with novel tasks that require us to formulate a goal, to plan, and to choose between alternative sequences of behaviour to reach this goal, to compare these plans in respect of their relative probabilities of success and their relative efficiency in attaining the chosen goal, to initiate the plan selected and to carry it through, amending it as necessary, until it is successful or until impending failure is recognised.” (p. 3) Compounding the difficulty with the precise definition of EF is the regular tendency to use the term “frontal” (or more precisely, “prefrontal”) as a synonym for “executive”, thereby confusing neuropsychological and neuroanatomical concepts. This confusion has arisen because the cognitive construct of EF was originally posed in response to observations of patients with frontal lobe damage, whose disorganised and disinhibited behaviours were hypothesised to have their origins in executive dysfunction. This led to a situation whereby some authors have considered any operation performed by the frontal lobes (or any behavioural symptom of frontal lobe damage) to be an EF, including constructs or functions such as emotion regulation and affective responsiveness, social behaviour and personality, insight, humour appreciation, and self-awareness. In the view of Zelazo and Müller (2002), EF includes both “cool”, cognitive aspects and “hot”, affective aspects. However, it is important to make it clear that in this thesis, EF will be considered a purely cognitive (“cool”) construct, as 8 The term ‘fluency’ is often used for this concept, however ‘generativity’ is preferred within this thesis. 32 defined (albeit broadly) above9, independent from any neuroanatomical basis. While there is strong support for the notion that EFs are at least partially subserved by frontal regions of the brain, mounting evidence indicates that the relationship is certainly not well defined, and many EF measures lack both sensitivity and specificity to frontal lesions (Reitan & Wolfson, 1994; Stuss & Alexander, 2000; Tranel, Anderson & Benton, 1994). Understandably, the measurement of EF in both adults and children has been just as problematic as its definition. The difficulty with EF measurement was predicted by Fodor (1983), who proposed the existence of domain-general, non-modular “central processes” which would be “bad candidates for scientific study” (p.127). Unlike ToM, where the false belief paradigm has (arguably) become a “gold standard” for its measurement, a similar gold standard for assessing EF has proven elusive – and perhaps unfeasible. This is not only because of EF’s complexity, but also because EF is a theoretical rather than an operational term (Burgess, 1997). To borrow Burgess’ example, one can clearly call a patient dyscalculic if s/he shows impaired performance on calculation tasks (or prosopagnosic if s/he shows impaired performance on face recognition tasks), but there is no equivalent way of determining whether or not a individual may be diagnosed as dysexecutive: there is no prototypical screening measure. This has meant that unlike any other cognitive domain, the validity of an EF test is not typically evaluated on psychological grounds, but rather in terms of whether or not patients with frontal lesions show impaired performance on it. However, the loose correspondence between the psychological and anatomical make this inference problematic. For example, one of the most widely used tests of EF is the Wisconsin Card Sorting Test (WCST; Grant & Berg, 1948), in which participants must work out rules for sorting cards by certain categories, and then adapt their responses according to feedback when the rules unexpectedly change. Patients with frontal lobe lesions have previously been found to achieve less categories and make more perseverative errors than patients with posterior lesions (Drewe, 1974; Milner, 1963). However, more recent evidence has shown that non-frontal or diffuse brain damage can produce similar deficits (e.g., Anderson, Damasio, Jones, & Tranel, 1991; Anderson, Bigler, & Blatter, 9 Clearly, affective factors influence and interact with cognitive processes, however it is possible to distinguish the two conceptually, methodologically and neuroanatomically. Furthermore, in examining the relationship between ToM and EF, considering affective and/or social factors to be part of the EF domain unhelpfully clouds the issue - for example, Zelazo and Müller (2002) actually deem false belief tasks to be tests of EF. 33 1995), and that adequate WCST performance does not exclude frontal pathology (e.g., Eslinger & Damasio, 1985). There has, however, been some attempt to delineate purely cognitive criteria for a test being a test of EF. For example, Phillips (1997) proposed that any test hypothesised to measure EF should have the following characteristics: i) the test should be novel, in order to tap goal identification and strategic planning (as well-practiced tasks can be performed using previously formulated strategies); ii) the test should be effortful in terms of task planning and execution, requiring inhibitory control and monitoring; and iii) the test may involve working memory, in order to coordinate concurring processing requirements. Similarly, Walsh (1978) proposed that EF tasks require novelty, complexity, and the need to integrate information. However, these criteria clearly apply only to multifactorial EF tasks – other more specific tests of particular EF components may not involve all of these features (the notion of EF components is discussed further in the next section). Besides the difficulty in ascertaining the construct validity of EF tests, there are several other reasons why measuring EF is challenging. Firstly, the psychometrics of EF tests are notoriously poor. By its very nature, EF is required in novel situations; yet because tests can only be novel once, the test-retest reliability of EF tasks is consequently low (Rabbitt, 1997). Secondly, most EF tests lack purity – that is, they tap multiple underlying processes, making it difficult to discern specific reasons for failure. For example, the WCST has been commonly described as a test of abstraction and flexibility, yet it also requires selective attention to relevant dimensions of the stimuli, generation of a sorting rule, working memory to hold the sorting principle in mind, and inhibition of the prepotent response to sort the cards according to the rule just used, as well as non-EF processes such as the use of verbal feedback provided by the examiner, and appreciation of the category of number (Ozonoff, 1995a; Pennington & Ozonoff, 1996). Furthermore, attempts to control task demands in order to isolate the relevant abilities may not always be successful, as some EF tasks specifically require the simultaneous co-ordination of a variety of different processes (Kimberg & Farah, 1993). A third difficulty with EF measurement is that of low “process-behaviour correspondence” (Burgess, 1997). In contrast to most other cognitive domains which are only manifest in circumscribed situations (e.g., calculation abilities are manifest when one is required to perform a calculation, or face recognition abilities when presented with a face), EFs manifest themselves across a range of different situations. As a consequence, there is an imprecise correspondence between behaviour and the 34 underlying process: a specific EF impairment can result in a variety of behaviours, and a specific behaviour may be caused by a variety of EF (or other cognitive) impairments. Furthermore, the same behavioural sequence is likely to require fewer EFs over time, even within one short period, as it becomes more practiced. A final problem with assessment of EF is that testing situations are for the most part structured and guided by the examiner, removing much of the load on EF for the examinee. This results in poor ecological validity of EF tests (Cripe, 1996). Do these issues with EF assessment apply equally to children and adults? Until recently, the large majority of EF research has focus on adult populations, and measuring EF in childhood has only lately become a more popular topic of interest10 as it becomes clear that EF develops much earlier than previously thought (see Section 2.2.2). Hughes and Graham (2002) argue that while EF measurement in children has its own set of problems, conversely there are actually some difficulties with adult EF assessment which are not so problematic in childhood. For example, children may perceive a new task as novel for longer, possibly leading to greater stability in underlying processes and overall performance (and therefore improved test reliability and validity). In addition, as EF tasks need to be simplified in order to be developmentally appropriate, the problem of task impurity is likely to be reduced. However, there are also difficulties associated with assessing EF in children, the most obvious of which, according to Hughes and Graham (2002), is children’s limited language skills. This leads to a number of problems: i) complex task instructions tax verbal comprehension, which may influence task performance for non-EF reasons; ii) because fluent literacy is not an automatic skill until late in development, many adult EF tasks which depend on written language being over-learned are not appropriate for children (e.g., the Stroop test, in which reading a colour-word such as “red” is assumed to be a prepotent response, making it difficult to instead say the different colour that the word is printed in); and iii) language itself may play a role in EF, by both guiding behaviour through internal self-talk and by enabling the use of verbal working memory to rehearse strategies. Clearly, the development of appropriate assessment tools for EF in children is an important focus for ongoing research. While reviewing the literature on the definition and measurement of EF appears to paint a rather negative picture, it should be emphasised that EF has still managed to retain its utility, relevance and validity as a measurable construct. Tranel et al. (1994) 10 For lists of commonly used EF tests in children, see Anderson (1998) or Zelazo and Müller (2002). 35 argue that despite the difficulties in studying EF, the term provides a useful heuristic or shorthand for denoting a relatively well-agreed upon set of capacities with several unique characteristics: i) they are the highest level of human cognition; ii) they are difficult to operationalise and therefore hard to measure quantitatively; iii) they are closely intertwined with personality and consciousness; and iv) they have intimate connections with the prefrontal cortex. In addition, recent research has focused on improving the measurement of EF by using more fine-grained tests with several performance measures and multiple control tasks in order to isolate specific components of EF which may be impaired (Delis, Squire, Bihrle, & Massman, 1992; Godefroy, Cabaret, Petit-Chenal, Pruvo, & Rousseaux, 1999; Ozonoff, Strayer, McMahon & Filloux, 1994) as well as attempting to make the task paradigms more ecologically valid (Manly et al., 2001; Wilson, Evans, Emslie, Alderman, & Burgess, 1998) and childfriendly (Espy, Kaufmann, Glisky, & McDiarmid, 2001; Gerstadt, Hong, & Diamond, 1994; Hughes, 1998a). It is nevertheless important to acknowledge the difficulties with the definition and measurement of EF, as it will become evident that these concerns were influential both in selecting appropriate EF tasks for the current research, and in interpreting the results gleaned from those tasks. 2.2.2 Models of EF and its development A burgeoning literature has produced a sizeable number of alternative theoretical frameworks for conceptualising EF, and, concurrently, the functions of the prefrontal cortex (see Eslinger, 1996; Grafman, 1994; Stuss & Knight, 2002). As is the case for EF measurement, models of EF have mostly been based on adults, with theories of EF development borrowing heavily from adult concepts. Several adult-based models have utilised the classic distinction between automatic and controlled actions from traditional cognitive psychology (Atkinson & Shiffrin, 1968; Schneider & Shiffrin, 1977), where unlike automatic actions, controlled actions involve conscious, effortful processing and are required in novel, non-routine situations. Based on a similar routine/non-routine dichotomy, an influential model of EF by Norman and Shallice (Norman & Shallice, 1980, 1986; Shallice, 1988) included two mechanisms for regulating behaviour: the Contention Scheduler, which operates in routine or overlearned situations via automatic priming of stored knowledge (analogous to scripts or schemas), which are cued either by environmental stimuli or conceptual thought; and the Supervisory Attentional System (SAS), which is activated in non-routine (novel, complex, difficult, and/or conflicting) 36 situations and in which conscious internal knowledge states can override the contention scheduling mechanism and set the priority for action by creating new action schemata. This model held an intuitive appeal and accounted for data on attention and action failures in patients with prefrontal lesions, who were proposed to have intact Contention Schedulers but an impaired SAS (Shallice & Burgess, 1991). Shallice (2002; Shallice & Burgess, 1996) recently elaborated upon his model, outlining a number of different components to the SAS including schema selection (which can occur via three different methods), schema implementation, and schema checking and monitoring. While Shallice (1984, 2002) contends that central control processes consist of multiple components, others have argued for a more unitary control structure or mechanism. Duncan (Duncan, Burgess, & Emslie, 1995; Duncan, Emslie, Williams, Johnson, & Freer, 1996) to some extent represented this position when he claimed that EF is largely synonymous with Spearman’s g, or fluid intelligence. Others have proposed that the range of EF failures may be attributed to a single process, such as inhibition (e.g., Dempster, 1992, 1993) or working memory (e.g., Case, 1985; Goldman-Rakic, 1995). The idea of a single EF system or mechanism has become increasingly unpopular, however, as evidence and opinion has converged upon the notion that EF consists of multiple separable components (Baddeley, 1996, 2002; Godefroy et al., 1999; Miyake, Friedman, Emerson, Witzki, & Howerter, 2000; Pennington, 1997; Stuss & Alexander, 2000). This conceptualisation accounts better for data showing weak correlations between various EF tasks (e.g., Boone, Ponton, Gorsuch, Gonzalez, & Miller, 1998; Hughes, Russell, & Robbins, 1994; Miyake et al., 2000) and differential impairment on various EF tasks in patients with lesions in different parts of the prefrontal cortex (see Stuss & Alexander, 2000). In addition, it allows us to distinguish between the various clinical groups in which executive dysfunction is found, by examining qualitative differences in profiles of performance on EF components (reviewed in the next section). However, there has been little agreement on the appropriate taxonomy for the components of EF. Lezak (1995) proposed four EF components: i) volition, ii) planning, iii) purposive action, and iv) effective performance. In their problem-solving framework of EF, Zelazo, Carter, Reznick and Frye (1997) also outlined four components: i) problem representation, ii) planning, iii) execution (rule use) and iv) evaluation (error detection/correction). While these two frameworks present four sequential stages to the problem-solving process, other conceptualisations focus more on concurrent or non-time-dependent executive processes. For example, Anderson 37 (1998; Anderson, Levin, & Jacobs, 2002) proposes three EF components: i) attentional control (including selective and sustained attention and response inhibition), ii) goal setting (incorporating initiation, planning, and problem solving), and iii) cognitive flexibility (including working memory, attentional shifting and self-monitoring). One model which has been particularly influential in the developmental literature is Roberts and Pennington’s (1996) interactive framework, in which performance on EF tasks is held to be a product of two separate but interdependent processes: working memory (to hold the task demands or rules in mind) and inhibitory control (to guide behaviour according to those rules). resources. These two components compete for limited executive Support for this model comes from studies showing that on EF tasks, prepotent response errors increase (indicating poorer inhibitory capacity) as the working memory demands increase (Roberts, Hager & Heron, 1994). A number of factor analytic studies have unfortunately produced a range of different results, without clearly indicating any particular model as superior to others (see Royall et al., 2002, for a review). For example, Burgess, Alderman, Evans, Emslie, and Wilson (1998) found three factors which they termed Inhibition, Intentionality and Executive Memory, whereas Collette, van der Linden and Salmon (1999) found two factors, Inhibition and Working Memory, while Boone et al. (1998) found only one Cognitive Flexibility factor (although in the latter study the EF tasks were only modestly correlated and the authors concluded that the tests tapped somewhat different abilities). In several of these studies, variables from the same EF task often load on different factors, and in addition, the same task may load on different factors in different studies depending upon which tests are included in the analysis. These inconsistencies are hardly surprising given the “impurity” of EF tasks and their questionable psychometric properties. As argued by Hughes and Graham (2002), factor analysis studies using children may be more fruitful, as more simple, “pure” tests may still tax EF in children and test performances may be more reliable. Support for the fractionation of EF in children has indeed been provided by several studies which have generally revealed three or four distinct EF factors (Espy, Kaufmann, McDiarmid, & Glisky, 1999; Hughes, 1998a; Levin et al., 1991; Luciana & Nelson, 1998; Pennington, 1997; Welsh, Pennington & Groissier, 1991). Although named differently by different authors, these factors have consistently included cognitive flexibility or set-shifting, inhibition, and working memory, with the addition or substitution of a planning component in some studies. For example, Hughes (1998a) identified Attentional Flexibility, Inhibitory Control and 38 Working Memory factors, and Pennington (1997) similarly found Set Shifting or Cognitive Flexibility, Motor Inhibition, and Verbal Working Memory factors; while Welsh et al. (1991) named their factors Planning, Hypothesis Testing & Impulse Control, and Fluid & Speeded Response. However, inconsistencies also appear in this developmental research, where the same task may be clustered with different tasks or be part of different factors across studies, although it is difficult to tell how much of this variability is attributable to different performance indices being used in the various studies. The stages of EF development have only relatively recently become the subject of systematic research, as evidence accumulates in opposition to the early influential notion that the prefrontal cortex was not functional at all until adolescence and did not reach maturity until around the age of 24 (Golden, 1981). Behavioural and electroencephalogram (EEG) data as well as case studies of children with early frontal lesions all now refute this view, indicating prefrontal activity even in infancy. For example, Diamond and Goldman-Rakic (1989; Diamond, 1985) found that by 12 months of age, human infants achieved errorless performance on classic delayed response and A-not-B tasks, performance on which they argue requires working memory and inhibition, and is sensitive to frontal lesions in monkeys. Bell and Fox (1992) demonstrated changes in frontal EEG recordings during the first year of life which correlated with improved performance on the A-not-B task. A number of case studies (Anderson, Bechara, Damasio, Tranel, & Damasio, 1999; Eslinger, Biddle, & Grattan, 1997; Marlowe, 1992; Price, Daffner, Stowe, & Marsel Mesulam, 1990) have also demonstrated that very early prefrontal lesions result in immediately noticeable consequences as well as EF deficits and impaired social and moral behaviour later in life. Nevertheless, it is clear that although it is certainly not “silent” in infancy, both the physiological and functional development of the prefrontal cortex follow a particularly protracted developmental course. Investigations of synaptic density, dendritic growth, myelination, interhemispheric connectivity, metabolic activity, and electrical (EEG) activity all show that the prefrontal cortex continues to develop through middle childhood and adolescence (Diamond, 2002; Huttenlocher & Dabholkar, 1997; Schwartz, 1997; Thatcher, 1997). Cognitive studies of the development of EF have focused on mapping developmental trajectories for the various EF components. An early study by Passler, Isaac, and Hynd (1985) found that EF development was a multistage process, with a spurt of development between the ages of 6 and 8 and mastery evident by the age of 12. 39 Similarly, Chelune and Baer (1986) found that WCST performance improved between 6 and 10 years of age, with adult performance achieved by 12 years. More recent studies incorporating a larger range of EF measures have extended the age range and more thoroughly articulated the multidimensional nature of EF development. A study by Levin et al. (1991) supported previous findings that tests of concept formation, setshifting and inhibition appear to be mastered by the age of 12, however they also found additional gains in their adolescent 13-15 year-olds on measures of generativity and planning. Welsh et al. (1991) found evidence for three distinct developmental stages, the first beginning at around 6 years, a second commencing at around the age of 10, and a third during adolescence. Consistent with Levin et al. (1991), they found that some components of EF (e.g., the ability to resist distraction, impulse control or inhibition) matured earlier than others (e.g., generativity, planning skills). An investigation of EF development in late childhood and adolescence by Anderson, Anderson, Northam, Jacobs & Catroppa (2001) found that while the developmental trajectory for EFs in this period was generally flatter than during early and middle childhood, differential developmental trends were observed within the different EF domains, with attentional control and planning showing the greatest improvements during adolescence, while cognitive flexibility was already matured by the age of 12. At the other end of the age spectrum, studies by Zelazo and colleagues (Zelazo & Reznick, 1991; Zelazo, Frye, & Rapus, 1996b) have demonstrated developments in rule use (the third stage in their problem-solving framework of EF) between the ages of 2 and 5. Several studies have also found significant improvements in inhibitory control between the ages of 3 and 6 years (Diamond & Taylor, 1996; Gerstadt et al., 1994; Kochanska, Murray, & Coy, 1997). Thus, descriptive studies mapping the development of EF have shown fairly consistently that i) the first emergence of EF occurs early in life, probably around the end of the first year; ii) EF development appears to follow a multistage process, with important changes occurring between the ages of 2-5 and 6-10, with adult performance levels reached by the age of 12 in several domains, and performance in other domains continuing to develop through adolescence; and iii) the various components of EF follow different developmental trajectories, with cognitive flexibility and inhibition tending to develop first and planning and generativity maturing later (see Anderson, 2002, for a slightly different mapping of EF development). Attempts to characterise the development of EF within a theoretical framework have tended to emphasise either one or two central constructs which account for EF 40 development as a whole. One view is that age-related changes in EF may be explained by the construct of inhibition, such that children become increasingly able to resist interference and keep task-irrelevant information out of working memory (Bjorklund & Harnishfeger, 1990; Dempster, 1992, 1993; Harnishfeger & Bjorklund, 1993). As for adult models of the structure of EF, this account is limited by its unidimensionality, defaulting to the explanation that children find some tasks more difficult than others simply because they require more inhibition, and being unable to explain developments in EF tasks with minimal inhibitory requirements (Zelazo et al., 1997; Zelazo & Müller, 2002). A more popular approach has been to argue that EF changes result from both working memory and inhibition, either as potentially separable components (Diamond, 2002; Diamond & Taylor, 1995; Gerstadt et al., 1994) or interacting processes (Roberts et al., 1994; Roberts & Pennington, 1996). Results of a recent well designed study by Beveridge, Jarrold, and Pettit (2002) favoured the view of inhibition and working memory as independent and additive rather than interacting components of EF. While they found that increasing the working memory load of an inhibition task did have a detrimental effect on performance (consistent with Roberts et al., 1994), by using tests with multiple levels of both inhibitory and working memory requirements they found that interactions between the two processes were non-significant in both 6- and 8-yearolds. A recent alternative theory of EF development is Zelazo and Frye’s Cognitive Complexity and Control (CCC) theory (e.g., Frye, Zelazo & Palfai, 1995; Zelazo, 2000; Zelazo & Frye, 1998), which, according to Frye (2000), is most relevant to the EFs of planning and deliberative action. This account focuses on development in the preschool years, proposing that within this period there are increases in the complexity of children’s rule systems (plans formulated in potentially silent self-directed speech, e.g., “If I see a mailbox, then I need to mail this letter”). Complexity is measured by the number of levels of embedding in these rule systems. Embedded rules establish a hierarchy in which rules are arranged beneath setting conditions (which select or restrict the application of a rule), and have the form “if s1, then if a1, then c1” in which s is a setting condition, a is an antecedent, and c is a consequent. Zelazo and Frye (Frye et al., 1995; Zelazo & Frye, 1998; Zelazo & Reznick, 1991) have shown that 3-year-olds readily integrate two “if-then” rules (e.g., in the Dimensional Change Card Sorting (DCCS) Task, modelled on the WCST, they are able to comprehend “If the test card is red then place it here; if blue then there”) but having difficulty representing a higherorder “if-if-then” rule that allows them to switch flexibly between incompatible pairs of 41 rules (e.g., “If sorting by colour, then if red then here, if blue then there. If sorting by shape, then if car then here, if flower then there”). The CCC account is reminiscent of Halford’s proposal that cognitive development is characterised by the developing ability to represent increasingly complex relations between items in parallel (Halford, 1993; Halford, Wilson, & Phillips, 1998), but differs from Halford in the central importance placed on embedded or hierarchical rule structures (Frye & Zelazo, 1998). While both the inhibition-working memory accounts and the CCC theory of EF development are able to account reasonably well for data within their specified task and age domains, it is doubtful whether either of them could account for the range of findings from descriptive studies mapping developmental trajectories of EF. Neither theory explicitly accounts for the multi-stage process of development from infancy to adulthood or the differential rate of development for the various components of EF. To some extent, these limitations stem from the widely acknowledged problems with the definition and measurement and EF, which make it difficult to agree on which component processes underlie each EF task and which aspects are central to EF development. Nevertheless, this important cognitive domain remains critical to the explanation of a wide range of clinical disorders - including autism. 2.2.3 EF in autism Research on executive dysfunction in autism has gathered momentum in a more gradual manner than the ToM literature, beginning with early case reports of autistic individuals documenting what would now be called EF deficits (Scheerer, Rothmann, & Goldstein, 1945; Steel, Gorman, & Flexman, 1984). Using tests of spontaneous colour and tone sequence production, Frith (1972) found what might be interpreted as a generativity impairment in children with autism, with the autistic sample producing more rigid, restricted, and less unique patterns. In 1978, Damasio and Maurer published an influential paper noting behavioural similarities between individuals with autism and patients with frontal lobe damage, such as ritualistic and compulsive behaviours and concreteness in thought and language. They proposed a neurological model of autism involving the frontal lobes and parts of the temporal lobes, basal ganglia and thalamus. Following this, Rumsey and colleagues (Rumsey, 1985; Rumsey & Hamburger, 1988) tested a group of high-functioning male adults with autism on executive and nonexecutive neuropsychological tasks. They found that the autistic men performed significantly more poorly than controls on the WCST as well as measures of cognitive 42 flexibility and problem solving, but showed intact or only mildly impaired performance in other cognitive domains. This finding was followed up in autistic adolescents by Prior and Hoffman (1990), who found impaired performance on the WCST and a maze test; and in individuals with Asperger syndrome by Szatmari, Tuff, Finlayson, and Bartolucci (1990), who found impaired performance on the WCST. The current era of EF research in autism was launched by a study by Ozonoff et al. (1991), which compared the primacy of ToM and EF impairments in a group of high-functioning children with autism. Contrary to their expectation, Ozonoff et al. found that in their autism group, EF deficits (as measured by the WCST and the Tower of Hanoi, a measure of planning) were more universal than ToM deficits and were better predictors of autism group membership. Such findings have since been consolidated in a number of studies showing impairment of individuals with autism compared with age and IQ-matched controls on tasks tapping a range of EF components, including cognitive flexibility or attentional shifting (Ciesielski & Harris, 1997; Courchesne et al., 1994; Goldstein, Johnson, & Minshew, 2001; Hughes & Russell, 1993; Hughes et al., 1994; Minshew et al., 1992; Ozonoff & Jensen, 1999; Ozonoff & McEvoy, 1994; Ozonoff et al., 1994), planning (Hughes, 1996a; Hughes et al., 1994; Ozonoff & Jensen, 1999; Ozonoff & McEvoy, 1994), and generativity (Boucher, 1988; Craig & Baron-Cohen, 1999; Lewis & Boucher, 1991; Turner, 1999; Williams, Moss, Bradshaw, & Rinehart, 2002). In their review of studies on EF in autism, Pennington and Ozonoff (1996) calculated the average effect size of group differences on EF tasks to be 0.98 (a large effect according to Cohen, 1988), and as high as 2.07 on the Tower of Hanoi. These EF deficits do not appear to be attributable to impairments in more basic attentional processes, such as sustained or selective attention or basic attentional capacity (Bryson, Landry, & Wainwright, 1997; Garcia-Villamisar & Della Sala, 2002; Garretson, Fein, & Waterhouse, 1990; Goldstein et al., 2001; Minshew et al.,1992). These findings resulted in the hypothesis that EF deficits may be primary in autism (e.g., Hughes & Russell, 1993; Ozonoff et al., 1991; Russell, 1997a). Furthermore, prefrontal dysfunction has been posited to be the underlying neuroanatomical basis for EF impairment in autism (Ozonoff, 1995a; Ozonoff et al., 1991). In a test of the prefrontal hypothesis, Bennetto, Pennington, and Rogers (1996) examined the pattern of performance displayed by individuals with autism on various memory tasks, and found that it was consistent with that typically displayed by frontal lobe patients. Minshew, Luna, and Sweeney (1999) found that the pattern of 43 performance of autistic individuals on oculomotor tasks suggested a disturbance in prefrontal circuitry. Consistent with the notion that autistic symptomatology may have its basis in frontal dysfunction, impairments in social interaction, spontaneous speech and pragmatic communication, and the production of novel, goal-directed behaviours, are also displayed by patients with frontal lobe damage, including children who have sustained early damage to the prefrontal cortex (e.g., Alexander, 2002; Ames, Cummings, Wirshing, Quinn, & Mahler, 1999; Anderson et al., 2002; Eslinger et al., 1997; Stuss & Benson, 1984, 1986; Tranel, 2002). Neuropathological and neuroimaging studies have also provided some evidence of prefrontal abnormalities in autism, although so far no gross abnormalities have been consistently identified. Casanova, Buxhoeveden, Switala, and Roy (2002) discovered minicolumnar abnormalities in the frontal and temporal lobes of children with autism, and Piven et al. (1990a) found evidence of abnormal neural migration in the frontal lobes of three autistic individuals, although this was only one-fifth of their sample. Piven et al. (1995) also found that the frontal lobes were small in comparison with other cortical areas in subjects with autism. Zilbovicius et al. (1995) demonstrated evidence that maturation of the frontal cortex (as measured by regional cerebral blood flow, rCBF) was delayed in autistic individuals. Decreased rCBF in frontal areas has also been found in a number of other studies (George, Costa, Kouris, Ring, & Ell, 1992; Ohnishi et al., 2000; Sherman, Nass, & Shapiro, 1984). Functional neuroimaging studies have shown reduced dorsolateral prefrontal activation during spatial working memory tasks in autism (Luna et al., 2002) as well as differences in the pattern of activation in the prefrontal cortex and other brain regions during ToM tasks (Castelli et al., 2002; Happé et al., 1996). However, prefrontal changes are only one of many brain abnormalities which have been documented in autism (see Bauman, 1999; Deb & Thompson, 1998; Koenig, Tsatsanis, & Volkmar, 2001), with other areas of significance including the cerebellum (see Courchesne, 1997), corpus callosum (Piven, Bailey, Ranson, & Arndt, 1997a), and limbic or medial temporal structures (Bachevalier, 1994; Bauman & Kemper, 1994). Most neurobiological theories of autism in fact do not give prominence to the prefrontal cortex (e.g., Akshoomoff, Pierce, & Courchesne, 2002; Waterhouse, Fein, & Modahl, 1996). Concluding that prefrontal abnormalities are the most significant in causing the symptoms of autism would therefore be premature, particularly given the inconsistency of results across neurobiological studies of autism in general (Ozonoff, 2001). In addition, a major difficulty with the prefrontal hypothesis is that children who sustain 44 early lesions to the prefrontal cortex do not actually develop autism, but more often display a syndrome resembling psychopathy or conduct disorder (Anderson et al., 1999; Eslinger, Grattan, Damasio, & Damasio, 1992; Eslinger et al., 1997). While the behavioural impairments displayed by children with frontal lesions may be broadly or categorically similar to those displayed by children with autism, there are obvious qualitative differences and it would be difficult to mistake one for the other in a clinical setting. However, it may be that prefrontal dysfunction is a necessary but not sufficient criterion for the development of autism (Ozonoff, 1995a), or that the timing of the insult is a crucial variable in behavioural outcome. Nevertheless, it is not clear that prefrontal dysfunction is the neurobiological basis for EF impairment in autism11. The central concern of this section, however, is the hypothesis that EF impairment, regardless of its neuroanatomical underpinnings, may be the primary cognitive impairment in autism (Hughes & Russell, 1993; Ozonoff et al., 1991; Ozonoff, 1995a; Pennington et al., 1997; Russell, 1997a). The associated claim of this hypothesis is that a primary impairment in EF may also explain the ToM deficit observed in individuals with autism (a claim which is examined in Section 2.3.2). As for ToM, the EF hypothesis of autism has undergone a number of tests of whether or not it meets the criteria for primacy, as reviewed below. Unlike ToM, the EF hypothesis is not required to prove its domain specificity, as EF is not claimed to be modular. i) Universality. The evaluation of whether or not EF deficits are universal in autism has not received nearly as much attention as the equivalent question in the ToM literature. This is probably because of the lack of a “gold standard” of EF performance on which failure can be unequivocally evaluated – unlike false belief tasks, performance on EF tasks is usually not a matter of pass or fail. Many studies of EF in autism interpret the presence of a group difference as evidence of an EF deficit in autism, but do not look further at what proportion of the autism group showed such a deficit. Those researchers who have examined the universality of deficits have tended to choose an arbitrary criterion for what comprises a “fail” or what defines a “deficit”. Ozonoff et al. (1991) used the proportion of participants scoring below the mean of the control group as their index of the universality of deficits, and found that 96% of their autism group showed an EF deficit using this criterion. This finding was (and still is) cited as 11 As Pennington et al. (1997) point out, EF impairment can also result from diffuse structural or metabolic differences in brain development or diffuse brain lesions in adult patients (as discussed in Section 2.2.1), indicating either that disrupting the connectivity of the whole brain may mimic the effects of a focal frontal lesion, or that the neuroanatomical basis of EF tasks is not specific to the prefrontal cortex. 45 evidence that EF deficits were almost universal among individuals with autism. However, defining any score below the mean as a deficit is a very lenient criterion – in most rating systems for impairment severity, a score needs to be at least 1 standard deviation (SD) below the mean to be considered as even mildly impaired (Heaton, Grant, & Matthews, 1991; Lezak, 1995). Other studies have been more equivocal than Ozonoff et al.’s (1991) study in their findings on the universality of EF impairments in autism, usually finding lower proportions of individuals with autism demonstrating difficulties. Liss et al. (2001) found that 57% of their group of high-functioning autistic adolescents scored within 1 SD of the mean of the control group on the WCST, and 29% performed better than the control mean. Teunisse, Cools, van Spaendock, Aerts, and Berger (2001) found that only 46% of their high-functioning adolescents with autism showed poor cognitive shifting, defined using the lenient criterion of any positive z score on the sum of two variables measuring the number of trials required for successful performance. Hughes et al. (1994) reported that 67% of their autism group failed both the Tower of London (ToL) and the Intradimensional, Extradimensional (IDED) set-shifting task (using an arbitrary criterion of failure), with 92% failing the ToL and 75% the IDED task. Ozonoff and McEvoy (1994) report that 41% of their autistic sample performed within the normal range on the WCST (but none on the Tower of Hanoi); and Ozonoff and Jensen (1999) found that up to 36% of their sample scored above the control mean on at least one EF task, with only half the sample performing below the control mean on all three tasks used. So, while universality has been assessed using a variety of different methods making it difficult to compare results across studies, it is apparent that EF deficits are not universal among individuals with autism. Those who have examined the characteristics of individuals who perform in the normal range on EF tasks have usually found, as for ToM, that they are older (Ozonoff & Jensen, 1999) and/or have higher verbal IQ (Liss et al., 2001; Ozonoff & McEvoy, 1994). However, unlike ToM, it would be difficult to argue that these older and higher-functioning individuals have developed some kind of compensatory strategy to aid their EF performance, as the very nature of EF tasks is that they are novel and would not have been encountered before, making strategies difficult to develop in advance (although it is conceivable that general strategies may have developed or been learned to offset recognised limitations in certain cognitive abilities, for example by using visual imagery to compensate for a verbal working memory deficit). An alternative argument might be that these EF “passers” 46 could show difficulties on tasks tapping other components of EF which may not have been measured in the studies reviewed (e.g., generativity). Hughes (2001) has suggested that low- and high-functioning individuals with autism may show distinct types of EF impairment, as indicated by the different types of repetitive behaviour displayed by these two groups (Turner, 1997), and therefore that group heterogeneity may prevent universal EF characteristics from being discovered. In addition, it may be that different aspects of EF are impaired at different points in the development of children with autism in comparison with age-matched controls, as EF components show different developmental trajectories. These possibilities await empirical investigation. ii) Uniqueness. A major challenge to the EF hypothesis of autism (which has been acknowledged and discussed by all its proponents) is that EF impairments are displayed in a number of other disorders, including ADHD (e.g., Grodzinsky & Diamond, 1992; Oosterlaan, Logan, & Sergeant, 1998; Pennington, Groissier, & Welsh, 1993; Shallice et al., 2002) schizophrenia (e.g., Elliott, McKenna, Robbins, & Sahakian, 1995; Pantelis et al., 1997; see Hoff & Kremen, 2003), Tourette’s syndrome (BaronCohen & Robertson, 1995; Channon, Flynn, & Robertson, 1992), obsessive-compulsive disorder (Christensen, Kim, Dyksen, & Hoover, 1992; Cox, Fedio, & Rapoport, 1989; Head, Bolton, & Hymas, 1989; Veale, Sahakian, Owen, & Marks, 1996), and earlytreated phenylketonuria (Diamond, Prevor, Callender, & Druin, 1997; Smith, Klim, & Hanley, 2000; Welsh, Pennington, Ozonoff, Rouse, & McCabe, 1990), not to mention neurological disorders affecting frontal lobe functions such as frontotemporal dementia (e.g., Razani, Boone, Miller, Lee, & Sherman, 2001; see Grossman, 2002), Parkinson’s disease (e.g., Owen et al., 1993), and traumatic brain injury (e.g., Anderson et al., 2002; Levine et al., 1998). The question of how these symptomatically different disorders could all share the same cognitive basis has been dubbed the “discriminant validity problem” (Pennington & Ozonoff, 1996). Proponents of the EF hypothesis of autism have attempted to circumvent this difficulty meeting the uniqueness criterion by proposing that different disorders may be characterised by different severity and age of onset of EF impairment, and different profiles of impairment on the various components of EF (Ozonoff, 1997b; Ozonoff et al., 1994; Pennington & Ozonoff, 1996). In particular, Ozonoff (1997b; Ozonoff & Jensen, 1999) has suggested that autism is characterised by deficits in cognitive flexibility and planning but spared inhibitory capacity. Studies by Ozonoff et al. (1994) and Ozonoff and Jensen (1999) demonstrated evidence of differentiation among the EF profiles of children with autism, ADHD and Tourette’s syndrome, with autistic children 47 showing poor performance on tests of cognitive flexibility and planning but not inhibition, the ADHD sample characterised by a specific impairment in inhibitory control, and individuals with Tourette’s showing little evidence of any EF impairment. The notion of intact inhibition in autism has received support in other studies using a negative priming paradigm (Brian, Tipper, Weaver, & Bryson, 2003; Ozonoff & Strayer, 1997). A few studies have suggested that working memory may also be spared in autism (Ozonoff & Strayer, 2001; Russell, Jarrold, & Henry, 1996). However, while these studies appear to paint a relatively clean picture of spared and impaired EFs in autism and other disorders, other findings have not been so consistent. Some have argued for and/or found evidence of impairments in inhibition (Hughes, 1996b; Rinehart, Bradshaw, Tonge, Brereton, & Bellgrove, 2002; Williams et al., 2002) and in working memory (Bennetto et al., 1996; Pennington et al., 1997) in individuals with autism, although it has been argued that these impairments emerge only if the task involves both inhibitory and working memory requirements (Russell, 1997b). Others have found set-shifting to be intact in high-functioning individuals with autism and/or Asperger syndrome (Ozonoff et al., 2000; Rinehart, Bradshaw, Moss, Brereton, & Tonge, 2001; Turner, 1997). The role of generativity impairments (e.g., Turner, 1999) has also been under-emphasised. In her review of EF in autism, Hughes (2001) was able to say only that EF deficits in autism were “high-level and non-spatial” (p. 258), a characterisation which clearly lacks the desirable specificity. Research on EF in other disorders is characterised by similar inconsistencies. For example, in their review of EF in ADHD, Sergeant, Geurts, and Oosterlaan (2002) concluded that the pattern of EF deficits in ADHD was not consistent across studies and did not appear to be specific to ADHD. A number of factors may have contributed to the lack of consistency in identifying unique EF profiles across studies of EF in autism and other disorders. Firstly, the usual concerns about the measurement precision of EF tasks will inevitably affect the reliability and interpretation of results (particularly given that most studies have used multifactorial tasks such as the WCST). Secondly, different research groups have focused on different aspects of EF depending upon their theoretical framework (Hughes, 2001), with very few studies including tasks measuring the full range of EF components. Thirdly, the task modality and/or mode of response (i.e., verbal or nonverbal) has varied across studies, which may be important as spatial ability is superior to verbal ability in autism (Happé & Frith, 1996), and this superior spatial ability may boost performance on non-verbal EF tasks as compared with verbal tasks (Hughes, 48 2001). Fourthly, it has been suggested that some of the inconsistencies across studies may be related to whether or not the tasks were computerised, and therefore whether performance may have been affected by the need for social interaction; and also, whether or not feedback about task performance is given verbally by the examiner or is provided automatically by the task (Ozonoff, 1995b, 2001). Finally, as discussed earlier, variability in the age and level of functioning (i.e., intellectual ability) of the sample may also influence the pattern of results12. Clearly, studies using a wide range of well-defined EF tasks, using both verbal and non-verbal response modes, and which may be broken down into component processes, are needed to determine whether autism is associated with a unique EF profile. iii) Causal precedence. This criterion of primacy has also posed difficulties for the EF hypothesis. The first study to test EF in young children with autism (McEvoy, Rogers, & Pennington, 1993) found that autistic preschoolers (mean age = 5.1 years) made significantly more perseverative errors on a spatial reversal task tapping inhibition and set-shifting (but not A-not-B, Delayed Response or Delayed Alternation tasks, which showed ceiling and floor effects). Similarly, Dawson, Meltzoff, Osterling, and Rinaldi (1998) found that young children with autism (mean age = 5.4 years) performed more poorly than controls on a version of the A-not-B task. However, no other studies have managed to replicate these findings. Using a younger sample than McEvoy et al. (1993), Wehner and Rogers (1994, cited in Pennington et al., 1997) found no difference between their autism and control groups on the same spatial reversal task. To examine whether this may be because perseverative behaviour increases as children with autism grow older (contrary to typical development), Griffith, Pennington, Wehner, and Rogers (1999) conducted a longitudinal study investigating the development of EF over the course of a year. Besides finding no difference in performance between young children with autism (mean age = 4.3) and developmentally delayed controls on a range of ageappropriate EF tasks (measuring inhibition, set-shifting, spatial and object working memory, and action monitoring) upon initial testing, they found that performance on the spatial reversal task at the second testing period did not change significantly over time, suggesting that young children with autism do not exhibit a deficit on this task at either 12 Besides the possibility that low- and high-functioning children with autism may actually exhibit different profiles of EF impairment, Russell (1997b) also points out that different findings in these two groups may be due to the different comparison groups used. If the control group consists of developmentally delayed or mentally retarded children who also display EF impairments (in comparison to typically developing children), then low-functioning children with autism may not be impaired compared with their IQ-matched control group. However, high-functioning children with autism may display an impairment in comparison with their more typically developing controls. 49 4 or 5 years of age. More recent studies by Dawson et al. (2002a), Stahl and Pry (2002), and Rutherford and Rogers (2003) have also failed to find EF deficits in young children with autism using the spatial reversal task and other measures of inhibition, working memory, and set-shifting, although Rutherford and Rogers (2003) reported marginal differences on their measure of generativity. The apparently intact EF abilities of young children with autism suggest that executive dysfunction cannot account for the earliest symptoms of autism. However, two possible explanations for these null findings have been proposed (Griffith et al., 1999; Pennington et al., 1997). Firstly, most of the above studies have used children with developmental delays as controls, and have found that these children perform more poorly on the EF tasks than typically developing children. Hence, it may be that young children with autism are impaired on the EF tasks, but this impairment is not specific to autism. A second explanation, which is more favourable to the EF hypothesis of autism, is that EF impairments have been missed because the studies have not incorporated the full range of EF components or task modalities in their test batteries. Because of the age of the children, tasks have been primarily non-verbal (which may favour children with autism); and in addition, measures of planning and generativity have not been included, probably because of the lack of age-appropriate measures of these abilities. Rutherford and Rogers’ (2003) finding of marginal differences in generativity in young children with autism indicates that this explanation may hold promise. In addition, Russell and colleagues (Biro & Russell, 2001; Russell, 1997b; Russell, Jarrold, & Hood, 1999) have proposed that children with autism are only challenged by EF tasks if they contain arbitrary rules (which require the online rehearsal of novel strategies), which most EF tests used with very young children (such as the Anot-B task) do not contain. As usual, however, further studies are needed to investigate all of these possibilities. iv) Explanatory value. Unlike the ToM hypothesis, the ability of the EF hypothesis to account for the range of autistic symptomatology is perhaps its greatest strength. While executive dysfunction was initially proposed largely to explain the repetitive behaviours and restricted interests characteristic of autism, it has also fairly consistently demonstrated links with social and communicative impairments. Two studies by Berger and colleagues (Berger, Aerts, van Spaendonck, Cools & Teunisse, 2003; Berger et al., 1993) found that set-shifting performance was a significant predictor of social understanding and social competence in high-functioning adolescents and young adults with autism, although the same group was not able to replicate that 50 result in a third study (Teunisse et al., 2001). Gilotty, Kenworthy, Sirian, Black, and Wagner (2002) found significant correlations in their autistic sample between parental reports of everyday executive abilities (using the Behavior Rating Inventory of Executive Function) and social and communication skills as measured by the Vineland Adaptive Behavior Scales, such that impaired EF was associated with poorer adaptive skills. McEvoy et al. (1993) also found a significant correlation between EF and early social and communication skills in young children with autism. Liss et al. (2001) found that relationships between EF and adaptive functioning were no longer significant when VIQ was partialled out; however, their study was also inconsistent with other studies in finding that autism versus control group differences on EF tasks disappeared when VIQ was accounted for. A number of studies have examined the relationship between EF deficits and joint attention, as one of the earliest signs of social impairment in young children with autism13. It has been proposed that difficulties in shifting attention, rather than mentalising impairment, may underlie problems with joint attention in young autistic children (Burack, 1994; Courchesne et al., 1994). This proposal received support in McEvoy et al.’s (1993) study, which found that the frequency of joint attention behaviours was significantly correlated with cognitive flexibility. However, those studies which failed to find EF impairments in young children with autism have generally also not found any correlation between EF and joint attention. Stahl and Pry (2002) and Rutherford and Rogers (2003) both found no relationship between EF measures and joint attention in their young autistic sample. Dawson and colleagues (Dawson et al., 1998, 2002a) found that performance on tasks purportedly tapping ventromedial prefrontal and medial temporal function (e.g., the delayed nonmatching to sample task) correlated with joint attention, but not performance on tasks tapping dorsolateral prefrontal function (i.e., more classic EF tasks such as the A not B and spatial reversal tasks). These findings of no relationship between EF and early signs of social impairment are likely to relate to the null findings of group differences in EF at that young age. If young children with autism do not show impairment on EF tasks, one can hardly argue that EF deficits underlie their abnormal joint attention behaviours. Indeed, Swettenham et al. (1998) found that infants with autism (mean age = 20 13 While joint attention is clearly a social behaviour, it is also often interpreted as a precursor to or marker of ToM (Baron-Cohen, 1995). The relationship between joint attention and EF is therefore also relevant to the upcoming section on the relationship between ToM and EF in autism. Similarly, pretend play is thought to reflect metarepresentative capacity and so its relationship with EF is also of relevance to that section. 51 months) showed more difficulty shifting attention between people than between objects, suggesting that their impairment may lie in social orientation rather than attentional shifting. Nevertheless, the extent to which EF and joint attention may be related in autism remains a matter of debate (Hughes, 2001). Several studies have also examined the link between EF and the absence of spontaneous pretend play in autism. As for joint attention, these investigations have been spurred by suggestions that a lack of pretend play may have its basis in EF deficits, specifically an impairment in generativity (Jarrold, Boucher, & Smith, 1994a, 1996; Lewis & Boucher, 1995) or the inability to disengage attention from salient external stimuli to access internal, hypothetical play schemas (Harris, 1993; Harris & Leevers, 2000) rather than an inability to mentalise. Observations that children with autism have intact comprehension of pretend acts (Jarrold, Smith, Boucher, & Harris, 1994b) and that they are able to produce structured, elicited, or instructed pretence (Lewis & Boucher, 1988) are consistent with this view. In a series of studies, Jarrold et al. (1996) showed that children with autism have the capacity to engage in the mechanics of pretence, but that they produced significantly less pretence than controls in spontaneous and weakly structured conditions, suggesting that their difficulty lay in the production or generation of pretence. Similarly, Lewis and Boucher (1995) found that the generation of original actions in the play of autistic children was more consistent with a generativity hypothesis than a metarepresentational deficit. In a more recent study, Rutherford and Rogers (2003) found that the performance of children with autism on a generativity task was a significant predictor of pretend play. Surprisingly, although the usual behavioural consequences of EF or prefrontal impairment correspond closely with the repetitive behaviours displayed by individuals with autism, only one published study has directly examined the association between EF and repetitive behaviours in autism. Turner (1997) found significant correlations in children with autism between measures of inhibition, set-shifting, and generativity and the incidence and severity of various aspects of repetitive behaviour (e.g., repetitive movements, circumscribed interests) as measured by a parental interview. Furthermore, specific EF components appeared to underlie different types of repetitive behaviour; for example, repetitive movements were associated with performance on a test of inhibition, whereas sameness behaviour was correlated with measures of generativity. The EF hypothesis has therefore demonstrated good explanatory value in terms of its ability to account for both social and communicative impairments and repetitive behaviours in autism, with the exception of early signs of social impairment such as 52 joint attention. Like the ToM hypothesis, it is less able to account for non-triad features of autism such as savant abilities and heightened visuospatial and visuoperceptual skills, a fact which has been somewhat overlooked in the EF literature. The causal direction of correlations between EF and behavioural symptoms is another issue to consider, as it may be that executive dysfunction is a consequence of fewer social interactions or engagement in high rates of repetitive, restricted activities, rather than vice versa. However, evidence does not appear to support this possibility. In a number of studies, increasing the structure of the environment has been found to result in less stereotypic and more social behaviours (Clark & Rutter, 1981; Dadds et al., 1988), indicating that reducing EF demands facilitates social interaction and reduces repetitive behaviour, which would not be expected if EF impairment was caused by the behavioural symptoms. Also, the results of longitudinal studies have shown that EF performance predicts later social understanding (Berger et al., 1993, 2003). So how does the EF hypothesis fare? Like ToM, while it has defended some of its weaknesses fairly successfully, it does not convincingly meet all of the criteria for a single primary cognitive deficit of autism. Although it holds good explanatory value for most of the symptoms of autism (with the exception of some early symptomatology and non-triad features), the evidence collected so far suggests it lacks causal precedence and that EF deficits are not universal among individuals with autism. In addition, the variability of findings among studies of EF in autism is problematic. Methodological issues of the measurement precision of EF tasks, the different developmental trajectories of the various EF components, the difficulty in designing age-appropriate EF tasks for young children which tap the range of EF abilities, the variability among studies in the age and level of functioning of the sample, and variations in the modality, arbitrariness of the rules, and mode of presentation, response, and feedback of the tasks used, have all clouded the definition of the universality, specific profile, and developmental course of EF deficits in autism. While it is fairly clear that autism is characterised by significant EF deficits, these methodological issues need to be addressed in order to determine how primary those deficits may be to autism. 53 2.3 The ToM-EF relationship 2.3.1 Models of the ToM-EF relationship On the surface, there is no particular reason to propose a link between the constructs of ToM and EF: why should the ability to attribute mental states to oneself and others relate to cognitive capacities which aid the control of action? In fact, an accumulating number of recent studies have consistently demonstrated an empirical relationship in typical development; for example, there are strong correlations between various types of ToM and EF tasks which remain significant when age and IQ variables are partialled out (Carlson & Moses, 2001; Carlson, Moses, & Breton, 2002; Davis & Pratt, 1995; Frye et al., 1995; Gordon & Olson, 1998; Hala, Hug, & Henderson, 2003; Hughes, 1998a, 1998b; Lang & Perner, 2002; Russell et al., 1991; see also Perner & Lang, 1999, who report a large average effect size of 1.08 across the studies conducted up until then). Marked improvements in ToM and in EF (particularly inhibitory control) both occur around the same age, in the preschool period between 3 and 5 years of age (e.g., Gerstadt et al., 1994; Kochanska et al., 1997; Wellman et al., 2001; Zelazo et al., 1996b). The co-occurrence of ToM and EF deficits not only in autism but also in schizophrenia (e.g., Corcoran et al., 1995; Elliott et al., 1995), frontal lobe pathologies (Bach et al., 1998; Channon & Crawford, 2000; Gregory et al., 2002; Rowe, Bullock, Polkey, & Morris, 2001; Saltzman et al., 2000), and possibly Fragile X syndrome (Garner, Callias, & Turk, 1999) is also suggestive of a meaningful relationship. A range of explanations for this observed relationship have been proposed by various authors, including links based on i) the EF requirements of ToM tasks (“expression” accounts), ii) a third common conceptual requirement, iii) functional dependence during development (“emergence” accounts), and iv) shared neuroanatomical bases. These classes of explanation each have different implications for the nature of the relationship between ToM and EF impairments in autism (these implications are reviewed in Section 2.3.2 and are important in the interpretation of the results of analyses of the ToM-EF relationship in Study One). As such, a review of each follows. 54 2.3.1.1 Expression accounts14 This type of account holds that the relationships observed between performances on ToM and EF tasks are (at least partly) due to the executive requirements of ToM tasks, and therefore that failure on ToM tasks may be caused by impaired or underdeveloped EF rather than (or in addition to) poor mentalising ability. In other words, EF might affect the expression of ToM capacity. Proponents of this account have emphasised either inhibitory control (Carlson & Moses, 2001; Carlson, Moses, & Hix, 1998; Hala & Russell, 2001; Leslie & Polizzi, 1998; Roth & Leslie, 1998; Russell et al., 1991), working memory (Davis & Pratt, 1995; Gordon & Olson, 1998; Keenan, 1998), or a combination of both inhibition and working memory (Carlson et al., 2002; Hala et al., 2003) as the crucial EF factors affecting ToM performance. These ideas have been tested both by manipulating the EF requirements of various ToM tasks and by examining correlations or predictive relationships between the relevant EF components and ToM variables. The idea that ToM tasks require inhibitory control has been advanced by several authors, although there has been some disagreement regarding exactly what it is that needs to be inhibited. Across a series of studies, Russell and colleagues (Hala & Russell, 2001; Russell et al., 1991; Russell, Jarrold, & Potel, 1994) have argued that knowledge of current physical reality is more salient than knowledge of mental reality, and that tests of both deception and false belief require children to suppress or inhibit responding on the basis of their physical knowledge in favour of their less salient mental knowledge15. For example, it is proposed that in the standard false belief (unexpected transfer) task, the child is required to disengage from (and inhibit the prepotent response to report) his/her knowledge about where the object is currently located, and instead refer to an empty location. This hypothesis has been tested mainly by using a measure of strategic deception called the “windows task”. Russell et al. (1991) found that despite prior training (using 14 Perner and Lang (1999, 2000) label this class of explanation slightly differently, as “Executive component in ToM tests”. As the notion of common task requirements refers to the idea that common underlying processes are required in both tasks, the term “expression account” is preferred here, as this more accurately encompasses the notion that EFs may influence the expression of ToM capacity in everyday life (due to the executive requirements of perceiving and inferring others’ mental states) as well as on structured tasks. 15 It should be noted that while Russell (1996, 1997b) argues that most ToM tasks confound EF and mentalising demands, his view of the ToM-EF relationship is not actually that it may be explained entirely because of common performance requirements. His main theory is reviewed in this section under the heading of “emergence accounts”. 55 two opaque boxes) on the rules of the task whereby they had to point to an empty location to prevent their opponent from winning a chocolate, 3-year-olds typically pointed to the true location of the chocolate on test trials, where they were able to see the chocolate but the opponent could not. Furthermore, the majority of the 3-year-olds persisted in revealing its true location across a series of 20 trials, suggesting that a failure of inhibition or inability to disengage from a salient stimulus was underlying their difficulty, rather than a conceptual deficit with deception (and thus ToM). This interpretation of the results was supported in two further studies. Russell et al. (1994) found that removing the opponent (and therefore the requirement to deceive) did not affect 3-year-olds’ performance on the windows task. Conversely, Hala and Russell (2001) found that the performance of 3-year-olds improved when the inhibitory demands of the task were reduced, such as by removing the requirement to directly point to the chocolate and instead using a mechanical pointer to indicate the appropriate response (as pointing correctly to true locations is likely to be a well-practiced, reinforced and therefore prepotent response). Using a different approach, Moore et al. (1995) found that when their own desires were particularly strong or salient, 3-year-olds performed as poorly on a conflicting desire task as on a false belief task. This suggests that even though desire is purportedly easier for young children to understand because of its non-representational nature, 3-year-olds have difficulty judging others’ desires when EF demands are high. Together, these results suggest that the failure of young children on ToM tasks may be at least partially attributable to inadequate inhibitory control rather than (or in addition to) a poorly developed ToM. Carlson, Moses and colleagues (Carlson & Moses, 2001; Carlson et al., 1998) have outlined a similar account of the role of inhibition in ToM performance. Like Russell, while they allow for genuine development in the understanding of mental concepts, they argue that the inhibitory requirements of ToM tasks affect the expression of ToM ability in 3-year-olds16. Using a similar deception paradigm as Hala and Russell (2001), Carlson et al. (1998) found that 3-year-olds showed improved performance under conditions requiring low inhibitory control (i.e., when pictorial cues or arrows were used to mislead the opponent rather than pointing), and that they were equally successful in using arrows to point whether the opponent was present or not. In a correlative study, Carlson and Moses (2001) found that the link with inhibition 16 Carlson and Moses (2001) also state that their results are equally compatible with an expression account (i.e., that inhibitory dysfunction impedes the expression of ToM ability) as with Russell’s (1996, 1997b) emergence account, to be reviewed later as mentioned in the previous footnote. 56 extended beyond deception, finding that performance on a battery of ToM tasks was significantly correlated with a number of indices of inhibitory control, and that these correlations remained robust after the effects of age, gender, number of siblings, verbal ability, and a number of other cognitive abilities were removed. Other studies have also indicated that like deception, young children’s performance on the standard false belief task improves when inhibitory demands are reduced by using a response mode that is not influenced by a prepotent response history or by reducing the salience of the desired object’s current location. Examples include tracing out the path a naive character would take in searching for their desired object (Freeman, Lewis, & Doherty, 1991), giving an explanation for a protagonist’s wrong search in an empty location (Bartsch & Wellman, 1989), and indicating which of two twin boys, one searching in the actual location and one in the empty location, had been absent during the transfer of the object (Robinson & Mitchell, 1995). In line with their ToMM-SP model (described in Section 2.1.2), a more unequivocal expression account of the relationship between inhibitory processes and ToM has been proposed by Leslie and colleagues (Leslie & Polizzi, 1998; Roth & Leslie, 1998). Besides differing from the above two accounts on the level of ToM ability attributed to 3-year-olds (Leslie argues that the ToM module is fully active by this age but that its abilities are usually masked by the processing requirements of ToM tasks, whereas Russell and Carlson and colleagues favour the view that some conceptual ToM development does take place between the ages of 3 and 4), Leslie and his colleagues also have a different view of what creates the salience-related difficulty for 3-year-olds on the standard false belief task. They argue that because beliefs are typically true, there is a default (or prepotent) assumption that beliefs are true, and therefore the attribution of a non-default (false) belief requires inhibition of this default assumption (Leslie, 1994a; Leslie & Polizzi, 1998). Thus, they identify the competition as being between two belief contents (one of which represents physical reality) rather than between physical and mental realities (notably, Leslie and colleagues do not offer an analysis of the inhibitory requirements of other ToM tasks such as tests of deception). In support of their hypothesis, Leslie and colleagues conducted a series of cleverly designed experiments which supported previous findings that reducing the inhibitory demands of false belief tasks improves the performance of 3-year-olds (Roth & Leslie, 1998), but also showed that increasing the inhibitory requirements of the false belief task had a significant detrimental effect on the performance of 4-year-olds (Leslie & Polizzi, 1998). 57 Inhibitory control is not the only executive process that has been implicated in ToM task performance. Olson (1989) argued that developments in children’s capacity for holding complex representations in mind may support or underlie their understanding of false belief. Similarly, Halford (1993) proposed that working memory capacity may limit young children’s success in situations which require the simultaneous integration of two representations of a situation (i.e., reality and belief). In a test of this hypothesis, Davis and Pratt (1995) found that backward digit span performance significantly predicted scores on the unexpected contents and appearancereality tasks over and above age and verbal ability (accounting for around 6% of the variance), but forward digit span did not. They interpreted this as suggesting that development in the central executive, but not articulatory loop, component of Baddeley and Hitch’s (1974) working memory model was a small but significant determinant of false belief task performance. Using an additional false belief task (the unexpected transfer test) and a more age-appropriate working memory measure involving dual-task performance, Keenan, Olson, and Marini (1998) also found that after controlling for age, working memory capacity was a significant predictor of false belief performance (accounting for 7.4% of the variance). The influence of working memory capacity on the expression of ToM ability is supported in other studies showing that reducing the usual memory demands of ToM tasks has a facilitative effect on performance (e.g., Chandler & Hala, 1994; Freeman & Lacohée, 1995; Mitchell & Lacohée, 1991; although see Hala et al., 2003; Robinson, Riggs, & Samuels, 1996). While these studies permit the fairly acceptable conclusion that working memory plays some limiting role in the expression of mentalistic concepts that may already exist, Gordon and Olson (1998) considered the more contentious possibility that increasing computational resources may actually allow the formation of those concepts. They argued that the key capacity required for false belief understanding is the ability to hold in mind and then update a previously created representation when a new representation is created by the current perceptual situation. They used two working memory tasks, both of which required children to perform concurrent mental activities, but only one of which required them to hold the product of such activity in mind such that it could be updated on the basis of some new perceptual information. While both their tasks showed strong correlations with false belief performance after controlling for age (accounting for up to 40% of the variance), the more complex working memory task contributed a significant amount of variance to false belief performance over and above the other more simple working memory task. Gordon and Olson concluded that while 58 primitive concepts such as self, true, and real may be available earlier, “their coordination into a higher-order structure depends upon increased computational resources” and thus that “conceptual content and conceptual complexity combine not only in the performance on theory of mind tasks but also for the formation of the understanding itself” (1998, p. 81)17. Two studies found that the relationship between ToM and working memory no longer held after age and verbal ability were controlled for (Hughes, 1998a; Jenkins & Astington, 1996). Hala et al. (2003) proposed that this discrepancy may be attributable to the lack of significant inhibitory demands in the working memory tasks used in these two studies (i.e., they were simple tests of maintenance of information in working memory over time and did not require dual-task performance or the simultaneous activation of two concurrent activities). This interpretation is supported by Davis and Pratt’s (1995) finding that forward digit span was not a significant predictor of false belief performance, in contrast to backward digit span (which arguably involves not only rehearsing the sequence of numbers but also inhibiting the tendency to report them in the order heard). The idea that EF tasks involving both working memory and inhibitory components may show the strongest relationship with ToM had been raised earlier by Carlson and colleagues (Carlson & Moses, 2001; Carlson et al., 2002). They argued that false belief tasks require both working memory and inhibition in that the child must hold in mind two representations simultaneously as well as make a response based on the representation which directly conflicts with his/her own salient perspective. Although this group earlier favoured the view that the ToM-EF relationship was based purely on the inhibitory requirements of ToM tasks, their shift in view was prompted by two separate studies in which they found that of two types of inhibition task, the type which involved a heavier working memory load (whereby two conflicting alternatives needed to be held in mind) was the more powerful predictor of ToM performance, and added extra variance over and above the low working memory load inhibition task (Carlson & Moses, 2001; Carlson et al., 2002). In addition, Carlson et al. (2002) found that a working memory task with no inhibitory requirement did not predict false belief performance independently of age and both verbal and non-verbal intelligence. Perner, Lang, and Kloo (2002b) also failed to find a significant relationship between ToM and inhibition when a simple go-nogo inhibition task with low working memory load was 17 Again, this view is therefore probably best conceived of as a combination of expression and emergence accounts of the ToM-EF relationship. 59 used. Consistent with these results, a recent study by Hala et al. (2003) found that “pure” measures of inhibition and working memory did not predict false belief performance individually, but tasks combining inhibitory and working memory requirements were strongly predictive of false belief performance after age and verbal ability were controlled. Only a few studies have examined the contribution of EF components other than inhibition and working memory to ToM performance, with mixed results. Hughes (1998a) found a significant correlation between tests of attentional flexibility (or setshifting) and deception after age and verbal and non-verbal intelligence were partialled out. However, flexibility did not correlate with false belief performance and the correlations with deception were not as robust as those with performance on inhibition tasks. Harris (1993) argued that in both ToM tasks and tests of planning, children must envisage a hypothetical state of affairs and respond or make a prediction in accordance with that hypothetical situation, which typically contradicts the response which is suggested by the true or current state of affairs. According to Harris, children will exhibit difficulty on both types of task if they are unable to shift or disengage from their current context to a hypothetical and conflicting context. However, he does not present any evidence specifically examining this hypothesis. Using a planning test developed for use with young children, Bischof-Köhler (1998, cited in Perner & Lang, 2000) found a relationship between planning ability and false belief performance, but the direction of the relationship was such that false belief understanding appeared to be necessary for planning success. However, the effect of age or verbal ability on this relationship was not reported. Moses and Carlson (2000, cited in Carlson et al., 2002) did not find a significant relationship between planning ability and ToM after age and verbal ability were partialled out. It therefore remains unclear whether EF components other than inhibition and working memory are correlated with ToM, and if so, what might underlie these relationships. Overall, though, the evidence appears to be consistent with the idea that the EF requirements of ToM tasks or abilities are a significant factor influencing the developmental relationship between ToM and EF. However, a number of criticisms of expression accounts have been advanced by Perner and colleagues (Perner, 1995, 2000; Perner & Lang, 2000; Perner et al., 2002b), whose central contention is that ToM-EF correlations are not solely attributable to task requirements (and, relatedly, that the 60 preschool development in ToM is not solely attributable to developments in EF)18. While they more readily accept the evidence suggesting that tests of deceptive pointing include a significant executive component, Perner and colleagues have questioned the methodology of several of the studies purporting to demonstrate earlier competence on false belief tasks when the EF demands are reduced. For example, Perner (1995; Perner et al., 2002b) argued that in Bartsch and Wellman’s (1989) study, equal numbers of children passed the explanation version of the false belief task (which does not include an obvious inhibitory requirement) as the standard prediction task; and that it was only after receiving an overly helpful prompt that they displayed additional correct answers on the explanation version. Other studies using alternative explanation paradigms have found that explanation tasks are equally as difficult as prediction tasks (Hughes, 1998a; Moses & Flavell, 1990; Perner et al., 2002b; Wimmer & Mayringer, 1998), although Russell, Hill, and Franco (2001) have pointed out that the mean age in these studies was around 4 years (compared with 3 years in Bartsch & Wellman’s study), which may have masked differences by boosting scores on the prediction task. Perner (1995) also argued that Robinson and Mitchell’s (1995) finding of significantly improved performance on their identical twin explanation paradigm compared with a standard prediction version may be explained by a difference in the baseline performance expected for children with no understanding, and that there is no difference in difficulty once the data are adjusted for correct guesses – a post-hoc analysis which was then confirmed by the pattern of results obtained by Perner et al. (2002b). While these findings suggest that modified false belief (“explanation”) tasks which purportedly remove the EF component may be just as difficult as standard false belief tasks (see also Robinson & Beck, 2000), even more pertinent for Perner are studies showing that performance on explanation versions correlates just as strongly with EF scores as performance on standard prediction versions. Hughes (1998a) found that performance on her explanation version correlated as strongly as performance on a standard false belief task with scores on tests of inhibitory control, and Perner et al. (2002b) found that performance on their explanation version correlated as strongly as performance on a prediction version with scores on the dimensional change card sorting task (which arguably requires set-shifting and inhibition). These results suggest that the ToM-EF relationship is not solely due to the EF requirements of ToM tasks. However, this conclusion rests on the assumption that 18 Perner’s own theory of the ToM-EF relationship is reviewed in Section 2.3.1.3, under the heading of “Emergence accounts”. 61 explanation versions of false belief tasks do not incorporate any EF requirements. Russell et al. (2001) argue that tasks requiring a linguistic explanation for why an empty location was visited still require children to set aside or inhibit their knowledge of the actual location. It could also be argued that other explanation versions still require the child to hold in mind two conflicting perspectives, thereby taxing working memory. For example, in the identical twin versions used by Robinson and Mitchell (1995) and Perner et al. (2002b), the child is still required to hold in mind the sequence of events that has occurred and consider the different experiences of both twins simultaneously in order to work out why one twin looks in the wrong location. In addition, in defence of the expression accounts, it should be noted that the majority of authors who argue that EF abilities constrain performance on ToM tasks do not hold that the ToM-EF relationship is solely due to performance-based factors, that there is no deeper conceptual link, or that the development in mentalistic understanding in the preschool period is attributable only to increasing EF capacity without any additional conceptual development (Leslie and colleagues are an obvious exception). Certainly, none of the authors have made the claim (sometimes attributed to them) that mentalising ability does not exist and that ToM tasks are simply EF tasks. On the basis of the evidence as a whole, it seems reasonable to accept that successful performance on some ToM tasks requires a certain level of capacity in EF (particularly inhibition and working memory) and that executive difficulties may impact upon ToM performance, and therefore that the correlations observed between ToM and EF are partially due to performance-based commonalities. 2.3.1.2 Common conceptual requirements of ToM and EF Rather than positing that the ToM-EF relationship arises from the EF requirements for successful ToM performance, this account contends that both ToM and EF share a third common underlying conceptual requirement. The main account falling in this category is the CCC theory (see Section 2.2.2; Frye, Zelazo, & Burack, 1998; Frye et al., 1995), which proposes that false belief and EF tasks both require the use of embedded conditionals, or if-if-then rules (an example of a task structure involving embedded conditionals, the Dimensional Change Card Sorting (DCCS) task, is described in Section 2.2.2). Frye’s (2000) analysis of the logical structure of the standard false belief task (where the child must predict where Maxi will look for his chocolate) in terms of if-if-then embedded conditionals runs as follows: 62 IF me (s1), IF looking for chocolate (a1), THEN here (c1). IF Maxi (s2), IF looking for chocolate (a1), THEN there (c2). Frye (1999) proposed that while ToM and EF are distinct and neither underlies the other, they are related in that they depend on different applications of the same set of reasoning rules, and the development in this reasoning ability underlies the development of ToM and EF at the same age. The same embedded rules “guide the inferences necessary for theory of mind and allow the formulation of action that results in improved executive control” (Frye, 1999, p. 121-122). Frye et al. (1995) tested the CCC account by comparing preschoolers’ performance on three false belief tasks and two reasoning tasks with an embedded conditional structure: the DCCS task and a physical causality task where a marble was rolled down a covered ramp either to a hole directly below its entry point or across to the opposite side, depending on the setting condition, and children had to predict where the marble would be found. Frye et al. found similar age-related improvements (between 3 and 5 years of age) across both types of task. In a further study, Frye, Zelazo, Brooks, and Samuels (1996) showed that 3-year-olds were able to successfully perform a simplified version of the physical causality task with a simple if-then structure. Frye et al. (1995) also found that scores on the reasoning and ToM tasks were significantly correlated with age partialled out. Furthermore, ToM performance only correlated with performance on reasoning tasks with an embedded rule structure, and not with performance on tasks with simple if-then structures. A number of criticisms of CCC theory and its explanation of the ToM-EF relationship have been advanced. Carlson and colleagues (Carlson et al., 1998; Carlson & Moses, 2001) argue that Frye et al.’s (1995) data are also consistent with an inhibitory control interpretation, as, for example, the DCCS task requires children to inhibit their previous way of responding and shift to a new response. Zelazo and Frye (1998) refute the inhibition interpretation by pointing to data showing that 3-year-olds are able to effectively inhibit a previous way of responding when the task conforms to a simple if-then structure (Marcovitch, Zelazo, Boseovski, & Cohen, 1997, cited in Zelazo & Frye, 1998) and that on a task with an embedded rule structure, 3-year-olds still performed poorly when evaluating the sorting of a puppet – that is, when they were not themselves required to inhibit a previous response (Jacques, Zelazo, Kirkham, & Semcesen, 1999). However, conversely, Carlson et al. (1998) highlight the fact that the deceptive pointing and arrow tasks used in their study had identical rule structures, but 3-year-olds’ performance was significantly better on the arrow task (which had a lower 63 inhibitory requirement). Perner and Lang (2002) also found that 3-year-olds were able to perform well on variations of the DCCS task which had an embedded rule structure, but which did not include an extradimensional shift (i.e., the rule reversed rather than changed dimensions from colour to shape) and did not involve a visual clash between target and test cards (i.e., had reduced inhibitory requirements). Moreover, Perner (2000) calls attention to the fact that go-nogo tasks (which require a simple pair of rules) are as difficult for 3-year-olds as other inhibition and conditional reasoning tasks. Perhaps even more pertinent is Carlson and Moses’ (2001) finding that one of their inhibition measures which had a simple if-then structure was one of the strongest predictors of ToM performance. Similarly, Sabbagh, Moses, and Shiverick (2001, cited in Carlson & Moses, 2001) found that false belief performance was strongly correlated with inhibition, but performance on the false photograph task (which has an identical rule structure) was not. These data suggest that similarities in the rule structure of ToM and EF tasks cannot account entirely for the ToM-EF relationship. Perner, Stummer, and Lang (1999) also present a more a priori argument against the CCC account’s analysis of the standard false belief task. They point out that in the DCCS and physical causality tasks, the conditional structures describe rules which the child must know in order to solve the task. However, in the case of the false belief task, the conditional rules (e.g., “If Maxi, if looking for chocolate, then here”) are not part of the task’s instructions and cannot be those explicitly used by the child in solving the task, as such a rule would only be possible if the child was repeatedly exposed to Maxi going to the empty location. Perner and colleagues argue that this highlights the arbitrary nature of the rules chosen to describe the false belief task. They offer the following analysis: IF I am looking for the chocolate (a1), THEN here (c1). IF Maxi is looking for the chocolate (a2), THEN there (c2). This plausible alternative reduces the task to one with a pair of simple if-then rules, which 3-year-olds should be capable of performing successfully. Zelazo, Jacques, Burack, and Frye (2002) attempted to refute these criticisms by arguing that their claim is not that people must learn the rules, but must formulate them in an impromptu manner in order to solve the task. They argue that their analysis of the rule structure of the false belief task is not logically necessary, but is an empirical claim which is confirmed by the correlations observed between ToM and rule-based reasoning tasks. However, this would mean that any task showing correlations with the rule-based reasoning tasks could then be considered to have an embedded conditional structure 64 surely a circular and ad hoc argument. Hence, difficulties with the logical defence of the conditional structure of false belief tasks, as well as an inability to account for data regarding both the abilities of young children and correlations between ToM and EF tasks with simple rule structures, present a significant challenge to the CCC account of the ToM-EF relationship. An alternative “common conceptual requirements” account has been presented by Halford and colleagues (Halford, 1993; Halford et al., 1998), although this has not been as thoroughly investigated or discussed as the CCC account. Halford’s theory states that processing capacity (or working memory) is limited by the complexity of the relations (i.e., the number of related dimensions or sources of variation) that may be processed in parallel, and that as processing capacity develops, children should be able to represent concepts of increasingly higher relational complexity. He proposes that young children’s difficulty on standard false belief and appearance-reality tasks may be explained by their inability to represent “ternary” relations, or problems with three related dimensions. His analysis of the standard false belief task is that it requires representing the relation between an object and two different representations of its location: one based on knowledge of its actual location and the other on a false belief of its location. He expresses this situation as the ternary relation: Find-object (<known-event>, <actual-location>, <believed-location>), instances of which are: Find-object (<saw-moved>, <object-in-location-A>, <believe-object-in-location-A>) and Find-object (<not-seen-moved>, <object-in-location-A>, <believe-object-in-locationB>). Halford argues that young children are able to understand any of the component binary relations (e.g., Find-object (<not-seen-moved>, <object-in-location-A>)), but that they cannot integrate two object-percept relations into a single ternary relation. This also explains their poor performance on other kinds of task, including EF tests such as the Tower of London, which require the same or a higher degree of relational complexity (the Tower of London is described in the next chapter, Section 3.4.1). Halford’s view is therefore similar to the CCC account, but differs in that it emphasises the number of relations between pieces of information rather than the presence of a hierarchical or embedded conditional structure. Because of this, Halford’s proposal may be more resistant to some of the criticisms levelled against the CCC account on the basis of the purported embedded rule structure of the false belief task. 65 However, his account has yet to be directly tested, although evidence of a relationship between working memory and ToM (Davis & Pratt, 1995; Keenan et al., 1998) is consistent with it. In addition, the relational complexity of the EF tasks (for example, of inhibitory control) which are mastered at the same time as false belief tasks remains to be determined. 2.3.1.3 Emergence accounts19 In this category falls two main theories of the ToM-EF relationship, one claiming that EFs are a prerequisite for the development of ToM, and the other claiming that ToM is necessary for EF to develop. i) EF is required for ToM. This position is represented mainly by Russell (1996, 1997b), whose argument is essentially that a sense of “agency” underlies selfawareness, which in turn underlies the development of ToM. According to Russell, agency has four main features: i) action-monitoring (the process through which changes in experience are perceived to have been caused by the self and not the world), ii) instigation (the ability of agents to determine their own perceptual sequences), iii) nonobservational knowledge of actions (the phenomenon whereby if an agent is in control of his/her actions, s/he does not have to consciously observe them to know what they are), and iv) privileged knowledge of goals (whereby in acting in a goal-directed fashion we know incorrigibly what the goal is, whereas a third person does not). Russell considers EF to be equivalent to action-monitoring and instigation (or at least that these are the fundamental aspects of EF), and it is in this sense that he views EF as underlying the development of ToM20. He asserts that these features allow a sense of ownership – the perception of experiences as one’s own, and not determined by the world – and therefore a self-awareness which he calls “pre-theoretical” (i.e., bodily-based and immediate, requiring no comprehension of psychological concepts). This pre- theoretical self-awareness is a necessary condition for the development of ToM - a form of self-awareness which does depend on mental concepts. 19 Perner and Lang (1999, 2000) label these “functional dependence” accounts. The term “emergence” accounts is preferred here as this more directly refers to the developmental aspect of this class of explanation – that is, the notion that one ability depends on the other to develop or “emerge”. 20 Russell (1997b) recognises that EFs include other components such as inhibition, cognitive flexibility and working memory, and goes on to say how monitoring and instigation relate to these components. For example, he says that instigation (defined as the capacity to take actions not driven by habit or the external world) is analogous to the concept of generativity, but also requires inhibition; and flexibility requires adequate monitoring of the outcome of an incorrect response and instigation of a new strategy. 66 Empirical tests of Russell’s (1997b) theory have mostly been conducted on children with autism (who are purported to have inadequate action-monitoring or instigation), and these are reviewed in Section 2.3.2. However, a study by Hughes (1998b) is also relevant. She found that preschoolers’ early EF performance, particularly on a test of goal-directed action and inhibition, predicted ToM scores one year later; but that early ToM scores did not predict later EF performance. Although this study did not directly measure monitoring and instigation, it provides general support for the notion that EF is required for ToM rather than vice versa. Perner and Lang (1999, 2000) have critiqued Russell’s theory on conceptual grounds, claiming that while it can explain how early action-monitoring may fundamentally enable the early and later development of ToM, it does not adequately explain why developments in false belief and inhibition in particular should occur at the same age (around 4 years, later than the development of action-monitoring) or why ToM and EF should be correlated at this age. Russell does not specifically address this issue in his writings, tending to focus on expression or performance-based explanations for the ToM-EF relationship during the preschool period, without relating later inhibition to earlier action-monitoring and instigation. Also posing a challenge for Russell’s theory are the existence of disorders where EF is impaired but ToM is intact. If EF is a prerequisite for ToM development, one would not expect children with impaired EF to show typical ToM capacity. Three studies have now shown that children with ADHD or at risk of ADHD have impaired EF (particularly inhibitory control) but intact performance on ToM tasks (Charman, Carroll, & Sturge, 2001; Hughes, Dunn, & White, 1998; Perner, Kain, & Barchfeld, 2002a). A study by Tager-Flusberg, Sullivan, and Boshart (1997) demonstrated a similar dissociation in children with Prader-Willi syndrome and Williams syndrome, who showed impaired EF and intact ToM with no correlation between EF and ToM performance. In addition, six children failed both EF tasks but passed both ToM tasks, while no children passed both EF tasks if they failed both ToM tasks21, inconsistent with the notion that intact EF is a prerequisite for ToM. Baron-Cohen and Robertson (1995) reported a case of a child with Tourette’s syndrome who passed all ToM tasks but failed two of three tests of inhibition, although this study comes with the usual caveats of a single-case design. 21 This additional data was not contained in the original publication but was reported by Perner and Lang (2000). 67 Although Russell has not directly responded to these challenges to his theory, he has implied that he subscribes to a multi-componential view of EF (Russell, 1997b) and therefore may argue that individuals showing a ToM-EF dissociation have intact actionmonitoring and instigation but impairments in other aspects of EF. However, this would contradict his assertion that monitoring and instigation are the fundamental basis for EF, underlying its other components. Another important issue in evaluating evidence of dissociations, highlighted by Perner and Lang (2000), is that the criterion for failure of EF tasks is arbitrary in many cases, and so evaluating the relative pass/fail rates of ToM and EF tasks suffers from the absence of absolute standards of performance. Although Perner’s view takes the opposite form, he concludes that while Russell’s theory is in need of greater specification regarding the aspects of ToM and EF it incorporates, there is no firm evidence against it (Perner, 2000; Perner & Lang, 1999, 2000). ii) ToM is required for EF. This account was first alluded to by Wimmer (1989, cited in Perner, 1991), Frith (1992) in reference to schizophrenia, and Carruthers (1996), and was then developed further by Perner (1998; Perner & Lang, 1999, 2000). The essence of this position is that the metarepresentational capacity which (purportedly) underlies ToM is necessary for volitional control over action. Wimmer’s initial idea (as described by Perner) was that a better understanding of our own mind and mental concepts allows better control over our mental processes and behaviour. Carruthers (1996) developed this notion by positing that normal human reasoning routinely involves second-order evaluation of first-order thoughts, beliefs and desires (e.g., how strong is my desire to do x as opposed to y?), a kind of reflexive, introspective access to our recent conscious mental events. He argues that the operation of a ToM module underlies this meta-access to our own beliefs and thoughts, and in turn, that this meta-access is a necessary condition for the evaluation of recent problemsolving strategies such as is required on many EF tasks. Perner (1998) elaborated upon this idea by specifically delineating the metarepresentational requirements of the contention scheduling and SAS aspects of Norman and Shallice’s (1980, 1986) model of EF (reviewed in Section 2.2.2). He argues that contention scheduling does not require “meta-intentional” understanding (i.e., a declarative, conscious representation of one’s goal), because at this level action schemas control each other automatically by mutual inhibition and activation of competing behavioural sequences (such as in trial-and-error learning). On the other hand, intentional actions such as following verbal instructions or planning a future 68 action sequence (which require control by the SAS) demand a declarative representation of a goal as desired or intended (i.e., “something the examiner wants me to do” or “something I want to do”), so that the correct novel action schema can be boosted22. These representations are called meta-intentional because they involve representing the intended action sequence as intended. In certain situations, though, boosting the desired action sequence is not sufficient – the inhibition of competing action schemas is also required. In these cases, Perner argues, one needs to understand why the particular competing action schema in question needs to be inhibited, and to do so one needs to consciously conceptualise the action sequence as a tendency one has – that is, metarepresent the schema as a representational vehicle (a representation which is not specified by its content, such as a procedural action sequence). Thus, it is only in situations where a competing action schema needs to be inhibited that we require metarepresentational (not just meta-intentional) capacity. The developmental relationship between tests of inhibitory control and false belief understanding therefore occurs because they both require metarepresentational capacity - both require the ability to represent representational vehicles (either action sequences or “pictures-in-the-head”) which have causal efficacy (i.e., make people act in a certain way). In a sense, then, Perner’s account is one of a common conceptual requirement of ToM and EF (i.e., they both rely on metarepresentational capacity) rather than ToM itself being a prerequisite for EF (although Perner himself places his account under the “functional dependence” heading, equating metarepresentational ability with ToM and then saying that EF tasks are applied ToM tasks)23. The main piece of direct evidence for Perner’s theory comes from a study by Lang and Perner (2002) which examined the relationship between early EF (as measured by the DCCS and Luria’s hand game, which requires inhibitory control), false belief and a knee-jerk reflex task. This latter task required the child to identify whether or not they intended to move their leg after a reflex movement was elicited by the examiner. Perner et al. argued that like the false belief and EF tasks, this requires an understanding of mental states as representations which are causally responsible for 22 Perner (1998) characterises this distinction as being that contention scheduling occurs at the level of the representational vehicle, and the SAS exerts control at the level of representational content (see Perner, 1995 for further explanation). 23 It also resembles a “common conceptual requirements” account in that Perner implies that ToM and EF both depend on metarepresentational capacity throughout the lifespan, and not just during development. However, it was classified as an emergence or functional dependence account here both because that is how it is classified by Perner himself, and because the hypothesis does emphasise that ToM (or metarepresentational capacity) is necessarily for EF to develop. 69 actions (as the child needs to differentiate between intentional and accidental movements). Consistent with their predictions, they found that the three types of task were significantly correlated with age and verbal ability partialled out, implying that all three abilities depend upon a common developmental factor. Perner and Lang (2000) also report further relevant results from what was presumably a preliminary version of the study, which showed that the knee-jerk reflex task still explained a significant amount of variance in false belief performance beyond the EF tasks, suggesting that the relationship between false belief and knee-jerk reflex understanding could not be explained by any executive component in the knee-jerk task. Perner’s account of the ToM-EF relationship has, like all the preceding accounts, been subjected to a range of critiques. Russell (1997b; Russell et al., 2001) has argued that the assumption that any behaviour with a second-order character (i.e., where the subject is required to represent to itself what it is doing and what needs to be done) necessarily involves ToM is an unjustified over-stretching of the ToM concept. He asserts further that action schemas or tendencies are not representations in any useful sense, or at least, do not necessarily require metarepresentational understanding. Russell has also criticised Perner’s interpretation of his result with the knee-jerk task, arguing that the task could be considered to require inhibition of an answer based on perceived outcome; and that the reason why it explains variance in the false belief task beyond that explained by the EF tasks is that the response to be inhibited is verbal, rather than a motor act as in Luria’s hand game (Russell et al., 2001; Russell, Hala, & Hill, 2003). It could also be argued that the correct rejection of the reflex movement as intentional requires action-monitoring (i.e., the ability to perceive the difference between changes in experience caused by the self and the world), and therefore that the results are also consistent with Russell’s agency theory. A number of empirically based criticisms of Perner’s theory have been advanced by Hughes (1998b) and Carlson and colleagues (Carlson & Moses, 2001; Carlson et al., 2002). Firstly, Hughes’ (1998b) finding that early EF predicted later ToM but not vice versa is inconsistent with the notion that ToM is a prerequisite for EF. Perner and Lang (1999, 2000) attempt to reinterpret this finding in their favour by arguing that EF tasks assess the understanding of mental states as causally efficacious as much as ToM tasks, and that Hughes’ data can be explained by assuming that this metarepresentational understanding occurs in reference to one’s own actions first, and in reference to others’ actions later. However, this is not consistent with findings that on the unexpected contents (Smarties) and unexpected identity (appearance-reality) tasks, correct reporting 70 of one’s own previous belief develops at the same time as, or even after, the correct prediction of others’ beliefs (e.g., Gopnik & Astington, 1988). Also, this interpretation would mean that the findings of impaired EF and intact ToM in certain disorders, the evidence used by Perner against Russell’s theory, would pose an equally difficult problem for Perner: is it plausible that children could be impaired in a developmentally precedent ability in comparison with intact performance on an ability which develops later? Secondly, Hughes (1998b) points out that significant improvements in inhibition and goal-directed behaviour occur during infancy (see Section 2.2.2), long before children acquire Perner’s representational theory of mind. Perner et al. (1999) provide a more solid defence of this problem by distinguishing between “automatic inhibition” (when a more highly activated schema naturally inhibits less activated competitors, or relatively automatic suppression of motor or cognitive responses), which he argues is tapped by the A-not-B task and other EF tasks used with infants, and “executive inhibition” (when no alternative schema is automatically activated and a response must be actively inhibited, or when there is deliberate suppression of a response to achieve an internally represented goal), which is what is tapped by EF tasks mastered around the age of 4, and which requires metarepresentational understanding. A third empirical problem for Perner was discovered by Carlson and colleagues (Carlson & Moses, 2001; Carlson et al., 2002), who found that their two types of inhibition task showed differential relationships with ToM, but both required executive inhibition and therefore according to Perner should have been equally related to ToM. Perner et al. (2002b) also found themselves that performance on a go-nogo task which required executive inhibition was not significantly correlated with false belief prediction or card sorting performance, contrary to their predictions. Fourthly, as highlighted by Hughes (1998b), young children may correctly verbalise their understanding of the rules of a task but nevertheless demonstrate perseveration of the incorrect response (Zelazo et al., 1996b), indicating that meta-intentional ability is not sufficient for strategic performance on an EF task. The existence of dissociations between ToM and EF whereby ToM is impaired but EF is spared poses another difficulty for Perner, although, consistent with Perner’s account, these cases appear to be rarer than cases of the reverse dissociation. A number of studies of individuals with brain injuries have demonstrated a ToM-EF dissociation such that ToM is impaired and EF intact (reviewed in the next section). In addition, deaf children displaying intact EF performance still demonstrate a ToM impairment 71 (Remmel, 2003). However, it could be argued that the abnormal development of ToM in deaf children has its basis in a different process to other conditions and is therefore a poor example of this dissociation, as it appears to be impoverished language development which underlies the delay in ToM rather than a metarepresentational deficit (de Villiers, 2000). Overall, then, while Perner’s theory of the ToM-EF relationship has an interesting and well-developed conceptual basis, it has not as yet been able to adequately refute conceptually grounded critiques or account for all the available data. Perner and Lang (1999) suggest that both emergence accounts may be correct, such that ToM and EF are interdependent: “an understanding of mental states as causally efficacious is required for executive inhibition, and executive inhibition is a main exercise ground for a theory of mind at this stage of development” (p. 342). However, while both of the emergence accounts are strengthened by evidence suggesting a deep link between ToM and EF during conceptual development, they are equally weakened by evidence that ToM and EF may be dissociably impaired (this is discussed further later). 2.3.1.4 Common neuroanatomical bases for ToM and EF This category of explanation holds that correlations between ToM and EF may be coincidental or accidental, occurring because both abilities depend upon the same or proximal brain regions (Bach et al., 1998; Ozonoff, 1995a; Ozonoff et al., 1991; Pennington et al., 1997). On this account, the concurrent developments in ToM and EF which occur around the age of 4 are due to late maturation of these common brain structures. It can also explain the frequent co-occurrence of ToM and EF impairments, on the basis that proximal neuroanatomical structures will often be damaged together. While the notion of common underlying brain regions is not inconsistent with any of the other theories of the ToM-EF relationship, the idea that this is the only link between the constructs is a possibility unique to this hypothesis. This account therefore allows for dissociations between ToM and EF, although of course only if the brain regions in question are not absolutely identical. So what are the brain regions in question? Areas within the prefrontal cortex are obvious suspects. In Sections 2.2.1 and 2.2.2, we saw that while EF tasks are not necessarily sensitive or specific to the functioning of the prefrontal cortex, the view that EFs rely upon the prefrontal cortex is fairly well established (e.g., Owen, Downes, 72 Sahakian, Polkey, & Robbins, 1990; see Stuss & Knight, 2002). Neuroimaging studies and investigations of brain-damaged and psychiatric patients have converged on the notion that the dorsolateral prefrontal cortex in particular is important for working memory, problem-solving, attentional flexibility and planning (Burgess, 2000; Cabeza & Nyberg, 2000; Collette & van der Linden, 2002; Dagher, Owen, Boecker, & Brooks, 1999; Fuster, 2000; Goldman-Rakic & Leung, 2002; Grattan, Bloomer, Archambault, & Eslinger, 1994; Kane & Engle, 2002; Mega & Cummings, 1994; Weinberger, 2002). An increasing number of recent neuroimaging studies of the brain regions involved in ToM (which are generally adult studies comparing activation during advanced ToM tasks with structurally similar tasks containing no mentalistic content) have implicated a network of structures including the medial prefrontal cortex, anterior cingulate cortex, superior aspects of the temporal lobes, and the temporal poles (BaronCohen et al., 1994, 1999a; Brunet, Sarfati, Hardy-Baylé, & Decety, 2000; Castelli et al., 2002; Fletcher et al., 1995; Gallagher et al., 2000; Goel, Grafman, Sadato, & Hallett, 1995; for reviews, see Abu-Akel, 2003; Frith & Frith, 2000; Gallagher & Frith, 2003; Kain & Perner, 2003). In their review of neuroimaging studies of ToM, Gallagher and Frith (2003) argue that the anterior paracingulate cortex (which is part of the medial frontal cortex, and lies just anterior to the anterior cingulate cortex proper) is the crucial region dedicated specifically to processing mental states, and the temporal regions which are commonly activated in ToM tasks have more secondary functions such as the interpretation of biological motion (which may be necessary to ascribe intentionality to others) and episodic memory (which may be required to imagine ourselves in the situation of another person). The orbitofrontal cortex and amygdala have also been proposed to have a role in social cognition (Baron-Cohen & Ring, 1994; Brothers, 1996), however activation in these areas is not seen in the majority of neuroimaging studies of ToM. Gallagher and Frith (2003) conclude that while these areas may form part of the social brain in general (with the amygdala involved in the automatic response to socially salient stimuli as well as playing a key role in emotion, and the orbitofrontal cortex in the processing of affective, particularly aversive, social stimuli), they are unlikely to be directly involved in ToM. However, neuroimaging studies are of limited utility in investigating the role of the orbitofrontal cortex in ToM as it is difficult to obtain reliable activation maps for this region (Gregory et al., 2002). The importance of the prefrontal cortex in ToM has also emerged in studies of neurological patients, which have demonstrated significant ToM impairments in patients with prefrontal damage (Bach et al., 1998; Channon & Crawford, 2000; Gregory et al., 73 2002; Happé, Malhi, & Checkley, 2001; Lough, Gregory, & Hodges, 2001; Lough & Hodges, 2002; Rowe et al., 2001; Stone, Baron-Cohen, & Knight, 1998; Stuss, Gallup, & Alexander, 2001). However, the regions of the prefrontal cortex implicated in these studies have been somewhat more ambiguous than those indicated in neuroimaging studies of normal adults24. Rowe et al. (2001) found no effect of the site of lesion within the prefrontal cortex on ToM performance. Some studies have muddied the issue either by defining damage as being in the “orbitomedial” region, using the terms ventromedial and orbitofrontal interchangeably, or focussing on the area of overlap between the orbitofrontal and ventromedial regions (Gregory et al., 2002; Lough et al., 2001; Lough & Hodges, 2002; Stuss et al., 2001). Studies by Happé et al. (2001) and Stone et al. (1998) found that ToM is impaired following orbitofrontal lesions, contrary to the findings of neuroimaging studies. Cicerone and Tanenbaum (1997) also describe a patient with traumatic orbitofrontal injury who performed poorly on ToM-like tasks requiring interpretation of social situations. Drawing on additional evidence that patients with orbitofrontal damage commonly show marked changes in social behaviour, Stone (2000) concludes that the orbitofrontal cortex is the most crucial region for ToM. However, Bach, Happé, Fleminger, and Powell (2000) report a case of an adult male with orbitofrontal damage who showed intact performance on ToM tasks even though he displayed a disturbance in social behaviour. Eslinger (1998) also reviews evidence showing that patients with orbitofrontal lesions have impaired emotional or affective empathic processing, but intact performance on cognitive aspects of empathic processing25. Importantly, though, studies of neurological patients rarely implicate the dorsolateral prefrontal cortex in ToM (although see Price et al., 1990). Regardless of whether ToM relies more heavily on orbitofrontal or medial frontal regions, more relevant are studies addressing the notion that ToM and EF are related because they rely on proximal brain regions. Consistent with this hypothesis, a number of studies have reported co-existing ToM and EF impairments in patients with prefrontal damage not limited to specific dorsolateral, ventromedial or orbitofrontal regions (Bach et al., 1998; Channon & Crawford, 2000; Gregory et al., 2002; Rowe et al., 2001; Saltzman et al., 2000). Gregory et al. and Rowe et al. both found that while 24 The laterality of ToM representation in the brain is also unclear, although it appears that patients with right hemisphere damage show ToM impairments more consistently than patients with left hemisphere damage (see Kain & Perner, 2003). 25 Interestingly, consistent with the notion of a distinction between cognitive and affective aspects of empathy, Blair et al. (1996) found that psychopaths show intact performance on ToM tasks but do not show typical affective responses (as measured by physiological arousal) to images of individuals in distress. 74 their frontal patients scored poorly on both ToM and EF tasks, these deficits were not significantly correlated, consistent with the idea that the two types of task may rely on different aspects of the prefrontal cortex. This specialisation within the prefrontal cortex has also been supported by studies demonstrating dissociations between ToM and EF in patients with damage to specific prefrontal regions. Case and group studies have demonstrated specific impairments in ToM in the face of intact EF in patients with frontotemporal dementia (Lough et al., 2001; Lough & Hodges, 2002). In addition, Fine, Lumsden, and Blair (2001) reported a case of a patient with congenital amygdala damage who demonstrated impaired ToM but intact EF. The reverse dissociation, of intact ToM with impaired EF, was also reported by Bach et al. (2000). A double dissociation of sorts was demonstrated by Stone et al. (1998), who found that while patients with orbitofrontal damage failed ToM tasks regardless of their working memory load, dorsolateral prefrontal patients displayed impaired ToM performance only under conditions where the working memory load was high (moreover, under these conditions they made errors on control questions as well as belief questions). This account of the ToM-EF relationship has therefore been fairly resistant to criticism, as it is able to explain both the relationships and the dissociations between ToM and EF. Of course, in addition to the lack of consistency in defining what constitutes an impairment, the problem of equating ToM and EF tasks for difficulty should be noted as a caveat in interpreting the dissociations observed in neurological patients (and any other clinical samples or individuals), although most studies are careful to note the absence of any floor or ceiling effects. Aside from this, Perner and Lang (2000) have argued that the theory lacks strong predictive value, as any kind of task association or dissociation is compatible with it. Perner (2000) adds that while it accounts for a general developmental relationship between ToM and EF based on common timing of the maturation of prefrontal structures, it does not specifically predict that false belief tasks and “executive inhibition” should be mastered at the same time. He also points out that environmental factors such as number of older siblings influence the age at which false belief tasks are mastered (Perner, Ruffman, & Leekam, 1994; Ruffman, Perner, Naito, Parkin, & Clements, 1998). However, a number of other extraneous factors also influence ToM development (e.g., language, visual perception), but this does not preclude the notion that ToM has a specific neuroanatomical basis which is proximal to the structures involved in EF. 75 So, what can we conclude overall about the relative strength of the various models of the ToM-EF relationship? As we have seen, no account has eluded criticism. Leaving aside the problem of defining “impairment” and equating ToM and EF tasks for difficulty, one recurrent theme is that the explanation must account for the observed correlations between ToM and EF and the frequent co-occurrence of deficits in the two areas, as well as allowing for dissociable impairments in either direction. The “emergence” accounts in particular are weakened by evidence of dissociations, as they both imply a fundamental conceptual dependence between the two constructs during development. Similarly, dissociations are problematic for the “common conceptual requirements” accounts particularly if the tasks on which dissociations are demonstrated are both purported to rely on the same third underlying mechanism. While the “common neuroanatomical bases” explanation accounts best for these data, it lacks specificity in its predictions regarding the components of EF and types of ToM task which should be related. Also, it does not specifically account for data showing that performance on false belief and deception tasks improves when the EF requirements are reduced. The “expression” accounts do not strictly predict ToM-EF dissociations as one would expect that performance on ToM tasks would be affected if EF is impaired (although it would be acceptable for ToM to be impaired while EF is intact), but if the account is limited to specific aspects of EF such as inhibition or working memory, then dissociations with other EF components would be allowable. Although this account has been criticised on the basis of evidence suggesting that the ToM-EF link extends beyond the level of the EF requirements of ToM tasks, those findings do not exclude the possibility that there may be both performance-based and deeper conceptual or functional (and neuroanatomical) links. It is also possible that ToM and EF may be dependent on each other for their initial development but then become separable processes when matured (linked only by the EF requirements of ToM tasks and/or their common neuroanatomical substrates) – in which case emergence accounts would not be inconsistent with the existence of dissociations in adults or children over the age of five (in whom ToM and EF had previously developed normally). Perner et al. (2002a) do not appear to concur with this possibility, implying that functional dependence should extend across the life span26. However, Karmiloff-Smith (1992; Karmiloff-Smith, Scerif & Ansari, 2003; Thomas & Karmiloff-Smith, 2002; see also Bishop, 1997) has argued persuasively that processes 26 If Perner’s account is viewed as a “common conceptual requirements” account, this claim becomes somewhat more defensible. However, he suggests that this is the case for both his and Russell’s accounts. 76 which are dissociable in adulthood cannot be assumed to be so during development and that “a difference in performance....at any point in development does not permit the inference of a stable double dissociation at a later or earlier time” (Karmiloff-Smith et al., 2003, p.162). This argument carries the inference that double dissociations between ToM and EF observed during middle childhood and adulthood do not necessarily mean that the two abilities were not interrelated during earlier stages of development27. It is possible and not implausible that performance-based, conceptual, functional, and neuroanatomical factors interact and combine to produce the observed relationships between ToM and EF during different stages of development and in different disorders. This remains a speculative proposition, however – the nature or existence of the relationship between ToM and EF beyond the preschool period has been largely overlooked by all of the main accounts, which have focused in particular on the relationship between false belief and certain aspects of EF (inhibition, working memory, conditional reasoning, monitoring) between the ages of 3 and 5, without generating or testing specific predictions about the ToM-EF relationship in later childhood, adolescence and adulthood (or addressing the role of components of EF such as generativity and flexibility, which are still developing during late childhood and adolescence). Only three studies provide separate data on correlations between ToM and EF for non-clinical controls older than 5 years, with inconsistent results. Perner et al. (2002a) found a number of significant correlations between a second-order false belief task and a range of EF measures in typically developing 4.5–6.5 year-olds, while Charman et al. (2001) did not find any significant correlations between advanced ToM stories and measures of inhibition and planning in their 8-10 year-old typically developing controls. The only available adult data is from Channon and Crawford (2000), who report significant correlations for their healthy adults (with a mean age of 43 years) between advanced ToM stories and two measures of generativity, but not other EF measures of flexibility, inhibition and planning. These data suggest that the nature of the ToM-EF relationship may not be the same for older children and adults as for young children, and therefore that ToM-EF dissociations in these age groups may not be easily interpreted in terms of theories based on the preschool period. Notably, almost all reported ToM-EF dissociations have occurred in individuals or samples older 27 Another example of this concept is that visuospatial skills are functionally dependent upon basic vision for their typical development (e.g., Vecchi, 1998), but in adult disorders it is possible for visuospatial processing to be impaired without disruption to vision itself, such as in cases of spatial neglect (see Heilman, Watson, & Valenstein, 1993). 77 than 5 years. The only exception is Hughes et al.’s (1998) study on “hard-to-manage” preschoolers, which found that these children showed largely intact performance on ToM tasks in comparison with impaired performance on EF tasks, but nevertheless that ToM and EF were correlated in this group. Evidently, further studies on older age groups are necessary to delineate the nature of the ToM-EF relationship beyond the early years and the meaning and implications of ToM-EF dissociations for the various accounts of the ToM-EF relationship. This is particularly important for the interpretation of studies on the ToM-EF relationship in autism, which have been conducted largely on older children. 2.3.2 The ToM-EF relationship in autism As a developmental disorder characterised by both ToM and EF impairments, autism provides an interesting test case for the various accounts of the ToM-EF relationship, each of which generates different predictions about the relationship between ToM and EF deficits in autism. The nature of the relationship is highly relevant to the evaluation of hypotheses of autism which propose a primary deficit in either ToM or EF. A single primary cognitive deficit account of autism needs to demonstrate that one impairment subsumes or explains the other; and conversely, a multiple cognitive deficits account would be consistent with evidence suggesting that ToM and EF are (at least partially) independent deficits in autism. Surprisingly, only a few studies have directly measured correlations between ToM and EF in autism, with many authors relying on the mere existence of both ToM and EF impairments in autism, other indirect evidence, or the theories and evidence generated from the study of typically developing children to argue for their position. Those who claim that ToM and EF are related deficits in autism rely heavily on Ozonoff et al.’s (1991) finding that ToM and EF were correlated in their sample of highfunctioning individuals with autism. However, this was based on single ToM and EF composite scores, therefore obscuring the specific nature of the relationship; and furthermore, age was not partialled out of the correlation, leaving open the possibility that the correlation may have been mediated by age. The evidence regarding the nature of the ToM-EF relationship in autism will be reviewed by examining the predictions of each of the accounts reviewed in the previous section for the ToM-EF relationship for autism, and how well these predictions fit with the available data. 78 i) Expression accounts. The idea that the failure of children with autism on ToM tasks may be at least partially due to difficulties with their EF requirements has been most overtly advocated by Russell and colleagues (Hughes & Russell, 1993; Russell, 1997b; Russell, Saltmarsh, & Hill, 1999). In favour of this, Hughes and Russell (1993) found that participants with autism continued to fail a test of strategic deception (the windows task) when there was no opponent present. Russell et al. (1999) found that children with autism demonstrated significantly poorer performance than controls on the conflicting desire task used by Moore et al. (1995), suggesting that their difficulty with the false belief task is not restricted to a lack of understanding of the representational nature of belief. Charman and Lynggaard (1998) also found that the performance of children with autism on the Smarties task was enhanced by the provision of a photographic cue which (arguably) reduced the working memory and inhibitory demands of the task, although Bowler and Briskman (2000) were not able to replicate this effect using the standard Sally-Anne false belief task. Although this evidence indicates only that children with autism show impairments on tasks with both ToM and EF requirements as well as on tasks with only EF requirements, some proponents of the EF hypothesis of autism have nevertheless suggested that children with autism may fail ToM tasks because of their EF requirements (e.g., Ozonoff, 1997a; Russell et al., 1999). One problem with this account, which has been overlooked by all of its critics, is that children with autism do not tend to demonstrate impairments on tests of inhibitory control or working memory (see Section 2.2.3), the main EF components implicated in expression accounts of the ToM-EF relationship. Although Russell and his colleagues do not directly address this, they implicitly sidestep it by interpreting their findings on the strategic deception and conflicting desire tasks in terms of a difficulty with mental disengagement rather than emphasising inhibition, an interpretation which is a little more consistent with the attentional shifting difficulties more consistently displayed by individuals with autism. Also, no studies have directly examined the performance of children with autism on tests combining inhibitory and working memory requirements (in comparison with tests tapping one or the other), which, as we have seen, appears to be more relevant to false belief performance. The most common argument advanced by critics of the view that EF impairments may explain the poor performance of children with autism on ToM tasks, however, is that they are able to pass the “false photograph” task (described in Section 2.1.3). The claim is that the false photograph task has an identical task structure to the 79 false belief task, and therefore that their failure on false belief tasks cannot be due to their EF requirements (Baron-Cohen & Swettenham, 1997; Leslie & Roth, 1993; Leslie & Thaiss, 1992). However, a number of criticisms of the false photograph task have been put forth in return. Pennington et al. (1997) argued that the “false” photograph is not actually false: it does not misrepresent current reality because the nature of photographs is that they do not portray current reality, and therefore the adequate performance of children with autism could be explained by their intact understanding of cameras. Pennington et al. and Hughes et al. (1994) both also maintain that the camera and photograph are perceptually salient, available and enduring to participants in a way that inferred beliefs are not. Similarly, Russell (1997b) claims that the inhibitory demands made by the false photograph task are far weaker than those made by the false belief task, as the participant is required only to inhibit their current perception of a three-dimensional representation (i.e., a toy) in order to refer to what is known about a two-dimensional representation (i.e., a photograph of a different toy). This claim was tested by Russell et al. (1999) by using a modified version of the false photograph task in which the initial photograph was taken of a blank wall, designed to increase the relative salience of the current representation (a three-dimensional doll) in comparison to the old one (where nothing was present). They found that while children with autism were able to pass the standard false photograph task, they demonstrated impaired performance on the modified version, indicating that when the inhibitory demands of the task matched those required by the false belief task more equally, children with autism could not sustain intact performance. Russell (1997b; Russell et al., 1999) has nevertheless made it clear that he is not of the view that the ToM deficits displayed by individuals with autism are entirely due to EF impairment, or that if the EF demands of ToM tasks were removed, then autistic individuals would show normal performance. Although Leslie and colleagues present an expression account of the ToM-EF relationship in typical development, they of course do not subscribe to this view of ToM failures in autism. While they propose that 3-year-olds fail false belief tasks because of an impaired Selection Processor (SP), they argue that children with autism have an intact SP but instead fail false belief tasks because of impaired metarepresentational capacity or ToMM (Leslie & Thaiss, 1992; Leslie & Roth, 1993). In support of their view, Leslie and colleagues have reported evidence that children with autism do not benefit from helpful task modifications as 3-year-olds do, and that while 3-year-olds will attribute beliefs to others even if they are incorrect, children with autism will not attribute any beliefs at all (Leslie & Roth, 1993; Roth & Leslie, 1998; Surian & Leslie, 80 1999). Leslie and colleagues acknowledge the presence of EF impairments in autism, but in their model, these are independent from the SP and from ToMM. They subscribe to a view of EF as a fractionated system whereby children with autism are impaired in some EF components but not those involved in the SP (Leslie & Roth, 1993). If the SP is considered to be an inhibitory mechanism, this in fact fits quite well with the literature suggesting that inhibition is intact in autism. ii) Common conceptual requirements of ToM and EF. In discussing the applications of their CCC theory for autism, Zelazo, Frye and colleagues have strongly advocated the role of domain general processes (such as rule-based reasoning) in the cognitive aetiology of autism and argued against the conception that autism is characterised by a domain specific impairment in a theory of mind module (Frye et al., 1998; Zelazo et al., 1996a, 2001). Zelazo et al. (2002) tested the hypothesis that individuals with autism may fail ToM tasks because of domain-general difficulties in rule use by examining correlations between performances on two false belief tasks, the physical causality task, and the DCCS task. They found that the correlation between ToM and rule use tasks was not significant for severely impaired individuals with autism (due to floor effects on most tasks), but was significant for their mildly impaired group. They interpreted this result as indicative of the lack of domain-specificity of ToM impairments in autism, and furthermore, argued that ToM deficits in autism may be accounted for by rule-based reasoning impairment. Besides the study’s small sample size (with only 10 mildly impaired participants) and the lack of a control group (necessary to ensure that any difficulties displayed are connected to autism; Colvert, Custance, & Swettenham, 2002), an important limitation of Zelazo et al.’s (2002) study is that they did not partial out age or IQ variables in their correlations. A study reported by Colvert et al. (2002) which addressed these limitations nevertheless replicated the result with 20 high-functioning children with autism, finding significant correlations between false belief and DCCS performance with age, verbal and non-verbal ability partialled out. However, as Colvert et al. point out, further research is needed to investigate what other factors (e.g., inhibition, salience of the switch of setting conditions) might account for these correlations; particularly in light of the criticisms of CCC theory outlined in the previous section’s review. In addition, as for dissociations observed in other disorders, the presence of ToM-EF dissociations in autism (reviewed below) challenge the notion that the two abilities depend upon a common conceptual ability. 81 Halford has not discussed or tested the implications of his relational complexity account (Halford, 1993; Halford et al., 1998) for autism. His proposal implies that individuals with autism may demonstrate limited relational complexity, a prediction awaiting empirical confirmation. iii) Emergence accounts. In his account of why a sense of internal agency is a necessary prerequisite for the development of ToM, Russell (1997b) specifically posited that autism may be a disorder characterised by impaired action monitoring and instigation, and therefore that these deficits may underlie the abnormal development of ToM28. Previous studies showing deficits in imitation (Smith & Bryson, 1994) and motor planning (Hughes, 1996a) in autism are consistent with this hypothesis. Russell’s first direct investigations of action monitoring in autism were promising, although not compelling (he has not studied instigation deficits, saying that this has been adequately covered by others under the guise of “generativity”). Russell and Jarrold (1998) found that on a task involving the launching of missiles towards targets, children with autism failed to correct errors based on both external and internal feedback. This was interpreted as indicating an impairment in constructing visual schemata of motor acts, which are necessary for action monitoring (although the authors acknowledged that their data could also be consistent with a deficit in flexibility). Russell and Jarrold (1999) tested higher-level self-monitoring by using a task requiring children with autism to recall whether they or another person had performed a certain action. Consistent with their predictions, they found that children with autism demonstrated impaired performance on this task, suggesting that they were failing to monitor their actions as their own. However, they also demonstrated some subtle difficulties on memory-based control tasks. More recent studies have not been so favourable towards Russell’s theory. Using a range of tasks including monitoring of basic actions, reporting an intention when the outcome was unintended but desired, and reporting on intended actions when the action achieved was unexpected, Russell and Hill (2001) did not find any strong evidence of monitoring impairments in children with autism. Similarly, Hill and Russell (2002) did not find evidence for a self-monitoring impairment in autism using a test of memory for actions which involved a self/other source attribution (i.e., a judgement of who performed the act), inconsistent with Russell and Jarrold’s (1999) 28 Russell (1997b; Russell & Hill, 2001) also argues that these deficits can account for the range of other EF deficits displayed by individuals with autism. He proposes that action-monitoring and instigation underlie the development of verbal self-regulation (or “inner speech”), which in turn is necessary to hold in mind arbitrary rules. Therefore, individuals with autism are impaired on EF tasks which have arbitrary rules. 82 results. These failures to meet the predictions of Russell’s (1997b) theory have led him to reconsider his conceptualisation of the core EF deficit in autism. Consistent with Ozonoff (1997b), Russell and Hill (2001) proposed that set-shifting or flexibility may instead be the core impairment in autism, and that this deficit may have a “homologous” rather than a “causal” or functional relationship with ToM. They propose that if one assumes that cognition is a form of set-shifting between domains, then children who are mentally inflexible would find it challenging to reflect on mental acts (their own and other people’s). Other proponents of the EF hypothesis of autism have presented alternative emergence accounts of the ToM-EF relationship in autism, although these have not been as extensively developed as Russell’s either conceptually or empirically. Hughes and Russell (1993) suggested that a child with an impairment in dealing with novelty and making decisions due to a damaged SAS (Norman & Shallice, 1980, 1986) would fail to develop successful social relations, with the developmental outcome being an impaired ToM. Pennington et al. (1997) proposed that autism is characterised by a severe deficit in working memory, which results in an early disruption in the planning and execution of complex behaviour. Because this occurs early in development, it affects the acquisition and use of concepts that require the integration of information across time and contexts. Concepts with these requirements include a recognition of one’s own and others’ intentions and their correspondence or conflict, which is involved in imitation as well as later ToM abilities. Hence, an early impairment in working memory would result in the development of a mentalising impairment. An obvious challenge for this account is the absence of convincing evidence for a working memory deficit in autism (in the absence of any inhibitory requirements). Ozonoff and McEvoy (1994) suggested that an early and persistent impairment in the ability to disengage from the external environment and guide behaviour by internal mental models (see Harris, 1993) would have significant consequences for the ability to appreciate others’ perspectives (which requires disengagement from one’s own prepotent thoughts). A problem for all of these accounts, however, is the lack of convincing evidence of early EF deficits in autism (see Section 2.2.3), which speaks against the notion of EF impairment as being causally primary. The existence of ToM-EF dissociations in individuals with autism, whereby EF is impaired but ToM intact, presents additional difficulties for these accounts (which in one way or another all propose that a primary EF impairment underlies the abnormal development of ToM), although only one study has reported data relevant to this 83 dissociation. Ozonoff et al. (1991) found that in their sample of high-functioning individuals with autism, EF deficits were almost universal but ToM deficits only occurred in a subset of the sample, with the implication being that some individuals failed EF tasks while passing ToM tasks. This finding is inconsistent with the view that EF is a necessary prerequisite for ToM and that early EF deficits result in later ToM impairment. Ozonoff et al. conclude that while EF deficits are primary in autism, they are unlikely to be causally related to ToM (they adopt a “common neuroanatomical bases” position, as reviewed below). However, it is possible that differences in the level of difficulty of the two sets of tasks may account for the pattern of results (Perner & Lang, 2000). The reverse emergence account proposed by Perner and colleagues has not been directly examined with autistic individuals, although they have implied that a ToM (or metarepresentational) impairment may underlie EF deficits in autism just as for 3-yearolds (Perner, 1998; Perner & Lang, 2000). The most obvious difficulty with the application of this account to autism is that children with autism have shown intact performance on tests which may be regarded as measuring Perner’s “executive inhibition” (e.g., Ozonoff & Strayer, 1997), which his theory would not predict. ToMEF dissociations in autism whereby ToM is impaired but EF intact also contradict his notion that metarepresentational ability is a prerequisite for EF development. BaronCohen and Robertson (1995) reported a case of a child with autism who failed several ToM tasks but performed successfully on EF tasks, and Baron-Cohen, Wheelwright, Stone, and Rutherford (1999b) report the same dissociation in three high-functioning adults with autism. Of course, the small number of individuals for which this dissociation has been noted limits the generalisability of these findings. Perner’s theory of the ToM-EF relationship in autism requires a direct and systematic investigation before it may be adequately evaluated, although the available evidence is not overly favourable towards it. iv) Common neuroanatomical bases for ToM and EF. The possibility that ToM and EF impairments may co-occur in autism because of their proximal neuroanatomical substrates was first proposed by Ozonoff et al. (1991; see also Bishop, 1993). However, Ozonoff (1997a; Ozonoff & McEvoy, 1994) subsequently put forward an opinion that there may be both performance-based and conceptual links as well. Baron-Cohen and Swettenham (1997), on the other hand, clearly expressed the view that ToM and EF are best conceptualised as independent deficits in autism, which probably co-occur because of their shared frontal origins. 84 While prefrontal abnormalities have been found in individuals with autism (as discussed in Section 2.2.3), more convincing evidence for this class of explanation would involve demonstrating both dorsolateral and medial or orbitofrontal impairment, as the purported substrates for EF and ToM respectively. Two studies provide indirect evidence for dorsolateral abnormalities in autism. Luna et al. (2002) found significantly reduced activation in the dorsolateral prefrontal cortex in individuals with autism during the performance of a spatial working memory task. Goldberg et al. (2002) also interpreted the presence of impairments on an eye movement anti-saccade task as suggestive of dorsolateral prefrontal dysfunction in autism. The idea that autism may involve medial frontal or orbitofrontal dysfunction has been suggested by a number of authors (e.g., Bachevalier & Loveland, 2003; Damasio & Maurer, 1978; Mundy, 2003) on the basis of the region’s apparent role in social behaviour. In a series of studies, Dawson and colleagues (Dawson et al., 1998, 2002a; Dawson, Osterling, Rinaldi, Carver, & McPartland, 2001) obtained indirect evidence of ventromedial prefrontal dysfunction in early autism by demonstrating impairments on tasks previously found to be linked with ventromedial functioning. Two studies have also found reduced activation in medial frontal areas during the performance of ToM tasks in individuals with autism (Castelli et al., 2002; Happé et al., 1996). While these studies do not provide unequivocal evidence of dorsolateral and medial/orbitofrontal abnormalities in autism, they are at least consistent with the possibility that ToM and EF impairments may co-occur in autism because of damage to their proximal neural substrates. Overall, then, we can conclude only that the nature of the relationship between ToM and EF in autism remains unclear. The only studies to have conducted direct correlations between ToM and EF in autism have either failed to partial out the effects of age and IQ variables (Ozonoff et al., 1991; Zelazo et al., 2002), not examined specific relationships with components of EF (Ozonoff et al., 1991), or only included one type of EF task (Colvert et al., 2002). Studies addressing expression accounts of the ToM-EF relationship have not yet systematically varied the EF requirements of ToM tasks, and the emergence accounts again struggle with the presence of ToM-EF dissociations in autism as well as being inconsistent with some of the available data. The accounts which propose that ToM and EF are related in autism, while intuitively appealing, therefore remain open to further investigation. Interestingly, these accounts mostly originate from proponents of the EF hypothesis of autism, who argue that EF deficits may explain ToM impairments in autism (either because of performance-based or 85 functional/developmental links). Notably, Baron-Cohen and Leslie - the most prominent proponents of the ToM hypothesis of autism - both adhere to the view that ToM and EF are independent deficits in autism. The independence and relative primacy of ToM and EF in autism are clearly important matters awaiting further empirical work. These matters are addressed in the current research. 86 CHAPTER 3 Selection and Description of Measures 3.1 Diagnostic measures 3.1.1 Autism Screening Questionnaire 3.1.2 Autism Diagnostic Interview – Revised 3.2 IQ measures 3.3 ToM measures 3.3.1 Simple false belief task 3.3.2 First-order false belief task 3.3.3 Second-order false belief task 3.3.4 Dewey Stories 3.4 EF measures 3.4.1 Tower of London 3.4.2 Intra-dimensional, Extra-dimensional Set-shifting task 3.4.3 Response Inhibition and Load task 3.4.4 Opposite Worlds 3.4.5 Relational Complexity 3.4.6 Pattern Meanings 3.4.7 Uses of Objects 3.4.8 Stamps task 3.5 Behavioural measures 3.5.1 Measures of repetitive behaviour 3.5.1.1 Repetitive Behaviours Questionnaire 3.5.1.2 Repetitive Behaviours Interview 3.5.2 Measures of social behaviour and communication 3.5.2.1 Social Behaviour Questionnaire 3.5.2.2 Social and communication ADI-R domains 87 Both of the studies contained in this thesis involve the use of a large range of cognitive, behavioural, and diagnostic measures. This chapter is devoted to a comprehensive discussion of each set of measures, including a rationale for the selection of each measure, a brief overview of its psychometric properties where possible, and a detailed description of what it entails. This precedes the chapters describing the two studies partly because of its length (due to the number of measures involved), and partly because the measures used are common to both studies. The reader is encouraged to refer back to this chapter when the procedure and results of the studies in Chapters 4 and 6 are being discussed. 3.1 Diagnostic measures 3.2.1 Autism Screening Questionnaire1 (ASQ; Berument, Rutter, Lord, Pickles & Bailey, 1999) The ASQ was developed as a screening instrument for autism and other PDDs, based on current diagnostic criteria and for use with all age groups. It is a 40-item questionnaire completed by the individual’s primary caregiver. The questions are based on the Autism Diagnostic Interview-Revised (ADI-R; Lord, Rutter, & Le Couteur, 1994 – see the following section) but have been modified to be easily understandable in a questionnaire format. It includes questions on reciprocal social interaction, language and communication, and repetitive and stereotyped behaviours. There are two versions, one for individuals under the age of six and the other for those aged six and over. A score of 1 is assigned for the presence of abnormal behaviour and a score of 0 for its absence. The total score therefore ranges from 0 to 39 (an item on current language level is not included in the score). Berument et al. found that a score of 15 or more on the ASQ was the optimal cutoff point for differentiating ASDs from other diagnoses. The ASQ shows good diagnostic validity and the correlation between the ASQ total score and the ADI algorithm score is high (Berument et al., 1999). In the current research, the ASQ was used mainly as a screening instrument to determine whether or not individuals in the control group (or siblings of individuals with ASDs and control siblings in the second study) displayed symptoms of autism. If a 1 The ASQ has now been published as the Social Communication Questionnaire, however as the version of the questionnaire used in this research was obtained from the authors prior to publication, the ASQ was deemed to be a more appropriate descriptor. 88 participant in one of these groups met the cutoff criterion for an ASD on the ASQ, the ADI-R was then administered. The ASQ cutoff point was lowered to a more conservative score of 10 (rather than 15) in this research, to ensure that both i) any controls scoring highly on the ASQ did not meet ADI-R criteria for an ASD, and ii) any individuals with mild ASD symptomatology met the criterion and were administered the ADI-R. 3.2.2 Autism Diagnostic Interview – Revised (ADI-R; Lord et al., 1994) The ADI-R is a modified version of the ADI (Le Couteur et al., 1989), which is a standardised, semistructured interview for caregivers of individuals for whom autism is a possible diagnosis. The diagnostic algorithm is based on ICD-10 (WHO, 1992) criteria for autism, but it can also provide a DSM-IV (APA, 1994) diagnosis as the two diagnostic systems are very similar. It has demonstrated good reliability and validity (Lord et al., 1994). The duration of the interview for a practiced administrator is approximately 1.5 – 2 hours. Special training is required for administrators and approval for use of the instrument is given after completion of a test interview. The ADI-R consists of five sections: opening questions, communication (both early and current), social development and play (early and current), repetitive and restricted behaviours (early and current), and general behaviour problems. Each item is scored either 0 (behaviour not present), 1 (behaviour probably present but criteria not fully met), or 2 (behaviour definitely present), and occasionally a score of 3 is used to indicate extreme severity. An algorithm cutoff score determines whether an individual meets diagnostic criteria within each of the three domains of abnormality (i.e., communication, social interaction, and repetitive/restricted behaviours). In order to meet diagnostic criteria for autism, the individual must meet criteria in each of these three domains, as well as exhibiting some abnormality in at least one area by 36 months of age. The application and utility of the ADI-R in diagnosing ASDs other than autism and in differentiating ASD subtypes has not yet been investigated. However, as individuals with clinical diagnoses of Asperger syndrome and PDDNOS were included in the current research, a more lenient ADI-R diagnostic criterion was introduced such that any individual exceeding the cutoff point in at least one of the three domains was considered to have met criteria for an ASD. Section 4.3.2 in Chapter 4 describes the 89 results of comparisons between the “full criteria” and “partial criteria” groups in Study One. 3.2 IQ measures Two Verbal and two Performance subtests from the Wechsler scales (WPPSI-R, WISCIII or WAIS-III, depending on the participant’s age) were used to estimate Verbal IQ (VIQ) and Performance IQ (PIQ) respectively. VIQ and PIQ scores were estimated by pro-rating sums of scaled scores based on the two subtests for each scale. Verbal subtests were Vocabulary (providing definitions of words) and Similarities (identifying the way in which two things are alike). Performance subtests were Picture Completion (identifying the missing part of a picture) and Object Assembly (assembling pieces of a puzzle to form a whole object). These subtests were chosen because they are representative tests of verbal and non-verbal ability2, as well as being the least similar to other measures in the test battery. 3.3 ToM measures Three different false belief tasks, varying in level of difficulty, were selected as the main ToM measures for both studies. Emphasis was placed on measures of false belief as these have been the main focus of studies of the ToM-EF relationship. The tasks chosen were all in common usage in the literature (unexpected contents/identity, standard first-order false belief, and second-order false belief; these are all described in detail below). A more advanced social cognition measure (Dewey Stories) was also included because of expected ceiling effects on false belief tasks in older control participants. The three false belief tasks were administered in a hierarchy of difficulty, with different starting points for participants of different ages. This was done mainly to conserve time within the extensive test battery. The “simple” false belief task (including unexpected contents and unexpected identity items, as described below) was 2 In the WISC-III (which was used with the most participants), the Vocabulary subtest loads .81 on the VIQ factor and .79 on the Verbal Comprehension (VC) index; and the Similarities subtest loads .75 on the VIQ factor and .72 on the VC factor. The Object Assembly subtest loads .66 on the PIQ factor and .69 on the Perceptual Organisation (PO) index; and the Picture Completion subtest loads .50 on the PIQ factor and .53 on the PO factor. Although the Block Design subtest has the highest loading on the PIQ and PO factors, it was not chosen because it is considered to be a measure of central coherence and therefore would have complicated the interpretation of PIQ scores. 90 considered the easiest of the three tasks, with first-order false belief (or unexpected transfer) next in the hierarchy and second-order false belief the most difficult. The more challenging nature of second-order false belief tasks was demonstrated by Perner and Wimmer (1985) in typically developing children and Baron-Cohen (1989b) in children with autism. The level of difficulty of “simple” and first-order false belief tasks has generally been found to be more equal (Wellman et al., 2001), but findings are consistent with the assumption that an individual who passes the first-order false belief task is likely to have passed the simple false belief task. The hierarchy of task administration operated such that participants were administered either the simple false belief task (for children between 4 and 6 years of age) or the first-order false belief task (for 7- to 16-year-olds) first, and then only proceeded to the more difficult task(s) if the initial task was passed3. Pass or failure was measured by performance on belief questions only, whereby a score of 2/3 or more for the simple false belief task or 3/6 or more for the first- and second-order tasks was considered a pass. If the initial (or subsequent) task was failed, failure was also assumed on the more difficult task(s). If the 7- to 16-year-olds passed the first-order false belief task, they were assumed to have also passed the simple false belief task; however, if they failed the first-order false belief task, the simple false belief task was administered. Only a few studies have investigated the reliability and validity of false belief tasks, with somewhat equivocal results. An initial study by Mayes, Klin, Tercyak, Cicchetti, and Cohen (1996) found poor test-retest reliability for standard first-order false belief tasks, however Hughes et al. (2000) found fair to moderate reliability across a wider range of false belief tasks, and high reliability when aggregate scores were used. Charman and Campbell (1997) found that a range of ToM tasks demonstrated moderate reliability in individuals with learning disorders. In a sample of children with autism, Grant, Grayson, and Boucher (2001) found good convergent validity of several false belief tasks as well as high consistency across task versions. 3.3.1 Simple false belief task (Flavell et al., 1983; Perner et al., 1987) This task included three items, one of which was an unexpected contents item and two of which were unexpected identity items. These two types of item resemble each other 3 Of note, most studies which have examined the effect of task order on false belief performance have not found any significant order effects (e.g., Gordon & Olson, 1998; Hala et al., 2003). 91 closely. Because a higher number of trials per task is preferable in general, they were grouped together and considered as different items of a single task. This has been done in previous studies (e.g., Gopnik & Astington, 1988), and its validity was confirmed by Wellman et al.’s (2001) meta-analysis, which showed no difference in the level of performance on the two item types. It was also confirmed in this research, where strong correlations were found across the three task items in the control participants of both studies (on the belief questions, r ranged from .52 to .7, all ps < .01). In the first task item (the unexpected contents item), the child is shown a box of Smarties and asked, “What do you think is inside this box?” After s/he responds, the box is opened and the child is shown that the box actually contains a pencil. The pencil is then replaced in the box, and the child is asked, “What is really in the box?” (the Reality question). S/he is then asked, “When you first saw the box, all closed up like this, what did you think was inside it, Smarties or a pencil?” (the Own Belief question). Finally, the participant is asked, “If I show X (parent/sibling) the box all closed up just as I showed you, and I ask X what he/she thinks is in the box, what do you think X will say, Smarties or a pencil?” (the Others’ Belief question)4. The other two items (the unexpected identity items) involve the same questions, except the stimulus for the second trial is a sponge which is spray-painted to look like a rock (after being asked what s/he thinks it is, the child is then allowed to squeeze it, then asked what it really is, and so on); and the stimulus for the third item is a black pen which contains red ink (after being asked what colour s/he thinks the pen is, the experimenter writes with it to show that it is red, then the child is asked what colour it is really, and so on). The child is given a score of 1 or 0 for each of the Reality, Own Belief and Others’ Belief questions, and the scores for the three trials are summed for each question type. 3.3.2 First-order false belief task (Wimmer & Perner, 1983; Baron-Cohen et al., 1985) For the current studies, rather than using puppets, a video was filmed in which six independent scenes are depicted5. The task is introduced to participants by saying “Now we are going to watch some short videos. Each video tells a story. After we finish watching each video, I will ask you some questions about what happened in the 4 The order of control and belief questions has been found to have no effect on participants’ responses (Eisenmajer & Prior, 1991; Leslie & Frith, 1988). 5 It should be noted that Wellman et al.’s (2001) meta-analysis demonstrated that the medium in which false belief tasks are presented (e.g., video, puppets, real people) had no effect on performance. 92 story”. In four of the scenes, an object is placed in Location 1 by Actor 1, who then leaves the room. Actor 2 moves the object to Location 2, and Actor 1 then re-enters the room. The participant is then asked i) where Actor 1 will look for the object (the Belief question), ii) where the object actually is (the Reality question), and iii) where Actor 1 placed the object at the beginning (the Memory question). In another one of the scenes, Actor 1 places Object 1 in a covered bag and leaves the room, and then Actor 2 replaces Object 1 with Object 2. The participant is asked i) what Actor 1 thinks is in the bag, ii) what is actually in the bag, and iii) what Actor 1 placed in the bag in the beginning. In the remaining scene, Actor 1 draws Picture 1 on a board and leaves the room, then Actor 2 rubs out Picture 1 and draws Picture 2. The participant is asked i) what picture Actor 1 thinks is on the board, ii) what picture is actually on the board, and iii) what picture Actor 1 drew in the beginning. Each scene therefore follows the same basic structure. In each case, both of the locations, objects or pictures are visible on the screen when the participants are being asked the questions. Participants are given a score of 1 or 0 for each of the Belief, Reality, and Memory questions, and scores for each question type are summed over the six scenes. In pilot testing, it was found that many participants gained a score of 1 on all items and found the task easy. A discontinue criterion was therefore introduced whereby if participants gained full marks for the Belief questions for three consecutive scenes, they were not shown the remaining scenes and gained automatic credit for these. If this discontinue criterion was met but participants had scored 0 on any of the Reality or Memory questions, their overall score in these question categories was calculated on a pro-rata basis – that is, a percentage correct was calculated for those items administered, and then multiplied by six. 3.3.3 Second-order false belief task (Perner & Wimmer, 1985; Baron-Cohen, 1989b) This task was also presented in video format, with each of the six scenes followed by Belief, Reality, and Memory questions. In four of the scenes, Actor 1 places an object in Location 1, and leaves the room, but spies on Actor 2 without Actor 2 knowing. Actor 2 moves the object to Location 2, and Actor 1 then re-enters the room. The participant is then asked i) where Actor 2 thinks that Actor 1 will look for the object (the Belief question), ii) where the object actually is (the Reality question), and iii) where Actor 1 placed the object at the beginning (the Memory question). In another one of the scenes, 93 Actor 2 draws a picture on a sheet of paper while Actor 1 watches, then Actor 1 leaves the room. While Actor 1 secretly watches, Actor 2 decides to draw a different picture instead. The participant is asked i) what Actor 2 thinks Actor 1 thinks the picture is, ii) what the picture actually is, and iii) what Actor 2 drew in the beginning. In the remaining scene, Actor 1 offers Actor 2 an orange and a banana, which are both placed in a lunchbox. Actor 2 takes the orange, and then Actor 1 leaves the room. While Actor 1 secretly watches, Actor 2 replaces the orange and takes the banana instead, and eats it. Actor 1 then re-enters. The participant is asked i) what Actor 2 thinks that Actor 1 thinks she ate, ii) what she actually ate, and iii) what she took in the beginning. Again, in each case, both locations, pictures or fruits are visible on the screen when the participants are being asked the questions. Participants are given a score of 1 or 0 for each of the Belief, Reality, and Memory questions, and scores for each question type are summed across the six scenes. The same discontinue criterion was used for this task as for the first-order false belief task. 3.3.4 Dewey Stories (Dewey, 1991) Dewey (1991) states that she composed this task in 1974 as an informal measure of knowledge of social norms and human relations. It was chosen as a higher-level, more advanced measure of social cognition than the false belief tasks. While Dewey (1991) reports qualitative data on the unusual comments made by individuals with autism in response to the stories, she does not report any quantitative scoring method or any results from typically developing or other control samples. No other published study has used the measure, and there have been no published investigations of its reliability or validity. Its inclusion in this research can therefore be considered somewhat experimental. While several of the story items appear to tap mentalistic understanding, its validity as a measure of ToM remains to be investigated (see Section 4.3.2.2 of Chapter 4), as it could be argued that the task may also be successfully performed simply by drawing on knowledge of normative or common social behaviours. The stimuli for the task are 7 stories, each one paragraph in length, which describe a sequence of events containing certain social scenarios (Figure 3 contains an example). The stories used in this study were taken directly from Dewey (1991), however one story was shortened and another was substantially modified to be more appropriate for an Australian sample. Two or more sections from each story are 94 underlined, and a pair of empty brackets follows each underlined part. Participants are asked to rate each underlined behaviour according to how they think most people would judge that behaviour if they witnessed it. They are asked to use the following scale: Fairly normal behaviour in that situation [A] Behaviour that is a little unusual in that situation [B] Rather strange behaviour in that situation [C] Very eccentric or shocking behaviour in that situation [D] Although there are no set right or wrong answers for each rating, as a way of judging the typicality of participants’ responses, each response was compared with norms derived from 30 undergraduate psychology students. Responses of the normative sample showed an equal split between the frequency of “B” and “C” responses on several items, and so it was decided to place B and C in the same category of response. The scoring system worked such that responses closer to the dominant normative response were assigned a lower score. For items where A was the dominant response (n = 8), participants who chose A scored 0, B/C scored 1, and D scored 2; and for items where B/C was the dominant response (n = 9), A scored 1, B/C scored 0, and D scored 1. There were no items where D was the dominant response. Scores were summed across items to produce a total score, on which lower scores represented a higher social awareness. Emily, age nineteen, overslept on the morning of her aeroplane trip. When she woke up, there was just enough time for her to dress and get to the airport, so she skipped her breakfast. [ ] At noon, the steward came around with lunch, but Emily was so hungry by then that one portion did not satisfy her. She watched a little girl across the aisle toy with her food, complaining “I can’t eat it.” Apparently, the father didn’t want any more, because he told the child to just leave it. Emily leant across the aisle and said, “If your little girl doesn’t want her tray, can you pass it over for me?” [ ] Figure 3. An example of a Dewey Story. 3.4 EF measures Because one of the central aims of this research was to conduct a thorough investigation of the EF profile characteristic of ASDs, as well as to examine the relationship of 95 various EF components to ToM (these aims are discussed in the next chapter), a strong emphasis in the test battery was placed on measures of EF. Given the difficulties following from the task impurity of most widely used EF tests (as discussed in the previous chapter, Section 2.2.1), specific assessment of component processes using tasks with high construct validity was given priority in task selection. The component process approach to EF assessment has been strongly advocated by several authors (e.g., Hill, 2004; Ozonoff, 1995a, 1997a, 1997b, 2001). The tasks chosen are relatively simple and/or include control conditions allowing precise delineation of the underlying EF process(es) involved, although for some EF domains (e.g., planning), this was not as easily achievable due to both the nature of the component and the availability of “pure” tasks. A wide range of EF components was assessed, including planning (measured by the Tower of London), set-shifting or cognitive flexibility (the Intra-dimensional, Extradimensional set-shifting task), inhibition and its interaction with working memory (Response Inhibition and Load task and Opposite Worlds), relational reasoning (Relational Complexity), and generativity (Pattern Meanings, Uses of Objects and the Stamps task). It was desirable to test each EF domain using both verbal and non-verbal response modes, which was possible for the inhibition and generativity domains. The child-friendliness of the tasks was another factor considered in EF task selection. It was important for the tasks to be appropriate for a fairly large age range, so tasks with a low floor and high ceiling were regarded as preferable. 3.4.1 Tower of London (Culbertson & Zillmer, 1998b; Shallice, 1982) The Tower of London (ToL) was first designed as a cognitive measure by Shallice (1982), who found that it was performed poorly by patients with frontal lobe lesions. The ToL’s sensitivity to frontal dysfunction has been supported in a number of subsequent studies using both adult clinical samples (Carlin et al., 2000; Owen et al., 1990) and head-injured children (Levin et al., 1994, 1996). Shallice proposed that the ToL specifically measures planning ability, which may be defined as the ability to generate, select, organise, integrate, and monitor behaviours needed to achieve a future goal (Culbertson & Zillmer, 1998a; Lezak, 1995). The validity of the ToL as a measure of planning was supported by Shallice’s (1982) finding that ToL performance did not covary with measures of visuospatial ability or working memory, although subsequent studies have found that working memory and inhibition may also be involved in task performance (e.g., Welsh, Satterlee-Cartmell, & Stine, 1999). 96 A wide range of administration and scoring procedures for the ToL have been used in different studies. Three groups of researchers have published proposed standardised versions of the ToL for use with paediatric populations (Anderson, Anderson, & Lajoie, 1996; Culbertson & Zillmer, 1998b; Krikorian, Bartok, & Gay, 1994). Both Anderson et al. and Krikorian et al.’s versions require the readministration of failed items, whereas Culbertson and Zillmer’s version only requires the child to do each problem once, and uses the number of extra moves made as its main dependent measure. Culbertson and Zillmer (1998b) argue that the readministration of failed items significantly increases the amount of on-task time, which is a liability when assessing younger children and clinical populations with limited attentional capacities. It can also provoke frustration and distress, leading to decreased motivation and co-operation. Their version has demonstrated adequate test-retest reliability as well as good criterionrelated, diagnostic and construct validity (Culbertson & Zilmer, 1998a, 1998b). For these reasons, administration and scoring procedures for the version of the ToL used in this research were based on those outlined by Culbertson and Zillmer (1998b). The major differences were that the floor was lowered by including 1- and 2move items (rather than beginning with 3-move items); there were four items at each level of difficulty instead of three; the instructions were slightly modified to encourage participants to plan moves in advance; and scores were adjusted (see below) to account for participants who completed problems in fewer moves than the minimum number because they broke the task rules. Participants are presented with a tower structure consisting of three wooden posts of descending heights mounted on a wooden base. Three coloured discs (red, black and white) are placed on the posts in a standard starting position (see Figure 4). The participant is then required to rearrange the three coloured discs on the posts so that the new configuration corresponds to the pattern presented on a 21cm x 15cm stimulus card. The participants are informed that this must be accomplished in the minimum number of moves, which is told to them verbally as well as being written at the top of the stimulus card. In addition, they are told that they must adhere to the following four rules: i) they can only use one hand to move the discs, ii) they can only move one disc at a time, iii) discs cannot be placed on the board or table - only on the posts, and iv) they cannot put more discs on a post than it will hold. Examples of breaking a rule are demonstrated, each time with the experimenter saying “You can’t do this”. 97 Figure 4. The starting configuration for the Tower of London stimuli. Participants are given one 1-move and two 2-move practice examples, during which if a rule is broken or extra moves made, the rules are reiterated and the correct solution demonstrated. The following instructions are then given: “Now I am going to set up more disc patterns and see if you can make them on your board in as few moves as possible. You may find that some of the patterns are difficult, but do the best you can. Each pattern can be solved. You should look carefully at the pattern and the board and plan the best move to start with. Take your time planning, as each move you make counts towards the total. If you think you can’t finish it in the correct number of moves, then just keep going and try and do it in the fewest number of moves you can.” Items range in difficulty from 1 move to 7 moves, with four items at each level of difficulty. Participants aged between 4 and 13 begin with 1-move items, and participants aged 14 and over begin with 3-move items (and are given automatic credit for 1- and 2-move items if they complete at least two 3-move items in the minimum number of moves). If the first item at a new level of difficulty is failed (either by breaking the rules or using too many moves), the correct solution is demonstrated. There is a time limit of 2 minutes on each item, after which the item is discontinued. Testing is discontinued if participants fail all items at a particular level of difficulty. Remaining items are assumed to have been failed and are assigned the maximum total number of moves (i.e., 20). Five scores are computed for each test item. These are listed in Table 1. The sum of extra move scores and adjusted extra move scores is calculated for each block of items (i.e., each level of difficulty) as well as overall. The total number of problems completed in the minimum number of moves is also computed (i.e., the total number of 98 problems with an adjusted extra move score of 0). Although Culbertson and Zillmer (1998b) also calculated initiation and solution times, these were not used in this research as not all participants were administered all items (due to the different starting points for different ages and the discontinue rule), making mean times (either overall or block by block) difficult to calculate and analyse in a meaningful way. Table 1. The five scores computed for each item on the Tower of London 1. Total number of rule violations Included moving 2 discs off the posts at the same time, placing more discs on a post than it would hold, and placing discs on the board or table 2. Extra moves Total number of moves – minimum moves* 3. Adjusted extra moves Adjusted moves – minimum moves, where adjusted moves = total moves + (2 x no. of rule violations) 4. Extra move score Designed to avoid excessive inflation of the “extra moves” index by an extreme number of total moves, this is calculated as follows: 5. Adjusted extra move score • If extra moves = 0, extra move score = 0 • If extra moves = 1-5, extra move score = 1 • If extra moves ≥ 6, extra move score = 2 Identical to the extra move score except using adjusted extra moves (3) instead of extra moves (2) *If the total number of moves exceeds 20, it is reduced to 20 to avoid inflation of the “extra moves” index by excessive moves on a limited number of items. For example, if a participant executes 24 moves on a 7-move problem, then the score would be calculated as follows: 20 - 7 = 13. In addition, the total number of moves is assigned a value of 20 for any item not solved within 2 minutes. 3.4.2 Intra-dimensional, Extra-dimensional (IDED) Set-shifting task (Owen et al., 1993) The original version of this task, which forms part of the CANTAB (Cambridge Neuropsychological Test Automated Battery), was designed as a WCST-like computerised measure of attentional set-shifting. In comparison to the WCST, the IDED set-shifting task is simpler to allow participants of a wider age and ability range to participate, and involves a series of stages containing a number of internal control 99 conditions to aid the elucidation of the mechanisms involved in successful or unsuccessful task performance. It has demonstrated fair test-retest reliability (Lowe & Rabbitt, 1998). Using this task, it was found that patients with Parkinson’s disease and with frontal lobe damage, but not patients with temporal lobe damage, demonstrated an inability to shift attention between two perceptual dimensions at the “extra-dimensional shift” stage (Downes et al., 1989; Owen, Roberts, Polkey, Sahakian, & Robbins, 1991). However, observations that patients with frontal lobe dysfunction and Parkinson’s disease may fail the task for different reasons led Owen et al. (1993) to develop a modified version of the CANTAB procedure, which was designed to distinguish whether impairments in attentional set-shifting ability are caused by an inability to release attention from a relevant stimulus dimension (Perseveration), or an inability to re-engage attention to a previously irrelevant dimension (Learned Irrelevance). This version (described further below) therefore allows even further breakdown of the processes involved in set-shifting performance, making it attractive for inclusion in the current research. Using this modified IDED task, Owen et al. (1993) found that the difficulty with extra-dimensional shifting demonstrated by patients with frontal lesions was caused by perseveration to the previously relevant dimension, whereas patients with Parkinson’s disease tended to show learned irrelevance. When Turner (1997) used the task with children with autism, she found that low-functioning participants demonstrated significantly more errors at the extra-dimensional shift stage of the Perseveration condition, but not the Learned Irrelevance condition. The modified version of the IDED set-shifting task includes two task conditions, one intended to assess perseveration and the other to assess learned irrelevance. In both conditions, each trial consists of two patterns appearing on a computer touchscreen (their positions randomly alternating between four rectangular boxes to the top, bottom, left and right of centre of the screen), and the participant is required to choose which one is “correct” according to an unspecified rule, with feedback provided by the computer. Participants are given the following instructions: “This is a game where you have to work out the rule for choosing the right answer. On the screen you are going to see two patterns. The patterns will appear in any two of four boxes. One of the patterns is right and the other is wrong, and you must tell the computer the one you think is right. You do this by touching the pattern on the screen. There is a rule which you can follow to make sure you make the right choice every time. The computer will not tell you the rule – you will have to work it out for yourself. To 100 begin with, there is nothing on the screen to tell you which of the patterns is correct so when you choose your first answer you will just have to guess. However, the computer will give you a message after each try to tell you whether you are right or wrong. If you are right, it will come up with the word “CORRECT”, written in green, and if you are wrong, it will say “INCORRECT”, written in red. The computer will be keeping track of how well you are doing. When the computer can tell that you know the rule, the computer will then change the rule, but it will not tell you that the rule has changed. You will have to work out the new rule. That won’t happen very often. Do you have any questions before you start?” Each condition comprises 8 stages presented in the same fixed order: a simple discrimination (SD) and reversal (SDR), then a compound discrimination (CD) and reversal (CDR), then an intra-dimensional shift (IDS) and reversal (IDR), and finally an extra-dimensional shift (EDS) and reversal (EDR). Participants can only proceed to the next stage after reaching the criterion of 6 consecutive correct responses. In the Perseveration condition (see Figure 5), the task begins with subjects being required to learn which of two geometrical shapes is correct (SD condition). The subject is then required to reverse the learnt rule and respond to the previously incorrect stimulus in the target stimulus dimension (SDR). The next stage introduces an additional stimulus dimension, white lines, which are paired with the shapes. At this stage the same shape remains correct, with the nature of the lines being irrelevant (CD). Once that is learnt, the subject is again required to reverse the learnt rule and respond to the other shape (CDR). The next stage is the IDS stage, where the subject is presented with new exemplars for both of the stimulus dimensions (shapes and lines). Although the exemplars are different to the two previous stages, the relevant stimulus dimension (shape) remains the same. In the EDS stage, the previously irrelevant stimulus dimension (lines) is replaced by a new stimulus dimension which becomes relevant (solidity), and the previously relevant dimension (shape) becomes irrelevant. Thus, participants must shift their attention from a previously relevant to a new stimulus dimension, and failure reflects perseveration to the previously relevant dimension. The Learned Irrelevance condition (see Figure 6) proceeds as for the Perseveration condition for the first 6 stages, except that colour is the relevant dimension and number the irrelevant dimension. However, in the EDS stage, the relevant dimension (colour) is replaced by a previously irrelevant dimension (number), and a new dimension (size) becomes the irrelevant dimension. Hence, participants must 101 shift their attention to a previously irrelevant stimulus dimension, and failure reflects learned irrelevance associated with the previously irrelevant dimension. Stage Stimuli Relevant dimension Irrelevant dimension SD Shape - SDR Shape - CD Shape Lines CDR Shape Lines IDS Shape Lines IDR Shape Lines EDS Solidity Shape EDR Solidity Shape Figure 5. Stimuli for the Perseveration condition of the IDED set-shifting task (see text for explanation of abbreviations). The correct choice is always displayed on the left. 102 Stage Stimuli Relevant dimension Irrelevant dimension SD Colour - SDR Colour - CD Colour Number CDR Colour Number IDS Colour Number IDR Colour Number EDS Number Size EDR Number Size Figure 6. Stimuli for the Learned Irrelevance condition of the IDED set-shifting task (see text for explanation of abbreviations). The correct choice is always displayed on the left. 103 Failure to achieve the criterion of 6 consecutive correct responses within 50 trials at any one stage results in discontinuation of the test. There is a 1000ms interval between successive trials. Each condition lasts approximately 10 minutes and the two conditions are separated by at least 30 minutes of unrelated tests. Unlike Owen et al.’s (1993) procedure, in the current study the dimensions used in each condition (i.e., shape, lines, solidity, colour, number, and size) were consistent across participants (as shown in Figures 5 and 6), rather than counterbalancing the dimensions across the two conditions. The only other difference between the current and Owen et al.’s version was that the Perseveration condition was presented first for all participants (rather than the order of conditions being counterbalanced across participants). So that participants could be effectively compared across conditions, the main index of performance was the number of “errors to criterion” within each stage of the task (this was also the main index of performance used by Owen et al., 1993). If the test was discontinued because the criterion of 6 consecutive correct responses was not met within 50 trials, a value of 25 (the value expected with random responding) was assigned for the errors to criterion score for subsequent stages of the task which were not administered. 3.4.3 Response Inhibition and Load (RIL) task The basic idea for this non-verbal computerised test of inhibition, which was created by the author, was taken from Drewe (1975) but with substantial modifications and additions. Drewe’s study included two types of task, one involving the requirement to press a button in response to one type of stimulus but not another (otherwise known as a “Go-Nogo” task) and the other requiring the participant to inhibit the prepotent response to match stimuli of the same colour by pressing a button which was opposite in colour to the stimulus. This latter task type had a control condition which required participants to press a button which was the same colour as the stimulus. The inclusion of a control condition was desirable for the current research, as subtraction of scores on this condition from scores on the inhibition condition allows more precise identification of the level of performance on the inhibition task condition without the confounding effects of non-inhibitory processes such as speed of processing and motor coordination. This “non-matching to target” paradigm was therefore adopted for this research but modified in order to improve aspects of the methodology. In Drewe’s task, the stimulus and two response buttons (one red and one blue) were always present and 104 visible (with the stimulus button lighting up in either red or blue), whereas in the current task a touch screen was used so that the stimulus was presented only briefly before the response buttons were presented. This also allowed the two coloured response buttons to change sides randomly from trial to trial, ensuring that the participant was responding on the basis of the colour of the response button rather than its spatial location. The inhibition task condition was also modified so that the colours of the stimulus and response buttons were different to the control condition (which preceded it). This was to avoid confounding inhibition with set-shifting (specifically, reversal), as the use of exactly the same stimuli in control and inhibition conditions means that the inhibition condition then requires the participant to reverse the stimulus-response contingencies and therefore directly “shift set” (Ozonoff et al., 1994). An important addition to the task was an extra condition which involved an increased working memory load, thereby allowing evaluation of the interaction between inhibitory capacity and working memory. This condition was included in order to examine two hypotheses: i) that children with autism (and/or their siblings) may be impaired only on tasks which combine inhibitory and working memory demands, and ii) that false belief measures show correlations with performance on tasks that combine inhibitory and working memory demands, but not with each capacity individually. Performance on the working memory load condition was compared with that on the inhibition condition, to assess the specific effect of the working memory load, and also with performance on the control condition, to assess the combined effect of inhibition and working memory requirements. The three task conditions are described below. Condition 1 (Control condition): In this condition, either a pink or green stimulus circle (approximately 5cm in diameter) appears at the top of a computer touch screen for 250ms, and then two smaller response circles (approximately 3.5cm in diameter), one pink and one green, appear simultaneously at the bottom left and right corners of the screen. Participants are instructed to touch the response circle which is the same colour as the stimulus circle. Participants have 4s to respond before the response circles disappear and the next trial begins. An equal number of pink and green stimulus circles are presented, and the order of presentation is random. As already mentioned, the response circles change sides randomly (i.e., the pink circle can appear on either the right or left), to ensure that participants are responding to the colour of the circle rather than simply its position on the screen. Participants are required to use one hand only to respond. Performance indices are the percentage of errors (i.e., responding to the wrong coloured stimulus), and the median RT for correct trials. 105 Condition 2 (Inhibition condition): This condition is identical to Condition 1 except that the colours of the stimulus and response circles are purple and yellow, and the participant is required to touch the response circle which is the opposite colour to the stimulus circle. Hence, if the stimulus circle is purple, participants must touch the yellow response circle, and vice versa. As for Condition 1, performance indices are the percentage of errors and the median RT for correct trials. Condition 3 (Working Memory Load condition): In this condition, instead of the stimulus being a circle, it is either a square, triangle or cross. As in Condition 2, participants must touch the response circle which is opposite in colour to the stimulus shape (the colours in this condition are orange and grey). However at random intervals, between trials, the three shapes are displayed on the screen and the participant must touch the shape which was presented in the most recent trial. This occurs for 25% of the trials. The participant must therefore recall the shape of the stimulus as well as inhibiting the prepotent tendency to respond to the same colour. Performance indices for the responses to the colour of the stimulus are identical to those in Conditions 1 and 2. For the questions about the shape of the stimulus, performance is measured by the percentage of errors. In each condition, participants perform 7 practice trials, during which any errors are pointed out verbally and corrected. Following the practice trials, there is a pause during which the participant may ask any further questions. The 60 critical trials then proceed, during which every third error is pointed out and the participant reminded of the task rules. The inter-trial interval is 1000ms in all conditions. 3.4.4 Opposite Worlds (Manly, Robertson, Anderson, and Nimmo-Smith, 1998) This task was selected as an additional measure of inhibition, where unlike the RIL task, a verbal response is required. Opposite Worlds is a subtest of the Test of Everyday Attention for Children (TEA-Ch; Manly et al., 1998), and is similar in design to Gerstadt et al.’s (1994) Stroop-like day-night task, but instead involves reading the number “1” as “2” and vice versa. It demonstrates good test-retest reliability (Manly et al., 2001). It forms part of the “attentional control/switching” factor in the TEA-Ch, but the naming of this factor reflects the executive nature of the factor rather than accurately describing the requirements of the tasks that load on it. Manly et al. (1998, 2001) consider Opposite Worlds to be a test of verbal inhibition, pointing out that as 106 participants are required to switch from the Opposite to the Same World (control) condition as well as vice versa, performance on the Opposite World condition may be attributed to the requirement to inhibit a prepotent verbal response rather than the demands of task switching. The task has displayed good construct and convergent validity, correlating significantly with other measures of inhibition (Manly et al., 1998). Opposite Worlds is administered only to children who are able to read the numbers 1 and 2. The stimuli are yellow squares linked together in an undulating pattern on a black background, with each square containing either a 1 or a 2. The task is introduced by saying: “In this test there are two sorts of world. There is the Same World, where everything is as you would say it here, and the Opposite World, where you have to say the opposite of what you would say here”. An example page with two Same World examples at the top and two Opposite World examples at the bottom is shown to the participant. The experimenter points to the beginning of the first Same World example and says: “Here I would say “Start, one, one, two, two, one, Stop”. The child is encouraged to complete the same item and then the other Same World example. While the child reads the numbers, the experimenter points to each square in turn, and does not move onto the next square until the child has said the correct number. After successful completion of the Same World examples, the experimenter then points to the first Opposite World example and says: “We’re now going to the Opposite World where we have to say the opposite. Here, when we see a one we have to say “two”, and when we see a two we have to say “one”. This is how to do it: “Start, one, one, two, one, two, Stop”. Both examples are then completed by the child. The participant then completes the four test trials in the order: Same World, Opposite World, Opposite World, Same World. S/he is reminded of the instructions at the beginning of each trial. The time taken to complete each trial is recorded from the time the child says “Start” to the time s/he says “Stop”. The number of errors for each trial is also recorded, with an error being defined as any occasion upon which the child says a “1” when required to say “2”, or vice versa. A total time score (summing the time taken across the two trials) and total error score (summing the errors made across the two trials) are calculated for each of the Same and Opposite World conditions. 3.4.5 Relational Complexity (Waltz et al., 1999) This task was included as a measure of relational reasoning, which refers to the ability to “form and manipulate mental representations of relations between objects and 107 events” (Waltz et al., 1999, p. 119). While relational reasoning is not often included in lists of EF components, it was assessed mainly to test Halford’s notion that limited capacity to integrate multiple relations (i.e., relational complexity) may underlie failure on false belief tasks (Halford, 1993; Halford et al., 1998; see Section 2.3.1.2 in the previous chapter). Halford et al. (1998) argue that working memory capacity may be best defined in terms of the complexity of the relations that can be processed in parallel, and therefore the Relational Complexity task may also be considered a test of working memory – a domain which is more often considered an aspect of EF. Waltz et al. (1999) found significant impairments in patients with prefrontal cortical damage on their Relational Complexity task, and proposed that failures on various types of EF task could be accounted for by a deficit in relational integration. The task is similar in format to the Raven Standard Progressive Matrices Test, in which the missing part of a pattern must be chosen from six alternatives. The current version is based on Waltz et al.’s (1999) adaptation but has more levels of difficulty, more pictures within each item, and more alternative answers. In this version, each problem consists of a 3 x 3 matrix of square-shaped simple geometric pictures, with the bottom right-hand corner picture missing. Participants are asked to select the missing picture from eight alternatives (see Figure 7). Problems vary in the number of relational changes (e.g., in shape, size, rotation), occurring over horizontal and/or vertical dimensions of the matrix, which must be attended to while selecting the missing picture. Nonrelational (Level 0 complexity) items consist of identical pictures, with participants simply having to choose the matching picture from the eight alternatives. The highest level of difficulty for relational problems are of Level 4 complexity – requiring the simultaneous processing of 4 relational changes (see Figure 8). In order to raise the ceiling of the task, some more difficult items were also included where the relevant stimulus dimensions do not necessarily vary in a consistent way across the vertical or horizontal dimensions of the matrix (see Figure 9). An additional item at the end of the task consists of a matrix with four missing pictures, and participants are required to move four cut-out pictures into their correct places. There are 4 problems at each level of difficulty. Participants are instructed to take their time and point to the correct answer when they are sure of it. They are given a maximum of two minutes for each problem. A score of 1 or 0 is given for each trial. They are given three minutes to solve the final problem with the four missing pictures. The task is discontinued if the participant fails all four problems at a particular level of 108 difficulty, with remaining items assumed to have been failed. The sum of correct responses is calculated for each level of difficulty and overall. Figure 7. Example of a Relational Complexity item with 1 relational change. Figure 8. Example of a Relational Complexity item with 4 relational changes. 109 Figure 9. Example of a more difficult Relational Complexity item without consistent relational changes. 3.4.6 Pattern Meanings (Wallach & Kogan, 1965; Turner, 1999) Tests of generativity measure the ability to produce multiple novel responses spontaneously following a single cue or instruction. In this research, the generativity domain was tested using three different tasks because this aspect of EF has been underresearched in autism, despite studies demonstrating its potential ability to explain several symptoms of autism (e.g., Jarrold et al., 1996; Turner, 1997). There are three basic types of generativity task: word fluency (requiring the participant to generate words beginning with a certain letter or belonging to a certain category), design fluency (where participants must produce abstract designs or patterns), and ideational fluency (requiring generation of uses for objects or interpretations of abstract line drawings). Word fluency was not tested in this research, mainly because it relies heavily on vocabulary, making it difficult to disentangle reasons for poor performance (particularly in autism, where verbal ability is typically impaired). A special emphasis was placed on ideational fluency, with two tasks of this capacity included, as Turner (1999) found 110 particularly poor performance on ideational fluency tasks in both low- and highfunctioning individuals with autism. Pattern Meanings is one of the measures of ideational fluency and requires verbal generativity. The stimuli are five meaningless line drawings, taken from Wallach and Kogan (1965) and also used by Turner (1999), which were printed on individual 14.3cm x 9.2cm laminated cards (see Figure 10 for an example). An additional drawing was used for a practice item. Administration procedures were identical to those used by Turner (1999), except that participants were given 90s instead of 150s to generate responses for each item. This shorter interval was introduced following pilot testing, in which it was found that participants tended to produce only a very small number of responses in the last minute, and would often become restless and impatient or inattentive. Before presentation of the practice stimulus, participants are told that the task is one in which they will be shown some different patterns and asked to think of all the things the pattern looks like, or what it could be. Participants are then shown the practice stimulus card (see Figure 11) and asked “What could this be?” Any appropriate response is reinforced and the participant is encouraged to think of other things the pattern looks like. The experimenter also makes the following suggestions (if they have not already been provided by the participant): “a hedgehog”, “someone with spiky hair”, “sparks from a fire cracker”, and “a brush”. Participants are told that they are allowed to turn the cards around and view them from any orientation. They are then given the test stimuli one at a time, and for each one asked “What could this be?”. Stimuli are presented in a random order. Figure 10. One of the five test stimuli for the Pattern Meanings task. 111 Figure 11. The practice stimulus for the Pattern Meanings task. Scoring procedures were similar to those used by Turner (1999), but an extra “uninterpretable response” category was added. This category was introduced because it was found during scoring that a number of responses could not be classified in any of the other categories. Each response was therefore classified as belonging to one of the following five scoring categories, and the number of responses in each category was summed across the five test items: 1. Correct response: A response which represents a plausible interpretation of the pattern. 2. Incorrect response: A response that represents an inappropriate or implausible interpretation of the pattern (e.g., for the pattern displayed in Figure 10: “this could be a shoe”). 3. Repetition: A response which is a repetition of a previous response (for the current stimulus or a previous stimulus). 4. Redundant response: A response that varies from a previous response only in terms of one minor element or feature of the response (e.g., “two hills”, “two mountains”, “two sand-hills”, etc.) 5. Uninterpretable response: A nonsensical response, which cannot be interpreted as fitting into any of the above categories (e.g., “up and down”). As some unusual responses were sometimes difficult to classify, the scoring of Pattern Meanings (and the Uses of Objects task described below) was more subjective than 112 other tasks in the protocol. Across all types of fluency tasks used in her study, Turner (1999) reported 85% inter-rater agreement and kappa values in excess of .70, indicating satisfactory inter-rater reliability. Because the version of Pattern Meanings used in this study employed slightly different scoring criteria from Turner, inter-rater reliability of this version was calculated using a subset of data from 22 participants (sampled randomly from the ASD and control groups in Study One and the ASD sibling and control sibling groups in Study Two). There was 93.3% agreement between the two raters and Cohen’s kappa was .81, indicating excellent inter-rater reliability for this version. 3.4.7 Uses of Objects (Wallach & Kogan, 1965; Turner, 1999) Uses of Objects also measures ideational fluency and requires verbal responses. In this task, the stimuli are six different objects. Three objects are “conventional” items with well-established functions (a pencil, a brick, and a mug), and three are “nonconventional” items with no clear or established function (a piece of plain navy blue material measuring 14 x 51 cm, a 50 cm length of dowelling, and a 90 cm long piece of clothing elastic). As with Pattern Meanings, administration procedures matched those used by Turner (1999), but again with the shorter 90s interval in which to provide responses. The task is introduced as one in which participants will be asked to think of all the ways in which some different objects could be useful. Participants are then asked “For example, how could we use a newspaper? Tell me something useful we could do with it”. Any appropriate suggestions made by the participant are praised and further responses encouraged. The examples “you could use it to start a fire”, “you could roll it up and swat flies with it”, and “you could use it to wrap a present” are provided by the experimenter if not already produced by the participant. Participants are then asked to think of as many uses as they can for the six different objects, one at a time. For each of the conventional items, the experimenter gives two examples, one representing the object’s established function (e.g., “you could use a mug to drink from”), and one that is more imaginative (e.g., “you could use a mug as a vase for flowers”). For the nonconventional items, the experimenter gives just one imaginative example (e.g., “you could use a piece of material to wrap up pencils if you wanted to carry them”). After the examples are provided, the participants are asked to say all the other ways in which the object could be useful. The objects are presented in the same order for each participant (pencil, dowel, brick, material, mug, elastic). 113 Scoring procedures were again similar to those used by Turner (1999) but, as with Pattern Meanings, an extra “Uninterpretable responses” category was added. In addition, a “Non-Useful responses” category was introduced as it was found during scoring that many responses were plausible things that could be done with the object, but that did not serve any useful purpose (e.g., “you could sharpen a pencil”). Hence, each response was categorised into one of the following six scoring categories: 1. Correct response: A response which represents a plausible use for the object. 2. Incorrect response: A response that represents an inappropriate or implausible use for the object (e.g., for the brick: “eat it”). 3. Repetition: A response which is a repetition of either one of their own previous responses (for the current object or previous objects) or one of the examples. 4. Redundant response: A response that varies from a previous response only in terms of one minor element or feature of the response (e.g., for the brick: “to build a garage”, “a shed”, “a factory”, etc.) 5. Uninterpretable response: A nonsensical response, which cannot be interpreted as fitting into any of the above categories (e.g., for the piece of fabric: “blow down”). 6. Non-useful response: A response which describes something plausible that could be done to or with the object, but which does not include a useful purpose for the object (e.g., for the piece of elastic: “stretch it”). The number of responses in each category was summed separately for conventional and nonconventional items, as well as overall. Inter-rater reliability for Uses of Objects was calculated using a subset of data from 23 participants (again sampled randomly from the ASD and control groups in Study One and the ASD sibling and control sibling groups in Study Two). There was 86.8% agreement between the two raters and Cohen’s kappa was .76, indicating good inter-rater reliability. 3.4.8 Stamps task (Frith, 1972) This task was based on one used by Frith (1972) as a measure of the spontaneous selfgeneration of underlying rules in patterns. It was considered a test of design fluency in this research despite the fact that Frith did not label her task in this way, as it is a nonverbal task requiring participants to produce multiple novel responses. The task differs from standard design fluency measures in that it involves producing patterns from a set 114 of materials rather than drawing abstract designs. While it has been used far less frequently than other design fluency tasks, its scoring system allows analysis of a number of different processes underlying task performance, making it amenable to a component process approach. In addition, Frith (1972) demonstrated interesting results with the task in children with autism, who tended to rigidly adhere to the same underlying pattern rules, used a restricted range of available materials, and did not generate original patterns. The task procedure was based on Frith (1972), with some minor procedural and scoring modifications. Participants are provided with four stamps of different shapes and colours, and a piece of paper with a line of 16 boxes on it. They are asked to make whatever pattern they like with the stamps, putting one stamp in each box. There are eight trials, four using only two of the stamps and four using all four stamps. If the child does not use all the stamps available during the first eight boxes of a trial (i.e., only uses one stamp on the two-stamp trials or less than four stamps on the four-stamp trials), at that point s/he is reminded that there are more stamps available. The twostamp and four-stamp trials are presented alternately. The trials are divided up into two blocks, separated by at least half an hour, with each block consisting of four trials (two trials with two stamps, and two trials with four stamps). Four types of scores are calculated for each trial: 1. Complexity. Rules are defined as consistently recurring sub-units of a fixed number of elements (e.g., the pattern red/green/red/green etc. has an underlying alternation rule as two elements are repeated over and over again; the pattern red/red/red/red etc. has an underlying rule to repeat a single element; and the pattern red/green/black/blue/red/green/black/blue etc. has the underlying rule to repeat a group of four elements). Ratings of complexity are based on the number of elements contained in a sub-unit, using the following scale: i) repetitions of single elements are given the lowest rating of 1 ii) repetitions of two single elements (i.e., alternations) are given a rating of 2 iii) repetitions of three or four single elements are given 3 iv) on two-stamp trials, if two stamps are used in the sequence, but the pattern consists of more than just an alternation (e.g., red/red/green/red/green/ green), a score of 3 is given v) on four-stamp trials, if three or four stamps are used but the pattern consists of more than just cycling through the three or four elements (e.g., 115 red/red/green/black/black/blue/red/green/green/black/blue/blue), then a score of 4 should be given. In a case where a single rule can not account for the whole sequence of 16 items, but only for part of it, the rule must account for at least one half of the sequence in order to receive its score. If this criterion is not met, the pattern is considered unidentifiable and given a rating of 1. 2. Rule adherence. All sequences which can be completely accounted for by a single rule (i.e., a repeated sub-unit or a mirror-reversed pattern) are given a score of 1. All sequences which are irregular in any way, including those with predominant or unidentified rules, are given a score of 0. 3. Restriction. In the four trials where four stamps are used to build a pattern, a score of 1 is given (for each trial) if the child uses fewer than the four stamps available. A score of 1 is also given if the child only uses one stamp on two-stamp trials. 4. Originality. Any sequence that occurs only once in all of the trials is considered “original” and given a score of 1. This score is only given if the original sequence follows an identifiable pattern. If the “original” sequence is random, it scores 0. Scores are summed across the eight trials to produce overall complexity, rule adherence, restriction and originality scores for each participant. 3.5 Behavioural measures Autistic symptomatology includes impairments in social interaction and communication, and repetitive behaviours. Social and communication behaviours were considered together in the current research, for reasons described below in Section 3.5.2.2. Thorough measurement of repetitive behaviours was emphasised, as discussed in the introduction to Study One (see Section 4.1.1 in Chapter 4). 3.5.1 Measures of repetitive behaviour 3.5.1.1 Repetitive Behaviours Questionnaire (RBQ) The RBQ was developed by the author as a screening measure to be completed prior to administration of the Repetitive Behaviours Interview (RBI; see Section 3.5.1.2). The RBQ covers the same repetitive behaviours as the RBI, but the questions are answered 116 in a yes/no questionnaire format. The caregiver of the individual completed the questionnaire. S/he was asked to tick “yes” if his/her child had ever shown the behaviour under question, regardless of its frequency, and whether it be recently or in the past. Any questions that were ticked “yes” were then asked again verbally, with follow-up questions, in the RBI, but questions which were ticked “no” were not repeated within the RBI. The purpose of this structure of administration was mainly to conserve time, given the time-consuming nature of the test protocol and other interviews. 3.5.1.2 Repetitive Behaviours Interview (RBI; Turner, 1996) The RBI was developed by Michelle Turner as part of her PhD thesis. Neither the full RBI nor a thorough description of it have been published, so it is described in detail here, and the current version is contained in full in Appendix A. It was designed to measure the presence and severity of a large range of repetitive behaviours, including those typically displayed by individuals with autism as well as those characteristic of other clinical groups. Turner’s version of the RBI consists of 59 questions covering 10 categories of repetitive behaviour: stereotyped manipulation of objects, object attachments, stereotyped movements, tic-like behaviours, self-injurious behaviour, obsessive-compulsive behaviours, insistence on sameness of environment, rigid adherence to routines and rituals, repetitive use of language, and circumscribed interests. Each interview question asks whether or not the caregiver’s child displays a particular type of behaviour, and includes specific examples of behaviours covered by the question, so that caregivers are clear about the type of behaviour being targeted and forgetting is minimised. The interview assesses whether or not the target behaviour is displayed currently (once a week or more over the last three months), as well as whether it had ever been displayed previously. These Recent and Lifetime behaviours are rated separately. In the current research, only the Recent behaviours ratings were used, because relationships between current cognitive and behavioural functioning were the central concern. Scoring procedures. For the classes of behaviour which occur in discrete episodes (i.e., stereotyped manipulation of objects, stereotyped movements, tic-like behaviours, self-injurious behaviour, and repetitive use of language), information on the frequency of the behaviour is coded using an 8-point scale. The codes, which refer to how often each episode of the behaviour occurs, range from (0) “never” to (7) “almost 117 constantly”, with intermediate codes referring to the number of episodes occurring per week and per day (ranging from 1-2 times per week to more than 30 times per day). Information on the duration of each episode is also included because individuals may show a particular repetitive behaviour infrequently, but engage in it for a long period of time, thus frequency data alone could be misleading. Duration information is coded using a 5-point scale ranging from (0) “less than one minute” to (5) “30 minutes or longer”. Caregivers are not given a list of the frequency and duration codes, in order to prevent response bias. However, if any response is unclear, the caregiver is asked for the number of times per day the behaviour is shown, or asked to choose between two of the duration codes. In Turner’s version of the RBI, the circumstances which commonly lead to each of the discrete-episode type behaviours are also coded in one of eight categories, including “at no specific time or situation”, “when anxious or tense”, and so forth. For other “steady-state” behaviours which do not occur in discrete episodes and can not be coded in terms of frequency and duration (i.e., object attachments, obsessivecompulsive behaviours, insistence on sameness of environment, and rigid adherence to routines and rituals), the severity of the behaviour is coded using a simpler 3-point scale. In general, a code of (0) is used to indicate the absence of the target behaviour (or at least, lack of abnormal levels of the target behaviour), (1) denotes mild inflexibility or mild-moderate behavioural severity, and (2) indicates marked inflexibility or extreme severity. Codes of (1) and (2) are specifically operationalised for each question. Because the nature of some of the behaviours is such that they are shown to some degree in a normal population (e.g., having regular routines, favourite items and so on), severity is often gauged by the impact of the behaviour on the rest of the family. Each of the sections on “steady-state” behaviours is followed by a series of questions about how the child would react if s/he was prevented from indulging in each behaviour that has been rated. The two questions on circumscribed interests are structured slightly differently, being rated in terms of the usual or unusual nature of the interest, the degree of obsessionality with the interest, the typical or atypical manifestation of the interest, and the degree to which it prevents the individual from pursuing other interests. During interview administration, care is taken to ensure that the same behaviour is not coded twice, under different questions. In cases where the same behaviour arises twice or seems to fit under two different categories, the behaviour is coded according to the most notable feature of the behaviour. For example, if a caregiver reports that their 118 child continuously kicks around a ball while walking around the house, this behaviour would be coded under the object manipulation question, rather than the repetitive pacing question. Similarly, a behaviour is not always coded under the question which elicits its description by the informant, if it fits more appropriately under another question. If a child shows two distinct behaviours which both fall under one question, both are recorded and scored. Differences in the current version. The version of the RBI used in the current research differed from Turner’s in several ways. Firstly, only the questions which were ticked “yes” on the RBQ were asked within the RBI. Secondly, the questions regarding the circumstances which commonly lead to the display of the discrete-episode type behaviours were not included. This was because it was found during initial testing that parents found these questions quite difficult to answer clearly, and also because it was felt that data gleaned from these questions were not essential for the current study. Thirdly, the questions about how the child would react if s/he was prevented from indulging in each of the “steady-state” behaviours were not asked either, for similar reasons. Finally, the section on compulsive behaviours was expanded from two to five questions, covering a larger range of behaviours. As a result of the latter three modifications, the current version of the RBI includes 52 rather than 59 questions. None of the questions from the original RBI about the presence and severity of the repetitive behaviours themselves were removed or changed in the current version. Summary variables used in statistical analyses. The main measures derived from the current version of the RBI were the presence of behaviour and severity summary scores for each behavioural category. The presence of behaviour summary score was calculated by assigning a score of 1 for each question that received a frequency rating above 0 (or severity rating above 0 for the “steady-state” behaviours), and then calculating a sum of scores for all questions in the category. The severity summary scores were slightly more complex. For the discreteepisode behaviours, Turner simply used the frequency codes and did not use the duration codes in her analyses. In the current research, it was decided that a severity score which included both frequency and duration information would be a more accurate reflection of the time spent on each behaviour. For each possible combination of frequency and duration codes, the maximum number of minutes per week spent doing the behaviour was calculated. For example, for a behaviour coded (2) “3-6 times per week” for frequency and (3) “4-9 minutes” for duration, the maximum number of minutes per week would be 54 (6 x 9). Because there was a very large range in the 119 number of minutes per week possible (0 to 10080), each combination of codes was then ranked in severity, such that the lowest number of minutes per week was given a score of 0 and the highest was given a score of 32. Thus, each of the possible combinations of frequency and duration codes corresponded with a score between 0 and 32 inclusive. Each discrete-episode behaviour rated on the RBI was therefore given a severity score of between 0 and 32, and the severity summary score for each behavioural category consisted of the sum of the severity scores for each behaviour in that category. For the “steady-state” behaviours, the severity scores for each behaviour were simply the same as the 0, 1 or 2 rating assigned during interview, with the severity summary score being the sum of these scores across the behaviours in each category. The severity summary scores for all behavioural categories were converted to t scores (with a mean of 50 and standard deviation of 10) using the grand mean and standard deviation across the autism and control groups, thereby enabling comparisons across different categories while controlling for the fact that the number of items and range of scores is variable across categories6. To reduce the number of statistical comparisons required in analyses examining the relationship between cognitive functioning and repetitive behaviours, Turner further collapsed the severity summary scores for each behavioural category into four composite variables: Repetitive Movements, Sameness Behaviour, Repetitive Language, and Circumscribed Interests. The same composite variables were used in this research, with the addition of a Compulsive Behaviours variable (due to the addition of items in this category in the current version). The Repetitive Movements composite score was the sum of severity summary scores for the stereotyped manipulation of objects, stereotyped movements, tic-like behaviours, and self-injurious behaviours categories. The Sameness Behaviour composite score included the severity summary scores for insistence on sameness of environment, rigid adherence to routines and rituals, and object attachments. The Compulsive Behaviours, Repetitive Language, and Circumscribed Interests composite scores were simply the severity summary scores for those categories. Reliability and validity. In her PhD, Turner reports test-retest and inter-rater reliability data for her version of the RBI. In terms of test-retest reliability, she reports an average of 96% agreement across two administrations with regard to the simple presence or not of each behaviour. The agreement was 83% for the frequency and 6 From this point onwards, the term “severity summary score” will mean the t score. 120 duration codes for the discrete episode behaviours, and 92% for the severity codes for the “steady-state” behaviours. Inter-rater reliability was very good, at a mean of 99.5% agreement for the frequency and duration codes, with a corresponding mean Kappa value of .99. For the severity codes, there was a mean agreement of 91%, with a Kappa value of 0.87. Turner (1996) did not explicitly examine the validity of the RBI in her thesis. However, in the current studies, a high correlation between the Repetitive/Restricted Behaviours domain of the ADI-R and an overall sum of severity summary scores across categories on the RBI, which was conducted across all groups in Studies 1 and 2, r = .73, p < .001, suggested good construct validity. The underlying factor structure of the RBI was also examined in Study 1, the results of which are reported in Section 4.3.5.1 of Chapter 4. 3.5.2 Measures of social behaviour and communication 3.5.2.1 Social Behaviour Questionnaire (SBQ; Skuse et al., 1997) The SBQ is a 12-item questionnaire completed by the individual’s caregiver, which was originally devised for use with a sample of individuals with Turner’s syndrome (Skuse et al., 1997). It includes 12 statements primarily relating to the child’s everyday social awareness and behavioural appropriateness; for example, “not aware of other people’s feelings”, “does not pick up on body language”, and “does not understand how to behave when out, e.g., in shops or other people’s houses”. These statements are rated as 0 (not at all true), 1 (quite or sometimes true), or 2 (very or often true). Scores therefore range from 0 to 24. The questionnaire demonstrates good internal consistency, testretest reliability, and validity (Skuse et al., 1997). 3.5.2.2 Social and communication ADI-R domains As the SBQ is a brief, limited measure of social functioning, questions in the social domain of the ADI-R which related to current functioning were selected and scores summed to form an additional measure of social behaviours. Similarly, scores on questions in the communication domain relating to current functioning were also summed as measure of communicative ability. Only questions relating to current functioning were used (rather than all the questions usually used to calculate the traditional algorithm for social behaviours and communication) because relationships 121 with current cognitive capacity were of central interest, as well as for the sake of comparability with the RBI, from which measures of current behaviour only were taken. These two ADI-R summary scores of current social behaviours and communication correlated quite highly, r = .77, p < .001. A factor analysis conducted with the two ADI-R summary scores and the SBQ score also demonstrated that all three measures loaded on the same factor (the results of this factor analysis are described more fully in Section 4.3.5.2 of the following chapter). It was therefore decided to create a composite score of all three measures of social/communicative ability (i.e., the SBQ and the current social and communication scores from the ADI-R). This was achieved by conducting a factor analysis deriving factor scores for each participant using a regression equation. 122 CHAPTER 4 Study One: Profile, Primacy, and Independence of Theory of Mind and Executive Function Impairments in Autism Spectrum Disorders 4.1 Introduction 4.1.1 Aims 4.1.2 Hypotheses 4.2 Method 4.2.1 Participants 4.2.2 Procedure 4.3 Results 4.3.1 Data screening 4.3.2 Group comparisons on ToM and EF tasks 4.3.2.1 False belief tasks 4.3.2.2 Dewey Stories 4.3.2.3 Tower of London 4.3.2.4 IDED set-shifting task 4.3.2.5 Response Inhibition and Load task 4.3.2.6 Opposite Worlds task 4.3.2.7 Relational Complexity 4.3.2.8 Pattern Meanings 4.3.2.9 Uses of Objects 4.3.2.10 Stamps task 4.3.2.11 Summary and effect sizes of group comparisons 4.3.3 Universality of ToM and EF deficits 4.3.4 Ability of ToM and EF variables to predict group membership 4.3.5 Behavioural measures: Group comparisons and derivation of indices used in correlational analyses 4.3.5.1 Repetitive Behaviours Interview 4.3.5.2 Social and communicative functioning 4.3.6 Correlations between ToM/EF and behavioural measures 4.3.7 Relationship between ToM and EF 4.3.7.1 Correlations between ToM and EF 4.3.7.2 Dissociations between ToM and EF 4.4 Discussion 4.4.1 Profile of ToM and EF deficits 4.4.2 Primacy of ToM and EF deficits 4.4.3 Independence of ToM and EF deficits 4.4.4 Towards a “multiple primary deficits” model of ToM and EF in ASDs 123 4.1 Introduction 4.1.1 Aims Chapter 2’s literature review revealed that individuals with autism consistently display both ToM and EF deficits, but that the primacy and independence of these two impairments remain a matter of current debate. The first of the two studies contained in this thesis was principally aimed at elucidating the profile, primacy, and independence of ToM and EF deficits in children with ASDs, with the broader aim of clarifying the structure of the cognitive level of explanation in a causal model of autism. Thus, the three central aims of Study One were to determine i) the specific profile of ToM and EF deficits which characterises ASDs; ii) whether impairments in ToM and/or EF can adequately meet the criteria for a primary cognitive deficit in ASDs, and which appears to be the most primary; and iii) whether or not ToM and EF impairments are related in ASDs, and if so, what the nature of that relationship might be (i.e., which theory of the ToM-EF relationship is best supported by the data). The remainder of this section describes how these aims were addressed in the current study. i) Aim 1: Determining the profile of ToM and EF impairments. The specific profile of ToM and EF impairments in ASDs was examined by comparing the performance of individuals with ASDs with control participants matched on age and non-verbal ability on a range of ToM and EF tasks. In particular, emphasis was given to the precise measurement of a range of EF components. As described in Section 2.2.3 of Chapter 2, previous studies of EF in autism have been weakened by the use of tasks which are often impure and/or require non-verbal responses only (which may advantage individuals with ASDs), and which do not cover the full range of EF components. This study sought to address those weaknesses, not only in order to provide an accurate map of the cognitive profile typical of ASDs, but also to help determine whether that profile may be unique to autism (as, for example, the presence of inhibition deficits would be inconsistent with the unique EF profile proposed by Ozonoff and colleagues; see Section 2.2.3) and how each component may relate to ToM ability (discussed further below). As described in Chapter 3, planning, set-shifting, inhibition, working memory, relational reasoning, and generativity components were all measured. Both verbal and non-verbal tests were used where possible. A task involving both inhibitory and working memory demands was included, following suggestions that only tasks combining both components are i) impaired in autism and ii) related to ToM. A test of 124 relational reasoning was included in order to examine Halford’s (1993) notion that the capacity to integrate multiple relations may be a key ability underlying false belief understanding (this represents a “common conceptual requirements” account of the ToM-EF relationship, as described below in hypothesis 3). Generativity was also assessed in detail, in response to indications that generativity deficits may hold strong explanatory value in terms of the symptoms of autism (e.g., Jarrold et al., 1996; Turner, 1997). ii) Aim 2: Determining the primacy of ToM and EF impairments. As we have seen, common criteria used to judge the primacy of a cognitive deficit to a disorder are its i) universality in individuals with the disorder, ii) uniqueness to the disorder, iii) causal precedence or ability to account for the earliest symptoms of the disorder, and iv) explanatory value or ability to account for the whole range of symptoms displayed by individuals with the disorder. In this study, all of these criteria were tested in some way for both the ToM and EF hypotheses of autism except for the third criterion of causal precedence, as children below the age of 5 were not included in the sample. The main reason for this was that it was important to test the range of EF components, using both verbal and non-verbal response modes if possible, which is difficult for a young sample both because tests in some EF domains (e.g., generativity) are not yet available for this age group and because the limited verbal abilities of young children constrain the tests appropriate for use1. The criterion of universality was addressed in this study by calculating the proportion of participants with ASDs displaying an impairment on the variable in question, with “impairment” defined as a score worse than one standard deviation from the control mean – a stricter cutoff for impairment than the lenient criterion of any score below the control mean, which was used by Ozonoff et al. (1991). The uniqueness criterion was tested indirectly, by analysing which ToM and EF variables best predicted membership of the ASD group (this methodology was also used by Ozonoff et al., 1991, to assess uniqueness). As individuals from other clinical groups were not assessed, with the exception of a few children in the control group with mild intellectual handicaps, this test of uniqueness should be considered as addressing whether the ToM and EF deficits displayed were unique to individuals with ASDs compared with individuals of equivalent age and non-verbal ability (rather than being unique to ASDs 1 Some 4-year-old autistic and control children were tested as part of the WAFSASD (see Method – Section 4.2) and it was found that many were unable to adequately comprehend or perform several of the EF tasks. 125 compared with all other clinical conditions). Explanatory value was measured by calculating correlations between ToM/EF variables and behavioural measures of autistic symptomatology (i.e., social/communicative functioning and repetitive behaviours). A particular emphasis was placed on a thorough examination of each cognitive impairment’s relationship with repetitive behaviours and restricted interests, in comparison with a briefer assessment of social and communicative functioning. Although this emphasis was not strictly necessary for the exploration of explanatory value, it was considered important because this third aspect of the autistic triad has been one of the main grounds for discriminating the ToM and EF hypotheses (that is, both ToM and EF capabilities show relationships with and appear able to explain social and communicative impairment, but the ToM hypothesis does not account well for repetitive behaviours). In addition, these non-social aspects of autistic symptomatology have been largely neglected in previous research, with only one published study directly addressing the relationship between ToM/EF and repetitive behaviours (Turner, 1997). iii) Aim 3: Determining the independence of ToM and EF impairments. The nature of the relationship between ToM and EF in children with ASDs was investigated by comparing the pattern of correlations between ToM and EF variables in the ASD participants with similar correlations in the control group. Thus, the presence of significant correlations between ToM and EF in individuals with ASDs would be suggestive of an underlying relationship, and the pattern of correlations would show which EF components may be important for ToM performance or development. The incidence and direction of dissociations between ToM and EF deficits in the ASD group were also examined, by calculating the proportion of ASD participants with impaired EF who displayed intact ToM, and vice versa (with impairment defined in the same way as for the universality calculations). This allowed assessment of whether one ability appeared to be a prerequisite for the other (or whether one impairment ever occurred without the other), which is relevant for the question of primacy as well as helping to discriminate between the different theories of the ToM-EF relationship – in particular, the two emergence accounts (see Section 2.3 in Chapter 2). 4.1.2 Hypotheses Predictions for the profile of deficits. It was expected that both ToM and EF deficits would be found in our sample of individuals with ASDs, with poorer performance expected on higher-level ToM measures. In terms of the specific profile of EF deficits, 126 based on the outcomes of previous research it was predicted that ASD participants would show impairments in planning, set-shifting, and generativity, but not inhibition or working memory. However, consistent with Russell’s (1997b) proposal, it was hypothesised that ASD participants may show impairments on the task combining inhibition and working memory requirements. It was expected that in domains where both verbal and non-verbal measures were used, individuals with ASDs would be more likely to show impairments on verbal tasks. No specific predictions were made with regard to performance on the relational reasoning task, which has not been used previously with individuals with autism; however, given previous findings of intact working memory in ASDs, it was thought possible that this domain may also be intact (as it tests Halford’s (1993) notion of working memory). Predictions for the primacy and independence of deficits. In considering the possible outcomes of analyses of the primacy and independence of ToM and EF deficits in individuals with ASDs, a number of different hypotheses are conceivable, all of which hold different implications for theories of the primary cognitive deficit(s) of autism as well as theories of the ToM-EF relationship. These hypotheses include the following: 1. There is only a single, primary deficit in ASDs, with no secondary impairments. This hypothesis would be supported if only ToM or only EF impairments are displayed by the ASD group. This possibility is not likely given fairly consistent evidence that both ToM and EF impairments are present in children with ASDs. 2. ToM and EF impairments are related in ASDs such that one deficit is primary and either causes or explains the other, which is secondary. This possibility could be consistent with expression, emergence, and common neuroanatomical bases accounts of the ToM-EF relationship. For example: i) If EF deficits are primary and cause a ToM deficit because of performancebased factors (i.e., the expression account), this would be revealed in a pattern of results showing EF deficits as more primary, significant correlations between ToM and certain EF components (most likely inhibition and working memory), and no or few dissociations such that those EF components are impaired but ToM is intact2. 2 Dissociations in the other direction would also be unlikely as EF should not be intact in individuals with ASDs if EF deficits are primary. 127 ii) If a ToM impairment is primary and causes a secondary EF deficit because of functional dependence during development (i.e., Perner’s emergence account), this would be reflected in a pattern of results demonstrating a ToM deficit meeting criteria for primacy, significant correlations between ToM and EF, and no dissociations in ASD participants such that ToM is impaired but EF is intact (dissociations in the other direction would also be inconsistent with Perner’s theory, as discussed in Section 2.3.1.3, and unlikely as per footnote 2). iii) If an EF (or ToM) deficit is primary and a secondary ToM (or EF) deficit is a consequence of its neuroanatomical proximity, then one would expect to see evidence of the primacy of EF but not ToM (or ToM but not EF), and correlations between ToM and EF, but dissociations would be acceptable such that EF (or ToM - i.e., the primary domain) is impaired and performance in the other domain is intact. 3. ToM and EF impairments are related, but neither is primary; there is a third deficit which is primary and causes both deficits. This result would not be supportive of either the ToM hypothesis or the EF hypothesis of autism. It would be consistent with a “common conceptual requirements” account of the ToM-EF relationship. This hypothesis would be reflected by results showing neither ToM nor EF deficits adequately meeting the criteria for primacy (as while they would be caused by the primary deficit, there would not be as direct a relationship with symptoms), significant correlations between ToM and EF variables, and no or few dissociations in either direction (at least on tasks with the common conceptual basis). 4. ToM and EF impairments are independent in ASDs, but only one is primary. In this hypothesis, the most likely explanation for the co-occurrence of the non-primary impairment would be its neuroanatomical proximity to the primary impairment, but unlike version iii) of hypothesis 2, the two deficits are not correlated. This lack of correlation despite neuroanatomical proximity would suggest something unusual about the ToM-EF relationship in ASDs as compared with typically developing children. Results would show primacy of one of the deficits but not the other, and no correlations between ToM and EF deficits. Dissociations would be allowable such that performance in the primary domain is impaired but the second impairment does not always occur. 128 5. ToM and EF impairments are independent in ASDs and both are equally primary. Like hypothesis 4, the co-occurrence of impairments would be most likely explained by their neuroanatomical proximity, but unlike hypothesis 4, both are primary. Results would be expected to demonstrate that both ToM and EF deficits meet criteria for primacy, but there would be few significant or strong correlations (as although ToM and EF deficits would have to co-occur in the large majority of ASD participants, they may not necessarily co-vary in severity). Dissociations would not be expected to occur if all criteria for primacy were met by both impairments, as both primary deficits would have to be impaired in each individual with an ASD. This is a somewhat unlikely outcome as it is improbable that two independent deficits would both show complete explanatory value for the full range of symptoms. 6. ToM and EF impairments are independent in ASDs, and neither meets all criteria for primacy. This represents a more classic “multiple primary deficits” model of ASDs, in which the deficits both hold causal importance but neither are universal or can account for the full range of symptoms. Again, the co-occurrence of independent deficits is most likely to be caused by common neurobiological substrates. This hypothesis is consistent with at least three different scenarios regarding cognitive deficits in ASDs, for example: i) There may be different subgroups of individuals with different primary deficits (these subgroups may be classified according to level of intellectual functioning or symptom severity, for example – see Section 1.2 in Chapter 1). In this case, neither ToM nor EF would be universal among the whole sample, but both may hold good explanatory value within the relevant subgroup (correlations with symptoms across the whole sample may not be strong, however). Dissociations in both directions would be expected, such that one subgroup could show intact ToM but impaired EF, and another subgroup could show the opposite pattern (there may also be a third group where both abilities are impaired). ii) If autism is considered to be a multidimensional spectrum, for example if a ToM deficit was the basis of one aspect of symptomatology (e.g., social impairment) and EF deficits were the basis for another aspect (e.g., repetitive behaviours), then neither deficit would be likely to be universal (if the 129 sample was heterogeneous and not all individuals showed all symptoms), and each deficit would only hold explanatory value for the relevant symptom domain. ToM-EF dissociations in either direction may occur in individuals who do not display all aspects of symptomatology. iii) There may be a third (or more) cognitive deficit(s), which may be more primary than or at least equally primary as ToM and EF deficits. This may actually also be the case for either of the above two scenarios (i.e., there could be 3 subgroups characterised by different primary deficits, or 3 independent cognitive deficits underlying the three aspects of symptomatology). This third deficit may be related to either ToM or EF deficits, but would not explain them both as in hypothesis 3. Hence, hypotheses 1-4 all represent different versions of a single primary cognitive deficit model of autism, whereas hypotheses 5 and 6 both represent multiple primary deficits models. In the only previous study to directly address the primacy and independence of ToM and EF deficits in autism in a similar manner to this study, Ozonoff et al. (1991) found most support for version iii) of hypothesis 2. They found that ToM and EF were correlated in autism, but that EF deficits were more primary, as judged by their universality and uniqueness to autism. Although dissociations were not explicitly examined, they reported that a subset of their ASD sample showed impaired EF but intact ToM. They interpreted this pattern of results as suggesting a neuroanatomical link between ToM and EF deficits in autism, such that they were correlated but the relationship was not causal at a cognitive level. However, as described in Chapter 2, Ozonoff et al.’s (1991) study was weakened by i) its use of impure EF tasks and the employment of an EF composite score which obscured the specific nature of both EF deficits and the ToM-EF relationship; ii) its lenient definition of impairment; and iii) its failure to partial out age from the ToM-EF correlations. In addition, Ozonoff et al. did not examine the presence of ToM-EF dissociations in both directions, the outcome of which is an important discriminator between the six hypotheses outlined above. Because of these weaknesses, and because other research relevant to the primary and independence of ToM and EF in autism has been equivocal, no strong predictions about which one of the above hypotheses was likely to be supported were made prior to conducting this study. Because of weak plausibility, a low likelihood was placed on hypotheses 1 and 5, and based on previous studies it was also suspected that neither 130 ToM nor EF deficits may fully meet all criteria for primacy. The current study may nevertheless be considered an exploratory but extensive investigation of the primacy and independence of ToM and EF impairments in ASDs. It builds upon Ozonoff et al.’s (1991) original study and other relevant research by i) utilising a range of EF tasks designed to tap separate components of EF, the results of which were analysed separately throughout; ii) adopting a stricter criterion of impairment; and iii) partialling out age, VIQ and PIQ from all significant correlations. It also explicitly examines the presence of double dissociations between ToM and EF, and includes investigations of the explanatory value of ToM and EF deficits as an additional measure of primacy (which hold particular importance as a way of discriminating between the 3 scenarios presented in hypothesis 6). 4.2 Method3 4.2.1 Participants Autism Spectrum Disorders (ASD) Group. There were 48 participants with ASDs ranging in age from 5 to 18 years. Participants in this group were mainly recruited through Western Australian autism centres (specialising in assessment and/or therapy with individuals with ASDs) and support groups, including the Autism Association of Western Australia, Intervention Services for Autism and Developmental Delay, the WA Disability Services Commission, and the Asperger Syndrome Support Group. Participants of a previous study on the genetics of autism conducted through the Centre for Clinical Research in Neuropsychiatry were also invited to participate in the current study. The study was advertised using brochures, features in newsletters, and presentations to professionals and parents. Parents expressed interest by returning a slip via mail to the research team giving consent to be contacted about the study. 3 Both of the studies in this thesis formed part of the Western Australia Family Study of Autism Spectrum Disorders (WAFSASD), a large-scale project funded by a National Health and Medical Research Council grant awarded to chief investigators Joachim Hallmayer, Murray Maybery, and Dorothy Bishop. The rationale and methodology for this thesis were nevertheless developed largely independently from the broader aims of the WAFSASD. The current author selected and developed all of the cognitive measures used in the thesis (as well as the RBI), was principally responsible for administration of these tasks to participants, and chose and conducted all statistical analyses reported. However, the diagnostic instruments were selected in collaboration with other WAFSASD investigators, and similarly were administered by research assistants for the WAFSASD. In addition, recruitment of families was conducted in collaboration with WAFSASD research assistants. Some of the probands with autism who participated in the WAFSASD were too low-functioning to complete all of the cognitive tasks used in this study, and were not included. 131 All participants had received a clinical diagnosis of autism (n = 28), Asperger syndrome (n = 13) or PDDNOS (n = 7) from a health professional (e.g., paediatrician, psychiatrist, psychologist). The presence of autistic symptomatology in at least one domain was then verified using the Autism Diagnostic Interview – Revised (ADI-R). Two participants (one with a clinical diagnosis of autism and one with Asperger syndrome) were excluded as they did not exceed cutoff scores in any of the three ADI-R domains (i.e., social interaction, communication, restricted/repetitive behaviour). Of the remaining 46 participants in the ASD group, 34 met criteria in all three domains of the ADI-R, 10 met criteria in two domains, and 2 met criteria in one domain only. Other exclusion criteria were the presence of genetic abnormalities or neurological dysfunction (e.g., head injury, encephalitis, neurofibromatosis, cerebral palsy), with the exception of epilepsy4. There were four participants in the ASD group with comorbid diagnoses, as reported by their parents (2 with dyspraxia, 1 with epilepsy, and 1 with dyspraxia and epilepsy). Control Group. Forty-nine control children ranging in age from 5 to 17 years were recruited to participate in the study. Of these, 46 were typically developing children and 3 had mild intellectual disabilities. The control group was selected to match the ASD group on age and PIQ (reasons for not matching on VIQ are described below). Recruitment of this group was mainly achieved through Western Australian schools, again through brochures and newsletters mailed to parents. Because of difficulty recruiting sufficient numbers of boys with low PIQ, in some schools all boys whose parents gave consent were tested on IQ measures, and then the parents of those boys with PIQs in the range of 60-95 were contacted and asked if they would like to participate in the larger study. The children with mild intellectual disabilities were recruited through the WA Disability Services Commission (as controls for children in the ASD group with PIQs between 60 and 70). Exclusion criteria were a known or suspected ASD, as well as genetic and neurological abnormalities. Mothers of control participants completed the Autism Screening Questionnaire (ASQ) in order to screen for symptoms of autism in the control group. If participants scored above the cutoff point of 10 on the ASQ, the ADI-R was administered. One participant, who had a mild intellectual disability, met criteria for 4 Although epilepsy is a neurological illness, because it is a common comorbid condition of autism, it was felt that exclusion of participants with epilepsy may result in a sample which was non-representative of autism. 132 autism on the ADI-R and was excluded from further analysis, leaving 48 participants in the control group. Two participants in the control group had received clinical diagnoses of ADHD (as reported by their parents). Demographic characteristics of each group are presented in Table 2. The ASD and control groups were matched on chronological age, t(92) = 1.74, p = .09, and PIQ, t(92) = .92, p > .1. Children in the ASD group had significantly lower VIQs than the control group, t(92) = 3.7, p < .001. Because children with autism typically show a significant discrepancy between VIQ and PIQ, matching groups on Full-Scale IQ or on both VIQ and PIQ was not considered appropriate or possible. VIQ was therefore included as an additional independent variable in group comparisons, as described in Section 4.3.2. All participants had a PIQ of 60 or above, and a VIQ of 50 or above. The proportion of girls was slightly higher in the control group than in the ASD group, and chi-square analysis revealed that the difference approached significance, χ2 (1, N = 94) = 3.65, p = 0.06. However, this was not considered to be a problem as analyses conducted to compare the performance of boys and girls in the control group on all cognitive tasks revealed no significant differences. Gender was not introduced as an additional independent variable (IV) in analyses because the number of girls in the ASD group was considered to be too small. Table 2. Demographic characteristics of the samples ASD group (n = 46) Control group (n = 48) 10.73 (3.96, 5-18) 9.49 (2.94, 5-17) 40: 6 34: 14 PIQ: Mean (SD, range) 96.07 (18.23, 63-138) 99.42 (16.99, 64-137) VIQ: Mean (SD, range) 91.76 (21.77, 52-150) 106.58 (16.85, 64-138) Age: Mean (SD, range) Male: Female The ASD and control groups were also matched in terms of their families’ socioeconomic status. This was assessed using education data from both mothers and fathers, which was coded using the following system: 1 = up to year 10 (or equivalent) of high school; 2 = up to year 12 (or equivalent) of high school; 3 = diploma, trade certificate, apprenticeship, or other traineeship; and 4 = university degree. A chi-square analysis comparing the education levels of ASD and control parents (the analysis included both mothers and fathers) revealed that there was no difference in the education level of the parents of ASD and control children, χ2 (3, N = 150) = 5.71, p > 133 .1. The difference remained non-significant when only the highest code from each family was included in the analysis, χ2 (3, N = 89) = 3.35, p > .1. With an n of 46 in the ASD group and 48 in the control group, the power of the study to detect medium sized effects (i.e., d = .5) at an alpha level of .05 reached an acceptable level at .78. 4.2.2 Procedure All questionnaires, parental interviews, and cognitive tasks are described in detail in Chapter 3. Initial screening questions regarding medical history (to assess whether participants met criteria for participation) were asked of the participant’s mother via telephone. Informed consent was obtained from the mother of each participant, on behalf of both herself and her child (direct consent was also obtained from participants over 12 years of age, with the exception of children whose level of understanding of the research was judged to be insufficient to give informed consent). Questionnaires were generally sent to participants’ mothers prior to the first testing session. Tests and parental interviews were usually administered at the participants’ homes, or in testing rooms at the Centre for Clinical Research in Neuropsychiatry. The ADI-R took approximately 2 hours to administer, and the RBI an additional 5-30 minutes, depending on the number of questions asked. The test battery took approximately 2.5 hours in total to administer5. The order of test administration was fixed, except the order of Wechsler subtests differed according to whether the WPPSI-R, WISC-III or WAIS-III was administered (the order of subtest administration specified by each test was retained). Testing was often divided into two sessions, in order to prevent fatigue and distractibility. For practical reasons, when testing was conducted across more than one session, the break was not always at the same point within the battery. Some tests were administered only to participants within a certain age range. The order of testing (not including other tests administered for WAFSASD) and the age range for each test is displayed in Table 3. 5 This includes other tests not reported within this thesis but which were conducted as part of the WAFSASD. The IQ, ToM and EF tests took between 1.5 and 2 hours in total. 134 Table 3. Order of test battery and age range for each test Test Age range 1. WPPSI-R: i) Object Assembly 5-6 ii) Vocabulary iii) Picture Completion iv) Similarities WISC-III: i) Picture Completion 7-16 ii) Similarities iii) Vocabulary iv) Object Assembly WAIS-III: i) Picture Completion 17+ ii) Vocabulary iii) Similarities iv) Object Assembly 2. Stamps task – first 4 trials 5-16 3. Tower of London All ages 4. Dewey Stories 10+ 5. Simple false belief task 5-16* 6. First-order false belief task 5-16* 7. Second-order false belief task 5-16* 8. IDED set-shifting: Perseveration condition 7+ 9. Response Inhibition and Load task 7+ 10. IDED set-shifting: Learned Irrelevance condition 7+ 11. Uses of Objects All ages 12. Relational Complexity All ages 13. Stamps task – second 4 trials 5-16 14. Pattern Meanings All ages 15. Opposite Worlds 7-16 *Not all children within this age range were necessarily administered these tasks. The structure of administration of the false belief tasks is described in Section 3.3 of Chapter 3. 135 4.3 Results This section includes subsections covering i) data screening; ii) group comparisons on ToM and EF tasks; iii) analyses addressing the universality of ToM and EF deficits in the ASD group; iv) logistic regression analyses examining the ability of ToM and EF task performance to predict ASD/control group membership; v) group comparisons on and derivation of indices for the behavioural measures; vi) correlations and multiple regressions examining the relationship between ToM/EF and behavioural measures; and viii) analyses examining the relationship between ToM and EF. SPSS (Statistical Package for the Social Sciences) Version 10.0.5 was used for all analyses. 4.3.1 Data screening Data from all measures were screened for normality and outliers. For variables with distributions that did not depart substantially from normality, outliers falling more than 3 standard deviations (SD) from the mean of the group (i.e., ASD or control) were trimmed to 3 SD from the mean. Several variables demonstrated highly skewed distributions. Square root, logarithm and inverse transformations were attempted for these variables. If transformation was successful, the transformed variable was used for all analyses (including correlations and regressions). For some variables where a large proportion of participants all gained the same score, transformations were ineffective. For these variables, scores were dichotomised. Again, the dichotomised variable was used in all analyses. Relevant specific details regarding outliers, transformations and dichotomising of scores are included within the results section for each measure. 4.3.2 Group comparisons on ToM and EF tasks For all group comparisons, the performance of the ASD group as a whole was compared with the control group. For tests administered only to participants within a certain age range (see Table 3), t-tests were conducted to check that the participants available from the ASD and control groups for those tests were still matched on age and PIQ. Results showed that the groups were matched for all tests, with the exception of Dewey Stories. The way in which this was handled is described in Section 4.3.2.2. To address concerns that the range of symptom severity within the ASD group may affect results (i.e., autism versus other PDD subgroups may display different 136 patterns of results), comparisons were also conducted between participants in the ASD group who exceeded cutoff scores in all three domains of the ADI-R (i.e., met “full criteria”; n = 34) and those who exceeded cutoff scores in only one or two domains (i.e., met “partial criteria”; n = 12). The two subgroups were matched on age and PIQ. Almost all comparisons on cognitive tasks revealed no significant subgroup differences6. The only task on which the two subgroups showed different patterns of performance was Pattern Meanings, and these results are reported in Section 4.3.2.8. For all other tasks, as there were no significant differences it was thought appropriate to consider the “full criteria” and “partial criteria” subgroups together as one sample for group comparisons. As described in Section 4.2.1, four participants in the ASD group and two in the control group had a clinical diagnosis other than an ASD (e.g., ADHD, epilepsy, dyspraxia). To check that the presence of a non-ASD diagnosis was not strongly influencing results, group comparisons were conducted excluding these participants from the sample. All significant group differences remained significant, so participants with non-ASD diagnoses were included in all analyses reported. A consistent approach to group comparisons on each task was followed, involving the following steps: 1. T-tests (or chi-square analyses for dichotomous variables) comparing the performance of the ASD and control groups were conducted for all task variables. 2. Scatterplots between task variables and age, VIQ, and PIQ were examined for any non-linear relationships. No significantly curvilinear relationships were detected. The relationship between age and some task variables was slightly curvilinear, but not to an extent that warranted the use of special analyses. 3. Pearson product-moment correlations7 were conducted between the task variables and age, VIQ, and PIQ. 6 It should be noted that some caution should be exercised in interpreting these non-significant subgroup differences because the power to detect them may not be adequate. However, the lack of subgroup differences is still likely to mean that including the PDD subgroup in the ASD sample did not have any significant effect on the overall result besides increasing the sample size and therefore power of the main analyses. 7 Although some variables were dichotomous, Cohen and Cohen (1983) state that the formula for the point biserial correlation coefficient is computationally equivalent to the formula for the product-moment correlation coefficient. They assert that the difference in formula is of no significance when computer programs are used for data analysis, because whatever formula the program uses will work when variables are scored 0-1. For ease of reporting, r is used throughout the results section to denote both a Pearson product-moment and a point biserial correlation coefficient (which are computed identically anyway). 137 4. For task variables which were significantly correlated with age and/or PIQ, an analysis of covariance (ANCOVA) was conducted, mainly to assess whether any non-significant group differences became significant when extraneous variance attributable to age and/or PIQ was removed. Miller and Chapman (2001) recommend the use of ANCOVA in this way as a “noise reduction technique”. 5. If task variables were correlated with VIQ, ANCOVA was not considered to be an appropriate technique to examine the effect of VIQ on group comparisons, as the groups were not matched on VIQ. In their review of the use of ANCOVA with nonrandomly assigned groups, Miller and Chapman (2001) argue that ANCOVA cannot be used to “control for” group differences on a covariate, which they state is a highly consistent view in the technical literature. Essentially, this is because when the covariate and the independent variable (in this case, group) are not independent, the regression adjustment of the independent variable (IV) may remove part of the effect of group or produce a spurious effect of group (see Miller & Chapman, 2001, for further explanation). As an alternative to ANCOVA in this situation, Maxwell and Delaney (1990) suggest the “blocking” of participants on the covariate, and then introducing the “blocked” variable as an additional IV in analyses. This strategy was adopted in the current study. Participants (in the ASD and control groups combined) were divided into three equal groups according to their VIQ score, and then VIQ level was used as an IV in a 2-way ANOVA (with group as the other IV) and the task variable as the dependent variable (DV). In this way, the influence of VIQ on group comparisons was assessed by examining whether i) the main effect of group remained significant when VIQ was controlled for by introducing it as an additional IV, or ii) if any non-significant group differences became significant when the effect of VIQ was separated out. In addition, any group x VIQ interactions would be of interest in examining the possibility that group differences were found for some VIQ levels but not others (i.e., interactions would indicate heterogeneous regression slopes). 6. If dichotomous variables correlated with age and/or IQ variables, the effect of age/IQ was assessed by conducting logistic regression analyses with the dichotomous variable as the outcome variable, and group and age/IQ as the predictor variables. This allowed assessment of the independent contribution of group to the outcome variable minus the variance attributable to age/IQ. 138 4.3.2.1 False belief tasks Examination of distributions of scores from all false belief tasks revealed that a large proportion of participants (particularly from the control group) gained perfect scores for both belief and control questions. All variables were therefore recoded as dichotomous such that a perfect score was coded as 1 and any other score as 0. This fairly strict scoring criterion was considered appropriate for the age and level of ability of both the ASD and control groups, as a more lenient scoring system would have produced ceiling effects. It should be noted, however, that a score of 0 is better interpreted as indicating an “unstable” false belief performance rather than a true failure on the task. Five participants (four in the ASD group, one in the control group) were not administered the First-order and Second-order false belief tasks due to equipment malfunction. These participants had all passed the Simple false belief task and were therefore assigned the mean value of other participants in their group who had passed the Simple false belief task (which was in turn coded dichotomously according to the criteria described above). As the false belief tasks were administered to a restricted age range, the overall sample size for all false belief tasks was 89 (n = 43 in the ASD group, n = 46 in the control group). However, the ns for the memory and reality questions, as well as the own belief questions in the Simple false belief task, were limited to those who actually did the task – as these questions were not assumed to be passed or failed according to performance on other false belief tasks, as was the case for the belief questions (see Section 3.3 for a description of the structure of administration of the false belief tasks). Percentages of participants gaining perfect scores (i.e., “perfect scorers”) in each group for each false belief task (on the belief questions only) are presented in Table 4. i) Simple false belief task. A chi-square analysis revealed that there was no statistically significant difference between the ASD and control groups on the reality questions, χ2 (1, N = 36) = .82, p > .1. However, on the belief questions referring to the participant’s own previous belief, significantly fewer children in the ASD group were perfect scorers, χ2 (1, N = 36) = 4.42, p < .05, indicating that they were more likely to incorrectly state their own previous beliefs8. On the belief questions referring to others’ beliefs, significantly fewer children in the ASD group were perfect scorers, χ2 (1, N = 8 While there were significant group differences on this variable, it was not included in subsequent analyses (e.g., correlations) because the sample size for the variable was considered to be too small. 139 89) = 7.25, p < .01, indicating that they were less likely to make accurate predictions about the beliefs of others. Performance on the others’ belief questions was significantly correlated with both age, r = .23, p < .05, and VIQ, r = .37, p < .001. A logistic regression with age, VIQ and group as the predictors showed that according to the Wald criterion, the independent contribution of group to variance in others’ belief questions performance became only marginally significant with age and VIQ partialled out, z = 3.61, p = .06. ii) First-order false belief task. There was no significant difference between the ASD and control groups on the reality questions, χ2 (1, N = 76) = 1.74, p > .1, or memory questions, χ2 (1, N = 76) = .001, p > .1. On the belief questions, a significantly lower proportion of children in the ASD group gained perfect scores, χ2 (1, N = 89) = 11.34, p < .01, indicating that they were less likely to make accurate predictions about others’ false beliefs. Performance on belief questions was significantly correlated with both age, r = .26, p < .05, and VIQ, r = .44, p < .001. In a logistic regression with age, VIQ and group as the predictors, the independent contribution of group remained significant, z = 7.60, p < .01. iii) Second-order false belief task. As for the other false belief tasks, there was no significant difference between the ASD and control groups on the reality questions, χ2 (1, N = 68) = 1.34, p > .1, or memory questions, χ2 (1, N = 68) = .29, p > .1. There were significantly fewer perfect scorers in the ASD group on the belief questions, χ2 (1, N = 89) = 4.93, p < .05. Scores on belief questions were significantly correlated with age, r = .28, p < .01, and VIQ, r = .44, p < .001. In a logistic regression with age, VIQ and group as the predictors, the independent contribution of group was no longer significant, z = 2.16, p > .1. iv) Overall false belief performance indices. An aggregate score was also calculated across the three false belief tasks, for use in other analyses. The sum of correct responses on belief questions (including only others’ belief questions for the simple false belief task) was dichotomised in the same way as for each individual task, such that a perfect score (15/15) was coded as 1 and any other score as 0. Chi-square analysis revealed that there were significantly fewer perfect scorers overall in the ASD group than in the control group, χ2 (1, N = 89) = 8.1, p < .01. The aggregate score was significantly correlated with age, r = .29, p < .01, and VIQ, r = .42 p < .001. Group 140 remained a significant predictor of the aggregate score when age and VIQ were partialled out in a logistic regression, z = 5.96, p < .05. As the dichotomous scoring system used for the aggregate score was a fairly strict one, a more lenient scoring criterion was also used for an alternative aggregate score. In the alternative system, any participant scoring 13 or more out of 15 (i.e., making either 0, 1, or 2 incorrect responses) was given a score of 1 (“high scorers”), and participants with lower scores were assigned a 0 (“low scorers”). A chi-square analysis showed that significantly fewer ASD participants were high scorers than control participants, χ2 (1, N = 89) = 6.25, p < .05. The alternative aggregate score correlated significantly with age, r = .26, p < .05, PIQ, r = .24, p < .05, and VIQ, r = .50, p < .001. When a logistic regression was conducted with these age and IQ variables and group as predictors, the effect of group became only marginally significant, z = 2.81, p = .09. Table 4. False belief task results: Percentage of participants in each group with perfect scores [or high scores in the case of the alternative aggregate score] on belief questions, and significance of group comparisons ASD group Control group p p with age/ IQ control Simple false belief: Own belief 55.0 87.5 * Others’ belief 72.1 93.5 ** - First-order false belief 48.8 82.6 ** ** Second-order false belief 51.2 73.9 * - Aggregate score 39.5 69.6 ** * Alternative aggregate [55.8] [80.4] * - * p < .05; ** p < .01; *** p < .001; - p > .05. 4.3.2.2 Dewey Stories With the Dewey Stories task administered to only a small subset of the sample in the older age range, the ASD (n = 17) and control (n = 18) participants who completed the task were not matched on age and PIQ. However, the total score on Dewey Stories was not significantly correlated with either age, r = -.19, p > .1, or PIQ, r = -.30, p = .08, and so the non-matching of groups on these variables was not considered to be important. 141 The total score variable did not require transformation. A t-test comparing the total scores of the ASD group (M = 7.94, SD = 3.91) and control group (M = 5.89, SD = 2.95) revealed a marginally significant group difference in the expected direction, t(33) = 1.76, p = .09. The total score was significantly correlated with VIQ, r = -.48, p < .01. Because of the small sample size for this task, VIQ was split into two levels rather than three. An ANOVA with group and VIQ level as the IVs showed that group differences in the total score did not remain significant when assessed independently of VIQ, F(1,31) = .29, p > .1. The group x VIQ level interaction was not significant, F(1,31) = .81, p > .1. Because the Dewey Stories task was of uncertain validity as a measure of ToM, correlations were conducted between the total score and the false belief variables. Raw correlations were significant for all false belief variables except the simple false belief task, however when age, PIQ and VIQ were partialled out, there were no longer any significant correlations between false belief variables and the Dewey Stories total score. It therefore appears that the validity of the Dewey Stories task as a measure of mentalising ability is questionable, and it may be better considered as a measure of social awareness and understanding of acceptable social behaviours. Throughout subsequent analyses, the Dewey Stories task is considered a measure of “social cognition”. 4.3.2.3 Tower of London (ToL) The main performance indices of the ToL were the overall sum of adjusted extra move scores (from here on referred to as the total adjusted extra move score) and the total number of problems completed in the minimum number of moves. These two scores were highly correlated, r = -.96, p < .001, hence only the total adjusted extra move score was used in analyses. The number of rule violations per block administered was also analysed. Because many participants committed no or very few rule violations, this variable was highly skewed and was recoded as a dichotomous variable, with participants making 0-1 violations per block being given a score of 0 (“low rule violators”) and participants making any higher number of violations scored as 1 (“high rule violators”). Two participants had missing data on the ToL, one from the ASD group and one from the control group, and were not included in analyses (resulting in n = 45 for the ASD group and n = 47 for the control group). A t-test comparing the total adjusted 142 extra move scores of the ASD group (M = 26.31, SD = 7.78) and control group (M = 22.53, SD = 7.35) revealed that the ASD group made a higher number of extra moves than the control group, t(90) = 2.40, p < .05. A chi-square analysis also showed that significantly more participants in the ASD group (44.4%) than the control group (23.4%) were high rule violators, χ2 (1, N = 92) = 4.56, p < .05. Both ToL indices were significantly correlated with age (r = -.41, p < .001, for the total adjusted extra move score; r = -.37, p < .001, for rule violations), and VIQ (r = -.39, p < .001, for the total adjusted extra move score; r = -.26, p < .05, for rule violations). An ANCOVA conducted on the total adjusted extra move score, with group and VIQ level as the IVs and age as the covariate, revealed that the group difference remained significant when age and VIQ were controlled, F(1,85) = 4.98, p < .05. The group x VIQ level interaction was not significant, F(2,85) = .48, p > .1. Group also remained a significant predictor of rule violation status (low/high) when age and VIQ were assessed independently in a logistic regression, z = 4.82, p < .05. 4.3.2.4 IDED Set-shifting task All set-shifting variables were highly skewed, with a large number of participants making no errors or only one error to criterion in each stage. As a result, all variables were recoded such that any error score of 0 or 1 was coded as 0 and any higher number of errors was coded as 1. Because the reversal stages were not crucial to the current study, only the first reversal stage (SDR) in each condition (i.e., Perseveration and Learned Irrelevance) was included in analyses. Only the extra-dimensional shift (EDS) stages were included in subsequent analyses (i.e., correlations etc.) as they were the central variables of interest. The overall N for the task (which had a restricted age range) was 72 (n = 36 in both the ASD and control groups). Due to computer malfunction, data for the Perseveration condition from one participant in the ASD group were invalid and not included in analyses. The percentage of participants in each group making only 0 or 1 errors (i.e., “low error scorers”) for each stage in each task condition is displayed in Table 5. There were no significant group differences on any variable. However, there was a marginally significant trend for a smaller proportion of participants from the ASD group to be low error scorers on the EDS stage of the Learned Irrelevance condition, χ2 (1, N = 72) = 3.77, p = .052, suggesting that children with ASDs may have found it more difficult to shift their attention to a previously irrelevant stimulus dimension. 143 No variables were significantly correlated with age or PIQ, and only the IDS stage of the Learned Irrelevance condition was significantly correlated with VIQ, r = -.34, p < .01. The effect of group remained non-significant when a logistic regression on this variable was performed with VIQ and group as predictors. Table 5. IDED Set-shifting task results: Percentage of low error scorers in each group for each stage of each task condition, and significance of group comparisons ASD group Control group p p with age/ IQ control Perseveration condition SD stage 60.0 66.7 - SDR stage 52.9 47.1 - CD stage 45.3 54.7 - IDS stage 44.9 55.1 - EDS stage 60.0 72.2 - Learned Irrelevance condition SD stage 63.9 66.7 - SDR stage 77.8 77.8 - CD stage 63.9 61.1 - IDS stage 77.8 77.8 - EDS stage 13.9 33.3 - - * p < .05; ** p < .01; *** p < .001; - p > .05. 4.3.2.5 Response Inhibition and Load (RIL) task For all RIL task conditions, error variables (i.e., the percentage of errors made) were highly skewed, with many participants making only 0-2% errors. These variables (with the exception of the percentage of errors made in choosing the most recently displayed shape in Condition 3) were recoded such that 0-2% errors was coded as 0 (a “low error score”), and any higher percentage of errors was coded as 1 (a “high error score”). However, as this precluded the use of a repeated measures ANOVA to compare increments in performance across Conditions 1-3, the main error variables used in analyses for these conditions were: 144 i) the inhibition error difference score - the difference between the scores for Condition 2 (the inhibition condition) and 1 (the control condition); ii) the load error difference score – the difference between the scores for Condition 3 (the working memory load condition) and Condition 2; and iii) the inhibition + load error difference score – the difference between the scores for Conditions 3 and 1. These difference scores were normally distributed. Four outliers were trimmed: one control participant’s inhibition and inhibition + load error difference scores, another control participant’s load error difference score, and one ASD participant’s inhibition + load error difference score. The distribution of the percentage of errors made in choosing the most recently displayed shape in Condition 3 (or the shape error score, a measure of working memory ability under conditions requiring inhibitory control) was also skewed, but a square root transformation was effective for this variable. Although the median RT variables for all conditions demonstrated roughly normal distributions, for the sake of consistency, an inhibition RT difference score, load RT difference score, and inhibition + load RT difference score were also calculated (representing the same comparisons between conditions as for the error data). One outlier on the inhibition + load RT difference score from a participant in the ASD group was trimmed. The overall N for the task was 71 (n = 36 in the ASD group, n = 35 in the control group). Due to computer malfunction, one participant in the ASD group had incomplete data in Condition 3, and so error and RT data from that condition as well as the difference scores involving Condition 3 were not included in analyses. Table 6 displays the mean and SD of each group (and the significance of group comparisons) for the error and RT difference scores, and the shape error score. There were no significant group differences on t-tests of the error difference scores, although there was a trend for the ASD group to show a larger inhibition + load error difference score, t(68) = 1.72, p = .09. A t-test comparing the shape error scores for Condition 3 revealed that the ASD group made significantly more shape errors, t(68) = 2.03, p < .05, indicating that in an inhibition task with a working memory load, individuals with ASDs were less able to respond accurately on a measure of their working memory ability. Examination of the error data for each condition separately revealed that there was a significantly lower proportion of low error scorers in the ASD group (27.8%) than the control group (51.4%) in Condition 2, χ2 (1, N = 71) = 4.16, p < .05. However, the inhibition error difference score did not show a significant group difference, probably 145 because there was also a trend for fewer children in the ASD group to be low error scorers in Condition 1, the control condition (47.2% vs. 68.6% in the control group), χ2 (1, N = 71) = 3.31, p = .07. The proportion of low error scorers in the ASD group was also marginally lower for Condition 3 (22.9% vs. 42.9%), χ2 (1, N = 70) = 3.17, p = .08. Thus, the overall pattern of results for the error scores in Conditions 1-3 suggested that the ASD group tended to make more errors on all tasks, but their performance accuracy was not proportionally worse in task conditions with inhibitory and working memory demands (at least on the inhibitory aspect of the task – note that the ASD group performed significantly worse on the shape error score, an index of their working memory performance when the task contained both inhibitory and working memory demands). There were no significant differences between groups on any RT difference scores, and no trends were evident. There were no significant RT differences between the groups when Conditions 1-3 were analysed separately. In subsequent analyses, only the error and RT difference scores and the shape error score were used, and separate error and RT data for Conditions 1-3 were not included. Table 6. RIL task results: Mean (and SD) of each group, and significance of group comparisons, for error and RT difference scores and the shape error score ASD group Control group p p with age/ IQ control Error difference scores: Inhibition 3.43 (6.37) 1.24 (6.20) - Load 2.14 (7.91) 0.66 (4.66) - Inhibition + load 5.23 (8.41) 1.95 (7.49) - Inhibition 194.53 (166.88) 191.95 (198.21) - - Load 116.25 (200.42) 174.53 (193.95) - - Inhibition + load 314.48 (232.88) 366.57 (214) - - * ** RT difference scores: Working memory measure: Shape error score 25.52 (21.48) 15.43 (13.89) * p < .05; ** p < .01; *** p < .001; - p > .05. Note: The means and SDs shown for the shape error score are for the raw data, prior to transformation. 146 It should be noted that for both the ASD and control groups, there was a significant increase in both the number of errors made and the time taken to respond in the inhibition condition (and load condition) compared with the control condition, indicating that these conditions were more difficult than the control condition and therefore that the instruction to respond to the opposite colour to the stimulus did require inhibitory control. Age was significantly correlated with the shape error score (r = -.32, p < .01), the inhibition RT difference score (r = -.30, p < .05), and the inhibition + load RT difference score (r = -.33, p < .01). VIQ was correlated with the load RT difference score (r = -.24, p < .05). ANCOVAs on the inhibition and the inhibition + load RT difference scores with age as the covariate did not change the non-significant effect of group for these variables: F(1,68) = .28, p > .1, for inhibition and F(1,67) = .22, p > .1, for inhibition + load. The group difference in the shape error score remained significant when an ANCOVA was conducted with age as a covariate, F(1,67) = 7.84, p < .01. The group difference on the load RT difference score also remained non-significant when a two-way ANOVA with group and VIQ as the IVs was conducted, F(1,64) = .22, p > .1. The interaction between group and VIQ was not significant, F(2,64) = .07, p > .1. 4.3.2.6 Opposite Worlds task Opposite Worlds task variables used in group comparisons were the Same World error score, Opposite World error score, Same World time score and Opposite World time score (each score equating to the sum of two trials). There was one outlier in the ASD group on the Opposite World time score, which was trimmed to 3 SD from the mean. In subsequent analyses (to be reported in sections to follow), the error and time difference scores between the Opposite and Same World conditions were the main variables used as these were thought to be appropriate summary scores (representing the performance decrement when inhibitory demands are introduced) for use in correlations and other analyses. Means and SDs for all variables are displayed in Table 7. The N for the task was 65 (n = 29 for the ASD group, n = 36 for the control group). For the error scores, a two-way repeated measures ANOVA was conducted with group (ASD, control) as the between-subjects factor and condition (Same World, Opposite World) as the within-subjects factor. There was a significant main effect of condition, F(1, 63) = 22.96, p < .001, but the main effect of group was not significant, F(1, 63) = 1.62, p > .1. The interaction approached significance, F(1, 63) = 3.01, p = 147 .09, suggesting there was a trend for the ASD group to make comparatively more errors in the Opposite World condition. Follow-up simple effects analyses showed that there was no significant difference between the groups in the number of errors made in the Same World condition, t(63) = .32, p > .1, but there was a marginally significant difference in the Opposite World error scores, t(63) = 1.71, p = .09. A two-way repeated measures ANOVA with group as the between-subjects factor and condition as the within-subjects factor was also conducted on the time scores. There was a significant main effect of condition, F(1, 63) = 107.77, p < .001, and a significant effect of group, F(1, 63) = 5.2, p < .05. The interaction was also significant, F(1, 63) = 7.36, p < .01, indicating that participants in the ASD group took comparatively longer to complete the Opposite World condition (in other words, they showed a larger performance decrement from the Same World to the Opposite World condition compared with the control group). Follow-up analyses confirmed that there was no significant difference between the ASD and control group on the Same World time scores, t(63) = 1.36, p > .1, but the ASD group took significantly longer in the Opposite World condition, t(63) = 2.66, p < .05. Table 7. Opposite Worlds results: Mean (and SD) of each group for error/time scores in each condition and difference scores, and significance of group comparisons ASD group Control group p p with age/ IQ control Error variables: Same World error score 1.21 (1.57) 1.08 (1.56) - Opposite World error score 2.69 (2.54) 1.78 (1.74) - - Error difference score 1.48 (2.2) 0.69 (1.45) - - Time variables: Same World time score 27.27 (6.42) 25.0 (6.89) - - Opposite World time score 38.42 (12.55) 31.53 (8.26) * * Time difference score 11.12 (9.02) 6.53 (4.2) ** ** * p < .05; ** p < .01; *** p < .001; - p > .05. Note: The difference scores relate to the interaction term on repeated measures ANOVAs. 148 Neither VIQ nor PIQ correlated with any task variables, but age was significantly correlated with the Same World time score, r = -.42, p < .001, the Opposite World time score, r = -.43, p < .001, and the Opposite World error score, r = -.28, p < .05. When age was introduced as a covariate in two-way repeated measures ANCOVAs, there was no change in any of the results. 4.3.2.7 Relational Complexity In this task, the main variable used for analyses was simply the total score (i.e., total number correct), summed across all trials. There was one outlier in the ASD group for this variable, which was trimmed. A t-test comparing the total score of the ASD group (M = 9.66, SD = 3.9) and control group (M = 9.71, SD = 4.2) was not significant, t(92) = .05, p > .1. The total score correlated with both age, r = .58, p < .001, and VIQ, r = .28, p < .01. An ANCOVA conducted on the total score with group and VIQ level as IVs and age as a covariate did not influence the non-significant effect of group, F(1, 87) = .003, p > .1. There was no significant interaction between group and VIQ level, F(2, 87) = .18, p > .1. 4.3.2.8 Pattern Meanings All error variables (i.e., redundant, repetitive, incorrect, and uninterpretable responses) showed significant positive skew. For redundant responses, a square root transformation was effective. Repetitions were recoded such that 0 or 1 repetition(s) was coded as 0 and 2 or more repetitions were coded as 1. Due to the very small number of incorrect and uninterpretable responses, these two variables were summed to form a combined incorrect/uninterpretable responses variable, which was recoded such that 0 errors remained at 0, and 1 or more errors was coded as 1. A “sum of errors” variable was created, where the number of error responses was summed across all categories. This variable was also skewed, and was transformed using a logarithm equation. The other major variable for the Pattern Meanings task was the number of correct responses, which was normally distributed. There were no statistically significant group differences in the mean number of correct responses produced, t(92) = 1.38, p > .1, or the sum of errors, t(92) = .14, p > .1. Similarly, individual analyses of error variables did not reveal any significant group 149 differences. There was no significant difference in the mean number of redundant responses, t(92) = .46, p > .1. The proportion of low error scorers in each group was not significantly different for repetitions, χ2 (1, N = 94) = .001, p > .1, or incorrect/ uninterpretable responses, χ2 (1, N = 94) = .68, p > .1. However, as mentioned previously, this task was the only one on which the “full criteria” (i.e., autism) and “partial criteria” (i.e., other PDD) subgroups showed significant differences. The partial criteria subgroup made significantly more errors than the full criteria subgroup on the sum of errors variable, t(44) = 2.62, p < .05. When specific error types were analysed, it was found that the partial criteria subgroup made significantly more redundant responses, t(44) = 2.49, p < .05, and a higher proportion of the partial criteria subgroup made a high number of repetitions, χ2 (1, N = 46) = 4.31, p < .05. There was also a trend for the partial criteria subgroup to make more correct responses, t(44) = 1.83, p = .07. This pattern of results therefore suggests that the partial criteria subgroup generated more responses overall, whether correct or not. Each of the two subgroups was then compared to the control group. It was found that the full criteria subgroup demonstrated significantly fewer correct responses than controls, t(80) = 2.06, p < .05, and the partial criteria subgroup produced significantly more error responses overall than controls, t(58) = 2.44, p < .05 (in terms of specific error types, the partial criteria subgroup produced significantly more redundant responses than controls but there were no significant differences for other error types). Because the two subgroups displayed different patterns of performance, Table 8 displays means, SDs, the percentage of low scorers for dichotomous variables, and significance of group comparisons for all variables separately for the two subgroups9. Across the whole sample, the sum of errors was significantly correlated with age, r = -.46, p < .001, as were all individual error variables: redundant responses, r = .32, p < .01, repetitions, r = -.25, p < .05, and incorrect/uninterpretable responses, r = .26, p < .05. VIQ was correlated with the number of correct responses, r = .20, p < .05, and the sum of errors, r = -.20, p < .05, but of the individual error variables, only repetitions were significantly correlated with VIQ, r = -.22, p < .05. All group differences remained non-significant after controlling for these variables when the whole ASD sample was analysed as one group. When the separate analyses for the full and partial criteria subgroups were conducted with the relevant age and IQ variables 9 The two subgroups were not analysed separately in subsequent analyses involving correlations with behavioural and ToM variables, as it was of interest to see whether Pattern Meanings performance correlated with symptom severity (or ToM performance) across the whole sample. 150 controlled, the difference between the full criteria subgroup and controls on correct responses became non-significant, F(1, 76) = 2.31, p > .1, and the difference between the partial criteria subgroup and controls on the sum of errors became only marginally significant, F(1, 53) = 3.02, p = .09, although the difference between these two groups on redundant responses remained significant, F(1, 57) = 7.17, p < .05. There were no significant interactions between group and VIQ level in any analyses. Table 8. Pattern Meanings results: Mean (and SD) of each subgroup [or the percentage of low error scorers for dichotomous variables], and significance of group comparisons ASD group Control group p p with ____________________________ age/IQ Full subgroup control Partial subgroup 26.50 (9.83) 25.1 (8.42) *1 - 7.21 (9.12) 15.75 (10.81) 7.4 (7.51) *2 - - Redundant 4.06 (4.94) 8.92 (7.51) 4.23 (4.34) *2 *2 - Repetition [67.6] [33.3] [58.3] - - - Incorrect/uninterpretable [73.5] [58.3] [77.1] - - Correct responses Sum of errors 21.32 (7.87) Individual error types: * p < .05; ** p < .01; *** p < .001; - p > .05. Note: The means and SDs shown for the sum of errors and redundant responses are for the raw data, prior to transformation. 1 Difference was between full criteria subgroup and controls 2 Difference was between partial criteria subgroup and controls 4.3.2.9 Uses of Objects As for the Pattern Meanings task, all error variables (including the additional non-useful responses variable) were positively skewed. For redundant responses and repetitions, log transformations were effective. A square root transformation improved the distribution of non-useful responses. Again, incorrect and uninterpretable responses were summed to form a combined variable, which was recoded as dichotomous in the same way as for the Pattern Meanings task. A “sum of errors” variable was also created, which was normally distributed for this task. One outlier on this variable from the control group was trimmed. The total number of correct responses, as well as the number of correct responses for conventional and non-conventional items separately, all 151 had approximately normal distributions. Means, SDs, the percentage of low scorers for dichotomous variables, and significance of group comparisons for all variables are displayed in Table 9. Table 9. Uses of Objects results: Mean (and SD) of each group [or the percentage of low error scorers for dichotomous variables], and significance of group comparisons ASD group Control group p p with age/ IQ control Correct responses: - Total 19.07 (8.99) 26.42 (9.5) *** ** 7.04 (3.86) 10.25 (4.35) 12.02 (5.77) 16.17 (6.02) 18.41 (12.84) 17.52 (10.09) - - - Redundant 6.02 (5.79) 5.42 (3.76) - - - Repetition 4.28 (3.99) 5.13 (4.55) - - - Non-useful 6.61 (6.05) 6.38 (5.53) - - - Incorrect/uninterpretable [63.0] [64.6] - - - Conventional items - Non-conventional items Sum of errors Individual error types: * p < .05; ** p < .01; *** p < .001; - p > .05. Note: The means and SDs shown for redundant responses, repetitions, and non-useful responses are for the raw data, prior to transformation. To examine whether or not the ASD group produced proportionally fewer correct responses on the conventional versus non-conventional items, a two-way repeated measures ANOVA was performed with group as the between-subjects factor and condition (conventional, non-conventional) as the within-subjects factor. There was a significant main effect of group, F(1, 92) = 14.82, p < .001, and condition, F(1, 92) = 155.97, p < .001, but the interaction was not significant, F(1, 92) = 1.16, p > .1, indicating that the ASD group produced fewer correct responses than the control group for both conventional and non-conventional items (with the conventional items being more difficult for both groups), but were not proportionally worse on conventional items. Because of this, the separate totals for conventional and non-conventional items were not used in further analyses. There was no significant difference between groups on the sum of errors, t(92) = .37, p > .1, and individual analyses of error variables did not reveal any significant 152 group differences. There was no significant difference in the mean number of redundant responses, t(92) = .64, p > .1, repetitions, t(92) = 1.22, p > .1, or non-useful responses, t(92) = .08, p > .1. The proportion of low error scorers in each group was not significantly different for incorrect/uninterpretable responses, χ2 (1, N = 94) = .02, p > .1. Age was significantly correlated with the number of correct responses, r = .31, p < .01, the sum of errors, r = -.34, p < .01, and all individual error variables (except repetitions): redundant responses, r = -.27, p < .05, non-useful responses, r = -.26, p < .05, and incorrect/uninterpretable responses, r = -.38, p < .001. VIQ was correlated with the number of correct responses, r = .45, p < .001, repetitions, r = -.27, p < .05, and incorrect/uninterpretable responses, r = -.25, p < .05. The group difference in the number of correct responses remained significant in an ANCOVA with group and VIQ level as the IVs and age as a covariate, F(1, 87) = 12.66, p < .01. Group remained a non-significant effect on the sum of errors when age was introduced as a covariate in an ANCOVA, F(1, 91) = 1.05, p > .1. Similarly, group differences in all individual error variables remained non-significant when age and/or VIQ was partialled out using either ANCOVA or logistic regression. There were no significant interactions between group and VIQ level in any analyses. 4.3.2.10 Stamps task Both the rule adherence and restriction scores demonstrated highly skewed distributions and were recoded as dichotomous variables. For rule adherence, a score between 0 and 6 inclusive was coded as 0 and a score of 7 or 8 was coded as 1. For restriction, a score of 0 was left as 0 and a score between 1 and 8 inclusive was coded as 1. The complexity and originality scores showed approximately normal distributions. Means and SDs for the latter two variables and the proportion of low scorers for the former two variables, along with the significance of group comparisons for all scores, are presented in Table 10. The N for the task was 87 (n = 41 for the ASD group, n = 46 for the control group). T-tests revealed significant group differences on the complexity score, t(85) = 2.73, p < .01, indicating that the ASD group produced less complex patterns than the control group, and on the originality score, t(85) = 2.81, p < .01, indicating that the ASD group produced fewer original patterns than the control group. It was found using chisquare analysis that there was a lower percentage of low scorers in the ASD group on 153 the restriction score, χ2 (1, N = 87) = 5.76, p < .05, indicating that a larger proportion of the ASD group tended to use fewer stamps than were available. For the rule adherence score, there was a marginally significant trend for a smaller proportion of the ASD group to produce patterns adhering to one rule, χ2 (1, N = 86) = 3.50, p = .06, which was contrary to expectation. The originality score was significantly correlated with both age, r = .34, p < .01, and VIQ, r = .40, p < .001. The restriction score correlated with VIQ, r = -.36, p < .01. In a two-way ANCOVA with group and VIQ level as the IVs and age as a covariate, the group difference in the originality score remained significant, F(1, 80) = 4.55, p < .05. The interaction between group and VIQ level was not significant, F(2, 80) = .96, p > .1. When a logistic regression was performed on the restriction score, group was no longer a significant predictor when it was assessed independently of VIQ, z = 1.47, p > .1. Table 10. Stamps task results: Mean (and SD) of each group [or the percentage of low scorers for dichotomous variables], and significance of group comparisons ASD group Control group p p with age/ IQ control Complexity score 18.63 (3.02) 20.39 (2.98) ** Originality score 3.17 (2.51) 4.78 (2.8) ** * - Restriction score [82.9] [97.8] * Rule adherence score [26.8] [11.1] - * p < .05; ** p < .01; *** p < .001; - p > .05. 4.3.2.11 Summary and effect sizes of group comparisons Table 11 presents a summary of the results of group comparisons on the main variables from each cognitive task. Overall, participants in the ASD group performed significantly more poorly than controls on tasks measuring false belief understanding, planning, verbal inhibition, working memory (under conditions where inhibition was required), and both verbal and non-verbal generativity (with different patterns of results for the two subgroups of ASD participants meeting “full criteria” and “partial criteria” on the Pattern Meanings task); but not awareness of social norms, set-shifting, nonverbal inhibition or relational reasoning (although marginally significant differences were obtained on certain measures of social awareness, set-shifting and non-verbal inhibition). Age and VIQ influenced some of these results, reducing the significance of 154 group comparisons for two false belief variables, two verbal generativity variables and one non-verbal generativity variable. Table 11. Summary and effect sizes of significant group differences Significant Significant Effect size: group difference with r (and d) difference? age/IQ control? 9 9 .35 (.75) 9 - .28 (.58) First-order false belief 9 9 .36 (.77) Second-order false belief 9 - .24 (.50) False belief aggregate 9 9 .30 (.63) False belief alternative aggregate 9 - .26 (.54) - - .28 (.58) 9 9 .24 (.50) 9 9 .21 (.43) - - - - Inhibition - - Load - - Inhibition + load - - Inhibition - - Load - - Inhibition + load - - Measure ToM: Simple false belief: Own belief Other’s belief Social Cognition: Dewey Stories Planning: ToL: Adjusted extra move score Rule violations Set-shifting: IDED Perseveration condition: EDS stage errors IDED Learned Irrelevance cond.: EDS stage errors .23 (.47) Inhibition: RIL task error difference scores: .23 (.47) RIL task RT difference scores: 155 Table 11 continued Significant Significant Effect size: group difference with r (and d) difference? age/IQ control? Error difference score - - .21 (.43) Time difference score 9 9 .31 (.65) 9 9 .27 (.56) - - Measure Inhibition continued: Opposite Worlds: Working memory: RIL task shape error score Relational reasoning: Relational Complexity score Generativity: Pattern Meanings: Correct responses Sum of errors 91 - .23 (.47) 2 - .34 (.73) .37 (.80) 9 Uses of Objects: Correct responses 9 9 Sum of errors - - Complexity score 9 9 .28 (.58) Originality score 9 9 .29 (.61) Restriction score 9 - .26 (.54) Rule adherence score - - .20 (.41) Stamps task: significant to at least p < .05 level; - p > .05. 1 Difference was between full criteria subgroup and controls only 2 Difference was between partial criteria subgroup and controls only 156 It should be noted that while Bonferroni corrections were not performed, the fact that group differences followed a consistent pattern and were all in the expected direction (such that ASD participants performed more poorly than controls) signifies that the results are likely to be valid. Table 11 also lists the effect sizes obtained for all significant and marginally significant group differences, as a measure of the strength of each effect. The “effect size correlation”, or r (Rosenthal, 1991), was used as the primary measure of effect size. The effect size correlation simply measures the size of the correlation between the independent and dependent variable (a phi correlation was calculated for dichotomous variables, which is equivalent to Pearson’s r and point biserial correlations for continuous variables). However, all values of r were also converted to d (as shown in Table 11) using an equation supplied by Rosenthal (1991), and the size of each effect was evaluated using Cohen’s (1988) system for classifying small, medium and large effects. The largest effect size, and the only one to classify as a large effect, was for Uses of Objects correct responses - a measure of verbal generativity. Most other effect sizes fell in the medium range, including the Dewey Stories total score, on which there was only a marginally significant group difference but for which there was a small sample size. All other variables for which only marginally significant group differences were found displayed small effect sizes, and the ToL rule violations also showed only a small effect size. 4.3.3 Universality of ToM and EF deficits Ozonoff et al. (1991) assessed universality of ToM and EF deficits in their study by calculating the proportion of individuals in their autism group who scored below the mean of the control group. As discussed in Section 2.2.3, this is a lenient criterion for defining a deficit. In this study, it was decided to adopt the stricter criterion of a score more extreme (in the direction of poorer performance) than 1 SD from the mean of the control group (i.e., in the extreme 16% of control scores for a normal distribution) as the definition of “impairment”. The universality of a deficit on continuous variables was therefore assessed by calculating the proportion of participants in the ASD group scoring more poorly than 1 SD from the mean. This was done only for variables where significant group differences were found (including variables for which the group difference did not remain significant when age and IQ variables were partialled out, but not including variables on which only marginally significant group differences were found). 157 For variables coded dichotomously, the “more poorly than 1 SD from the mean” strategy was obviously not feasible, but it was necessary for the calculation of universality to be comparable to that for continuous variables. To address this, the percentage of control participants gaining a score of 0 (or 1 if a higher score was poorer) was calculated, and if it was approximately 16%, the percentage of ASD participants gaining that score was considered a comparable measure of the universality of a deficit on that variable (as a score at the 16th percentile corresponds to a score at 1 SD below the mean for a normal distribution). For the false belief variables, the alternative aggregate score (see Section 4.3.2.1) was considered the best measure to use in assessing universality10, as 19.6% of control participants gained a score of 0. Universality was also calculated for the first-order false belief task, on which 17.4% of control participants scored 0. The percentages for the two other dichotomous variables where significant group differences were found were not quite as ideal. For ToL rule violations, 23.4% of control participants gained a high error score of 1. This variable was therefore recoded using a more lenient criterion such that a score between 0 and 1.5 rule violations per block scored 0, which resulted in a more appropriate 17% of control participants scoring 111. For the Stamps task restriction score, only 2.2% of control participants gained a high restriction score of 1, so this variable was not included in the universality calculations. The percentages of ASD participants demonstrating a deficit on the ToM and EF variables where significant group differences were found are displayed in Table 12. It is evident that neither ToM nor EF deficits are universal within the ASD sample12. The percentages of ASD participants showing deficits also appear to be fairly comparable across ToM and EF variables, although there was some variability among the EF variables. Within the EF tasks, deficits in verbal inhibition and verbal generativity were the most prevalent. 10 It is worth noting that although an aggregate score was used for the false belief variables, an aggregate or composite score was not calculated across the EF tasks (as was done by Ozonoff et al., 1991) for these universality calculations or for subsequent analyses because it was not thought to be valid or meaningful, particularly in light of the fact that one of the aims of the study was to examine the specific profile of EF deficits in ASDs and the relationship of each EF component with behavioural symptomatology and with ToM. In support of this, although there were some intercorrelations between EF domains, for the most part EF task variables were not significantly correlated with each other and appeared to be measuring different constructs (these correlations are presented in Appendix B and discussed further in Section 4.4.1). The fact that group differences were found on some EF tasks but not others solidifies this view. In addition, within EF domains, verbal and non-verbal measures often did not correlate with each other (i.e., for the different tests of inhibition and generativity). EF variables were therefore considered separately throughout analyses. 11 Group differences were still significant for this recoded variable, χ2 (1, N = 93) = 4.70, p < .05. 12 Even when Ozonoff et al.’s (1991) more lenient criterion for defining a deficit was used, ToM and EF “deficits” still could not be considered universal, with proportions ranging from 60.0 to 82.6%. 158 Table 12. Universality of ToM and EF deficits in the ASD group % of ASD group displaying a deficit ToM: False belief alternative aggregate score 44.2 First-order false belief 51.2 Planning: ToL: Adjusted extra move score Rule violations 28.9 37.0 Inhibition: Opposite Worlds time difference score 48.3 Working memory: RIL task shape error score 37.1 Generativity: Pattern Meanings: Correct responses Sum of errors 28.3 26.7 Uses of Objects correct responses 41.3 Stamps task: Complexity score 19.5 Originality score 29.3 4.3.4 Ability of ToM and EF variables to predict group membership In order to investigate the “uniqueness” of ToM and EF impairments to autism (as compared with matched controls), a logistic regression analysis was conducted to examine which cognitive task variables were best able to discriminate the ASD group from the control group. A direct logistic regression was performed with group as the outcome variable, and VIQ and all ToM and EF variables on which there were significant group differences as the predictors. Logistic regression was chosen as the method of analysis rather than discriminant function analysis because logistic regression is more suitable when there is a mixture of dichotomous and continuous predictor variables (Tabachnik & Fidell, 1996). Direct logistic regression evaluates the independent contribution made by each predictor over and above that of the other predictors (i.e., each predictor is assessed as if it entered the equation last). Because not all participants completed every task (mainly due to age limits on certain tasks, as well as missing data), only those participants with data for all the 159 predictor variables were included in the logistic regression. There were 27 participants in the ASD group and 32 in the control group who met these criteria, and these limited groups were matched on age (M = 11.26, SD = 3.18 for the ASD group; M = 10.13, SD = 2.27 for the control group), t(57) = 1.58, p > .1, and PIQ (M = 94.52, SD = 15.78 for the ASD group; M = 99.78, SD = 18.68 for the control group), t(57) = 1.16, p > .1. A test of the full model with all 12 predictors against a constant-only model was statistically reliable, χ2 (12, N = 59) = 31.03, p < .01, indicating that the predictors, as a set, reliably distinguished children with ASDs from controls. 84.4% of the control group and 77.8% of the ASD group were classified correctly by the model. Table 13 presents regression coefficients, Wald statistics, odds ratios, and 95% confidence intervals for odds ratios for each of the 12 predictors. According to the Wald criterion, the only reliable predictors of group membership were the Opposite Worlds time difference score (a verbal measure of inhibition) and the number of correct responses on the Uses of Objects task (a measure of verbal generativity). Performance on first-order false belief questions approached significance as a predictor (p = .08). Two possible limitations with this initial analysis were that i) correlations between variables derived from the same task (or set of tasks) may have affected the ability of individual variables from those tasks to emerge as a significant predictor, and ii) the ratio of cases to predictors was lower than it should be in the ideal regression. In order to address these limitations, another logistic regression was conducted where only one variable from each task was included (VIQ, first-order false belief, ToL adjusted extra move score, RIL shape error score, Opposite Worlds time difference score, Uses of Objects correct responses, and Stamps task originality score). The ratio of cases to predictors was therefore substantially higher in this alternative analysis. Variables were chosen on the basis of the effect size of group comparisons and their representativeness of task performance. Results were almost the same as the initial regression, with the only difference being that the level of significance of the Uses of Objects correct responses variable dropped from p = .04 to .07. The first-order false belief task variable remained only marginally significant as a predictor13 (p = .08). The initial logistic regression was therefore interpreted as a valid indicator of the ability of each task variable to predict group membership14. 13 When the false belief alternative aggregate score was included instead (as this was the variable used in the universality calculations), the results also remained the same with the exception that the false belief aggregate was a non-significant, rather than a marginally significant, predictor of group membership. 14 As it was possible that VIQ and false belief variables may have affected each other’s contribution due to their significant correlation, another logistic regression (with the initial set of task variables) was conducted without including VIQ. First-order false belief performance was found to be a significant 160 Table 13. Logistic regression analysis of group membership as a function of VIQ, ToM and EF variables 95% C. I. for odds ratio Wald test Variables ___________________ B (z-ratio) Odds ratio -.04 2.46 .96 .91 1.01 .68 .15 1.98 .06 59.92 1 - order -2.05 2.96 .13 .01 1.33 2nd - order .87 .53 2.38 .23 24.56 Adj. extra move score .03 .18 1.03 .90 1.19 Rule violations -.07 .01 .93 .15 5.93 -.02 .01 .98 .70 1.37 .16 3.98* 1.17 1.0 1.38 -.10 4.03* .91 .82 1.0 Complexity score -.19 1.36 .83 .60 1.14 Originality score .08 .25 1.08 .80 1.47 Restriction score -2.26 1.26 .10 .0 5.43 VIQ Upper Lower False belief tasks: Simple st ToL: RIL task: Shape error score Opposite Worlds: Time difference score Uses of Objects: Correct responses Stamps task: *p < .05; ** p < .01; *** p < .001. predictor in this analysis, z = 4.12, p < .05. However, rather than suggesting that false belief performance was actually a meaningful predictor of group membership, this pattern of results (i.e., the change in significance of false belief as a predictor when VIQ was included) indicates that false belief understanding did not add significant additional variance to the regression beyond that contributed by VIQ. 161 4.3.5 Behavioural measures: Group comparisons and derivation of indices used in correlational analyses 4.3.5.1 Repetitive Behaviours Interview (RBI) Group comparisons. Severity summary scores were the main RBI variables used in analyses. Distributions of the severity summary scores were frequently skewed for the ASD group, and highly skewed for the control group. However, all transformations were ineffective. Non-parametric statistics were used for group comparisons of the severity of different types of repetitive behaviours. As expected, Mann-Whitney U tests revealed that children in the ASD group exhibited significantly more severe repetitive behaviours in all categories of the RBI (all ps < .001, except for self-injurious behaviours, where p < .01)15. Medians and ranges of the severity summary scores (expressed as t scores) for the ASD and control groups are presented in Table 14. Table 14. Median (and range) of RBI severity summary scores for the ASD and control groups Median (range) of severity summary scores RBI category ASD group Control group Stereotyped manipulation of objects 54 (45-119) 45 (45-75) Stereotyped movements 58 (46-110) 46 (46-63) Tic-like behaviours 49 (47-130) 47 (47-57) Self-injurious behaviours 48 (48-172) 48 (48-67) Compulsive behaviours 60 (46- 99) 46 (constant) Object attachments 53 (46-108) 46 (46-60) Insistence on sameness of environment 60 (45- 98) 45 (45-69) Rigid adherence to routines and rituals 61 (47-119) 47 (47-54) Repetitive use of language 61 (46-109) 46 (46-62) Circumscribed interests 64 (45- 91) 45 (45-73) Derivation of indices used in correlational analyses. Consistent with Turner’s (1996, 1997) study, severity summary scores from the RBI were summed across categories to form composite severity summary scores (i.e., Repetitive Movements, Sameness 15 Non-parametric group comparisons were also conducted for the “presence of behaviour” summary scores, and the outcomes were identical. 162 Behaviour, Compulsive Behaviours, Repetitive Language, and Circumscribed Interests composites; see Section 3.5.1.2 in Chapter 3), which were used in correlational analyses with cognitive measures16. These composite scores generally demonstrated normal distributions in the ASD group. One outlier (in the ASD group) on the Repetitive Movements composite score was trimmed. For variables with skewed distributions, scatterplots were examined for evidence of curvilinearity and multivariate outliers, and no major problems were identified. In order to examine the factor structure of the RBI and the statistical validity of Turner’s (1996, 1997) categories and composite scores (which were based on classes of repetitive behaviour derived from the literature), principal components analysis with varimax rotation was conducted on the severity summary scores from each RBI category (including the data from both the ASD and control groups). Evaluation of two- and three-factor solutions indicated that a two-factor model appeared to be more meaningful. The two factors explained 57.0% of the total variance in the RBI, with 39.7% accounted for by a High-level Repetitive Behaviours factor (eigenvalue 3.97), and 17.3% by a Low-level Repetitive Behaviours factor (eigenvalue 1.73). Factor loadings are displayed in Table 15. Table 15. Factor loadings of RBI severity summary scores Factor 1: High–level RBI category Stereotyped manipulation of objects Factor 2: Low-level Repetitive Behaviours Repetitive Behaviours .449 .707 Stereotyped movements .793 Tic-like behaviours .779 Self-injurious behaviours .540 Compulsive behaviours .752 Object Attachments .668 Insistence on sameness of environment .865 Rigid adherence to routines and rituals .835 Repetitive use of language .721 Circumscribed interests .419 Note: Factor loadings lower than .4 are not shown 16 Correlations were also conducted using the “presence of behaviour” summary scores (which were also summed to form composite scores). These showed an almost identical pattern of correlations with cognitive measures, as well as being highly correlated with the severity summary scores. 163 As the factors derived differed from the composite scores used by Turner (1996, 1997), factor scores for each participant were calculated using a regression equation, and these factor scores were also used in correlational analyses with cognitive measures. 4.3.5.2 Social and communicative functioning Group comparisons. Group comparisons were conducted for each of the three measures of social and communicative functioning separately. On the Social Behaviour Questionnaire (SBQ), one outlier in the control group was trimmed. A t-test revealed that, as expected, participants in the ASD group (M = 16.22, SD = 5.79) scored significantly higher on the SBQ, indicating more abnormal social behaviours than the control group (M = 5.06, SD = 4.75), t(88) = 10.0, p < .001. Unsurprisingly, there were also significant group differences indicating a higher number of abnormal current behaviours in the ASD group in the social domain of the ADI-R, (ASD group: M = 15.02, SD = 7.92; Control group: M = 2.33, SD = 3.21), t(47) = 2.74, p < .01, and in the communication domain, (ASD group: M = 17.0, SD = 5.54; Control group: M = 4.0, SD = 4.36), t(47) = 3.97, p < .001. Derivation of indices used in correlational analyses. As mentioned in Section 3.5.2.2 of the previous chapter, a principal components analysis was conducted with scores from the SBQ and scores on current behaviours only from the Social and Communication domains of the ADI-R, which showed that all three measures loaded on one factor (smallest factor loading = .80) which explained 75.29% of the variance in the sample (eigenvalue 2.26). Factor scores for each participant were calculated using a regression equation, on which higher scores indicated more abnormal social/communicative functioning. This social/communication score was used in all correlational analyses. 164 4.3.6 Correlations between ToM/EF and behavioural measures The explanatory value of ToM and EF impairments was examined by correlating cognitive task performances with behavioural indices17. As the incidence of repetitive behaviours and abnormal social behaviours was very low in the control group, correlations between cognitive and behavioural measures were conducted for the ASD group only. If raw correlations were statistically significant, partial correlations (controlling for age, PIQ and VIQ) were also conducted. Table 16 displays raw correlations and relevant partial correlations between cognitive measures and behavioural factors (i.e., the two RBI factor scores and the social/communication factor score). High-level repetitive behaviours correlated only with the Uses of Objects correct responses (in an unexpected direction, such that a higher number of correct responses was associated with more severe high-level repetitive behaviours), but this correlation was not significant when age and IQ variables were partialled out. Low-level repetitive behaviours showed significant raw and partial correlations with the Opposite Worlds time difference score, a verbal measure of inhibition (in the expected direction, such that poorer inhibitory ability was correlated with increased severity of low-level repetitive behaviours). The social/communication factor showed a significant raw correlation in the expected direction with the Stamps task complexity score, which remained significant when age and IQ variables were controlled. These results demonstrate that the behavioural symptoms of ASDs showed different patterns of correlation with cognitive measures. In general, however, there were few significant correlations (with high-level repetitive behaviours in particular being poorly explained by the available data). Of note, the false belief aggregate score did not correlate significantly with any behavioural factors18. Dewey Stories, a higher level measure of social cognition, also showed no significant correlations with any behavioural factors, including social/communicative functioning. The only EF variables to correlate significantly with behavioural factors were select measures of verbal inhibition and non-verbal generativity. 17 As for the correlations conducted between age, PIQ, VIQ, and cognitive task variables, the same computational formula was used for correlations between continuous variables (i.e., Pearson productmoment correlation coefficients) and correlations between continuous and dichotomous variables (i.e., point biserial correlation coefficients), as recommended in Cohen and Cohen (1983). 18 Correlations were also conducted for all false belief tasks individually as well as for the alternative aggregate score, but no significant correlations emerged. 165 Table 16. Raw and partial correlations between cognitive measures and behavioural factors within the ASD group Factor score Cognitive task High-level Low-level Social/ Rep. Behaviours Rep. Behaviours Communication ToM (n = 43): False belief aggregate .17 .08 .24 -.18 -.08 .09 Adj.extra move score -.02 -.04 -.04 Rule violations -.06 .01 -.02 -.33 -.14 .20 .0 Social Cognition (n = 17): Dewey Stories total Planning: ToL (n = 46): Set-shifting: IDED Perseveration condition (n = 35): EDS stage errors .10 IDED Learned Irrelevance condition (n = 36): EDS stage errors .0 Inhibition: RIL task error difference scores (n = 35 except inhibition score, n = 36): Inhibition .03 .12 .22 Load .04 -.21 -.08 Inhibition + load .05 -.09 .12 RIL task RT difference scores (n = 35 except inhibition score, n = 36): Inhibition -.03 .14 -.07 Load -.12 -.23 -.15 Inhibition + load -.11 -.10 -.19 Error diff. score .06 -.04 -.03 Time diff. score -.07 .38* Opposite Worlds (n = 29): .47* .12 166 Table 16 continued Factor score Cognitive task High-level Low-level Social/ Rep. Behaviours Rep. Behaviours Communication Working memory: RIL shape error score .10 .22 .20 .18 -.09 -.10 Correct responses .06 .03 -.06 Sum of errors -.07 .05 .0 -.05 .0 Relational Reasoning: Relational Complexity (n = 46): Total score Generativity: Pattern Meanings (n = 46): Uses of Objects (n = 46): Correct responses .33* Sum of errors -.03 -.12 .16 Complexity score .10 -.07 -.38* Originality score .28 .09 .13 Restriction score .0 -.15 -.03 -.03 -.16 -.29 .21 Stamps task (n = 41): Rule adherence score -.52** * p < .05; ** p < .01; *** p < .001. Note: Partial correlations controlled for age, VIQ and PIQ. All tests were two-tailed. Ns listed for each task show the sample size for correlations with the behavioural factors. 167 Correlations between cognitive task variables and RBI composite scores (equivalent to those used by Turner, 1996, 1997) were also of interest, both in terms of examining patterns of correlations with more specific types of behaviour and determining whether these results replicate those reported by Turner. These are presented in Table 17. Repetitive Movements demonstrated significant raw and partial correlations in the expected direction with the Opposite Worlds time difference score, a verbal measure of inhibitory capacity (consistent with the correlation between this variable and Low-level Repetitive Behaviours). Sameness Behaviour, Compulsive Behaviours and Repetitive Language did not correlate significantly with any cognitive task variables. Circumscribed Interests demonstrated significant raw correlations with three variables (false belief aggregate, Uses of Objects correct responses, and Stamps task restriction score), all in the opposite direction than expected; however, none of these correlations remained significant when age and IQ were partialled out. Overall, each RBI composite score demonstrated a unique pattern of correlations with cognitive task variables, although again there were few significant correlations, with only the Repetitive Movements composite showing a significant partial correlation with a cognitive variable. When age and IQ were controlled, ToM and social cognition variables did not correlate with any RBI composite scores. Only one EF measure, of verbal inhibition, correlated significantly with an RBI composite (Repetitive Movements, as described above). 168 Table 17. Raw and partial correlations between cognitive measures and RBI composite scores within the ASD group RBI composite score Repetitive Sameness Compulsive Repetitive Movements Behaviour Behaviours Language Cognitive task ToM (n = 43): False belief aggregate .14 .03 .08 .11 Social Cognition (n = 17): Dewey Stories total -.09 -.14 -.14 -.14 Planning: ToL (n = 46): Adj.extra move score -.03 -.04 -.04 .0 Rule violations -.05 -.01 -.19 .24 Set-shifting: IDED Perseveration condition (n = 35): EDS stage errors -.31 -.02 .06 .07 IDED Learned Irrelevance condition (n = 36): EDS stage errors .23 .0 .08 .03 Inhibition: RIL task error difference scores (n = 35 except inhibition score, n = 36): Inhibition .09 .16 .13 -.01 Load -.22 .01 -.10 .13 Inhibition + load -.11 .10 .01 .10 RIL task RT difference scores (n = 35 except inhibition score, n = 36): Inhibition .10 .0 -.03 .16 Load -.28 -.12 -.14 -.07 Inhibition + load -.18 -.07 -.14 .07 Circumscribed Interests .32* .12 -.05 -.03 -.04 -.11 -.05 -.20 -.16 -.29 .14 -.13 -.01 169 Table 17 continued Repetitive Movements Sameness Behaviour RBI Composite Score Compulsive Behaviours Repetitive Language Circumscribed Interests Cognitive task Inhibition continued: Opposite Worlds (n = 29): Error diff. score -.01 -.01 .04 .05 -.06 Time diff. score .37* .08 .0 .09 .10 .48* Working memory (n = 35): RIL shape error score -.18 .15 .12 .31 .07 Relational Reasoning: Relational Complexity (n = 46): Total score -.06 .08 .27 .02 .23 Generativity: Pattern Meanings (n = 46): Correct responses .07 .0 -.08 .07 .15 Sum of errors .01 -.06 -.09 .13 .05 Uses of Objects (n = 46): Correct responses .06 .18 .27 .08 .35* .17 Sum of errors -.10 -.05 -.11 .11 -.24 Stamps task (n = 41): Complexity score -.03 -.02 .15 -.13 .15 Originality score .12 .13 .23 .24 .31 Restriction score -.16 -.10 .10 .21 -.31* -.15 Rule adherence score -.16 -.05 -.10 -.08 -.09 *p < .05; ** p < .01; *** p < .001. Note: Partial correlations controlled for age, VIQ and PIQ. All tests were two-tailed. Ns listed for each task show the sample size for correlations with the RBI severity summary scores. 170 4.3.7 Relationship between ToM and EF The relationship between ToM and EF in the ASD and control groups was investigated by examining both correlations and dissociations between the two domains. The Dewey Stories total score was omitted from these analyses, for two main reasons: firstly because it does not appear to measure ToM and therefore any relationships with EF would be difficult to interpret within the theoretical frameworks that exist regarding the ToM-EF relationship; and secondly because only older participants completed the task and so the sample size for correlations was small19. 4.3.7.1 Correlations between ToM and EF Correlations between task variables were calculated separately for the ASD and control groups. Again, partial correlations (controlling for the effects of age, VIQ and PIQ) were conducted if raw correlations were significant. Table 18 presents raw and relevant partial correlations between ToM and EF task variables within the control group. Correlations are displayed separately for the various false belief variables rather than the overall aggregate score because the pattern of correlations was different for the three tasks. In this group, simple false belief task performance correlated with the ToL adjusted extra move score, and the Pattern Meanings and Uses of Objects sum of errors (with all correlations in the expected direction, such that poor false belief performance correlated with poor EF task performance); however when age, VIQ and PIQ were controlled, only the correlation with the Pattern Meanings sum of errors remained significant. First-order false belief task performance correlated with the ToL adjusted extra move score, Relational Complexity total score, Uses of Objects correct responses and Stamps task originality score (all in the expected direction), but only the correlations with ToL and Uses of Objects variables were significant when age and IQ were partialled out. Second-order false belief task performance correlated with the ToL adjusted extra move score and rule violations, the RIL task load error and RT difference scores and inhibition RT difference score, the Relational Complexity total score, the Uses of Objects correct responses, and Stamps task originality score (all in the expected direction except the RIL task load RT difference score); but with age and IQ controlled, 19 Correlations were conducted out of interest, but did not reveal much of importance. There were only a small number of significant correlations with EF variables in the control and ASD groups, and a few of these were in the opposite than expected direction. 171 only the correlations with ToL rule violations, RIL task load error and RT difference scores, and the Stamps task originality score remained significant. Overall, in the control group, ToM variables demonstrated relationships with measures of planning, non-verbal inhibition (under working memory load conditions), relational reasoning, and both verbal and non-verbal generativity, but several of the correlations were mediated by age and IQ effects (there were no significant partial correlations with relational reasoning ability). All correlations were in the expected direction - such that poorer performance on EF tasks correlated with poorer false belief task performance - with the exception of the RIL task load RT difference score. A possible explanation for this is that participants who performed well on false belief tasks made fewer errors on the working memory load condition, but at the expense of speed (i.e., they demonstrated a cautious speed/accuracy tradeoff). Another noticeable aspect of the pattern of correlations was that there tended to be more correlations with the second-order than the simple and first-order false belief tasks, which is likely to be partly due to the fact that only a small proportion of participants in the control group failed to obtain a perfect score on the lower-order tasks. Table 19 displays raw and partial correlations between ToM and EF tasks within the ASD group. In children with ASDs, simple false belief task performance correlated with the ToL adjusted extra move score, Uses of Objects correct responses, and the Stamps task restriction score (with all correlations in the expected direction); however when age and IQ variables were controlled, only the correlation with the Stamps restriction score remained significant. First-order false belief task performance correlated with the ToL adjusted extra move score, Uses of Objects correct responses and Stamps task originality score (all in the expected direction), but none of the correlations were significant when age and IQ were partialled out. Second-order false belief task performance correlated with the ToL adjusted extra move score and rule violations and the Uses of Objects correct responses (all in the expected direction); none of these correlations were significant with age and IQ controlled. Overall, the ASD group showed noticeably fewer significant correlations between ToM and EF variables than the control group, with only one correlation remaining significant with age and IQ controlled (between simple false belief performance and a non-verbal measure of generativity). 172 Table 18. Raw and partial correlations between ToM and EF measures within the control group EF task False belief task _______________________________________________ Simple 1st-order 2nd-order ToL (n = 46): Adj. extra move score Rule violations -.30* -.26 -.06 IDED Set-shifting task condition (n = 34): Perseveration EDS stage errors -.27 Learned Irrelevance EDS stage errors -.13 -.40** -.35* -.01 -.45** -.27 -.48** -.34* -.25 -.28 -.23 -.13 RIL task (n = 33): Error difference scores: Inhibition Load Inhibition + load RT difference scores: Inhibition Load Inhibition + load Shape error score .27 -.16 .12 .07 -.33 -.15 -.02 -.45** -.54** -.30 -.05 .26 .19 -.20 -.05 .24 .18 -.19 -.44* -.35 .42* .52** -.02 -.31 Opposite Worlds (n = 35): Error difference score Time difference score .08 .02 -.20 -.03 .03 -.09 Relational Complexity (n = 46): Total score .26 .30* Pattern Meanings (n = 46): Correct responses Sum of errors -.02 -.37* -.32* .28 -.10 .15 -.26 Uses of Objects (n = 46): Correct responses Sum of errors .11 -.34* -.27 .44** .30* -.19 .51*** .28 -.28 Stamps task (n = 45): Complexity score Originality score Restriction score Rule adherence score .01 .23 .04 -.10 .28 .40** .25 .07 .02 .29 .56*** .40** .09 -.06 .07 .48** .14 * p < .05; ** p < .01; *** p < .001. Note: Partial correlations controlled for age, VIQ and PIQ. All tests were two-tailed. Ns listed for each EF task show the sample size for the correlations with the ToM tasks. 173 Table 19. Raw and partial correlations between ToM and EF measures within the ASD group EF task False belief task _________________________________________________ Simple 1st-order 2nd-order ToL (n = 43): Adj. extra move score Rule violations -.33* -.07 -.25 IDED Set-shifting task condition: Perseveration (n = 32): EDS stage errors -.15 Learned Irrelevance (n = 33): EDS stage errors .15 -.53***-.24 -.16 -.35* -.04 -.30* -.10 .08 -.23 -.22 -.19 RIL task (n = 32 except inhibition difference scores, n = 33): Error difference scores: Inhibition -.28 -.13 Load -.18 -.03 Inhibition + load -.32 -.09 RT difference scores: Inhibition .05 .13 Load -.17 .07 Inhibition + load -.16 .13 Shape error score -.09 -.05 Opposite Worlds (n = 29): Error difference score Time difference score -.08 -.17 -.17 .0 .07 .04 .05 -.24 -.21 .01 .06 .08 -.12 Relational Complexity (n = 43): Total score .10 .23 .27 Pattern Meanings (n = 43): Correct responses Sum of errors .19 .02 .01 -.10 .26 .09 Uses of Objects (n = 43): Correct responses Sum of errors .32* .13 Stamps task (n = 41): Complexity score Originality score Restriction score Rule adherence score .23 .11 -.56***- .46** .15 .11 .31* .02 .08 .39* -.16 .01 .03 .10 .48** .31 .12 -.01 .30 -.21 -.15 * p < .05; ** p < .01; *** p < .001. Note: Partial correlations controlled for age, VIQ and PIQ. All tests were two-tailed. Ns listed for each EF task show the sample size for the correlations with the ToM tasks. 174 Table 20 presents a summary of the significant partial correlations between ToM and EF domains in the control and ASD groups, clearly demonstrating the different pattern of correlations in the two groups. Table 20. Summary of significant partial correlations between ToM and EF variables in the control and ASD groups ToM Control group EF domain ASD Group 99 Planning Set-shifting Inhibition – Non-verbal 99* Inhibition – Verbal Working Memory Relational Reasoning Generativity – Verbal Generativity – Non-verbal 99 9 9 * Correlations marked with an asterisk were in the opposite direction than expected. Note: Each tick represents one significant correlation between a false belief and an EF variable in that domain. 4.3.7.2 Dissociations between ToM and EF While the correlative evidence presented in the previous section was suggestive of a relative independence between ToM and EF in the ASD group compared with controls, it was also of interest to examine the incidence and direction of dissociations between ToM and EF deficits within the ASD group. This was achieved by defining a deficit on any given task in the same way as for the universality calculations in Section 4.3.3. The proportion of ASD participants with a ToM deficit who displayed unimpaired performance on EF tasks was calculated, and conversely, the proportion of participants with a given EF deficit who displayed unimpaired ToM performance was also calculated. For ease and simplicity of interpretation, the false belief alternative aggregate score was used as the measure of ToM performance (as for the universality calculations) rather than analysing all the false belief variables separately. However, all EF variables on which significant group differences were found were analysed separately. The results of these calculations are displayed in Table 21. 175 Table 21. The incidence of ToM-EF dissociations in the ASD group % of ToM-impaired ASD participants EF measure with unimpaired EF N 47.4 19 36.8 19 Opposite Worlds time difference score 44.4 9 RIL shape error score 54.5 11 Uses of Objects correct responses 47.4 19 Stamps task: Complexity score 83.3 18 50.0 18 ToL: Adjusted extra moves score Rule violations Originality score % of EF-impaired ASD participants with unimpaired ToM ToL: Adjusted extra moves score 23.1 13 40.0 20 Opposite Worlds time difference score 64.3 14 RIL shape error score 58.3 12 Uses of Objects correct responses 44.4 18 Stamps task: Complexity score 62.5 8 25.0 12 Rule violations Originality score These data clearly demonstrate that dissociations between ToM and EF occurred relatively frequently (usually in around 50% of the participants showing impairments) and in both directions, such that the presence of a ToM impairment did not necessarily result in an EF impairment and vice versa. These results are consistent with the correlative data in indicating an independence between ToM and EF deficits in the ASD group. 4.4 Discussion This section includes four subsections. The first three of these examine the profile, primacy and independence of ToM and EF deficits in ASDs in this study, comparing the current findings to those of previous studies and considering alternative interpretations of the results. In the final section, an attempt is made to interpret the outcomes in terms of the six alternative hypotheses regarding primacy and independence outlined in the introduction. 176 4.4.1 Profile of ToM and EF deficits As predicted, both ToM and EF deficits were found in this sample of individuals with ASDs. However, a unique profile of spared and impaired abilities emerged, which included both expected and unexpected features. Profile of ToM deficits. In the ToM domain, a higher proportion of ASD participants than controls displayed unstable performance on all false belief tasks and on aggregate scores, although partialling out age and VIQ reduced the significance of group comparisons on the simple and second-order tasks as well as on the alternative aggregate score (which involved a more lenient scoring criterion). These results suggest that false belief understanding was significantly impaired in the ASD group, but that on two of the tasks the impairment was partially attributable to the poorer verbal skills of the ASD participants20. This lack of robustness of ToM deficits on false belief tasks was underscored by the relatively high percentage of ASD participants who demonstrated errorless performance on the tasks, which ranged from 48.8% on the standard first-order task to 72.1% on the simple (unexpected contents and unexpected identity) tasks, with 39.5% displaying perfect performance across all tasks. According to the alternative aggregate score which more reliably indicates poor performance, 55.8% of ASD participants were high scorers on false belief tasks. The highest percentage of first-order false belief task passers found in previous studies was 90% (Dahlgren & Trillingsgaard, 1996), with the next highest being 55% (Prior et al., 1990). Although the high 72.1% on the simple false belief task is likely to be an overestimation due to the fact that perfect performance on it was assumed if the first-order task was passed (for 7-16 year-olds who began with the first-order task), it is nevertheless clear that the sample of ASD participants in this study demonstrated better false belief task performance than the majority of samples from previous studies. The finding that false belief performance was significant correlated with both age and VIQ suggests that the relatively old mean age and high level of verbal ability of the sample probably explains this good false belief performance, consistent with previous studies demonstrating that individuals with autism passing false belief tasks tend to be older and have higher verbal ability (e.g., Eisenmajer & Prior, 1991; Prior et al., 1990; Sparrevohn & Howie, 1995). 20 Notably, the effect of group on performance on the simple and second-order false belief tasks remained significant when age only was included in the regression, but did not remain significant when VIQ only was included, suggesting that VIQ had a greater impact on the significance of group differences than did age. 177 The age and level of ability of the control sample meant that a high percentage of controls also demonstrated flawless false belief performance. However, it is of note that the use of a fairly strict scoring criterion prevented extreme ceiling effects, with 30.4% of controls demonstrating unstable false belief performance on the aggregate score. Even using the more lenient alternative aggregate, 19.6% of controls emerged as low scorers, the majority of whom were between 6 and 10 years of age. This suggests that beyond the age of 5, either ToM is still developing or other cognitive factors may be influencing false belief performance (this latter possibility is discussed further below in Section 4.4.3). Despite the fact that ceiling effects did not pose a significant problem on the false belief tasks, the relatively high proportion of perfect performances in both the ASD and control groups indicates that the assessment of ToM in this study would have been significantly strengthened by the inclusion of a more advanced ToM task such as Happé’s (1994a) “Strange Stories” task or Baron-Cohen et al.’s (1997, 2001a) “Eyes Task”. The use of the Dewey Stories task represented an attempt to tap into higherlevel social cognitive skills, however its lack of correlation with false belief variables (after partialling out age, PIQ, and VIQ) indicates that it is questionable as a measure of mentalising ability and can probably be successfully performed by drawing on more declarative knowledge of social norms. In light of this, it is noteworthy that ASD participants did not show a significant impairment on the task, with marginally significant differences reducing to non-significance when VIQ was accounted for (although the medium effect size of the group difference suggested that the marginal significance may have been due to the small size of the sample who completed the task). This suggests that high-functioning individuals with ASDs often have intact knowledge of what is considered “normal” or appropriate, but that this knowledge does not aid or interact with either their mentalising skills or their own social skills (Dewey Stories performance did not correlate significantly with social/ communicative functioning). Interestingly, a similar pattern of results has been demonstrated previously with patients with damage to the ventromedial prefrontal cortex (e.g., Saver & Damasio, 1991). Profile of EF deficits. ASD participants also demonstrated an interesting pattern of strength and weakness on the various EF components tested. Consistent with previous research, individuals with ASDs displayed robust planning impairments on the ToL, both in terms of the number of extra moves made and the frequency with which the rules of the task were violated. However, the small to medium effect size was somewhat lower than expected, with Pennington and Ozonoff (1996) reporting an 178 average effect size of 2.07 on the similar Tower of Hanoi (ToH) task across the studies conducted up until then. This discrepancy probably cannot be attributed to the age or level of functioning (i.e., IQ level) of the sample because previous studies using Tower tasks have also used older, high-functioning participants. One difference between the ToL administration procedure used in this study compared with other studies using the ToL and ToH is that during the initial task instructions participants were actively encouraged to plan the movements of the discs in advance. This may have positively influenced performance on the task and reduced the size of the difference between the ASD and control groups. Nevertheless, the fact that planning impairments persisted in the ASD participants despite this extra cueing provides evidence of the severity of their deficit in this domain. Furthermore, as the ToL and ToH have been found to hold slightly different cognitive demands (e.g., Welsh et al., 1999), a comparison of effect sizes across the two tasks should be viewed with caution (unfortunately, the only other study to use the ToL rather than the ToH - Hughes et al.’s (1994) study - did not report standard deviations and therefore the effect size from that study could not be directly compared). Following from this, it should also be noted that Welsh et al. (1999) found that ToL performance tapped working memory and inhibition as well as planning ability, and therefore the poor ToL performance demonstrated by the ASD group may not necessarily reflect a planning impairment. However, the lack of group differences on separate and more direct measures of working memory and non-verbal inhibition (as discussed below) make this unlikely, supporting the interpretation of the ToL result as indicating a planning deficit in the ASD group. The absence of significant impairments in attentional shifting abilities on the IDED set-shifting task was an unexpected result, given fairly consistent evidence of setshifting difficulties in previous studies (e.g., Ciesielski & Harris, 1997; Hughes & Russell, 1993; Hughes et al., 1994; Ozonoff et al., 1994). A difficulty with mental flexibility holds an intuitive appeal in explaining autistic symptoms such as perseveration and rigid adherence to routines and rituals, and Ozonoff (1997b) has suggested that a shifting impairment may in fact be the key feature of the EF profile which characterises autism. However, there have been at least two other studies which have also failed to find set-shifting deficits in autism. Ozonoff et al. (2000) found no significant difference between their high-functioning autistic participants and controls on the original IDED set-shifting task from the CANTAB battery. They attributed this result to the fact that their task was computerised, thereby facilitating the performance of their autistic participants. However, the fact that their participants with Asperger 179 syndrome did show impaired performance on the task, along with Hughes et al.’s (1994) previous finding of a deficit on the same computerised task in individuals with autism, speak against this explanation. Turner (1997) also found that her high-functioning participants with autism displayed intact performance on both conditions of the modified IDED set-shifting task used in the current study, although her low-functioning participants demonstrated impairments on the EDS stage of the Perseveration condition. There was also evidence of a marginally significant difference in set-shifting performance in the current sample, but contrary to Turner’s results this occurred in the Learned Irrelevance condition. These two negative results using the same task (i.e., Turner’s and the current study) necessitate some decomposition of the requirements of the task. Although the design of the modified IDED task allows more specific analysis of the component processes involved in the task than the original version, it appears to do this at the expense of the impact and obviousness of the shift. As Turner pointed out, in the modified IDED task, the change in stimulus dimension that occurs in the EDS stage of both conditions (i.e., the introduction of the new relevant stimulus dimension of solidity in the Perseveration condition and the new irrelevant dimension of size in the Learned Irrelevance condition) signals very clearly that the task has changed. This means that it is fairly easy for the participant to deduce the rules of responding for that condition without relating them to the previous condition or becoming easily “stuck” in their previous mode of responding. This in turn suggests that either the validity of the task as a measure of set-shifting is questionable or that the nature of the shift required is too easy for high-functioning individuals with ASDs. The fact that Owen et al. (1993) found different impairments on the task in patients with frontal lesions and Parkinson’s disease indicates that the problems on the task will emerge if the shifting deficit is severe enough. Hence, the lack of convincing evidence of impairments on the task in high-functioning autism may indicate that a set-shifting or cognitive flexibility deficit may not be as central to autism as previously thought. Most of the initial studies on which this notion was based used the WCST as their measure of cognitive flexibility, on which impaired performance may be caused by a range of different factors. The variability in findings on more pure set-shifting measures such as the IDED tasks calls into question the importance of the role of set-shifting in the EF profile characteristic of autism. Results obtained in the inhibition domain were also contrary to predictions and added to previous research in an interesting way. Most earlier studies have not found impairments in inhibition in individuals with autism (Brian et al., 2003; Ozonoff et al., 180 1994; Ozonoff & Jensen, 1999; Ozonoff & Strayer, 1997), and those that have found apparent inhibition deficits have used tasks on which performance could be influenced by other EF capacities such as cognitive flexibility, working memory or generativity (Hughes, 1996b; Rinehart et al., 2002; Williams et al., 2002). Notably, all of the studies finding intact inhibition in autism have used non-verbal tasks except one study in which the Stroop task was used (Ozonoff & Jensen, 1999). In the present study, previous findings of unimpaired non-verbal inhibitory control in autism were replicated using the newly developed RIL task, on which neither accuracy or RT measures revealed inhibitory difficulties in the ASD group. However, significant and robust verbal inhibition impairments were found on the Opposite Worlds test, particularly on RT measures (a trend was also evident on error measures). This result stands in contrast to that obtained by Ozonoff and Jensen (1999) with their autistic sample of similar size and age range using the Stroop, which involves very similar verbal inhibitory requirements. Closer inspection of Ozonoff and Jensen’s data reveals, though, that the autism group in their study performed at a very similar level to their ADHD group (autism group mean = 27.7 versus 27.4 for the ADHD group, on an unspecified scale), the latter of which did differ significantly from the control group (mean = 32.0). The lack of a significance difference from controls in the case of the autism group was likely to have been due to their larger standard deviation (11.4 versus 7.0 for the ADHD group). However, it is also notable that while the ADHD group was matched with controls on all age and IQ variables, the autism group was not matched to the control group on VIQ, PIQ, or Full-Scale IQ (FSIQ), and this was handled by covarying FSIQ in all group comparisons. As discussed in Section 4.3.2, ANCOVA is not considered an appropriate statistical technique for accounting for group differences in cases such as this, as it may also remove part of the effect of group. Ozonoff and Jensen’s result may therefore represent a false negative. It will be interesting to monitor the outcome of further studies on verbal inhibition, particularly in regard to how inhibition performance in autism may be distinguished from that displayed in ADHD. The interaction between inhibition and working memory was another topic of interest for this study, with Russell (1997b) proposing that impairments in these domains only emerge in autism if the task at hand requires both abilities simultaneously. Although inhibition deficits were revealed on a verbal task with minimal working memory requirements, results from the non-verbal RIL task were largely consistent with this proposal. While ASD participants were able to successfully perform the RIL task condition involving only non-verbal inhibitory requirements (and as discussed further 181 below also showed intact performance on the Relational Complexity task, which arguably requires working memory but not inhibition), on the condition involving both inhibition and working memory requirements, the ASD participants made significantly more errors on a measure of working memory capacity. There was also a trend for the ASD group to make more errors on a measure of inhibitory capacity for this condition as compared with the control condition. This suggests that in situations where both (non-verbal) inhibition and working memory are required, individuals with ASDs are unable to maintain an adequate level of performance in either domain, but particularly in working memory (although it may be the case that the working memory component of the task was more vulnerable in this case because that task requirement was added after the inhibitory component and was therefore more novel; or, alternatively, because it was tested less frequently). No group differences were identified on the Relational Complexity task, suggesting that the capacity to integrate multiple relations in parallel (Halford, 1993; Halford et al., 1998; Waltz et al., 1999) is not impaired in children with ASDs. This result further indicates that failure on false belief tasks in children with ASDs is unlikely to have its basis in a working memory or relational reasoning deficit. This was confirmed by the lack of significant correlations between false belief and Relational Complexity performance in either the ASD or control group. However, although Waltz et al. (1999) found that frontal lobe patients were significantly impaired on their version of the Relational Complexity task, the validity of the task as a measure of relational reasoning is yet to be determined. It could be argued that the task does not tap working memory or integrative capacity as strongly as it first appears. As the stimuli and all possible response choices are always present and visible to the participant, it is possible that the participant can check the accuracy of each response choice against the requirements of each relational change one by one, rather than having to hold in mind all the relevant relational changes simultaneously. All that would then be required is for the participant to notice all the relational changes which are occurring and accurately check whether each response choice fits the sequence of each change correctly. These requirements are quite obviously different to the relational integration arguably required by false belief tasks. So, while results from the Relational Complexity task did not hold much promise in suggesting a relational integration difficulty in ASDs, the use of different kinds of relational complexity task (such as the transitive inference task also used by Waltz et al., 1999) could be an interesting avenue for further research. 182 Results from the generativity tasks were more promising. On the verbal Uses of Objects task, the group difference on the number of correct responses variable met the criterion for a large effect. This is consistent with previous studies demonstrating generativity impairments in autism using other tasks (Boucher, 1988; Craig & BaronCohen, 1999; Lewis & Boucher, 1991), and in particular replicates Turner’s (1999) study, which found that both low- and high-functioning children with autism generated fewer responses than controls on the Uses of Objects task21. However, unlike Turner, the ASD sample in this study did not produce a higher number of error responses. Although the scoring systems used in the two studies were slightly different, even on categories common to both studies such as redundant responses, there were discrepant outcomes. Another difference between the studies was that Turner allowed 150s for her participants to produce responses, whereas only 90s was allowed in the current study. It is possible that during the extra 60s given in Turner’s study, a pressure to respond had accumulated over a longer time and so the children with autism produced inappropriate responses when they were unable to generate correct ones; whereas the children in this study felt less of a demand to produce a response. Regardless, it appears that the individuals with ASDs in both studies demonstrated difficulty spontaneously generating correct verbal responses on this task. In contrast, results from the Pattern Meanings task revealed no significant group differences on any variable overall. This was somewhat surprising as it was also thought to be a test of verbal generativity and ASD participants were found to produce fewer responses and make more errors on the task in Turner’s (1999) study. However, more detailed analyses involving the two subgroups of ASD participants meeting “full criteria” and “partial criteria” on the ADI-R showed that the full criteria subgroup generated fewer correct responses than controls (although this effect disappeared when age and VIQ were controlled), and the partial criteria subgroup generated more error responses than controls (although this effect became marginally significant when age and VIQ were controlled). This discrepancy between the full and partial criteria subgroups appeared to be due to a tendency for the partial criteria subgroup to produce more responses overall than the fully autistic subgroup, such that the partial criteria subgroup produced as many correct responses as the control group but were also more likely to produce error responses. This suggests that the less severe subgroup (in terms 21 Turner (1999) actually calculated the total number of responses overall rather than the number of correct responses. Although it was not reported, the ASD group in the current study also produced significantly fewer responses overall than the control group, replicating Turner. 183 of the range and number of symptoms present) reacted to problems generating responses by producing errors, whereas the more severe subgroup reacted by not producing responses at all. The failure to replicate Turner’s (1999) findings of significant differences in the overall sample on the Pattern Meanings task (and the lack of robustness of subgroup differences) requires further comment. The shorter time period allowed in the current study may also account for the lack of robust or significant differences, but this does not seem likely to be the sole cause given the strong generativity deficit displayed on the Uses of Objects task in the same time period. It could be argued that Pattern Meanings is not as good a task at discriminating those with poor generativity, as a larger range of responses are acceptable than for the Uses of Objects task. Scoring was fairly lenient for the task as it was often necessary to accept responses which the pattern possibly could be, even if they were a little far-fetched. This could explain the lack of an overall difference in the number of error responses made, but even given the lenient scoring, one would expect a reduced number of correct and total responses if the ASD participants experienced difficulty producing ideas. It may be that a combination of these two explanations can account for the lack of significant overall group differences in this study, in that the majority of children with ASDs were able to produce adequate responses for a 90s period because they found the task easier than the Uses of Objects task and the scoring was more lenient, but if the task had been continued for another minute, they may have started producing fewer and more inappropriate responses. Consistent with this interpretation, the rate of producing responses was similar for the control participants across the two studies (approximately 1 every 3.6s in the current study and 1 every 3.3s for the high-functioning controls in Turner’s study), but the ASD participants in the current study produced responses at a faster rate (1 every 3.97s) than the high-functioning ASD participants in Turner’s study (approximately 1 every 4.7s). Results on the Pattern Meanings task in this study should not, therefore, be interpreted as evidence against a verbal generativity deficit (although any such deficit on this task was clearly more subtle than on the Uses of Objects task). Non-verbal generativity impairments in ASD participants also emerged in this study, with performances on the Stamps task revealing that individuals with ASDs produced less complex and fewer original patterns and were more restricted in their use of the stamps available. There was also a trend for children with ASDs to show less adherence to one rule for each pattern. Results on the originality and restriction scores were consistent with Frith (1972), however contrary to this study Frith found no 184 difference in the complexity of patterns produced by her sample of children with autism, and she also found that her sample showed a very high degree of rule adherence. In the current study, the lack of rule adherence was likely to have been attributable to a certain proportion of the ASD participants who produced random patterns with no underlying rule. These participants may also have been the cause of the lower mean complexity score of the ASD group, as random or unidentifiable patterns were assigned the lowest complexity score of 1. The main difference between the two studies was the level of functioning of the samples, with 14 out of 20 of Frith’s participants having an estimated PIQ below 60. It is possible that higher-functioning individuals with ASDs may have opted to produce random patterns when unable to produce original rules, whereas lower-functioning participants may have simply produced the same pattern repeatedly. This hypothesis cannot be directly tested in the current sample because all participants had PIQs above 60. Nevertheless, it is evident that the generativity impairment which characterises autism extends across both verbal and non-verbal domains as well as across all levels of functioning. Concluding comments on the profile of impairments. In summary, then, the ASD group in this study demonstrated a characteristic profile of strength and weakness in the cognitive domains tested, with impairments on measures of ToM, planning, verbal inhibition, tasks combining inhibition and working memory, and both verbal and non-verbal generativity, but intact performance on tests of awareness of social norms, set-shifting, non-verbal inhibition and relational reasoning. Consistent with predictions, the largest effects were on verbal tasks22 (measures of false belief, verbal inhibition and verbal generativity), consolidating the importance of including tasks involving both verbal and non-verbal responses where possible. Certain aspects of the profile of impairments found in this study were inconsistent with initial predictions based on previous studies, such as the absence of set-shifting deficits, and the presence of impairments in verbal inhibition. These findings suggest that the EF profile characteristic of autism as proposed by Ozonoff and colleagues (e.g., Ozonoff, 1997; Ozonoff & Jensen, 1999) may require modification, and its discriminant validity (i.e., its uniqueness to autism as compared with other clinical conditions) merits further investigation. 22 It should be noted that there were no significant correlations between PIQ and any ToM or EF measures. This indicates that the measures of PIQ on which the control and ASD samples were matched measured different abilities to those measured by the non-verbal EF tasks, and therefore that the relative lack of group differences on non-verbal EF tasks compared with verbal tasks cannot be accounted for by the matching of the groups on PIQ. 185 It should be pointed out, however, that the neat profile described above of course assumes reasonable construct validity of each task variable. This assumption deserves some critical analysis, particularly given the well-documented difficulties with EF measurement discussed in Chapter 2 (Section 2.2). It is possible that both i) certain variables are not actually measuring what they are purported to and ii) there is overlap between the EF domains measured and/or the tasks used in different domains. The ideal way to address this uncertainty would be to conduct a factor analysis of all the EF variables in the battery, however the high number of variables in relation to the number of participants prevented a valid factor analysis from being performed on this sample. Interpretation of each variable therefore relied mainly on previous literature as well as informed qualitative analysis of the requirements of each task. The choice of relatively pure EF tasks and/or tasks which included control conditions, allowing decomposition of the processes involved in task performance, facilitated the ease and clarity with which variables could be interpreted. Examination of raw and partial correlations (partialling out age, VIQ and PIQ) between EF variables in the control group was also informative (these are presented in Appendix B), in general demonstrating weak and relatively few significant correlations between EF domains as well as several strong intra-domain correlations, thereby validating the notion that the tasks measure mostly independent constructs. This was the case even for variables which could conceivably belong in a different category to other more central variables from the same task, such as rule violations on the ToL or the error variables on the verbal generativity tasks (both of which could reflect inhibition or working memory), with these variables usually correlating more strongly with other variables from the same task than those from other tasks. It appears, therefore, that there is no strong evidence to suggest that the underlying abilities assumed to be measured by each of the EF variables are invalid. 4.4.2 Primacy of ToM and EF deficits Having identified the ToM and EF profile which characterises this sample of individuals with ASDs, the next question concerns the primacy of each of these deficits. In this study, primacy was measured by calculating the universality, uniqueness (this criterion was measured indirectly), and explanatory value of each variable on which significant group differences were found (or all variables in the case of explanatory value). Results showed that while ToM and EF deficits showed similar prevalence within the ASD group, measures of ToM did not successfully discriminate between the ASD and control 186 groups or show any significant relationships with behavioural measures, yet several EF indices emerged as significant predictors of autism group membership and two EF variables correlated significantly with measures of symptomatology. Overall, it would appear that EF deficits are relatively more primary23 than a ToM deficit in ASDs. However, before making any strong conclusions, results derived from each index of primacy require a more detailed discussion. Universality. The first matter of note is that neither ToM nor EF deficits, as defined by a score worse than one standard deviation from the mean of the control group (or a close approximation in the case of dichotomous variables), were universal among this sample of high-functioning individuals with ASDs. Within the ASD group, 44.2% displayed a ToM deficit and the prevalence of EF deficits ranged from 19.5% to 48.3% (with impairments in verbal inhibition and verbal generativity being the most prevalent of the EF components). All deficits remained non-universal even using the more lenient criterion of any score below the mean of the control group, contrary to the results obtained by Ozonoff et al. (1991) which showed that deficits defined in this way on their EF composite were almost universal (96%) amongst their autism group whereas ToM deficits were not (52% on a first-order composite and 87% on a second-order composite). However, the current results are consistent with most other studies which report prevalence data on ToM and/or EF impairment in autism, the majority of which have not found either ToM (see Happé, 1995) or EF deficits (e.g., Liss et al., 2001; Hughes et al., 1994; Ozonoff & Jensen, 1999) to be universal. It should also be noted that it is unlikely that EF deficits would have been universal in the Ozonoff et al. (1991) study if a stricter definition of a deficit had been used. In any case, unless the ToM and EF tasks used were too easy for a proportion of the participants, these results suggest that neither a ToM nor EF deficit is the single primary deficit in autism, but rather (as outlined in hypothesis 6 in the introduction), that either i) different ToM and EF profiles are found in different subgroups of individuals with autism, rather than both deficits co-occurring in all individuals; ii) ToM and EF deficits underlie different aspects of symptomatology, and therefore may be present in differing degrees of severity according to the individual’s position on the multidimensional autism spectrum; or iii) an unidentified third deficit may be more primary or at least equally primary. A fourth possibility is also conceivable, which is 23 The notion of “relative primacy” refers to the relative ability of each impairment to meet the criteria for a primary deficit. Although the term “primary” is usually used in the context of a single primary deficit, in a multiple primary deficits model it is also possible for one deficit to hold superior causal importance (e.g., explanatory value) over another, and therefore have superior relative primacy. 187 that different developmental stages of autism are characterised by different cognitive profiles. These four possibilities will be re-visited later in this section and discussed further in Section 4.4.4. Uniqueness. Results on the uniqueness criterion more clearly discriminated between the ToM and EF accounts, with verbal measures of inhibition and generativity being the strongest predictors of autism group membership (deficits on these two variables were also the most universal among the ASD group and had the largest effect sizes of all the EF variables), while first-order false belief performance was only a marginally significant predictor24. While these results do not allow any strong inferences regarding the uniqueness of these deficits to autism as opposed to other clinical groups, they do suggest that deficits in verbal inhibition and verbal generativity are particularly central to ASDs. This is an interesting result given that mental flexibility and planning deficits were previously thought to be the most significant in autism. It also adds to the previous study by Ozonoff et al. (1991), which showed that EF performance was the best predictor of autism group membership, but did not analyse the key EF components involved. Explanatory value. In terms of explanatory value, correlations between cognitive and behavioural measures revealed that ToM variables did not correlate significantly with any behavioural domain, whereas two EF measures showed significant relationships with various aspects of autistic symptomatology. The lack of explanatory value of the ToM tasks, particularly the non-significant relationship between ToM and social/communicative functioning, was a somewhat surprising result, although not without precedent (Prior et al., 1990; Sparrevohn & Howie, 1995; the lack of relationship with repetitive behaviours is also consistent with Turner, 1997). If a ToM deficit is the primary basis for social/communicative impairments in autism then one would expect that those who performed poorly on the ToM tasks would have been those who showed more severe social impairment. Yet this was not the case: although the correlation was not significant, its direction actually suggested the opposite trend, such that those with better performance on the false belief aggregate tended to show more abnormal social/communicative functioning25. The reason for this is unclear, but 24 When the false belief alternative aggregate score was used, it was a non-significant predictor of group membership. 25 This unexpected trend still existed when the false belief variables were analysed separately and when the Social Behaviour Questionnaire and the Social and Communication domains of the ADI-R were analysed separately rather than together as one factor score. This suggests that it was not simply a spurious individual result, which was a possibility given that only one correlation between ToM and social/communicative functioning was conducted (in comparison with the wider range of EF tasks and measures of repetitive behaviour). 188 in any case it constitutes evidence against the notion that a ToM deficit underlies the social/communicative impairments which characterise autism. Although an inability to appreciate others’ mental states is an intuitively appealing explanation for abnormal social behaviours, a one-to-one relationship between an emergent behaviour and underlying cognitive deficit cannot be assumed; abnormal social behaviours are not necessarily caused by an impairment in a social or ToM module (Bowler, 2001). The existence of a significant correlation in the appropriate direction between social/communicative functioning and an EF measure casts further doubt on the idea that ToM deficits underlie the social/communicative symptoms of autism while EF deficits underlie repetitive behaviours and restricted interests. EF measures demonstrated better explanatory value than ToM variables, but there were still only two EF variables showing significant correlations with behaviour: a measure of verbal inhibition, which correlated with low-level repetitive behaviours (when the RBI data was broken down further using Turner’s (1996, 1997) categories, the verbal inhibition measure correlated with repetitive movements); and a non-verbal generativity measure, which correlated with social/communicative functioning. This latter result consolidates previous findings of a predictive relationship between EF impairment and abnormal social behaviours in autism, although previous studies identified different EF components, most commonly set-shifting, as holding explanatory value (Berger et al., 2003; Gilotty et al., 2002; McEvoy et al., 1993). This may be because the previous studies incorporated different measures of set-shifting to the one used here and did not include any tests of non-verbal generativity. The significant correlation between verbal inhibition and repetitive movements was consistent with Turner’s (1997) findings, which showed that performance on a test of “recurrent perseveration”, on which inhibitory control was required, also correlated significantly with repetitive movements in her sample of children with autism. This relationship between inhibitory control and repetitive movements makes intuitive sense; it is easy to imagine how inhibitory impairment could lead to difficulties “stopping” a particular movement sequence26. However, Turner’s (1997) study demonstrated several other significant correlations between EF and RBI measures which were not replicated in this study. These included a significant relationship between set-shifting performance on the 26 Of course, the usual caveat about correlation and causation applies here. Arguments against the opposite direction of causation (i.e., that the behavioural symptoms may cause the EF deficits) are presented in Section 2.2.3 of Chapter 2. 189 modified IDED task and repetitive use of language and circumscribed interests; and significant associations between performance on generativity measures (including the verbal generativity tasks used in this study) and sameness behaviours and circumscribed interests. The lack of any significant partial correlations between verbal generativity variables and behavioural measures in this study was also surprising given the apparent centrality of that domain in the analyses addressing the universality and uniqueness criteria. These discrepancies with Turner’s study are somewhat difficult to explain. While Turner did not partial out age or ability variables from her correlations (choosing instead to divide her sample into low- and high-functioning subgroups), the results of raw correlations in this study were not consistent with Turner’s findings either – the Uses of Objects correct responses variable actually correlated significantly with circumscribed interests in the opposite direction than predicted, and there were no other significant raw correlations consistent with Turner’s results. This failure to replicate is reflective of a general paucity of significant correlations between cognitive and behavioural measures in the ASD group in this study27. Neither ToM nor EF variables could account for the full range and extent of autistic symptomatology measured, or even one complete symptom domain. Some behavioural categories, such as high-level repetitive behaviours and several subcategories of the RBI falling under that heading, did not correlate with any cognitive task variables at all; and conversely, some cognitive variables on which deficits were significant and relatively prevalent did not show any relationships with symptomatology. What might explain this? One possibility is that the behavioural measures used were not sufficiently accurate, sensitive or wide-ranging, but this would not seem to be the most likely reason given the well-documented diagnostic validity of the ADI and the wide range and depth of domains covered by the RBI. The fairly heterogeneous nature of the sample (i.e., the inclusion of participants meeting ADI-R criteria in only one or two domains) is not a plausible explanation for these results either, as variations in the range of behaviours displayed is more likely to increase, rather than decrease, the probability of finding correlations. One potentially influential factor is the behavioural therapy received by most children with ASDs. It may be the case that relationships between underlying cognitive deficits and behavioural expressions have been distorted because therapeutic 27 Pellicano, Maybery, Durkin, & Maley (submitted) also recently found a similar lack of significant correlations between ToM and EF measures and autistic symptomatology (as measured by the ADI-R) in a substantial sample of children with autism. 190 intervention shapes the nature and severity of the behaviours which would otherwise occur if no intervention took place (without affecting cognitive functioning as strongly). Parental discipline would have a similar effect, particularly in the case of repetitive behaviours; indeed, during the administration of the RBI when questions were asked regarding how long their child usually indulged in particular repetitive behaviours, parents would often answer “Until I tell him/her to stop”. This highlights the importance of considering the interaction between environmental and genetically based cognitive influences on behavioural expression. Just as there is no one-to-one mapping between genes and cognition (Karmiloff-Smith, Scerif, & Thomas, 2002), it is also unlikely for direct or simple relationships to exist between cognition and behaviour. It is probable that the relationship between cognitive functions and behavioural outcomes is dynamic and changes continuously throughout development. Hence, correlations between current cognitive status and current behaviours may not reveal the cascade of processes which has shaped the nature and severity of those behaviours, and they are likely to be weak and unreliable, resulting in failures to replicate such as that which occurred with this and Turner’s (1997) study. Correlations between cognitive and behavioural factors may also be weakened by the use of parental report as the method of behavioural measurement, rather than direct observation (this is discussed further in Chapter 7). Explanatory value would probably be best measured using longitudinal designs, examining correlative and predictive relationships between early cognitive deficits and both early and later behaviours, using both observational and parental report methods of behavioural measurement. Notwithstanding these concerns, the findings on explanatory value are consistent with the results on the universality criterion in indicating the unlikelihood of a single primary deficit model (or a model in which both deficits meet all criteria for primacy), and could suggest that either i) different subgroups within the autism spectrum are characterised by different cognitive and behavioural profiles, with this variability obscuring and diluting clear relationships in the overall sample; or ii) a third cognitive domain which was not measured is at least equally primary and can account for the behaviours which did not correlate with any of the cognitive variables included in this study. The “multidimensional spectrum” possibility, as described in hypothesis 6 (ii), was not supported by these results, as this model (or at least the version described) would predict strong correlations between particular cognitive deficits and the behavioural domains they were purported to underlie, regardless of the variability of the sample. 191 Concluding comments on the primacy of ToM and EF deficits. So far, then, it appears that while EF deficits do not adequately or consistently meet all the criteria for primacy in ASDs, they fare slightly better than the ToM hypothesis. In evaluating the relative primacy of ToM and EF deficits, however, the comparative level of difficulty of the ToM and EF tasks is perhaps one of the most important issues to be addressed, as it is possible that these results simply reflect the fact that the ToM tasks were easier than the EF tasks. The older, high-functioning nature of the overall sample was necessary to achieve the aim of specifically assessing the full range of EF components, however the consequence of this was that a larger than usual percentage of both ASD and control participants displayed perfect performance on both first- and second-order false belief tasks, thereby reducing the universality of ToM deficits. Although performance was not quite at ceiling in the control group, the high level of performance overall may have reduced the potential size of the group difference (this was also pointed out by Perner and Lang (2000) in reference to the Ozonoff et al. (1991) study). This would have the consequence of decreasing the ability of the ToM measures to discriminate the ASD group from the control group, therefore affecting their performance on the uniqueness criterion. The lack of significant correlations between ToM tasks and behavioural measures could also have been a by-product of task difficulty, because it may have been the case that ToM task passers still showed social and/or other behavioural impairments - in other words, the behavioural measures may have been more sensitive than the ToM measures, thereby reducing the strength of the relationship. What may be said in defence of the validity of results derived from the ToM measures in this study? Firstly, the inclusion of the second-order false belief task, which has previously been failed by 10-18 year-old individuals with autism (BaronCohen, 1989b) extended the range of difficulty in the ToM task domain. Secondly, it is noteworthy that ToM and EF deficits were of roughly equal prevalence (using the ToM scoring criterion which more clearly indicates low scorers), with a tendency for the ToM deficit to be slightly more prevalent than most EF deficits in the current sample. Given this, its lack of uniqueness and explanatory value cannot be readily discounted as an artefact of the unequal difficulty of the tasks. Similarly, the significant proportion of individuals who showed impaired performance on ToM tasks but unimpaired performance on EF tasks (discussed below) indicates that for some individuals, the false belief tasks were more difficult than the EF tasks (at least when evaluated with reference to control group performance). Thirdly, performance on all of the false belief tasks was far from the ceiling in the ASD group, allowing enough variability in the 192 sample for correlations with behavioural measures to emerge. The fact that false belief performance showed medium level correlations with VIQ confirms that it was not at ceiling and also suggests that it was assessed with some reliability. Finally, these findings are consistent with previous research, with non-universality typical of all ToM studies in autism (see Section 2.1.3 in Chapter 2), the discrepancy between the uniqueness or discriminability of ToM and EF consistent with Ozonoff et al.’s (1991) results, and the lack of explanatory value of ToM replicating studies by Turner (1997) on repetitive behaviours and by Prior et al. (1990) and Sparrevohn and Howie (1995) on social behaviours. For these reasons, the lack of universality, uniqueness and explanatory value of the ToM deficit in this sample cannot be convincingly rejected as an uninteresting consequence of the level of difficulty of the false belief tasks. One additional alternative interpretation of the results indicating superior primacy of EF deficits on the uniqueness criterion also requires acknowledgement. When commenting on Ozonoff et al.’s (1991) results, Perner (1998; Perner & Lang, 2000) argued that the finding that an EF deficit discriminates better between ASD and control groups than a ToM deficit does not necessarily indicate that the EF deficit is more primary. He argues that a partial impairment in ToM (which he equates with metarepresentational capacity) should actually result in a more severe impairment in EF, as the SAS (Supervisory Attentional System) depends on metarepresentational capacity and so any metarepresentational impairment will be magnified during EF task performance. However, three findings are inconsistent with this explanation: i) the roughly equal effect sizes of ToM and EF deficits in the ASD group, which suggest that the deficits are equally severe; ii) the lack of explanatory value of the ToM tasks (relative severity of impairment is irrelevant to that index of primacy, and if ToM was primary it should show relationships with symptoms of autism); and iii) the presence of a significant proportion of cases showing impaired ToM but intact EF (discussed below), which this account does not allow for. Therefore, the evidence suggesting better discriminative ability of EF deficits in this ASD sample appears to be a valid indicator of superior primacy. 4.4.3 Independence of ToM and EF deficits Given that EF deficits appear to be more primary than a ToM deficit in ASDs, is it possible, then, that they can explain or subsume the ToM deficit which also characterises these individuals? The relative absence of significant correlations and the 193 frequency of dissociations between ToM and EF performances in the ASD group suggest that this is not in fact the case, and instead indicate fairly persuasively that the two deficits are largely independent in individuals with ASDs. The fact that the dissociations occurred in both directions importantly demonstrated that mastery of one domain did not appear to be a prerequisite for the other. Instead, they suggest that the two deficits co-occur in ASDs probably because of their proximal neuroanatomical substrates. These results stand in contrast to the handful of studies which have found significant correlations between ToM and EF in autism (Colvert et al., 2002; Ozonoff et al., 1991; Zelazo et al., 2002), but consolidate previous reports of ToM-EF dissociations in autistic individuals (Baron-Cohen & Robertson, 1995; Baron-Cohen et al., 1999b; Ozonoff et al., 1991). As described in Section 2.3.2 of Chapter 2, the three studies which have found an association between ToM and EF in autism have either failed to partial out the effects of age and/or IQ or used only one type of EF task28. The importance of partialling out the effects of age and ability was verified in this study, as almost all of the several significant raw correlations in the ASD group between false belief tasks and measures of planning and verbal generativity became non-significant when these factors were accounted for. The current results are therefore likely to represent a more reliable indication of the nature of the specific relationship between ToM and EF in ASDs. A number of alternative interpretations of these results are conceivable, however. In their meta-analysis of the studies on the ToM-EF relationship conducted up to that point, Perner and Lang (1999) found significant non-homogeneity among the size of the correlations and proposed that the length of the testing session may be an important confounding factor. They found a significant negative correlation between the estimated testing duration per session and the size of the ToM-EF correlation reported, leading them to suggest that longer testing sessions could result in fatigue which would affect performance and decrease the strength of the correlation. It is possible, then, that the relatively long testing sessions in this study (approximately 2.5 hours in total, including all tests in the WAFSASD battery; this was usually divided into two sessions) may have influenced the strength of the correlations. Similarly, the fact that the order of test administration was the same for all participants could potentially 28 In the one study (Colvert et al., 2003) which did partial out age and ability variables, only one EF task was included, the DCCS task. As this task is multifactorial and may be failed for a number of different reasons (see Perner & Lang, 2002), no equivalent task was included in the current battery. It is interesting to note that when ToM-EF correlations have been found in autism, the EF measures have been impure and/or consisted of composite scores. 194 have introduced extra fatigue-related variance to performance on the tasks completed towards the end of each session. The questionable reliability of both ToM and EF tasks (as discussed in Chapter 2) could also leave the correlations vulnerable to extraneous variance. However, while these factors may have introduced a degree of extra variance to the data, it is unlikely that they could have differentially affected the ASD and control groups in such a way as to fully account for the striking difference in the number of significant correlations observed in the two groups. Also, it is not the case that the tasks at the end of the battery showed the weakest correlations, in either the ASD or control group (e.g., the Opposite Worlds test, which was the last test to be administered, showed strong correlations with repetitive movements in the ASD group; and the generativity tasks, which were also administered towards the end of the battery, demonstrated significant correlations with false belief variables in the control group). In Section 2.3.1 of Chapter 2, it was argued that the close relationship between ToM and EF which has been consistently demonstrated in typically developing 3 – 5 year-olds may not necessarily hold for older age groups. It is therefore also possible that the lack of significant correlations in the ASD group may be a consequence of their age, and that a relationship would be observed in a younger sample. However, the presence of a range of significant partial correlations between ToM and EF measures in the age-matched control group in this study suggests that age was not the most important factor causing the outcome in the ASD group. Nevertheless, the pattern of correlations demonstrated in the control group revealed a number of differences to those typically observed in younger children, suggesting that the nature of the ToM-EF relationship may change throughout development. Firstly, the significant correlations occurred mostly (although not always) with the second-order false belief task, which could reflect both the larger proportion of non-perfect scorers on this task as well as the higher EF demands of the task (consistent with Tager-Flusberg & Sullivan, 1994b). Second and more importantly, some EF components such as inhibition and working memory did not show the usual relationship with false belief performance (the RIL task Load error difference score, which reflects performance on a task combining inhibitory and working memory requirements, did correlate with the second-order false belief task, but other indices of that task such as the shape error score did not correlate with false belief scores); whereas other variables which have not commonly been associated with ToM performance, such as planning and generativity, did show significant correlations with variables from both the first- and second-order tasks. 195 In order to further explore changes in the ToM-EF relationship with age, the control group was divided into two age subgroups (5-8 year-olds and 9-18 year-olds) and raw and relevant partial correlations were conducted separately for the two subgroups (these are presented in Appendix C). Although the sample size for some of the correlations was quite small in the younger age subgroup because some tasks were administered only to participants over the age of 6, the results showed clearly that there were a larger number of significant ToM-EF correlations in the younger subgroup, and that the pattern of correlations was different to that observed in the older subgroup29. While the larger number of significant correlations in the younger subgroup may simply be due to the increased failure rate on false belief tasks in that subgroup (although note that several controls over the age of 8 did not demonstrate perfect performance), the fact that correlations with different EF variables were revealed in the older subgroup suggests that there are also qualitative differences in the ToM-EF relationship at different developmental stages. These findings are consistent with the few studies which have been conducted previously on the ToM-EF relationship in children over the age of 5, which have also demonstrated a smaller number and different pattern of correlations compared to studies of younger children (Charman et al., 2002; Perner et al., 2002a). The mechanisms underlying these developmental changes remain open to speculation. One possibility is that a functional dependence between ToM and certain aspects of EF such as inhibition exists as both of these abilities are developing (as outlined in the emergence accounts), but once a certain level of development takes place the ToM-EF relationship revolves more around performance-based factors (as proposed by the expression accounts). However, the dissociability of impairments in the ASD group is inconsistent with both emergence accounts (this is discussed further later). Another possibility is that performance-based (or expression) factors explain the relationship at all ages, but the EF components which influence ToM performance change with age, as it may be that different skills are required for children of different ages to solve ToM tasks (e.g., inhibitory control may be more important at a young age as one’s own perspectives may be more salient)30. Although the typical development of 29 Separate correlations for the same two age subgroups were also conducted within the ASD group. Both subgroups showed no significant ToM-EF correlations after age and IQ variables were partialled out, with the exception of the correlation between the Stamps task restriction score and the simple false belief task, which was only significant in the older subgroup. This suggests that ToM and EF deficits were independent in all age ranges included in this sample of ASD participants. 30 The “common conceptual bases” account tested in this study, involving relational complexity, did not receive support in the current results – individuals with ASDs showed a ToM impairment but no relational reasoning deficit, and there were no significant correlations between the Relational Complexity task and false belief variables in either group. Nonetheless, this type of account would be unlikely to 196 ToM and EF is not the major focus of this thesis, these ideas and results certainly merit further exploration in future studies using a wider range of ToM tasks and including participants with a wider range of ages. Returning to the ToM-EF relationship in ASDs, given that the lack of correlations between ToM and EF found in the ASD group as compared with the control group cannot be easily dismissed as a result of the length of the testing session or age of the sample (as the two groups were matched on these factors), the question remains as to why ToM and EF should be correlated in typically developing children but largely uncorrelated in children with ASDs, where deficits in both co-exist. Something akin to the ToMM-SP model proposed by Leslie and colleagues (Leslie & Thaiss, 1992; Leslie & Roth, 1993) could potentially account for this pattern of results. In this model, typically developing children may fail false belief tasks because of their processing requirements (based on the ToM-EF correlations in the control group in this study, the SP would include planning and generativity as well as inhibitory/working memory processes), but children with autism fail because they lack a ToMM. Consistent with the results from this study and in accordance with their predictions, Roth and Leslie (1998) found that performances on a false belief task and a non-mentalistic control task with similar processing requirements were significantly correlated in typically developing children, but not autistic children. Similarly, the lack of correlations between ToM and EF in the ASD group in this study could reflect the notion that EF factors did not add any extra variance to their ToM performance – the false belief tasks were failed because of ToM-specific factors and not because of poor EF. This interpretation of the current results is consistent with the notion that ToM may be a domain-specific capacity, although the ToMM-SP model in its original form cannot adequately account for other results obtained in this study as it holds that a ToM deficit is primary to autism. This interpretation would also suggest that the correlations observed in the control group were caused by some individuals performing poorly on the false belief tasks because of weaknesses in aspects of EF and not because of a ToM impairment. This explanation would therefore favour an expression account of the ToM-EF relationship in typical development. However, while this explanation can account for the lack of ToM-EF correlations in the ASD participants who failed ToM tasks, it does not explain the lack of a relationship in ASD participants who showed EF impairments but intact ToM. The explain developmental changes in the ToM-EF relationship as the common conceptual basis occurs because of common task structures, regardless of age. 197 expression account of the ToM-EF relationship would predict that those ASD participants showing impairments in the EF components which were correlated with ToM performance in the control group (e.g., planning, generativity) should sometimes fail ToM tasks because of poor EF, thereby resulting in correlations between ToM and EF performance. These correlations would only occur in the half of the ASD sample showing impaired EF and may therefore have been too weak to emerge as significant. Another possibility, though, is that those ASD participants who scored well on false belief tasks did so via non-conventional routes to success, such as by using the compensatory strategies described in Section 2.1.3 of Chapter 2. If so, then those EF capacities which are normally required for successful ToM performance may not have been needed, as the task-solving strategy may have been previously learned or taught and therefore not dependent on online problem-solving skills. This speculation requires empirical confirmation, however. One method of testing it would be to examine correlations between EF measures and higher-level, more advanced and ecologically valid measures of ToM, for which compensatory strategies may be less likely to have developed. It is not the case, however, that there was a complete absence of correlations between ToM and EF in the ASD group. One non-verbal generativity variable (the Stamps task restriction score) showed a significant correlation with simple false belief task performance such that poorer generativity (higher restriction) was associated with unstable performance on the simple false belief task. While generativity has not previously been an EF component of particular focus in the literature on the ToM-EF relationship, its potential role in the false belief performance of children with autism has been highlighted previously by Peterson and Bowler (2000). Peterson and Riggs (1999) had earlier argued that tests of false belief and subtractive reasoning (assessed by asking a question such as “If the marble had not been moved, where would it be now?) require similar counterfactual reasoning capabilities, in that they both involve processing a negative counterfactual question of the form “If not-F, then Q” (where F is a known fact and Q is a question). However, Peterson and Bowler’s (2000) results, which showed that subtractive reasoning ability appeared to be necessary but not sufficient for accurate false belief performance in children with autism, led them to suggest that the false belief task required a crucial additional factor: that of generativity. They argued that in subtractive reasoning tasks, children are given the supposition “not-F” as part of the problem, but in false belief tasks it must be generated. A generativity impairment could therefore explain why children with autism found false belief tasks more difficult than 198 subtractive reasoning tasks in their study. They proposed that both subtractive reasoning and generativity were additional requirements for successful false belief performance, beyond basic mentalistic understanding. Although no subtractive reasoning tasks were included in this study, this kind of model fits quite well with the current data, which suggested that ToM performance was largely independent of EFrelated factors in the ASD group, but that generativity played some role in false belief performance. It is not clear, however, why the restriction score did not show significant correlations with performance on the more difficult first- and second-order false belief tasks (although these correlations were in the predicted direction). Nevertheless, this result requires a slight modification to the two ToMM-SP-like and compensatory strategy accounts proposed above, indicating that it may be the case that some individuals with ASDs showed unstable performance on simple false belief tasks because of a generativity impairment. 4.4.4 Towards a “multiple primary deficits” model of ToM and EF in ASDs The next challenge is to attempt to unite this rather complex set of results on the profile, primacy, and independence of ToM and EF deficits into a coherent theoretical framework. In the introduction to this chapter, six hypotheses regarding the primacy and independence of ToM and EF in ASDs and their implications for theories of autism and models of the ToM-EF relationship were considered. The first hypothesis was that there is only a single, primary deficit in ASDs, with no secondary impairments. As both ToM and EF deficits were present in this sample of individuals with ASDs, this hypothesis was not supported. Hypotheses 2 and 3 represented different scenarios in which ToM and EF deficits were related in ASDs, either because one caused the other or because both were caused by a third, more primary deficit. Neither of these hypotheses were supported in this study, with the evidence suggesting that ToM and EF deficits are largely independent in ASDs, with their co-occurrence most likely explained by the neuroanatomical proximity. Hypotheses 4, 5, and 6 all proposed that ToM and EF deficits were independent, but differed in terms of the primacy of those deficits. Notwithstanding the other explanations for the results considered in previous sections of this discussion, the non-universality and incomplete explanatory value of both ToM and EF deficits in this study indicate that neither ToM nor EF deficits meet all the criteria for primacy. This rules out hypothesis 4 (that either ToM or EF is the single primary 199 cognitive impairment of ASDs) and also excludes hypothesis 5 (that both ToM and EF deficits are primary). This leaves hypothesis 6: that ToM and EF impairments are independent in ASDs, and neither meets all criteria for primacy. Of the six hypotheses, this found the most support in the results from this study. Three different versions of this “multiple deficits” model were presented in the introduction: i) different ToM and EF profiles are found in different subgroups of individuals with autism, rather than both deficits cooccurring in all individuals (in this model, explanatory value across all ASD individuals may be low because the presence of different subgroups may obscure relationships in the overall sample); ii) ToM and EF deficits underlie different aspects of symptomatology, and therefore may be present in differing degrees of severity according to the individual’s position on a multidimensional autism spectrum; or iii) there may be an unidentified third deficit which is at least equally primary (and may underlie symptoms which were unrelated to ToM and EF). A fourth version was also proposed in Section 4.4.2 of this discussion: that iv) different stages of development of individuals with ASDs may be associated with different primary cognitive deficits. Each of these four possibilities will now be considered in turn. i) Subgroups. Subgroups of individuals with ASDs can be classified or defined in several different ways, such as by severity of symptoms, the domains in which symptoms are present, or level of intellectual impairment (e.g., Beglinger & Smith, 2001; Prior et al., 1998). The only subgroups which were explicitly examined in this study were the “full criteria” and “partial criteria” subgroups, defined according to the number of domains in which a higher-than-threshold level of symptomatology was present (as assessed by the ADI-R). Although these two subgroups showed different patterns of performance on the Pattern Meanings task, which suggested that their verbal generativity difficulties may be expressed in slightly different ways, there were no other differences between the two subgroups on any other EF or ToM measures. This indicates that the number of domains in which symptoms are present does not relate systematically to the profile of ToM and EF deficits displayed. However, this subgrouping method did not distinguish which symptom domains were present within the “partial criteria” subgroup, which may have obscured more fine-grained differences. While other subgroup divisions were not specifically analysed, the lack of significant or strong correlations between ToM and EF variables and any measures of symptom severity also suggests that subgroups based on overall symptom severity (rather than presence of symptoms in particular domains) are also unlikely to be 200 associated with consistent profiles of ToM and EF deficits. It is a stronger possibility that subgroups based on level of functioning (as measured by IQ) may be characterised by different ToM and EF profiles, as several group differences on both ToM and EF measures were mediated by VIQ. Previous research has also suggested that level of functioning (which has been measured by adaptive skills as well as IQ) has shown the most promise in discriminating subgroups and predicting outcome (see Beglinger & Smith, 2001; Fein et al., 1999; Stevens et al., 2000). When comparisons were conducted between “low VIQ” and “high VIQ” subgroups within the ASD group31, it was found that the low VIQ subgroup performed significantly more poorly on ToM measures (including both aggregate scores), and on one EF measure (the ToL). However, there were no other EF task differences such that the high VIQ subgroup performed more poorly than the low VIQ subgroup, which suggests that this subgroup division did not map directly onto “ToM-impaired, EF-intact” and “EF-impaired, ToMintact” subgroups, instead indicating that the low VIQ group was more impaired in ToM and equally impaired in most domains of EF relative to the high VIQ group. However, this assumes that there are only two subgroups based on ToM-EF performance and VIQ. This is unlikely, as there is at least a third subgroup showing both ToM and EF deficits (as the incidence of ToM-EF dissociations was not 100%). Furthermore, there may be more subgroup divisions which vary according to the specific EF profile displayed. A better way of determining how many subgroups based on ToM and EF performance there are and how they relate to other measures such as IQ or symptomatology would be to employ cluster analysis, where the characteristics of ToM-EF clusters could be examined to determine how the subgroups should be defined behaviourally. This would require a large sample which varied considerably on IQ and symptomatology. Conclusions about the validity of the subgroup notion therefore await further investigations, although subgroups based on symptom domains or symptom severity were not strongly supported by the current data. ii) An autism spectrum with multiple dimensions. Rather than proposing several discrete subgroups, this model conceives of autistic symptomatology as varying on a more continuous spectrum. In a single primary deficit model, this spectrum could be unidimensional, but in a multiple deficits model, there would need be more than one dimension, with ToM and EF deficits each underlying a different dimension. In the version of this model presented in the introduction, these dimensions corresponded to 31 A PIQ-based division did not reveal any significant subgroup differences in ToM or EF performance. 201 symptom domains, such that, for example, a ToM deficit was the basis for social impairment and EF deficits accounted for repetitive behaviours. Thus, each ASD individual’s profile of ToM and EF deficits would determine the nature and severity of their symptomatology (so, a more severe ToM deficit would be associated with more severe social impairment). In this model, the apparent presence of subgroups of “ToMimpaired, EF-intact” and “EF-impaired, ToM-intact” individuals would be an artefact of the arbitrary cutoff for “impairment” within a continuous distribution of scores. There were no bimodal distributions of continuous variables in this study (although ToM performance was highly skewed), suggesting the dimensional variation notion is appropriate at least for EF performances. However, the particular version of the model alluded to above was not supported in this study, with ToM performance showing no significant correlations with behavioural measures and various EF deficits correlating significantly with both social/communicative functioning and repetitive behaviours. As discussed earlier, environmental and developmental factors may have contributed to these results; however, as it stands, the data are not consistent with this model. Alternative versions of this model are nevertheless possible. For example, rather than the dimensions corresponding to symptom domains, there may be one dimension for number and severity of symptoms and one for level of functioning (as proposed by Szatmari et al., 2002). Perhaps EF deficits could then be associated with the former dimension (as they showed greater explanatory value) and a ToM deficit could be associated with level of functioning (as ToM performance covaried more strongly with VIQ). The weak and incomplete explanatory value of EF deficits is inconsistent with this possibility, but the notion of a multidimensional spectrum appears more suited to the distributions of scores on cognitive tasks (particularly EF tasks) and warrants further investigation. iii) A third deficit. As neither ToM nor EF deficits were able to account for the full range of symptoms displayed by this sample of individuals with ASDs, it is possible that there is a third (or more) cognitive deficit(s) which could explain those symptoms. As stated in the introduction, this possibility is compatible with both of the accounts just described, rather than competing with them. The relative primacy and the relationship of this third deficit to ToM and EF would be open for investigation. This study does not allow any inferences about what this deficit might be, but based on current research the most obvious candidate would be weak central coherence, which has been found to characterise individuals with autism in a number of studies (Happé, 1994b, 1996, 1997; Jolliffe & Baron-Cohen, 2000; Shah & Frith, 1983, 1993). Happé (2000) has argued 202 that weak central coherence is independent from ToM and can explain aspects of autism which ToM cannot, although another study found that ToM and central coherence were related (Jarrold, Butler, Cottington, & Jimenez, 2000). The universality, uniqueness, causal precedence, and in particular the explanatory value of weak central coherence and its relationship with ToM and EF will be interesting topics for further research. iv) Different developmental stages. This fourth variant of a multiple primary deficits model holds that the primacy of ToM and EF deficits in autism may not remain consistent throughout different stages of development. This would mean, of course, that the criterion of “causal precedence” would not necessarily be an appropriate index of primacy. Previous research has generally not found strong EF deficits in young children with autism, while ToM deficits (or least impairments in the proposed precursors to ToM) and social abnormalities have been more consistently documented (see Sections 2.1.3 and 2.2.3 in Chapter 2). It could be hypothesised that an impairment in ToM holds more explanatory value in the early stages of autism, but that deficits in EF somehow become more primary with age. Furthermore, deficits in the various components of EF could also change in primacy as they develop; for example, inhibition impairments could be more important early on (as inhibitory control is typically one of the first EF components to develop), with planning and generativity impairments (which typically reach their capacity during adolescence) becoming more central to autism later in development32. It may be the case that the age at which a particular capacity usually develops is the age at which its abnormal development has the most impact on behaviour. The relatively old mean age of the sample could therefore explain why the correlations between ToM and behavioural measures were not significant, and the variability in the age of the sample could account for the relatively small number and size of significant correlations with EF components, as well as the non-universality of both ToM and EF deficits. This is obviously speculative and relies on the findings of previous studies given that the early stages of the development of autism were not studied in this research. This hypothesis would be best assessed using longitudinal studies of the development of ToM and EF and their relationship with behaviour throughout development. Notably, its plausibility is supported by previous findings that in children with Williams syndrome (which is also a genetically based 32 When the sample was divided into younger (5-8 year-old) and older (9-18 year-old) subgroups, the results of group comparisons were consistent with this hypothesis: group differences in verbal inhibition were significant for the younger subgroup but not the older subgroup, and planning and generativity impairments were significant for the older subgroup but only marginally significant for the younger subgroup (these analyses are presented in Appendix D). 203 developmental disorder), a change in the nature of cognitive deficits is observed at different developmental stages (Paterson, Brown, Gsodl, Johnson, & Karmiloff-Smith, 1999). If this developmental account is accepted, is it then possible that ToM and EF are related processes in younger children with autism (i.e., below the age of five years)? In Section 2.3.1 of Chapter 2, it was argued that the existence of dissociable impairments in two abilities at a certain age cannot be used to infer the independence of the two abilities throughout earlier development. As suggested earlier in regard to typically developing children, it may be the case that aspects of EF depend on ToM for their development, as argued by Perner (or vice versa as suggested by Russell, although this is less likely because of the lack of EF deficits found in young children with autism as well as EF’s later developmental trajectory), but that the two domains become independent after the crucial stage of development has passed. However, if this was the case, it would be unlikely at later ages for deficits in ToM to exist without deficits in EF (this occurred in a significant proportion of this sample for all EF components). While double dissociations could occur if impairment in one domain was acquired after the initial stage of development of the other domain, “ToM-impaired, EF-intact” dissociations could not occur if ToM was impaired from an early age, as this would result in abnormal development of EF (and vice versa if early EF impairments caused a ToM deficit). This suggests that EF deficits in ASDs are not a consequence of an early ToM deficit. Moreover, the presence of double dissociations in this sample provides evidence against both emergence accounts of the ToM-EF relationship (as well as the “common conceptual basis” accounts). The only situation in which an emergence account may be plausible would be if EF deficits caused ToM to develop abnormally, but this ToM deficit was not apparent at later ages because the use of compensatory strategies “masked” the impairment. Nevertheless, it appears that ToM and EF deficits in ASDs are best explained as occurring independently, most likely linked by their neurobiological substrates, but possibly varying in primacy according to the age at which they usually have the most impact on behaviour. In sum, then, results from this study suggest that deficits in ToM and certain aspects of EF characterise individuals with ASDs; neither of these deficits meet criteria for a single primary deficit, but EF deficits (in particular, deficits in verbal inhibition and generativity) are relatively more primary; and the deficits appear to be independent. This pattern of results suggests that a “multiple primary cognitive deficits” account best 204 explains ASDs, but it remains to be seen which version of this model is most appropriate (or, perhaps, which combination of these models – this is discussed further in the General Discussion in Chapter 7). Distinguishing between these models relies fairly heavily on determining the relationship of each cognitive deficit with behavioural symptom domains; however, the difficulties with measuring the explanatory value of cognitive deficits in a precise and thorough manner (due to the indirectness and complexity of cognitive-behavioural relationships, as discussed in Section 4.4.2) prevented strong conclusions from being made on this basis. Similarly, while the nonuniversality of both ToM and EF deficits indicated that neither of them was singularly primary, the inferior primacy of ToM based on its lack of explanatory value remains debatable (although its non-significant ability to discriminate ASD from control individuals supported this inference). Another method of confirming the primacy of ToM and EF deficits and testing various multiple deficits models is to examine the prevalence of these deficits and their independent occurrence or co-occurrence in relatives of individuals with ASDs – thereby determining their potential as independent subclinical markers of the ASD genotype. That is the focus of Chapters 5 and 6. 205 206 CHAPTER 5 Literature Review: The Broad Autism Phenotype 5.1 Autism as a genetic disorder 5.2 The broad phenotype 5.2.1 The behavioural phenotype 5.2.2 The cognitive phenotype 5.2.2.1 General intellectual ability 5.2.2.2 Specific cognitive deficits 207 As mentioned at the end of Chapter 4, the role of ToM and EF deficits as subclinical markers of the ASD genotype is another method of determining their primacy to ASDs. If a cognitive deficit is primary to autism, then its prevalence in individuals who carry the autism genotype (or at least the genotype for the relevant autistic trait), including those with a milder or lesser variant who do not meet criteria for an ASD diagnosis, should be higher than in the normal population (Bailey et al., 1996). An elevated incidence of a particular cognitive weakness in first-degree relatives of individuals with autism therefore provides evidence of the centrality of that deficit to autism. The independent incidence of those weaknesses in certain subgroups of families, and their relationship with behavioural traits, would also have implications for the various multiple deficits models presented in the previous chapter (this is discussed further in the introduction to Study Two in Chapter 6). This chapter contains a literature review of the genetics of autism and the broad autism phenotype, as a background for the rationale and methodology developed in Study Two. As the arguments outlined above depend upon the assumption that autism is a genetic disorder, the first section of the review presents evidence for that assumption. The second section of the review describes research on the behavioural and cognitive characteristics of the broad autism phenotype, including previous studies of ToM and EF deficits in relatives of individuals with ASDs. Throughout the review, it will hopefully become evident why it is important to study ToM and EF in the broad phenotype and what needs to be addressed in further studies. 5.1 Autism as a genetic disorder Because autism did not appear to run in families (i.e., it was rare for children with autism to have parents with autism), for several years a genetic basis to the disorder was rejected in favour of environmental causes such as cold, detached child-rearing practices or “refrigerator” parenting (Bettelheim, 1967; Eisenberg & Kanner, 1956). Early reports by Kanner and Asperger themselves of social and communicative difficulties and obsessional characteristics in parents of children with autism and Asperger syndrome were commonly interpreted as causing abnormal development in their children rather than reflecting a genetically based milder phenotype. However, these notions came under increasing doubt as it was realised that it would be rare for autistic individuals to develop close relationships and therefore have children, and as studies failed to find evidence of abnormal parenting styles (see Cantwell, Baker, & Rutter, 208 1979). Recognition of associations with mental retardation (Lockyer & Rutter, 1969) and epilepsy (Rutter, 1970) provided further evidence of a biological basis. A key study by Folstein and Rutter (1977) helped establish autism as a genetic disorder, finding a significant difference in the concordance rates for autism in monozygotic (MZ; 36%) versus dizygotic (DZ; 0%) same-sex twins. Furthermore, they found that the majority of MZ twins who did not have autism showed some type of cognitive deficit, usually involving language. These findings have since been replicated in two large-scale studies (Bailey, Le Couteur, Gottesman, & Bolton, 1995; Steffenburg et al., 1989), although Bailey et al. (1995) found higher MZ concordance rates of 60% for autism and 92% for a broader spectrum of cognitive or social abnormalities (the DZ concordance for this broad spectrum was also higher at 10%). Based on their results, Bailey et al. (1995) estimated that the heritability for autism is greater than 90%. An elevated recurrence risk for autism has also been observed in siblings, ranging from 2% (Boutin et al., 1997; Minton, Campbell, Green, Jennings, & Samit, 1982) to 6% (Baird & August, 1985) and averaging at around 2.2% across studies (Szatmari, Jones, Zwaigenbaum, & MacLean, 1998), compared with a population base rate of around 0.1% (Fombonne, 2003). An increased rate of ASDs more broadly in twins and other relatives of individuals with autism has also been reported (Bailey et al., 1995; Bolton et al., 1994; Le Couteur et al., 1996), indicating that the genetic liability is not restricted to a narrowly defined disorder. Family members with ASDs do not always covary in diagnostic subtype or symptom severity, with MacLean et al. (1999) finding no familial aggregation of ASD subtype (i.e., autism, Asperger syndrome, or PDDNOS), and Le Couteur et al. (1996) finding as much variation in symptom severity and intellectual ability within MZ twin pairs as between pairs. These findings, along with the rapid decrease in risk rates from MZ twins to DZ twins and siblings to more distant relatives (the latter being very low; e.g., Delong & Dwyer, 1988), indicate that the genetic mechanisms of autism are not simple or Mendelian in nature and are likely to involve epistatic effects involving interactions among several genes (Pickles et al., 1995; Rutter, 2000; Szatmari, 1999). Studies involving linkage analysis more directly indicate the presence of multiple susceptibility loci for autism (e.g., Risch et al., 1999; Yonan et al., 2003). The existence of MZ twins discordant for autism and high phenotypic variability within twin pairs also suggests that environmental or other factors play a role, although it remains unclear what these factors may be. Bailey, Palferman, Heavey, and Le Couteur (1998) favour genetic instability (e.g., caused by a 209 somatic mutation), gene-environment interactions, and/or stochastic factors as explanations for variability in phenotypic expression. 5.2 The broad phenotype Numerous studies have now demonstrated that milder forms or lesser variants of autistic symptomatology, which do not meet criteria for a diagnosis of autism, are frequently exhibited in relatives of individuals with autism (Bolton et al., 1994; Landa et al., 1992; Le Couteur et al., 1996; Pickles et al., 2000; Piven et al., 1990b, 1991, 1994; Piven, Palmer, Jacobi, Childress, & Arndt, 1997b), giving rise to the notion of a spectrum of autistic traits or “broad phenotype” of autism. As described earlier, studying the characteristics of the broad phenotype in non-autistic relatives is a useful method of identifying which traits are primary to autism. Exploration of the broad phenotype has also been helpful in identifying possible genetic mechanisms (e.g., whether traits are contributed by both parents) and in increasing the power to identify genes linked with autism. If the broad phenotype is considered to be a collection of individual traits, each of which could be related to one of the several genes which combine to cause autism, then using the broader phenotype in linkage analysis can boost the power to find genes involved in autism by increasing the number of “affected” individuals available for analysis (Folstein, Bisson, Santangelo & Piven, 1998; Piven, 1999). 5.2.1 The behavioural phenotype Most studies attempting to identify or define the broad autism phenotype have focussed on documenting behavioural signs, generally either by conducting family history interviews about the presence of social and communicative difficulties and repetitive behaviours in family members, or by conducting more direct interviews and assessments of personality characteristics and psychiatric disorders. As already noted, family history studies (which have generally used the Family History Interview, a semistructured interview specifically designed to assess the broad autism phenotype) have consistently found social abnormalities, communicative difficulties, and repetitive stereotyped behaviours in a substantial minority of relatives of individuals with autism (Bailey et al., 1995; Bolton et al., 1994; Le Couteur et al., 1996). The use of personality assessment tools such as the Personality Assessment Schedule (PAS; Tyrer, 1988) has revealed that parents of children with autism rate 210 significantly higher than controls on several personality characteristics relating to social interaction such as aloof, untactful, shy, schizoid, oversensitive to criticism and lacking in empathy (Murphy et al., 2000; Narayan, Moyes, & Wolff, 1990; Piven et al., 1991, 1994, 1997c; Wolff, Narayan, & Moyes, 1988). Using Baron-Cohen, Wheelwright, Skinner, Martin, and Clubley’s (2001b) Autism Spectrum Quotient (a self-report questionnaire designed to assess features of the broad autism phenotype), Bishop et al. (in press-a) recently found elevated ratings on the “social skills” and “communication” subscales in parents of children with ASDs1. Abnormal pragmatic communication styles have also been detected in some parents using both interviews and direct assessments of narrative discourse (Landa et al., 1992; Wolff et al., 1988), although structural language skills are usually found to be intact (Bishop et al., in press-b; Pilowsky, Yirmiya, Shalev, & Gross-Tsur, 2003). A history of language delay appears to be a more equivocal finding, with most studies reporting language delay in only a small proportion of relatives (for a review, see Bailey et al., 1998). Obsessional traits and repetitive behaviours have been found to be relatively less common than social and communicative impairments in relatives of autistic individuals (Bailey et al., 1995; Bolton et al., 1994), although Piven et al. (1997c) found fairly high rates (almost 50%) of the personality trait “rigid” in the parents of multiplex families in their study. As pointed out by Bailey et al. (1998), the infrequency of behaviours in this category in the broad phenotype may be a consequence of the insensitivity or inappropriateness of the measures used rather than reflecting the secondary or unimportant nature of those symptoms in autism. In support of the importance of repetitive behaviours, Silverman et al. (2002) found that the severity of repetitive behaviours showed a high level of familiality in multiplex families, whereas there was little evidence for familiality in social or verbal communication domains. The risk of psychiatric disorders other than autism in relatives of autistic individuals has also been found to be elevated. In particular, increased rates of major depression have been documented in parents (Bolton, Pickles, Murphy, & Rutter, 1998; Piven et al., 1990b, 1991; Piven & Palmer, 1999; Smalley, McCracken, & Tanguay, 1995). In all of these studies, the majority of depressive episodes have been found to occur prior to the birth of the child with autism, suggesting that they cannot be explained by the burden of caring for a disabled child. The incidence of anxiety disorders may also be increased in relatives of autistic probands, but these findings have 1 These parents were part of the WAFSASD, and therefore were the parents of the probands and siblings in the current research. 211 been less consistent. Piven et al. (1991) found an elevated rate of anxiety disorder in parents, and increased rates of social phobia have also been reported (Piven & Palmer, 1999; Smalley et al., 1995), but other studies have not found evidence of phobic disorders (Bolton et al., 1998; Piven et al., 1991) or anxiety disorders in general (Bolton et al., 1998). However, two studies have found higher rates of obsessive-compulsive disorder in relatives of autistic probands (Bolton et al., 1998; Hollander, King, Delaney, Smith, & Silverman, 2003) with the recent study by Hollander et al. showing that the occurrence of obsessive-compulsive traits or disorder in parents of multiplex families was significantly more likely if the autistic children showed high levels of repetitive behaviours. There is no consistent evidence for higher rates of other psychiatric disorders such as schizophrenia or substance abuse (Bolton et al., 1998; Piven et al., 1991; Smalley et al., 1995). 5.2.2 The cognitive phenotype While studies of the behavioural features of the broad autism phenotype have been informative, individual behavioural signs suffer from the problem of low diagnostic specificity (Bailey et al., 1998), and are therefore of limited utility as unique indicators of the broad phenotype. In addition, because behavioural phenotypes are multiply determined and have indirect and complex relationships with underlying genotypes (i.e., the same genotype can give rise to different phenotypes, and the same phenotype can arise from a range of genotypes; Gottesman & Gould, 2003; Karmiloff-Smith et al., 2002), they are not an ideal basis for identifying genetic mechanisms. Researchers have therefore concurrently searched for a more basic subclinical marker of the autism genotype – or “endophenotype” - at the level of cognition. An endophenotype may be described as an “intermediate phenotype” or “vulnerability marker” which is unseen by the unaided eye (e.g., a neurophysiological, biochemical, endocrinological, or cognitive feature – i.e., not at the level of behaviour) and is somewhere between the disorder’s phenotype and the distal genotype (Gottesman & Gould, 2003). Endophenotypes are believed to represent a genetic liability to the disorder in unaffected individuals, and may only be indirectly related to classic symptoms of the disorder (Leboyer et al., 1998; Skuse, 2001). The identification of endophenotypes for complex genetic disorders may help address questions about aetiology and establish markers for diagnosis and classification (Gottesman & Gould, 2003). 212 The presence of a cognitive endophenotype is suggested when unaffected relatives of individuals with autism show a raised incidence of a cognitive deficit (or strength) that is associated with autism, but to a milder degree than in individuals with autism themselves (Hill & Frith, 2003). Studies of the cognitive phenotype have tended to focus either on the IQ profiles of relatives of autistic individuals or have investigated the presence of specific deficits in ToM, EF, and central coherence. 5.2.2.1 General intellectual ability Because approximately 70% of individuals with autism are mentally retarded (Fombonne, 2003) and autistic individuals in general tend to have better Performance than Verbal IQ (e.g., Lockyer & Rutter, 1970), several studies have examined the possibility that the broad phenotype may be similarly characterised by an increased incidence of mental retardation and/or a Verbal-Performance IQ discrepancy. Several small early studies involving very low-functioning autistic probands found a higher rate of mental retardation in their relatives than in the general population (August, Stewart, & Tsai, 1981; Baird & August, 1985; Minton et al., 1982). However, larger and more recent studies have not replicated this result, finding that mental retardation occurs only in association with autism and not in isolation, or at least at no greater incidence than for the general population (Bailey et al., 1995; Folstein et al., 1999; Fombonne, Bolton, Prior, Jordan, & Rutter, 1997; Freeman et al., 1989; Piven et al., 1990b; Smalley & Asarnow, 1990; Szatmari et al., 1993). This suggests that the genetic liability for autism is not usually for mental retardation alone (Bailey et al., 1998). The discrepancy between earlier and later studies may be due to the severe retardation of the probands in earlier studies. Consistent with this possibility, August et al. (1981), Baird and August (1985) and Boutin et al. (1997) all reported higher rates of cognitive disabilities (including language delay, learning disabilities, and mental retardation) in relatives of low-functioning probands (but see Piven et al., 1990b, and Szatmari et al., 1993, both of whom found no association between the proband’s IQ and the cognitive and academic functioning of their relatives; Starr et al., 2001 also found comparable familial loading for the broad phenotype in low and high IQ autism families). Although there does not appear to be an increased incidence of mental retardation, a number of studies have found significantly lower Verbal or Performance IQs than controls and/or significant discrepancies between verbal and non-verbal ability in first-degree relatives of individuals with autism. Consistent with the pattern typically 213 observed in autistic individuals, Minton et al. (1982) found that siblings of autistic children had significantly lower VIQ than PIQ on the WISC-R and WAIS. Similarly, Leboyer, Plumet, Goldblum, Perez-Diaz, and Marchaland (1995) found that siblings of autistic females showed significantly lower verbal abilities than siblings of Down syndrome controls, but there was no difference in visuospatial abilities across the two siblings groups. The lower verbal abilities in the siblings of autistic probands appeared to be due to a proportion of brothers who showed particularly discrepant verbal and visuospatial abilities. However, other studies examining the IQ profile of relatives of autistic probands have either reported no IQ differences from controls at all (Freeman et al., 1989; Ozonoff, Rogers, Farnham, & Pennington, 1993; Szatmari et al., 1993) or have found exactly the opposite pattern of discrepancy. Three large-scale studies have found small but significant VIQ-PIQ discrepancies in parents of individuals with autism whereby VIQ was significantly higher than PIQ (Folstein et al., 1999; Fombonne et al., 1997; Piven & Palmer, 1997). Fombonne et al. (1997) found this pattern in both parents and siblings of autistic probands irrespective of the test version used (WISC-R versus WAIS) and after controlling for SES. Folstein et al. (1999) observed superior VIQ to PIQ only in parents, finding no difference in siblings. While Fombonne et al. (1997) and Piven and Palmer (1997) both used Down syndrome controls, in the former study VIQ was significantly higher in the autism relatives and there was no difference between the groups in PIQ, whereas in the latter study PIQ was significantly lower in the Down syndrome relatives and there was no difference in VIQ. There may be a number of reasons for these inconsistencies regarding the presence and direction of VIQ-PIQ discrepancies in relatives of autistic individuals. Firstly, siblings and parents of autistic probands do not appear to demonstrate the same IQ profile, with most sibling studies finding superior PIQ to VIQ or no difference (with the exception of Fombonne et al., 1997), while studies involving parents tend to find the opposite pattern. It has been argued that parents are by definition “selected” for parenthood in that their social and communicative functioning must be sufficient for partnership and children, and they may therefore be less impaired than siblings in domains such as VIQ (e.g., Piven & Palmer, 1997). Secondly, there is some evidence that there may be at least two subgroups of parents (and possibly siblings) showing different IQ profiles. In Folstein et al.’s (1999) study, parents with early language delays demonstrated lower VIQ than parents without language delays and no VIQ-PIQ discrepancy, leading the authors to suggest that there may be two or more patterns of IQ performance in parents of autistic probands. Consistent with this, Freeman et al. (1989) 214 reported that approximately equal numbers of relatives showed VIQ-PIQ discrepancies in both directions (although this could equally reflect random differences). While subgroups of parents (and siblings) showing better VIQ than PIQ appear to contradict the pattern typically found in individuals with autism, several studies have now shown that children with high-functioning autism and Asperger syndrome often show higher VIQ than PIQ (Goodman, 1989; Klin, Volkmar, Sparrow, Cicchetti, & Rourke, 1995; Szatmari et al., 1990). The possibility that parents in general may be less impaired than siblings is therefore consistent with the finding that parents more often show IQ discrepancies in favour of VIQ (mirroring the pattern observed in higher-functioning probands). There is also the possibility that that high-functioning autism is genetically different to low-functioning autism (MacLean et al., 1999; Szatmari et al., 2002), and is associated with different IQ profiles in relatives; direct correlations between proband and relative IQ have generally not been reported, however. Finally, the role of other methodological differences between studies such as the IQ subtests used, sampling methods, unit of analysis (aggregation of familial data versus inclusion of individual sibling scores), age and gender of the probands and/or relatives, and range of ASD diagnoses included are yet to be clarified. 5.2.2.2 Specific cognitive deficits Given the variability in studies of IQ profiles, researchers have increasingly turned their attention to the investigation of specific cognitive deficits as potential endophenotypes for autism. Studies of the specific cognitive phenotype have been driven by concurrent research on primary cognitive deficits in autism, focussing on the three main current cognitive theories: ToM, EF, and weak central coherence. This not only allows more precise delineation of the broad cognitive phenotype, but also represents a method of testing the primacy of deficits in those domains to autism. To date, only three published studies have examined the mentalising abilities of relatives of individuals with autism, with contrasting results. Ozonoff et al. (1993) found no significant differences between siblings of autistic individuals and learningdisabled controls on three ToM tasks. However, they acknowledged that according to their power analyses, the ToM measures used were not sensitive enough to detect any deficits in non-autistic siblings, and they suggested the use of higher-level tasks. BaronCohen and Hammer (1997) employed Baron-Cohen et al.’s (1997) Eyes Task with parents of children with Asperger syndrome. They found that both mothers and fathers 215 in the Asperger group showed subtle but significant impairment on the task compared with control mothers and fathers. Using the same task, a recent study by Dorris, Espie, Knott, & Salt (2004) replicated these findings in siblings of children with Asperger syndrome, who displayed poorer performance on the task compared with control siblings. EF performance in relatives of autistic probands has been addressed in several studies, most of which have focussed on measures of planning and set-shifting. Significantly poorer performance by parents of individuals with autism compared with control parents (including parents of children with learning disabilities and Down syndrome) on Tower tasks (i.e., the Towers of Hanoi and London and the Stockings of Cambridge test from the CANTAB battery) was found by Hughes, Leboyer and Bouvard (1997) and Piven and Palmer (1997), and the same result in siblings was obtained by Ozonoff et al. (1993) and Hughes, Plumet, and Leboyer (1999). In Hughes et al.’s (1997) study, the difference in planning ability was restricted to fathers only, and in both of the studies by Hughes et al. (1997, 1999) a planning deficit was restricted to a subset of the relatives of autistic probands, with group differences only emerging clearly when the proportions of participants showing a deficit were compared. Findings of no group differences on the WCST in parents (Szatmari et al., 1993) or siblings (Ozonoff et al., 1993) were initially suggestive of intact cognitive flexibility in relatives of individuals with autism. However, two subsequent studies using the IDED set-shifting task found that a subset of both parents and siblings of autistic probands demonstrated difficulties with the extra-dimensional shift stage of the task (Hughes et al., 1997, 1999). Hughes et al. (1999) postulated that this discrepancy between the results observed using the WCST and the IDED task may be due to the fact that the IDED task involves a total change of stimuli at each shift and so perseverative responses are limited to high-level dimensional shifting difficulties rather than specific exemplars. However, an argument that the WCST is lower-level than the IDED task is inconsistent with findings on probands themselves, who generally show difficulties on the WCST more often than the IDED task. It may be the case that subsets of the samples tested with the WCST may have shown a deficit as in the Hughes et al. studies, but this was not directly examined. The two studies by Hughes et al. (1997, 1999) also incorporated working memory measures, with different patterns of results for parents and siblings. Both studies included a spatial working memory task involving a high demand for strategy use and therefore purportedly the “central executive” (Baddeley, 1986), and a simple 216 spatial span task with low executive or strategic requirements which served as a control task. Parents of autistic probands showed intact spatial spans, but fathers made a significantly higher number of errors than normal control fathers on the more strategic working memory task (Hughes et al., 1997). However, there was no difference between the fathers of autistic probands and the fathers of learning disabled controls, indicating that the deficit was not unique to autism families. By contrast, siblings of autistic probands showed superior spatial spans to siblings of both developmentally delayed and normal controls (as well as better verbal short-term memory for recently presented items), but there were no group differences on the more strategic working memory task (Hughes et al., 1999). Together, these results suggest that a working memory deficit is not as reliable or unique a characteristic of the broad phenotype as problems with planning and set-shifting. Other components of EF such as inhibition and generativity have not been as well studied in relatives of autistic probands. Hughes et al. (1999) included a verbal generativity task (word fluency) in their battery with siblings, and found both a significant group difference overall in the number of words generated and a higher proportion of “low fluency” participants in the autism sibling group. This promising result requires replication and extension to parent samples using both verbal and nonverbal generativity tasks. Similarly, tests of both verbal and non-verbal inhibition are yet to be employed with either siblings or parents. The possibility that weak central coherence may characterise the broad autism phenotype has also received some attention. No evidence of a relative strength in the Block Design subtest from the Wechsler scales, which purportedly indicates weak central coherence (Shah & Frith, 1993; Happé, 1994c), was found by Szatmari et al. (1993) or Fombonne et al. (1997) in parents or siblings, even when the analysis was restricted to relatives with the broad phenotype. Using the Embedded Figures Test, arguably a more direct test of central coherence, Baron-Cohen and Hammer (1997) found that both mothers and fathers of children with Asperger syndrome were faster to identify hidden shapes (indicating a tendency for piecemeal, detail-focussed processing). Happé, Briskman, and Frith (2001) included a larger range of both verbal and visuospatial measures of central coherence with both parents and siblings, and found that parents of children with autism – particularly fathers – showed a significant bias towards piecemeal processing across the four tasks used compared with parents of children with dyslexia and with no disorder. There were no significant differences among the sibling groups, however. 217 Research on specific cognitive deficits has therefore revealed that deficits in ToM, EF, and central coherence may all be characteristic of the broader phenotype, but that results across studies are often inconsistent. In all three domains, significant differences have been found in parents of autistic probands but often not in their siblings, contrary to the notion that parents should be less impaired than siblings because of the selection for parenthood described earlier. Even on measures of planning and set-shifting where significant differences among sibling groups were found, Hughes et al. (1999) noted that the deficits were not as strong in siblings as in parents. While it is not clear that the selection for parenthood should extend beyond social and communicative capabilities to cognitive characteristics, these parent-sibling discrepancies still require explanation. Hughes et al. (1999) proposed that the use of computerised tasks may favour young participants over parents, but this does not account for studies using non-computerised tasks. Happé et al. (2001) suggested that the tasks used may not be sufficiently sensitive in younger subjects. However, many studies, including theirs, have found differences in parents (who one would expect to be at a higher level than their children) using the same tasks as for siblings2. These authors also suggested that genetically determined cognitive weaknesses may only emerge at a certain age or become more pronounced with age. This would be an unusual finding in domains such as ToM and certain aspects of EF, though, which typically develop relatively early in life. These parent-sibling discrepancies therefore remain difficult to explain. A number of issues appear worthy of further investigation in future broad phenotype studies. Firstly, identification of the profile of EF performance in relatives of individuals with autism on tasks measuring the full range of EF components is yet to be achieved and will augment research on EF deficits in probands. Secondly, comparison of inter-task correlations in relatives of probands with autism versus control relatives could be informative, with Hughes et al. (1997, 1999) finding unusual associations between task performances in both parents and siblings of autistic individuals, which they interpreted as suggesting the use of different strategies in performing the tasks. Thirdly, given that cognitive deficits are often only found in a subset of relatives, it remains to be seen whether this subset display both ToM and EF deficits and therefore represent a general “cognitive impairment” subgroup, or whether there are different subgroups with different types of cognitive deficit (most studies have only examined 2 Low sensitivity may be a result of floor effects in children as well as ceiling effects, but there is no evidence of floor level performance in the sibling studies reported above. 218 one cognitive domain). Fourthly, the relationship between performance on cognitive tasks and the presence of certain behavioural traits is another important issue. Hughes et al. (1997) found a modest but significant correlation between a composite EF score and interviewers’ pre-test impressions of social abnormalities in parents of autistic probands, and Briskman, Happé, and Frith (2001) found that parents of autistic individuals who reported more preference for nonsocial activities in everyday life tended to show weaker central coherence on testing. These findings show that the cognitive weaknesses observed in the broad autism phenotype may hold relevance in accounting for subtle behavioural traits also displayed by parents and siblings, but the nature and specificity of these cognitive-behavioural relationships remains unclear, and could be important for assessing the validity of the “multidimensional spectrum” version of the multiple primary deficits model of ASDs (see Section 4.4.4 in Chapter 4). Finally, while several studies have examined relationships between the IQ of the proband and the behavioural or cognitive traits of family members, no published studies have directly correlated performances of probands and relatives on the same ToM or EF tasks. This could be a useful method of identifying which aspects of cognitive functioning in autism are the most highly familial (and therefore which may be most strongly coded in the autism genotype). In sum, greater specification of the cognitive and behavioural characteristics of the broad autism phenotype will hopefully aid progress in identifying relationships between genotype, endophenotype, and behavioural phenotype both in autism and its milder variants. These issues are examined in Study Two. 219 220 CHAPTER 6 Study Two: Theory of Mind and Executive Function in Siblings of Individuals with Autism Spectrum Disorders 6.1 Introduction 6.1.1 Aims 6.1.2 Hypotheses 6.2 Method 6.2.1 Participants 6.2.2 Procedure 6.3 Results 6.3.1 Sibling group comparisons on ToM and EF tasks 6.3.1.1 False belief tasks 6.3.1.2 Tower of London 6.3.1.3 IDED Set-shifting task 6.3.1.4 Response Inhibition and Load task 6.3.1.5 Opposite Worlds task 6.3.1.6 Pattern Meanings 6.3.1.7 Uses of Objects 6.3.1.8 Stamps task 6.3.1.9 Summary of sibling group comparisons 6.3.2 Comparisons between ASD siblings and ASD probands 6.3.3 Ability of cognitive variables to predict sibling group membership 6.3.4 Proband-sibling relationships within the ASD families 6.3.4.1 Correlations between proband IQ and siblings’ cognitive performances 6.3.4.2 Correlations between probands’ and siblings’ cognitive performances 6.3.5 Prevalence of deficits in ASD siblings 6.3.6 Correlations between ToM and EF 6.3.7 Dissociations between ToM and EF 6.3.8 Results from behavioural measures 6.4 Discussion 6.4.1 Endophenotype status of ToM and EF impairments 6.4.2 Differentiating the multiple deficits models 221 6.1 Introduction 6.1.1 Aims As reviewed in Chapter 5, weaknesses in both ToM and EF have been identified in parents and/or siblings of autistic probands in previous studies, but findings have been inconsistent and there are several empirical issues yet to be examined. Study Two is an investigation of ToM and EF deficits in siblings of individuals with ASDs. The main aims of this study were i) to identify whether ToM or EF performance meets criteria for an endophenotype or vulnerability marker for the autism genotype, and thereby seek confirmation of the results of Study One regarding the relative primacy of ToM and EF in ASDs; and ii) to collect further information relevant to distinguishing the various multiple deficits models presented in Chapter 4 (Section 4.4.4). These aims were addressed in several ways. i) Aim 1: Determining endophenotype status. In this study, those ToM and EF tasks on which probands with ASDs showed significantly poorer performance than control probands in Study One were administered to siblings of individuals with ASDs (“ASD siblings”) and control siblings. These tasks therefore included measures of false belief understanding, planning, and both verbal and non-verbal inhibition and generativity (even though non-verbal inhibition was found to be intact in probands using the RIL task, that task was administered because probands showed difficulties in the condition combining working memory and inhibitory requirements). Although only marginal differences were found between proband groups on the IDED set-shifting task, this task was also administered with siblings to enable comparison with Hughes et al.’s (1999) previous findings on siblings using the original IDED task. The inclusion of both verbal and non-verbal inhibition and generativity tasks represents an extension to previous research. Gottesman and Gould (2003, p. 639) outline five criteria for the identification of an endophenotype. The following numbered points list these criteria and describe how they were tested in the current research. 1. “The endophenotype is associated with illness in the population”. This was demonstrated in Study One, which showed that both ToM and EF deficits were associated with having an ASD diagnosis. 222 2. “The endophenotype is primarily state-independent (manifests in an individual whether or not illness is active)”. This criterion is somewhat irrelevant in the case of ASDs, as the disorder is present throughout the lifetime of the affected individual (unlike, for example, depression or schizophrenia). Of course, one would still expect the endophenotype to manifest in affected individuals. 3. “Within families, endophenotype and illness co-segregate”. This was indirectly assessed by examining whether there was an increased incidence of ASDs in siblings of ASD probands as compared with the normal population (the control siblings were not used as a comparison group in this situation because having any child with a clinical diagnosis of an ASD was an exclusion criterion for control families). However, comparisons between affected siblings and controls on ToM and EF variables (to assess whether the siblings with ASDs showed a similar profile of deficits as the ASD probands) were not conducted because the size of the affected group was too small for meaningful analyses (see Section 6.2.1), particularly on some tasks which were administered to participants within a restricted age range. 4. “The endophenotype found in affected family members is found in nonaffected family members at a higher rate than in the general population”. This was tested by examining whether there were any group differences in ToM or EF performance between ASD and control siblings which remained significant after siblings with ASD diagnoses were excluded. The ability of any deficits to discriminate ASD siblings from control siblings was also calculated, as any useful endophenotype should be unique to the disorder in question (Skuse, 2001). 5. “The endophenotype is heritable”. This was assessed by calculating correlations between the ToM and EF performances of the ASD siblings in Study Two, and i) the level of intellectual ability and ii) the ToM and EF performances of the ASD probands in Study One, with the assumption that significant correlations would be suggestive of a degree of familiality to the trait. A sixth feature would also be expected, which is: 6. “The severity of the endophenotype in nonaffected family members is milder than in the affected family members” (Slaats-Willemse, Swaab-Barneveld, de Sonneville, van der Meulen, & Buitelaar, 2003). Relative severity of any ToM and EF deficits in unaffected siblings as compared with affected probands was assessed by comparing the effect sizes of any significant differences between sibling groups with effect sizes for the proband groups. 223 Hence, this study addressed criteria 4, 5, and 6. The ability of ToM and/or EF deficits to sufficiently meet these three criteria would suggest that they could be endophenotypes for ASDs and therefore that they have some degree of primacy to ASDs. If one deficit is better able to meet these criteria than the other, this would suggest superior relative primacy. ii) Aim 2: Testing multiple deficits models. If ToM and/or EF deficits were identified in ASD siblings, the pattern of results would have implications for the various versions of the multiple deficits model presented in Section 4.4.4 of Chapter 4. Although the “third primary deficit” and “different developmental stages” versions of the multiple primary deficit model were not tested in this study, it was possible to examine (somewhat indirectly) the plausibility of the other two versions (the “subgroups” and “multidimensional spectrum” models). If there were different subgroups of ASDs displaying different ToM and EF profiles, then assuming these subgroups corresponded with different ASD genotypes, it would be predicted that similar subgroupings would be evident in the broad autism phenotype. This would be demonstrated by results indicating the presence of “ToM-impaired, EF-intact” and “EFimpaired, ToM-intact” siblings as was the case for probands (although impairments would be more subtle), and furthermore it would be expected that siblings demonstrating a particular ToM-EF profile would be the siblings of probands demonstrating that same profile. These possibilities were examined in several ways. Firstly, the prevalence of any deficits identified in group comparisons was calculated, to examine whether they appeared to occur only in a subset of ASD siblings. Secondly, correlations between ToM and EF in both ASD and control sibling groups were also conducted to investigate whether ASD siblings showed a similar independence between ToM and EF as was the case in probands; or if not, whether they may show unusual patterns of association between the two domains. Thirdly, the incidence of ToM-EF dissociations was examined. Finally, the correlations between proband and sibling scores on ToM and EF tasks would be indicative of possible familial aggregation of ToM-EF profiles, as described earlier. The version of the multidimensional spectrum model examined in this study was the one in which ToM and EF were purported to underlie different domains of symptomatology, although it is acknowledged that other versions (which may be more plausible based on the results of Study One) are possible. Abnormal social behaviours and repetitive behaviours in ASD siblings were both measured in this study. Although 224 it would be expected that unaffected ASD siblings would not show symptoms of autism even if they showed a cognitive endophenotype, under this version of the spectrum model it would still be predicted that ToM or EF weaknesses would be associated with increased levels of symptomatology in the relevant domain, even if that symptomatology was subclinical. Therefore, this model was examined by first analysing sibling group differences on behavioural measures (to confirm that some subclinical symptomatology was present in ASD siblings), and then conducting correlations between ToM and EF performances and these behavioural measures within the ASD sibling group. However, it was recognised that this methodology may be subject to the same problems as cognitive-behavioural correlations conducted in Study One. 6.1.2 Hypotheses Predictions for endophenotype status. Given that neither ToM nor EF deficits met all of the criteria for primacy in Study One, no strong predictions were made with regard to whether or not either domain would adequately meet all criteria for an endophenotype of ASDs, although previous research has suggested that both ToM and EF have potential endophenotype status. More confident predictions could be made in terms of relative primacy, as EF deficits were found to be relatively more primary in Study One. On this basis, it would be expected that i) significant weaknesses in ASD siblings on EF tasks (which are less severe than the deficits displayed by probands) would be more likely than on ToM tasks, and would be better able to predict group membership; and ii) correlations between the EF performance of ASD siblings and probands would be stronger than correlations between ToM performances of ASD siblings and probands. Furthermore, it would be expected that the EF variables which demonstrated the strongest evidence of primacy in ASD probands (i.e., verbal inhibition and verbal generativity) would be the most likely variables on which weaknesses in ASD siblings would emerge. However, given the concerns with interpretation of some of the findings relevant to primacy in Study One, the possibility remained open that a ToM deficit may also meet criteria as an endophenotype to an equal degree as EF deficits. Predictions for multiple deficits models. Based on the results of Study One and on previous research, it was predicted that if ToM and EF weaknesses were found, they would only be evident in a subset of ASD siblings. Beyond that, however, given that 225 the results of the analyses relevant to the different multiple deficits models were very much dependent upon the results of analyses relevant to determining endophenotype status, no specific predictions were made prior to conducting the study. This aspect of the study may therefore be considered exploratory. 6.2 Method 6.2.1 Participants Siblings of ASD Group (“ASD siblings”)1. There were 108 siblings in this group, ranging in age from 4 to 29 years. These siblings came from 68 families, thus in some cases there was more than one sibling per family. Six siblings had received clinical diagnoses of ASDs: three with diagnoses of autism, one with Asperger syndrome, and two with PDDNOS. Three additional siblings had received diagnoses indicating language impairment. As for the control group in Study One, autistic symptomatology in siblings was assessed using the ASQ, and the ADI-R was administered for anyone scoring above 10. All six siblings with clinical diagnoses of ASDs and two of the three with language impairment met either full or partial criteria for autism on the ADI-R. In addition, two other siblings without clinical diagnoses met partial criteria on the ADI-R. Hence, there were 10 siblings altogether who met criteria for an ASD, which is 9.3% of the ASD sibling group – a 31-fold increase compared to the population prevalence for ASDs, which is around 0.3% (including ASDs besides autism; Fombonne, 2003). Exclusion criteria were the same as for Study One (genetic abnormalities or neurological dysfunction, except for epilepsy). No siblings were excluded for these reasons. There were 10 ASD siblings with other clinical diagnoses (6 with ADHD, 2 with epilepsy, 1 with dyspraxia, and 1 with dyslexia). Siblings of Control Group (“Control siblings”). Sixty-seven control siblings ranging in age from 4 to 24 years participated in the study. These siblings came from 49 families. No siblings in this group had clinical diagnoses of ASDs or exceeded the cutoff 1 This group included siblings of some probands with ASDs who were not included in Study One of this thesis because they were too low-functioning (but who were recruited as participants in the WAFSASD). Given that the ASD siblings were themselves matched with the siblings of the control group on age and PIQ, the inclusion of the siblings of low-functioning children with ASDs was considered to be valid. The relationship between siblings’ performance on cognitive tasks and the level of functioning of the proband was also examined, as reported in Section 6.3.4.1. 226 criterion on the ASQ. Three control siblings had other clinical diagnoses (2 with ADHD and 1 with epilepsy). Demographic characteristics of each group are presented in Table 22. The ASD and control siblings were matched on chronological age, t(173) = .90, p > .1, and PIQ, t(173) = .08, p > .1. All participants had a PIQ and VIQ of 60 or above. ASD siblings had significantly lower VIQs than control siblings, t(173) = 2.28, p < .05. However, when ASD siblings who met full or partial ADI-R criteria were excluded, the difference in VIQs became only marginally significant, t(163) = 1.74, p = .08. There was a higher proportion of girls in the control sibling group (68.7% vs. 42.6% in the ASD sibling group), χ2 (1, N = 175) = 11.27, p < .01. This is likely to be due to the fact that in attempting to select proband samples matched on gender, often the male child in the family was selected as a control proband, resulting in a higher proportion of female siblings in the control sibling group. To ensure sibling group comparisons were not influenced by gender, it was included as an independent variable in all analyses. This also enabled evaluation of any group by gender interactions (e.g., it may be that brothers of ASD probands show greater heritability of autistic-like cognitive traits than sisters). Table 22. Demographic characteristics of the sibling samples ASD siblings ASD siblings, Control siblings ADI-R subgroup excluded N 108 98 67 11.33 (5.38, 4-29) 11.62 (5.42, 4-29) 10.61 (4.71, 4-24) Male: Female 62: 46 53: 45 21: 46 PIQ: Mean (SD, range) 107.63 107.9 107.42 (17.08, 70-149) (17.7, 70-149) (16.18, 58-146) 102.41 103.67 107.61 (15.02, 66-141) (14.37, 66-141) (14.14, 72-138) Age: Mean (SD, range) VIQ: Mean (SD, range) With an n of 108 in the ASD sibling sample and 67 in the control sibling sample, the power of the study to detect medium sized effects (i.e., d = .5) at an alpha level of .05 was excellent at .94. 227 6.2.2 Procedure The same tests, questionnaires and interviews were used in this study as in Study One2, all of which are described in Chapter 3. The procedure for this study was also identical to that used in Study One (see Section 4.2.2 of Chapter 4). 6.3 Results This section includes analyses addressing i) group comparisons between ASD and control siblings on ToM and EF tasks, both before and after exclusion of siblings who met ADI-R criteria for an ASD; ii) the relative severity of any weaknesses in ASD siblings compared with ASD probands; iii) the ability of task variables to predict ASD/control sibling group membership; iv) relationships between probands’ and siblings’ scores; v) the prevalence of deficits in the ASD sibling group; vi) correlations between ToM and EF variables; vii) dissociations between ToM and EF performances; and viii) results from behavioural measures. Hence, analyses i) to iv) assess the endophenotype status of ToM and EF, and v) to viii) are aimed primarily at assessing the subgroup and multidimensional spectrum models. As for Study One, SPSS Version 10.0.5 was used for all analyses. Data screening was handled in the same way as for Study One (see Section 4.3.1 in Chapter 4). 6.3.1 Sibling group comparisons on ToM and EF tasks The consistent approach to group comparisons that was used in Study One (as described in Section 4.3.2) was also followed in this study. The ASD and control sibling groups remained matched on age and PIQ for all tests which were administered to participants within a restricted age range. For some tasks (all false belief tasks, the Opposite Worlds task, the RIL task, the IDED set-shifting task, and the Stamps task), the two groups of siblings who completed those tasks, including participants meeting full or partial criteria on the ADI-R, were also matched on VIQ. As mentioned in Section 6.2.1, gender was included as an independent variable in all sibling group comparisons on continuous variables. In the case of dichotomous 2 Although there were no group differences on the Dewey stories and Relational Complexity tasks in the proband study, these tasks were actually also administered to the siblings in this study. Hence, the order and length of task administration was identical across the two studies. Results from these tasks were analysed out of interest and there were no significant differences between the sibling groups. 228 variables, gender effects were assessed by conducting separate chi-square analyses for brothers (i.e., ASD brothers versus control brothers) and sisters (ASD sisters versus control sisters). These separate analyses are only reported if there were significant group differences for one gender but not another (or if group differences were in opposite directions for the two genders); otherwise, only the results of overall chisquare analyses including both genders are reported. For all variables, separate means and standard deviations (or proportions of high/low scorers) for brothers and sisters are only reported if there were significant group by gender interactions on the task, or if displaying separate data for brothers and sisters was meaningful for other reasons. Similarly, in analyses where age, PIQ and/or VIQ were covaried or controlled, gender was only included as an independent variable if significant interactions involving gender had been found in initial group comparisons. In all sibling group comparisons where there were significant group differences, analyses were repeated after excluding all ASD siblings who met full or partial criteria on the ADI-R. Results of these repeat analyses are reported in each of the relevant sections. As in Study One, the influence of participants with non-ASD diagnoses (of which there were 10 in the ASD sibling group and 3 in the control sibling group) was checked by repeating all group comparisons after excluding these participants from the sample. This did not affect any of the results (i.e., both non-significant and significant differences remained so, both before and after exclusion of participants meeting full or partial ADI-R criteria), with the exception of the Stamps task complexity score. The change in the result for this variable is reported in Section 6.3.1.8. 6.3.1.1 False belief tasks As in Study One, a large proportion of participants gained perfect scores for both belief and control questions on all false belief tasks. All variables were recoded as dichotomous such that a perfect score was coded as 1 and any other score as 0. Four participants (two ASD siblings and two control siblings) were not administered the First-order and Second-order false belief tasks due to equipment malfunction. These participants had all passed the Simple false belief task and were therefore assigned the mean value of other participants in their group who had passed the Simple false belief task. The overall sample size for all false belief tasks (which were administered to participants within a restricted age range) was 148 (87 ASD 229 siblings and 61 control siblings). As for Study One, the ns for the memory and reality questions, as well as the “own belief” questions in the Simple false belief task, were limited to those who actually did the task (as these questions were not assumed to be passed or failed according to performance on other false belief tasks, as was the case for the belief questions). Percentages of participants gaining perfect scores on belief questions (i.e., “perfect scorers”) in each group for each false belief task are presented in Table 23. i) Simple false belief task. Chi-square analyses revealed that there was no statistically significant difference between the ASD and control siblings on reality questions, χ2 (1, N = 51) = 1.34, p > .1, belief questions referring to the participant’s own previous belief, χ2 (1, N = 51) = .84, p > .1, or belief questions referring to other’s beliefs, χ2 (1, N = 148) = .95, p > .1. There were no significant differences when brothers and sisters’ results were analysed separately. Performance on the questions relating to the participant’s own previous belief was significantly correlated with VIQ, r = .34, p < .05. Performance on others’ belief questions was significantly correlated with both age, r = .33, p < .001, and VIQ, r = .35, p < .001. Group remained a non-significant predictor of performance on both own belief questions, z = .1, p > .1, and others’ belief questions, z = .48, p > .1, when VIQ (and age in the case of the latter variable) was controlled using logistic regression. ii) First-order false belief task. The performance of ASD and control siblings did not differ significantly on reality questions, χ2 (1, N = 113) = .31, p > .1, memory questions, χ2 (1, N = 113) = .01, p > .1, or belief questions, χ2 (1, N = 148) = 1.20, p > .1. Results were the same for brothers and sisters. Performance on belief questions was significantly correlated with both age, r = .51, p < .001, and VIQ, r = .20, p < .05. In a logistic regression with age, VIQ and group as predictors of performance on belief questions, the independent contribution of group remained non-significant, z = 1.97, p > .1. iii) Second-order false belief task. Again, there was no significant difference between the ASD and control siblings on reality questions, χ2 (1, N = 127) = .06, p > .1, memory questions, χ2 (1, N = 127) = .09, p > .1, or belief questions, χ2 (1, N = 148) = .11, p > .1, and no difference in the results when brothers and sisters were examined separately. Scores on belief questions were significantly correlated with age, r = .47, p < .001, and VIQ, r = .34, p < .001. Group was not a significant predictor of performance 230 on belief questions in a logistic regression with age, VIQ and group as the predictors, z = .04, p > .1. iv) Overall false belief performance indices. As for the probands, an aggregate score and a more lenient alternative aggregate score were calculated for siblings. There was no significant group difference in the proportion of perfect scorers on the aggregate score, χ2 (1, N = 148) = .01, p > .1, or in the proportion of high scorers on the alternative aggregate, χ2 (1, N = 148) = .57, p > .1. Results were the same when brothers and sisters were analysed separately. Both aggregate scores were correlated with age (r = .52, p < .001, for the aggregate score and r = .48, p < .001, for the alternative aggregate) and VIQ (r = .32, p < .001, for the aggregate score and r = .34, p < .001, for the alternative aggregate). When logistic regressions were performed with age, VIQ, and group as the predictors, group was not a significant predictor of either the aggregate score, z = .42, p > .1, or the alternative aggregate, z = .14, p > .1. Table 23. False belief task results: Percentage of siblings in each group with perfect scores [or high scores for the alternative aggregate] on belief questions, and significance of group comparisons ASD siblings Control siblings p p with age/ IQ control Simple false belief: Own belief 74.2 85.0 - - Others’ belief 90.8 95.1 - - First-order false belief 82.8 75.4 - - Second-order false belief 74.7 77.0 - - Aggregate score 71.3 70.5 - - [80.5] [85.2] - - Alternative aggregate * p < .05; ** p < .01; *** p < .001; - p > .05. 6.3.1.2 Tower of London (ToL) As in Study One, the number of rule violations per block administered was highly skewed and was recoded as a dichotomous variable, with participants making 0-1 violations per block being given a score of 0 (“low rule violators”) and participants making any higher number of violations scored as 1 (“high rule violators”). 231 Two ASD siblings had missing data on the ToL, and were not included in analyses. A two-way ANOVA comparing the total adjusted extra move scores of the ASD siblings and control siblings revealed no significant effect of group or gender, and no significant interaction; largest F(1, 169) = .06, p > .1 (ASD siblings: M = 21.65, SD = 8.72; Control siblings: M = 21.99, SD = 9.60). A chi-square analysis also showed that the proportion of high rule violators in the ASD sibling group (25.5%) did not differ significantly from the proportion in the control sibling group (29.9%), χ2 (1, N = 173) = .40, p > .1. This difference was non-significant for both brothers and sisters. The total adjusted extra moves score correlated significantly with age, r = -.73, p < .001, and rule violations were significantly correlated with both age, r = -.58, p < .001, and VIQ, r = -.17, p < .05. An ANCOVA conducted on the total adjusted extra move score revealed that the group difference remained non-significant when age was covaried, F(1,170) = .50, p > .1. Group also remained a non-significant predictor of rule violation status (low/high) when age and VIQ were assessed independently in a logistic regression, z = .19, p > .1. 6.3.1.3 IDED Set-shifting task All set-shifting variables were again highly skewed, and all variables were recoded such that any participant making 0 or 1 errors was assigned a score of 0 (“low error scorers”) and any participant making a higher number of errors was given a score of 1 (“high error scorers”). The overall N for the task (which had a restricted age range) was 129 (81 ASD siblings and 48 control siblings). Due to computer malfunction, data for the Perseveration condition from one ASD sibling were invalid and not included in analyses. The percentage of low error scorers for each stage in each task condition is displayed in Table 24. There were no significant group differences overall on any variable. When brothers and sisters were analysed separately, a significant difference was observed in the SD stage of the Learned Irrelevance condition such that there was a higher proportion of high error scorers among sisters of ASD probands than among control sisters, χ2 (1, N = 70) = 5.08, p < .05. There was no significant difference between the brother groups on this variable, and no discrepancies in the results from brothers and sisters on other variables. Errors made on the SD stage of the Perseveration condition correlated significantly with both age, r = -.24, p < .01, and PIQ, r = -.17, p < .05. Age was also 232 significantly correlated with errors made in the Learned Irrelevance condition on the SD stage, r = -.21, p < .05, and the IS stage, r = -.19, p < .05. Group remained a nonsignificant predictor of performance on these variables when logistic regressions were performed with age and group (and PIQ where relevant) as predictors; the largest z = 1.80, p > .1. When brothers and sisters were analysed separately for the Learned Irrelevance SD stage variable, group was again a significant predictor of performance on that variable for sisters when age was accounted for, z = 4.14, p < .05. The group difference also remained significant when sisters meeting full or partial ADI-R criteria were excluded, χ2 (1, N = 69) = 5.29, p < .05. Table 24. IDED Set-shifting task results: Percentage of low error scorers in each sibling group for each stage of each task condition, and significance of group comparisons ASD siblings Control siblings p p with age/ IQ control Perseveration condition: SD stage 76.3 75.0 - SDR stage 57.5 70.8 - CD stage 71.3 75.0 - IDS stage 76.3 83.3 - EDS stage 68.8 75.0 - - Learned Irrelevance condition SD stage – brothers only 74.3 85.7 - - SD stage – sisters only 80.0 97.1 * * SDR stage 77.8 81.3 - CD stage 70.4 83.3 - IDS stage 77.8 87.5 - EDS stage 27.2 25.0 - - * p < .05; ** p < .01; *** p < .001; - p > .05. 6.3.1.4 Response Inhibition and Load (RIL) task For all RIL conditions, error variables (representing the percentage of errors made) were again highly skewed, with many participants making a low percentage of errors. These variables (with the exception of the shape error score) were recoded such that 0-2% 233 errors was coded as 0 (a “low error score”), and any higher percentage of errors was coded as 1 (a “high error score”). However, as for Study One, the main error variables used in analyses were the inhibition error difference score, load error difference score, and inhibition + load error difference score. These difference scores were normally distributed. Five outliers were trimmed: one ASD sibling’s inhibition and load error difference scores, and the inhibition error difference score for two other ASD siblings and one control sibling. The distribution of the shape error score was slightly positively skewed but transformation was not considered necessary. Median RT variables for all conditions demonstrated roughly normal distributions, but again, an inhibition RT difference score, load RT difference score, and inhibition + load RT difference score were also calculated. Five outliers were trimmed: one control sibling’s inhibition RT difference score, one ASD sibling’s load and inhibition + load RT difference scores, and the inhibition + load RT difference scores of one ASD and one control sibling. The overall N for the task was 126 (79 ASD siblings and 47 control siblings). Table 25 displays the mean and SD of each group (and the significance of group comparisons) for error and RT difference scores, and the shape error score. On the error difference scores, there were no significant main effects of group or gender and no significant group x gender interactions. There were no significant differences in any of the individual conditions when error data were examined separately for each condition (either overall or for brothers or sisters). The shape error score did not differ significantly between ASD and control siblings, F(1, 122) = 1.90, p > .1, and there was no significant effect of gender, F(1, 122) = .40, p > .1, and no significant interaction, F(1, 122) = .01, p > .1. On both the RT difference scores and the separate RT data for each condition, there were no significant main effects of group or gender and no significant interactions. In subsequent analyses, only the error and RT difference scores were used, and separate error and RT data for Conditions 1-3 were not included (nor was gender included as a factor). There were a number of significant correlations between age and IQ variables and both error and RT difference scores from the RIL task. The inhibition + load error difference score correlated significantly with VIQ, r = -.24, p < .01. The shape error score correlated significantly with age, r = -.39, p < .001, PIQ, r = -.30, p < .01, and VIQ, r = -.18, p < .05. The inhibition RT difference score was significantly correlated with age, r = -.24, p < .01, and VIQ, r = -.24, p < .01. Finally, the inhibition + load RT difference score correlated significantly with age, r = -.20, p < .05. Group differences 234 remained non-significant (or group was a non-significant predictor) for all of the above variables, with the exception of the shape error score, when age and/or IQ variables were partialled out using ANCOVA or logistic regression. For the shape error score, group differences became significant when age, PIQ, and VIQ were introduced as covariates in an ANCOVA (VIQ was not “blocked” and used as an additional IV because the groups were matched on VIQ for the RIL task), F(1, 121) = 4.40, p < .05. This indicates that when extraneous variance caused by age and IQ factors was removed, ASD siblings were found to make a significantly higher number of errors than control siblings on a measure of working memory on a task where inhibition was required. Importantly, this group difference in the shape error score remained significant when siblings who met full or partial ADI-R were excluded from the sample, F(1, 116) = 4.08, p < .05. Table 25. RIL task results: Mean (and SD) of each sibling group, and significance of group comparisons, for error and RT difference scores and the shape error score ASD siblings Control siblings p p with age/ IQ control Error difference scores: Inhibition 0.76 (2.80) 1.10 (3.39) - Load 0.73 (3.67) 0.35 (3.47) - Inhibition + load 1.43 (3.88) 1.49 (3.50) - - Inhibition 159.35 (147.76) 140.92 (141.82) - - Load 145.44 (132.66) 188.73 (137.83) - Inhibition + load 304.52 (203.07) 329.05 (216.90) - - 13.42 (13.57) 9.50 (12.02) - * RT difference scores: Working memory measure: Shape error score * p < .05; ** p < .01; *** p < .001; - p > .05. 6.3.1.5 Opposite Worlds task No transformations were required on Opposite Worlds task variables. Two ASD siblings and two control siblings demonstrated outlying scores (two on the Same World error score, one on the Same World time score, and one on both the Same and Opposite 235 World time scores), which were trimmed. Means and SDs for all variables are displayed in Table 26. The N for the task was 100 (56 ASD siblings and 44 control siblings). For the error scores, a three-way repeated measures ANOVA was conducted with group and gender as between-subjects factors and condition (Same World, Opposite World) as the within-subjects factor. There was a significant main effect of condition, F(1, 96) = 19.77, p < .001, but there was no significant main effect of group, F(1, 96) = .24, p > .1, or gender, F(1, 96) = 1.80, p > .1. There was a significant interaction between group and condition, F(1, 98) = 4.63, p < .05, but no other significant interactions. Follow-up simple effects analyses showed that there was no significant difference between the groups in the Same World error score, t(98) = 1.31, p > .1, or the Opposite World error score, t(98) = 1.02, p > .1, however the control siblings demonstrated a significantly larger error difference score than the ASD siblings (as reflected in the interaction). Examination of the pattern of results suggested that this was due to a combination of both the ASD siblings tending to make slightly more Same World errors than the control siblings, and the control siblings making slightly more Opposite World errors than ASD siblings. Time scores were also analysed using a three-way repeated measures ANOVA with group and gender as the between-subjects factors and condition as the withinsubjects factor. There was a significant main effect of condition, F(1, 96) = 182.56, p < .001, but no significant effect of group, F(1, 96) = 1.55, p > .1, or gender, F(1, 96) = .02, p > .1. The interaction between group and gender was not significant, F(1, 96) = .02, p > .1, however there was a marginally significant interaction between group and condition, F(1, 96) = 3.93, p = .05, and a significant interaction between gender and condition, F(1, 96) = 9.51, p < .01. The group x gender x condition interaction was not significant, F(1, 96) = .27, p > .1. With regard to the group x condition interaction, follow-up analyses showed that there was a marginally significant difference between the groups in the Same World time score such that ASD siblings took slightly longer than control siblings, t(98) = 1.81, p = .07, but no significant difference in the Opposite World time score, t(98) = .77, p > .1. In terms of the gender x condition interaction, follow-up analyses indicated that there was no significant gender difference in either the Same World time score, t(98) = .64, p > .1, or the Opposite World time score, t(98) = .98, p > .1, but brothers demonstrated a significantly larger time difference score than sisters (as reflected in the interaction). The pattern of results suggested that this was due 236 to a tendency both for sisters to take slightly longer in the Same World condition and for brothers to take slightly longer in the Opposite World condition. Table 26. Opposite Worlds results: Mean (and SD) and significance of group comparisons for each sibling group for error/time scores in each condition and difference scores, and for each gender for time scores ASD siblings Control siblings p p with age/ IQ control Error variables: Same World error score 1.22 (1.41) 0.87 (1.27) - - Opposite World error score 1.68 (1.86) 2.05 (1.67) - - Error difference score 0.42 (1.70) 1.12 (1.56) * * Same World time score 25.72 (7.04) 23.41 (5.24) - - Opposite World time score 31.75 (8.95) 30.47 (7.39) - - 6.19 (5.60) 6.94 (4.16) - * p p with age/ Time variables: Time difference score Brothers Sisters IQ control Same World time score 24.27 (6.01) 25.09 (6.73) - - Opposite World time score 32.05 (9.03) 30.42 (7.57) - - 7.97 (5.27) 5.24 (4.42) ** ** Time difference score * p < .05; ** p < .01; *** p < .001; - p > .05. Note: The difference scores relate to the interaction term on repeated measures ANOVAs. Age was significantly correlated with all task variables: the Same World error score, r = -.26, p < .01, Opposite World error score, r = -.27, p < .01, Same World time score, r = .50, p < .001, and the Opposite World time score, r = -.56, p < .001. VIQ correlated significantly with the Same World error score, r = -.20, p < .05, and the Opposite World time score, r = -.27, p < .01. Age and VIQ were introduced as covariates (VIQ was covaried because the groups were matched on VIQ for this task) in a two-way group x condition repeated measures ANCOVA on the error scores and three-way ANCOVA on the time scores (including gender as a between-subjects factor, as there were interactions involving gender for the time scores). There was no change in any of the 237 results with the exception that the group x condition interaction on the time scores increased in significance, F(1, 94) = 6.56, p < .05. This interaction remained significant when siblings meeting full or partial criteria on the ADI-R were excluded, F(1, 90) = 6.94, p < .05. The interaction between group and condition for the error scores also remained significant when siblings meeting ADI-R criteria were excluded, F(1, 94) = 6.69, p < .05 (age and VIQ were not covaried in this analysis as there was no difference in the original result when these variables were accounted for). 6.3.1.6 Pattern Meanings Individual error types were not analysed in this study (this level of detail was not considered essential, particularly given that the ASD and control groups in Study One did not show significant differences in error variables). As for Study One, the sum of errors variable was skewed and transformed using a logarithm equation. The number of correct responses variable was normally distributed. There was no significant difference in the number of correct responses produced by ASD siblings (M = 26.71, SD = 9.55) and control siblings (M = 26.52, SD = 8.13), no significant effect of gender on this variable, and no significant group by gender interaction; largest F(1, 171) = 1.90, p > .1. The sum of errors was not significantly different between ASD siblings (Median = 4, Range = 0-42, prior to transformation) and control siblings (Median = 5, Range = 0-56), F(1, 171) = 2.58, p > .1. There was a trend for brothers to make more error responses than sisters, F(1, 171) = 3.65, p = .06, but the interaction between group and gender was not significant for the sum of errors variable, F(1, 171) = .16, p > .1. The sum of errors was correlated with age, r = -.51, p < .001, and VIQ, r = -.24, p < .01. Group comparison of the sum of errors remained non-significant in an ANCOVA with group and VIQ level as the IVs and age as a covariate, F(1, 168) = 2.23, p > .1, and the interaction between group and VIQ level was not significant, F(2, 168) = .06, p > .1. 6.3.1.7 Uses of Objects As for the Pattern Meanings task, individual error types were not analysed. No transformation was necessary for the sum of errors variable, but four outliers were trimmed (for 2 ASD siblings and 2 control siblings). The total number of correct 238 responses, as well as the number of correct responses for conventional and nonconventional items separately, were all normally distributed. Table 27 displays means and SDs for these variables for each sibling group, and the significance of group comparisons. One ASD sibling had missing data and was not included in analyses. In a threeway repeated measures ANOVA on the number of correct responses with group and gender as the between-subjects factors and condition (conventional, non-conventional) as the within-subjects factor, there was a significant main effect of condition such that more correct responses were generated in the non-conventional condition than in the conventional condition, F(1, 170) = 305.84, p < .001, but no significant main effect of group, F(1, 170) = .67, p > .1, or gender, F(1, 170) = 2.50, p > .1. There were no significant interactions between any variables. Separate totals for conventional and non-conventional items were not used in further analyses. In a two-way group x gender ANOVA on the sum of errors, there was no main effect of group F(1, 170) = 1.12, p > .1. Brothers (M = 19.38, SD = 13.56) produced significantly more error responses than sisters (M = 15.09, SD = 12.60), F(1, 170) = 5.21, p < .05, but there was no significant interaction between group and gender, F(1, 170) = .01, p > .1. The number of correct responses was significantly correlated with both age, r = .51, p < .001 and VIQ, r = .24, p < .01. The sum of errors also correlated significantly with both age, r = -.36, p < .001, and VIQ, r = -.22, p < .01. Group differences in both correct responses and the sum of errors remained non-significant in ANCOVAs where group and VIQ level were the IVs and age was covaried, and there were no significant interactions between group and VIQ level in either analysis. Table 27. Uses of Objects results: Mean (and SD) of each sibling group, and significance of group comparisons ASD siblings Control siblings p p with age/ IQ control Correct responses: - Total 25.57 (10.32) 27.55 (11.46) - Conventional items 19.57 (7.80) 21.0 (8.99) - Non-conventional items 22.76 (8.22) 24.34 (9.75) 16.73 (12.69) 17.72 (14.04) Sum of errors - - - - * p < .05; ** p < .01; *** p < .001; - p > .05. 239 6.3.1.8 Stamps task Both the rule adherence and restriction scores demonstrated highly skewed distributions and were recoded as dichotomous variables, in the same way as for Study One. For rule adherence, a score between 0 and 6 inclusive was coded as 0 and a score of 7 or 8 was coded as 1. For restriction, a score of 0 was left as 0 and a score between 1 and 8 inclusive was coded as 1. The complexity and originality scores were approximately normally distributed. Means and SDs for the latter two variables and the proportion of low scorers for the former two variables, along with the significance of group comparisons for all scores, are presented in Table 28. Table 28. Stamps task results: Mean (and SD) of each sibling group [or the percentage of low scorers for dichotomous variables], and significance of group comparisons ASD siblings Control siblings p p with age/ IQ control Complexity score 18.90 (3.28) 20.18 (3.86) * * Originality score 4.21 (3.04) 4.34 (2.64) - - Restriction score [94.6] [88.7] - Rule adherence score [19.8] [21.3] - - * p < .05; ** p < .01; *** p < .001; - p > .05. The N for the task was 154 (92 ASD siblings and 42 control siblings). There was a significant group difference on the complexity score, F(1, 150) = 4.68, p < .05, indicating that the ASD siblings produced less complex patterns than control siblings. The effect of gender was not significant for this variable, F(1, 150) = .10, p > .1, and the interaction between group and gender was not significant, F(1, 150) = .02, p > .1. In a two-way ANOVA on the originality scores, there was no significant effect of group or gender, and no significant interaction, largest F(1, 150) = .67, p > .1. Chi-square analyses revealed that there was no significant group difference in the percentage of low scorers on the restriction score, χ2 (1, N = 154) = 1.77, p > .1, or the rule adherence score, χ2 (1, N = 154) = .05, p > .1. Results were the same when brothers and sisters were analysed separately. Originality scores were significantly correlated with both age, r = .46, p < .001, and VIQ, r = .23, p < .01. Age also correlated significantly with complexity scores, r = .44, p < .001, and rule adherence scores, r = .35, p < .001. In a two-way ANCOVA 240 with group and VIQ level as the IVs and age as a covariate, the group difference in the originality score remained non-significant, F(1, 147) = .03, p > .1. The interaction between group and VIQ level was not significant, F(2, 147) = .20, p > .1. When a logistic regression was performed on the rule adherence score, group remained a nonsignificant predictor when it was assessed independently of age, z = .01, p > .1. In an ANCOVA with age as a covariate, the group difference in the complexity score remained significant, F(1, 151) = 5.78, p < .05. The difference also remained significant when participants meeting full or partial criteria on the ADI-R were excluded from the sample, F(1, 142) = 4.0, p < .05. However, when participants with non-ASD diagnoses were additionally excluded, the group difference in the complexity score became only marginally significant, F(1, 131) = 2.96, p = .09. 6.3.1.9 Summary of sibling group comparisons In summary, ASD siblings performed significantly more poorly than control siblings on tasks measuring working memory (under conditions where inhibition was required) and non-verbal generativity. When participants with non-ASD diagnoses were excluded, the group difference on the non-verbal generativity measure became only marginally significant. Sisters of ASD probands also made more errors than sisters of control probands on the simple first stage of the IDED set-shifting task Learned Irrelevance condition. Control siblings showed larger error and time difference scores on the Opposite Worlds test, which appeared to be a fairly spurious result which was equally attributable to ASD siblings performing slightly (but not significantly) more poorly on the Same World condition and control siblings performing slightly (but not significantly) more poorly on the Opposite World condition. There were no significant group differences on measures of ToM, planning, set-shifting (i.e., in the EDS stages), non-verbal inhibition, or verbal generativity. All of the significant group differences remained significant when ASD siblings meeting full or partial criteria on the ADI-R were excluded, which indicates that criterion 4 for endophenotype status was met (see Section 6.1.1). However, as the poorer performance on the simple first stage of the IDED set-shifting task in sisters of ASD probands did not correspond with a deficit displayed by the ASD probands themselves, this suggests that the weakness displayed by ASD sisters on this task was not indicative of an endophenotype for ASDs (as it violates criterion 1). Therefore, this variable is not included in subsequent analyses examining other criteria for 241 endophenotype status. In addition, because the group differences on the Opposite Worlds error and time difference scores appeared to be spurious outcomes deriving from two non-significant differences in opposite directions and therefore did not represent meaningful strengths or weaknesses in the ASD sibling group (i.e., were not candidates for an endophenotype), these variables were not included in subsequent analyses either. Hence, the RIL task shape error score and the Stamps task complexity score were the only two candidate endophenotype variables remaining. 6.3.2 Comparisons between ASD siblings and ASD probands If these two variables are possible endophenotypes for ASDs, then it would be expected that the performance displayed by ASD siblings would be poorer than that of control siblings, but not as poor as that of ASD probands. To compare the severity of deficits across these three groups, it was decided not to directly compare the scores, as the groups were not matched on PIQ or VIQ (and therefore any differences could be attributable to those variables). Instead, the effect sizes of the differences found between ASD and control siblings were calculated and compared with the effect sizes of the differences between the two proband groups in Study One. These two sets of effect sizes are presented in Table 29. It is evident that the effect sizes for the sibling group differences (both small effects) are smaller than those for the proband group differences (both medium effects), consistent with predictions. Table 29. Effect sizes, r (and d), of significant group differences between sibling groups and between proband groups Measure ASD versus control ASD versus control siblings probands .15 (.31) .27 (.56) .18 (.36) .28 (.58) Working memory: RIL task shape error score Generativity: Stamps task complexity score 242 6.3.3 Ability of cognitive variables to predict group membership In order to examine if either of two candidate endophenotype variables were able to discriminate ASD siblings from control siblings, a direct logistic regression was performed with group as the outcome variable, and the RIL task shape error score and Stamps task complexity score as the predictors. There were 65 ASD siblings and 42 control siblings with data for both predictor variables, and these limited groups were matched on age (M = 11.27, SD = 2.62 for ASD siblings; M = 11.59, SD = 2.74 for control siblings), t(105) = .61, p > .1, PIQ (M = 107.69, SD = 17.74 for ASD siblings; M = 103.83, SD = 15.81 for control siblings), t(105) = 1.15, p > .1, and VIQ (M = 105.65, SD = 14.60 for ASD siblings; M = 107.43, SD = 13.33 for control siblings), t(105) = .64, p > .1. Thus, none of these matching variables were included in the regression. A test of the full model with both predictors against a constant-only model was statistically reliable, χ2 (2, N = 107) = 8.69, p < .05, indicating that the two predictors together reliably distinguished ASD from control siblings. 90.8% of ASD siblings and 35.7% of control siblings were classified correctly by the model, with an overall success rate of 69.2%. This pattern of results suggests that the model was sensitive to ASD sibling group membership, but not specific. Table 30 presents regression coefficients, Wald statistics, odds ratios, and their 95% confidence intervals for each predictor. According to the Wald criterion, only the Stamps task complexity score was a significant individual predictor of group membership. Of note, when age, PIQ, and VIQ were included in the regression, the RIL task shape error score also became marginally significant as a predictor, z = 2.74, p = .098 (note that the group difference in the shape error score also became significant only after the age and IQ variables were controlled). Table 30. Results of logistic regression analysis of sibling group membership 95% C. I. for odds ratio Wald test Variables ___________________ B (z-ratio) Odds ratio Upper Lower -.02 1.85 .98 .95 1.01 5.16* 1.22 1.03 1.44 RIL task: Shape error score Stamps task: Complexity score .20 *p < .05; ** p < .01; *** p < .001. 243 6.3.4 Proband-sibling relationships within the ASD families To examine correlations between ASD probands’ and ASD siblings’ scores on cognitive measures, data were used from one sibling in each family who was closest in age to the proband. Correlations were conducted for all ToM and EF variables regardless of whether or not they were tasks on which ASD siblings demonstrated weaknesses, as it was possible that sibling performances could covary with proband performances even if the sibling performances were in the normal range. Because of concerns that age would strongly mediate relationships between probands’ and siblings’ scores, age-scaled scores were calculated for use in all correlations. For each variable, the regression equation: predicted score = slope x age + intercept was calculated using the combined control proband and sibling samples. If the relationship between the variable and age was curvilinear, the log of age was used in this equation instead, if it resulted in a more linear relationship (this was the case for ToL rule violations, IDED set-shifting Learned Irrelevance condition EDS stage errors, the sum of errors on the Pattern Meanings and Uses of Objects tasks, and the Stamps task complexity score). ASD participants (both probands and siblings) then had their scores converted to age-scaled z-scores by subtracting the predicted score from the obtained score and dividing by the standard error of the estimate. Because linear regression equations could not be calculated for dichotomous variables, some of these were not used in the proband-sibling correlations (all IDED set-shifting variables except the EDS stage in the Learned Irrelevance condition, and the Stamps task rule adherence and restriction scores). For variables which were not excessively skewed before dichotomisation, the original continuous form of the variable was used in correlations (the false belief aggregate score, ToL rule violations, and the IDED Learned Irrelevance EDS stage errors). 6.3.4.1 Correlations between proband IQ and siblings’ cognitive performances The relationship between ASD probands’ level of functioning and their siblings’ performances on cognitive tasks was assessed by calculating correlations between probands’ IQ scores and siblings’ scores on cognitive measures. This was particularly important as the probands of the sibling groups were not matched on either VIQ or PIQ (as probands who were part of the WAFSASD but were too low-functioning to participate in Study One had siblings who were included in Study Two). The IQ data 244 from the low-functioning probands who were not included in Study One (but whose siblings were included in Study Two) were included in these correlations. Age-scaled scores were used for the siblings’ scores, but no z-scores were calculated for probands’ IQs (as these were already age-scaled). If raw correlations were significant, partial correlations were also conducted where sibling VIQ and PIQ were controlled, to ensure that the correlations were not mediated by relationships between proband and sibling IQ (the correlations between proband and sibling PIQ and VIQ were both significant; PIQ: r = .25, p < .01; VIQ: r = .30, p < .01). Table 31. Raw and partial correlations between proband PIQ and VIQ and siblings’ scores on ToM and EF measures Proband IQ score PIQ VIQ Sibling score on cognitive task False belief aggregate score .45** .32* .25* .08 ToL: Adjusted extra moves score .24* .26* .08 .07 Rule violations -.11 -.10 IDED set-shifting Learned Irrelevance .01 .16 condition EDS stage errors RIL task: Inhibition error difference score -.13 -.29* -.05 Load error difference score -.13 .0 Inhibition + load error diff. score -.20 -.25 Inhibition RT difference score -.15 -.09 Load RT difference score .24 .24 Inhibition + load RT diff. score .06 .10 Shape error score -.05 -.01 Opposite Worlds: Error difference score .05 -.01 Time difference score -.19 .03 Pattern Meanings: Correct responses -.07 -.12 Sum of errors -.22 -.02 Uses of Objects: Correct responses .03 -.06 Sum of errors -.20 -.16 Stamps task: Complexity score -.11 .05 Originality score -.08 -.07 *p < .05; ** p < .01; *** p < .001. Results of these correlations are displayed in Table 31. Siblings’ false belief aggregate score and ToL adjusted extra moves score both correlated significantly with both the PIQ and VIQ of probands, and the RIL task inhibition error difference score correlated significantly with proband VIQ. However, when sibling PIQ and VIQ were controlled, 245 only the correlation between proband PIQ and siblings’ false belief aggregate score remained significant (this correlation also remained significant when ASD siblings meeting full or partial ADI-R criteria were excluded). None of the variables on which significant differences between sibling groups were observed correlated significantly with either proband PIQ or VIQ, suggesting that the non-matching of the autistic probands of Study Two’s participants did not affect the outcome of group comparisons in this study. Overall, these results indicate that the ToM performance of siblings of autistic individuals was related to the level of functioning of probands, but EF variables were not. 6.3.4.2 Correlations between probands’ and siblings’ cognitive performances Correlations between probands’ and siblings’ cognitive task performances were limited to the sample of siblings whose autistic brother or sister participated in Study One. Only correlations between identical task variables for each family member were examined (rather than all correlations between different tasks). Surprisingly, there were no significant raw correlations between probands’ and siblings’ scores on any variable (therefore, no partial correlations were conducted). This suggests that ToM and EF performances were not strongly familial. 6.3.5 Prevalence of deficits in ASD siblings The prevalence of deficits in ASD siblings on the two potential endophenotype variables was calculated in the same way as the universality of deficits in Study One. The proportion of ASD siblings scoring below the 16th percentile of control siblings (or above the 74th percentile in the case of the error variable) was 24.1% on the RIL task shape error score, and 19.6% on the Stamps task complexity score. Therefore, impairments on these variables clearly only occurred in a subset of ASD siblings. 6.3.6 Correlations between ToM and EF Although no ToM deficit was identified in the ASD sibling group as compared with control siblings, suggesting that the notion of a “ToM-impaired, EF-intact” subgroup was somewhat invalid, correlations between ToM and EF were still of interest to 246 determine whether ASD siblings showed an unusual pattern of association between the two domains (which may suggest that they used different strategies to solve the tasks, even if no performance decrement was observed). Correlations between ToM and EF variables were calculated separately for the ASD and control sibling groups. As in Study One, partial correlations (controlling for the effects of age, VIQ and PIQ) were also conducted if significant raw correlations were observed. Table 32 presents the raw and relevant partial correlations between ToM and EF task variables within control siblings. As for Study One, correlations are displayed separately for the various false belief variables rather than the overall aggregate score because the pattern of correlations was different for the three tasks. In the control sibling group, simple false belief task performance correlated with measures of planning and verbal generativity (with all correlations in the expected direction, such that poor false belief performance correlated with poor EF task performance); however when age, VIQ and PIQ were controlled for, none of these correlations remained significant. Firstorder false belief task performance correlated with measures of planning, non-verbal inhibition (with working memory load), working memory (with inhibition requirements), verbal inhibition, and both verbal and non-verbal generativity (all in the expected direction with the exception of the RIL task load error difference score). However, when age and IQ variables were partialled out, correlations remained significant only with measures of planning, verbal inhibition, and non-verbal generativity. Second-order false belief task performance correlated with measures of planning and verbal and non-verbal generativity (all in the expected direction); with age, PIQ and VIQ controlled, there were no significant correlations with planning measures. Overall, in the control group, ToM variables demonstrated significant relationships with all EF domains measured except for set-shifting, but several of the correlations were mediated by age and IQ (there were no significant partial correlations with non-verbal inhibition measures). All correlations were in the expected direction, such that poorer performance on EF tasks correlated with poorer false belief task performance. Ceiling effects on the simple false belief task resulted in a paucity of significant correlations with EF variables for that task. 247 Table 32. Raw and partial correlations between ToM and EF variables within control siblings False belief task Simple EF task ToL (n = 61): Adj.extra move score Rule violations -.26* -.01 -.01 First-order -.61*** -.49*** -.32* -.22 IDED Set-shifting Perseveration condition (n = 42): EDS stage errors -.17 a IDED Set-shifting Learned Irrelevance condition (n = 42): EDS stage errors -.01 a RIL task (n = 41): Error difference scores: Inhibition Load Inhibition + load RT difference scores: Inhibition Load Inhibition + load Shape error score Opposite Worlds (n = 43): Error diff. score Time diff. score -.55*** -.37** -.21 -.05 .01 -.01 a a a -.13 .32* .18 a a a a .09 .22 .20 -.35* -.21 .15 -.01 .09 -.13 a a -.34* -.38* -.42** -.32* -.13 -.12 -.03 .25* -.30* .28* .02 Pattern Meanings (n = 61): Correct responses .14 Sum of errors -.15 Uses of Objects (n = 61): Correct responses .31* Sum of errors -.02 Stamps task (n = 61): Complexity score Originality score Restriction score Rule adherence score Second-order .23 .06 -.16 -.25 .20 -.34** .17 .31 -.09 -.07 -.15 .36** -.23 .13 .52*** -.19 .34** .45*** .28* -.03 -.16 .33* .07 .41** .24 -.05 -.19 .26* * p < .05; ** p < .01; *** p < .001. Note: Partial correlations controlled for age, VIQ and PIQ. All tests were two-tailed. Ns listed for each task show the sample size for correlations with the ToM tasks. a = No correlation could be calculated as all participants had perfect scores on the false belief task. 248 For ASD siblings, correlations were conducted both before and after excluding individuals meeting ADI-R criteria for an ASD. For the sake of brevity, only correlations after exclusion of this subgroup are reported, as priority was given to determining the pattern characteristic of the broad phenotype without any siblings with ASDs in the sample3. Table 33 displays these raw and partial correlations. In this group, simple false belief task performance correlated with two measures of non-verbal generativity (with both correlations in the expected direction), both of which remained significant when age and IQ variables were partialled out. First-order false belief task performance correlated with variables from all EF domains tested except for set-shifting and verbal generativity (all in the expected direction), but only the correlations with one planning measure, non-verbal inhibition (with a working memory load), and one nonverbal generativity measure were significant when age and IQ were controlled. Secondorder false belief task performance correlated with measures of planning, non-verbal inhibition, and verbal and non-verbal generativity (all in the expected direction), but only one correlation with a non-verbal inhibition measure remained significant when age and ability variables were partialled out. Overall, the ASD siblings showed a fairly similar pattern of raw correlations as the control siblings. However, partial correlations showed a different pattern from controls, with the ASD siblings showing no significant partial correlations between ToM measures and verbal inhibition or verbal generativity variables, but demonstrating significant partial correlations of ToM variables with measures of non-verbal inhibition. Like control siblings, ASD siblings also showed significant partial correlations between ToM variables and measures of planning and non-verbal generativity. Table 34 presents a summary of the significant partial correlations between ToM and EF domains in the control and ASD sibling groups as well as the control and ASD proband groups from Study One. The most striking aspect of this table is the clear relative absence of significant ToM-EF correlations in ASD probands compared with all other groups. It also shows that while the pattern of correlations displayed by ASD siblings did not mirror that demonstrated by the ASD probands, it was also qualitatively different to the pattern displayed by control siblings. Nevertheless, it is additionally evident that the control groups from both studies did not show identical patterns of correlation (this is discussed further in Section 6.4.2). 3 When siblings with ASDs were included, the pattern of correlations was similar. 249 Table 33. Raw and partial correlations between ToM and EF variables within ASD siblings False belief task Simple EF task ToL (n = 78): Adj.extra move score Rule violations -.15 -.20 First-order -.29** -.24* -.04 -.23* IDED Set-shifting Perseveration condition (n = 58): EDS stage errors .16 a IDED Set-shifting Learned Irrelevance condition (n = 59): EDS stage errors .01 a RIL task (n = 57): Error difference scores: Inhibition Load Inhibition + load RT difference scores: Inhibition Load Inhibition + load Shape error score Opposite Worlds (n = 45): Error diff. score Time diff. score a a a .0 -.41** -.42** a a a a -.07 .20 .09 -.32* a a -.09 -.29* Second-order -.34** -.13 -.17 -.04 .15 -.42** -.40** -.34** .11 -.20 -.07 -.02 .04 .02 -.19 -.09 -.19 -.25 -.31* Pattern Meanings (n = 79): Correct responses .14 Sum of errors -.09 -.02 -.21 -.01 -.15 Uses of Objects (n = 79): Correct responses .15 Sum of errors -.10 .21 -.19 .30** -.08 .08 .24* .32** -.16 -.08 .09 .16 Stamps task (n = 76): Complexity score Originality score Restriction score Rule adherence score .33** .22 .05 -.35** .24* -.28* .38** .30** -.10 -.28* .30* .15 -.22 * p < .05; ** p < .01; *** p < .001. Note: Partial correlations controlled for age, VIQ and PIQ. All tests were two-tailed. Ns listed for each task show the sample size for correlations with the ToM tasks. a = No correlation could be calculated as all participants had perfect scores on the false belief task. 250 Table 34. Summary of partial correlations between ToM and EF variables in the control and ASD probands and siblings ToM EF domain Planning Control ASD Control ASD siblings siblings probands probands 9 99 999 99* 9 Set-shifting Inhibition – Non-verbal Inhibition – Verbal 99 Working Memory Generativity – Verbal 99 Generativity – Non-verbal 99 99 999 9 9 * Correlations marked with an asterisk were in the opposite direction than expected. Note: Each tick represents one significant correlation between a false belief and an EF variable in that domain. 6.3.7 Dissociations between ToM and EF The presence of ToM-EF dissociations in the ASD sibling group was assessed in the same way as for Study One. It should again be noted that because ToM was not impaired in ASD siblings relative to the control siblings, the presence of any “ToMimpaired, EF-intact” dissociations is somewhat misleading in that ToM was not impaired in the group as a whole. However, these calculations were still of interest as it may have been the case that those siblings showing EF deficits were also more likely to be low scorers on ToM tasks. As in Study One, the false belief alternative aggregate score was used as the measure of ToM performance (14.8% of control siblings were low scorers on this variable, indicating that defining ASD siblings with low scores as “impaired” was comparable with the definition of impairment for continuous variables). The two candidate endophenotype EF variables were analysed separately. The results of these calculations are displayed in Table 35, and demonstrate that ToM and EF impairments did not always co-occur in the same ASD siblings. Rather, the EF- impaired siblings were equally or more likely to show intact ToM than impaired ToM. However, it is also notable that both the ToM-impaired and the EF-impaired siblings were more likely than the sibling group as a whole to demonstrate impairments in the other domain (e.g., 55.6% of ASD siblings showing impaired non-verbal generativity 251 also scored poorly on the false belief aggregate, compared with 19.5% of the ASD sibling group as a whole). Table 35. The incidence of ToM-EF dissociations in the ASD siblings % of ToM-impaired ASD siblings EF measure with unimpaired EF N RIL task shape error score 50.0 4 Stamps task complexity score 52.9 9 % of EF-impaired ASD siblings with unimpaired ToM RIL task shape error score 88.9 18 Stamps task complexity score 44.4 18 6.3.8 Results from behavioural measures Both the SBQ (which measures current social behaviours) and the RBQ (which measures the lifetime presence of repetitive behaviours) were completed by parents of siblings (both are described in Chapter 3, Section 3.5). The ASQ was not used as a measure of subclinical behavioural traits as it was not considered valid for use in this way, being designed to discriminate individuals with autism from typically developing individuals rather than measure the severity of any autistic-like symptomatology in typically developing individuals. As the RBQ was originally intended only as a screening measure (not enough siblings demonstrated an adequate number of repetitive behaviours for analyses using the RBI to be meaningful), only the overall sum was used rather than composite scores for each behavioural category. Comparisons between ASD and control siblings on the SBQ and RBQ were conducted to assess whether there was an increased incidence of behavioural symptomatology in ASD siblings. These comparisons were conducted both with the overall group and with siblings meeting full or partial ADI-R criteria excluded. The summary variables from both measures were highly skewed and were not amenable to transformation, so non-parametric tests (Mann-Whitney U) were used for group comparisons. Two ASD siblings and three control siblings had missing data on all three questionnaires and were not included in analyses. 252 On the SBQ overall sum, there was no significant difference between ASD and control siblings, U = 3176.5, N1 = 106, N2 = 64, p > .1, and the difference remained non-significant when siblings meeting full or partial ADI-R criteria were excluded, U = 2710.0, N1 = 97, N2 = 64, p > .1. On the RBQ, however, there was a trend for parents to report more repetitive behaviours in control siblings than in ASD siblings, U = 2818.0, N1 = 106, N2 = 64, p = .06, and this difference became significant when siblings meeting ADI-R criteria for an ASD were excluded, U = 2320.5, N1 = 97, N2 = 64, p < .01. One explanation for these unexpected findings4 could be that parents of a child with autism were more likely to under-report autistic-like symptomatology in their non-autistic children, as their benchmark for comparison (e.g., what might be considered to be repetitive use of language) was set much higher. This seems more likely than ASD siblings actually displaying less behavioural symptomatology than control siblings, given previous research on the broad behavioural phenotype (see Chapter 5, Section 5.2.1). Of note, ASD siblings meeting full or partial ADI-R criteria scored significantly higher than remaining ASD siblings on both the SBQ and RBQ, as would be expected, which suggests that these measures successfully discriminated individuals with ASD diagnoses from those without ASD diagnoses, but were not accurate measures of symptom severity in individuals without ASD diagnoses. While it was initially intended to use data from these two questionnaire measures to examine correlations between cognitive and behavioural measures within the ASD sibling group, the outcomes of these group comparisons suggest that the behavioural data are not likely to be a valid indicator of behavioural severity, making correlations difficult to interpret. When these correlations were conducted with the overall ASD sibling sample, there were a number of significant correlations between both ToM and EF variables and behavioural measures (most of which remained significant when age, PIQ, and VIQ were partialled out), but when siblings meeting full or partial ADI-R criteria were excluded, many of these correlations became nonsignificant. Therefore, because of concerns about their interpretation, these correlations are not reported. 4 When results from the ASQ were analysed, a similar pattern emerged: when siblings with ASD diagnoses were excluded, it was found that control siblings scored more highly than ASD siblings in both the communication and interests domains as well as overall. 253 6.3 Discussion 6.4.1 Endophenotype status of ToM and EF impairments In the introduction to this chapter, six features of endophenotypes were described, with criteria 4, 5, and 6 being tested in the current study. Did any ToM or EF variables meet these three criteria? i) Criterion 4. The first criterion tested in this study was that the endophenotype should be found in siblings of probands with ASDs at a higher rate than in the general population (or in this case, a higher rate than siblings of control probands). Group comparisons between siblings of individuals with ASDs and siblings of controls revealed few significant differences. There were no significant group differences on measures of ToM, planning, set-shifting, non-verbal inhibition, or verbal generativity. However, weaknesses in working memory (within an inhibition task) and non-verbal generativity emerged as the two main candidate endophenotypes. Importantly, group differences on these variables remained significant when siblings with ASDs were excluded. Although the group difference in non-verbal generativity became only marginally significant when individuals with non-ASD diagnoses were excluded, this does not necessarily mean that a non-verbal generativity deficit in the broad phenotype is an unimportant artefact of pathology unrelated to ASDs, but may reflect the possibility that the broad phenotype is itself characterised by higher rates of non-ASD diagnoses and these individuals also display a more abnormal cognitive profile5. Sisters of ASD probands also performed significantly more poorly than control sisters on the simple first stage of the IDED set-shifting task (in the Learned Irrelevance condition). As there were no significant differences observed in latter stages of the task which involve shifting set, this difference probably reflects either attentional or motivational differences rather than a deficit in a component of EF. As previously stated, the fact that ASD probands did not display a deficit on this simple first stage variable further indicates that it is not likely to represent a useful endophenotype for ASDs. It is unclear why the difference occurred only in sisters and only in one of the task conditions, but it may have been that ASD sisters were more prone to fatigue (the Learned Irrelevance condition was administered after the Perseveration condition). 5 Of the ASD sibling group, 9.3% had a non-ASD diagnosis compared with 4.5% of control siblings. 254 There were no other significant interactions between group and gender, suggesting that brothers and sisters of ASD probands were equally susceptible to any inherited cognitive weaknesses. Unexpectedly, control siblings showed significantly larger error and time difference scores on the Opposite Worlds task, which is superficially suggestive of a strength in verbal inhibition in ASD siblings. However, this difference resulted from a non-significant weakness in ASD siblings in the control condition combined with a nonsignificant weakness in control siblings in the inhibition condition. As the control siblings were not actually significantly poorer in the inhibition condition, it would be difficult to argue that the result reflects a strength in inhibition in the ASD siblings, and it appears more likely that it was a rather spurious outcome of two non-meaningful but additive differences. Overall, then, the results relevant to criterion 4 were consistent with the prediction, based on the results of Study One, that EF deficits would be more likely to demonstrate superior relative primacy over a ToM deficit as measured by their presence in siblings of individuals with ASDs. Non-autistic siblings of ASD probands exhibited a broad cognitive phenotype characterised by weaknesses in the non-verbal generation of novel ideas and working memory performance in situations combining working memory and inhibitory requirements, but no impairment in ToM. However, there are a number of caveats and additions to this conclusion. Firstly, the sensitivity of the ToM tasks used in this study may not have been sufficient to detect subtle weaknesses in mentalising abilities (although while ceiling effects were observed on the simple false belief task, there was a significant proportion of both ASD and control siblings who showed unstable performance on the other tasks). This limitation is particularly pertinent given the wide age range of participants in this study, which was larger than in Study 1. The use of more advanced and naturalistic ToM tasks would strengthen future broad phenotype studies (previous findings using higher-level ToM tasks are discussed further below). Nevertheless, it is apparent that the broad autism phenotype is not characterised by a significant impairment in basic ToM abilities. Secondly, the prediction that the EF variables which showed the strongest evidence of primacy in Study One (i.e., verbal inhibition and verbal generativity) would be the most likely to emerge as endophenotypes in siblings was not borne out. ASD siblings did not show impairments in either verbal inhibition or verbal generativity (or in planning, which was also found to be impaired in probands). These negative findings call into question the primacy of those EF domains to ASDs – although alternatively, it is possible either that 255 i) the tasks used in these domains were also lacking in sensitivity, or ii) impairments in these domains as well as in ToM always result in an ASD phenotype, and therefore are not seen in a milder form in unaffected relatives. Thirdly, only non-verbal generativity performance significantly predicted membership in the ASD sibling group (although the RIL task shape error score also became a marginally significant predictor when age and IQ variables were included in the regression); and while the two variables together successfully predicted membership of the ASD sibling group in 90.8% of cases, they also misclassified 64.3% of control siblings. Hence, their utility as endophenotypes is limited by their poor uniqueness or specificity as markers of genetic vulnerability. How do the results of sibling group comparisons compare with previous studies? As reviewed in Section 5.2.2.2 of Chapter 5, studies on cognitive deficits in siblings of autistic probands have generally found fewer and smaller differences than studies with parents, and in that sense this study is consistent with other sibling studies6. Only one previous study has employed similar false belief tasks with siblings of children with autism (Ozonoff et al., 1993), which also found no evidence of mentalising deficits. However, the ToM tasks used in both Ozonoff et al.’s and the current study may not have been difficult or high-level enough to detect subtle weaknesses, especially in older siblings. Dorris et al.’s (2004) finding of impaired performance on the higher-level Eyes Task in siblings of children with Asperger syndrome suggests that ToM difficulties may indeed be revealed if more advanced ToM tasks are used, although the “purity” and validity of the Eyes task as a measure of ToM is questionable (Dorris et al., 2004). EF in siblings of probands with ASDs has been investigated in two previous studies (Hughes et al., 1999; Ozonoff et al., 1993). Neither the interaction between working memory and inhibition or non-verbal generativity were tested in these two studies, so the positive results in these domains in the current study are new findings. Unlike this study, planning difficulties in ASD relatives were reported in both of the two previous studies, and Hughes et al. (1999) also found evidence of weaknesses in set-shifting and verbal generativity in their ASD sibling sample. The sample size was much larger for the current study than for either of these two previous studies (108 siblings of probands with ASDs compared with 18 in Ozonoff et al. and 31 in Hughes et al.), ruling out power as an explanation for these discrepancies. As was the case in this study, Ozonoff et al. also included siblings of probands with other ASDs besides 6 Moreover, the parents of probands with ASDs who were tested as part of the WAFSASD did show more EF deficits than the siblings (Wong, Maybery, Bishop, Maley, & Hallmayer, in preparation). 256 autism, however Hughes et al. included only siblings of probands with a full diagnosis of autism, and the mean IQ of the probands in that study was also lower than in this study (even though siblings of the lower-functioning probands in WAFSASD were included, the mean PIQ and VIQ of the probands was still higher in this study than in Hughes et al.’s study). It is possible that the broad cognitive phenotype is expressed more strongly when the proband has a full autism diagnosis and is lower-functioning, therefore explaining the increased incidence of planning, set-shifting and verbal generativity deficits in the siblings in that study. However, contrary to this explanation, there were no significant partial correlations between proband IQ and sibling performance on planning, set-shifting or verbal generativity tasks in this study. Another possible explanation for the discrepancies in the EF results obtained across sibling studies is that the tasks employed differed in such a way as to favour the siblings in this study. As discussed in Section 4.4.1 of Chapter 4, the administration of the ToL in this study differed slightly from most other studies in that forward planning was actively encouraged, which may have bolstered any weaknesses. While Hughes et al. (1999) used the original version of the IDED set-shifting task which had also been used to demonstrate set-shifting deficits in probands (Hughes et al., 1994), the modified version used in this study did not reveal deficits in probands either in Study One of this research or in high-functioning probands in previous research (Turner, 1997), making it unsurprising that siblings demonstrated intact performance on it in this study. It is also interesting, however, that Ozonoff et al. (1993) did not find any evidence of a deficit in cognitive flexibility in siblings using the WCST. The lack of any significant differences on the verbal generativity tasks used in this study was a little more surprising, as Turner (1999) found that autistic probands actually showed more striking deficits on ideational fluency tasks (used in this study) than on the word fluency task used by Hughes et al. (1999). However, while only 90s were allowed to generate responses in this study, Hughes et al. allowed 120s, which may have been the extra time needed to reveal generativity difficulties in siblings of autistic probands. Of additional note, Hughes et al. (1999) did not find overall group differences in their sibling groups on either of their continuous measures of planning or set-shifting, but found differences only when particular variables were dichotomised and the proportions of siblings classified as passers or failers were compared. In this study, setshifting variables were already dichotomised (although using a different criterion from 257 Hughes7), but no differences in the proportion of poor performers were found. When the current ToL data were re-analysed using a dichotomous pass/fail performance criterion (such that a “failer” was anyone who completed less than 50% of the problems in the minimum number of moves), again no group differences were revealed. It therefore does not appear that existing group differences were merely “hidden” by the methods of analysis used in this study; furthermore, the fact that Hughes et al. were only able to find significant group differences on certain variables after dichotomising their data suggests that the deficits observed lacked robustness and were not highly prevalent. ii) Criterion 5. The familiality of ToM and EF abilities was assessed by calculating correlations between the cognitive functioning of ASD siblings and the IQ and cognitive functioning of ASD probands. The only significant relationship to emerge (after the mediating effect of sibling IQ was controlled) was between proband PIQ and siblings’ false belief performance, indicating that siblings were more likely to perform poorly on false belief tasks if the proband with autism was low-functioning (non-verbally). Interestingly, this implies a small degree of familiality of ToM performance in ASD siblings, even though they did not display evidence of a ToM impairment and the correlation between proband and sibling false belief performance was not itself significant. There were no significant relationships between siblings’ EF performances and proband IQ or EF performances. This suggests that siblings’ EF abilities were not strongly familial, even on the tasks on which they displayed significant weaknesses. Hence, the results did not support the prediction that EF performances would be more likely to show evidence of familiality than ToM performance. Impairments in non-verbal generativity and working memory (in an inhibitory context), which were identified as potential endophenotypes on the basis of sibling group comparisons, did not meet the criterion of heritability. Do these results indicate that autism is not a genetic disorder, that there is no broad autism phenotype, or that the cognitive impairments displayed by ASD siblings were random and unrelated to genetic vulnerability? There are several alternative explanations. It is possible that i) different genetic factors underpin the variation found in siblings than in probands - for example, many genes are known to cause mental retardation, but these genes do not influence the normal variation of IQ; ii) the measures used lacked sensitivity to the milder deficits displayed by siblings, thereby weakening 7 When Hughes’ criterion was used (i.e., achieving six consecutive correct responses on the EDS stage), there were still no significant group differences, although on this task version there were ceiling effects using this method (i.e., the large majority of siblings achieved criterion in both conditions). 258 proband-sibling correlations; or iii) non-shared environmental factors contributed a significant amount of variance, therefore reducing the size of correlations. It is also of note that even under a perfect model of a monogenic disorder, the correlation between siblings would only be 0.5. The general lack of significant proband-sibling relationships in this study is consistent with previous studies by Piven et al. (1990) and Szatmari et al. (1993), both of which found no association between proband IQ and the cognitive functioning of first-degree relatives. No previous studies have reported direct correlations between probands’ and siblings’ performances on specific cognitive tasks. iii) Criterion 6. The notion that any endophenotype displayed in nonaffected family members would be less severe than in affected probands was not so much a criterion, but rather an expected feature of endophenotypes. The comparison of effect sizes of significant differences in this study with the proband differences in Study One confirmed this expectation, with smaller effect sizes displayed for both candidate endophenotype variables in this study. However, it should be noted that the smaller effect sizes could have been caused by a smaller proportion of siblings than probands showing a deficit, rather than the severity of the deficit being milder across the sibling sample. It was not possible to directly compare the performances of the probands and siblings, as they were not matched on PIQ or VIQ8. The proportion of siblings showing a deficit is discussed further below, in Section 6.4.2. Concluding comments on endophenotype status. In summary, weaknesses in working memory (in a context where inhibition was required) and non-verbal generativity were identified in ASD siblings when compared with control siblings, making these two variables candidates for endophenotypes of ASDs (ASD sisters also showed poorer performance on a simple discrimination stage of the IDED set-shifting task, but this was ruled out as a potential endophenotype because probands did not show a deficit on that variable). However, while these two variables also met criterion 6 (i.e., deficits in those domains were less severe in siblings than in probands), they failed to meet criterion 5 (heritability). They also lacked specificity as predictors of ASD sibling group membership (misclassifying a high proportion of control siblings), and they did not show the strongest evidence of primacy in Study One. Therefore, the evidence for their validity and utility as endophenotypes for ASDs was not strong or consistent. 8 It was possible to compare the performances of only those probands and siblings who showed a deficit (as defined by a score worse than 1 SD from the mean) on the RIL task shape error score, as the proband and sibling samples were matched on age, PIQ, and VIQ in this case. Consistent with expectation, the ASD siblings showed significantly better performance than ASD probands on that variable when z-scores were compared, t(34) = 2.76, p < .01. 259 Overall, while EF deficits demonstrated superior relative primacy than a ToM deficit, neither ToM nor EF showed convincing evidence of their primacy in this study (these results are fairly consistent with Study One, but were even less compelling). This outcome occurs in the context of frequent inconsistencies across sibling studies in the autism field and between studies of probands, siblings, and parents, and probably reflects the likelihood that genotype-phenotype relationships in the autism spectrum are complex and indirect, even when the phenotype is at the level of cognition (this is discussed further in Chapter 7). Nevertheless, further studies incorporating higherlevel, more sensitive tasks may still prove useful in identifying more subtle weaknesses in family members. 6.4.2 Differentiating the multiple deficits models The “subgroups” and “multidimensional spectrum” versions of the multiple primary deficits model of ASDs were both indirectly examined in this study. The lack of any ToM impairment in ASD siblings was in itself inconsistent with both of these models, indicating that the notion of a ToM-impaired sibling subgroup or the idea that a ToM impairment was the basis for certain types of subclinical symptomatology both lacked support. The prediction that any cognitive deficits would only occur in a subset of ASD siblings was confirmed, however, with impairments in working memory (in an inhibition context) and non-verbal generativity being demonstrated by 24.1% and 19.6% of siblings respectively. This is consistent with previous studies of cognitive abilities in relatives of individuals with ASDs, and suggests that endophenotypes or markers of genetic vulnerability are only expressed in a certain subgroup of relatives (although, it is possible that impairments in other domains not measured here may turn out to be more prevalent among relatives). The results of analyses examining the presence of ToM-EF dissociations further suggested that there may be more than one of these subgroups, each with a different cognitive profile. While impairment in ToM did not occur with a significantly greater frequency in ASD siblings than in control siblings, those ASD siblings who did display an unstable ToM were also more likely to show EF deficits than the ASD sibling group as a whole (and vice versa, those with a non-verbal generativity deficit were also more likely to show unstable ToM performance). However, there was also a high frequency of ToM-EF dissociations in both directions. This pattern of results suggests that within the subgroup of relatives expressing the endophenotype, there were two further 260 subgroups: an “EF-impaired, ToM-intact” group and a “both ToM and EF impaired” group. However, it is also possible that these do not represent valid subgroups, but instead that ToM and EF performance varied on a more continuous spectrum (with the two spectrums covarying to some degree), and only some siblings fell on the “impaired” side of the arbitrary cutoff for impairment. It would be interesting to test the validity of the various sibling “subgroups” by investigating whether there are systematic differences between their genotypes (or, perhaps, gene-environment interactions; see Bauminger & Yirmiya, 2001). Correlations between ToM and EF in siblings of individuals with ASDs were also investigated in this study, although given the lack of ToM impairment this set of analyses did not really address the “subgroups”-driven idea that ToM and EF impairments may be independent in ASD siblings (this would not be expected, given that the ASD siblings did not display a ToM deficit and the account outlined in Study 1 proposes that the relative independence between ToM and EF in probands occurs partly because their ToM deficit is caused by ToM-specific factors). Instead, the results of ToM-EF correlations were more relevant to the ancillary issue of whether ASD siblings showed unusual patterns of association between the two domains. Results demonstrated that, compared to control siblings, ASD siblings did show a different pattern of associations between ToM and EF performances. Unlike control siblings, ASD siblings showed no significant partial correlations between measures of ToM and verbal inhibition or verbal generativity, but showed three significant partial correlations between ToM and non-verbal inhibition measures. Hughes et al. (1999) found similar evidence of unusual associations between tasks for ASD siblings (although the correlations were between EF tasks in that study). This is consistent with the notion that ASD siblings may use different strategies to solve cognitive tasks than siblings of children without autism, even though their overall level of performance may not be impaired. Nevertheless, it must be noted that control siblings and control probands also showed a different pattern of correlations between ToM and EF. One explanation for this may be that the siblings in this study were older, on average, than the probands in Study One and the age range was larger. If the hypothesis that the ToM-EF relationship changes with development is correct, these differences in correlations across the two control samples would be expected. As the ASD and control sibling groups were matched on age and similar in age range, the difference in the pattern of correlations between these two groups more meaningfully suggests that ASD siblings are characterised by unusual associations between ToM and EF. However, it is also 261 possible that the different patterns of ToM-EF correlations displayed in different control samples could simply reflect the fact that ToM-EF relationships are weak or even spurious in some cases, resulting in variable outcomes in different samples. The hypothesis that siblings (and probands) use unconventional strategies to solve ToM (or EF) tasks would be better addressed by systematically varying the problem-solving demands of ToM tasks and observing the differential effects of these manipulations in ASD and control samples. It was initially intended to examine the “multidimensional spectrum” idea in this study by calculating correlations between cognitive and behavioural measures within the ASD sibling group. Unfortunately, the behavioural measures of social impairment and repetitive behaviours did not appear to be valid indicators of behavioural severity in siblings without ASD diagnoses, with the higher levels of symptoms reported in control siblings probably reflecting a tendency for parents of children with ASDs to underreport subtle autistic-like symptomatology in non-autistic siblings (e.g., the parent of a child with autism may answer the question “Does your child pace or move around repetitively?” negatively with regard to their non-autistic child as compared with their child with autism, whereas a control parent may answer positively because many children display occasional restlessness). This meant that correlations between cognitive and behavioural measures could not be calculated, as the behavioural measures lacked validity. It was interesting, however, that this under-reporting was not evident on the measure of social behaviour, which may be an indication of an increased incidence of social impairment in ASD siblings (although this is highly speculative). Future investigations of cognitive-behavioural relationships in relatives of individuals with ASDs may benefit from the employment of observational measures of behaviour or other more direct measures which do not rely on parental report (this is discussed further in Chapter 7). In sum, then, the results from this study were not able to contribute as much as was hoped to the question of which multiple deficits model may be the most appropriate for ASDs. Indeed, the results relevant to endophenotype status did not identify multiple deficits in siblings (i.e., deficits in both ToM and EF domains). It was nevertheless evident that the two EF deficits showing the most promise as endophenotypes only characterised a subgroup of ASD siblings, and that the presence of an additional ToM deficit may represent a more subsidiary subgroup. However, it was not possible to determine whether these represented valid subgroups (as opposed to ends of a spectrum) or whether they were also associated with increased levels of subclinical behavioural 262 symptomatology. The role of ToM and EF in ASDs therefore remains a question with a somewhat nebulous answer. 263 264 CHAPTER 7 General Discussion: Constructing an Explanatory Model for ASDs 7.1 Summary of the findings 7.2 Methodological strengths and limitations 7.3 Conclusions on constructing an explanatory model for ASDs 7.4 Future directions 265 7.1 Summary of the findings The major findings of this research may be summarised as follows: 1. Individuals with ASDs demonstrated a profile of spared and impaired cognitive abilities which differed in important ways from previous research. They showed impairments in ToM, planning, verbal inhibition, working memory (when inhibitory control was also required), and both verbal and non-verbal generativity, but intact performance on tests of awareness of social norms, set-shifting, non-verbal inhibition and relational reasoning. Deficits on verbal tasks were more common than on non-verbal tasks, and several task performances were mediated by VIQ. The deficits in verbal inhibition and in working memory in an inhibitory context were new findings which suggested that the previously proposed “typical EF profile” of individuals with ASDs, in which inhibition is spared (e.g., Ozonoff & Jensen, 1999), may need revision. 2. Results did not support a single primary cognitive deficit model of ASDs. Neither ToM nor EF deficits met the criteria of universality or explanatory value. This confirms and supports the findings of several previous studies which have demonstrated similar outcomes (as described in Chapter 2). 3. However, EF deficits showed superior relative primacy compared with a ToM deficit, as judged by their superior ability to discriminate individuals with ASDs from controls, and the higher number of significant correlations with aspects of behavioural symptomatology. In particular, deficits in verbal inhibition and verbal generativity appeared to be the most primary. 4. ToM and EF were found to be largely independent deficits in ASDs, as measured by the paucity of significant correlations between the two domains and the dissociability of the impairments in both directions. This was the first study to demonstrate this independence of the two deficits in ASDs. It indicated that although EF deficits were relatively more primary, they could not explain or subsume ToM as a secondary deficit. These findings were also inconsistent with both the “common conceptual bases” and “emergence” accounts of the ToM-EF relationship in typical development, thereby providing the most support for either the “common neuroanatomical bases” and/or “expression” accounts. 5. No ToM or EF variables demonstrated strong or consistent potential as endophenotypes for ASDs, although EF deficits showed better potential than a ToM 266 deficit. Weaknesses in working memory (in a context where inhibition was required) and in non-verbal generativity were identified in ASD siblings when compared with control siblings; however, performance in these domains was not strongly familial and the variables lacked specificity as predictors of ASD sibling versus control group membership. Hence, in sum, ToM and EF were found to be independently impaired in ASDs, but neither impairment was universal, showed strong relationships with symptoms, or was a useful candidate for an endophenotype for ASDs. EF deficits consistently showed superior primacy in comparison with ToM. The results of both studies indicated that a multiple primary deficit model is more suitable for ASDs than a single primary deficit model, but it was not possible to determine which type of multiple deficits model was the most appropriate. There appeared to be different subgroups of both ASD probands and ASD siblings demonstrating different cognitive profiles, but results were also compatible with the notion of a more continuous spectrum (with the apparent subgroups an artefact of the arbitrary cutoff for the definition of “impairment”). The way in which these subgroups or spectrums should be defined behaviourally was also unclear, although results were more consistent with a classification system based on level of functioning as opposed to symptom domains or symptom severity. The possibility that the primacy of deficits changes with development also remains open, and there may be other equally primary deficits which were not measured in this research. 7.2 Methodological strengths and limitations As described in Chapters 4 and 6, the current research incorporated several methodological improvements upon previous studies. One of the major strengths was the use of relatively process-pure tests of a range of EF components, several of which included in-built control conditions allowing isolation of the relevant ability. Tests requiring both verbal and non-verbal responses were used, and most tasks had several levels of difficulty in order to be suitable for individuals of a wide range of ages. Large sample size was also a significant strength of this research; for example, the number of siblings who participated in Study Two was more than three times higher than the number for the largest previous study on ToM or EF in siblings. Statistical approaches were thorough and the effects of potentially confounding variables such as age and IQ were carefully examined and accounted for throughout all analyses. 267 One of the methodological weaknesses of this research was the limited range of ToM measures employed. While the Dewey Stories task was included as a higher-level social cognition measure, it was of questionable validity as a measure of ToM. The otherwise exclusive choice of false belief tasks was the result of i) the fact that theories of the ToM-EF relationship have been based largely around false belief, ii) the need to constrain the length of the test battery, and iii) the fact that the few high-level, advanced ToM tasks that exist suffer from the problem of a lack of process purity (this is discussed further in Section 7.2). Based on previous research (e.g., Baron-Cohen, 1989b), it was also expected that the failure rate of ASD probands on the false belief tasks used (particularly the second-order tasks) would be higher than was found in this research – with the current result indicating that a ToM deficit is not as severe or prevalent in ASDs than some authors have argued (e.g., Baron-Cohen, 1995; Leslie & Roth, 1993). The high success rates of control probands and both ASD and control siblings on the false belief tasks caused difficulties for the interpretation of task results as indicative of a lack of ToM impairment in ASD siblings or of a mild severity of impairment in ASD probands. However, as discussed in Section 4.4.2 of Chapter 4, the lack of discriminative ability (or “uniqueness”) and explanatory value of ToM in Study One was not easily dismissable as a consequence of the level of difficulty of false belief tasks, as i) ToM and EF deficits were of roughly equal prevalence in the ASD group, ii) a significant proportion of individuals showed impaired performance on ToM tasks but unimpaired performance on EF tasks, and iii) performance on all of the false belief tasks was far from the ceiling in the ASD group – as confirmed by the significant mediumlevel correlations between false belief performance and VIQ (which also suggests that the measures have some reliability). In addition, although ToM performance in the control group and in the two sibling groups was high, it was not at ceiling. The use of an additional higher-level, more sensitive ToM task would nevertheless have strengthened this research, particularly for the detection of any subtle weaknesses in mentalising ability in ASD siblings, especially those in middle childhood and older. Another possible limitation was the use of parental questionnaires and interviews as indices of the presence and severity of behavioural symptomatology. Parental report is subjective and dependent upon the individual parent’s framework for judging abnormality. More direct and objective methods such as systematic behavioural observation techniques may have provided more valid measures of behavioural variation in individuals without ASDs and possibly resulted in stronger relationships with underlying cognitive deficits. Bishop and Norbury (2002) found that diagnostic 268 measures of autism based on parental interview (the ADI-R) and direct observation (the ADOS-G: Autism Diagnostic Observation Schedule – Generic; Lord et al., 2000) resulted in widely discrepant outcomes for several children, and they noted that it is usually recommended that information from parental report and observational techniques be combined. However, observational methods have their own limitations, such as time-intensiveness and the possibility of poor ecological validity (as behaviour can only be observed for a limited time and in a restricted range of situations, and the presence of an observer may alter the nature of the behaviour displayed). There are also essentially no observational scales available for social/communicative functioning and repetitive behaviours which are appropriate for recording both normal and abnormal variation in these behavioural domains, rather than being directed at diagnosing pathology. While the characteristics of the ASD proband sample (e.g., age, level of functioning, range of symptom severity) were not considered to be a major limitation as the sample was generally appropriate for the research aims and the tasks used, it should be recognised that the sample characteristics limit the scope of the conclusions. Szatmari and colleagues have suggested that low-functioning autism may arise from different genetic mechanisms from high-functioning autism (e.g., Szatmari, 1999; Szatmari et al., 2002), therefore the high-functioning nature of the ASD probands in this research may limit the generalisability of the findings. However, the inclusion of lowfunctioning probands would have required substantial alterations to the design of the study, as many of the tasks were inappropriate for individuals with mental retardation. Similarly, the relatively old age of the sample limited the conclusions that could be drawn particularly with regard to the possibility of changes in the primacy of and relationship between ToM and EF impairments with development, but again the inclusion of participants below the age of five years would have caused difficulties for task selection. The non-matching of the ASD and control probands on VIQ was not ideal, but this is a somewhat inevitable aspect of autism research given the typical PIQVIQ discrepancy displayed by individuals with ASDs (making it difficult to match controls on both PIQ and VIQ), and VIQ was taken into account in all analyses. The analysis of ASD probands with different ASD diagnoses (e.g., autism, Asperger syndrome) together as one sample may also be subject to criticism, but its validity was attested by the finding that significant differences between probands meeting full ADIR criteria for autism and those meeting partial ADI-R criteria occurred only on one task. Nevertheless, the variability of the sample is likely to have increased standard 269 deviations which may have reduced the likelihood of finding significant or large differences from controls in group comparisons (marginal differences were found on several tasks, and several significant differences were attenuated when age and/or IQ variables were controlled). Finally, although there were a small number of control probands with mild mental retardation, the control proband group consisted largely of typically developing individuals, matched to ASD probands on age and PIQ. Without having a control group of individuals with other disabilities (e.g., Down’s syndrome), it is not possible to test whether simply having any developmental disability may have resulted in some of the deficits observed. This distinction is necessary for any deficit to meet the uniqueness criterion for primacy, and it is also relevant for the interpretation of deficits in ASD siblings, who may show adverse effects of living with a sibling with a disability (although it is difficult to see why this would affect certain EF components and not others; for further discussion, see Bauminger & Yirmiya, 2001). 7.3 Conclusions on constructing an explanatory model for ASDs Taking into account these constraints, what broad conclusions about explanatory models of ASDs can be made on the basis of this research? As already stated, we can fairly confidently reject a conceptualisation of autism as a unitary syndrome with a single primary cognitive deficit. Notwithstanding the possibility that there is another cognitive deficit which was not measured in this research and which could explain both ToM and EF deficits as secondary to it (which is unlikely as ToM and EF were found to be unrelated impairments), the current findings clearly and consistently demonstrate that ASDs can not be explained by a single primary deficit. These findings consolidate recent research on cognitive impairments in ASDs, which has increasingly moved away from the notion of a single primary deficit. Psychologists studying cognitive deficits in ASDs have to some extent lagged behind other researchers focussing on genetic and neurobiological aetiologies, who have been arguing for some time that “any attempt to demonstrate a single cause for all cases of autism appears to be futile” (Gillberg & Coleman, 1992, p. 283). This lag was driven by the hope that the identification of a single cognitive deficit would provide a diagnostic marker for autism and a unified explanation for the range of unusual behaviours displayed by individuals with ASDs. However, it could be argued that the failure to find such a cognitive marker was somewhat predictable given the 270 heterogeneity evident at the genetic, neurobiological, and behavioural levels of explanation. This highlights the importance of an integrated approach to ASD research, where findings from all levels of explanation constrain and inform each other (see Bailey et al., 1996; Tager-Flusberg, 1999a). Nevertheless, recognition of this need for integration is only the first step towards the discovery of which kind of multiple deficits model may best explain ASDs. Several key questions remain. Are ASDs best conceptualised as a group of distinct subtypes or a multidimensional spectrum? How should these subgroups or spectrums be defined and operationalised? Are they associated with different genotypes or neuropathologies? Are there other cognitive impairments besides ToM and EF which may be equally primary in ASDs, and if so what are they? Do the various cognitive impairments change in primacy or causal status with development? It may be the case that a combination of the various multiple deficits models will end up forming the best explanatory paradigm; for example, there may be a multidimensional autism spectrum within which certain clusters often occur (this kind of model has been proposed previously by Beglinger & Smith, 2001), where more than two cognitive deficits are present and these deficits change in primacy throughout development. The problem with this sort of model is that its complexity makes it very difficult to test empirically. The methodological and conceptual difficulties with determining which integrated causal model can explain ASDs are characteristic of research on complex genetic disorders in general, especially when dealing with disorders of development. The task of beginning with behaviour and tracing the causal chain back through cognition to the level of biology is full of hazards. The mapping of genotype to phenotype is neither direct nor specific (Karmiloff-Smith et al., 2002), and a neurological abnormality which occurs early in development can trigger a complex chain of both structural and functional changes. This means that diverse pathogenic processes may lead to similar behavioural phenotypes, and conversely, similar pathogenic processes may lead to divergent behavioural symptoms (Courchesne, Townsend, & Chase, 1995). In the words of Gottesman and Gould (2003): In diseases with classic or Mendelian genetics as their distal causes, genotypes are usually indicative of phenotypes. However, this degree of genetic certainty does not exist for diseases with complex genetics. Genetic probabilism aptly describes the process by which a particular genotype gives rise to phenotype. Epigenetic factors may also be of critical importance for modifying the development of phenotypes, and such modifications may be influenced by genotype or environment or be entirely stochastic in origin. Thus, models of complex genetic 271 disorders predict a ballet choreographed interactively over time among genotype, environment, and epigenetic factors, which gives rise to a particular phenotype (p. 636 – references not included). The recognition of this multilayered complexity will be necessary to make progress in the construction of an explanatory model for ASDs. While admirable attempts at integrated models have been made (e.g., Courchesne et al., 1995; Dawson et al., 2002b; Waterhouse et al., 1996), we are still far from understanding how to tie together the diverse array of often inconsistent findings across all levels of explanation. This not least in part because our understanding of the interactions between genes, neurobiology, cognition and behaviour in typical development is crude and fragmentary at best. 7.4 Future directions An integrative approach to autism research ideally requires both clarity within each level of explanation and consistency and integration between the various levels of explanation. Therefore, further research needs to address remaining questions at the cognitive level of explanation, as well as the integration of cognitive findings with research on behavioural outcomes, neurobiological substrates, and genetic mechanisms. So which issues at the cognitive level of explanation deserve further attention? Although neither ToM or EF impairments appear to be a core marker for autism, the study of their nature, primacy and relationship can still inform research on causal models of ASDs as well as the study of ToM, EF, and their relationship in typical development. Firstly, there is a clear need to develop more high-level, ecologically valid measures of ToM. It is evident that ToM develops beyond the ability to understand false belief, and yet there are few tasks available for investigating more advanced ToM development. Those that are available (e.g., the Eyes task, Strange Stories) suffer from a lack of process purity, relying heavily on other abilities such as face perception, emotion recognition, and verbal comprehension. Ecological validity is an important task property as attempts to rigorously control the conditions of the task can end up altering the essence of the phenomenon under study (Volkmar et al., 2004), but the search for ecological validity often comes at the expense of task purity. The challenge is therefore to develop ecologically valid tasks which have in-built control conditions that allow isolation of the target ability. Such tasks would allow more precise examination of the extent of higher-level ToM development in individuals with ASDs and their first-degree relatives. They would also aid the investigation of the 272 nature of the ToM-EF relationship in children and adults over the age of five, which will be important both for the extension of theory and empirical findings on the ToM-EF relationship at later ages and for the interpretation of findings on the ToM-EF relationship in clinical samples in this older age range. The development of ToM tasks which incorporate the component process approach may also help uncover any compensatory strategies which may be used to aid ToM performance. The use of alternative strategies often becomes a “default” explanation for intact performance on ToM tasks, yet this hypothesis has not been directly tested, relying only on indirect evidence such as neuroimaging data. More direct tests could involve systematic manipulation of the problem-solving requirements of high-level ToM tasks to examine how performance is affected with certain strategies cannot be used. Such multiple-condition tasks may also represent a method of investigating the validity of one of the proposed explanations for the intriguing lack of correlations between ToM and EF in individuals with ASDs (i.e., the possibility that the lack of significant correlations between ToM and EF in individuals with ASDs who show EF impairment is due to the use of alternative strategies for ToM performance; see Section 4.4.3 in Chapter 4). A number of aspects of EF in ASDs also merit further investigation. The findings of this research suggest that deficits in generativity play a key role in ASDs. The impairment in verbal generativity displayed by ASD probands demonstrated the largest effect size, was one of the most prevalent deficits, and was a significant discriminator between the ASD and control groups, and an impairment in non-verbal generativity was one of the few significant deficits to emerge in ASD siblings. Further investigation of generativity with individuals with ASDs of a larger age range and using a wider range of tasks therefore appears worthwhile, although it awaits the development of generativity tasks appropriate for young children. It is interesting to note that generativity tasks are generally the most unstructured of EF tasks, as they require the participant to produce novel responses, as opposed to reacting to stimuli presented to them as part of a structured task. It would therefore also be interesting to see whether the apparent severity of impairment displayed on generativity tasks is due to a specific problem with generativity, or whether EF impairments in general are better detected and therefore more severe on unstructured tasks which are more representative of many reallife situations (i.e., have higher ecological validity), regardless of which EF component is involved. This could be addressed by designing more unstructured tests of other EF components. 273 Impairments in verbal inhibition and on tasks requiring a combination of inhibitory and working memory requirements were also new findings in this research which await replication. If further studies confirm the existence of an inhibitory impairment in ASDs, this raises additional questions about the discriminant validity of the EF profile in ASDs as compared with other disorders such as ADHD. Even if EF impairments are not singularly primary in autism and therefore do not strictly need to meet the uniqueness criterion, the question remains as to why similar EF impairments result in such different behaviours in different disorders. Is it the case that EF impairments must co-occur with certain other cognitive impairments, or emerge at a particular point in development - or both - in order to produce the unique behaviours displayed by individuals with ASDs? The integration of cognitive and behavioural levels of explanation is the next challenge in constructing an integrated explanatory model of ASDs. To begin with, in order to examine the relationships between cognition and behaviour in a more precise manner, we first require more accurate measures of behaviour. As mentioned previously, parental report and observational techniques each have their own set of problems. A combination of approaches may be necessary to gain a complete picture of the nature and severity of behavioural symptomatology (Bishop & Norbury, 2002). However, it will first be necessary to develop observational scales which are appropriate for capturing both the normal range of variation in behaviour and the extremes of abnormality. Without this, it remains unclear whether ToM and EF impairments do actually underlie the behaviours that they are commonly purported to – or indeed whether they hold any explanatory value at all. If ToM and/or EF do not show strong relationships with behaviour, it remains possible that they are simply pleiotropic effects – that is, they may be related to the genetic mechanisms which cause autism but unrelated to its behavioural phenotype. This does not seem plausible given the results of previous research and our knowledge about the behavioural effects of cognitive deficits such as EF impairment in other disorders (e.g., individuals with frontal lobe damage), but it is a possibility which needs to be ruled out using valid behavioural measures. The need for longitudinal studies which track the development of cognition and behaviour in children with ASDs from an early age is a theme which has recurred throughout this thesis as well as autism research in general. These studies will be crucial for i) determining whether ToM or EF impairments have causal precedence, ii) examining relationships between ToM and EF impairments and their proposed 274 precursors such as joint attention (e.g., Leekam & Moore, 2001; Mundy, 2003) and imitation (Rogers, 1999; Rogers & Pennington, 1991), and iii) investigating how early cognitive deficits affect the nature and severity of both early and later behavioural symptomatology. Until recently, cognitive theories of ASDs have largely ignored the process of development and instead proposed essentially static impairments which supposedly persist throughout the affected individual’s lifetime. However, the importance of considering developmental factors when conducting research on developmental psychopathologies is being increasingly emphasised (e.g., Bishop, 1997; Karmiloff-Smith, 1992; Tager-Flusberg, 1999a; Thomas & Karmiloff-Smith, 2002). For example, Karmiloff-Smith (1997) recommended six changes of approach for research in developmental cognitive neuroscience: 1. The recognition that plasticity is the rule, not simply a specialised response to injury. 2. The identification of constraints on plasticity. 3. A focus on the dynamics of development at multiple levels. 4. The recognition that specialisation within some brain regions is the product of development, not its starting point. 5. A focus not only on the end state but also how the child progressively develops to the end state. 6. The in-depth analysis of the different processes by which seemingly normal surface behaviour can be produced by a brain that has developed differently from the outset. (p. 514). Increased recognition of the role of interactive developmental processes in cognitive performance and behavioural outcomes in the field of autism has paralleled this more general shift (e.g., Bowler, 2001; Burack, Charman, Yirmiya, & Zelazo, 2001; Courchesne et al., 1995; Happé, 2001; Steele, Joseph, & Tager-Flusberg, 2003; TagerFlusberg, 2001). However, while the need for longitudinal studies is often discussed, it is rarely enacted. This is partly because autism is still difficult to diagnose early, although recent progress in early diagnosis (see Charman & Baird, 2002) may facilitate the ease with which longitudinal studies can be conducted. Targeting newborn siblings of individuals with ASDs, who have an increased likelihood of developing an ASD, is another method of identifying possible participants early for longitudinal monitoring. The ability to conduct studies of cognitive impairment from a very young age has also been hampered by the lack of appropriate tasks for young children, although again such tasks are becoming increasingly available. 275 How might findings at the cognitive level of explanation inform research on the neurobiological substrates of ASDs? One possible avenue results from the finding that ToM and EF deficits were independent in ASDs, which indicates that their cooccurrence is most likely to be explained by their neuroanatomical proximity. This suggests that the functioning of the prefrontal cortex, both in its ventromedial and dorsolateral aspects, is disrupted in individuals with ASDs. However, there is no clear evidence of structural frontal abnormality in ASDs. Instead, it is possible that cortical networks involving frontal regions may have been disrupted during development, or that neurotransmitters which are particularly active in these regions are deficient1 (e.g., a dopaminergic deficit may underlie the range of cognitive deficits displayed by individuals with autism, as suggested by Pennington et al., 1997). It therefore appears that investigations of the development of cortical networks involving the prefrontal cortex and neurotransmitter systems which heavily populate frontal areas would be worthwhile targets for neurobiological studies of ASDs. The relationship between cognitive (and behavioural and neurobiological) findings with underlying genetic mechanisms will arguably be one of the most important and fruitful links in the search for the aetiology of autism. This research suggested that there may be different subgroups of individuals with ASDs, possibly defined by their level of functioning, which have different ToM and EF profiles. Similarly, relatives of individuals with ASDs who demonstrated endophenotypes or cognitive vulnerability markers also appeared to show a variety of cognitive profiles. However, in both probands and siblings, it was also possible (and perhaps even more likely) that cognitive performances varied on a more continuous spectrum, with the apparent subgroups a result of classifying individuals scoring below a certain point as “impaired”. One method of distinguishing between these two possibilities would be to conduct a cluster analysis (based on cognitive performances) on a large sample of individuals with ASDs and their relatives, and test firstly whether any meaningful clusters emerged which showed unique behavioural characteristics, and secondly whether any such clusters showed distinct genotypic markers and/or neuropathologies. This kind of research has been conducted previously with promising results (Dawson, Klinger, Panagiotides, Lewy, & Vastelloe, 1995; this study used the subgroup classification system proposed by Wing and Gould (1979) based on differences in social 1 However, note that if the severity of ToM and EF deficits depended directly on the extent of neurotransmitter deficiency, then significant correlations between ToM and EF might be expected in the ASD population. 276 behaviour rather than using cluster analysis). The notion of different subgroups of individuals with ASDs characterised by different genetic mechanisms has been previously proposed to explain the heterogeneity evident at all levels of explanation (e.g., Szatmari, 1999; Tager-Flusberg & Joseph, 2003). However, it remains to be seen how these subgroups should be defined, or indeed, whether the notion of subgroups holds any validity at all. The kind of multi-level approach proposed above seems the most appropriate way of approaching this problem, although cluster analysis is limited by the absence of objective rules for defining the boundaries of each subgroup (Lorr, 1994). The current research has consolidated and extended previous work by demonstrating that i) neither ToM or EF impairments meet criteria for a single primary deficit in ASDs, ii) ToM and EF impairments are independent and do not explain each other, and iii) multiple deficits models involving subgroups or spectrums which are probably not based on symptom domains or severity, and where deficits are not considered static and unchanging, are the best place to focus future research efforts. Studies of cognitive mechanisms in ASDs and their relationship with behaviour and biological substrates should move away from attempting to find a specific core cognitive deficit which could “explain autism” and instead focus upon mapping the profile of deficits and examining how these deficits change over time and interact with the other levels of explanation. The challenge will be to develop creative methods and strategies for implementing an integrated, developmental approach which recognises the complexities and dynamics of the genotype-phenotype interactions that underlie autism. 277 278 REFERENCES Abu-Akel, A. (2003). A neurobiological mapping of theory of mind. Brain Research Reviews, 43, 29-40. Akshoomoff, N., Pierce, K., & Courchesne, E. (2002). The neurobiological basis of autism from a developmental perspective. Development & Psychopathology, 14(3), 613-634. Alexander, M. P. (2002). Disorders of language after frontal lobe injury: Evidence for the neural mechanisms of assembling language. In D. T. Stuss & R. T. Knight (Eds.), Principles of frontal lobe function (pp. 159-167). London: Oxford University Press. American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders. (4th ed.). Washington, DC: Author. Ames, D., Cummings, J. L., Wirshing, W. C., Quinn, B., & Mahler, M. (1994). Repetitive and compulsive behavior in frontal lobe degenerations. Journal of Neuropsychiatry & Clinical Neurosciences, 6(2), 100-113. Anderson, C. V., Bigler, E. D., & Blatter, D. D. (1995). Frontal lobe lesions, diffuse damage, and neuropsychological functioning in traumatic brain-injured patients. Journal of Clinical & Experimental Neuropsychology, 17(6), 900-908. Anderson, P. (2002). Assessment and development of executive function (EF) during childhood. Child Neuropsychology, 8(2), 71-82. Anderson, P., Anderson, V., & Lajoie, G. (1996). The Tower of London Test: Validation and standardization for pediatric populations. Clinical Neuropsychologist, 10(1), 54-65. Anderson, S. W., Bechara, A., Damasio, H., Tranel, D., & Damasio, A. R. (1999). Impairment of social and moral behavior related to early damage in human prefrontal cortex. Nature Neuroscience, 2(11), 1032-1037. Anderson, S. W., Damasio, H., Jones, R., & Tranel, D. (1991). Wisconsin Card Sorting Test performance as a measure of frontal lobe damage. Journal of Clinical & Experimental Neuropsychology, 13(6), 909-922. Anderson, V. (1998). Assessing executive functions in children: Biological, psychological, and developmental considerations. Neuropsychological Rehabilitation, 8(3), 319-349. 279 Anderson, V. A., Anderson, P., Northam, E., Jacobs, R., & Catroppa, C. (2001). Development of executive functions through late childhood and adolescence in an Australian sample. Developmental Neuropsychology, 20(1), 385-406. Anderson, V., Levin, H. S., & Jacobs, R. (2002). Executive functions after frontal lobe injury: A developmental perspective. In D. T. Stuss & R. T. Knight (Eds.), Principles of frontal lobe function (pp. 504-527). London: Oxford University Press. Astington, J. W., Harris, P. L., & Olson, D. R. (Eds.). (1988). Developing theories of mind. Cambridge: Cambridge University Press. Atkinson, R., & Shiffrin, R. (1968). Human memory: A proposed system and its control processes. In K. W. Spence & J. T. Spence (Eds.), The psychology of learning and motivation . New York: Academic Press. August, G. J., Stewart, M. A., & Tsai, L. (1981). The incidence of cognitive disabilities in the siblings of autistic children. British Journal of Psychiatry, 138, 416-422. Bach, L., Davies, S., Colvin, C., Wijeratne, C., Happé, F., & Howard, R. (1998). A neuropsychological investigation of theory of mind in an elderly lady with frontal leucotomy. Cognitive Neuropsychiatry, 3(2), 139-159. Bach, L. J., Happé, F., Fleminger, S., & Powell, J. (2000). Theory of mind: Independence of executive function and the role of the frontal cortex in acquired brain injury. Cognitive Neuropsychiatry, 5(3), 175-192. Bachevalier, J. (1994). Medial temporal lobe structures and autism: A review of clinical and experimental findings. Neuropsychologia, 32(6), 627-648. Bachevalier, J., & Loveland, K. A. (2003). Early orbitofrontal-limbic dysfunction and autism. In D. Cicchetti & E. Walker (Eds.), Neurodevelopmental mechanisms in psychopathology (pp. 215-236). New York: Cambridge University Press. Baddeley, A. D. (1986). Working memory. Oxford: Oxford University Press. Baddeley, A. (1996). Exploring the central executive. Quarterly Journal of Experimental Psychology A, 49A(1), 5-28. Baddeley, A. (2002). Fractionating the central executive. In D. T. Stuss & R. T. Knight (Eds.), Principles of frontal lobe function (pp. 246-260). London: Oxford University Press. Baddeley, A. D., & Hitch, G. (1974). Working memory. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 8). New York: Academic Press. 280 Bailey, A., Le Couteur, A., Gottesman, I., & Bolton, P. (1995). Autism as a strongly genetic disorder: Evidence from a British twin study. Psychological Medicine, 25(1), 63-77. Bailey, A., Palferman, S., Heavey, L., & Le Couteur, A. (1998). Autism: The phenotype in relatives. Journal of Autism & Developmental Disorders, 28(5), 369-392. Bailey, A., Phillips, W., & Rutter, M. (1996). Autism: Towards an integration of clinical, genetic, neuropsychological, and neurobiological perspectives. Journal of Child Psychology and Psychiatry, 37(1), 89-126. Baird, G., Charman, T., Baron-Cohen, S., Cox, A., Swettenham, J., Wheelwright, S., & Drew, A. (2000). A screening instrument for autism at 18 months of age: A 6year follow-up study. Journal of the American Academy of Child and Adolescent Psychiatry, 39(6), 694-702. Baird, T. D., & August, G. J. (1985). Familial heterogeneity in infantile autism. Journal of Autism & Developmental Disorders, 15(3), 315-321. Baron-Cohen, S. (1988). Social and pragmatic deficits in autism: Cognitive or affective? Journal of Autism & Developmental Disorders, 18(3), 379-402. Baron-Cohen, S. (1989a). Are autistic children "behaviorists"? An examination of their mental-physical and appearance-reality distinctions. Journal of Autism & Developmental Disorders, 19(4), 579-600. Baron-Cohen, S. (1989b). The autistic child's theory of mind: A case of specific developmental delay. Journal of Child Psychology & Psychiatry & Allied Disciplines, 30(2), 285-297. Baron-Cohen, S. (1989c). Do autistic children have obsessions and compulsions? British Journal of Clinical Psychology, 28(3), 193-200. Baron-Cohen, S. (1989d). Perceptual role taking and protodeclarative pointing in autism. British Journal of Developmental Psychology, 7(2), 113-127. Baron-Cohen, S. (1991a). Do people with autism understand what causes emotion? Child Development, 62, 385-395. Baron-Cohen, S. (1991b). Precursors to a theory of mind: Understanding attention in others. In A. Whiten (Ed.), Natural theories of mind: Evolution, development and simulation of everyday mindreading (pp. 233-251). Oxford: Blackwell. Baron-Cohen, S. (1991c). The theory of mind deficit in autism: How specific is it? British Journal of Developmental Psychology, 9(2), 301-314. 281 Baron-Cohen, S. (1992). Debate and argument: On modularity and development in autism: A reply to Burack. Journal of Child Psychology & Psychiatry & Allied Disciplines, 33(3), 623-629. Baron-Cohen, S. (1994). How to build a baby that can read minds: Cognitive mechanisms in mindreading. Cahiers de Psychologie Cognitive, 13(5), 513-552. Baron-Cohen, S. (1995). Mindblindness: An essay on autism and theory of mind. Cambridge, MA: MIT Press. Baron-Cohen, S. (1998). Does the study of autism justify minimalist innate modularity? Learning & Individual Differences, 10(3), 179-191. Baron-Cohen, S. (2000). Theory of mind and autism: A fifteen year review. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from developmental cognitive neuroscience (2nd ed., pp. 320). London: Oxford University Press. Baron-Cohen, S., Allen, J., & Gillberg, C. (1992). Can autism be detected at 18 months? The needle, the haystack, and the CHAT. British Journal of Psychiatry, 161, 839-843. Baron-Cohen, S., Campbell, R., Karmiloff-Smith, A., Grant, J., & Walker, J. (1995). Are children with autism blind to the mentalistic significance of the eyes? British Journal of Developmental Psychology, 13, 379-398. Baron-Cohen, S., & Goodhart, F. (1994). The "seeing-leads-to-knowing" deficit in autism: The Pratt and Bryant probe. British Journal of Developmental Psychology, 12(3), 397-401. Baron-Cohen, S., & Hammer, J. (1997). Parents of children with Asperger syndrome: What is the cognitive phenotype? Journal of Cognitive Neuroscience, 9(4), 548554. Baron-Cohen, S., Jolliffe, T., Mortimore, C., & Robertson, M. (1997). Another advanced test of theory of mind: Evidence from very high functioning adults with autism or Asperger Syndrome. Journal of Child Psychology & Psychiatry & Allied Disciplines, 38(7), 813-822. Baron-Cohen, S., Leslie, A. M., & Frith, U. (1985). Does the autistic child have a "theory of mind"? Cognition, 21(1), 37-46. Baron-Cohen, S., Leslie, A. M., & Frith, U. (1986). Mechanical, behavioural and Intentional understanding of picture stories in autistic children. British Journal of Developmental Psychology, 4(2), 113-125. 282 Baron-Cohen, S., & Ring, H. (1994). A model of the mindreading system: Neuropsychological and neurobiological perspectives. In C. Lewis & P. Mitchell (Eds.), Children's early understanding of mind: Origins and development (pp. 183-207). Hove, UK: Lawrence Erlbaum Associates. Baron-Cohen, S., Ring, H., Moriarty, J., Schmitz, B., Costa, D., & Ell, P. (1994). Recognition of mental state terms: Clinical findings in children with autism and a functional neuroimaging study of normal adults. British Journal of Psychiatry, 165(5), 640-649. Baron-Cohen, S., Ring, H. A., Wheelwright, S., Bullmore, E. T., Brammer, M. J., Simmons, A., & Williams, S. C. R. (1999a). Social intelligence in the normal and autistic brain: An fMRI study. European Journal of Neuroscience, 11, 18911898. Baron-Cohen, S., & Robertson, M. M. (1995). Children with either autism, Gilles de la Tourette syndrome or both: Mapping cognition to specific syndromes. Neurocase: Case Studies in Neuropsychology, Neuropsychiatry, & Behavioural Neurology, 1(2), 101-104. Baron-Cohen, S., & Swettenham, J. (1997). Theory of mind in autism: Its relationship to executive function and central coherence. In D. J. Cohen & F. R. Volkmar (Eds.), Handbook of autism and pervasive developmental disorders (2nd ed., pp. 880-893). New York: John Wiley & Sons. Baron-Cohen, S., Wheelwright, S., Hill, J., Raste, Y., & Plumb, I. (2001a). The "Reading the mind in the eyes" Test revised version: A study with normal adults, and adults with Asperger syndrome or high-functioning autism. Journal of Child Psychology & Psychiatry & Allied Disciplines, 42(2), 241-251. Baron-Cohen, S., Wheelwright, S., Skinner, R., Martin, J., & Clubley, E. (2001b). The Autism-Spectrum Quotient (AQ): Evidence from Asperger syndrome/highfunctioning autism, males and females, scientists and mathematicians. Journal of Autism & Developmental Disorders, 31(1), 5-17. Baron-Cohen, S., Wheelwright, S., Stone, V., & Rutherford, M. (1999b). A mathematician, a physicist and a computer scientist with Asperger syndrome: Performance on folk psychology and folk physics tests. Neurocase, 5(6), 475483. Bartsch, K., & Wellman, H. (1989). Young children's attribution of action to beliefs and desires. Child Development, 60(4), 946-964. 283 Bartsch, K., & Wellman, H. M. (1995). Children talk about the mind. New York: Oxford University Press. Bauman, M. L. (1999). Autism: Clinical features and neurobiological observations. In H. Tager-Flusberg (Ed.), Neurodevelopmental disorders: Developmental cognitive neuroscience (pp. 383-399). Cambridge, MA: The MIT Press. Bauman, M. L., & Kemper, T. L. (1994). Neuroanatomic observations of the brain in autism. In M. L. Bauman & T. L. Kemper (Eds.), The neurobiology of autism (pp. 119-145). Baltimore, MA: John Hopkins. Bauminger, N., & Kasari, C. (1999). Brief report: Theory of mind in high-functioning children with autism. Journal of Autism & Developmental Disorders, 29(1), 8186. Bauminger, N., & Yirmiya, N. (2001). The functioning and well-being of siblings of children with autism: Behavioral-genetic and familial contributions. In J. A. Burack, T. Charman, N. Yirmiya, & P. R. Zelazo (Eds.), The development of autism: Perspectives from theory and research (pp. 61-80). Mahwah, NJ: Lawrence Erlbaum Associates. Beglinger, L. J., & Smith, T. H. (2001). A review of subtyping in autism and proposed dimensional classification model. Journal of Autism & Developmental Disorders, 31(4), 411-422. Bell, M. A., & Fox, N. A. (1992). The relations between frontal brain electrical activity and cognitive development during infancy. Child Development, 63(5), 11421163. Bennetto, L., Pennington, B. F., & Rogers, S. J. (1996). Intact and impaired memory functions in autism. Child Development, 67, 1816-1835. Benson, G., Abbeduto, L., Short, K., Bibler-Nuccio, J., & Maas, F. (1993). Development of a theory of mind in individuals with mental retardation. American Journal on Mental Retardation, 98(3), 427-433. Berger, H. J., Aerts, F. H., van Spaendonck, K. P., Cools, A. R., & Teunisse, J.-P. (2003). Central coherence and cognitive shifting in relation to social improvement in high-functioning young adults with autism. Journal of Clinical & Experimental Neuropsychology, 25(4), 502-511. Berger, H. J., Van Spaendonck, K. P., Horstink, M. W., Buytenhuijs, E. L., Lammers, P. W. J. M., & Cools, A. R. (1993). Cognitive shifting as a predictor of progress in social understanding in high-functioning adolescents with autism: A prospective study. Journal of Autism & Developmental Disorders, 23(2), 341-359. 284 Bertrand, J., Mars, A., Boyle, C., Bove, F., Yeargin-Allsopp, M., & Decoufle, P. (2001). Prevalence of autism in a United States population: The Brick Township, New Jersey, investigation. Pediatrics, 108(5), 1155-61. Berument, S. K., Rutter, M., Lord, C., Pickles, A., & Bailey, A. (1999). Autism screening questionnaire: Diagnostic validity. British Journal of Psychiatry, 175, 444-451. Bettelheim, B. (1967). The empty fortress: Infantile autism and the birth of the self. New York: Free Press. Beveridge, M., Jarrold, C., & Pettit, E. (2002). An experimental approach to executive fingerprinting in young children. Infant & Child Development, 11(2), 107-123. Biro, S., & Russell, J. (2001). The execution of arbitrary procedures by children with autism. Development & Psychopathology, 13(1), 97-110. Bishop, D. V. (1993). Annotation: Autism, executive functions and theory of mind: A neuropsychological perspective. Journal of Child Psychology & Psychiatry & Allied Disciplines, 34(3), 279-293. Bishop, D. V. M. (1997). Cognitive neuropsychology and developmental disorders: Uncomfortable bedfellows. Quarterly Journal of Experimental Psychology A, 50A(4), 899-923. Bishop, D. V. M. (2000). What's so special about Asperger syndrome? The need for further exploration of the borderlands of autism. In A. Klin & F. R. Volkmar (Eds.), Asperger syndrome (pp. 254-277). New York: Guilford Press. Bishop, D. V. M., Maybery, M., Maley, A., Wong, D., Hill, W., & Hallmayer, J. (in press-a). Using self-report to identify the broad phenotype in parents of children with autistic spectrum disorders: A study using the Autism-Spectrum Quotient. Journal of Child Psychology & Psychiatry. Bishop, D. V. M., Maybery, M., Wong, D., Maley, A., Hill, W., & Hallmayer, J. (in press-b). Are phonological processing deficits part of the broad autism phenotype? American Journal of Medical Genetics (Neuropsychiatric Genetics). Bishop, D. V., & Norbury, C. F. (2002). Exploring the borderlands of autistic disorder and specific language impairment: A study using standardised diagnostic instruments. Journal of Child Psychology & Psychiatry & Allied Disciplines, 43(7), 917-929. Bjorklund, D. F., & Harnishfeger, K. K. (1990). The resources construct in cognitive development: Diverse sources of evidence and a theory of inefficient inhibition. Developmental Review, 10(1), 48-71. 285 Blair, J., Sellars, C., Strickland, I., Clark, F., Williams, A., Smith, M., & Jones, L. (1996). Theory of mind in the psychopath. Journal of Forensic Psychiatry, 7(1), 15-25. Bolton, P., Macdonald, H., Pickles, A., Rios, P., Goode, S., Crowson, M., Bailey, A., & Rutter, M. (1994). A case-control family history study of autism. Journal of Child Psychology & Psychiatry & Allied Disciplines, 35(5), 877-900. Bolton, P., Pickles, A., Murphy, M., & Rutter, M. (1998). Autism, affective and other psychiatric disorders: Patterns of familial aggregation. Psychological Medicine, 28(2), 385-395. Boone, K. B., Ponton, M. O., Gorsuch, R. L., Gonzalez, J. J., & Miller, B. L. (1998). Factor analysis of four measures of prefrontal lobe functioning. Archives of Clinical Neuropsychology, 13(7), 585-595. Boucher, J. (1988). Word fluency in high-functioning autistic children. Journal of Autism & Developmental Disorders, 18(4), 637-645. Boucher, J. (1996). What could possibly explain autism? In P. Carruthers & P. K. Smith (Eds.), Theories of theories of mind (pp. 223-241). Cambridge: Cambridge University Press. Boutin, P., Maziade, M., Merette, C., Mondor, M., Bedard, C., & Thivierge, J. (1997). Family history of cognitive disabilities in first-degree relatives of autistic and mentally retarded children. Journal of Autism & Developmental Disorders, 27(2), 165-176. Bowler, D. M. (1992). "Theory of mind" in Asperger's syndrome. Journal of Child Psychology & Psychiatry & Allied Disciplines, 33(5), 877-893. Bowler, D. M. (2001). Autism: Specific cognitive deficit or emergent end point of multiple interacting systems? In J. A. Burack, T. Charman, N. Yirmiya, & P. R. Zelazo (Eds.), The development of autism: Perspectives from theory and research (pp. 219-235). Mahwah, NJ: Lawrence Erlbaum Associates. Bowler, D. M., & Briskman, J. A. (2000). Photographic cues do not always facilitate performance on false belief tasks in children with autism. Journal of Autism & Developmental Disorders, 30(4), 305-316. Brian, J. A., Tipper, S., Weaver, B., & Bryson, S. (2003). Inhibitory mechanisms in autism spectrum disorders: Typical selective inhibition of location versus facilitated perceptual processing. Journal of Child Psychology & Psychiatry & Allied Disciplines, 44(4), 552-560. 286 Briskman, J., Happé, F., & Frith, U. (2001). Exploring the cognitive phenotype of autism: Weak "central coherence" in parents and siblings of children in autism: II. Real-life skills and preferences. Journal of Child Psychology & Psychiatry & Allied Disciplines, 42(3), 309-316. Brothers, L. (1996). Brain mechanisms of social cognition. Journal of Psychopharmacology, 10(1), 2-8. Brown, R., Hobson, R., Lee, A., & Stevenson, J. (1997). Are there "autistic-like" features in congenitally blind children? Journal of Child Psychology & Psychiatry & Allied Disciplines, 38(6), 693-703. Bruner, J., & Feldman, C. (1993). Theories of mind and the problem of autism. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from autism. Oxford: Oxford University Press. Brunet, E., Sarfati, Y., Hardy-Baylé, M. C., & Decety, J. (2000). A PET investigation of the attribution of intentions with a non-verbal task. NeuroImage, 11, 157-166. Bryson, S. E., Landry, R., & Wainwright, J. A. (1997). A componential view of executive dysfunction in autism: Review of recent evidence. In J. A. Burack & J. T. Enns (Eds.), Attention, development, and psychopathology (pp. 232-255). New York: The Guilford Press. Buitelaar, J. K., Swaab, H., van der Wees, M., Wildschut, M., & van der Gaag, R. J. (1996). Neuropsychological impairments and deficits in theory of mind and emotion recognition in a non-autistic boy. European Child & Adolescent Psychiatry, 5(1), 44-51. Burack, J. A. (1994). Selective attention deficits in persons with autism: Preliminary evidence of an inefficient attentional lens. Journal of Abnormal Psychology, 103(3), 535-543. Burack, J. A., Charman, T., Yirmiya, N., & Zelazo, P. R. (Eds.). (2001). The development of autism: Perspectives from theory and research. Mahwah, NJ: Lawrence Erlbaum Associates. Burgess, P. W. (1997). Theory and methodology in executive function research. In P. Rabbitt (Ed.), Methodology of frontal and executive function (pp. 81-116). Hove, UK: Psychology Press. Burgess, P. W. (2000). Strategy application disorder: The role of the frontal lobes in human multitasking. Psychological Research, 63(3-4), 279-288. 287 Burgess, P. W., Alderman, N., Evans, J., Emslie, H., & Wilson, B. A. (1998). The ecological validity of tests of executive function. Journal of the International Neuropsychological Society, 4(6), 547-558. Cabeza, R., & Nyberg, L. (2000). Imaging cognition II: An empirical review of 275 PET and fMRI studies. Journal of Cognitive Neuroscience, 12(1), 1-47. Cantwell, D. P., Baker, L., & Rutter, M. (1979). Families of autistic and dysphasic children: I. Family life and interaction patterns. Archives of General Psychiatry, 36(6), 682-687. Capps, L., Kehres, J., & Sigman, M. (1998). Conversational abilities among children with autism and children with developmental delays. Autism, 2(4), 325-344. Carlin, D., Bonerba, J., Phipps, M., Alexander, G., Shapiro, M., & Grafman, J. (2000). Planning impairments in frontal lobe dementia and frontal lobe lesion patients. Neuropsychologia, 38(5), 655-665. Carlson, S. M., & Moses, L. J. (2001). Individual differences in inhibitory control and children's theory of mind. Child Development, 72(4), 1032-1053. Carlson, S. M., Moses, L. J., & Breton, C. (2002). How specific is the relation between executive function and theory of mind? Contributions of inhibitory control and working memory. Infant & Child Development, 11(2), 73-92. Carlson, S. M., Moses, L. J., & Hix, H. R. (1998). The role of inhibitory processes in young children's difficulties with deception and false belief. Child Development, 69(3), 672-691. Carpenter, M., Pennington, B. F., & Rogers, S. J. (2001). Understanding of others' intentions in children with autism. Journal of Autism & Developmental Disorders, 31(6), 589-599. Carruthers, P. (1996). Autism as mind-blindness: An elaboration and partial defence. In P. Carruthers & P. K. Smith (Eds.), Theories of theories of mind (pp. 257-273). Cambridge: Cambridge University Press. Carruthers, P., & Smith, P. K. (Eds.). (1996). Theories of theories of mind. Cambridge: Cambridge University Press. Casanova, M. F., Buxhoeveden, D. P., Switala, A. E., & Roy, E. (2002). Minocolumnar pathology in autism. Neurology, 58(3), 428-432. Case, R. (1985). Intellectual development: From birth to adulthood. New York: Academic Press. 288 Castelli, F., Frith, C., Happé, F., & Frith, U. (2002). Autism, Asperger syndrome and brain mechanisms for the attribution of mental states to animated shapes. Brain, 125, 1839-1849. Chakrabarti, S., & Fombonne, E. (2001). Pervasive developmental disorders in preschool children. Jama: Journal of the American Medical Association, 285(24), 3093-3099. Chandler, M. J., Fritz, A. S., & Hala, S. M. (1989). Small scale deceit: Deception as a marker of 2-, 3- and 4-year-olds' early theories of mind. Child Development, 60, 1263-1277. Chandler, M., & Hala, S. (1994). The role of personal involvement in the assessment of early false belief skills. In C. Lewis & P. Mitchell (Eds.), Children's early understanding of mind: Origins and development (pp. 403-425). Hillsdale, NJ: Lawrence Erlbaum. Channon, S., & Crawford, S. (2000). The effects of anterior lesions on performance on a story comprehension test: Left anterior impairment on a theory of mind-type task. Neuropsychologia, 38(7), 1006-1017. Channon, S., Flynn, D., & Robertson, M. M. (1992). Attentional deficits in Gilles de la Tourette syndrome. Neuropsychiatry, Neuropsychology, & Behavioral Neurology, 5(3), 170-177. Charman, T. (2000). Theory of mind and the early diagnosis of autism. In S. BaronCohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from developmental cognitive neuroscience (2nd ed., pp. 422-441). London: Oxford University Press. Charman, T., & Baird, G. (2002). Practitioner review: Diagnosis of autism spectrum disorder in 2- and 3-year-old children. Journal of Child Psychology & Psychiatry & Allied Disciplines, 43(3), 289-305. Charman, T., & Baron-Cohen, S. (1992). Understanding drawings and beliefs: A further test of the metarepresentation theory of autism: A research note. Journal of Child Psychology & Psychiatry & Allied Disciplines, 33(6), 1105-1112. Charman, T., & Baron-Cohen, S. (1995). Understanding photos, models, and beliefs: A test of the modularity thesis of theory of mind. Cognitive Development, 10(2), 287-298. Charman, T., & Campbell, A. (1997). Reliability of theory of mind task performance by individuals with a learning disability: A research note. Journal of Child Psychology & Psychiatry & Allied Disciplines, 38(6), 725-730. 289 Charman, T., Carroll, F., & Sturge, C. (2001). Theory of mind, executive function and social competence in boys with ADHD. Emotional & Behavioural Difficulties, 6(1), 31-49. Charman, T., & Lynggaard, H. (1998). Does a photographic cue facilitate false belief performance in subjects with autism? Journal of Autism & Developmental Disorders, 28(1), 33-42. Chelune, G. J., & Baer, R. A. (1986). Developmental norms for the Wisconsin Card Sorting Test. Journal of Clinical & Experimental Neuropsychology, 8(3), 219228. Christensen, K. J., Kim, S. W., Dysken, M. W., & Hoover, K. M. (1992). Neuropsychological performance in obsessive-compulsive disorder. Biological Psychiatry, 31(1), 4-18. Cicerone, K. D., & Tanenbaum, L. N. (1997). Disturbance of social cognition after traumatic orbitofrontal brain injury. Archives of Clinical Neuropsychology, 12(2), 173-188. Ciesielski, K. T., & Harris, R. J. (1997). Factors related to performance failure on executive tasks in autism. Child Neuropsychology, 3(1), 1-12. Clark, P., & Rutter, M. (1981). Autistic children's responses to structure and to interpersonal demands. Journal of Autism & Developmental Disorders, 11(2), 201-217. Cohen, J. (1988). Statistical power analysis for the behavioural sciences. (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum. Cohen, J., & Cohen, P. (1983). Applied multiple regression/correlation analysis for the behavioral sciences. (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum. Collette, F., & van der Linden, M. (2002). Brain imaging of the central executive component of working memory. Neuroscience & Biobehavioral Reviews, 26(2), 105-125. Collette, F., van der Linden, M., & Salmon, E. (1999). Executive dysfunction in Alzheimer's disease. Cortex, 35(1), 57-72. Colvert, E., Custance, D., & Swettenham, J. (2002). Rule-based reasoning and theory of mind in autism: A commentary on the work of Zelazo, Jacques, Burack and Frye. Infant & Child Development, 11(2), 197-200. Corcoran, R. (2000). Theory of mind in other clinical conditions: Is a selective 'theory of mind' deficit exclusive to autism? In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from 290 developmental cognitive neuroscience (2nd ed., pp. 391-421). London: Oxford University Press. Corcoran, R., Mercer, G., & Frith, C. D. (1995). Schizophrenia, symptomatology and social influence: Investigating "theory of mind" in people with schizophrenia. Schizophrenia Research, 17(1), 5-13. Courchesne, E. (1997). Brainstem, cerebellar and limbic neuroanatomical abnormalities in autism. Current Opinion in Neurobiology, 7(2), 269-278. Courchesne, E., Townsend, J., Akshoomoff, N. A., Saitoh, O., Yeung-Courchesne, R., Lincoln, A. J., James, H. E., Haas, R. H., Schreibman, L., & Lau, L. (1994). Impairment in shifting attention in autistic and cerebellar patients. Behavioral Neuroscience, 108(5), 848-865. Courchesne, E., Townsend, J., & Chase, C. (1995). Neurodevelopmental principles guide research on developmental psychopathologies. In D. Cicchetti & D. J. Cohen (Eds.), Developmental psychopathology (Vol. 1: Theory and methods, pp. 195-226). Oxford: John Wiley & Sons. Cox, C. S., Fedio, P., & Rapoport, J. L. (1989). Neuropsychological testing of obsessive-compulsive adolescents. In J. L. Rapoport (Ed.), Obsessivecompulsive disorder in children and adolescents (pp. 73-85). Washington, DC: American Psychiatric Press. Craig, J., & Baron-Cohen, S. (1999). Creativity and imagination in autism and Asperger syndrome. Journal of Autism & Developmental Disorders, 29(4), 319-326. Cripe, L. I. (1996). The ecological validity of executive function testing. In R. J. Sbordone & C. J. Long (Eds.), Ecological validity of neuropsychological testing (pp. 171-202). Delray Beach, FL: GR Press/St Lucie Press. Culbertson, W. C., & Zillmer, E. A. (1998a). The construct validity of the Tower of LondonDX as a measure of the executive functioning of ADHD children. Assessment, 5(3), 215-226. Culbertson, W. C., & Zillmer, E. A. (1998b). The Tower of LondonDX: A standardized approach to assessing executive functioning in children. Archives of Clinical Neuropsychology, 13(3), 285-301. Dadds, M. R., Schwartz, S., Adams, T., & Rose, S. (1988). The effects of social context and verbal skill on the stereotypic and task-involved behaviour of autistic children. Journal of Child Psychology & Psychiatry & Allied Disciplines, 29(5), 669-676. 291 Dagher, A., Owen, A. M., Boecker, H., & Brooks, D. J. (1999). Mapping the network for planning: A correlational PET activation study with the Tower of London ask. Brain, 122(10), 1973-1987. Dahlgren, S., Dahlgren Sandberg, A., & Hjelmquist, E. (2003). The non-specificity of theory of mind deficits: Evidence from children with communicative disabilities. European Journal of Cognitive Psychology, 15(1), 129-155. Dahlgren, S. O., & Trillingsgaard, A. (1996). Theory of mind in non-retarded children with autism and Asperger's syndrome: A research note. Journal of Child Psychology & Psychiatry & Allied Disciplines, 37(6), 759-763. Damasio, A. R., & Maurer, R. G. (1978). A neurological model for childhood autism. Archives of Neurology, 35, 777-786. Davis, H. L., & Pratt, C. (1995). The development of children's theory of mind: The working memory explanation. Australian Journal of Psychology, 47(1), 25-31. Dawson, G. (1991). A psychobiological perspective on the early socio-emotional development of children with autism. In D. Cicchetti & S. L. Toth (Eds.), Rochester symposium on developmental psychopathology (Vol. 3: Models and integrations, pp. 207-234). Rochester, NY: University of Rochester Press Dawson, G., & Adams, A. (1984). Imitation and social responsiveness in autistic children. Journal of Abnormal Child Psychology, 12(2), 209-225. Dawson, G., & Fernald, M. (1987). Perspective-taking ability and its relationship to the social behavior of autistic children. Journal of Autism and Developmental Disorders, 17, 487-498. Dawson, G., Klinger, L. G., Panagiotides, H., Lewy, A., & Vastelloe, P. (1995). Subgroups of autistic children based on social behavior display distinct patterns of brain activity. Journal of Abnormal Child Psychology, 23(5), 569-583. Dawson, G., & Lewy, A. (1989). Arousal, attention, and the socioemotional impairments of individuals with autism. In G. Dawson (Ed.), Autism: Nature, diagnosis, and treatment (pp. 49-74). New York: The Guildford Press. Dawson, G., Meltzoff, A. N., Osterling, J., & Rinaldi, J. (1998). Neuropsychological correlates of early symptoms of autism. Child Development, 69(5), 1276-1285. Dawson, G., Munson, J., Estes, A., Osterling, J., McPartland, J., Toth, K., Carver, L., & Abbott, R. (2002a). Neurocognitive function and joint attention ability in young children with autism spectrum disorder versus developmental delay. Child Development, 73(2), 345-358. 292 Dawson, G., Osterling, J., Rinaldi, J., Carver, L., & McPartland, J. (2001). Brief report: Recognition memory and stimulus-reward associations: Indirect support for the role of ventromedial prefrontal dysfunction in autism. Journal of Autism & Developmental Disorders, 31(3), 337-341. Dawson, G., Webb, S., Schellenberg, G. D., Dager, S., Friedman, S., Aylward, E., & Richards, T. (2002b). Defining the broader phenotype of autism: Genetic, brain, and behavioral perspectives. Development & Psychopathology, 14(3), 581-611. de Villiers, J. (2000). Language and theory of mind: What are the developmental relationships? In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from developmental cognitive neuroscience (2nd ed., pp. 83-123). London: Oxford University Press. de Villiers, J. G., & de Villiers, P. A. (2000). Linguistic determinism and the understanding of false beliefs. In P. Mitchell & K. J. Riggs (Eds.), Children's reasoning and the mind (pp. 191-228). Hove, UK: Psychology Press. Deb, S., & Thompson, B. (1998). Neuroimaging in autism. British Journal of Psychiatry, 173, 299-302. Delis, D. C., Squire, L. R., Bihrle, A., & Massman, P. J. (1992). Componential analysis of problem-solving ability: Performance of patients with frontal lobe damage and amnesic patients on a new sorting test. Neuropsychologia, 30(8), 683-697. DeLong, G., & Dwyer, J. T. (1988). Correlation of family history with specific autistic subgroups: Asperger's syndrome and bipolar affective disease. Journal of Autism & Developmental Disorders, 18(4), 593-600. Dempster, F. N. (1992). The rise and fall of the inhibitory mechanism: Toward a unified theory of cognitive development and aging. Developmental Review, 12(1), 4575. Dempster, F. (1993). Resistance to interference: Developmental changes in a basic processing mechanism. In M. L. Howe & R. Pasnak (Eds.), Emerging themes in cognitive development (Vol. 1: Foundations, pp. 3-27). New York: SpringerVerlag. Dennett, D. C. (1978). Beliefs about beliefs. The Behavioral and Brain Sciences, 4, 568570. Dewey, M. (1991). Living with Asperger's syndrome. In U. Frith (Ed.), Autism and Asperger syndrome. (pp. 184-206). Cambridge: Cambridge University Press. Diamond, A. (1985). Development of the ability to use recall to guide action, as indicated by infants' performance on AB. Child Development, 56(4), 868-883. 293 Diamond, A. (2002). Normal development of prefrontal cortex from birth to young adulthood: Cognitive functions, anatomy, and biochemistry. In D. T. Stuss & R. T. Knight (Eds.), Principles of frontal lobe function (pp. 466-503). London: Oxford University Press. Diamond, A., & Goldman-Rakic, P. S. (1989). Comparison of human infants and rhesus monkeys on Piaget's AB task: Evidence for dependence on dorsolateral prefrontal cortex. Experimental Brain Research, 74, 24-40. Diamond, A., Prevor, M. B., Callender, G., & Druin, D. P. (1997). Prefrontal cortex cognitive deficits in children treated early and continuously for PKU. Monographs of the Society for Research in Child Development, 62(4), 1-205. Diamond, A., & Taylor, C. (1996). Development of an aspect of executive control: Development of the abilities to remember what I said and to "Do as I say, not as I do". Developmental Psychobiology, 29(4), 315-334. Donnellan, A. M., Anderson, J. L., & Mesaros, R. A. (1984). An observational study of stereotypic behavior and proximity related to the occurrence of autistic childfamily member interactions. Journal of Autism & Developmental Disorders, 14(2), 205-210. Dorris, L., Espie, C. A. E., Knott, F., & Salt, J. (2004). Mind-reading difficulties in the siblings of people with Asperger's syndrome: Evidence for a genetic influence in the abnormal development of a specific cognitive domain. Journal of Child Psychology & Psychiatry, 45(2), 412-418. Downes, J. J., Roberts, A. C., Sahakian, B. J., Evenden, J. L., Morris, R. G., & Robbins, T. W. (1989). Impaired extra-dimensional shift performance in medicated and unmedicated Parkinson’s disease: Evidence for a specific attentional dysfunction. Neuropsychologia, 27, 1329-1343. Drewe, E. (1974). The effect of type and area of brain lesion on Wisconsin Card Sorting Test performance. Cortex, 10(2), 159-170. Drewe, E. A. (1975). An experimental investigation of Luria's theory on the effects of frontal lobe lesions in man. Neuropsychologia, 13, 421-429. Duncan, J., Burgess, P., & Emslie, H. (1995). Fluid intelligence after frontal lobe lesions. Neuropsychologia, 33(3), 261-268. Duncan, J., Emslie, H., Williams, P., Johnson, R., & Freer, C. (1996). Intelligence and the frontal lobe: The organization of goal-directed behavior. Cognitive Psychology, 30(3), 257-303. 294 Eisenberg, L., & Kanner, L. (1956). Early infantile autism, 1943-55. American Journal of Orthopsychiatry, 26, 556-566. Eisenmajer, R., & Prior, M. (1991). Cognitive linguistic correlates of "theory of mind" ability in autistic children. British Journal of Developmental Psychology, 9(2), 351-364. Elliott, R., McKenna, P., Robbins, T., & Sahakian, B. (1995). Neuropsychological evidence for frontostriatal dysfunction in schizophrenia. Psychological Medicine, 25(3), 619-630. Eslinger, P. J. (1996). Conceptualizing, describing, and measuring components of executive function: A summary. In G. R. Lyon & N. A. Krasnegor (Eds.), Attention, memory, and executive function (pp. 367-395). Baltimore, MD: Paul H. Brookes. Eslinger, P. J. (1998). Neurological and neuropsychological bases of empathy. European Neurology, 39(4), 193-199. Eslinger, P. J., Biddle, K. R., & Grattan, L. M. (1997). Cognitive and social development in children with prefrontal cortex lesions. In N. A. Krasnegor, G. R. Lyon, & P. S. Goldman-Rakic (Eds.), Development of the prefrontal cortex: Evolution, neurobiology, and behavior (pp. 295-335). Baltimore, MD: Paul H. Brookes. Eslinger, P. J., & Damasio, A. R. (1985). Severe disturbance of higher cognition after bilateral frontal lobe ablation: Patient EVR. Neurology, 35(12), 1731-1741. Eslinger, P. J., Grattan, L. M., Damasio, H., & Damasio, A. R. (1992). Developmental consequences of childhood frontal lobe damage. Archives of Neurology, 49, 764769. Espy, K. A., Kaufmann, P. M., Glisky, M. L., & McDiarmid, M. (2001). New procedures to assess executive functions in preschool children. Clinical Neuropsychologist, 15(1), 46-58. Espy, K. A., Kaufmann, P. M., McDiarmid, M. D., & Glisky, M. L. (1999). Executive functioning in preschool children: Performance on A-not-B and other delayed response format tasks. Brain & Cognition, 41(2), 178-199. Fein, D., Stevens, M., Dunn, M., Waterhouse, L., Allen, D., Rapin, I., & Feinstein, C. (1999). Subtypes of pervasive developmental disorder: Clinical characteristics. Child Neuropsychology, 5(1), 1-23. 295 Fine, C., Lumsden, J., & Blair, R. (2001). Dissociation between "theory of mind" and executive functions in a patient with early left amygdala damage. Brain, 124(2), 287-298. Flavell, J. H., Flavell, E. R., & Green, F. L. (1983). Development of the appearancereality distinction. Cognitive Psychology, 15(1), 95-120. Fletcher, P. C., Happé, F., Frith, U., Baker, S. C., Dolan, R. J., Frackowiak, R. S. J., & Frith, C. D. (1995). Other minds in the brain: A functional imaging study of "theory of mind" in story comprehension. Cognition, 57, 109-128. Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: MIT Press. Folstein, S. E., Bisson, E., Santangelo, S. L., & Piven, J. (1998). Finding specific genes that cause autism: A combination of approaches will be needed to maximize power. Journal of Autism & Developmental Disorders, 28(5), 439-445. Folstein, S., & Rutter, M. (1977). Infantile autism: A genetic study of 21 twin pairs. Journal of Child Psychology & Psychiatry & Allied Disciplines, 18(4), 297-321. Folstein, S. E., Santangelo, S. L., Gilman, S. E., Piven, J., Landa, R., Lainhart, J., Hein, J., & Wzorek, M. (1999). Predictors of cognitive test patterns in autism families. Journal of Child Psychology & Psychiatry & Allied Disciplines, 40(7), 11171128. Fombonne, E. (2003). Epidemiological surveys of autism and other pervasive developmental disorders: An update. Journal of Autism & Developmental Disorders, 33(4), 365-382. Fombonne, E., Bolton, P., Prior, J., Jordan, H., & Rutter, M. (1997). A family study of autism: Cognitive patterns and levels in parents and siblings. Journal of Child Psychology & Psychiatry & Allied Disciplines, 38(6), 667-683. Fonagy, P., Steele, M., Steele, H., Leigh, T., Kennedy, R., Mattoon, G., & Target, M. (1995). Attachment, the reflective self, and borderline states: The predictive specificity of the Adult Attachment Interview and pathological emotional development. In S. Goldberg, R. Muir, & J. Kerr (Eds.), Attachment theory: Social, developmental, and clinical perspectives (pp. 233-278). New York: Analytic Press. Freeman, B., Ritvo, E. R., Mason-Brothers, A., Pingree, C., Yokota, A., Jenson, W., McMahon, W., Peterson, B., Mo, A., & Schroth, P. (1989). Psychometric assessment of first-degree relatives of 62 autistic probands in Utah. American Journal of Psychiatry, 146(3), 361-364. 296 Freeman, N. H., & Lacohée, H. (1995). Making explicit 3-year-olds' implicit competence with their own false beliefs. Cognition, 56(1), 31-60. Freeman, N. H., Lewis, C., & Doherty, M. (1991). Preschoolers' grasp of a desire for knowledge in false-belief prediction: Practical intelligence and verbal report. British Journal of Developmental Psychology, 9(1), 139-157. Frith, C. D. (1992). The cognitive neuropsychology of schizophrenia. Hillsdale, NJ: Lawrence Erlbaum. Frith, C., & Frith, U. (2000). The physiological basis of theory of mind: Functional neuroimaging studies. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from developmental cognitive neuroscience (2nd ed., pp. 334-356). London: Oxford University Press. Frith, U. (1972). Cognitive mechanisms in autism: Experiments with color and tone sequence production. Journal of Autism and Childhood Schizophrenia, 2(2), 160-173. Frith, U., & Happé, F. (1994). Autism: Beyond "theory of mind". Cognition, 50(1-3), 115-132. Frith, U., Happé, F., & Siddons, F. (1994). Autism and theory of mind in everyday life. Social Development, 3(2), 108-124. Frith, U., Morton, J., & Leslie, A. M. (1991). The cognitive basis of a biological disorder: Autism. Trends in Neurosciences, 14, 433-438. Frye, D. (1999). Development of intention: The relation of executive function to theory of mind. In P. D. Zelazo, J. W. Astington, & D. Olson (Eds.), Developing theories of intention: Social understanding and self-control (pp. 119-132). Mahwah, NJ: Lawrence Erlbaum Associates. Frye, D. (2000). Theory of mind, domain specificity, and reasoning. In P. Mitchell & K. J. Riggs (Eds.), Children's reasoning and the mind (pp. 149-167). Hove, UK: Psychology Press. Frye, D., & Zelazo, P. D. (1998). Complexity: From formal analysis to final action. Behavioural and Brain Sciences, 21(6), 836-837. Frye, D., Zelazo, P. D., Brooks, P. J., & Samuels, M. C. (1996). Inference and action in early causal reasoning. Developmental Psychology, 32(1), 120-131. Frye, D., Zelazo, P. D., & Burack, J. A. (1998). Cognitive complexity and control: I. Theory of mind in typical and atypical development. Current Directions in Psychological Science, 7(4), 116-121. 297 Frye, D., Zelazo, P. D., & Palfai, T. (1995). Theory of mind and rule-based reasoning. Cognitive Development, 10(4), 483-527. Fuster, J. M. (2000). Prefrontal neurons in networks of executive memory. Brain Research Bulletin, 52(5), 331-336. Gallagher, H. L., & Frith, C. D. (2003). Functional imaging of 'theory of mind'. Trends in Cognitive Sciences, 7(2), 77-83. Gallagher, H., Happé, F., Brunswick, N., Fletcher, P., Frith, U., & Frith, C. (2000). Reading the mind in cartoons and stories: an fMRI study of 'theory of the mind' in verbal and nonverbal tasks. Neuropsychologia, 38(1), 11-21. Garcia-Villamisar, D., & Della Sala, S. (2002). Dual-task performance in adults with autism. Cognitive Neuropsychiatry, 7(1), 63-74. Garner, C., Callias, M., & Turk, J. (1999). Executive function and theory of mind performance of boys with fragile-X syndrome. Journal of Intellectual Disability Research, 43(6), 466-474. Garretson, H. B., Fein, D., & Waterhouse, L. (1990). Sustained attention in children with autism. Journal of Autism & Developmental Disorders, 20(1), 101-114. George, M. S., Costa, D. C., Kouris, K., Ring, H. A., & Ell, P. J. (1992). Cerebral blood flow abnormalities in adults with infantile autism. Journal of Nervous & Mental Disease, 180(7), 413-417. German, T. P., & Leslie, A. M. (2000). Attending to and learning about mental states. In P. Mitchell & K. J. Riggs (Eds.), Children's reasoning and the mind (pp. 229252). Hove, UK: Psychology Press. Gerstadt, C. L., Hong, Y. J., & Diamond, A. (1994). The relationship between cognition and action: Performance of children 31/2-7 years old on a Stroop-like day-night test. Cognition, 53(2), 129-153. Gillberg, C., & Coleman, M. (1992). The biology of the autistic syndromes. (2nd ed.). London: Mac Keith Press. Gilotty, L., Kenworthy, L., Sirian, L., Black, D. O., & Wagner, A. E. (2002). Adaptive skills and executive function in autism spectrum disorders. Child Neuropsychology, 8(4), 241-248. Godefroy, O., Cabaret, M., Petit-Chenal, V., Pruvo, J.-P., & Rousseaux, M. (1999). Control functions of the frontal lobes: Modularity of the central-supervisory system? Cortex, 35(1), 1-20. Goel, V., Grafman, J., Sadato, N., & Hallett, M. (1995). Modeling other minds. Neuroreport, 6(13), 1741-1746. 298 Goldberg, M., Lasker, A., Zee, D., Garth, E., Tien, A., & Landa, R. (2002). Deficits in the initiation of eye movements in the absence of a visual target in adolescents with high functioning autism. Neuropsychologia, 40(12), 2039-2049. Golden, C. J. (1981). The Luria-Nebraska Children's Battery: Theory and formulation. In G. W. Hynd & J. E. Obrzut (Eds.), Neuropsychological assessment of the school-aged child (pp. 277-302). New York: Grune & Stratton. Goldman-Rakic, P. S. (1995). Architecture of the prefrontal cortex and the central executive. In J. Grafman, K. J. Holyoak, & F. Boller (Eds.), Structure and functions of the human prefrontal cortex (pp. 71-83). New York: New York Academy of Sciences. Goldman-Rakic, P. S., & Leung, H.-C. (2002). Functional architecture of the dorsolateral prefrontal cortex in monkeys and humans. In D. T. Stuss & R. T. Knight (Eds.), Principles of frontal lobe function (pp. 85-95). London: Oxford University Press. Goldstein, G., Johnson, C. R., & Minshew, N. J. (2001). Attentional processes in autism. Journal of Autism & Developmental Disorders, 31(4), 433-440. Goodman, R. (1989). Infantile autism: A syndrome of multiple primary deficits? Journal of Autism & Developmental Disorders, 19(3), 409-424. Gopnik, A. (1993). How we know our minds: The illusion of first-person knowledge of intentionality. Behavioral & Brain Sciences, 16(1), 1-14, 29-113. Gopnik, A., & Astington, J. W. (1988). Children's understanding of representational change and its relation to the understanding of false belief and the appearancereality distinction. Child Development, 59(1), 26-37. Gopnik, A., & Meltzoff, A. N. (1997). Words, thoughts, and theories. Cambridge, MA: MIT Press. Gopnik, A., & Wellman, H. M. (1994). The theory theory. In L. A. Hirschfeld & S. A. Gelman (Eds.), Mapping the mind: Domain specificity in cognition and culture (pp. 257-293). Cambridge: Cambridge University Press. Gordon, A. C. L., & Olson, D. R. (1998). The relation between acquisition of a theory of mind and the capacity to hold in mind. Journal of Experimental Child Psychology, 68(1), 70-83. Gottesman, I. I., & Gould, T. D. (2003). The endophenotype concept in psychiatry: Etymology and strategic intentions. American Journal of Psychiatry, 160(4), 636-645. 299 Grafman, J. (1994). Alternative frameworks for the conceptualization of prefrontal lobe functions. In F. Boller & J. Grafman (Eds.), Handbook of Neuropsychology (Vol. 9, pp. 187-201). Amsterdam: Elsevier Science. Grant, C. M., Grayson, A., & Boucher, J. (2001). Using tests of false belief with children with autism: How valid and reliable are they? Autism, 5(2), 135-145. Grant, D. A., & Berg, E. (1948). A behavioral analysis of degree of reinforcement and ease of shifting to new responses in a Weigl-type card-sorting problem. Journal of Experimental Psychology, 38, 404-411. Grattan, L. M., Bloomer, R. H., Archambault, F. X., & Eslinger, P. J. (1994). Cognitive flexibility and empathy after frontal lobe lesion. Neuropsychiatry, Neuropsychology, and Behavioral Neurology, 7(4), 251-259. Gregory, C., Lough, S., Stone, V., Erzinclioglu, S., Martin, L., Baron-Cohen, S., & Hodges, J. R. (2002). Theory of mind in patients with frontal variant frontotemporal dementia and Alzheimer's disease: Theoretical and practical implications. Brain, 125(4), 752-764. Griffith, E. M., Pennington, B. F., Wehner, E. A., & Rogers, S. J. (1999). Executive functions in young children with autism. Child Development, 70(4), 817-832. Grodzinsky, G. M., & Diamond, R. (1992). Frontal lobe functioning in boys with attention-deficit hyperactivity disorder. Developmental Neuropsychology, 8(4), 427-445. Grossman, M. (2002). Frontotemporal dementia: A review. Journal of the International Neuropsychological Society, 8(4), 566-583. Hala, S., Hug, S., & Henderson, A. (2003). Executive function and false-belief understanding in preschool children: Two tasks are harder than one. Journal of Cognition & Development, 4(3), 275-298. Hala, S., & Russell, J. (2001). Executive control within strategic deception: A window on early cognitive development? Journal of Experimental Child Psychology, 80(2), 112-141. Halford, G. S. (1993). Children's understanding: The development of mental models. Hillsdale, NJ: Lawrence Erlbaum. Halford, G. S., Wilson, W. H., & Phillips, S. (1998). Processing capacity defined by relational complexity: Implications for comparative, developmental, and cognitive psychology. Behavioral & Brain Sciences, 21(6), 803-864. Happé, F. G. E. (1994a). An advanced test of theory of mind: Understanding of story characters' thoughts and feelings by able autistic, mentally handicapped, and 300 normal children and adults. Journal of Autism & Developmental Disorders, 24(2), 129-154. Happé, F. G. E. (1994b). Annotation: Current psychological theories of autism: The "Theory of Mind" account and rival theories. Journal of Child Psychology & Psychiatry & Allied Disciplines, 35(2), 215-229. Happé, F. G. E. (1994c). Wechsler IQ profile and theory of mind in autism: A research note. Journal of Child Psychology & Psychiatry & Allied Disciplines, 35(8), 1461-1471. Happé, F. G. (1995). The role of age and verbal ability in the theory of mind task performance of subjects with autism. Child Development, 66(3), 843-855. Happé, F. G. E. (1996). Studying weak central coherence at low levels: Children with autism do not succumb to visual illusions. Journal of Child Psychology and Psychiatry, 37(7), 873-877. Happé, F. G. E. (1997). Central coherence and theory of mind in autism: Reading homographs in context. British Journal of Developmental Psychology, 15, 1-12. Happé, F. (1999). Understanding assets and deficits in autism: Why success is more interesting than failure. Psychologist, 12(11), 540-546. Happé, F. (2000). Parts and wholes, meaning and minds: Central coherence and its relation to theory of mind. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from developmental cognitive neuroscience (2nd ed., pp. 203-221). London: Oxford University Press. Happé, F. (2001). Social and nonsocial development in autism: Where are the links? In J. A. Burack, T. Charman, N. Yirmiya, & P. R. Zelazo (Eds.), The development of autism: Perspectives from theory and research (pp. 237-253). Mahwah, NJ: Lawrence Erlbaum Associates. Happé, F., Briskman, J., & Frith, U. (2001). Exploring the cognitive phenotype of autism: Weak "central coherence" in parents and siblings of children with autism: I. Experimental tests. Journal of Child Psychology & Psychiatry & Allied Disciplines, 42(3), 299-307. Happé, F., Ehlers, S., Fletcher, P., Frith, U., Johansson, M., Gillberg, C., Dolan, R., Frackowiak, R., & Frith, C. (1996). 'Theory of mind' in the brain. Evidence from a PET scan study of Asperger syndrome. Neuroreport, 8(1), 197-201. 301 Happé, F., & Frith, U. (1995). Theory of mind in autism. In E. Schopler & G. B. Mesibov (Eds.), Learning and Cognition in Autism (pp. 177-197). New York: Plenum Press. Happé, F., & Frith, U. (1996). The neuropsychology of autism. Brain, 119, 1377-1400. Happé, F., Malhi, G. S., & Checkley, S. (2001). Acquired mind-blindness following frontal lobe surgery? A single case study of impaired 'theory of mind' in a patient treated with stereotactic anterior capsulotomy. Neuropsychologia, 39(1), 83-90. Harnishfeger, K. K., & Bjorklund, D. F. (1993). The ontogeny of inhibition mechanisms: A renewed approach to cognitive development. In M. L. Howe & R. Pasnak (Eds.), Emerging themes in cognitive development (Vol. 1: Foundations, pp. 28-49). New York: Springer-Verlag. Harris, P. (1993). Pretending and planning. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from autism (pp. 228246). Oxford: Oxford University Press. Harris, P. L., & Leevers, H. J. (2000). Pretending, imagery, and self-awareness in autism. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from developmental cognitive neuroscience (2nd ed., pp. 182-202). London: Oxford University Press. Head, D., Bolton, D., & Hymas, N. (1989). Deficit in cognitive shifting ability in patients with obsessive-compulsive disorder. Biological Psychiatry, 25(7), 929937. Heaton, R. K., Grant, I., & Matthews, C. G. (1991). Comprehensive norms for an expanded Halstead-Reitan battery: Demographic corrections, research findings, and clinical applications. Odessa, FL: Psychological Assessment Resources. Heilman, K., Watson, R., & Valenstein, E. (1993). Neglect and related disorders. In K. Heilman & E. Valenstein (Eds.), Clinical Neuropsychology (3rd ed., pp. 279336). New York: Oxford University Press. Hermelin, B., & O'Connor, N. (1970). Psychological experiments with autistic children. New York: Pergamon. Hill, E. L. (2004). Executive dysfunction in autism. Trends in Cognitive Sciences, 8(1), 26-32. Hill, E. L., & Frith, U. (2003). Understanding autism: Insights from mind and brain. Philosophical Transactions of the Royal Society of London B, 358(1430), 281289. 302 Hill, E. L., & Russell, J. (2002). Action memory and self-monitoring in children with autism: Self versus other. Infant & Child Development, 11(2), 159-170. Hobson, R. P. (1989). Beyond cognition: A theory of autism. In G. Dawson (Ed.), Autism: Nature, diagnosis, and treatment (pp. 22-48). New York: The Guilford Press. Hobson, R. P. (1993). Understanding persons: The role of affect. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from autism (pp. 204-227). Oxford: Oxford University Press. Hoff, A. L., & Kremen, W. S. (2003). Neuropsychology in schizophrenia: An update. Current Opinion in Psychiatry, 16(2), 149-155. Hollander, E., King, A., Delaney, K., Smith, C. J., & Silverman, J. M. (2003). Obsessive-compulsive behaviors in parents of multiplex autism families. Psychiatry Research, 117(1), 11-16. Holroyd, S., & Baron-Cohen, S. (1993). Brief report: How far can people with autism go in developing a theory of mind? Journal of Autism & Developmental Disorders, 23(2), 379-385. Hughes, C. (1996a). Brief report: Planning problems in autism at the level of motor control. Journal of Autism & Developmental Disorders, 26(1), 99-107. Hughes, C. (1996b). Control of action and thought: Normal development and dysfunction in autism: A research note. Journal of Child Psychology & Psychiatry & Allied Disciplines, 37(2), 229-236. Hughes, C. (1998a). Executive function in preschoolers: Links with theory of mind and verbal ability. British Journal of Developmental Psychology, 16, 233-253. Hughes, C. (1998b). Finding your marbles: Does preschoolers' strategic behavior predict later understanding of mind? Developmental Psychology, 34(6), 13261339. Hughes, C. (2001). Executive dysfunction in autism: Its nature and implications for the everyday problems experienced by individuals with autism. In J. A. Burack, T. Charman, N. Yirmiya, & P. R. Zelazo (Eds.), The development of autism: Perspectives from theory and research (pp. 255-275). Mahwah, NJ: Lawrence Erlbaum Associates. Hughes, C., Adlam, A., Happé, F., Jackson, J., Taylor, A., & Caspi, A. (2000). Good test-retest reliability for standard and advanced false-belief tasks across a wide range of abilities. Journal of Child Psychology & Psychiatry & Allied Disciplines, 41(4), 483-490. 303 Hughes, C., Dunn, J., & White, A. (1998). Trick or treat? Uneven understanding of mind and emotion and executive dysfunction in "hard-to-manage" preschoolers. Journal of Child Psychology and Psychiatry, 39(7), 981-994. Hughes, C., & Graham, A. (2002). Measuring executive functions in childhood: Problems and solutions? Child & Adolescent Mental Health, 7(3), 131-142. Hughes, C., Leboyer, M., & Bouvard, M. (1997). Executive function in parents of children with autism. Psychological Medicine, 27, 209-220. Hughes, C., Plumet, M.-H., & Leboyer, M. (1999). Towards a cognitive phenotype for autism: Increased prevalence of executive dysfunction and superior spatial span amongst siblings of children with autism. Journal of Child Psychology & Psychiatry & Allied Disciplines, 40(5), 705-718. Hughes, C., & Russell, J. (1993). Autistic children's difficulty with mental disengagement from an object: Its implications for theories of autism. Developmental Psychology, 29(3), 498-510. Hughes, C., Russell, J., & Robbins, T. W. (1994). Evidence for executive dysfunction in autism. Neuropsychologia, 32(4), 477-492. Hughes, C., Soares-Boucaud, I., Hochmann, J., & Frith, U. (1997). Social behaviour in pervasive developmental disorders: Effects of informant, group and "theory of mind". European Child & Adolescent Psychiatry, 6(4), 191-198. Hutt, S., & Hutt, C. (1968). Stereotypy, arousal and autism. Human Development, 11(4), 277-286. Huttenlocher, P. R., & Dabholkar, A. S. (1997). Developmental anatomy of prefrontal cortex. In N. A. Krasnegor, G. R. Lyon, & P. S. Goldman-Rakic (Eds.), Development of the prefrontal cortex: Evolution, neurobiology and behavior (pp. 69-83). Baltimore, MD: Paul H. Brookes. Jacques, S., Zelazo, P. D., Kirkham, N. Z., & Semcesen, T. K. (1999). Rule selection versus rule execution in preschoolers: An error-detection approach. Developmental Psychology, 35(3), 770-780. Jarrold, C., Boucher, J., & Smith, P. K. (1994a). Executive function deficits and the pretend play of children with autism: A research note. Journal of Child Psychology & Psychiatry & Allied Disciplines, 35(8), 1473-1482. Jarrold, C., Boucher, J., & Smith, P. K. (1996). Generativity defects in pretend play in autism. British Journal of Developmental Psychology, 14(3), 275-300. 304 Jarrold, C., Butler, D. W., Cottington, E. M., & Jimenez, F. (2000). Linking theory of mind and central coherence bias in autism and in the general population. Developmental Psychology, 36(1), 126-138. Jarrold, C., Smith, P., Boucher, J., & Harris, P. (1994b). Comprehension of pretense in children with autism. Journal of Autism & Developmental Disorders, 24(4), 433-455. Jenkins, J. M., & Astington, J. W. (1996). Cognitive factors and family structure associated with theory of mind development in young children. Developmental Psychology, 32(1), 70-78. Johnson, M. H., Siddons, F., Frith, U., & Morton, J. (1992). Can autism be predicted on the basis of infant screening tests? Developmental Medicine & Child Neurology, 34(4), 316-320. Jolliffe, T., & Baron-Cohen, S. (2000). Linguistic processing in high-functioning adults with autism or Asperger's syndrome. Is global coherence impaired? Psychological Medicine, 30(5), 1169-1187. Kain, W., & Perner, J. (2003). Do children with ADHD not need their frontal lobes for theory of mind? A review of brain imaging and neuropsychological studies. In M. Brüne, H. Ribbert, & W. Schiefenhövel (Eds.), The social brain: Evolution and pathology (pp. 197-230). Chichester, UK: John Wiley & Sons. Kane, M. J., & Engle, R. W. (2002). The role of prefrontal cortex in working-memory capacity, executive attention, and general fluid intelligence: An individualdifferences perspective. Psychonomic Bulletin & Review, 9(4), 637-671. Karmiloff-Smith, A. (1992). Beyond modularity: A developmental perspective on cognitive science. Cambridge, MA: The MIT Press. Karmiloff-Smith, A. (1997). Crucial differences between developmental cognitive neuroscience and adult neuropsychology. Developmental Neuropsychology, 13(4), 513-524. Karmiloff-Smith, A., Scerif, G., & Ansari, D. (2003). Double dissociations in developmental disorders? Theoretically misconceived, empirically dubious. Cortex, 39(1), 161-163. Karmiloff-Smith, A., Scerif, G., & Thomas, M. (2002). Different approaches to relating genotype to phenotype in developmental disorders. Developmental Psychobiology, 40(3), 311-322. Keenan, T. (1998). Memory span as a predictor of false belief understanding. New Zealand Journal of Psychology, 27(2), 36-43. 305 Keenan, T., Olson, D. R., & Marini, Z. (1998). Working memory and children's developing understanding of mind. Australian Journal of Psychology, 50(2), 7682. Kerr, N., Dunbar, R. I., & Bentall, R. P. (2003). Theory of mind deficits in bipolar affective disorder. Journal of Affective Disorders, 73(3), 253-259. Kimberg, D. Y., & Farah, M. J. (1993). A unified account of cognitive impairments following frontal lobe damage: The role of working memory in complex, organized behaviour. Journal of Experimental Psychology: General, 122, 411428. Kleinman, J., Marciano, P. L., & Ault, R. L. (2001). Advanced theory of mind in highfunctioning adults with autism. Journal of Autism and Developmental Disorders, 31(1), 29-36. Klin, A., & Volkmar, F. (1993). The development of individuals with autism: Implications for the theory of mind hypothesis. In S. Baron-Cohen, H. TagerFlusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from autism (pp. 317-331). Oxford: Oxford University Press. Klin, A., Volkmar, F. R., & Sparrow, S. S. (1992). Autistic social dysfunction: Some limitations of the theory of mind hypothesis. Journal of Child Psychology & Psychiatry & Allied Disciplines, 33(5), 861-876. Klin, A., Volkmar, F., Sparrow, S., Cicchetti, D., & Rourke, B. (1995). Validity and neuropsychological characterization of Asperger syndrome: Convergence with nonverbal learning disabilities syndrome. Journal of Child Psychology & Psychiatry & Allied Disciplines, 36(7), 1127-1140. Kochanska, G., Murray, K., & Coy, K. C. (1997). Inhibitory control as a contributor to conscience in childhood: From toddler to early school age. Child Development, 68(2), 263-277. Koenig, K., Tsatsanis, K. D., & Volkmar, F. R. (2001). Neurobiology and genetics of autism: A developmental perspective. In J. A. Burack, T. Charman, N. Yirmiya, & P. R. Zelazo (Eds.), The development of autism: Perspectives from theory and research (pp. 81-101). Mahwah, NJ: Lawrence Erlbaum Associates. Krikorian, R., Bartok, J., & Gay, N. (1994). Tower of London procedure: A standard method and developmental data. Journal of Clinical & Experimental Neuropsychology, 16(6), 840-850. 306 Landa, R., Piven, J., Wzorek, M. M., Gayle, J. O., Chase, G. A., & Folstein, S. E. (1992). Social language use in parents of autistic individuals. Psychological Medicine, 22(1), 245-54. Lang, B., & Perner, J. (2002). Understanding of intention and false belief and the development of self-control. British Journal of Developmental Psychology, 20(1), 67-76. Le Couteur, A., Bailey, A., Goode, S., Pickles, A., Robertson, S., Gottesman, I., & Rutter, M. (1996). A broader phenotype of autism: The clinical spectrum in twins. Journal of Child Psychology & Psychiatry & Allied Disciplines, 37(7), 785-801. Le Couteur, A., Rutter, M., Lord, C., Rios, P., Robertson, S., Holdgrafer, M., & McLennan, J. D. (1989). Autism Diagnostic Interview: A semi-structured interview for parents and caregivers of autistic persons. Journal of Autism & Developmental Disorders, 19(3), 363-387. Leboyer, M., Bellivier, F., Nosten-Bertrand, M., Jouvent, R., Pauls, D., & Mallet, J. (1998). Psychiatric genetics: Search for phenotypes. Trends in Neurosciences, 21(3), 102-105. Leboyer, M., Plumet, M.-H., Goldblum, M.-C., Perez-Diaz, F., & Marchaland, C. (1995). Verbal versus visuospatial abilities in relatives of autistic females. Developmental Neuropsychology, 11(1), 139-155. Leekam, S. R., & Moore, C. (2001). The development of attention and joint attention in children with autism. In J. A. Burack, T. Charman, N. Yirmiya, & P. R. Zelazo (Eds.), The development of autism: Perspectives from theory and research (pp. 105-129). Mahwah, NJ: Lawrence Erlbaum Associates. Leekam, S. R., & Perner, J. (1991). Does the autistic child have a metarepresentational deficit? Cognition, 40(3), 203-218. Leekam, S. R., & Prior, M. (1994). Can autistic children distinguish lies from jokes? A second look at second-order belief attribution. Journal of Child Psychology & Psychiatry & Allied Disciplines, 35(5), 901-915. Leslie, A. M. (1987). Pretense and representation: The origins of "theory of mind". Psychological Review, 94(4), 412-426. Leslie, A. M. (1991). The theory of mind impairment in autism: Evidence for a modular mechanism of development? In A. Whiten (Ed.), Natural theories of mind: Evolution, development and simulation of everyday mindreading (pp. 63-78). Oxford: Blackwell. 307 Leslie, A. M. (1994a). Pretending and believing: Issues in the theory of ToMM. Cognition, 50(1-3), 211-238. Leslie, A. M. (1994b). ToMM, ToBy, and Agency: Core architecture and domain specificity. In L. A. Hirschfeld & S. A. Gelman (Eds.), Mapping the mind: Domain specificity in cognition and culture (pp. 119-148). Cambridge: Cambridge University Press. Leslie, A. M., & Frith, U. (1988). Autistic children's understanding of seeing, knowing and believing. British Journal of Developmental Psychology, 6, 315-324. Leslie, A. M., & Happé, F. (1989). Autism and ostensive communication: The relevance of metarepresentation. Development & Psychopathology, 1(3), 205-212. Leslie, A. M., & Polizzi, P. (1998). Inhibitory processing in the false belief task: Two conjectures. Developmental Science, 1(2), 247-253. Leslie, A., & Roth, D. (1993). What autism teaches us about metarepresentation. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from autism (pp. 83-111). Oxford: Oxford University Press. Leslie, A. M., & Thaiss, L. (1992). Domain specificity in conceptual development: Neuropsychological evidence from autism. Cognition, 43(3), 225-251. Levin, H. S., Culhane, K. A., Hartmann, J., Evankovich, K., Mattson, A. J., Harward, H., Ringholz, G., Ewing-Cobbs, L., & Fletcher, J. M. (1991). Developmental changes in performance on tests of purported frontal lobe functioning. Developmental Neuropsychology, 7(3), 377-395. Levin, H. S., Fletcher, J. M., Kufera, J. A., Lilly, M. A., Mendelsohn, D., Bruce, D., & Eisenberg, H. M. (1996). Dimensions of cognition measured by the Tower of London and other cognitive tasks in head-injured children and adolescents. Developmental Neuropsychology, 12(1), 17-34. Levin, H. S., Mendelsohn, D. B., Lilly, M. A., Fletcher, J. M., Culhane, K. A., Chapman, S. B., Harward, H., Kusnerik, L., Bruce, D., & Eisenberg, H. M. (1994). Tower of London performance in relation to Magnetic Resonance Imaging following closed head injury in children. Neuropsychology, 8(2), 171179. Levine, B., Stuss, D. T., Milberg, W. P., Alexander, M. P., Schwartz, M., & MacDonald, R. (1998). The effects of focal and diffuse brain damage on strategy application: Evidence from focal lesions, traumatic brain injury and normal aging. Journal of the International Neuropsychological Society, 4(3), 247-264. 308 Lewis, C., & Mitchell, P. (Eds.). (1994). Children's early understanding of mind: Origins and development. Hove, UK: Lawrence Erlbaum. Lewis, V., & Boucher, J. (1988). Spontaneous, instructed and elicited play in relatively able autistic children. British Journal of Developmental Psychology, 6(4), 325339. Lewis, V., & Boucher, J. (1991). Skill, content and generative strategies in autistic children's drawings. British Journal of Developmental Psychology, 9(3), 393416. Lewis, V., & Boucher, J. (1995). Generativity in the play of young people with autism. Journal of Autism & Developmental Disorders, 25(2), 105-121. Lezak, M. D. (1993). Newer contributions to the neuropsychological assessment of executive functions. Journal of Head Trauma Rehabilitation, 8(1), 24-31. Lezak, M. D. (1995). Neuropsychological assessment. (3rd ed.). New York: Oxford University Press. Liss, M., Fein, D., Allen, D., Dunn, M., Feinstein, C., Morris, R., Waterhouse, L., & Rapin, I. (2001). Executive functioning in high-functioning children with autism. Journal of Child Psychology & Psychiatry & Allied Disciplines, 42(2), 261-270. Lockyer, L., & Rutter, M. (1969). A five- to fifteen-year follow-up study of infantile psychosis: III. Psychological aspects. British Journal of Psychiatry, 115(525), 865-882. Lockyer, L., & Rutter, M. (1970). A five- to fifteen-year follow-up study of infantile psychosis: IV. Patterns of cognitive ability. British Journal of Social & Clinical Psychology, 9(2), 152-163. Lord, C., Risi, S., Lambrecht, L., Cook, E. H., Leventhal, B. L., DiLavore, P. C., Pickles, A., & Rutter, M. (2000). The Autism Diagnostic Observation ScheduleGeneric: A standard measure of social and communication deficits associated with the spectrum of autism. Journal of Autism & Developmental Disorders, 30(3), 205-223. Lord, C., Rutter, M., & Le Couteur, A. (1994). Autism Diagnostic Interview--Revised: A revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders. Journal of Autism & Developmental Disorders, 24(5), 659-685. 309 Lorr, M. (1994). Cluster analysis: Aims, methods, and problems. In S. Strack & M. Lorr (Eds.), Differentiating normal and abnormal personality (pp. 179-195). New York: Springer Publishing Co. Lough, S., Gregory, C., & Hodges, J. R. (2001). Dissociation of social cognition and executive function in frontal variant frontotemporal dementia. Neurocase, 7(2,Pt2), 123-130. Lough, S., & Hodges, J. R. (2002). Measuring and modifying abnormal social cognition in frontal variant frontotemporal dementia. Journal of Psychosomatic Research, 53(2), 639-646. Lowe, C., & Rabbitt, P. (1998). Test/re-test reliability of the CANTAB and ISPOCD neuropsychological batteries: Theoretical and practical issues. Neuropsychologia, 36(9), 915-923. Luciana, M., & Nelson, C. A. (1998). The functional emergence of prefrontally-guided working memory systems in four- to eight-year-old children. Neuropsychologia, 36(3), 273-293. Luna, B., Minshew, N., Garver, K., Lazar, N., Thulborn, K., Eddy, W., & Sweeney, J. (2002). Neocortical system abnormalities in autism: An fMRI study of spatial working memory. Neurology, 59(6), 834-840. Luria, A. R. (1966). Higher cortical functions in man. New York: Basic Books. Macintosh, K. E., & Dissanayake, C. (2004). Annotation: The similarities and differences between autistic disorder and Asperger's disorder: A review of the empirical evidence. Journal of Child Psychology and Psychiatry, 45(3), 421434. MacLean, J. E., Szatmari, P., Jones, M. B., Bryson, S. E., Mahoney, W. J., Bartolucci, G., & Tuff, L. (1999). Familial factors influence level of functioning in pervasive developmental disorder. Journal of the American Academy of Child & Adolescent Psychiatry, 38(6), 746-753. Manly, T., Anderson, V., Nimmo-Smith, I., Turner, A., Watson, P., & Robertson, I. H. (2001). The differential assessment of children's attention: The Test of Everyday Attention for Children (TEA-Ch), normative sample and ADHD performance. Journal of Child Psychology & Psychiatry & Allied Disciplines, 42(8), 10651081. Manly, T., Robertson, I. H., Anderson, V., & Nimmo-Smith, I. (1998). The Test of Everyday Attention for Children (TEA-Ch). Thames Valley Test Company. 310 Marlowe, W. B. (1992). The impact of a right prefrontal lesion on the developing brain. Brain & Cognition, 20(1), 205-213. Maxwell, S. E., & Delaney, H. D. (1990). Designing experiments and analysing data: A model comparison perspective. Belmont, CA: Wadsworth. Mayes, L. C., Klin, A., Tercyak, K. P., Cicchetti, D. V., & Cohen, D. J. (1996). Testretest reliability for false-belief tasks. Journal of Child Psychology & Psychiatry & Allied Disciplines, 37(3), 313-319. Mazza, M., De Risio, A., Surian, L., Roncone, R., & Casacchia, M. (2001). Selective impairments of theory of mind in people with schizophrenia. Schizophrenia Research, 47(2-3), 299-308. McEvoy, R. E., Rogers, S. J., & Pennington, B. F. (1993). Executive function and social communication deficits in young autistic children. Journal of Child Psychology & Psychiatry & Allied Disciplines, 34(4), 563-578. Mega, M. S., & Cummings, J. L. (1994). Frontal-subcortical circuits and neuropsychiatric disorders. Journal of Neuropsychiatry & Clinical Neurosciences, 6(4), 358-370. Mengelberg, A., & Siegert, R. J. (2003). Is theory-of-mind impaired in Parkinson's disease? Cognitive Neuropsychiatry, 8(3), 191-209. Miller, G. A., & Chapman, J. P. (2001). Misunderstanding analysis of covariance. Journal of Abnormal Psychology, 110, 40-48. Miller, J. N., & Ozonoff, S. (2000). The external validity of Asperger disorder: Lack of evidence from the domain of neuropsychology. Journal of Abnormal Psychology, 109(2), 227-238. Milner, B. (1963). Effects of different brain lesions on card sorting. Archives of Neurology, 9, 90-100. Minshew, N. J., Goldstein, G., Muenz, L. R., & Payton, J. B. (1992). Neuropsychological functioning nonmentally retarded autistic individuals. Journal of Clinical & Experimental Neuropsychology, 14(5), 749-761. Minshew, N. J., Johnson, C., & Luna, B. (2001). The cognitive and neural basis of autism: A disorder of complex information processing and dysfunction of neocortical systems. In L. M. Glidden (Ed.), International review of research in mental retardation: Autism (Vol. 23, pp. 111-138). San Diego, CA: Academic Press. 311 Minshew, N. J., Luna, B., & Sweeney, J. A. (1999). Oculomotor evidence for neocortical systems but not cerebellar dysfunction in autism. Neurology, 52(5), 917-922. Minter, M., Hobson, R., & Bishop, M. (1998). Congenital visual impairment and 'theory of mind'. British Journal of Developmental Psychology, 16(2), 183-196. Minton, J., Campbell, M., Green, W. H., Jennings, S., & Samit, C. (1982). Cognitive assessment of siblings of autistic children. Journal of the American Academy of Child Psychiatry, 21(3), 256-261. Mitchell, P., & Lacohée, H. (1991). Children's early understanding of false belief. Cognition, 39, 107-127. Mitchell, P., & Riggs, K. J. (Eds.). (2000). Children's reasoning and the mind. Hove, UK: Psychology Press. Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., & Howerter, A. (2000). The unity and diversity of executive functions and their contributions to complex "frontal lobe" tasks: A latent variable analysis. Cognitive Psychology, 41(1), 49-100. Moore, C., Jarrold, C., Russell, J., Lumb, A., Sapp, F., & MacCallum, F. (1995). Conflicting desire and the child's theory of mind. Cognitive Development, 10(4), 467-482. Morton, J., & Frith, U. (1995). Causal modeling: A structural approach to developmental psychopathology. In D. Cicchetti & D. J. Cohen (Eds.), Developmental psychopathology (Vol. 1: Theory and methods, pp. 357-390). Oxford: John Wiley & Sons. Morton, J., & Frith, U. (2001). Why we need cognition: Cause and developmental disorder. In E. Dupoux (Ed.), Language, brain, and cognitive development: Essays in honor of Jacques Mehler (pp. 263-278). Cambridge, MA: MIT Press. Moses, L. J., & Flavell, J. H. (1990). Inferring false beliefs from actions and reactions. Child Development, 61(4), 929-945. Mundy, P. (2003). The neural basis of social impairments in autism: The role of the dorsal medial-frontal cortex and anterior cingulate system. Journal of Child Psychology & Psychiatry & Allied Disciplines, 44(6), 793-809. Mundy, P., & Neal, A. (2001). Neural plasticity, joint attention, and a transactional social-orienting model of autism. In L. M. Glidden (Ed.), International review of research in mental retardation: Autism (Vol. 23, pp. 139-168). San Diego, CA: Academic Press. 312 Mundy, P., & Sigman, M. (1989). Specifying the nature of the social impairment in autism. In G. Dawson (Ed.), Autism: Nature, diagnosis, and treatment (pp. 321). New York: The Guilford Press. Mundy, P., Sigman, M., & Kasari, C. (1993). The theory of mind and joint-attention deficits in autism. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from autism (pp. 181-203). Oxford: Oxford University Press. Murphy, M., Bolton, P. F., Pickles, A., Fombonne, E., Piven, J., & Rutter, M. (2000). Personality traits of the relatives of autistic probands. Psychological Medicine, 30(6), 1411-1424. Narayan, S., Moyes, B., & Wolff, S. (1990). Family characteristics of autistic children: A further report. Journal of Autism & Developmental Disorders, 20(4), 523-535. Norman, D., & Shallice, T. (1980). Attention to action: Willed and automatic control of behaviour. Center for Human Information Processing (Technical Report No. 99). Norman, D. A., & Shallice, T. (1986). Attention to action: Willed and automatic control of behaviour. In R. J. Davidson, G. E. Schwartz, & D. Shapiro (Eds.), Consciousness and self-regulation (Vol. 4). New York: Plenum Press. Ohnishi, T., Matsuda, H., Hashimoto, T., Kunihiro, T., Nishikawa, M., Uema, T., & Sasaki, M. (2000). Abnormal regional cerebral blood flow in childhood autism. Brain, 123(9), 1838-1844. Olson, D. R. (1989). Making up your mind. Canadian Psychology, 30, 617-627. Oosterlaan, J., Logan, G. D., & Sergeant, J. A. (1998). Response inhibition in AD/HD, CD, comorbid AD/HD + CD, anxious, and control children: A meta-analysis of studies with the stop task. Journal of Child Psychology & Psychiatry & Allied Disciplines, 39(3), 411-425. Ornitz, E. M. (1969). Disorders of perception common to early infantile autism and schizophrenia. Comprehensive Psychiatry, 10(4), 259-274. Ornitz, E. M. (1988). Autism: A disorder of directed attention. Brain Dysfunction, 1(56), 309-322. Oswald, D. P., & Ollendick, T. H. (1989). Role taking and social competence in autism and mental retardation. Journal of Autism & Developmental Disorders, 19(1), 119-127. 313 Owen, A. M., Downes, J. J., Sahakian, B. J., Polkey, C. E., & Robbins, T. W. (1990). Planning and spatial working memory following frontal lobe lesions in man. Neuropsychologia, 28(10), 1021-1034. Owen, A. M., Roberts, A. C., Hodges, J. R., Summers, B. A., Polkey, C. E., & Robbins, T. W. (1993). Contrasting mechanisms of impaired attentional set-shifting in patients with frontal lobe damage or Parkinson's disease. Brain, 116, 1159-1175. Owen, A. M., Roberts, A. C., Polkey, C. E., Sahakian, B. J., & Robbins, T. W. (1991). Extra-dimensional versus intra-dimensional set shifting performance following frontal lobe excisions, temporal lobe excisions or amygdalo-hippocampectomy in man. Neuropsychologia, 29(10), 993-1006. Ozonoff, S. (1995a). Executive functions in autism. In E. Schopler & G. B. Mesibov (Eds.), Learning and cognition in autism (pp. 199-219). New York: Plenum Press. Ozonoff, S. (1995b). Reliability and validity of the Wisconsin Card Sorting Test in studies of autism. Neuropsychology, 9(4), 491-500. Ozonoff, S. (1997a). Causal mechanisms of autism: Unifying perspectives from an information-processing framework. In D. J. Cohen & F. R. Volkmar (Eds.), Handbook of autism and pervasive developmental disorders (2nd ed., pp. 868879). New York: John Wiley & Sons. Ozonoff, S. (1997b). Components of executive function in autism and other disorders. In J. Russell (Ed.), Autism as an executive disorder (pp. 179-211). Oxford: Oxford University Press. Ozonoff, S. (2001). Advances in the cognitive neuroscience of autism. In C. A. Nelson & M. Luciana (Eds.), Handbook of developmental cognitive neuroscience (pp. 537-548). Cambridge, MA: MIT Press. Ozonoff, S., & Jensen, J. (1999). Brief report: Specific executive function profiles in three neurodevelopmental disorders. Journal of Autism and Developmental Disorders, 29(2), 171-177. Ozonoff, S., & McEvoy, R. E. (1994). A longitudinal study of executive function and theory of mind development in autism. Development & Psychopathology, 6(3), 415-431. Ozonoff, S., Pennington, B. F., & Rogers, S. J. (1991). Executive function deficits in high-functioning autistic individuals: Relationship to theory of mind. Journal of Child Psychology & Psychiatry & Allied Disciplines, 32(7), 1081-1105. 314 Ozonoff, S., Rogers, S. J., Farnham, J. M., & Pennington, B. F. (1993). Can standard measures identify subclinical markers of autism? Journal of Autism & Developmental Disorders, 23(3), 429-441. Ozonoff, S., South, M., & Miller, J. N. (2000). DSM-IV-defined Asperger syndrome: Cognitive, behavioral and early history differentiation from high-functioning autism. Autism, 4(1), 29-46. Ozonoff, S., & Strayer, D. L. (1997). Inhibitory function in nonretarded children with autism. Journal of Autism and Developmental Disorders, 27(1), 59-77. Ozonoff, S., & Strayer, D. L. (2001). Further evidence of intact working memory in autism. Journal of Autism and Developmental Disorders, 31(3), 257-263. Ozonoff, S., Strayer, D. L., McMahon, W. M., & Filloux, F. (1994). Executive function abilities in autism and Tourette syndrome: An information processing approach. Journal of Child Psychology & Psychiatry & Allied Disciplines, 35(6), 10151032. Pantelis, C., Barnes, T. R., Nelson, H. E., Tanner, S., Weatherley, L., Owen, A. M., & Robbins, T. W. (1997). Frontal-striatal cognitive deficits in patients with chronic schizophrenia. Brain, 120(10), 1823-1843. Passler, M. A., Isaac, W., & Hynd, G. W. (1985). Neuropsychological development of behavior attributed to frontal lobe functioning in children. Developmental Neuropsychology, 1(4), 349-370. Paterson, S., Brown, J., Gsoedl, M., Johnson, M., & Karmiloff-Smith, A. (1999). Cognitive modularity and genetic disorders. Science, 286(5448), 2355-2358. Pellicano, E., Maybery, M., Durkin, K., & Maley, A. (2004). Weak central coherence in children with autism: Its relationship to mindreading and executive functioning. Manuscript submitted for publication. Pennington, B. F. (1997). Dimensions of executive functions in normal and abnormal development. In N. A. Krasnegor, G. R. Lyon, & P. S. Goldman-Rakic (Eds.), Development of the prefrontal cortex: Evolution, neurobiology, and behavior (pp. 265-281). Baltimore, MD: Paul H. Brookes. Pennington, B. F., Groisser, D., & Welsh, M. C. (1993). Contrasting cognitive deficits in attention deficit hyperactivity disorder versus reading disability. Developmental Psychology, 29(3), 511-523. Pennington, B. F., & Ozonoff, S. (1991). A neuroscientific perspective on continuity and discontinuity in developmental psychopathology. In D. Cicchetti & S. L. Toth (Eds.), Rochester symposium on developmental psychopathology (Vol. 3: 315 Models and integrations, pp. 117-159). Rochester, NY: University of Rochester Press. Pennington, B. F., & Ozonoff, S. (1996). Executive functions and developmental psychopathology. Journal of Child Psychology and Psychiatry, 37(1), 51-87. Pennington, B. F., Rogers, S. J., Bennetto, L., McMahon Griffith, E., Reed, D. T., & Shyu, V. (1997). Validity tests of the executive dysfunction hypothesis of autism. In J. Russell (Ed.), Autism as an executive disorder (pp. 143-178). Oxford: Oxford University Press. Pennington, B. F., & Welsh, M. (1995). Neuropsychology and developmental psychopathology. In D. Cicchetti & D. J. Cohen (Eds.), Developmental psychopathology (Vol. 1: Theory and methods, pp. 254-290). Oxford: John Wiley & Sons. Perner, J. (1991). Understanding the representational mind. Cambridge, MA: MIT Press. Perner, J. (1993). The theory of mind deficit in autism: Rethinking the metarepresentation theory. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from autism (pp. 112137). Oxford: Oxford University Press. Perner, J. (1995). The many faces of belief: Reflections on Fodor's and the child's theory of mind. Cognition, 57(3), 241-269. Perner, J. (1998). The meta-intentional nature of executive functions and theory of mind. In P. Carruthers & J. Boucher (Eds.), Language and thought (pp. 270283). Cambridge: Cambridge University Press. Perner, J. (2000). About + belief + counterfactual. In P. Mitchell & K. J. Riggs (Eds.), Children's reasoning and the mind (pp. 367-401). Hove, UK: Psychology Press. Perner, J., Baker, S., & Hutton, D. (1994). Prelief: The conceptual origins of belief and pretence. In C. Lewis & P. Mitchell (Eds.), Children's early understanding of mind: Origins and development. Hove, UK: Lawrence Erlbaum. Perner, J., Frith, U., Leslie, A. M., & Leekam, S. R. (1989). Exploration of the autistic child's theory of mind: Knowledge, belief, and communication. Child Development, 60(3), 689-700. Perner, J., Kain, W., & Barchfeld, P. (2002a). Executive control and higher-order theory of mind in children at risk of ADHD. Infant & Child Development, 11(2), 141158. 316 Perner, J., & Lang, B. (1999). Development of theory of mind and executive control. Trends in Cognitive Sciences, 3(9), 337-344. Perner, J., & Lang, B. (2000). Theory of mind and executive function: Is there a developmental relationship? In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from developmental cognitive neuroscience (2nd ed., pp. 150-181). London: Oxford University Press. Perner, J., & Lang, B. (2002). What causes 3-year-olds' difficulty on the dimensional change card sorting task? Infant & Child Development, 11(2), 93-105. Perner, J., Lang, B., & Kloo, D. (2002b). Theory of mind and self-control: More than a common problem of inhibition. Child Development, 73(3), 752-767. Perner, J., Leekam, S. R., & Wimmer, H. (1987). Three-year-olds' difficulty with false belief: The case for a conceptual deficit. British Journal of Developmental Psychology, 5(2), 125-137. Perner, J., Ruffman, T., & Leekam, S. R. (1994). Theory of mind is contagious: You catch it from your sibs. Child Development, 65(4), 1228-1238. Perner, J., Stummer, S., & Lang, B. (1999). Executive functions and theory of mind: Cognitive complexity or functional dependence? In P. D. Zelazo, J. W. Astington, & D. R. Olson (Eds.), Developing theories of intention: Social understanding and self-control (pp. 133-152). Mahwah, NJ: Lawrence Erlbaum. Perner, J., & Wimmer, H. (1985). "John thinks that Mary thinks that . . .": Attribution of second-order beliefs by 5- to 10-year-old children. Journal of Experimental Child Psychology, 39(3), 437-471. Peterson, C. C. (2002). Drawing insight from pictures: The development of concepts of false drawing and false belief in children with deafness, normal hearing, and autism. Child Development, 73(5), 1442-1459. Peterson, C. C., & Siegal, M. (1995). Deafness, conversation and theory of mind. Journal of Child Psychology & Psychiatry & Allied Disciplines, 36(3), 459-474. Peterson, D. M., & Bowler, D. M. (2000). Counterfactual reasoning and false belief understanding in children with autism, children with severe learning difficulties and children with typical development. Autism, 4(4), 391-405. Peterson, D. M., & Riggs, K. J. (1999). Adaptive modelling and mindreading. Mind and Language, 14(1), 80-117. 317 Phillips, L. H. (1997). Do "frontal tests" measure executive function? Issues of assessment and evidence from fluency tests. In P. Rabbitt (Ed.), Methodology of frontal and executive function (pp. 191-213). Hove, UK: Psychology Press. Phillips, W., Baron-Cohen, S., & Rutter, M. (1998). Understanding intention in normal development and in autism. British Journal of Developmental Psychology, 16(3), 337-348. Pickles, A., Bolton, P., Macdonald, H., Bailey, A., Le Couteur, A., Sim, C. H., & Rutter, M. (1995). Latent-class analysis of recurrence risks for complex phenotypes with selection and measurement error: A twin and family history study of autism. American Journal of Human Genetics, 57(3), 717-26. Pickles, A., Starr, E., Kazak, S., Bolton, P., Papanikolaou, K., Bailey, A., Goodman, R., & Rutter, M. (2000). Variable expression of the autism broader phenotype: Findings from extended pedigrees. Journal of Child Psychology & Psychiatry & Allied Disciplines, 41(4), 491-502. Pilowsky, T., Yirmiya, N., Arbelle, S., & Mozes, T. (2000). Theory of mind abilities of children with schizophrenia, children with autism, and normally developing children. Schizophrenia Research, 42(2), 145-155. Pilowsky, T., Yirmiya, N., Shalev, R. S., & Gross-Tsur, V. (2003). Language abilities of siblings of children with autism. Journal of Child Psychology & Psychiatry & Allied Disciplines, 44(6), 914-925. Piven, J. (1999). Genetic liability for autism: The behavioural expression in relatives. International Review of Psychiatry, 11(4), 299-308. Piven, J., Arndt, S., Bailey, J., Havercamp, S., Andreasen, N. C., & Palmer, P. (1995). An MRI study of brain size in autism. American Journal of Psychiatry, 152(8), 1145-1149. Piven, J., Bailey, J., Ranson, B. J., & Arndt, S. (1997a). An MRI study of the corpus callosum in autism. American Journal of Psychiatry, 154, 1051-1056. Piven, J., Berthier, M. L., Starkstein, S. E., Nehme, E., Pearlson, G., & Folstein, S. (1990a). Magnetic resonance imaging evidence for a defect of cerebral cortical development in autism. American Journal of Psychiatry, 147(6), 734-739. Piven, J., Chase, G. A., Landa, R., Wzorek, M., Gayle, J., Cloud, D., & Folstein, S. (1991). Psychiatric disorders in the parents of autistic individuals. Journal of the American Academy of Child & Adolescent Psychiatry, 30(3), 471-478. Piven, J., Gayle, J., Chase, G. A., Fink, B., Landa, R., Wzorek, M. M., & Folstein, S. E. (1990b). A family history study of neuropsychiatric disorders in the adult 318 siblings of autistic individuals. Journal of the American Academy of Child & Adolescent Psychiatry, 29(2), 177-183. Piven, J., & Palmer, P. (1997). Cognitive deficits in parents from multiple-incidence autism families. Journal of Child Psychology & Psychiatry & Allied Disciplines, 38(8), 1011-1021. Piven, J., & Palmer, P. (1999). Psychiatric disorder and the broad autism phenotype: Evidence from a family study of multiple-incidence autism families. American Journal of Psychiatry, 156(4), 557-563. Piven, J., Palmer, P., Jacobi, D., Childress, D., & Arndt, S. (1997b). Broader autism phenotype: Evidence from a family history study of multiple-incidence autism families. American Journal of Psychiatry, 154(2), 185-190. Piven, J., Palmer, P., Landa, R., Santangelo, S., Jacobi, D., & Childress, D. (1997c). Personality and language characteristics in parents from multiple-incidence autism families. American Journal of Medical Genetics (Neuropsychiatric Genetics), 74(4), 398-411. Piven, J., Wzorek, M., Landa, R., Lainhart, J., Bolton, P., Chase, G. A., & Folstein, S. (1994). Personality characteristics of the parents of autistic individuals. Psychological Medicine, 24(3), 783-795. Plaisted, K. C. (2000). Aspects of autism that theory of mind cannot explain. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from developmental cognitive neuroscience (2nd ed., pp. 222-250). London: Oxford University Press. Plaisted, K. C. (2001). Reduced generalization in autism: An alternative to weak central coherence. In J. A. Burack, T. Charman, N. Yirmiya, & P. R. Zelazo (Eds.), The development of autism: Perspectives from theory and research (pp. 149-169). Mahwah, NJ: Lawrence Erlbaum Associates. Premack, D., & Woodruff, G. (1978). Does the chimpanzee have a theory of mind? The Behavioral and Brain Sciences, 4, 515-526. Price, B. H., Daffner, K. R., Stowe, R. M., & Mesulam, M. M. (1990). The comportmental learning disabilities of early frontal lobe damage. Brain, 113, 1383-1393. Prior, M., Dahlstrom, B., & Squires, T.-L. (1990). Autistic children's knowledge of thinking and feeling states in other people. Journal of Child Psychology & Psychiatry & Allied Disciplines, 31(4), 587-601. 319 Prior, M., Eisenmajer, R., Leekam, S., Wing, L., Gould, J., Ong, B., & Dowe, D. (1998). Are there subgroups within the autistic spectrum? A cluster analysis of a group of children with autistic spectrum disorders. Journal of Child Psychology & Psychiatry & Allied Disciplines, 39(6), 893-902. Prior, M., & Hoffmann, W. (1990). Brief report: Neuropsychological testing of autistic children through an exploration with frontal lobe tests. Journal of Autism & Developmental Disorders, 20(4), 581-590. Pylyshyn, Z. W. (1978). When is attribution of beliefs justified? The Behavioral and Brain Sciences, 1, 592-593. Rabbitt, P. (Ed.). (1997). Methodology of frontal and executive function. Hove, UK: Psychology Press. Rapin, I. (1997). Classification and causal issues in autism. In D. J. Cohen & F. R. Volkmar (Eds.), Handbook of autism and pervasive developmental disorders (2nd ed., pp. 847-867). New York: John Wiley & Sons. Razani, J., Boone, K., Miller, B. L., Lee, A., & Sherman, D. (2001). Neuropsychological performance of right- and left-frontotemporal dementia compared to Alzheimer's disease. Journal of the International Neuropsychological Society, 7(4), 468-480. Reed, T., & Peterson, C. (1990). A comparative study of autistic subjects' performance at two levels of visual and cognitive perspective taking. Journal of Autism and Developmental Disorders, 20, 555-568. Reitan, R. M., & Wolfson, D. (1994). A selective and critical review of neuropsychological deficits and the frontal lobes. Neuropsychology Review, 4(3), 161-198. Remmel, E. R. (2003). Theory of mind development in signing deaf children. Unpublished PhD thesis, Stanford University. Rinehart, N. J., Bradshaw, J. L., Moss, S. A., Brereton, A. V., & Tonge, B. J. (2001). A deficit in shifting attention present in high-functioning autism but not Asperger's disorder. Autism, 5(1), 67-80. Rinehart, N. J., Bradshaw, J. L., Tonge, B. J., Brereton, A. V., & Bellgrove, M. A. (2002). A neurobehavioral examination of individuals with high-functioning autism and Asperger disorder using a fronto-striatal model of dysfunction. Behavioral & Cognitive Neuroscience Reviews, 1(2), 164-177. 320 Risch, N., Spiker, D., Lotspeich, L., Nouri, N., Hinds, D., Hallmayer, J., et al. (1999). A genomic screen of autism: evidence for a multilocus etiology. American Journal of Human Genetics, 65(2), 493-507. Roberts, R. J., Hager, L. D., & Heron, C. (1994). Prefrontal cognitive processes: Working memory and inhibition in the antisaccade task. Journal of Experimental Psychology: General, 123(4), 374-393. Roberts, R. J., & Pennington, B. F. (1996). An interactive framework for examining prefrontal cognitive processes. Developmental Neuropsychology, 12(1), 105126. Robinson, E. J., & Beck, S. (2000). What is difficult about counterfactual reasoning? In P. Mitchell & K. J. Riggs (Eds.), Children's reasoning and the mind (pp. 101119). Hove, UK: Psychology Press. Robinson, E., & Mitchell, P. (1995). Masking of children's early understanding of the representational mind: Backwards explanation versus prediction. Child Development, 66(4), 1022-1039. Robinson, E., Riggs, K., & Samuels, J. (1996). Children's memory for drawings based on a false belief. Developmental Psychology, 32(6), 1056-1064. Rogers, S. J. (1999). An examination of the imitation deficit in autism. In J. Nadel & G. Butterworth (Eds.), Imitation in infancy: Cambridge studies in cognitive perceptual development (pp. 254-283). New York: Cambridge University Press. Rogers, S. J., & Pennington, B. F. (1991). A theoretical approach to the deficits in infantile autism. Development & Psychopathology, 3(2), 137-162. Rosenthal, R. (1991). Meta-analytic procedures for social research. Newbury Park, CA: Sage. Roth, D., & Leslie, A. M. (1998). Solving belief problems: Toward a task analysis. Cognition, 66(1), 1-31. Rowe, A. D., Bullock, P. R., Polkey, C. E., & Morris, R. G. (2001). 'Theory of mind' impairments and their relationship to executive functioning following frontal lobe excisions. Brain, 124(3), 600-616. Royall, D. R., Lauterbach, E. C., Cummings, J. L., Reeve, A., Rummans, T. A., Kaufer, D. I., LaFrance, W., & Coffey, C. (2002). Executive control function: A review of its promise and challenges for clinical research: A report from the committee on research of the American Neuropsychiatric Association. Journal of Neuropsychiatry & Clinical Neurosciences, 14(4), 377-405. 321 Ruffman, T., Perner, J., Naito, M., Parkin, L., & Clements, W. A. (1998). Older (but not younger) siblings facilitate false belief understanding. Developmental Psychology, 34(1), 161-174. Rumsey, J. M. (1985). Conceptual problem-solving in highly verbal, nonretarded autistic men. Journal of Autism & Developmental Disorders, 15(1), 23-36. Rumsey, J. M., & Hamburger, S. D. (1988). Neuropsychological findings in highfunctioning men with infantile autism, residual state. Journal of Clinical & Experimental Neuropsychology, 10(2), 201-221. Russell, J. (1996). Agency: Its role in mental development. Hove, UK: Lawrence Erlbaum. Russell, J. (Ed.). (1997a). Autism as an executive disorder. Oxford: Oxford University Press. Russell, J. (1997b). How executive disorders can bring about an inadequate 'theory of mind'. In J. Russell (Ed.), Autism as an executive disorder (pp. 215-255). Oxford: Oxford University Press. Russell, J., Hala, S., & Hill, E. (2003). The automated windows task: The performance of preschool children, children with autism, and children with moderate learning difficulties. Cognitive Development, 18(1), 111-137. Russell, J., & Hill, E. L. (2001). Action-monitoring and intention reporting in children with autism. Journal of Child Psychology & Psychiatry & Allied Disciplines, 42(3), 317-328. Russell, J., Hill, E. L., & Franco, F. (2001). The role of belief veracity in understanding intentions-in-action: Preschool children's performance on the transparent intentions task. Cognitive Development, 16(3), 775-792. Russell, J., & Jarrold, C. (1998). Error-correction problems in autism: Evidence for a monitoring impairment? Journal of Autism & Developmental Disorders, 28(3), 177-188. Russell, J., & Jarrold, C. (1999). Memory for actions in children with autism: Self versus other. Cognitive Neuropsychiatry, 4(4), 303-331. Russell, J., Jarrold, C., & Henry, L. (1996). Working memory in children with autism and with moderate learning difficulties. Journal of Child Psychology and Psychiatry, 37(6), 673-686. Russell, J., Jarrold, C., & Hood, B. (1999). Two intact executive capacities in children with autism: Implications for the core executive dysfunctions in the disorder. Journal of Autism and Developmental Disorders, 29(2), 103-112. 322 Russell, J., Jarrold, C., & Potel, D. (1994). What makes strategic deception difficult for children - the deception or the strategy? British Journal of Developmental Psychology, 12(3), 301-314. Russell, J., Mauthner, N., Sharpe, S., & Tidswell, T. (1991). The "windows task" as a measure of strategic deception in preschoolers and autistic subjects. British Journal of Developmental Psychology, 9(2), 331-349. Russell, J., Saltmarsh, R., & Hill, E. (1999). What do executive factors contribute to the failure on false belief tasks by children with autism? Journal of Child Psychology & Psychiatry & Allied Disciplines, 40(6), 859-868. Rutherford, M., & Rogers, S. J. (2003). Cognitive underpinnings of pretend play in autism. Journal of Autism & Developmental Disorders, 33(3), 289-302. Rutter, M. (1968). Concepts of autism: A review of research. Journal of Child Psychology & Psychiatry & Allied Disciplines, 9(1), 1-25. Rutter, M. (1970). Autistic children: Infancy to adulthood. Seminars in Psychiatry, 2, 435-450. Rutter, M. (1983). Cognitive deficits in the pathogenesis of autism. Journal of Child Psychology & Psychiatry & Allied Disciplines, 24(4), 513-531. Rutter, M. (2000). Genetic studies of autism: From the 1970s into the millennium. Journal of Abnormal Child Psychology, 28(1), 3-14. Saltzman, J., Strauss, E., Hunter, M., & Archibald, S. (2000). Theory of mind and executive functions in normal human aging and Parkinson's disease. Journal of the International Neuropsychological Society, 6(7), 781-788. Saver, J. L., & Damasio, A. R. (1991). Preserved access and processing of social knowledge in a patient with acquired sociopathy due to ventromedial frontal damage. Neuropsychologia, 29(12), 1241-1249. Scheerer, M., Rothmann, E., & Goldstein, K. (1945). A case of "idiot savant": An experimental study of personality organization. Psychological Monographs, 58, 1-63. Schneider, W., & Shiffrin, R. M. (1977). Controlled and automatic human information processing: I. Detection, search, and attention. Psychological Review, 84(1), 166. Scholl, B. J., & Leslie, A. M. (1999). Modularity, development and 'theory of mind'. Mind & Language, 14(1), 131-153. 323 Scholl, B. J., & Leslie, A. M. (2001). Minds, modules, and meta-analysis. Commentary on "Meta-analysis of theory-of-mind development: The truth about false belief.". Child Development, 72(3), 696-701. Schwartz, M. L. (1997). Organization and development of callosal connectivity in prefrontal cortex. In N. A. Krasnegor, G. R. Lyon, & P. S. Goldman-Rakic (Eds.), Development of the prefrontal cortex: Evolution, neurobiology and behavior (pp. 49-67). Baltimore, MD: Paul H. Brookes. Sergeant, J. A., Geurts, H., & Oosterlaan, J. (2002). How specific is a deficit of executive functioning for attention-deficit/hyperactivity disorder? Behavioural Brain Research, 130(1-2), 3-28. Shah, A., & Frith, U. (1983). An islet of ability in autistic children: A research note. Journal of Child Psychology and Psychiatry, 24(4), 613-620. Shah, A., & Frith, U. (1993). Why do autistic individuals show superior performance on the block design task? Journal of Child Psychology & Psychiatry & Allied Disciplines, 34(8), 1351-1364. Shallice, T. (1982). Specific impairments in planning. Philosophical Transactions of the Royal Society of London B, 298, 199-209. Shallice, T. (1984). More functionally isolable subsystems but fewer "modules"? Cognition, 17(3), 243-252. Shallice, T. (1988). From neuropsychology to mental structure. New York: Cambridge University Press. Shallice, T. (2002). Fractionation of the supervisory system. In D. T. Stuss & R. T. Knight (Eds.), Principles of frontal lobe function (pp. 261-277). London: Oxford University Press. Shallice, T., & Burgess, P. (1991). Deficits in strategy application after frontal lobe damage in man. Brain, 114, 727-741. Shallice, T., & Burgess, P. W. (1996). Domains of supervisory control and the temporal organisation of behaviour. Philosophical Transactions of the Royal Society of London B, 351, 1405-1412. Shallice, T., Marzocchi, G. M., Coser, S., Del Savio, M., Meuter, R. F., & Rumiati, R. I. (2002). Executive function profile of children with attention deficit hyperactivity disorder. Developmental Neuropsychology, 21(1), 43-71. Sherman, M., Nass, R., & Shapiro, T. (1984). Brief report: Regional cerebral blood flow in autism. Journal of Autism and Developmental Disorders, 14(4), 439-446. 324 Siegal, M., & Beattie, K. (1991). Where to look first for children's understanding of false beliefs. Cognition, 38(1), 1-12. Sigman, M., & Ruskin, E. (1999). Social competence in children with autism, Down syndrome and developmental delays: A longitudinal study. Monographs of the Society for Research in Child Development, 64(Serial No. 256). Silverman, J. M., Smith, C. J., Schmeidler, J., Hollander, E., Lawlor, B. A., Fitzgerald, M., Buxbaum, J. D., Delaney, K., & Galvin, P. (2002). Symptom domains in autism and related conditions: Evidence for familiality. American Journal of Medical Genetics (Neuropsychiatric Genetics), 114(1), 64-73. Skuse, D. (2001). Endophenotypes and child psychiatry. British Journal of Psychiatry, 178, 395-396. Skuse, D., James, R., Bishop, D., Coppin, B., Dalton, P., Aamodt-Leeper, G., BacareseHamilton, M., Creswell, C., McGurk, R., & Jacobs, P. A. (1997). Evidence from Turner's syndrome of an imprinted X-linked locus affecting cognitive function. Nature, 387(6634), 705-708. Slaats-Willemse, D., Swaab-Barneveld, H., de Sonneville, L., van der Meulen, E., & Buitelaar, J. (2003). Deficient response inhibition as a cognitive endophenotype of ADHD. Journal of the American Academy of Child & Adolescent Psychiatry, 42(10), 1242-1248. Smalley, S. L., & Asarnow, R. F. (1990). Brief report: Cognitive subclinical markers in autism. Journal of Autism & Developmental Disorders, 20(2), 271-278. Smalley, S. L., McCracken, J., & Tanguay, P. (1995). Autism, affective disorders, and social phobia. American Journal of Medical Genetics (Neuropsychiatric Genetics), 60, 19-26. Smith, I. M., & Bryson, S. E. (1994). Imitation and action in autism: A critical review. Psychological Bulletin, 116(2), 259-273. Smith, M. L., Klim, P., & Hanley, W. B. (2000). Executive function in school-aged children with phenylketonuria. Journal of Developmental & Physical Disabilities, 12(4), 317-332. Sodian, B., & Frith, U. (1992). Deception and sabotage in autistic, retarded and normal children. Journal of Child Psychology & Psychiatry & Allied Disciplines, 33(3), 591-605. Sparrevohn, R., & Howie, P. M. (1995). Theory of mind in children with autistic disorder: Evidence of developmental progression and the role of verbal ability. Journal of Child Psychology & Psychiatry & Allied Disciplines, 36(2), 249-263. 325 Stahl, L., & Pry, R. (2002). Joint attention and set-shifting in young children with autism. Autism, 6(4), 383-396. Starr, E., Berument, S. K., Pickles, A., Tomlins, M., Bailey, A., Papanikolaou, K., & Rutter, M. (2001). A family genetic study of autism associated with profound mental retardation. Journal of Autism & Developmental Disorders, 31(1), 89-96. Steel, J., Gorman, R., & Flexman, J. E. (1984). Neuropsychiatric testing in an autistic mathematical idiot-savant: Evidence for nonverbal abstract capacity. Journal of the American Academy of Child Psychiatry, 23(6), 704-707. Steele, S., Joseph, R. M., & Tager-Flusberg, H. (2003). Brief report: Developmental change in theory of mind abilities in children with autism. Journal of Autism and Developmental Disorders, 33(4), 461-467. Steffenburg, S., Gillberg, C., Hellgren, L., Andersson, L., Gillberg, I., Jakobsson, G., & Bohman, M. (1989). A twin study of autism in Denmark, Finland, Iceland, Norway and Sweden. Journal of Child Psychology & Psychiatry & Allied Disciplines, 30(3), 405-416. Stevens, M. C., Fein, D. A., Dunn, M., Allen, D., Waterhouse, L. H., Feinstein, C., & Rapin, I. (2000). Subgroups of children with autism by cluster analysis: A longitudinal examination. Journal of the American Academy of Child & Adolescent Psychiatry, 39(3), 346-352. Stone, V. (2000). The role of the frontal lobes and the amygdala in theory of mind. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from developmental cognitive neuroscience (2nd ed., pp. 253-273). London: Oxford University Press. Stone, V. E., Baron-Cohen, S., & Knight, R. T. (1998). Frontal lobe contributions to theory of mind. Journal of Cognitive Neuroscience, 10(5), 640-656. Stuss, D. T., & Alexander, M. P. (2000). Executive functions and the frontal lobes: A conceptual view. Psychological Research, 63(3-4), 289-298. Stuss, D. T., & Benson, D. F. (1984). Neuropsychological studies of the frontal lobes. Psychological Bulletin, 95(1), 3-28. Stuss, D. T., & Benson, D. F. (1986). The frontal lobes. New York: Raven Press. Stuss, D. T., Gallup, G. G., Jr., & Alexander, M. P. (2001). The frontal lobes are necessary for "theory of mind". Brain, 124(2), 279-286. Stuss, D. T., & Knight, R. T. (Eds.). (2002). Principles of frontal lobe function. London: Oxford University Press. 326 Surian, L., & Leslie, A. M. (1999). Competence and performance in false belief understanding: A comparison of autistic and normal 3-yr-old children. British Journal of Developmental Psychology, 17(Pt 1), 141-155. Swettenham, J., Baron-Cohen, S., Charman, T., Cox, A., Baird, G., Drew, A., Rees, L., & Wheelwright, S. (1998). The frequency and distribution of spontaneous attention shifts between social and nonsocial stimuli in autistic, typically developing, and nonautistic developmentally delayed infants. Journal of Child Psychology & Psychiatry & Allied Disciplines, 39(5), 747-753. Szatmari, P. (1999). Heterogeneity and the genetics of autism. Journal of Psychiatry & Neuroscience, 24(2), 159-165. Szatmari, P., Jones, M. B., Tuff, L., Bartolucci, G., Bartolucci, G., Fisman, S., & Mahoney, W. (1993). Lack of cognitive impairment in first-degree relatives of children with pervasive developmental disorders. Journal of the American Academy of Child & Adolescent Psychiatry, 32(6), 1264-1273. Szatmari, P., Jones, M. B., Zwaigenbaum, L., & MacLean, J. E. (1998). Genetics of autism: Overview and new directions. Journal of Autism & Developmental Disorders, 28(5), 351-368. Szatmari, P., Merette, C., Bryson, S. E., Thivierge, J., Roy, M.-A., Cayer, M., & Maziade, M. (2002). Quantifying dimensions in autism: A factor-analytic study. Journal of the American Academy of Child & Adolescent Psychiatry, 41(4), 467474. Szatmari, P., Tuff, L., Finlayson, A. J., & Bartolucci, G. (1990). Asperger's Syndrome and autism: Neurocognitive aspects. Journal of the American Academy of Child & Adolescent Psychiatry, 29(1), 130-136. Tabachnik, B. G., & Fidell, L. S. (1996). Using multivariate statistics. (3rd ed.). New York: HarperCollins College Publishers. Tager-Flusberg, H. (1992). Autistic children's talk about psychological states: Deficits in the early acquisition of a theory of mind. Child Development, 63, 161-172. Tager-Flusberg, H. (Ed.). (1999a). Neurodevelopmental disorders: Developmental cognitive neuroscience.. Cambridge, MA: The MIT Press. Tager-Flusberg, H. (1999b). A psychological approach to understanding the social and language impairments in autism. International Review of Psychiatry, 11(4), 325334. Tager-Flusberg, H. (2000). Language and understanding minds: Connections in autism. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding 327 other minds: Perspectives from developmental cognitive neuroscience (2nd ed., pp. 124-149). London: Oxford University Press. Tager-Flusberg, H. (2001). A reexamination of the theory of mind hypothesis of autism. In J. A. Burack, T. Charman, N. Yirmiya, & P. R. Zelazo (Eds.), The development of autism: Perspectives from theory and research (pp. 173-193). Mahwah, NJ: Lawrence Erlbaum Associates. Tager-Flusberg, H., & Joseph, R. M. (2003). Identifying neurocognitive phenotypes in autism. Philosophical Transactions of the Royal Society of London B, 358(1430), 303-314. Tager-Flusberg, H., & Sullivan, K. (1994a). Predicting and explaining behavior: A comparison of autistic, mentally retarded and normal children. Journal of Child Psychology & Psychiatry & Allied Disciplines, 35(6), 1059-1075. Tager-Flusberg, H., & Sullivan, K. (1994b). A second look at second-order belief attribution in autism. Journal of Autism & Developmental Disorders, 24(5), 577586. Tager-Flusberg, H., & Sullivan, K. (1995). Attributing mental states to story characters: A comparison of narratives produced by autistic and mentally retarded individuals. Applied Psycholinguistics, 16(3), 241-256. Tager-Flusberg, H., Sullivan, K., & Boshart, J. (1997). Executive functions and performance on false belief tasks. Developmental Neuropsychology, 13(4), 487493. Teunisse, J.-P., Cools, A. R., van Spaendonck, K. P. M., Aerts, F. H. T. M., & Berger, H. J. C. (2001). Cognitive styles in high-functioning adolescents with autistic disorder. Journal of Autism & Developmental Disorders, 31(1), 55-66. Thatcher, R. W. (1997). Human frontal lobe development: A theory of cyclical cortical reorganization. In N. A. Krasnegor, G. R. Lyon, & P. S. Goldman-Rakic (Eds.), Development of the prefrontal cortex: Evolution, neurobiology and behavior (pp. 85-113). Baltimore, MD: Paul H. Brookes. Thomas, M., & Karmiloff-Smith, A. (2002). Are developmental disorders like cases of adult brain damage? Implications from connectionist modelling. Behavioral & Brain Sciences, 25(6), 727-787. Tranel, D. (2002). Emotion, decision making, and the ventromedial prefrontal cortex. In D. T. Stuss & R. T. Knight (Eds.), Principles of frontal lobe function (pp. 338352). London: Oxford University Press. 328 Tranel, D., Anderson, S. W., & Benton, A. (1994). Development of the concept of 'executive function' and its relationship to the frontal lobes. In F. Boller & J. Grafman (Eds.), Handbook of Neuropsychology (Vol. 9, pp. 125-148). Amsterdam: Elsevier Science. Turner, M. A. (1996). Repetitive behaviour and cognitive functioning in autism. Unpublished PhD thesis, University of Cambridge. Turner, M. (1997). Towards an executive dysfunction account of repetitive behaviour in autism. In J. Russell (Ed.), Autism as an executive disorder (pp. 57-100). Oxford: Oxford University Press. Turner, M. A. (1999). Generating novel ideas: Fluency performance in high-functioning and learning disabled individuals with autism. Journal of Child Psychology and Psychiatry, 40(2), 189-201. Tyrer, P. (Ed.). (1988). Personality assessment schedule: In personality disorders: Diagnosis, management, and course. London: Butterworth. Veale, D. M., Sahakian, B. J., Owen, A. M., & Marks, I. M. (1996). Specific cognitive deficits in tests sensitive to frontal lobe dysfunction in obsessive-compulsive disorder. Psychological Medicine, 26, 1261-1269. Vecchi, T. (1998). Visuo-spatial imagery in congenitally totally blind people. Memory, 6(1), 91-102. Volkmar, F. R., Lord, C., Bailey, A., Schultz, R. T., & Klin, A. (2004). Autism and pervasive developmental disorders. Journal of Child Psychology and Psychiatry, 45(1), 135-170. Volkmar, F. R., Sparrow, S. S., Goudreau, D., Cicchetti, D. V., Paul, R., & Cohen, D. J. (1987). Social deficits in autism: An operational approach using the Vineland Adaptive Behavior Scales. Journal of the American Academy of Child & Adolescent Psychiatry, 26(2), 156-161. Wallach, M. A., & Kogan, N. (1965). Modes of thinking in young children. New York: Holt, Rinehart, & Winston. Walsh, K. W. (1978). Neuropsychology: A clinical approach. New York: Churchill Livingston. Waltz, J. A., Knowlton, B. J., Holyoak, K. J., Boone, K. B., Mishkin, F. S., de Menezes Santos, M., Thomas, C. R., & Miller, B. L. (1999). A system for relational reasoning in human prefrontal cortex. Psychological Science, 10(2), 119-125. Waterhouse, L., Fein, D., & Modahl, C. (1996). Neurofunctional mechanisms in autism. Psychological Review, 103(3), 457-489. 329 Weinberger, D. (2002). Schizophrenia, the prefrontal cortex, and a mechanism of genetic susceptibility. European Psychiatry, 17(Suppl4), 355-362. Wellman, H. M. (1990). The child's theory of mind. Cambridge, MA: MIT Press. Wellman, H. M., Cross, D., & Watson, J. (2001). Meta-analysis of theory-of-mind development: The truth about false belief. Child Development, 72(3), 655-684. Wellman, H. M., & Gelman, S. A. (1998). Knowledge acquisition in foundational domains. In D. Kuhn & R. Siegler (Eds.), Handbook of child psychology: Cognition, perception and language (5th ed., pp. 523-573). New York: Wiley. Wellman, H. M., & Lagatutta, K. H. (2000). Developing understandings of mind. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from developmental cognitive neuroscience (2nd ed., pp. 21-49). London: Oxford University Press. Wellman, H. M., & Woolley, J. D. (1990). From simple desires to ordinary beliefs: The early development of everyday psychology. Cognition, 35(3), 245-275. Welsh, M. C., & Pennington, B. F. (1988). Assessing frontal lobe functioning in children: Views from developmental psychology. Developmental Neuropsychology, 4(3), 199-230. Welsh, M. C., Pennington, B. F., & Groisser, D. B. (1991). A normative-developmental study of executive function: A window on prefrontal function in children. Developmental Neuropsychology, 7(2), 131-149. Welsh, M. C., Pennington, B. F., Ozonoff, S., Rouse, B., & McCabe, E. (1990). Neuropsychology of early-treated phenylketonuria: Specific executive function deficits. Child Development, 61(6), 1697-1713. Welsh, M. C., Satterlee-Cartmell, T., & Stine, M. (1999). Towers of Hanoi and London: Contribution of working memory and inhibition to performance. Brain & Cognition, 41(2), 231-242. Whiten, A. (Ed.). (1991). Natural theories of mind: Evolution, development and simulation of everyday mindreading. Oxford: Blackwell. Williams, M. A., Moss, S. A., Bradshaw, J. L., & Rinehart, N. J. (2002). Random number generation in autism. Journal of Autism & Developmental Disorders, 32(1), 43-47. Wilson, B. A., Evans, J. J., Emslie, H., Alderman, N., & Burgess, P. (1998). The development of an ecologically valid test for assessing patients with dysexecutive syndrome. Neuropsychological Rehabilitation, 8(3), 213-228. 330 Wimmer, H., & Mayringer, H. (1998). False belief understanding in young children: Explanations do not develop before predictions. International Journal of Behavioral Development, 22(2), 403-422. Wimmer, H., & Perner, J. (1983). Beliefs about beliefs: Representation and constraining function of wrong beliefs in young children's understanding of deception. Cognition, 13(1), 103-128. Wing, L., & Gould, J. (1979). Severe impairments of social interaction and associated abnormalities in children: Epidemiology and classification. Journal of Autism and Developmental Disorders, 9, 11-29. Wolff, S., Narayan, S., & Moyes, B. (1988). Personality characteristics of parents of autistic children: A controlled study. Journal of Child Psychology & Psychiatry & Allied Disciplines, 29(2), 143-53. Wong, D., Maybery, M., Bishop, D. V. M., Maley, A., & Hallmayer, J. (2004). Profiles of executive function performance in parents and siblings of individuals with autism spectrum disorders. Manuscript in preparation. World Health Organization. (1992). The ICD-10 classification of mental and behavioral disorders: Clinical descriptions and diagnostic guidelines. Geneva, Switzerland: Author. Yirmiya, N., Erel, O., Shaked, M., & Solomonica-Levi, D. (1998). Meta-analyses comparing theory of mind abilities of individuals with autism, individuals with mental retardation, and normally developing individuals. Psychological Bulletin, 124(3), 283-307. Yirmiya, N., & Shulman, C. (1996). Seriation, conservation, and theory of mind abilities in individuals with autism, individuals with mental retardation, and normally developing children. Child Development, 67(5), 2045-2059. Yirmiya, N., Solomonica-Levi, D., Shulman, C., & Pilowsky, T. (1996). Theory of mind abilities in individuals with autism, Down syndrome, and mental retardation of unknown etiology: The role of age and intelligence. Journal of Child Psychology & Psychiatry & Allied Disciplines, 37(8), 1003-1014. Yonan, A. L., Alarcon, M., Cheng, R., Magnusson, P. K., Spence, S. J., Palmer, A. A., Grunn, A., Juo, S. H., Terwilliger, J. D., Liu, J., Cantor, R. M., Geschwind, D. H., & Gilliam, T. C. (2003). A genomewide screen of 345 families for autismsusceptibility loci. American Journal of Human Genetics, 73(4), 886-97. 331 Zelazo, P. D. (2000). Self-reflection and the development of consciously controlled processing. In P. Mitchell & K. J. Riggs (Eds.), Children's reasoning and the mind (pp. 169-189). Hove, UK: Psychology Press. Zelazo, P. D., Burack, J. A., Benedetto, E., & Frye, D. (1996a). Theory of Mind and rule use in individuals with Down's Syndrome: A test of the uniqueness and specificity claims. Journal of Child Psychology & Psychiatry & Allied Disciplines, 37(4), 479-484. Zelazo, P. D., Burack, J. A., Boseovski, J. J., Jacques, S., & Frye, D. (2001). A cognitive complexity and control framework for the study of autism. In J. A. Burack, T. Charman, N. Yirmiya, & P. R. Zelazo (Eds.), The development of autism: Perspectives from theory and research (pp. 195-217). Mahwah, NJ: Lawrence Erlbaum Associates. Zelazo, P. D., Carter, A., Reznick, J. S., & Frye, D. (1997). Early development of executive function: A problem-solving framework. Review of General Psychology, 1(2), 198-226. Zelazo, P. D., & Frye, D. (1998). Cognitive complexity and control: II. The development of executive function in childhood. Current Directions in Psychological Science, 7(4), 121-126. Zelazo, P. D., Frye, D., & Rapus, T. (1996b). An age-related dissociation between knowing rules and using them. Cognitive Development, 11(1), 37-63. Zelazo, P. D., Jacques, S., Burack, J. A., & Frye, D. (2002). The relation between theory of mind and rule use: Evidence from persons with autism-spectrum disorders. Infant & Child Development, 11(2), 171-195. Zelazo, P. D., & Müller, U. (2002). Executive function in typical and atypical development. In U. Goswami (Ed.), Blackwell handbook of childhood cognitive development (pp. 445-469). Malden, MA: Blackwell Publishers. Zelazo, P. D., & Reznick, J. (1991). Age-related asynchrony of knowledge and action. Child Development, 62(4), 719-735. Ziatas, K., Durkin, K., & Pratt, C. (1998). Belief term development in children with autism, Asperger syndrome, specific language impairment, and normal development: Links to theory of mind development. Journal of Child Psychology & Psychiatry & Allied Disciplines, 39(5), 755-763. Zilbovicius, M., Garreau, B., Samson, Y., Remy, P., Barthelemy, C., Syrota, A., & Lelord, G. (1995). Delayed maturation of the frontal cortex in childhood autism. American Journal of Psychiatry, 152(2), 248-252. 332 APPENDIX A Repetitive Behaviours Interview – Current Version Instructions: In this interview I will ask you for details about some of the behaviours covered in the repetitive behaviours questionnaire which you would have completed in regard to each of your children. I’ll just be asking you about questions which you answered ‘yes’ to in that questionnaire. I’ll start by asking if [name] currently displays a particular behaviour and by this I mean a behaviour s/he has displayed once a week or more over the last 3 months. If s/he has, I’d like you to try to describe the behaviour, and I’ll also ask how often he/she shows this behaviour. [I’ll then ask you whether he/she has ever shown this behaviour at least once a week for a period of three months or more. If he/she has shown this behaviour in the past, I’d like you to try to describe the behaviour me and if possible to tell me at what age the behaviour was most frequent.]* Ask me questions at any time if things don’t seem clear. I am interested in all the repetitive behaviours shown by [name], so please tell me anything that you think may be of any interest. All the information you give me will be confidential. Any queries before we start? * [These instructions are only for parents of participants who are over the age of 12.] 333 STEREOTYPED MANIPULATION OF OBJECTS 1. Does [name] currently manipulate objects repetitively in any way? For example, does he/she spin, twiddle, bang, tap, twist, flick or wave objects or other materials repetitively? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable Describe objects and actions- 2. Does [name] currently operate light switches, taps, the toilet flush etc, repeatedly? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable Describe actions- 3. Does [name] currently arrange objects in rows or other patterns? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable c) DOES (NAME] ALWAYS LINE UP THE SAME OBJECTS IN THE SAME ORDER? (1) different objects and different order (2) same objects and different order (3) same objects and same (9) no information (99) not applicable d) DOES [NAME] SEEM TO NOTICE INSTANTLY IF AN OBJECT IS MISSING OR MOVED? (1) no (2) frequently (3) always (9) no information (99) not applicable e) DOES [NAME] OBJECT IF THESE ROWS OR PATTERNS ARE MOVED OR PACKED AWAY? (1) no (2) frequently (3) always (9) no information (99) not applicable Describe objects and arrangements- 334 4. Does [name] currently mouth or suck objects or parts of him/herself repeatedly? For example, does he/she mouth or suck his/her fingers, a favourite object, his/her shirt collar or the like? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable Describe objects or body parts- 5. Does [name] currently stare closely at objects or his/her body parts? For example, does he/she stare at lights, spinning objects, a certain toy, his/her fingers etc.? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable Describe objects or body parts- 6. Does [name] currently obsessively collect or hoard items of any sort? Has s/he ever? (0) (1) (2) (3) (9) no obsessive, or unusually keen, collecting or hoarding very keen collector of usual items (eg. stamps, football cards etc.) hoards unusual or odd items (eg. leaflets, jar lids, sticks etc.), irregularly or on occasion and is reticent to throw anything that has been collected away. hoards unusual or odd items on a very regular basis, which, because of the volume of items hoarded, leads to regular difficulties and conflicts no information Details- STEREOTYPED MOVEMENTS 7. Does [name] currently pace or move around repetitively? For example, does he/she walk to and fro across a room or around the house or garden repetitively? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable Describe movement, route and location- 335 8. Does [name] currently often spin him/herself around and around? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable Describe movement- 9. Does [name] currently rock rhythmically backwards and forwards, or side to side, either when sitting or when standing? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable Describe whether sitting or standing- 10. Does [name] currently touch parts of his/her body or clothing repeatedly? For example, does he/she repeatedly rub his/her legs, pull at the buttons on his/her clothing, or touch his/her ear or elbow etc.? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable Describe action and body part or clothing- 11. Does [name] currently make repetitive arm, hand and/or finger movements? For example, does he/she repetitively wave, flick, flap or twiddle his/her hands or fingers repetitively? Does he/she repetitively clap or clasp his/her hands? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable Describe movements and whether this occurs near his/her eyes- 336 12. Does [name] currently make any repetitive movements with his/her feet or legs? For example, does he/she repetitively tap his/her feet, swing his/her legs or jump etc.? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable Describe movements- TIC-LIKE BEHAVIOURS 13. Does [name] currently make any particular words, noises etc. that he/she uses repeatedly? For example, does he/she repeat single words or nonsense words? Or other sounds such as hums, growls, clicking of the tongue, or clearing the throat? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable Describe words, noises etc- 14. Does [name] currently make any repetitive head or neck movements? For example, does he/she nod or shake his/her head repetitively, or show any jerky tic-like movements? Or does he/she show other repetitive movements of the face muscles such as raising eyebrows or moving the muscles around the lips? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable Describe movements- 337 15. Does [name] currently make any repetitive eye movements? For example, does he/she blink, roll or move his/her eyes repeatedly? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable Describe movements- 16. Does [name] currently make any repetitive mouth and/or tongue movements? For example, does he/she grind his/her teeth, smack his/her lips, or make sucking movements repetitively? Has s/he in the past? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable Describe movements- SELF-INJURIOUS BEHAVIOUR 17. Does [name] currently bang his/her head? Does he/she do this repeatedly? Has s/he in the past? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable Describe what head is banged against- 18. Does [name] currently ever injure himself/herself? For example does he/she bite, scratch, knock or pick at himself/herself? Does he/she do this repeatedly? Has s/he in the past? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable 338 GENERAL 19. Has [name] always shown one or more of these behaviours, or have there been periods when he/she hasn't shown any repetitive behaviours for 3 months or more? (1) (2) (3) (9) (99) at times has shown no repetitive behaviours for 3 months or more has always shown one or more behaviours has always shown at least one repetitive activity no information not applicable- items 1-18 all received a (0) rating Details of time periods- COMPULSIVE BEHAVIOURS 20. Cleaning/Washing Compulsions: Does [name] currently wash his/her hands, shower, bathe or groom himself/herself, more than is necessary? Is he/she overly concerned about dirt and contamination, or take measures to prevent contact with contaminants? Does s/he clean household items or other objects excessively? Has s/he in the past? (0) (1) (2) (9) no obsessive-or compulsive behaviour of this type- washes hands at appropriate times (e.g. at meal times, after using the toilet), but does not consistently wash at inappropriate times. Is not unusually concerned about dirt or contamination. suspicious or mild obsessive or compulsive behaviour- washes hands 10 -14 times a day clear obsessive or compulsive behaviour- washes hands 15+ times a day, or is preoccupied with worry about dirt and contamination no information b) IS THIS WASHING BEHAVIOUR CARRIED OUT IN A RITUALISED FASHION? (i.e. is it always carried out in the same order or in the same way) (1) no (2) frequently (3) always (9) no information (99) not applicable Describe cleaning/washing behaviour - 21. Checking Compulsions: Does [name] currently often check repeatedly that things are switched off, locked up or put away etc? Does s/he check other things like that nothing bad has happened, or that s/he did not make a mistake? Does he/she check these things more often than is necessary? Has s/he in the past? (0) no obsessive-or compulsive behaviour of this type- may check that an item has been switched off ·etc. once, but is not preoccupied with whether or not items have been checked (1) suspicious or mild obsessive or compulsive behaviour- checks that one or more items have been turned off etc. on two separate occasions on a daily basis (2) clear obsessive or compulsive behaviour- checks that one or more items has been switched off etc. on at least three separate occasions on a daily basis, or is preoccupied with items being safely handled in order to avert disaster (9) no information b) IS THIS CHECKING BEHAVIOUR CARRIED OUT IN A RITUALISED FASHION? (i.e. is it always carried out in the same order or in the same way) (1) no (2) frequently (3) always (9) no information (99) not applicable Describe checking and items checked- 339 22. Repeating Rituals: Does [name] currently perform any rituals where s/he has to keep repeating a certain action? For example, does s/he reread or rewrite excessively, or repeat routine activities such as going in and out of a door or getting up and down from a chair? (0) (1) (2) (9) no obsessive-or compulsive behaviour of this typesuspicious or mild obsessive or compulsive behaviour- performs repeating routine 3-10 times a day clear obsessive or compulsive behaviour- performs routine 10+ times a day no information b) IS THIS REPEATING BEHAVIOUR CARRIED OUT IN A RITUALISED FASHION? (i.e. is it always carried out in the same order or in the same way) (1) no (2) frequently (3) always (9) no information (99) not applicable Describe repeating ritual- 23. Counting Compulsions: Does [name] currently count objects repeatedly? Does s/he perform any rituals, which involve counting? Has s/he in the past? (0) (1) (2) (9) no obsessive-or compulsive behaviour of this type- may count money or other objects but not excessively or inappropriately suspicious or mild obsessive or compulsive behaviour- counts objects inappropriately less than 5 times a day clear obsessive or compulsive behaviour- counts objects more than 5 times per day no information b) IS THIS COUNTING BEHAVIOUR CARRIED OUT IN A RITUALISED FASHION? (i.e. is it always carried out in the same order or in the same way) (1) no (2) frequently (3) always (9) no information (99) not applicable Describe counting behaviour - 24. Does [name] currently engage in any other compulsive behaviours? For example does s/he write lists excessively? Does s/he repeatedly touch, tap or rub certain things? Any other superstitious behaviours? Has s/he in the past? (0) (1) (2) (9) no obsessive-or compulsive behaviour of this typesuspicious or mild obsessive or compulsive behaviourclear obsessive or compulsive behaviour no information b) IS THIS COMPULSIVE BEHAVIOUR CARRIED OUT IN A RITUALISED FASHION? (i.e. is it always carried out in the same order or in the same way) (1) no (2) frequently (3) always (9) no information (99) not applicable Describe compulsive behaviour - 25. How much time do you think s/he spends on these compulsive behaviours per day? (1) (2) (3) (4) (9) (99) 0-1 hrs/day 1-3 hrs/day 3-8 hrs/day >8 hrs/day no information not applicable 340 OBJECT ATTACHMENTS 26. Is [name] currently attached to any particular objects? For example, does he/she carry a teddy, a blanket or a stick etc. around with him/her? Does he/she want to sleep with this item? Does he/she become distressed if it is lost or forgotten? Has s/he in the past? [In order to be considered an object attachment the individual must insist on sleeping with the item, or must carry it with him/her at specific times or in specific situations (e.g. whenever out of the house). The individual should also be concerned or distressed if the item is mislaid. (0) (1) (2) (9) no attachments to objects attachments to objects which are commonly used as comforters (e.g. teddies, blankets etc.) attachments to unusual objects or junk materials (eg. sticks, tins etc.). Rate here even if unusual attachments ccexist with more usual object attachments no information b) [If score on the previous item is (1) or (2), then also complete the following item] (1) insists that the object must be in bed every night, but only when the individual is at home (2) insists that the object must be in bed every night whether the individual is at home or away (3) insists that the object must be with the individual at times other than when tired or sleeping (9) no information Describe objects- G. INSISTENCE ON SAMENESS OF ENVIRONMENT 27. Does [name] currently insist on things about the house staying the same? For example, does he/she insist on furniture staying in the same place, or curtains being open or closed etc.? (0) (1) (2) (9) No fixed insistence on furniture, ornaments etc. remaining in the same places simply because he/she doesn't like things to be moved any relatively inflexible example which does not impact on other family members daily, as it primarily concerns items that belong to, or are used by, the individual only, or if this is not the case, he/she is able to tolerate alterations when others are present any pervasive example which is very rigid and impacts on the other members of the family on a daily basis (e.g. having to have lounge furniture organised in a particular way, or insisting that everybody's bedroom door must be closed etc, at all times) no information Describe items and location- 28. Does [name] currently insist on other items being put out, kept or stored in the same way? For example, does he/she like ornaments, toys or cassette tapes kept in the same places or positions? Has s/he in the past? (0) (1) (2) (9) no fixed insistence that items must be stored in the same places or the same way any example which does not interfere with other family members on a daily basis, as it primarily concerns the individuals own personal possessions although it may be very inflexible a (e.g. the arrangement of personal toiletries- he/she will not tolerate others moving them, even when cleaning the bathroom). any pervasive example which is very rigid and impacts on the other members of the family daily (e.g. insisting that a family video collection must always be stored in precisely the same way.) no information Describe items and location- 341 29. Is there anything else that [name] currently likes to remain just so? Has s/he in the past? (0) (1) (2) (9) no yes, any relatively inflexible example which is consistently observed by the individual, but has only a limited impact on the family (i.e. does not impact on the remainder of the family on a daily basis) yes, any pervasive example which is highly rigid and impacts on the other family members on a daily basis no information Describe- 30. Does [name] currently play the same music, game or video, or read the same book repeatedly? Has s/he in the past? (0) (1) (2) (9) does not have any music, games, videos or books that s/he uses more than normal plays the same, music, game or video or reads the same book (excepting continuing on with a novel) at least once a day plays the same, music, game or video or reads the same book (excepting continuing on with a novel) at least three times a day and prevention or interruption of this activity causes a marked negative reaction no information Describe the book, game or music- 31. Does [name] currently insist on using the same objects or items in any other situation? For example, does he/she insist on using the same chair, plate, bed linen or door? Has s/he in the past? [Do note rate insistence on using the same mug or cup] (0) (1) (2) (9) no fixed insistence on always using precisely the same items (excepting a mug or cup) in any situation- will generally use any item that he/she is given or the first item that is available any example which is unusually restricted or fixed, but can generally be modified if it is important to do so (e.g. if the item is in the dishwasher, if someone else is using it) any pervasive example which is very rigid and leads to regular confrontations with others, or requires extra effort on the part of the individual or others (e.g. insisting on using a certain plate etc. even if it is dirty or someone else is using it), on a regular or daily basis no information Describe item and situation- 32. Does [name] currently insist on wearing the same clothes or refuse to wear new clothes? Has s/he in the past? (0) (1) (2) (9) no insistence on wearing the same items of clothes- wears a range of different items and is keen to have new clothes insists on wearing the same item of clothing (e.g. jumper, trousers), in most situations, including frequently when it is inappropriate. Or refuses, or shows marked reticence, to wear new clothes. Will wear alternative clothing for at least certain, or special, occasions if prompted. insists on wearing the same (or substantially the same), outfit most or all of the time so that it is difficult for this outfit to be washed and any deviation from this usual outfit causes an extreme negative reaction. no information Describe clothing- 33. Does [name] currently insist that certain items of clothing must always be worn, or worn in the same situation or in the same way? For example, does he/she insist on always wearing a vest, or wearing a hat to the shops, or always buttoning a shirt to the collar? Has s/he in the past? (0) (1) (2) (9) no unusually fixed ways of wearing clothes- will modify clothing and the way in which it is warn etc. as appropriate (e.g. will take off coat if hot, or if wet or dirty etc.) consistently dresses in the same fixed manner, or wears the same clothes in the same situations, in a manner that is odd or unusual, but can modify this behaviour if it is necessary or important to do so (e.g. generally wears tops done up and with the hood up, but will undo this if it is hot etc.) has very fixed ways of wearing clothes, or always wears the same clothes in the same situations, and this is adhered to strictly even when it is very odd and impractical (e.g. always wears hat to the shops, always wears a coat outside irrespective of the weather) no information Describe clothing and situation- 342 34. Does [name] currently insist on eating the same foods, or a very small range of foods, at every meal? Has s/he in the past? (0) (1) (2) (9) eats a range of foods, although there may be a limited number of foods that he/she doesn't like to eat eats a limited range of foods and it is regularly the case that the he/she will eat a different meal to the rest of the family- will not try new foods eats fewer than five separate food types no information Describe foods- 35. How does [name] respond if you introduce him/her to a new activity or place? Would he/she have any objection to trying something new and different? Would he/she be anxious? [Rate usual, or most common, reaction] (0) (1) (2) (3) (9) participates/will visit without hesitation will be persuaded, but shows some reticence because the activity/place is new or different refuses to take part in anything new or different shows a high degree of stereotyped behaviour when trying something new no information Describe reaction- RIGID ADHERENCE TO ROUTINES AND RITUALS 36. Are there any aspects of routine that [name] currently insists must remain the same? For example, does he/she insist on always bathing before breakfast, on going to the shops every afternoon, or on watching a video after every meal? Has s/he in the past? (0) has no rigid routine - preferred routines can be modified if it is necessary or appropriate to do so (1) has a set routine which is inflexible and consistently impacts on other family members because he/she is unable to take "shortcuts" in his/her routine (e.g. the individual is unable to finish early in the bathroom if someone needs it, or take their walk on another day if a family outing is planned etc.) (2) has a very fixed or inflexible routine which involves not just the self but also other family members and so has a substantial impact on the family (e.g. expects everyone to go swimming on a Saturday morning and is upset if this routine is violated) (9) no information Describe routine- 37. Does [name] currently make rituals out of everyday activities such as eating, dressing, getting in the car, walking up stairs etc.? Are these activities always carried out in exactly the same way? Has s/he in the past? (0) (1) (2) (9) has no regular rituals or set ways of doing things- preferred ways of doing things can be modified if it is appropriate to do so (e.g. may always put socks on first, but if no clean socks are available will put on other items of clothing first) has set rituals, which are inflexible and impact on other family members to some degree because the individual is unable to modify these rituals when it is important to do so. These rituals concern the individual only and are not excessively time-consuming. They do not incorporate unnecessary or redundant steps and actions. has very elaborate and inflexible rituals which may, or may not, involve others, but take considerable time (i.e. take significantly more time than the same activity would take in non-ritualised fashion) and cannot be abbreviated. These rituals affect all family members because of the large amounts of time taken up with these rituals on a daily basis (e.g. having to check that every bodies seat belt is fastened and that the glove box contains certain items before setting out on any car journey, no matter how short.) no information Describe activity and precise ritual- 343 38. Does [name] currently have any rituals that are linked to particular occasions or places? For example, does he/she have specific rituals for the supermarket, the Doctor's surgery or a relative's house? Has s/he in the past? (0) (1) (2) (9) has no fixed rituals for particular places or occasions- preferred ways of doing things can be modified if it is appropriate to do so (e.g. if in a hurry, if the weather is not appropriate etc.) has certain fixed activities or rituals that he/she insists on at particular occasions or particular places. These rituals concern the individual only and have minimal impact on the remainder of the family (e.g. always rides on the swings in the same fixed order, or always orders the same food in a cafe) has one or more very fixed and inflexible rituals which have a severe impact on the family as it is highly intrusive or involves other family members (e.g. must always enter certain shops in certain order when shopping) no information Describe ritual and occasion or place- 39. Does [name] currently insist on moving or travelling by the same route? For example, does he/she insist on taking the same route when moving about the house, going for a walk, or travelling in the car? Has s/he in the past? (0) (1) (2) (9) has no set route for moving or travelling- preferred ways of doing things can be modified if it is appropriate to do so has a set route that he/she will always take to one or more specific locations if on his/her own or if given the choice. Finds it very difficult to accept deviations from this, but will accept an alternative if there is a good reason for doing so. will take only one route to at least one specific destination and will not tolerate any deviation from this, no matter what the need or justification for the change is. no information Describe mode of travelling and journey- 40. Is there anything else that[name] currently likes to be done in a certain way, or at a certain time? Has s/he in the past? (0) (1) (2) (9) no yes, any relatively inflexible example which is consistently observed by the individual, but has only a limited impact on the family (i.e. does not impact on the remainder of the family on a daily basis) yes, any pervasive example which is highly rigid and impacts on the other family members on a daily basis no information no information Describe- 41. Does [name] currently incorporate any unnecessary, or unusual, behaviours as part of any rituals or routines? For example, does he/she tap the plate after every mouthful when eating, or touch specific objects when walking through a room? Has s/he in the past? (0) (1) (2) (9) no unnecessary, idiosyncratic behaviours incorporated in routines yes, any relatively inflexible example which is consistently observed by the individual, but has a limited impact on the familyhe/she can refrain from the behaviour when asked to do so for at least 10 minutes yes, any pervasive and unusual example, which is very rigid and is observed by the individual at all times. He/she is unable (or unwilling) to suppress this behaviour. no information Describe ritual or routine and unnecessary activity- 344 REPETITIVE USE OF LANGUAGE 42. Does [name] currently mimic others or repeat speech? Has s/he in the past? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable C) DOES [NAME] (A) REPEAT WHAT IS SAID IMMEDIATELY AFTER IT IS SAID OR, (B) REPEAT WHAT HAS BEEN SAID SOME TIME AFTER IT HAS BEEN SAID? (1) A (2) B (3) combination of A and B (9) no information (99) not applicable Describe the type of speech repeatedItems 43-45 inclusive specifically address spontaneous language and exclude echolalia, or language that is copied from other sources. If [name] does not have at least good phrase speech, skip items 43-45 (and score (99), not applicable). 43. Does [name] currently say the same things, or sing the same songs, repeatedly? For example, does [name] recite the same thing over and over, or have stock phrases that he/she often uses? Has s/he in the past? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable c) DOES [NAME] (A) SAY THE SAME THING OVER AND OVER AGAIN AT ONE POINT IN TIME OR, (B) SAY THE SAME THING AT DIFFERENT TIMES? (1)A (2) B (3) combination of A and B (9) no information· (99) not applicable Describe sentences or songs- 44. Does [name] currently ask the same questions repeatedly? Has s/he in the past? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable c) DOES [NAME] (A) SAY THE SAME THING OVER AND OVER AGAIN AT ONE POINT IN TIME OR, (B) SAY THE SAME THING AT DIFFERENT TIMES? (1) A (2) B (3) combination of A and B (9) no information (99) not applicable d) DOES HE/SHE DEMAND THAT OTHERS ALWAYS GIVE THE SAME ANSWERS? (1) no (2) frequently (3) always (9) no information (99) not applicable Describe questions - 345 45. Does [name] currently talk about the same topic over and over again? Has s/he in the past? [Rate only repeated attempts to raise the same topic in conversation. These attempts may incorporate some echoed speech, but must also include spontaneous speech and attempts to talk around the topic.] a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) 1-2x’s per week (2) 3-6x’s per week (3) 1-4x’s per day (4) 5-14x’s per day (5) 15-29x’s per day (6) 30+x’s per day (7) almost constantly (9) no information (1) less 60 secs (2) 1-3 mins (3) 4-9 mins (4) 10-29 mins (5) 30 mins + (9) no information (99) not applicable Describe topic and whether it’s based on fantasy or reality- CIRCUMSCRIBED INTERESTS 46. Does [name] have any unusual preoccupations? Does he/she regularly talk about and seek out a particular type of object? Has s/he ever? (0) (1) (2) (9) no preoccupations or preoccupation with objects that are common in their age group and not to the exclusion of other interests or activities preoccupation with items common in their age but to such a degree that it significantly limits involvement in other interests or activities preoccupation with unusual items no information How long has [name] been preoccupied with [interest]? Please describe the preoccupation- 47a. Does [name] have any particular interests? Is there anything unusual about this interest? Would you describe this interest as particularly keen or obsessional? Does he/she pursue this interest to the exclusion of other interests and hobbies? What other interests and hobbies does [name] have? Has s/he ever had any unusual or obsessional interests? (0) (1) (2) (3) (9) usual topic of hobby or interest (e.g. computers or football teams)- casual to keen interest usual topic of hobby or interest (e.g. computers or football teams)- abnormally keen or obsessional interest OR mildly unusual topic of hobby or interest (e.g. road maps or record covers)- casual to keen interest mildly unusual topic of hobby or interest (e.g. road maps or record covers) – abnormally keen or obsessional interest abnormally keen or obsessional interest in highly unusual topic of hobby or interest (e.g. DIY tools or street lamps) - abnormally keen or obsessional interest no information How long has [name] had this particular interest(s)? Please describe the interest(s)- 47b. Summary Rating (0) (1) (2) (3) (9) has a varied pattern of interests, which are pursued meaningfully. one or more abnormally keen or highly circumscribed interests, but also more usual interests which are pursued meaningfully. has only obsessional interests which are either pursued to an abnormally keen extent, or are highly circumscribed in nature has no particular interests or hobbies that he/she will pursue spontaneously (DO NOT RATE WATCHING TELEVISION) no information 47c. How is this interest or hobby manifested (0) (1) (2) (9) usual manifestation of interest- collecting, sorting, reading, playing/using relevant materials mildly unusual or idiosyncratic manifestation of interest- odd or unusual activity highly unusual or idiosyncratic manifestation of interest- highly stereotyped or ritualised activity not applicable- item received a (O) rating Please describe how it is manifest- 346 GENERAL ITEMS Skip items 48-52 inclusive (and score (99), not applicable), if all interview items have received a (O) rating. 48a. Does [name] ever make any attempt to cover up, hide or change any of the behaviours you have described? For example, does he/she leave the room to engage in repetitive activities, or does he/she suppress them if he/she knows that other people are watching? Has s/he in the past? (0) (1) (2) (3) (4) (5) (6) (7) (8) (9) (99) never occasionally- but not at specific or predictable times most often- but not at specific or predictable times at all times only, or mainly, when calm and relaxed only, or mainly, at school only, or mainly, with new people or in social situations (excluding solely school) only, or mainly, when likely to be reprimanded at other times no information not applicable- all interview items received a (O) rating Describe the way in which the behaviour has been covered up- 48b. Which behaviours? [Rate the category that the behaviour that s/he attempts to cover belongs to.] (1) repetitive movements – (a) stereotypies (b) repetitive use of objects (c) tic like movements (d) self injurious behaviours (2) object attachments (3) insistence on sameness of environment (4) insistence on sameness of activity or item (5) adherence to routine and rituals (6) repetitive use of language (7) circumscribed interests (8) compulsive behaviours (9) no information (99) not applicable- all interview items received a (O) rating Briefly describe the behaviour- 49. Have you, or anyone else, ever made any attempt to reduce any of the behaviours shown by [name] that we have talked about? (1) (2) (3) (9) (99) no yes, at different times yes, continually and consistently no information not applicable - all interview items received a (0) rating 50. What was the earliest repetitive activity that you remember [name] showing? How old was he/she when this began? [Rate the category that this activity belongs to and the age at which it began.] (1) repetitive movements – (2) (3) (4) (5) (6) (7) (8) (9) (99) (a) stereotypies (b) repetitive use of objects (c) tic like movements (d) self injurious behaviours object attachments insistence on sameness of environment insistence on sameness of activity or item adherence to routine and rituals repetitive use of language circumscribed interests compulsive behaviours no information not applicable- all interview items received a (O) rating 347 [The following two items apply only to repetitive activities which have been evident during the last three months.] 51a. Of the repetitive behaviours and rituals and special interests that we have discussed, which one would you say is the most marked or the most noticeable? [Rate the category that this activity belongs to.] (1) repetitive movements – (a) stereotypies (b) repetitive use of objects (c) tic like movements (d) self injurious behaviours (2) object attachments (3) insistence on sameness of environment (4) insistence on sameness of activity or item (5) adherence to routine and rituals (6) repetitive use of language (7) circumscribed interests (8) compulsive behaviours (9) no information (99) not applicable- all interview items received a (O) rating b. Which would come second? c. Which would you think comes third? 52a. Of all of the repetitive behaviours and rituals and special interests etc. that we have talked about, which one would you say causes the greatest problem in day-to-day life? (1) repetitive movements – (2) (3) (4) (5) (6) (9) (10) (9) (99) (a) stereotypies (b) repetitive use of objects (c) tic like movements (d) self injurious behaviours object attachments insistence on sameness of environment insistence on sameness of activity or item adherence to routine and rituals repetitive use of language circumscribed interests compulsive behaviours no information not applicable- all interview items received a (O) rating b. Which would you think comes second? c. Which would you think comes third? 348 APPENDIX B Correlations between EF task variables in the control group (Study One) Table B1. Raw correlations between EF variables in the control group (N.B.: Intra-domain correlations are depicted in bold) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 1 2 .30* 3 .05 .22 4 .11 .17 .04 5 -.07 -.20 -.42* -.26 6 .11 -.06 .18 .10 -.03 7 .02 -.19 -.22 -.15 .79** .59** 8 .01 .05 .06 .04 -.06 -.07 -.09 9 -.04 .10 .24 .16 .15 -.11 .06 -.07 10 .14 .14 .28 -.02 -.08 -.22 -.19 .05 .45** 11 -.40** -.16 -.25 .19 .04 -.18 -.09 -.33 -.14 -.34* 12 -.11 -.09 -.17 -.16 .16 .12 .20 -.49** -.14 -.19 .29* 13 .35* .15 -.02 -.28 -.02 .09 .04 .04 -.05 .11 -.51** .18 14 -.44** -.32* .20 -.01 -.24 .12 -.12 -.31 -.24 -.14 .47** .45** -.25 15 .29* .17 -.24 -.32 .26 -.04 .19 -.11 .02 .27 -.36* .33* .48** -.45** 16 -.25 -.14 -.42* .12 .08 .11 .13 .07 -.03 -.13 .08 -.02 -.13 .18 -.12 17 -.23 -.40** -.40* -.08 .14 -.01 .10 -.23 -.05 -.22 .30* .24 -.21 .50** -.28 .51** 18 .05 .27 .27 .13 -.73** -.34 -.78** .15 .03 .34* -.02 -.13 -.01 .04 -.07 -.12 -.10 19 .11 .15 .13 -.34 .42* -.38* .02 -.11 .13 .07 .02 -.11 -.09 -.13 -.10 a -.53** -.28 1 = ToL adjusted extra moves score; 2 = ToL rule violations; 3 = IDED set-shifting task Perseveration Condition EDS stage errors; 4 = IDED set-shifting task Learned Irrelevance Condition EDS stage errors; 5 = RIL task inhibition error difference score; 6 = RIL task load error difference score; 7 = RIL task inhibition + load error difference score; 8 = RIL task shape error score; 9 = Opposite Worlds error difference score; 10 = Opposite Worlds time difference score; 11 = Relational Complexity total score; 12 = Pattern Meanings correct responses; 13 = Pattern Meanings sum of errors; 14 = Uses of Objects correct responses; 15 = Uses of Objects sum of errors; 16 = Stamps task complexity score; 17 = Stamps task originality score; 18 = Stamps task restriction score; 19 = Stamps task rule adherence score. *p < .05; ** p < .01. All tests were two-tailed. a = Correlation could not be computed because one of the variables was constant. Note: The RIL task RT difference scores are not included in this table for the sake of brevity. 349 Table B2. Partial correlations between EF variables in the control group (N.B.: Intra-domain correlations are depicted in bold) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 1 2 .14 3 .03 .18 4 .15 .21 .07 5 -.11 -.23 -.50** -.22 6 .11 -.07 .19 .09 .0 7 -.02 -.22 -.26 -.10 .76** .65** 8 -.28 -.15 .02 .03 .03 -.10 -.04 9 .01 .10 .22 .17 .21 -.10 .10 -.09 10 .04 .03 .24 .06 -.27 -.22 -.35 -.01 .50** 11 -.01 .21 -.26 .24 .08 -.24 -.09 -.04 -.23 -.23 12 -.02 .01 -.14 -.17 .16 .12 .20 -.44* -.12 -.14 .21 13 .05 -.07 -.08 -.31 -.07 .09 .0 -.24 -.03 -.04 -.18 .36* 14 -.20 -.11 .33 -.03 -.37* .18 -.17 -.07 -.31 .0 .06 .41** .10 15 .07 .0 -.32 -.34 .32 -.06 .21 -.36* .03 .20 -.07 .49** .31* -.26 16 -.19 -.10 -.43* .10 .18 .11 .21 .12 -.06 -.05 -.05 -.04 -.03 .12 -.05 17 .0 .0 -.38* -.11 .16 -.01 .12 -.03 -.04 -.11 -.09 .14 .05 .31* -.11 .53** 18 .02 .02 .26 .11 -.74** -.36 -.79** .07 .0 .41* .11 -.08 -.07 .16 -.13 -.15 -.01 19 .09 .09 .11 -.31 .37* -.39* -.06 -.12 .15 -.01 .11 -.11 -.16 -.13 -.15 a -.51** -.30 1 = ToL adjusted extra moves score; 2 = ToL rule violations; 3 = IDED set-shifting task Perseveration Condition EDS stage errors; 4 = IDED set-shifting task Learned Irrelevance Condition EDS stage errors; 5 = RIL task inhibition error difference score; 6 = RIL task load error difference score; 7 = RIL task inhibition + load error difference score; 8 = RIL task shape error score; 9 = Opposite Worlds error difference score; 10 = Opposite Worlds time difference score; 11 = Relational Complexity total score; 12 = Pattern Meanings correct responses; 13 = Pattern Meanings sum of errors; 14 = Uses of Objects correct responses; 15 = Uses of Objects sum of errors; 16 = Stamps task complexity score; 17 = Stamps task originality score; 18 = Stamps task restriction score; 19 = Stamps task rule adherence score. *p < .05; ** p < .01. All tests were two-tailed. a = Correlation could not be computed because one of the variables was constant. Note: The RIL task RT difference scores are not included in this table for the sake of brevity. 350 APPENDIX C Separate ToM-EF correlations for young and old age subgroups within the control sample (Study One) Table C1. Raw and partial correlations between ToM and EF variables within “young” control participants (aged 5-8 years) False belief task 2nd-order Simple 1st-order EF task ToL (n = 25): Adj.extra move score -.34 -.46* -.48* -.39 Rule violations -.11 -.02 -.60** -.55** IDED Set-shifting task condition (n = 13): Perseveration EDS stage errors -.68* -.90*** -.64* -.84** a Learned Irrelevance EDS stage errors -.19 -.08 a RIL task (n = 12): Error difference scores: Inhibition .06 .02 a Load -.64* -.80* -.51 a Inhibition + load -.69* -.66 -.57 a RT difference scores: Inhibition .18 -.45 a Load .21 .47 a Inhibition + load .46 .10 a Shape error score -.31 -.12 a Opposite Worlds (n = 14): Error diff. score -.24 -.11 a Time diff. score -.11 -.14 a Relational Complexity (n = 25): Total score .14 .38 .40* .13 Pattern Meanings (n = 25): Correct responses -.19 .39 .17 Sum of errors -.56** -.58** -.13 -.14 Uses of Objects (n = 25): Correct responses .08 .46* .46* .24 .25 Sum of errors -.48* -.39 -.13 -.21 Stamps task (n = 25): Complexity score -.04 .36 .35 Originality score .05 .41* .48* .25 .36 Restriction score a a a Rule adherence score .13 -.01 .17 * p < .05; ** p < .01; *** p < .001. Note: Partial correlations controlled for age, VIQ and PIQ. All tests were two-tailed. a = No correlation could be calculated as one of the variables was constant 351 Table C2. Raw and partial correlations between ToM and EF variables within “old” control participants (aged 9-18 years) False belief task 2nd-order Simple 1st-order EF task ToL (n = 21): Adj.extra move score -.20 -.15 -.20 Rule violations .09 .13 .09 IDED Set-shifting task condition (n = 21): Perseveration EDS stage errors -.28 -.08 -.28 Learned Irrelevance EDS stage errors -.18 -.26 -.18 RIL task (n = 21): Error difference scores: Inhibition .29 .07 .29 Load -.32 -.05 -.32 Inhibition + load .10 .03 .10 RT difference scores: Inhibition -.16 -.26 -.16 Load .44* .30 .44* .54* .54* Inhibition + load .21 .01 .21 Shape error score -.31 -.20 -.31 Opposite Worlds (n = 21): Error diff. score .18 -.17 .18 Time diff. score .02 .02 .02 Relational Complexity (n = 21): Total score .43 .08 .43 Pattern Meanings (n = 21): Correct responses .20 .14 .20 Sum of errors -.08 .12 -.08 Uses of Objects (n = 21): Correct responses .10 .27 .10 Sum of errors -.02 -.17 -.02 Stamps task (n = 20): Complexity score .07 .10 .07 Originality score .59** .43 .26 .59** .43 Restriction score .05 .08 .05 Rule adherence score .06 .08 .06 * p < .05; ** p < .01; *** p < .001. Note: Partial correlations controlled for age, VIQ and PIQ. All tests were two-tailed. 352 APPENDIX D Separate group comparisons for young and old age subgroups on EF tasks (Study One) Table D1. Group comparisons for “young” (5-8 years) and “old” (9-18 years) participants on inhibition, planning, and generativity tasks N Mean (SD) Age subgroup ASD Control ASD Control t p Young Inhibition: participants Opposite Worlds: Error difference score 10 14 2.60 (2.59) 0.43 (1.87) 2.39 .03* Time difference score 10 14 15.67 (11.99) 7.01 (4.55) 2.48 .02* Planning: ToL: Adjusted extra moves score 20 25 29.80 (8.17) 25.36 (7.19) 1.94 .06 Generativity: Uses of Objects: Correct responses 20 25 16.75 (8.28) 22.00 (9.51) 1.95 .06 Stamps task: Complexity score 20 25 18.25 (3.48) 20.12 (3.15) 1.89 .07 Originality score 20 25 2.50 (2.16) 3.96 (2.99) 1.83 .07 Old Inhibition: participants Opposite Worlds: Error difference score 19 22 0.89 (1.76) 0.86 (1.13) .07 .95 Time difference score 19 22 8.73 (6.09) 6.22 (4.03) 1.57 .12 Planning: ToL: Adjusted extra moves score 25 22 23.52 (6.33) 19.32 (6.24) 2.29 .03* Generativity: Uses of Objects: Correct responses 26 23 20.85 (9.26) 31.22 (6.93) 4.39 .00*** Stamps task: Complexity score 21 21 19.00 (2.55) 20.71 (2.80) 2.08 .04* Originality score 21 21 3.81 (2.69) 5.76 (2.23) 2.56 .01* Note: Only continuous variables on which significant overall group differences were found are included. This table is intended to demonstrate that the EF components which are impaired in individuals with ASDs (in comparison with age-matched controls) change with development. 353
© Copyright 2026 Paperzz