American Educational Research Journal June 2015, Vol. 52, No. 3, pp. 582–617 DOI: 10.3102/0002831215573531 Ó 2015 AERA. http://aerj.aera.net Assessing the Cognitive Demands of a Century of Reading Curricula: An Analysis of Reading Text and Comprehension Tasks From 1910 to 2000 Robert J. Stevens Xiaofei Lu David P. Baker Pennsylvania State University Melissa N. Ray Northern Illinois University Sarah A. Eckert Agnes Irwin School David A. Gamson Pennsylvania State University This research investigated the cognitive demands of reading curricula from 1910 to 2000. We considered both the nature of the text used and the comprehension tasks asked of students in determining the cognitive demands of the curricula. Contrary to the common assumption of a trend of simplification of the texts and comprehension tasks in third- and sixth-grade curricula, the results indicate that curricular complexity declined early in the century and leveled off over the middle decades but has notably increased since the 1970s, particularly for the third-grade curricula. KEYWORDS: reading curricula, historical analyses, cognitive complexity L earning to read and comprehend information presented in text is an important skill that is the foundation for students’ future academic activities. Most of the instruction on reading comprehension occurs in elementary school, and reading instruction during the 20th century, and thus far in the 21st century, has been dominated by published reading curricula, known as basal reading series (Chall & Squires, 1991; Smith, 1965; Venezky, 1987). The goal of this project was to investigate the historical changes in the cognitive demands made on students by the reading curriculum materials. Theoretical developments in learning resulting from the cognitive revolution of the 1970s and the expansion of research on reading has led to Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula a better understanding of the complexity of reading comprehension. This conceptualization traces its theoretical roots to Bartlett’s (1932) ideas of construction of meaning as an interaction of learner characteristics, learning activity, learning materials, and criterion tasks. Since that time, theory and basic research have helped develop our understanding of reading comprehension as a relatively complex process of meaning construction. Reading comprehension is currently conceived of as the interaction of the text, the reader (e.g., prior knowledge), the criterion task, and the broader sociocultural context (e.g., instructional context, sociocultural group) (Kintsch & Kintsch, 2005; Pearson & Fielding, 1991; Pressley, 2000; RAND Reading Study Group, 2002; Rapp, van den Broek, McMaster, Kendeou, & Espin, 2007; Spiro, 1980). Perhaps Kintsch and Kintsch (2005) conceptualized it most clearly as a ‘‘delicate interaction’’ of processes of taking meaning from text and integrating it with one’s prior knowledge under the influence of a ‘‘multitude of contextual constraints’’ (p. 71). Each of these four elements (learner, activities, materials, and criterion tasks) are multifaceted, adding to the complexity of the variables involved in reading comprehension. For example, considerations of the reader include variables like the reader’s prior knowledge, cognitive abilities, skills or strategies, motivation, and attitudes (e.g., RAND Reading Study Group, ROBERT J. STEVENS, PhD, is a professor and chair of educational psychology in the Department of Educational Psychology, Counseling, and Special Education at the Pennsylvania State University, 141 CEDAR Building, University Park, PA 16802; e-mail: [email protected]. He does research in reading comprehension, effective literacy instruction, and teacher effectiveness, with particular interest in the schooling of atrisk students. XIAOFEI LU, PhD, is Gil Watz Early Career Professor in Language and Linguistics and associate professor of applied linguistics at the Pennsylvania State University. His research interests are primarily in corpus linguistics, computational linguistics, and intelligent computer-assisted language learning. DAVID P. BAKER is professor of education and sociology in the Departments of Education Policy Studies and sociology at the Pennsylvania State University. He does research on the impact of education on social institutions, including health and demography, labor markets, cognitive abilities in populations, and political mobilization. MELISSA N. RAY, PhD, is a research scientist at the Center for the Interdisciplinary Study of Language and Literacy and Department of Psychology at Northern Illinois University. Her research is focused on individual differences in reading comprehension, the relationship between text structure and comprehension, and the role of literacy in college readiness. SARAH A. ECKERT, PhD, teaches history at the Agnes Irwin School in Bryn Mawr, Pennsylvania. DAVID A. GAMSON, PhD, is an assistant professor of education in the Department of Educational Policy Studies at the Pennsylvania State University. His research focuses on the history of American education. 583 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. 2002; Rapp et al., 2007; Spiro, 1980). All of these variables are important and play a role in a reader’s construction of meaning. This project investigated the influence over time of two of these four components of reading comprehension, reading texts and reading comprehension tasks, in the context of reading instruction. An investigation of reading curricula over the past century provides documentation of the evolution of reading materials commonly used in American schools. Reading curriculum materials are, and have been, a driving influence in reading instruction. As such, these materials influence the development of children’s cognitive abilities. For example, in the mid 1960s nearly 95% of all teachers used basal reading series in their classrooms (Chall & Squires, 1991). While teachers may currently use a wider variety of reading materials (e.g., children’s literature) in classroom instruction, the influence of commercial reading series remains strong. The National Assessment of Educational Progress (NAEP) reported in 1992 that 36% of fourth-grade teachers said they relied exclusively on basal readers, and 49% said they used basal readers supplemented with trade books for their reading instruction (Wade & Moje, 2000). More recent analyses of classroom reading instruction found a similar reliance on basal reading series (Brenner & Hiebert, 2010). These responses suggest basal reading curricula play a central role in reading instruction for a great majority of fourth-grade teachers. Earlier historical research (e.g., Chall, 1967; Chall & Squires, 1991; Smith, 1965; Venezky, 1991) has documented changes in the goals and implementation of reading instruction as well as the evolution of reading curricula in response to the changing goals of reading instruction in schools. For example, early in the 20th century, reading instruction focused on elocution, and much of reading instruction involved reading orally (Smith, 1965; Venezky, 1987). Reading research by Huey (1908), Thorndike (1917), and others led to an important reconceptualization of reading as the process of translating printed words into meaning. As this conceptualization became more prevalent, it resulted in important changes in how teachers taught reading, what students read, and what activities students did in response to what they had read. The earlier historical accounts document similar theory- and researchbased influences. Changes in the curriculum materials included increasing the amount of authentic text and informational text students read and asking comprehension questions requiring deep rather than surface-level comprehension of those texts. However, most of the historical research in reading has been descriptive (e.g., Smith, 1965; Venezky, 1987). Only Chall, Conrad, and Harris (1977) and Chall and Squires (1991) report quantitative research studying changes in vocabulary difficulty and readability in basal reading series over segments of the 20th century. This study will expand on and attempt to quantify relationships described in the previous research by looking at text difficulty and the demands of comprehension tasks to get 584 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula a more complete understanding of the systematic changes in reading curricula. This research will also contextualize these curricular changes in the theory of cognition (e.g., cognitive complexity and cognitive load of reading materials and comprehension tasks) as a way to understand the cognitive complexity of the materials and the changing cognitive demands curricular changes make on students. Are Reading Texts Getting Simpler? Some researchers have argued that ‘‘the materials teachers use for reading instruction today are considerably different from those that were used for reading instruction for nearly the entire twentieth century’’ (Martinez & McGee, 2000, p. 154). Some recent research suggests that there has been a decline in the difficulty of textbooks used in school that has resulted in a decline in verbal SAT scores (e.g., Hayes, Wolfer, & Wolfe, 1996) and a lack of preparation for later work and academic endeavors (Williamson, 2006). Hayes et al. (1996) conducted a comprehensive analysis of 800 textbooks (from a variety of content areas) published from 1919 to 1991 and used in elementary through high school. They grouped curricula into three large historical periods: 1919–1945, 1946–1962, and 1963–1991. Hayes et al. did not look at the readability of the text because ‘‘psycholinguists have many strong reservations about these measures’’ (p. 494), suggesting that readability estimates lacked accuracy particularly when comparing text across time. Instead, they looked at the lexical difficulty (i.e., the level of difficulty of the vocabulary) of the text as their sole measure of text difficulty. The overall conclusion was that ‘‘a major simplification of schoolbooks occurred after World War II and that, beyond third grade, current levels have never been lower in American history’’ (p. 505). Specifically, Hayes et al. found that (a) the most difficult basal readers were generally published prior to 1918, (b) significant simplification of lexical difficulty occurred after World War I, (c) after World War II readers again became simpler, and (d) today’s sixth-, seventh-, and eighth-grade reading texts are simpler than fifth-grade basal readers prior to World War II. The authors decry this simplification as a leading cause for declining SAT verbal scores. Other authors (e.g., Adams, 2010–2011) have also commented on the declining vocabulary difficulty in school texts. Using the findings of Hayes et al. (1996), Adams (2010–2011) links the vocabulary simplification to results of the National Assessment of Educational Progress. Noting that Hayes et al. found small increases in primary grade reading curricula vocabulary since 1963, Adams suggests that the vocabulary difficulty increase relates to increases in NAEP reading performance for 9- and 13-year-olds since 1971. However, she notes there is a simplification of vocabulary in curricula used in fourth-grade since World War II. Further, Adams suggests that 585 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. this vocabulary simplification may be the cause of flat NAEP reading performance for 17-year-old students. Recent Analyses of Text in Reading Curricula Previously published research (Gamson, Lu, & Eckert, 2013) from this project has shown that the mean readability level and mean lexical difficulty dropped through the early part of the century to a low point in the 1950s through 1970s. However, particularly for third-grade text, there have been increases in average readability and lexical difficulty over the past three decades. Third-grade lexical difficulty measures (LEX and word frequency band) scores for texts published in the 2000s are higher than at any other time in the century. Reading texts for sixth grade do not show an upturn in the latter half of the century, but the measures of lexical difficulty have increased in recent decades. Similarly, this research indicates that readability estimates have increased in recent decades for third-grade texts while remaining fairly stable for the past 50 years for sixth-grade texts. A goal of this article is to expand on these text analyses for third- and sixth-grade reading materials by examining additional measures and dimensions of their linguistic complexity. We also will consider the cognitive demands made of students by the comprehension tasks in the curricula. Research on Reading Comprehension Tasks As we discussed earlier, there is more to the complexity of reading than the text. What students are asked to do in comprehending and using the information in the text also affects the cognitive demands of reading instruction. There is a rich research literature that has investigated the content and use of reading curricula (e.g., Beck, McKeown, McCaslin, & Burkes, 1979; Brenner & Hiebert, 2010; Durkin, 1981; Smith, 1965). Our goal is to add to this literature with a historical investigation describing the changes in reading curricula over the 20th century. We will categorize reading comprehension tasks to track changes in the types of comprehension skills emphasized (e.g., literal vs. inferential comprehension). Project Goals The goal of this project was to analyze and describe the changes in the complexity of reading curricula text and the comprehension tasks asked of students in those curricula. We looked at the curricula for each decade, rather than longer time periods used in previous research (e.g., Hayes et al., 1996), so our analyses would be more sensitive to changes that occur in reading materials that tend to have new copyrights, and hence content revisions, approximately every 5 years. 586 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula Method Selection and Acquisition of the Reading Curricula Our goal was to locate a sample of third- and sixth-grade basal reading textbooks from 1900 to 2008 that represented the textbook market. In order to collect the textbooks for this study, we began by casting a wide net in order to identify as many third- and sixth-grade reading books from 1900 to 2008 as possible. During the initial phase of collection we identified major publishers (e.g., Ginn & Co., Houghton Mifflin, MacMillan, and Scott Foresman) and attempted to follow the textbook series issued by each publisher throughout the century. After textbook publication was tracked for major publishers, we began a more purposeful search, using state adoption records from Texas and California as well as from a wide variety of professional and academic publications. Finally, we used Bowker’s (1948–2010) Books in Print as a source to identify additional reading curricula and help fill in any gaps in our textbook sample. Textbook Sample for Text Difficulty Analysis The sample used for the text difficulty analysis included texts (i.e., individual reading units) from 187 third-grade and 71 sixth-grade reading textbooks published between 1905 and 2004. We selected third grade because it is a critical year in developing students’ comprehension skills through reading instruction and is prior to the documented ‘‘fourth grade slump’’ (Chall, Jacobs, & Baldwin, 1990), when increased demands in text complexity and comprehension tasks overwhelm some students. As such, third-grade reading instruction is critical to preparing students for the increased emphasis on reading in content areas typical in middle and high school. We selected sixth-grade reading curricula as text to represent the changing emphasis of the role of reading in the curricula in middle school. During middle school, students encounter a wider variety of types of text with increasing complexity and increased demands on their comprehension. There is an important shift in reading with increased emphasis on students’ learning the content presented in text, particularly in content areas like social studies and science. Investigating reading curricula used in sixth grade may provide evidence of the degree to which curricula provide instruction related to demands found in content area reading. The textbooks were scanned and digitized using OmniPage, an optical character recognition program. Each text was manually checked for accuracy and labeled with relevant meta-information, such as publisher, print year, grade level, start page, end page, and genre. Table 1 summarizes the details of the data set for third- and sixth-grade texts. To ensure the reliability of the text difficulty measures applied, only texts with at least 100 words were included. Altogether, there were 5,259 texts with a total of over 5 million 587 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. Table 1 Details of the Third- and Sixth-Grade Data Sets in Text Analyses Grade 3 6 Decade Years Books Texts Words Mean Length SD 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 All 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 All 1905–1914 1915–1924 1925–1934 1935–1944 1945–1954 1955–1964 1965–1974 1975–1984 1985–1994 1995–2004 1905–2004 1905–1914 1915–1924 1925–1934 1935–1944 1945–1954 1955–1964 1965–1974 1975–1984 1985–1994 1995–2004 1905–2004 10 7 16 10 12 13 35 33 25 26 187 3 5 6 6 4 8 8 14 13 4 71 489 298 638 308 384 362 857 803 566 554 5,259 133 252 357 272 160 301 262 456 351 238 2,782 346,401 216,670 471,188 322,083 400,946 368,677 982,683 831,931 618,952 489,526 5,049,057 158,503 444,822 427,087 486,467 306,149 537,425 549,843 798,805 816,438 413,919 4,939,458 708.39 727.08 738.54 1,045.72 1,044.13 1,018.44 1,146.65 1,036.03 1,093.55 883.62 960.08 1,191.75 1,765.17 1,196.32 1,788.48 1,913.43 1,785.47 2,098.64 1,751.77 2,326.03 1,739.16 1,775.51 795.00 673.34 631.13 615.41 638.58 683.29 735.49 711.29 727.96 685.59 716.38 1,086.15 2,755.04 1,294.04 1,491.61 1,404.17 981.39 2,002.21 2,340.00 2,454.29 1,664.93 1,940.12 words in the third-grade data set and 2,782 texts and over 4 million words in the sixth-grade data set. Textbooks were grouped into 10 time periods, each of which covers one decade, beginning 5 years prior to the beginning of a decade and ending 4 years after the turn of the decade. Textbook Sample for Analysis of Comprehension Tasks From the aforementioned sample of texts, we selected a subsample to use in the analysis of comprehension tasks. For each decade we selected three widely used basal reading series at both third- and sixth-grade levels. These series spanned the decades from the 1920s to the 2000s. While we were able to locate student readers for the decades prior to 1920 to use in the linguistic analysis of the text students read, we were unable to locate a sufficient number of teacher’s manuals, which at that time contained the comprehension questions for use in the comprehension task analyses. Hence, the comprehension task analyses included 54 textbooks in third 588 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula grade (because there are two levels, 3.1 and 3.2 in third grade) and 27 textbooks in sixth grade. Sampling Stories for Analysis of Comprehension Tasks For each of the curriculum and grades selected, we sampled eight stories to use in the comprehension tasks analyses. In order to systematically sample all of the curricula in a similar fashion, we selected the first four stories, and starting in the middle of the book we selected four more stories. Only selections longer than one page were used in these analyses. We found that some curricula periodically had one-page selections like ‘‘About the Author.’’ Beyond the one-page selections, curricula tended to have longer, multiple page narratives or informational selections. Our rationale for using selections that were longer than one page was that teachers were likely to allocate much more instructional time to longer selections, and as a result, those selections would have more instructional impact. Text Difficulty Analysis Lexical Sophistication Lexical sophistication measures the degree of sophistication of the words in a text. In readability formulas, two approaches have been used to define sophisticated words, one based on word length or syllable structure, the other based on word frequency or some notion of word familiarity. Admittedly, what constitutes familiar words may vary among readers of even the same age. However, previous studies have reported experimental evidence that word frequency has a significant effect on reading comprehension (Marks, Doctorow, & Wittrock, 1974; McGregor, 1989). Gamson et al. (2013) reported results of lexical sophistication using two measures, for example, LEX (which was originally proposed by Hayes et al., 1996) and word frequency band scores. In this article, we examine lexical sophistication using two additional frequency-based measures. We consider a word as sophisticated if it is not among the 2,000 most frequent words in the American National Corpus (ANC) (Reppen & Ide, 2004), a massive corpus of American English texts of all genres and transcribed speech data with 22 million words in its current release. Specifically, we consider the following two lexical sophistication ratios: sophisticated word ratio (i.e., the proportion of words in the text that are sophisticated) and sophisticated word type ratio (i.e., the proportion of unique words in the text that are sophisticated). These ratios were computed for each text using the lexical complexity analyzer developed by Lu (2012). 589 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. Lexical Diversity Lexical diversity measures the range of vocabulary used in a text. Malvern, Richards, Chipere, and Durán (2004) provided a thorough discussion of the many measures of lexical diversity that exist. A widely used measure of lexical diversity is type-token ratio (TTR), for example, the ratio of the number of word types or unique words to the total number of words in a text (Templin, 1957). This measure has been shown to decrease as the length of the text increases (Richards, 1987), and several transformations of the ratio have been proposed to minimize the sample size effect. We consider both the original TTR and one of its transformation here, namely, Root TTR (RTTR; Guiraud, 1960). RTTR adjusts the denominator in the original TTR to the square root of the number of words. These measures were again computed for each text in the data set using the lexical complexity analyzer developed by Lu (2012). Syntactic Complexity Another aspect of text difficulty considered here is syntactic complexity. As discussed previously, most readability formulas measure syntactic complexity using mean length of sentence. While this measure is intuitively useful, it is insensitive to structural differences among sentences. Many other syntactic complexity measures proposed for child language acquisition research rank syntactic constructions based on patterns of syntactic development or the frequency of occurrence of these constructions in natural speech, such as Developmental Level (D-Level; Covington, He, Brown, Nac xi, & Brown, 2006; Rosenberg & Abbeduto, 1987), Developmental Sentence Scoring (DSS; Lee, 1974), and Index of Productive Syntax (IPSyn; Scarborough, 1990). There is also a family of metrics that quantify the processing demand of different types of sentence structures, such as Yngve depth (Yngve, 1960). We adopt the D-Level scale for measuring syntactic complexity here, for two reasons. First, as a measure based on child language acquisition research, it has excellent validity (Covington et al., 2006). Second, it has been shown by psycholinguistic experiments to be a more adequate index of sentence comprehension and recall than many other metrics (Cheung & Kemper, 1992). Originally proposed by Rosenberg and Abbeduto (1987) and subsequently revised by Covington et al. (2006), this scale classifies a sentence into one of eight increasingly complex developmental levels based on its structure, as summarized in Table 2. The D-Level scores for the sentences in each text were computed using the Developmental Level Analyzer (Lu, 2009). We reported two indices based on the D-Level analysis here, namely, the proportion of complex sentences (i.e., sentences in Levels 1 to 7 defined in Table 2) in each text and the average D-Level score for each text. 590 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula Table 2 Convington, He, Brown, Nacxi, and Brown (2006) Revised Developmental Level Scale Level 0 1 2 3 4 5 6 7 Description Simple sentences, including questions Sentences with auxiliaries and semiauxiliaries Simple elliptical (incomplete) sentences Infinitive or -ing complement with same subject as main clause Conjoined noun phrases in subject position Sentences conjoined with a coordinating conjunction Conjoined verbal, adjectival, or adverbial constructions Relative or appositional clause modifying object of main verb Nominalization in object position Finite clause as object of main verb Subject extraposition Raising Non-\finite complement with its own understood subject Comparative with object of comparison Sentences joined by a subordinating conjunction Nonfinite clauses in adjunct position Relative or appositional clause modifying subject of main verb Embedded clause serving as subject of main verb Nominalization serving as subject of main verb More than one structure from Levels 1–6 Example John cried. This has solved it. John did. Try to smile. John and Mary left. I came early but Peter arrived late. He sang and jumped. John scolded the boy who stole his bicycle. I understand his rejection of the offer. John knew that Mary was angry. It was hard for John to tell Mary. John seems to Mary to be happy. I saw him walking away. John is older than Mary. They won’t play if it rains. Having tried both, I prefer the second one. The man who cleans the room left early. For John to have left Mary was surprising. His rejection of the offer was unexpected. John decided to leave Mary when he heard that she was seeing Mark. Cognitive Demands of Comprehension Tasks Comprehension tasks (i.e., questions) may create greater or lesser cognitive demands on the reader. The questions may require readers to process varying amounts of textual information; they may also ask readers to process 591 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. the information in either simpler, more literal ways or more complex ways, such as making inferences based on the text and their own prior knowledge. These two constructs, the amount of text and the type of processing of text, have an impact on the cognitive complexity of the reading instruction provided. Working Memory Load One way to conceptualize the complexity of comprehension tasks is to consider the amount of text a reader needs to process in order to respond to the comprehension task (Perfetti, 1985). As described previously, reading comprehension involves the process of integrating text information with one’s prior knowledge to construct meaning. Comprehension tasks put demands on a reader’s working memory as they require the reader to retain information from the text in order to generate an answer. These tasks may require the retention of only a sentence or as little as one idea unit or the retention of a paragraph or more containing many idea units. An idea unit can be understood as the smallest number of words for expressing an idea or thought (Johnson, 1970; Meyer, 1975; Taylor, 1980). For example, literal comprehension questions typically require information from one sentence or a part of a sentence, thus making fewer demands on the reader’s working memory capacity. Other tasks may ask the reader to summarize text, requiring an integration of information from one paragraph or many paragraphs. The sheer number of ideas the reader must hold in working memory relates directly to the cognitive load of the task (Carlson, Chandler, & Sweller, 2003; Mayer, 2002; Perfetti, 1985; van den Broek, McMaster, Kendeou, & Espin, 2007). When the reader can process one idea at a time, the interactivity is low, and the resulting cognitive demand is low. When the reader must process many ideas at the same time, such as when asked to summarize a paragraph, the interactivity is higher, and the resulting cognitive load of the task is also higher. To assess the amount of text needed for each comprehension task, we counted the number of content words in the portion of text that was related to the answer to the comprehension question. A content word provides the information in the text, and a count of them is a good proxy for the number of idea units in a portion of text. For example, in the sentence ‘‘The black bear climbed the tree and grabbed the food’’ the words black, bear, climbed, tree, grabbed, and food are the content words, for a count of 6. In an analysis of a subsample of third-grade texts selected across decades, we found that a count of the content words in text correlated highly with the number of idea units in the same text (r = .92). While similar to idea units, we preferred the use of content words in this study because they were simpler to count in reading texts and easier to train coders to count reliably. 592 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula Previous research on cognitive load has also suggested that extraneous information adds to one’s cognitive load just as additional necessary information adds to cognitive load (e.g., Carlson et al., 2003; van den Broek et al., 2007). For example: ‘‘Washington rode into battle hoping to lead his troops to victory while charging forward on his white horse.’’ The question ‘‘What color horse did Washington ride?’’ requires only ‘‘Washington rode . . . white horse’’ or a count of four content words. However, the words ‘‘into battle hoping to lead his troops to victory while charging forward on his’’ intervene and add to the cognitive demand of the task by requiring that the student comprehend that text and recognize that it is not relevant for this particular comprehension task. As a result of these considerations, to assess the working memory load aspect of the cognitive demand of comprehension tasks, we measured (a) the information needed to answer the question and (b) the extraneous information in the portion of text readers were asked to process for the comprehension task. Processing Complexity The complexity of the processing demands of a task also adds to the cognitive load required to respond to that task (Carlson et al., 2003; Mayer, 2002; van den Broek et al., 2007). The more the reader is required to make connections between different pieces of information, the greater the processing demands on working memory. This includes connections between information from the text with prior knowledge as well as from different segments of text. Similarly, when a reader must search long-term memory and identify and use specific prior knowledge in combination with textual information to solve the comprehension task, it increases the processing demands on working memory and entails a greater cognitive load. One way to think of the type of processing of text would be literal comprehension versus inferential comprehension. Literal comprehension simply requires students to recall the information presented in the text, whereas inferential comprehension requires students to integrate what they have read with relevant prior knowledge to understand the information in ways not explicitly stated in the text. Typically, inferential comprehension makes greater cognitive demands on the reader (Perfetti, 1985; van den Broek et al., 2007). The analyses of comprehension tasks looked similar to previous research on the types of comprehension questions asked in reading curricula (e.g., Armbruster, Stevens, & Rosenshine, 1977). Following the pattern of previous research, we categorized questions into the following types: explicit detail, paraphrased detail, inference, assessing prior knowledge, character trait, opinion, supporting opinion, prediction, author’s purpose, compare and contrast, cause and effect, following directions, figurative 593 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. language, fact and opinion, interpreting illustrations, mood and setting, main idea, sequence of events, summarization, understanding genre, and words in context. Level of Ideas Assessed Comprehension questions also vary in whether they ask students about high-level or low-level ideas presented in the text. Low-level ideas are those that require understanding of a sentence or clause in the text. Higher-level ideas require the understanding of ideas presented across multiple sentences (Haenggi & Perfetti, 1994; Magliano & Mills, 2003). Higher-level ideas are more thematic and promote deeper processing of the text information (Goldman & Rakestraw, 2000; Kintsch & van Dijk, 1978). Questions that ask about lower-level ideas typically promote more superficial processing of the text. For example, a question that asks students to summarize what happened in a portion of text requires the reader to process ideas from many sentences and about a higher-level or more thematic idea from the text. Questions that asked about explicit details in the text are likely to require processing of a sentence or clause and be about lower-level ideas. Raters coded the questions as asking for either higher-level or lower-level ideas from the text. Interrater Reliability Four raters coded the constructs pertaining to the cognitive demands of comprehension tasks described previously. Interrater agreement was determined by using 10 samples of reading materials randomly selected from the curricula used in these analyses. Each sample included the text and two questions about the text. Raters counts of (a) the information needed to answer a question and (b) the amount of extraneous information were correlated to determine the interrater reliability. For interrater agreement, we calculated the correlation between each pair of raters across the 20 ratings (2 questions each for 10 passages). The eight correlations ranged from r = .76 to .89 with an average of .81 on measures of information needed. Similarly, for counts of extraneous information, the correlations ranged from r = .78 to .87, with an average of .84. For the coding of comprehension question category and level of idea asked, we computed the percentage agreement between raters based on their ratings of the same questions, as described previously. The interrater agreement for categorizing the type of comprehension question ranged from 72% to 93% with a mean of 81% agreement. On the coding of the level of ideas (high-level ideas vs. low-level ideas), the interrater agreement ranged from 88% to 95% agreement, with a mean of 92%. 594 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula Results Text Difficulty Lexical Sophistication Figure 1 summarizes the mean sophisticated word ratio (SWR) and sophisticated word type ratio (SWTR) of third-grade texts. One-way ANOVAs revealed significant between-decade differences in both mean SWR, F(9, 5249) = 30.592, p = .000, and SWTR, F(9, 5249) = 361.023, p = .000. The significance level for all ANOVAs used a Bonferroni adjustment of .05/6 = .008 for the six ANOVAs conducted within each grade-level sample. For SWR, a falling trend was observed from the 1910s through the 1940s, with the 1940s showing a significantly lower ratio than all other decades. A rising trend surfaced from the 1950s onwards (with the exception of the 1980s), with the 2000s showing a significantly higher ratio than all other decades. Note that the ratios for the 1920s–1930s period and the 1950s–1980s period were all comparable. For SWTR, a falling trend was also observed from the 1910s through the 1940s. This was followed by a rising trend from the 1950s onwards. Following two significant increases in the 1970s and the 1990s, the 2000s achieved a significantly higher ratio than all other decades. Figure 2 summarizes the mean SWR and SWTR of sixth-grade texts. Significant between-decade differences were observed for both mean SWR, F(9, 2772) = 14.572, p = .000, and SWTR, F(9, 2772) = 14.110, p = .000, as compared to an adjusted p = .008. For SWR, the 1910s and 1920s had the highest ratios, with the 1920s showing a significantly higher ratio than all other decades. The 1930s through the 2000s had largely comparable ratios, with the exception of the 1980s, which had a significantly lower ratio than most other decades in this period. The ratio of the 2000s was significantly lower than that of only the 1920s. For SWTR, the 1920s showed a significantly higher ratio than all other decades. The 1980s showed a significantly lower ratio than most other decades, with the exception of the 1950s and 2000s. No significant differences were found among the other decades. Lexical Diversity Figure 3 summarizes the mean TTR and RTTR of third-grade texts. Oneway ANOVAs showed significant between-decade differences in both mean TTR, F(9, 5249) = 96.991, p = .000, and mean RTTR, F(9, 5249) = 42.362, p = .000, as compared to an adjusted p = .008. For TTR, the 1910s had a significantly higher ratio than all other decades, and the 1920s and 1930s both had significantly higher ratios than the 1940s–1990s. The ratio remained low and stable between the 1940s and 1970s and then increased steadily from the 595 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. Figure 1. Mean sophisticated word ratio and sophisticated word type ratio of third-grade reading texts. Y error bars indicate the 95% confidence interval for the mean. 596 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula Figure 2. Mean sophisticated word ratio and sophisticated word type ratio of sixth-grade reading texts. Y error bars indicate the 95% confidence interval for the mean. 597 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. Figure 3. Mean type-token ratio and root type-token ratio of third-grade reading texts. Y error bars indicate the 95% confidence interval for the mean. 1980s onward. The more reliable RTTR measure, however, showed a very different pattern of change. RTTR stayed low and stable from the 1910s through the 1960s, with no significant changes in this period. The 1970s 598 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula and 1980s saw significant increases, and the 1990s and 2000s both achieved significantly higher ratios than all earlier decades. There is some variability in the trends indicated in third grade. This may be an artifact of the measurement used and the variability of the text in the curricula. The TTR has been shown to be subject to a sample size effect, namely, it tends to decrease as sample size increases, whereas RTTR has been shown to significantly reduce this effect, namely, it remains more stable across texts of different lengths. The sample size effect for TTR may have given rise to a different pattern for TTR than that for RTTR, as there is variation in the length of the texts within each decade. Figure 4 summarizes the mean TTR and RTTR of sixth-grade reading texts. Significant between-decade differences were found for both mean TTR, F(9, 2772) = 28.180, p = .000, and RTTR, F(9, 2772) = 6.000, p = .000, as compared to an adjusted p = .008. For TTR, the 1910s had a significantly higher ratio than the 1940s–1990s, while the 1920s and 1930s both had a significantly higher RTTR than the 1940s–2000s. Following a significant decline in the 1940s, TTR stayed low and stable between the 1950s and 1990s. However, the 2000s saw a significant increase, reaching a level that is significantly lower than only the 1920s and 1930s. RTTR again showed a different pattern of change. In general, much less variation was observed for RTTR than for TTR across the decades. On average, the later decades (1960–2000) had higher ratios than the early decades (1900–1950). Notably, the 1930s had a significantly lower RTTR than either the 1940s or the decades from 1960 to 2000. The only other significant difference was found between the 1990s and the 1920s, the decades with the highest and second lowest RTTR, respectively. Syntactic Complexity Figure 5 summarizes the mean proportion of complex sentences (PCS) and mean developmental level (D-Level) of third-grade texts. One-way ANOVAs showed significant between-decade differences for both mean PCS, F(9, 5249) = 59.628, p = .000, and D-Level, F(9, 5249) = 71.049, p = .000, as compared to an adjusted p = .008. For PCS, the 1910s had a significantly higher ratio than all later decades, the 1920s and 1930s both had significantly higher ratios than the 1940s–2000s, and the 1940s had a significantly higher ratio than the 1980s and 1990s. PCS stayed low and stable in the 1950s–2000s, with no significant changes within this period. For D-Level, the 1910s–1930s had significantly higher scores than the 1940s– 2000s, with no significant differences found within the 1940s–2000s period. Figure 6 summarizes the mean PCS and D-Level of sixth-grade texts. Significant between-decade differences were observed for both mean PCS, F(9, 2772) = 35.526, p = .000, and D-Level, F(9, 2772) = 49.987, p = .000, as compared to an adjusted p = .008. For PCS, the 1910s had a significantly higher ratio than the 1950s–2000s, the 1920s and 1930s both had significantly 599 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. Figure 4. Mean type-token ration and root type-token ratio of sixth-grade reading texts. Y error bars indicate the 95% confidence interval for the mean. higher ratios than the 1940s–2000s, and the 1940s had a significantly higher ratio than the 1960s, 1980s, and 2000s. PCS stayed low and stable between the 1950s and 2000s, with no significant changes within this period. D600 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula Figure 5. Mean proportion of complex sentences and developmental level of third-grade texts. Y error bars indicate the 95% confidence interval for the mean. Level was again significantly higher in the 1910s–1930s than in the 1940s– 2000s. The 1940s also had a significantly higher D-Level than most later 601 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. Figure 6. Mean proportion of complex sentences and developmental level of sixth-grade reading texts. Y error bars indicate the 95% confidence interval for the mean. decades (except the 1950s and 1970s). Within the 1950s–2000s period, the 1980s showed a significantly lower D-Level than the 1970s and the 1990s. 602 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula Table 3 Number of Questions per Story: Mean and Standard Deviation, by Decade Third grade Mean Standard deviation Sixth grade Mean Standard deviation 1920 1930 1940 1950 1960 1970 1980 1990 2000 16.2 4.4 7.1 1.1 23.0 21.4 27.8 7.0 31.0 6.8 31.3 4.9 27.8 11.8 31.8 11.2 48.5 12.0 14.0 1.2 9.8 6.5 29.4 26.6 16.3 5.7 20.0 8.4 26.3 19.7 37.3 35.3 31.8 13.7 36.2 11.3 Comprehension Tasks Number of Comprehension Questions Asked In considering the number of comprehension tasks, or questions students are asked, it is evident that there has been a fairly continuous increase in the number of questions throughout the century (see Table 3). In the 1920s, third-grade curricula asked an average of 16 questions per story. That steadily increased up to the 2000 editions that asked an average of 48 questions per story. Similarly, sixth-grade curricula in 1920 asked an average of 14 questions per story, and by the 2000s, that increased to 36 questions per story. These increases do not necessarily mean that teachers asked all these questions to their students, but it does show a change in emphasis by the textbook publishers that may have an influence on teachers’ behavior. Information Needed to Answer Questions Analyses of the average amount of information needed to answer a comprehension question (as measured by the number of content words) showed an initial dip from 1920 to 1930 curricula, then essentially no change in the 1940s and 1950s for both third- and sixth-grade materials (see Table 4). In the 1970s, third-grade curricula began asking questions that required the processing of more information (content words), and there was a gradual increase in those demands through the 2000 curricula. In the 1930s, 1940s, and 1950s, on average, students were required to process only around 25 content words to answer a typical comprehension question. By the 1990 and 2000 editions, on average, students needed to process nearly twice as much text per question, averaging 47.1 content words per question in 1990 and 49.7 in 2000. A similar trend is observed in the sixth-grade curricula, with the gradual increase in the cognitive demands per question beginning in the 1960 curricula. From a low of 32.4 content words in 1930 to a high of 68.9 content 603 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. Table 4 Information Needed to Answer a Question and Extraneous Information in Text: Means, by Decade Third grade Information needed Extraneous information Sixth grade Information needed Extraneous information 1920 1930 1940 1950 1960 1970 1980 1990 2000 35.6 3.5 26.7 9.4 24.2 5.4 26.1 7.0 28.3 8.6 33.1 9.3 39.2 7.4 47.1 10.1 49.7 7.8 39.7 8.8 32.4 7.0 36.4 9.7 36.3 6.9 46.9 4.3 47.5 4.3 34.3 7.7 64.2 6.1 68.9 5.3 words in 2000 curricula, the cognitive demands of questions in the sixthgrade curricula based on the number of content words processed has doubled. Extraneous Information in Text It is thought that extraneous information embedded in text that students must process to respond to a comprehension task would put additional cognitive demands on them as they attempt to answer the question. Our analyses indicated no strong pattern in the amount of extraneous information involved in students’ processing of text to respond to comprehension tasks (see Table 4). The number of extraneous content words averaged from 3 to 10 per comprehension question across the years and curricula we surveyed. Types of Comprehension Questions Different types of comprehension questions put different cognitive demands on readers. In Table 5, we present the relative emphasis (measured by percentage of the total number of questions) to the most frequent types of comprehension. In Table 6, we translate that percentage into the average number of questions per story of that type of question. Explicit Detail Questions By inspection it is evident that in third grade, explicit detail questions are frequently asked throughout all the curricula analyzed and across all decades. Early curricula, from the 1920s through 1950s, emphasized asking explicit detail questions, averaging around one-third of all questions across those decades (see Table 5). In the latter half of the century, there is a gradual decline in the emphasis on explicit detail questions, down to an average of only 10% of all questions in the 2000 curricula. In the 1920s, that emphasis 604 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula Table 5 Relative Emphasis on Types of Comprehension Questions (Averaging 5% of Total or More): Mean Percentage of All Questions Asked, by Decade Third grade Explicit detail Paraphrased detail Inferential Assess prior knowledge Character trait Opinion Sixth grade Explicit detail Paraphrased detail Inferential Assess prior knowledge Character trait Opinion 1920 1930 1940 1950 1960 1970 1980 1990 2000 33.5 8.8 10.1 3.6 1.5 14.2 31.8 14.1 14.7 12.9 2.9 3.5 37.8 26.3 4.2 4.9 1.8 1.8 36.1 8.1 7.9 7.8 8.8 9.9 23.4 11.4 11.9 7.1 5.9 9.5 25.9 11.5 6.9 6.1 5.5 11.3 19.3 16.8 13.0 2.4 6.1 6.9 13.0 13.1 8.4 4.6 10.2 10.7 10.3 13.7 11.5 5.4 6.2 12.1 45.8 9.3 5.6 8.6 1.9 7.9 39.6 7.9 5.5 4.1 5.4 14.5 24.1 12.4 5.9 11.1 2.4 6.2 30.3 18.2 5.5 3.7 4.9 15.9 14.8 10.1 13.1 10.1 4.5 10.9 22.1 13.8 9.7 5.1 5.9 12.8 30.1 20.2 10.8 2.9 5.8 7.8 7.8 16.3 10.1 2.4 6.5 13.5 8.0 11.0 9.2 2.2 5.3 6.3 on explicit details amounted to about 5 questions (of an average 16) per story. By 2000, while the number of questions per story has increased dramatically, still about 5 questions per story (out of an average of 48) were about explicitly stated details. The pattern is similar and in some ways more prevalent in sixth-grade curricula. In the 1920 editions, 48.5% of all questions were explicit details, by far the most prevalent type of comprehension question in curricula. The emphasis diminished in more recent curricula, with 1990 and 2000 editions asking for explicitly stated details in less than 10% of all comprehension questions. In 1920, that emphasis on explicit details meant that an average of 6 out of 14 questions about the story were about explicitly stated details. By the 2000 curricula, sixth-grade curricula asked an average of only 2.9 questions about explicit details. Other Types of Comprehension Questions Reading curricula ask a variety of kinds of comprehension questions other than explicit details (as noted in Tables 5 and 6). However, there are no readily apparent trends in the relative emphasis on them over time. Inferential questions do seem to be more frequently used in sixth-grade reading curricula since about the 1960s. Prior to the 1960s, reading curricula averaged one inferential question per story (around 5% of all questions). 605 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. Table 6 Types of Comprehension Questions: Average Number per Story, by Decade Third grade Question per story Explicit detail Paraphrased detail Inferential Assess prior knowledge Character trait Opinion Sixth grade Questions per story Explicit detail Paraphrased detail Inferential Assess prior knowledge Character trait Opinion 1920 1930 1940 1950 1960 1970 1980 1990 2000 16.2 5.4 1.4 0.6 0.2 1.5 2.3 7.1 2.3 14.1 1.0 12.9 0.9 0.2 23.0 8.7 6.0 1.0 1.1 0.4 0.4 27.8 10.0 2.3 2.2 2.2 2.4 2.8 31.0 7.3 3.5 3.7 2.2 1.8 2.9 31.3 8.1 3.6 2.2 1.9 1.7 3.5 27.8 9.4 4.7 3.6 0.7 1.7 1.9 31.8 4.1 4.1 2.7 1.5 3.2 3.4 48.5 5.0 6.6 5.6 2.6 3.0 5.8 14.0 6.4 1.3 0.7 1.2 0.3 1.1 9.8 3.9 0.7 0.5 0.4 0.5 1.4 29.4 7.1 3.6 1.7 3.3 0.6 1.8 16.3 4.9 3.0 0.9 0.6 0.8 2.6 20.0 3.0 2.0 2.6 2.0 0.9 2.2 26.3 5.8 3.6 2.6 1.3 1.6 3.4 37.3 11.2 7.5 4.0 1.1 2.2 2.9 31.8 2.5 5.2 3.2 0.8 2.1 4.3 36.2 2.9 4.0 3.3 0.8 1.9 1.1 Since 1960, inferential questions have been asked about 10% of the time in sixth grade, ranging from an average of 2.6 to 4 questions per story. Percentage of Questions About Higher-Level Ideas in Text In both third- and sixth-grade curricula, there is an increase in the percentage of questions about higher-level (more thematic) ideas in text (see Table 7). In third grade, one-third or fewer of the comprehension questions asked students about higher-level ideas. In more recent curricula, 1990 and 2000 editions, nearly 50% of the questions asked were about higher-level ideas in the text. Sixth-grade curricula provided evidence of an even stronger pattern of increased emphasis on questions about higher-level ideas in text. From a low of only 11.1% of questions in the 1930s, questions about higher-level ideas increased to 49% in each of the 1990s and 2000s curricula. Conclusions In looking for patterns in the data it seems most appropriate to consider the data by grade as there are different and clearer patterns in the third-grade curricula than for the sixth-grade curricula. 606 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula Table 7 Percentage of High-Level Questions, by Decade Third grade Sixth grade 1920 1930 1940 1950 1960 1970 1980 1990 2000 34.3 14.6 35.4 11.1 29.2 26.2 27.9 18.5 41.1 20.6 28.6 32.5 46.5 34.4 52.3 49.7 47.3 49.1 Cognitive Demands of Third-Grade Curricula Linguistic Analyses of the Text The following patterns of change were observed for the three dimensions of text complexity, as summarized in Table 8. Lexical sophistication and diversity both exhibited a different pattern of change than that of syntactic complexity. For lexical sophistication, while a falling trend was observed between the 1910s and the 1940s, a rising trend surfaced in the 1950s and continued through the 2000s, with the 2000s reaching a significantly higher level than all other decades. Lexical diversity as measured using RTTR (a more reliable measure than TTR) remained fairly stable from the 1910s through the 1960s and then increased from the 1970s onwards, with the 1990s and 2000s reaching markedly higher levels than all earlier decades. Syntactic complexity was higher in the 1910s–1940s than the 1950s–2000s; there was a falling trend from the 1910s through the 1940s, but no significant changes occurred in the 1950s–2000s period. Analyses of Comprehension Tasks The first interesting finding about reading comprehension tasks asked of third-grade students is the consistent increase in the quantity of questions asked per story across the century (Table 3). The mean number of comprehension questions increased from a low of 7 per story in 1930 to a high of 48.5 questions per story in the 2000 editions. While this suggests an increased emphasis on developing students’ comprehension of the story, one could question the upper limit of the efficacy of asking comprehension in terms of facilitating versus interfering with students’ construction of meaning while reading. In the more recent editions the teacher’s manual has multiple questions per page embedded in the story. While asking some questions during reading may facilitate students’ understanding of what is read, it is unknown whether frequent questions during reading may interfere with the cohesion of the ideas in the story and hence interfere with the reader’s construction of meaning. 607 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. Table 8 Summary of Finding: Cognitive Demands of Third-Grade Reading Texts and Tasks Measure of Complexity Text: Lexical sophistication Lexical diversity Syntactic complexity Tasks: Information needed Percentage explicit detail questions Percentage questions of high-level ideas Early 20th Century Mid 20th Century Late 20th Century Decrease 1910–1940 Stable Some increase 1950–1980 Stable 1910–1960 Stable Marked increase 1990–2000 Increasing 1970–2000 Stable 1950–2000 Decrease 1920–1940 Stable Stable Stable Fairly stable Increasing 1970–2000 Decreasing 1970–2000 Increasing 1980–1990 Decease 1910–1940 Stable 1920–1950 In terms of the types of comprehension tasks asked of readers, there seems to be a fairly consistent pattern of increasing cognitive demands in the comprehension tasks during the second half of the century, as summarized in Table 8. After a slight drop from 1920 to 1930, the cognitive demand of questions, as measured by the mean number of content words needed to answer each question (Table 4), stayed fairly stable though 1960, followed by gradual increases through 2000 to a mean of 49 that need to be processed to answer questions. This suggests that the comprehension questions asked of students in recent editions of reading curricula were more demanding cognitively, requiring students to process more textual information to construct an answer to the question. Similar increased complexity can be seen in the types of questions asked of students, most notable in the marked decline in explicit detail questions asked in curricula after 1950 (Tables 5 and 6). During the first half of the century, nearly one-third of comprehension questions in reading curricula were explicit details that required surface-level processing of the text. The emphasis on literal comprehension questions declined to about one-quarter of all questions in the 1960s and 1970s and continued to decline to the point where curricula from 2000 asked explicit detail questions at a rate of around 10% of all questions. This pattern of increasing complexity of questions is supported by the increasing percentage of higher-level questions (Table 7) in the more recent curricula. In the 1920s and 1930s, approximately one- 608 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula third of all questions were about higher-level ideas; this then dips to approximately one-quarter through the 1950s, indicating an emphasis on fairly literal, detail-oriented comprehension. However, particularly in the past three decades, comprehension questions in reading curricula have been fairly balanced in asking questions about higher-level ideas and lower-level ideas. This suggests that there is an important decline in conceptualizing comprehension primarily as extracting ideas from text and an increase in valuing readers’ interpretation and integration of text. In general, these data on the nature of the text and tasks in third-grade reading curricula suggest a pattern of marked decrease in cognitive demands early in the century (1920–1930) and fairly stable cognitive demand in midcentury (1930–1950 or 1960). Starting in about 1970, the third-grade curricula seem to gradually increase in both the complexity of the text and the cognitive demands of the comprehension tasks. Cognitive Demands of Sixth-Grade Curricula Linguistic Analyses of the Text The following patterns of change were observed for the three dimensions of text complexity, as summarized in Table 9. For sixth grade, lexical sophistication was significantly higher in the 1910s–1920s but remained largely stable in the 1930s–2000s (with a low point in the 1980s). Lexical diversity as measured using RTTR was generally higher in the 1950s–2000s than in the 1910s–1940s, with the highest point in the 1990s and the lowest point in the 1930s. This suggests that the cognitive demands of sixth-grade reading texts have been fairly stable since about 1950. Syntactic complexity was significantly higher in the 1910s–1940s than the 1950s–2000s; there was a falling trend from the 1910s through the 1940s, but it remained largely stable within the 1950s–2000s period. Analyses of Comprehension Tasks The analysis of comprehension tasks shows some similarities between sixth-grade curricula and the trends previously reported for third-grade curricula. First, the quantity of comprehension questions per story followed the same gradual increase across the decades as found in the third-grade materials. By 2000, reading curricula asked over 36 questions per story on average, an increase 2.5 times the number of questions asked in the 1920s. The cognitive demand of questions as measured by the amount of information students need to process to answer a question was fairly consistent from 1920 to 1950 (about 35 content words) but then increased in the 1960s and 1970s and again in the 1990s and 2000s. By the latter part of the century, comprehension questions required students to process around 609 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. Table 9 Summary of Findings: Cognitive Demands of Sixth-Grade Reading Texts and Tasks Measure of Complexity Text: Lexical sophistication Lexical diversity Syntactic complexity Tasks: Information needed Percentage explicit detail questions Percentage questions of high-level ideas Early 20th Century Mid 20th Century Late 20th Century Decrease 1910–1930 Decrease 1910–1930 Decrease 1910–1940 Fairly stable Fairly stable 1930–2000 Fairly stable 1990–2000 Stable 1950–2000 Decrease 1920–1930 Decrease 1920–1940 Stable 1920–1930 Fairly stable 1940–1960 Varies 1950–1980 Fairly stable 1940–1960 Increase 1930–1970 Stable Increase 1970–2000 Marked decrease 1980–2000 Increase 1970–2000 65 content words of information on average, nearly double the cognitive demand of curricula tasks in the early half of the century. The declining emphasis on explicit detail questions observed in the third-grade curricula, described previously, is also evident in the analyses of comprehension questions in the sixth-grade curricula. In the 1920s, 45% of all questions asked were explicit detail questions, whereas by 1990 and 2000 curricula, that had diminished to only 8% of the questions asked. The move away from more literal and lower-level processing of reading to processing more higher-level ideas is also seen in the increasing percentage of questions asked about higher-level ideas in stories. In the earlier half of the century, only 25% of the questions were asked about higher-level ideas in the stories, whereas by 2000 curricula, nearly half of all questions were about higher-level ideas. This trend away from literal comprehension questions provides evidence that the tasks students are asked to complete have become more cognitively demanding by requiring the students to process more information from the text to answer the questions or complete the tasks asked of them. Questions that ask students to interpret information may also increase the cognitive demand for readers to use their prior knowledge and integrate it with text information. Again, it also gives evidence to a change from conceptualizing comprehension as extracting information from text to more cognitively demanding acts of interpreting and summarizing textual information. 610 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula Similar to the pattern found for third-grade curricula, the comprehension tasks in sixth-grade curricula for the latter half of the century show a gradual increase in the cognitive demands put on students. In general, there is a decline in the comprehension tasks asked about surface-level (explicit details) information in the text and more emphasis on asking questions about more thematic ideas (higher-level ideas) in the text, again requiring students to process more information from the text and increasing the cognitive demands of the comprehension tasks. General Conclusions These data offer a detailed look at the changes in cognitive demands of commonly used reading curricula. In general, there seems to be an initial and fairly dramatic decline in the cognitive demands and difficulty of curricula from 1920 to 1930. This may be due to the change in emphasis from reading aloud and elocution to an emphasis on reading and comprehending stories, as documented by N. B. Smith (1965). In part, this change seems related to trends in reading research with William Gray (1924), E. B. Huey (1908), and Edward Thorndike (1917) advocating for more instructional emphasis on developing students’ comprehension and using silent reading, rather than an emphasis on oral reading and elocution. The drop in the readability of text also may have been influenced by the use of then newly developed readability measures (Lively & Pressey, 1923; Thorndike, 1921) that may have been used to ‘‘standardize’’ the complexity of the stories used in reading curricula. From about 1930 through 1960 or 1970, the cognitive demands of reading curricula changed little. It is difficult to explain a lack of change. In the period of 1970 through 2000 we observed a fairly consistent increase in the difficulty of reading text and comprehension tasks, particularly at third grade. This may have been the result of criticism of schools and classroom instruction beginning in the 1950s. For example, Why Johnny Can’t Read (Flesch, 1955) criticized typical reading instruction and the growth of remedial instruction in American schools. The focus on reading instruction was heightened by the publication of A Nation at Risk (National Commission on Excellence in Education, 1983) and Becoming a Nation of Readers (Anderson, Hiebert, Scott, & Wilkinson, 1985). These reports recommended that students be challenged with more rigorous and meaningful curricula. The criticism may have led to increases in the demands in reading curricula we observe beginning in 1970. Similarly, there was a marked increase in reading comprehension research from the 1970s through 2000s (Pearson & Fielding, 1991; Spiro, 1980). The research investigated how students construct meaning while reading and the role instruction plays in developing comprehension strategies. This research may have influenced the shift from explicit, literal 611 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. comprehension to more inferential and integrative comprehension tasks that make greater cognitive demands on readers. It is also likely that the research was influential in the move to more authentic text that was of greater relevance to the readers. The texts for sixth-grade reading curricula, however, have been fairly stable over the past half of the twentieth century, at a relatively simpler level than what was observed during the 1910–1930 period. Comprehension tasks have to some degree become more cognitively demanding in recent decades along the lines of what is observed in the third-grade curricula. There has been a move away from questions about explicitly stated ideas to more interpretation and integration of ideas. However, the change is not as profound as that observed in the third-grade curricula. Another difference over time that is notable is the types of reading material used in curricula. While the reading texts may not be as demanding as those in 1910 or 1920, as evidenced by syntactic complexity, the text may be more sensible and applicable to readers. Current reading texts use children’s literature from authors like Beverly Cleary in the Scott Foresman curricula (Allington, Cortez, Cunningham, Sebesta, & Tierney, 1989) or Patricia Compton in the Macmillan McGraw Hill curricula (Flood et al., 2003). The addition of children’s literature reflects the move toward using more authentic text begun in the 1980s (Chall & Squires, 1991). Children’s literature is likely more relevant and comprehensible for young readers rather than reading texts like Norse mythology selections found in the 1917 third-grade Ginn reader (Young & Field, 1917). These findings also offer a more precise look at reading text and comprehension task complexity that may inform the future work of curriculum developers. The text analyses offer a method for analyzing text that is more sophisticated than simply looking at vocabulary complexity or readability measures. Further, this study provides information about how questions about the text can increase or decrease the overall difficulty of the reading curricula. Finally, these results raise an issue about how many questions are too many. Inserted questions are an attempt to increase readers’ comprehension; however, there may be an upper limit beyond which questions begin to detract from the cohesion of readers’ comprehension of what they’ve read. These results may offer useful ways for teachers and curriculum adoption committees to think about reading curricula. Both the text and the types of questions or tasks asked of students have an impact on the cognitive demands of a curriculum. Reading curricula from different publishers vary in these areas. Making comparisons of the reading material and comprehension tasks across publishers may help in assessing the cognitive demands when selecting curricula. 612 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula Limitations This study is limited in that it investigated the cognitive demands of reading curricula at only the third- and sixth-grade levels. The decision was made to look more deeply at a narrower range of curricula rather than less deeply and more broadly at reading curricula across multiple grades. Our intent was to focus on third grade because more intense and focused reading comprehension instruction tends to begin in third grade, whereas earlier instruction focuses more on decoding instruction. We also selected sixth grade because of the transition to more focus on reading to learn in content areas found in early middle school. While this limits the conclusions to some degree, we believe it allowed us to analyze a more varied sample of curricula at each grade level and to do finer-level analyses of the cognitive demands of the comprehension tasks asked of students. Another limitation on the analyses of comprehension tasks is that we sampled only a subset of stories from a subset of the reading curricula. We attempted to sample reading curricula that were more commonly used in given decades in an attempt to look at the potentially most influential curricula. We also selected some curricula because various editions were used across all the decades of the study. However, as in any sampling method, there is the potential to miss data in the process. This study is also limited in that it cannot analyze all the reading materials used in classrooms as there is great variability nationwide in (a) whether teachers add children’s literature to their reading instruction and (b) what those literature selections are. While the majority of teachers use basal series in reading instruction (Wade & Moje, 2000), they are not the only source of text used in reading instruction. Future Research Future research in the changes in cognitive demands of school curricula is needed to build on this study and the one conducted by Baker et al. (2010) on mathematics curricula. While these studies provide in-depth analyses of elementary curricula, there is a similar need to investigate the cognitive demands of secondary curricula. Our study suggests that the relative increase in cognitive demands in reading is greater in third grade than in sixth grade. If there are declining cognitive demands in school curricula, as some authors’ data have suggested (e.g., Hayes et al., 1996; Williamson, 2006), then these results suggest that more in-depth investigations of secondary curricula are warranted. Research should also consider the degree to which inserted questions aid or hinder readers’ comprehension. This research found a steady increase in the number of questions asked per story, culminating in recent basal readers averaging 48 questions per story for third-grade editions. While teachers may not use all the questions provided, research is needed to determine how 613 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. many questions are helpful in increasing comprehension and when questions hinder or distract a reader’s comprehension. Notes This project was funded in part by a grant from the Spencer Foundation, Chicago, IL. The authors would like to thank the reviewers of the manuscript for their insightful comments. The first author would also like to thank Charles Perfetti for his assistance during the planning phase of the project. References Adams, M. J. (2010–2011). Advancing our students’ language and literacy: The challenge of complex texts. American Educator, 34, 3–11, 53. Allington, R., Cortez, J., Cunningham, P., Sebesta, S., & Tierney, R. (1989). Scott Foresman collections reading program. Glenview, IL: Scott, Foresman, and Company. Anderson, R. C., Hiebert, E. H., Scott, J. A., & Wilkinson, I. (1985). Becoming a nation of readers: The report of the Commission on Reading. Washington, DC: National Institute of Education, U.S. Department of Education. Armbruster, B. B., Stevens, R. J., & Rosenshine, B. V. (1977). Analyzing content coverage and emphasis: A study of three curricula and two tests. Paper presented at the annual meeting of the American Educational Research Association, New York. Baker, D., Knipe, H., Collins, J., Leon, J., Cummings, E., Blair, C., & Gamson, D. (2010). One hundred years of elementary school mathematics in the United States: A content analysis and cognitive assessment of textbooks from 1000 to 2000. Journal for Research in Mathematics Education, 41, 383–424. Bartlett, F. C. (1932). Remembering: A study in experimental and social psychology. London: Cambridge University Press. Beck, I., McKeown, M., McCaslin, E., & Burkes, A. (1979). Instructional dimensions that may affect reading comprehension: Examples from two commercial reading programs (Technical report). Pittsburgh, PA: Learning Research and Development Center, University of Pittsburgh. Bowker, R. R. (1948–2010). Books in print. New York, NY: R. R. Bowker Company. Brenner, D., & Hiebert, E. H. (2010). If I follow the teachers’ editions, isn’t that enough? Analyzing reading volume in six core reading programs. Elementary School Journal, 110, 347–363. Carlson, R., Chandler, P., & Sweller, J. (2003). Learning and understanding science instructional materials. Journal of Educational Psychology, 95, 629–640. Chall, J. S. (1967). Learning to read: The great debate. New York, NY: McGraw-Hill. Chall, J. S., Conrad, S., & Harris, S. (1977). An analysis of textbooks in relation to declining SAT scores. Princeton, NJ: College Entrance Examination Board. Chall, J. S., Jacobs, V. A., & Baldwin, L. E. (1990). The reading crisis: Why poor children fall behind. Cambridge, MA: Harvard University Press. Chall, J. S., & Squires, J. (1991). The publishing industry and textbooks. In R. Barr, M. Kamil, P. Mosenthal, & P. D. Pearson (Eds.), Handbook or reading research (Vol. 2, pp. 120–146). New York, NY: Macmillan. Cheung, H., & Kemper, S. (1992). Competing complexity metrics and adults’ production of complex sentences. Applied Psycholinguistics, 13, 53–76. 614 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula Covington, M. A., He, C., Brown, C., Nacxi, L., & Brown, J. (2006). How complex is that sentence? A proposed revision of the Rosenberg and Abbeduto D-Level scale (CASPR Research Report 2006-01). Athens, GA: Artificial Intelligence Center, University of Georgia. Durkin, D. (1981). Reading comprehension instruction in five basal reader series. Reading Research Quarterly, 16, 515–544. Flesch, R. (1955). Why Johnny can’t read. New York, NY: Harper. Flood, J., Hasbrouck, J, Hoffman, J., Lapp, D., Lubcker, D., Medearis, A., . . . Wood, K. (2003). Macmillan McGraw Hill reading. New York, NY: Macmillan McGraw Hill. Gamson, D., Lu, X., & Eckert, S. (2013). Challenging the research base for the Common Core State Standards: A historical reanalysis of text complexity. Educational Researcher, 42, 381–391. Goldman, S., & Rakestraw, J. (2000). Structural aspects of constructing meaning from text. In M. Kamil, P. Mosenthal, P. D. Pearson, & R. Barr (Eds.) Handbook of reading research (Vol. 3, pp. 311–335). Mahwah, NJ: Lawrence Erlbaum Associates. Gray, W. S. (1924). The importance of intelligent silent reading. Elementary School Journal, 24, 348–356. Guiraud, P. (1960). Problèmes et méthodes de la statistique linguistique [Problems and methods of statistical linguistics]. Dordrecht, The Netherlands: D. Reidel. Haenggi, D., & Perfetti, C. A. (1994). Processing components of college-level reading comprehension. Discourse Processes, 17, 83–104. Hayes, D. P., Wolfer, L. T., & Wolfe, M. F. (1996). Schoolbook simplification and its relation to the decline in SAT-verbal scores. American Educational Research Journal, 33, 489–508. Huey, E. B. (1908). The psychology and pedagogy of reading. Cambridge, MA: MIT Press. Johnson, R. E. (1970). Recall of prose as a function of the structural importance of the linguistic units. Journal of Verbal Learning and Verbal Behaviour, 9, 12–20. Kintsch, W., & Kintsch, E. (2005). Comprehension. In S. G. Paris & S. A. Stahl (Eds.), Children’s reading comprehension and assessment (pp. 71–92). Mahwah, NJ: Lawrence Erlbaum Associates. Kintsch, W., & van Dijk, T. (1978). Toward a model of text comprehension and production. Psychological Review, 85, 363–394. Lee, L. (1974). Developmental sentence analysis. Evanston, IL: Northwestern University Press. Lively, B. A., & Pressey, S. L. (1923). A method for measuring the vocabulary burden of text books. Educational Administration and Supervision, 9, 389–398. Lu, X. (2009). Automatic measurement of syntactic complexity in child language acquisition. International Journal of Corpus Linguistics, 14, 3–28. Lu, X. (2012). The relationship of lexical richness to the quality of ESL learners’ oral narratives. The Modern Language Journal, 96, 190–208. Magliano, J. P., & Mills, K. K. (2003). Assessing reading skill with a think-aloud procedure and latent semantic analysis. Cognition and Instruction, 21, 251–283. Malvern, D., Richards, B., Chipere, N., & Durán, P. (2004). Lexical diversity and language development: Quantification and assessment. Houndmills, UK: Palgrave Macmillan. Marks, C. B., Doctorow, M. J., & Wittrock, M. C. (1974). Word frequency and reading comprehension. Journal of Educational Research, 67, 259–262. Martinez, M. G., & McGee, L. M. (2000). Children’s literature and reading instruction: Past, present, and future. Reading Research Quarterly, 35, 154–169. 615 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Stevens et al. Mayer, R. (2002). Cognitive theory and the design of multimedia instruction: An example of the two-way street between cognition and instruction. New Directions in Teaching and Learning, 89, 55–71. McGregor, A. K. (1989). The effect of word frequency and social class on children’s comprehension. Reading, 23, 105–115. Meyer, B. J. F. (1975). The organization of prose and its effect on memory. Amsterdam: North Holland Publishing Company. National Commission on Excellence in Education. (1983). A nation at risk: The imperative for educational reform. Washington, DC: U. S. Department of Education. Pearson, P. D., & Fielding, L (1991). Comprehension instruction. In R. Barr, M. Kamil, P. Mosenthal, & P. D. Pearson (Eds.), Handbook of reading research (Vol. 3, pp. 815–860). Mahwah, NJ: Lawrence Erlbaum Associates. Perfetti, C. A. (1985). Reading ability. New York, NY: Oxford University Press. Pressley, M. (2000). What should comprehension instruction be the instruction of? In R. Barr, M. Kamil, P. Mosenthal, & P. D. Pearson (Eds.), Handbook of reading research (Vol. 3, pp. 545–561). Mahwah, NJ: Lawrence Erlbaum Associates. RAND Reading Study Group. (2002). Reading for understanding: Toward a research and development program in reading comprehension. Santa Monica, CA: RAND. Rapp, D. N., van den Broek, P., McMaster, K. L., Kendeou, P., & Espin, C. A. (2007). Higher-order comprehension processes in struggling readers: A perspective for research and intervention. Scientific Studies of Reading, 11, 289–312. Reppen, R., & Ide, N. (2004). The American National Corpus: Overall goals and the first release. Journal of English Linguistics, 32, 105–113. Richards, B. (1987). Type/token ratios: What do they really tell us? Journal of Child Language, 14, 201–209. Rosenberg, S., & Abbeduto, L. (1987). Indicators of linguistic competence in the peer group conversational behavior of mildly retarded adults. Applied Psycholinguistics, 8, 19–32. Scarborough, H. S. (1990). Index of productive syntax. Applied Psycholinguistics, 11, 1–22. Smith, N. B. (1965). American reading instruction. Newark, DE: International Reading Association. Spiro, R. J. (1980). Constructive processes in prose comprehension and recall. In R. J. Spiro, B. C. Bruce, & W. F. Brewer (Eds.), Theoretical issues in reading comprehension (pp. 245–278). Hillsdale, NJ: Lawrence Erlbaum Associates. Taylor, B. M. (1980). Children’s memory for expository text after reading. Reading Research Quarterly, 15(3), 399–411. Templin, M. (1957). Certain language skills in children: Their development and interrelationships. Minneapolis, MN: The University of Minnesota Press. Thorndike, E. L. (1917). Reading as reasoning: A study of mistakes in paragraph reading. Journal of Educational Psychology, 8, 323–332. Thorndike, E. L. (1921). Teacher’s word book. New York, NY: Teachers College, Columbia University. van den Broek, P., McMaster, K. L., Kendeou, P., & Espin, C. A. (2007). Higher order comprehension processes in struggling readers: A perspective for research and intervention. Scientific Studies of Reading, 11, 289–312. Venezky, R. L. (1987). A history of the American reading textbook. The Elementary School Journal, 87, 246–265. Wade, S. E., & Moje, E. B. (2000). The role of text in classroom learning. In M. L Kamil, P. Mosenthal, P. D. Pearson, & R. Barr (Eds.), Handbook of reading research (Vol. 3, pp. 609–628). Mahwah, NJ: Lawrence Erlbaum Associates. 616 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015 Cognitive Demands of a Century of Reading Curricula Williamson, G. L. (2006). Student readiness for postsecondary endeavors. Paper presented at the annual meeting of the American Educational Research Association, San Francisco, CA. Yngve, V. (1960). A model and a hypothesis for language structure. Proceedings of the American Philosophical Society, 104, 444–466. Young, E. F., & Field, A. T. (1917). Young and Field literary readers. Boston, MA: Ginn and Company. Manuscript received January 30, 2014 Final revision received September 12, 2014 Accepted December 7, 2014 617 Downloaded from http://aerj.aera.net at Aarhus Universitets Biblioteker / Aarhus University Libraries on July 19, 2015
© Copyright 2026 Paperzz