Alignment of State Assessments and Higher Education Expectations: Definition and Utilization of an Alignment Index By Dr. Gil Fonthal THIS RESEARCH PAPER DEFIN E S A Q U A N T I TAT I V E I N D E X TO M EASURE THE GAP BETWEEN SEC O N D A RY E D U C AT I O N A N D H I G H E R EDUCATION IN TH E U N I T E D S TAT E S . Results showed, on the one hand, that there were substantial differences in the index of alignment between all state tests and university expectations in English language. On the other hand, there were small differences in the index of alignment for mathematics tests. However, in both subject matters, most state tests obtained total indices of alignment slightly below 0.50, which is considered moderate to low. also reached the highest total index of alignment (ITE = 0.57) “ Missouri above the mean (M = 0.37) of all states. This value located Missouri in the Middle range of alignment for English. ” Research done at the University of California, Irvine. 2004. INTEDCO P. O. Box 8081 Laguna Hills, CA 92654. USA ��� (800) 880-1091 (949) 589-2360 www.intedco.org Alignment of State Assessments and Higher Education Expectations: Definition and Utilization of an Alignment Index By Dr. Gil Fonthal University of California, Irvine and Los Angeles (UCI/UCLA) Gil Fonthal, 2004 ABSTRACT This study developed a quantitative methodology of alignment between standards and assessments. Such methodology adopted three dimensions to define the criteria of alignment: range, depth, and balance of content match. These three criteria, which involve a revised and expanded version of the main categories and criteria currently used in salient alignment methodologies, are the minimum and sufficient concepts necessary to characterize the process of alignment between standards and assessments in the content focus category. This methodology also includes the construction of an index of alignment as a mathematical function that is a linear combination of the three alignment criteria adopted. A software program was developed to prototype possible indices (formulas) of alignment. Additionally, small computational programs were developed to manipulate the different alignment criteria and the index itself. Several graphical representations were created to show results in a visual manner. As a result, an index was adopted that complied with required internal mathematical structure, the logical inferences about its function, and its performance using actual data. This alignment index was utilized to measure the alignment between a set of higher education expectations (standards suggested by a consortium of higher education institutions in the US) and selected state assessments across the country. Twenty-seven state tests were selected and analyzed to determine their alignment with higher education expectations. The alignment was carried out in mathematics and English language subject matters for each state test. In general terms, the raters (from 4 to 6) found that the university expectations in English language are more cognitively demanding than the state tests. The raters also found that the state tests in mathematics are almost equally cognitively demanding than the higher education expectations in this subject matter. Results showed, on the one hand, that there were substantial differences in the index of alignment between all state tests and university expectations in English language. On the other hand, there were small differences in the index of alignment for mathematics tests. However, in both subject matters, most state tests obtained total indices of alignment slightly below 0.50, which is considered moderate to low. The levels of the total index of alignment correlated with NAEP scores of the states studied. These results provided validity evidence for the methodology. CHAPTER 1 INTRODUCTION Statement of the Problem Standards-based systemic school reform in the United States considers the alignment of the public schooling system as a desired goal that would ensure an efficient and effective educational system. Alignment and continuity among standards, assessments, curriculum, professional development, and pedagogy is, then, considered a condition for a healthy school system, and ultimately a condition for student success. During the last few decades, standards-based school reform in the United States has made a major impact on the educational system. At the heart of systemic reform is the concept of alignment (National Council on Education Standards and Testing (NCEST), 1992; Porter, 2002; Smith & O’Day, 1991). According to reformers, the alignment of state standards, state assessments, curriculum, teachers’ professional development, and classroom instruction will ensure high quality in student outcomes (Fuhrman, 1999). The definition and introduction of standards at all levels and the increased use of standardized tests have become ubiquitous in systemic school reform. Across the nation, a coalition of educational leaders and policy makers advocate standards and high-stakes testing as the main educational policy to improve the accountability of the public schooling system. As such, standardized assessment has become the significant means of quality control of education and an instrument of reform of the educational agenda. Yet, high-stakes tests remain the center of national debate and intense political rhetoric due to the important differential consequences for students, teachers, administrators, and schools 1 as a whole. Despite the consensus about the necessity of an aligned schooling system, there is neither a widely accepted definition or common terminology of alignment, nor a proven quantitative methodology to perform or measure such alignment across contexts in the instructional system (Porter, 2002). The alignment of the educational system is a complex issue. Alignment among specific components of the instructional system (standards, assessment, curriculum, professional development, and pedagogy) has been addressed by different scholars, from different points of view, since the inception of standards-based reform (Cohen, 1987; NCEST, 1992; Smith & O’Day, 1991). Comprehensive information on research, policies, and resources in regard to standards and alignment may be found through National Center for Research on Evaluation Standards and Student Testing (CRESST); Tools for Auditing National Standards-based Education of the National Education Association (NEA); Project 2061 of the American Association for the Advancement of Science (AAAS); Council of Chief State School Officers (CCSSO); and Curriculum Alignment Project (CAP) in Indiana. Several studies from these projects are discussed here. Alignment procedures among curricula, assessments, and standards have been widely used in school districts across the nation (Buckendahl, Plake, Impara, & Irwin, 2000; Rothman, Slattery, Vranek, & Resnick, 2002). Several methodologies to ensure and measure alignment among components of the instructional systems have been developed (Webb, 1997, 1999: Porter & Smithson, 2001a, 2001b). Detailed analysis of these methodologies will be included in this study. There is a parallel between the alignment methodology and content-related evidence of validity in educational measurement. For example, in the Standards for 2 Educational and Psychological Testing (American Educational Research Association, 1999), evidence of validity based on test content is defined as follows: Evidence based on test content can include logical or empirical analyses of the adequacy with which the test content represents the content domain and of the relevance of the content domain to the proposed interpretation of test scores. Evidence based on content can also come from expert judgment of the relationship between parts of the test and the construct (p. 11). The term “content domain” above is equivalent to state content standards in terms of current alignment terminology. As a consequence, the two spheres of validity and alignment intercept each other, which emphasizes their dual importance in standardsbased school reform. Nevertheless, the methodology developed in this study may provide evidence of validity for the use of well-aligned assessments in some given purposes of educational measurement such as graduation, college placement, and college admission decisions. During the last decade, as part of the standards-based movement, there has been debate about a perceived mismatch between state tests and state standards (Resnick & Resnick, 1992). A related issue is that institutions of higher education are raising concern about the performance of freshmen entering the system (Powell, 1996; Kirst, 1998). Additionally, the different standardized tests used as part of college admission criteria have been under criticism for a variety of reasons (Kirst, 1998). Among these criticisms are the lack of alignment between college admission assessment instruments and secondary school tests and standards (Le, Hamilton, & Robyn, 2000). For example, the Consortium for Policy Research in Education (CPRE, 2000) has found that in the 3 southern states, 75 different college placement tests are used, without any consideration of secondary state standards. Further, educational leaders believe that the expectations of institutions of higher education are divorced from what K-12 leaders expect from their senior graduates (Bishop, 1996; Kirst & Venezia, 2001; La Marca, Redfield, Winter, Bailey, & Despriet, 2000). As such, the challenge of alignment permeates the K-16 educational system (The Bridge Project, 2000). Only recently have some alignment methodologies and documentation appeared in the area of content alignment (Impara, 2001; La Marca et al., 2000; Porter & Smithson, 2001a; Webb, 1997). Most of these alignment methodologies are qualitative assertions based on a cell-by-cell comparison criterion, in which a threshold has been arbitrarily predefined. This kind of alignment is usually referred to as content alignment because its objective is to match content covered between particular components of the instructional system. These current alignment methodologies, especially their terminology and criteria, utilized by different authors are diverse and sometimes confusing. Consequently, a comprehensive language to describe alignment and a sound quantitative methodology to measure alignment across the different contexts of the educational system is needed (Porter, 2002). As Herman, Webb, & Zuniga (2003) pointed out: “[W]e also need better methodologies for judging alignment, methodologies that recognize the meaning and complexity of the concept of alignment and that can support better reform goals” (p. 2). Purpose of the Study The main purpose of the study is to develop a quantitative methodology to measure alignment between standards and assessments. The existing alignment methodology includes qualitative and quantitative concepts (Porter & Smithson, 2001a; 4 Webb, 1997), which address different criteria used to measure alignment between standards and assessment. As such, in the present study, the researcher proposes an index of alignment that takes into consideration an extension of the qualitative and quantitative criteria defined by different scholars in alignment research. The alignment index is a mathematical formula as a function of the main alignment criteria defined in the conceptual framework of this study. Such an index could be expressed as follows: I = f (alignment criteria) This index is intended to serve in different educational contexts such as subject matter, grade level, and across states. A quantitative metric, in the form of an alignment index, will facilitate measurement and comparisons among components of the instructional system, specifically between standards (content domain) and assessments. In addition, a quantitative index is able to serve as a prototype and to be manipulated using modern computational tools. Several computational tools were developed to prototype the index and to manipulate quantitatively the three alignment criteria defined. The index of alignment defined in this study was also utilized to match the content between a set of higher education expectations (standards) and selected state assessments. The higher education expectations, known as the Key Knowledge and Skills for University Success (KSUS), were defined by a consortium of United States universities, under the auspices of the American Association of Universities (AAU) (Conley, 2002). Objectives of the Study There are three main objectives of the study. The first objective is to adopt a common and concise alignment language, taken from the extensive current terminology. This language served to analyze, describe the alignment criteria, and systematize the 5 process of measuring alignment in a concise way. This minimum and concise language is needed to reconcile and resolve differences in the alignment terminology used by different authors. This minimum terminology is established in the framework and in the definition of terms sections of this study. The second objective is to quantify the alignment methodology proposing a metric. This objective includes the development of a meaningful mathematical formula (alignment index) that defines and measures the levels (in proportions or percentages) of content match between standards and assessments. The definition of this index intends to bring together and extend the two most currently used methodologies in alignment research, which are Porter (2002) and Webb’s (1997, 1999) approaches. Besides the definition of the index of alignment, a series of computational tools (software) have been created in order to prototype and visualize the index, and to manipulate the different alignment criteria defined in the conceptual framework of this study. The third objective is for the alignment index to be used to measure the alignment between a set of higher education expectations (standards/content domain) that a consortium of United States universities has put together (Conley, 2002) and current state assessments across the country. These higher education expectations are equivalent to state standards in their form and content. This third objective intends to evaluate the properties and performance of the alignment index proposed, as well as to provide evidence of its validity. Alignment results were also used to find relationships with accountability factors of high-stakes testing, such as the National Assessment of Educational Progress (NAEP). Significance of the Study 6 An advanced quantitative methodology of alignment is needed in current standards-based school reform (Porter, 2002; Herman, 2003). Such alignment methodology should improve our understanding of the alignment complexity, and should provide better tools for judging content alignment in the educational system. This research effort is expected to contribute to the fulfillment of this need. The study of alignment is important because educational leaders and policy makers need to know about the alignment status of the educational system. Having such knowledge will enable them to make instructional decisions that improve the quality of education. In addition, knowing the alignment status allows decision makers to take corrective action to appropriately align the educational system where poor alignment exists. More importantly, strengthening the alignment between higher education expectations and K-12 curriculum and assessment may serve to improve the opportunities for all students to enter and succeed in higher education. Several software programs have been developed and are intended to expand the tools available to quantify the alignment between standards and assessments. Also, novel visual representations are used as graphical aids to describe, analyze, and measure alignment. Finally, this study is expected to advance the research in standards-based school reform and, in particular, the research on educational alignment. Research Questions The five fundamental research questions addressed by this study are: Research Question 1. What is a minimum terminology necessary and sufficient to analyze and describe the criteria in the process of measuring content alignment? 7 Research Question 2. What is the mathematical form and characteristics of a quantitative metric (alignment index) to measure alignment between expectations and assessment across instructional contexts? Research Question 3. To what extent are state assessments and higher education expectations (KSUS) aligned according to the index defined? Research Question 4. What alignment comparisons can be made between KSUS and state assessment across states and across subject matter using this index? Research Question 5. What relationships could be established among the levels of alignment and accountability features of state testing? Definition of Terms Alignment: The match and continuity among the main components of the educational systems such as standards, assessments, curriculum, professional development, and pedagogy. For the purposes of this study, the term alignment will be used in the context of standards and assessments. Alignment Index: Quantitative formula intended to measure (in proportions or percentages) the alignment between standards and assessments. It ranges from 0 to 1, where 1 indicates perfect alignment. There are three different indices defined in this study. Partial Index with respect to standard, I(S), or with respect to assessment, I(A), for each subject topic. Overall Index (I) as a combination of the two partial indices for each subject topic. Total Index (IT) as the final index for each state test including all topics for a subject matter. Detailed descriptions for these indices are in the conceptual framework of this study in Chapter 3. 8 Range of Knowledge (ROK): Alignment criterion. Range or span of content coverage among components of the instructional system. Specifically applied in this study to standards and assessments. ROK is usually expressed as the proportion of standards (or assessments) addressed by the tests (or standards). When ROK is measured from assessments with respect to standards, it is expressed as ROKS. Conversely, when ROK is measured from standards with respect to assessments, it is expressed as ROKA. Detailed definitions of these terms are in Chapter 3 under the framework of the study. Depth of Knowledge (DOK): Alignment criterion. Scale to indicate the levels of cognitive complexity (cognitive demand) of any component of the instructional system. Specifically applied in this study to standards and assessments. The values of DOK are expressed as below (DOKB), equal (DOKE), or above (DOKA) when comparing standards with respect to assessments. For example, a DOKE value for a test means the proportion of items that have the same DOK value of the respective KSUS. More information about DOK is in Chapter 3. Balance of Knowledge (BOK): Alignment criterion. The degree of importance or emphasis the standards have on the test. BOK is measured as the distribution of questions (or standards) addressed by each standard (or assessment) throughout the test. When BOK is measured from assessments with respect to standards, it is expressed as BOKS. Conversely, when BOK is measured from standards with respect to assessments, it is expressed as BOKA. A detailed discussion about BOK can be found in Chapter 3. Skewness: Quantitative (vector) concept that indicates the contribution of each alignment criterion to the index. Its module denotes the magnitude of the skewness and its direction indicates toward which criterion the skewness of the index is oriented. 9 Standards for Success (S4S): Project fostered by the American Association of Universities (AAU) aimed to bridge the gap between higher education expectations and K-12 graduation performance. The S4S project, through a series of workshops, collected the data that is used in this study to put to the test the index of alignment constructed. The S4S project involved more than 400 educators and administrators from more than 20 higher education institutions. Key Knowledge and Skills for University Success (KSUS): Higher education expectations in several subject matters (language, mathematics, science, social science, secondary language, and humanities/arts) defined by the S4S project. KSUS are equivalent in content and form to K-12 standards. Appendices C and D list the KSUS for language and mathematics, which are the subject matter analyzed in this research. 10 CHAPTER 2 REVIEW OF THE LITERATURE Definition of Alignment Alignment, in the context of the systemic standards-based school reform, refers to the match, continuity, and synchronization among the main components of the instructional system: content standards, assessment, curriculum, professional development, and classroom practice. Alignment and School Reform The hypothesis of an aligned schooling system, as a necessary condition for a healthy and effective educational system, and ultimately as guarantee for student achievement, is built into the essence of the current systemic school reform (NCEST, 1992; Smith & O’Day, 1991). As Porter (2002) stated, “When a system is aligned, all the messages from the policy environment are consistent with each other, content standards drive the system, and assessment, materials, and professional development are tightly aligned to the content standards” (p. 11). Alignment as a Necessary Condition Educational reformers believe that the present state of alignment is weak among all components of the instructional system and agree about the importance of alignment (Baker & Linn, 2000; Feuer, Holland, Green, Bertenthal, & Hemphill, 1999; Rothman et al., 2002). The importance of alignment is also expressed in terms of educational responsibility. States should design assessments aligned to their academic standards to make justifiable accountability decisions (La Marca, 2001; La Marca et al., 2000; 11 Lashway, 1999). Impara (2001), after reviewing the different methodologies of alignment, suggested the need to conduct appropriate large-scaled alignment studies to inform instruction. In regard to alignment, one important study was an in-depth analysis of the reasons eighth grade science students in Minnesota were second only to students in Singapore in the Third International Mathematics and Science Study (TIMSS) (National Education Goals Panel, 2000). The researchers found that the two main reasons for such world-class performance were the alignment (by design) of standards, assessment, and curriculum, as well as statewide continuity of educational programs in Minnesota. Alignment has been identified as an important element in educational accountability systems. For example, Title I of the Elementary and Secondary Education Act (ESEA) [P.L. 103-382] requires that the participant states adopt challenging content standards as well as high standards for student performance. Title I mandates that states’ content standards “must be aligned” with the assessment system adopted by each state. By the same token, the “No Child Left Behind” Act (ESEA, Act of 2001) ruled that, by the 2005-06 school year, states should start administering annual assessments in reading and mathematics for grades 3-8, which “must be aligned” with the state academic standards. Other subject matter must follow in subsequent years. Despite the importance of alignment recognized by these government mandates, none indicates how to perform such alignment or what constitutes sufficient or appropriate alignment between tests and standards. Alignment also is an important element in educational measurement. Contentrelated evidence of test validity is based on the match between the content of the test 12 items and the content of the knowledge domain (expectations/standards). Knowledge of these standards is what the test supposes to measure. Moreover, alignment studies and content-related evidence of validity share commonalties in methodology, as will be discussed later. Alignment measurements range from simple traditional matches, in which, through a checklist, subjective perceptions of the rater, or word counts, test items are compared to standards at school and district levels (Lewis, 1997; Nelson, 2002), to sophisticated large-scale studies, some lasting several years, comparing state standards and state tests across the United States (Impara et al., 2000; Porter & Smithson, 2001a; Webb, 1997; Rothman et al., 2002). Another concern is the alignment problem between secondary and post-secondary education (P-16) (Tafel & Eberhart, 1999). Kirst (1998) recognized the importance of this alignment and identified four critical points, which “threaten to potentially undermine the preparation of American secondary students for college education” (p. 2). These points include: (a) lack of authentic measures for student assessment regarding college preparation; (b) misalignment between secondary student preparation and college admission and placement standards; (c) placement of unacceptable high number of students in remedial classes; and (d) low retention and completion rates of students in many public universities. The lack of alignment between K-12 and post-secondary education (P-16) has motivated various higher education institutions across the nation to review their admission policies and launch efforts aimed to connect high school standards with university success. A number of those efforts are directed at establishing a relationship 13 between university admissions and state K-12 standards and assessments (CPRE, 2000; Tafel & Eberhart, 1999). For example, a consortium of universities across the nation launched the Standards for Success (S4S) project, supported by the American Association of Universities (AAU). Under S4S, a series of national conversations have taken place involving nearly 400 higher education faculty members and administrators from more than 20 higher education institutions (Conley, 2002). S4S has described what higher education institutions expect of entering freshman in terms of knowledge and skills. The S4S project put together a document, Key Knowledge and Skills for University Success (KSUS), which includes six categories encompassing six academic disciplines—English, math, science, social science, second language, and humanities/arts (Conley, 2002). KSUS has been distributed to school districts across the United States, and it is expected that some school districts will take KSUS into consideration when defining their own content standards. Importantly, a comparison between KSUS and state tests is equivalent to a comparison between state standards and state tests. Additionally, S4S has carried out a series of workshops in which experienced educators have rated both KSUS and state assessments, with respect to predefined depth of knowledge criteria and content concurrence. Data from the S4S project is available to scholars and researchers who are interested in developing a defensible methodology to match college expectations and current K-12 assessment. This research study has utilized data from the S4S project with the purpose of finding evidence of validity for the alignment index constructed. Determining Alignment between Standards and Assessment Three basic approaches have been used to determine alignment between standards and assessment (La Marca et al., 2000; Webb, 1997; 1999). These approaches include: 14 (a) the development of the assessment followed sequentially after the development of the standards; (b) post facto judgment, which is expert review of the standards and the assessment; and (c) document analysis, which is a systematic analysis (coding) of standards and assessment using a common pre-defined metric. These three strategies are also used to secure content-related evidence when they refer to validity of assessmentbased interpretations in educational measurement. Dimensions of Alignment Two overarching dimensions of alignment have been identified in regard to test item-level comparison to standards: content match and depth match. La Marca et al. (2000) wrote a guide to assist states and school districts in aligning their assessment systems to their standards. He identified relevant aspects of alignment to be considered, such as content match, depth match, emphasis, performance match, accessibility, and reporting. He also discussed alignment in the context of other important educational components, such as accountability, teachers’ involvement, professional development, policy development, textbook adoption, and K-16 connections. La Marca (2001) further refined the two most important dimensions of content alignment (content match and depth of match). For content match, according to La Marca, the concern is how well the test matches subject area content identified through state academic standards. Of relevance are broad content coverage, which concerns whether test content addresses broad academic standards and whether there is categorical congruence; range of coverage, which concerns whether test items address the specific objectives related to each standard; and balance of coverage, which concerns whether test items reflect the major emphasis and priorities of the academic standards. For depth 15 match, the concern is how well the test items match the knowledge and skills specified in the state standards in terms of cognitive complexity. A test that emphasized simple recall, for example, would not be well aligned with a standard calling for students to be able to demonstrate a skill. In a study of alignment of norm-referenced achievement tests with Nebraska’s content standards, Impara et al. (2000) used teacher’s judgments to measure the level of alignment using declarative perception of alignment, with the rating criteria of high level, moderate level, low level, and no alignment. These authors contend that using this definition and procedure of alignment resulted in a much higher likelihood that test items would match the content of the standards. This lowest common denominator is not acceptable in a genuine study of alignment, however. Impara’s definition of alignment can be considered a narrow characterization of alignment in comparison to the work of authors such as Webb (1997, 1999), Porter (2002), and Rothman et al. (2002). The current study will concentrate only on the three main dimensions of alignment that characterize the content focus category, which are breadth of match, depth of match, and balance of match. Methods of Alignment Several methodologies, generally as a combination of the three basic approaches presented above, have been proposed to determine the degree of alignment between standards and assessment in states and school districts. These approaches are generally referred to as content alignment methodologies because their aim is to match content covered between standards and assessment. This type of content alignment usually employs rating scales (percentages) based on breadth of content coverage and depth of 16 content coverage. Every component of the system (e.g. standard and assessment) receives a rating in both scales (breadth and depth) and then a comparison is established according to the level (percentage) of concurrence of content covered. Buckendahl et al. (2000), for example, used panels of experienced teachers to rate alignment between district standards and commercial tests. These authors found that teachers’ and test publishers’ perspectives do not coincide when commercial test publishers claim that their tests are aligned with current state standards. Rothman et al. (2002) also trained a panel of experts to determine alignment between assessment and standards, using protocols analogous to the scoring of performance assessment or portfolios. In determining the degree of alignment, these authors designed a protocol based on four dimensions to rate the level of alignment between assessment and standards. The four dimensions are content centrality, performance centrality, challenge, and balance and range. The criterion of content centrality defines the match between the content of each item and the content of the related standard. Sometimes this criterion is also used to determine the importance of a topic in a test. Performance centrality focuses on the degree of match between the types of performance exhibited by each test item and the types of performance required by the corresponding standard. The criterion of challenge compares the level of challenge between the assessment items and the related standard. Two factors are considered in evaluating this match: (a) source of challenge, in which reviewers rate items according to the intrinsic difficulty of the questions or to the difficulty with respect to students’ background knowledge, and (b) level of challenge, in which reviewers rate test items according to level of difficulty presented by the standards. 17 The level of challenge in both instruments (tests and standards) should be comparable. Fulfilling the criterion on balance and range assures that test items cover the full range of standards with an appropriate balance of emphasis across the standards. When referring to defined content domain of skills, knowledge, and affect, this protocol corresponds to some degree with the terminology of test validity in educational measurement. Webb’s Alignment Methodology Webb (1997) proposed the use of experts who perform systematic review of the standards and the corresponding tests. Webb’s approach to measure alignment depends upon five categories, each of which informs one aspect of the alignment methodology. These five categories are content focus, articulation across grades and ages, equity and fairness, pedagogical implications, and system applicability. Content focus is related to the content knowledge of the standards and assessment. It is subdivided into six criteria: categorical concurrence, depth of knowledge consistency, range of knowledge correspondence, structure of knowledge comparability, balance of representation, and dispositional consonance. Articulation across grades and ages is related to students learning at different developmental stages and their understanding of content and processes growing over time. This category is based on cognitive soundness determined by research and understanding, as well as cumulative growth in content knowledge during students’ schooling. Equity and fairness involve alignment of standards and assessment, and give students the opportunity for higher levels of learning. Pedagogical implications include the idea that alignment of standards and assessment should affect teaching practice. Proper alignment will help teachers develop appropriate pedagogy. Alignment is judged through engagement of students, 18 effective classroom practices, and use of technology, materials, and tools. System applicability means that alignment of standards and assessment should help teachers to create educational systems that are realistic, reliable, applicable, and attainable. Currently, only the first category, content focus, has been developed extensively in Webb’s (1997, 1999) work and by authors such as Porter (2000, 2002) and La Marca (2001). In subsequent work, Webb (2001, 2002) refined his methodology concentrating in the content focus category. He applied this methodology to the alignment of science, mathematics, and language with their respective standards in selected states. In Webb’s model of alignment, the content focus category includes the four criteria: categorical concurrence, depth of knowledge consistency, range of knowledge correspondence, and balance of representation. The categorical concurrence (CC) criterion refers to the content categories of the standards and assessment. This criterion is met if both documents address the same content categories. According to Webb (1997, 1999), a minimum agreement of six items addressing the same category is considered for alignment. This study challenges such criterion for several reasons, which are discussed in detail below in the conceptual framework of this study. Depth of knowledge consistency (DOK) is an indication of the complexity of knowledge. Webb (1997, 1999) defined four levels of DOK, which will be presented later. This criterion is met if the standards and assessments are comparable on their cognitive exigency. According to Webb, the two instruments (standards and assessment) are comparable if the DOK of the assessment items are equal or above the depth of knowledge of the standards. Webb contends that a minimum benchmark of 50% (the sum 19 of equal and above rates) is the criterion for alignment. The researcher will argue that perfect alignment should be considered when test and standard have equal DOK. It is not fair for learners to have higher levels of exigency in a test if the standard (and classroom instruction) did not prepare them for such demand. Range of knowledge correspondence (ROK) refers to the breadth of knowledge between the standards and assessment. In other words, it is the proportion of standards addressed by the items. Alignment is acceptable if a comparable span of knowledge is achieved between standards and assessment. According to Webb (1997, 1999) an agreement of at least 50% is needed for proper alignment. The researcher will argue that, to have proper alignment, the proportion of items addressed by standards (bidirectional) also should be considered. In other words, it is a measurement of how the test (items) covers the standards. Balance of representation (BR) means that the degree of importance and emphasis of content, instruction, and tasks should be comparable in both instruments (tests and standards). The number of questions that address a given standard should be equally distributed across the test. Webb (1997, 1999) defined a partial index of alignment to measure this criterion, using a formula similar to that proposed by Porter and Smithson (2001a). This formula is central to this study and will be discussed later. According to Webb (1997, 1999) an index value of at least .70 is needed for proper alignment using this criterion. Webb’s BR is also in one direction in its measuring of emphasis of items in relation to the standards. The researcher will argue that emphasis of standards in terms of items (bidirectional) also should be taken in consideration. This modified version of 20 Webb’s balance of representation (BR) is called balance of knowledge (BOK) in this study. Although Webb’s (2001, 2002) work was in the alignment of expectations and assessment in mathematics, language, and science, he provided a methodology, especially in the content focus category, which can be applied in other content areas. Webb (1997, 1999) proposed two additional criteria within the content focus category, which are structure of knowledge comparability and dispositional consonance. Webb, or any other scholar, has not further developed these two criteria. The current study will address only the four criteria of categorical concurrence, depth of knowledge, range of knowledge, and balance of representation. Standards and Tests Cognitive Demand For the content focus category, Webb (1999, 2002) provided four different levels to judge depth of knowledge (cognitive demand) for both standards and assessments. Level 1 is recall of a fact, information, or procedure. Level 2 is skill/concept that involves the use of information, conceptual knowledge, and procedures, and uses two or more steps. Level 3 is strategic thinking, which involves the use of reasoning, a plan or sequence of steps, and more than one possible answer. Level 4 is extended thinking, which is the use of investigation, multiple conditions or alternatives, non-routine manipulations, and application of knowledge. Marzano (2001), based on the Bloom’s taxonomy of educational goals (Bloom, Engelhart, Furst, Hill, & Krathwohl, 1956), suggested a new taxonomy of educational objectives. Marzano’s model changes Bloom’s ordered hierarchy of difficulty and suggests, instead, an ordered hierarchy in terms of mental processes and levels of 21 consciousness, in which some mental processes exercise control over the operation of other processes in a hierarchical manner. Marzano’s model presents three mental systems: cognitive system, metacognitive system, and self-system. Marzano’s model articulates six levels of mental processing. First is the cognitive system which includes retrieval processes (level 1), comprehension processes (level 2), analysis processes (level 3), and knowledge utilization processes (level 4). Next are the metacognitive system processes (level 5) and then the self-system processes (level 6). Each level of consciousness takes control of the preceding level, ranging from automatic (subconscious) processes at level 1, up to full-conscious processes at level 6. Further, each level subsumes the previous one. In other words, lower levels are contained in subsequent higher levels. Levels of consciousness, rather than levels of complexity, are the key factors in Marzano’s taxonomy. A comparison between Bloom and Marzano’s models is depicted in Table 1. Table 1 Comparison between Bloom’s and Marzano’s Models Bloom’s Model Marzano’s Model Knowledge Retrieval Processes Comprehension Comprehension Processes Application Analysis Processes Analysis Knowledge Utilization Processes Synthesis Metacognitive Processes Evaluation Self-System Processes In recent work, Porter (2002) also defined five descriptors of categories (levels) of cognitive demand in the area of mathematics. It is worthwhile to notice that Porter uses 22 the terms “cognitive demand” and “expectations for students” interchangeably. The term “expectations” refers mainly to standards for most scholars. Porter’s five levels of cognitive demand are: (1) memorize facts, definitions, and formulas; (2) perform procedures and solve routine problems; (3) communicate understanding of concepts; (4) solve non-routine problems and make connections; and (5) conjecture, generalize, and prove. As can be observed, Porter’s (2002) categories of cognitive demand resemble Bloom and Marzano’s taxonomies. The S4S project used the first five levels of Marzano’s model to rank the depth of knowledge of both instruments—state tests and KSUS standards. Cognitive scientists have hypothesized different levels of knowledge since Bloom’s work almost 50 years ago. For example, Anderson (1983) defined three levels, declarative, procedural, and strategic, as the three main categories that other scholars have subdivided in an effort to understand the processes of higher order thinking. Because different authors use different terminology that applies to alignment, it is useful to recap the terms. When referring to the different levels to rate knowledge, for example, Webb uses depth of knowledge; Marzano prefers levels of consciousness and mental processing; Porter talks about cognitive demand or expectations for students; Rothman uses levels of challenge; and La Marca prefers cognitive complexity. Overall, there is lack of a common terminology, as well as disagreement among theories of higher order cognition. These factors perhaps have prohibited the development of a firm quantitative method (index) to evaluate alignment across educational contexts (Porter, 2002). The present study will adopt Marzano’s cognitive demand taxonomy because the 23 data provided by S4S used his constructs. However, the index of alignment defined in the present study is transparent to the number of levels because DOK values are used to compare standards and assessment regardless of the scale used. Porter’s Alignment Methodology Porter defined an overall index of alignment based on data collected using teacher surveys. These surveys were designed to measure alignment between assessments and classroom instruction. In Porter’s work, the data were obtained by a two-fold procedure. First, he surveyed schoolteachers about level of content coverage for several subject matters taught in the classroom. He also included questions about student expectations. He used the term cognitive demand, which is equivalent to Webb’s DOK, to rate content covered in classroom instruction as well as in assessments. Content of instruction was then measured at the intersection between topics covered and students’ cognitive demand. A table of topics covered versus cognitive demand was built. Second, the same procedure was applied to the tests. Each item was rated with the same cognitive demand scale (DOK), and another table of items versus cognitive demand was also built. These two tables were converted to context matrices of proportions (percentages) to be able to make comparisons. A context matrix describing standards (X) and another describing assessment (Y) were defined in Porter’s model. The match between these two matrices, as the sum of cell-by-cell intercepts, is a measurement of content alignment between standards and assessments. Porter’s index of alignment, which ranges from 0 to 1, is represented by the following formula: I = 1 - (Σ |X –Y|)/2 24 (1.1) where X and Y denote cell proportions in the standards matrix and assessment matrix respectively. An index value of 1 indicates perfect alignment. Figure 1 presents Porter and Smithson’s (2001b) alignment analysis. Figure 1. Alignment analysis. From Porter & Smithson (2001a). To understand the methodology of alignment proposed by Porter (2002), an example is helpful. Data are usually represented in a table of standards against test items, such as the following (Table 2). Table 2 Porter’s (2002) Methodology of Alignment Item 1 Item 2 Item 3 Item 4 C.D. N2 N3 N4 Standard 1 N1 N1 25 Standard 2 N2 Standard 3 N3 Standard 4 N4 √ √ √ √ Note. Item #: First row includes the test items of the assessment. Standard #: First column includes the standards. C.D. (cognitive demand) ranges from N1 to N4 (integers). These values are assigned to both instruments (standards and assessment) by the rater. √ : Categorical concurrence assigned by raters (hits). In Table 3 there is an example with imaginary data. Table 3 Porter’s (2002) Methodology of Alignment Using Imaginary Data Item 1 Item 2 Item 3 C.D. 2 Standard 1 1 √ Standard 2 1 Standard 3 3 Standard 4 2 4 3 Item 4 2 √ √ √ √ Tables 4 and 5 below show the data converted to two matrices of proportions, one for the standards and another for the assessment. In this example, C.D. values range from 1 to 4 and the number of total hits (√) is 5. Table 4 Porter’s (2002) Methodology of Alignment Using Constructed Matrix X with Respect to Standards’ Cognitive Demand Cognitive Demand 1 2 3 4 Standard 1 2/5 0 0 0 Standard 2 0 0 0 0 26 Standard 3 0 0 2/5 0 Standard 4 0 1/5 0 0 Table 5 Porter’s (2002) Methodology of Alignment Using Constructed Matrix Y with Respect to Items’ Cognitive Demand Cognitive Demand 1 2 3 4 Standard 1 0 2/5 0 0 Standard 2 0 0 0 0 Standard 3 0 0 1/5 1/5 Standard 4 0 1/5 0 0 Numbers into each cell represent the proportions of hits for each standard or assessment in relation to the total number of hits. The index of alignment between standards and assessment is then calculated by comparing, cell-by-cell, the two matrices using equation 1.1: (Σ|X-Y|)/2 = (|2/5 – 0| + |0 – 2/5| + |2/5 – 1/5| + |0 – 1/5| + |1/5 – 1/5|)/2 = 3/5 = 0.6 Thus, I = 0.4, which is considered a weak index of alignment (I < 0.5) (Porter, 2002). In the conceptual framework of this study there is a detailed definition of the alignment index proposed. A prototype of the index is presented in Appendix A, in which the alignment index is defined in terms of DOK, ROK, and BOK. A comparison including Webb (1997) and Porter’s (2002) models is also included in the prototype, as well as graphical representations of the alignment components. It is worth mentioning that the Porter’s index and the Webb’s DOKEqual calculations are completely equivalent. In addition, the alignment index defined in this study was used to measure the alignment between KSUS and selected state tests, helping to bridge the alignment gap 27 between K-12 and higher education institutions. Figure 2 represents all possible paths of alignment between K-12 (State) and higher education (P-16), and within districts, states, and P-16. Although the alignment index defined in this study could be used across different contexts (paths in Figure 2), the properties and behavior of the index were tested using S4S data and making comparisons between higher education standards and state assessments. This procedure is represented by the bold arrow in Figure 2. P-16 Assessm ent Standard s Figure 2. Bridging the gap between K-12 and P-16 (modified from Porter, 2002). An index of alignment, taking into consideration Webb’s (1997, 1999) and Porter’s (2002) approaches would serve to estimate alignment in different directions—vertically and horizontally—across educational contexts (see Figure 1 and Figure 2). 28 The Standards for Success Project (S4S) A consortium of universities across the nation, supported by the American Association of Universities (AAU), have joined efforts in describing what they expect of entering freshmen in terms of knowledge and skills. Under the project Standards for Success (S4S) (Conley, 2002), a series of national conversations have taken place involving nearly 400 higher education faculty members and administrators from more than 20 higher education institutions. The S4S project put together a document named Key Knowledge and Skills for University Success (KSUS), which includes six categories encompassing six academic disciplines—English, math, science, social science, second language, and humanities/arts (Conley, 2002). The KSUS in language and mathematics are listed in Appendices C and D respectively. The KSUS are a set of comprehensive statements of what higher education institutions expect from well-prepared senior graduates. The KSUS standards used in this research are grouped into several areas according to the different topics into which each subject matter is divided, as shown in tables 6 and 7. The percentages are used later in Chapter 4 as weights in the definition of the total index of alignment. Table 6 KSUS – English Language Topics Topics No. of Objectives Percentages (Weights) Reading and Comprehension 24 0.37 Writing 25 0.39 Research Skills 10 0.15 Critical Thinking 6 0.09 29 Table 7 KSUS – Mathematics Topics Topics No. of Objectives Percentages (Weights) Computation 11 0.15 Algebra 22 0.29 Trigonometry 4 0.05 Geometry 13 0.17 Math Reasoning 25 0.34 The KSUS have been distributed to school districts across the US, and it is expected that some school districts take the KSUS into consideration when defining their own content standards. As a consequence, the KSUS are appropriate to be used in this study because they are equivalent (in form and content) to state standards. In other words, a comparison between the KSUS and state tests is equivalent to comparing state standards to state tests. Twenty-seven state-level assessment exams with similar purposes were chosen for this study. All of them are administered in grades 10 or 11, or are denominated as end of course tests. Also, S4S has carried out a series of workshops where experienced educators have rated both, KSUS and state assessments, with respect to predefined depth of knowledge criteria and content concurrence. Data from the S4S project is available to scholars and researchers who are interested in developing a defensible methodology to match college expectations (KSUS) and current K-12 assessment (Conley, 2002). The S4S project involved training of a group of experienced higher education and K-12 teachers in a rating procedure based fundamentally on Webb’s (1997, 2000) 30 alignment methodology and Marzano’s (2001) taxonomy of educational objectives. The raters needed to be familiar with the KSUS and the state’s assessment. Both, expectations and tests, were rated with respect to the Depth of Knowledge (DOK) levels using Marzano’s levels of mental processing ranging from 1 to 5. S4S raters performed three basic tasks in order to compile the essential data that were used during the execution of this study: a. Rate the Depth of Knowledge level of each KSUS. This task provides values of DOK for the standards. b. Rate the Depth of Knowledge level of each assessment item. This task provides values of DOK for the test items. c. Determine the categorical concurrence between KSUS and corresponding test items. This task provides values (hits) of categorical concurrence between the two instruments (tests and standards). Values of ROK and BOK (BR) have been obtained applying the respective alignment criteria to the data collected above. This research project received S4S raw data generated by different raters (from 4 to 6) for each subject matter (English and mathematics) and selected state tests (27). Data from the S4S project served to explore the properties and behavior of the alignment index defined in this study. 31 CHAPTER 3 METHODOLOGY Study Objectives This research study focuses on the development of a quantitative methodology to analyze and measure the alignment between standards and assessment. This alignment methodology includes the construction of an index as a mathematical entity that describes and measures the alignment between expectations and tests in a quantitative manner. Such a metric is needed to advance the research in this area (Porter & Smithson, 2001a, Porter, 2002). In building the index, the researcher has taken into consideration the main qualitative and quantitative criteria already established in the alignment research literature (Webb, 1997, 2001, 2002; La Marca, 2001; Impara, 2001; Porter, 2002; Rothman et al., 2002). This study reviewed, extended, and advanced such criteria by building a cohesive, concise, and comprehensive methodology that informs alignment of standards and tests from different perspectives. Several computational tools were developed to prototype, create, and manipulate the index and the criteria from which it is built. Additionally, the researcher adopted a minimum terminology to define the criteria and to describe the process of measuring alignment. Such terminology is defined in Chapter 1 under the section definition of terms and expanded below in the conceptual framework of the study. Novel graphical representations were developed to describe and measure alignment. In particular, trilinear plots and centroid graphs were used to provide visual interpretations of the alignment criteria and the indices constructed. Such visual representations served to improve our understanding of the alignment process by analyzing visual patterns. As a result of these 32 graphical analyses, new data and new relationships could be revealed. This flexibility allows researchers to analyze data from novel perspectives, and sometimes this graphical representation lets scholars visualize undisclosed patterns or connections not shown in tables and flat graphs. Research Questions This study is devoted to filling a need in educational alignment research, which consists of determining a quantitative methodology to measure alignment between expectations (standards) and assessment. A systematic and common terminology was adopted in terms of the three alignment criteria that help to systematize and simplify the alignment process. A minimum and concise language was needed in order to resolve differences among the variant terminology and conceptualizations used by scholars when defining the dimensions and criteria of alignment research. In addition, this research study developed a mathematical index (alignment index) as a function of the alignment criteria adopted. This index served to measure the match between expectations (standards) and assessment. Also, this index was utilized across different educational contexts such as subject matter and states. Data available from the S4S project (Conley, 2002) were used to explore the properties and performance of the proposed alignment index. The five fundamental research questions addressed by this study are: Research Question 1. What is a minimum terminology necessary and sufficient to analyze and describe the criteria in the process of measuring content alignment? 33 Research Question 2. What is the mathematical form and the characteristics of a quantitative metric (alignment index) to measure alignment between expectations and assessment across instructional contexts? Research Question 3. To what extent are state assessments and higher education expectations (KSUS) aligned according to the index defined? Research Question 4. What alignment comparisons can be made between KSUS and state assessment across states and across subject matter using this index? Research Question 5. What relationships could be established among the levels of alignment and accountability features of state testing? Conceptual Framework of the Study Based on the alignment research literature, the conceptual framework of this study includes the concepts of content focus, content coverage, and content centrality. All three concepts are considered equivalent and will be embedded in a single category, which is the content focus category. In other words, for the purpose of the study, centrality is considered constant through the standards and the tests. Content focus includes, in turn, three alignment criteria: range of knowledge (ROK), depth of knowledge (DOK), and balance of knowledge (BOK). The outline below represents the conceptual framework, including in parentheses the terminology subsumed into each concept. Content Focus (content coverage, content centrality) ROK (range of coverage, span of coverage, range of knowledge) DOK (depth of coverage, cognitive demand, depth of knowledge) BOK (balance of coverage, balance of representation, balance of knowledge) 34 These three terms are the minimum criteria chosen in this study for alignment between standards and assessments. They belong to the three dimensions of alignment (range, depth, and balance) that characterize the content focus category. The alignment index constructed in this study involves the quantification of the three different criteria defined in the framework. This index is a formula as a function of ROK, DOK, and BOK. Additionally, ROK and BOK will be extended to include bidirectional alignment, not only from assessment with respect to standards (as Webb proposed) but also from standards with respect to assessments. Alignment of tests with respect to standards has been termed with the suffix “S.” Its meaning can be interpreted as the relevance of standards to items. It is calculated as the proportion of topics (objectives) of the standards addressed by the test. Likewise, alignment of the standards with respect to assessments has been termed with the suffix “A.” Its meaning can be interpreted as the relevance of the items to the standards. It is calculated as the proportion of items that match content found in the standards. For example, ROKS and BOKA represent range of knowledge with respect to standards and balance of knowledge with respect to assessment respectively (See Appendix A). Webb’s balance of representation (BR) will be replaced in this study by the extended concept balance of knowledge (BOK). The criterion BOK is bidirectional, which includes balance of knowledge with respect to standards (BOKS) and with respect to assessment (BOKA). BOKS can be interpreted as a measurement of the distribution of questions addressed by each standard. Conversely, BOKA can be interpreted as a measurement of the distribution of objectives addressed by each item. Following the same convention, DOK is represented by DOKA, DOKE, and DOKB indicating depth of knowledge that are above, equal, and below the 35 standards (or assessments) respectively. This bidirectional conceptualization of the alignment criteria is instrumental in the definition of the indices of alignment below. Webb’s (1997) categorical concurrency (CC) is not considered in this study because this criterion has been challenged for several reasons. First, this is the only criterion that uses a different metric in relation to the other criteria. Webb’s categorical concurrence is measured as an absolute value rather than as proportions or percentages; consequently, it is inappropriate to be included as a component of the overall index. Second, the number six chosen by Webb, as the minimum condition for alignment in the categorical concurrency category, is arbitrary and does not take into consideration the number of standards included in the match. It is possible to have a short test with a small number of standards addressed. Third, Webb’s categorical concurrence seems to represent a validity criterion rather than an alignment criterion by itself, since it is based on a previous work (Subkoviak, 1988), which is only related to reliability of mastery tests. Although there is a relationship between test validity and alignment, as noted before, categorical concurrence is not strictly a criterion of alignment as are the other three. Consequently, categorical concurrence is an attribute of the test that is related only to its validity and, thus, contradicts the proper assertion of Webb (2001) when he contends that “alignment is a quality of the relationship between expectations and assessments and not an attribute of any one of these two system components” (p. 2). The terminology defined and used above constitutes a revised, modified, and simplified terminology taken from the existing alignment research. Consequently, range of knowledge (ROK), depth of knowledge (DOK), balance of knowledge (BOK), and all of their sub-criteria defined above constitute the minimum and sufficient terminology to 36 describe the entire criteria in the process of measuring content alignment between standards and assessments. Additionally, this study adopted the following terminology to describe the components of the standards. Standards are divided into topics, and the topics are composed of objectives. In this way, the terms defined in this conceptual framework address the first research question of the study. Prototyping the Index of Alignment In order to define and represent graphically the index of alignment, centroid plots were used. Centroid plots are graphs constructed using three concurrent axes at 120 degrees each. Each axis in the graph represents one of the three alignment criteria. In turn, the criteria values determine the vertices of a triangle. A software program was developed in order to prototype three possible formulas that would represent the index of alignment (Figure 3). The first formula (Index Area) was represented by the area of a triangle, in which its sides were the three different alignment criteria, ROK, DOK, and BOK. This approach using the area of a triangle was utilized by Conley & Brown (in press). The second formula (Index Vector) is a vector constructed as the sum of three different vectors, each vector representing each of the alignment criteria. The third possible index (Index Mean) was the simple average among ROK, DOK, and BOK. Comparing the behavior of the three formulas, it was found that the first equation represented by the area has a tendency to underestimate the value of the index, which tended to zero when two criteria approached zero. This singularity is not appropriate because the index should show a value different from zero when at least one of the criteria is different from zero. The second equation, represented by a vector, tended to overestimate the index. Consequently, the third formula, which corresponded to the 37 average of the three criteria, performed smoothly and was chosen as the best representation of the alignment index. Figure 3. Prototyping the index of alignment using a centroid plot. Three equations are explored. Taking the mean value of the three criteria to represent the index of alignment seems to be plausible (Kane, 1992) given the available evidence provided by the software prototype. Plausibility in this case is based on the best interpretation that can be made for the definition of the index. The mean value of the three alignment criteria, used to define the index, is the best interpretation possible since it is based on the evidence provided by the software, by its performance, by its simplicity, and by its statistical meaning. None of these characteristics is present in the other two formulas. Additional to the definition of a mathematical formula to represent the index of alignment, the researcher proposed the utilization of a concept that gives an idea of the contribution every criterion play in the index. This concept is called skewness, which has a vectorial character with magnitude (module) and direction. Skewness is the vectorial 38 sum of the three alignment criteria in the plane of the graphic. Its module represents the magnitude (scalar) of how different is the contribution of each criterion to the index. The direction of the skewness informs about the tendency towards which criterion the index is oriented. In other words, skewness is oriented toward the criterion that contributes the most to the index. Skewness equal to zero means that each criterion contributes equally to the index. In graphical terms, skewness equal to zero happens when the three criteria determine an equilateral triangle. Definition of the Indices of Alignment The definition of the index of alignment includes the representation of the index as a mathematical formula, which is a function of the three alignment criteria (ROK, DOK, and BOK). A test is considered aligned to its respective content standard from multiple perspectives given by the different indices defined below. The levels of alignment are chosen according to several reasons described in Chapter 5. For now, it is worth mentioning that in relation to DOK, the index will include values of DOK that are at least equal (DOKE) or above (DOKA). In other words, DOK is represented in the indices as DOKE or DOKE + DOKA. The latest is a condition pursued by Webb (1997, 2001). Four different alignment indices are defined in this study, as follows: 1. I(S): This partial index represents the alignment taking into consideration the criteria with respect to standards. It measures the relevance of the standards to the items. It is defined as: I(S) = (ROKS + DOKE + BOKS)/3 39 (3.1) 2. I(A): This partial index represents the alignment taking into consideration the criteria with respect to assessment. It measures the relevance of the items to the standards. It is defined as: I(A) = (ROKA + DOKE + BOKA)/3 (3.2) 3. I: This index is the overall index of alignment, which combines the two partial indices mentioned above. It is defined as: I = (ROK + DOKE + BOK)/3 (3.3) where ROK = (ROKS + ROKA)/2 and BOK = (BOKS + BOKA)/2. It can be shown that I = [ I(S) + I(A) ]/2 4. I(+): This overall index takes into consideration the combined depth of knowledge equal and above (DOKE + DOKA). It is defined as: I(+) = [ROK + (DOKE + DOKA) + BOK]/3 (3.4) A linear combination of the alignment criteria has been chosen attending to the reasons given above in the section prototyping the index of alignment. Appendix A includes an operative prototype model for the alignment index. The four indices defined above are calculated for every topic in each subject matter. Each state test in English language, for example, obtained 16 different indices according to the number of topics (Reading & Comprehension, Writing, Research Skills, and Critical Thinking), in which the subject matter is divided. Sixteen indices in English language are the result of 4 indices multiplied by 4 topics. The total index of alignment (IT) of a test is calculated taking into consideration the overall index (I) defined in the numeral 3 above. Due to the fact that every topic in the KSUS has a specific weight (number of objectives), the total index is defined as a 40 weighted average of each index (I) belonging to each topic. In the case of English language, the total index for a test is defined as follows: ITE = 0.37 [ I(Reading & Comprehension)] + 0.39 [ I(Writing)] + 0.15 [ I(Research Skills)] + 0.09 [ I(Critical Thinking)] (3.5) By a similar procedure, the total index for mathematics tests is defined as follows: ITM = 0.15 [ I(Computation)] + 0.29 [ I(Algebra)] + 0.05 [ I(Trigonometry)] + 0.17 [ I(Geometry)] + 0.34 [ I(Math Reasoning)] (3.6) The weights are taken from tables 6 and 7 (Chapter 2) for English and mathematics respectively. In both cases, the total indices of alignment range from 0 to 1. This weighted approach was also used by Conley & Brown (in press). The different indices defined above, along with the concept of skewness, give a suitable indication of the content coverage alignment category from different perspectives and provide a comprehensive approach to the alignment process. This section, which is related to the definition and construction of a quantitative index of alignment, addresses the second research question of this study. Data Collection With the purpose of testing the index of alignment constructed in this study, data collected by the Standards for Success (S4S) project at Stanford and Oregon universities were used to determine the alignment between higher education expectations (KSUS) and selected state tests. S4S is the project that provided the definition of KSUS. The researcher has received S4S raw data generated by a number of raters ranging from four to six. Such data consisted of Excel workbooks including rating values of the KSUS and state assessments according to a scale of DOK ranging from 1 to 5. Additionally, the 41 raters determined categorical concurrence between KSUS objectives and test items. The categorical concurrence spreadsheet was a matrix of hits at the cross point of the standards’ objectives (topics) and test items. A hit was defined as a mark in the cell where objectives and test items intersect. A mark indicates that both (standard objectives and test items) addressed the same content category. In Appendix A there is a detailed description of the model including the definitions, calculations, and operations of all variables. Schools accountability data from different states was also gathered and contrasted with alignment results. Results of the alignments were compared and contrasted with accountability features of state testing such as NAEP. Twenty-seven state tests in areas such as English language and mathematics were processed in this study. Data Analysis Data analysis is included in the third objective of this study. Data acquired from the S4S project were analyzed and used to put the index to work. Values of ROK, DOK, and BOK were obtained from raters’ responses using the model defined in Appendix A. These values were used as the alignment criteria, and consequently they were used to define the quantitative index of alignment. Two subject matters were chosen, English language and mathematics. Four different indices of alignment [ I(S), I(A), I, and I(+) ] were calculated by topics within each subject matter for all states. In Chapter 2, under the section Standards for Success Project, there is a list of subject matter topics for English language and mathematics. Reliability analysis of the raters’ data was performed using generalizability theory. G-coefficients were calculated for all states in regard to DOK values assigned by the raters. 42 Reliability of Data Generalizability (G-Study) analysis was performed to determine reliability of the raters’ data, using generalizability theory (G-Theory). Generalizability theory is a statistical theory for evaluating the dependability of behavioral and social science measurements (Brennan, 2001; Cronbach, Gleser, Nanda, & Rajaratnam, 1972). GTheory can be applied in this study because it provides a framework to analyze the reliability of the observations provided by the raters during the execution of the S4S project. The S4S project consisted of the training of a group of experienced higher education and K-12 teachers in a rating procedure based on Webb’s (1997, 2000) alignment methodology and Marzano’s (2001) taxonomy of educational objectives. The raters needed to be familiar with KSUS and the state’s assessment. Both expectations and tests were rated with respect to DOK levels using the Marzano’s levels of mental processing ranging from 1 to 5. The raters performed three basic tasks to compile the essential data used in this study: (a) rate the DOK level of each KSUS objective, which provides values of DOK for the standards; (b) rate the DOK level of each assessment item, which provides values of DOK for the test items; and (c) determine the categorical concurrence between KSUS and corresponding test items. This procedure provides values (hits) of categorical concurrence between the two instruments (tests and standards) in a matrix of hits. Values of ROK and BR (BR was converted to BOK) were obtained applying the respective alignment criteria to the data collected from the raters. Moreover, validity of the constructed alignment index was based on its internal mathematical structure, logical inferences about its function, and its performance when used with KSUS and state tests. Additionally, alignment results were correlated with accountability 43 features of state testing, such as NAEP reports. Finally, it was recommended that accumulated evidence of validity be obtained by posterior use of the index. 44 CHAPTER 4 RESULTS Generalizability of KSUS A generalizability study (G-Study) was performed in order to determine the reliability of the data provided by the S4S project. Six trained subjects rated the KSUS in language and mathematics according to the Marzano’s five levels of depth of knowledge (DOK). Figure 4 depicts the generalizability coefficients for KSUS-Language and for KSUS-Math versus the number of raters. For six raters, the G-coefficient for KSUSLanguage is 0.93 and for KSUS-Math is 0.89. In both cases, these two values indicate acceptable levels of consistency among the raters. According to Figure 4, a number of raters as low as 4 would be enough to reach acceptable levels of reliability (G > 0.80) in both subject matters. Generalizability Coefficient Raters Reliability for DOK (KSUS) 1 0.9 0.8 0.7 Language 0.6 0.5 0.4 0.3 Math 0 1 2 3 4 5 6 7 Number of Raters Figure 4. Generalizability coefficients for KSUS in English and mathematics. Generalizability of State Tests Generalizability analysis was performed for all 27 state tests. Figure 5 depicts the reliability coefficients for the state of Colorado-English as an example among all states. 45 The state of Colorado (the first one alphabetically) has been chosen with the only purpose of illustrating the entire methodology of the study. As Figure 5 shows, generalizability coefficients above 0.80 were obtained with three raters and above for English language in the state of Colorado. Raters Reliability for DOK (Colorado - Language) Generalizability Coefficient 1.00 0.90 0.80 English 0.70 0.60 0.50 0 1 2 3 4 5 6 Number of Raters Figure 5. Generalizability coefficient for Colorado-Language. For Mathematics, generalizability values around 0.80 were reached with five raters for the state of Colorado, as seen in Figure 6. Generalizability Coefficient Raters Reliability for DOK (Colorado - Math) 1.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 Math 0 1 2 3 4 5 6 Number of Raters Figure 6. Generalizability coefficient for Colorado-Mathematics. 46 7 In terms of raters’ generalizability for test items, twenty-seven state assessments were analyzed in English language and mathematics. In most cases, the generalizability (dependability) results were consistent and reliable (see Figure 7). Only four states (KYReading, G = 0.78; MS-English, G = 0.74; PA-Writing, G = 0.69; and TX-Reading, G = 0.75), among a total of 27 in the English language area, were below the value of G (0.80) that is considered desirable. However, KY-Reading (G = 0.78) and TX-Reading (G = 0.75) could be considered within the range of acceptable values. Only PA-Writing (G = 0.69) could be considered below acceptable scores. The lowest coefficients of dependability for DOK are due mainly to the restriction of range across items determined by the raters. This behavior has also been found in previous work of alignment such as the one performed by Herman et al. (2003). Raters Reliability for DOK - All States Language 1.00 0.95 0.90 G-Coefficient 0.85 0.80 0.75 0.70 0.65 0.60 0.55 MI MO MS NH NJ OR Figure 7. Raters’ reliability for all state tests in English-Language. 47 UT Writing Reading Writing Writing TX Reading Writing PA States Reading Reading EngFormC EngFormB English2 NY EngFormA English1 English English English English Writing English KY MA ME Reading IL English Reading Writing CT ACT KSUS CO Reading English Language 0.50 VA In order to determine the restriction of range for PA-Writing, it is necessary to calculate the item variance per rater. This is done by calculating the mean DOK for each rater across all items and, then, calculating the average among all raters. The variation of mean for PA-Writing (SD = 0.29, M = 3.72) was quite small in contrast, for example, to CT-Writing (SD = 1.23, M = 2.20). The small variation of mean for PA-Writing is an indication of restriction of range across items. Raters chose values of DOK in the range of 3 and 4 only, as can be noticed by the mean value of 3.72. Another factor contributing to this low G value for PA-Writing was the number of items (3) for this test (Table 8). A small number of items in a test do not provide enough room for variation of DOK scores, if the raters do not find substantial differences among the items. Similar circumstances appear in the case of mathematics tests (Figure 8). Two state assessments (MN-MathBasic, G = 0.68; and VA-Algebra I, G = 0.63) showed the lowest values of the dependability coefficient among all states. In both cases the range was also small in comparison to the other tests. The variations of mean for MNMathBasic (SD = 0.46, M = 1.66) and for VA-Algebra I (SD = 0.49, M = 1.70) were small in comparison, for example, to NH-Math (G = 0.94, SD = 0.90, M = 2.17). In these two cases, raters chose small values of DOK around 1 and 2, as can be noticed by the mean values around 1.70. 48 Raters Reliability for DOK - All States Mathematics 1.00 0.95 0.90 G-Coefficient 0.85 0.80 0.75 0.70 0.65 0.60 0.55 VA Math UT Math TX Algebra II Math PA Algebra I Math MathC OR Math NY MathB MathB NJ MathA Math MO MS NH MathA Math MI Algebra1 Math MN Math MathComp MathBasic Math KY MA ME Math ACT IL Math KSUS CO CT Mathematics Math Math 0.50 WA WY States Figure 8. Raters’ reliability for all state tests in mathematics. Standards and Tests Cognitive Demand Raters analyzed the KSUS and the selected state tests in terms of DOK. As noted, the raters used Marzano’s cognitive demand scale ranging from 1 to 5, where higher numbers indicated higher depth of knowledge. The values of DOK represented in the graphics below were obtained as the average among the raters across all items. Depth of knowledge (DOK) of KSUS and State Tests - Language A comparison of DOK values for KSUS-Language and for all state assessments is shown in Figure 9. 49 State Tests' Cognitive Demand - Language 5 DOK 4 3 2 1 NY PA TX English Writing VA English Writing UT Reading Reading Writing Reading Writing Reading EngFormB OR EngFormC NJ English2 NH EngFormA English MI English1 English English MO MS Writing MA ME English KY English English IL Reading Reading Writing CT ACT KSUS CO Reading English Language 0 WA WY States Figure 9. Comparison of KSUS-Language and state test cognitive demand. The bold line across the graphic represents the mean DOK for all states (2.39). As shown in Figure 9, the mean value of DOK for all states (2.39) is below the average value of DOK for KSUS-Language (2.86). However, some state tests (MIWriting = 3.61, MO-English = 3.38, and PA-Writing = 3.72) showed much higher values of DOK than the average for all state tests and for the KSUS. According to the data, the raters determined that those three state tests are more cognitively demanding than the rest. In the opposite direction, there are two states (IL-ACT = 1.80 and VA-Writing = 1.77) with the lowest cognitive demand values. It is worth noting that the highest values of DOK (MI-Writing = 3.61 and PA-Writing = 3.72) correspond to the shortest tests, both tests with three items each. Table 8 presents the DOK values of state tests, G coefficients, and the number of test items. Table 8 Comparison of DOK mean values, G-Coefficient, and number of test items for English language. State Test DOK SDa KSUS English 2.86 0.23 0.93 65 CO English 2.18 0.18 0.90 21 G-Coefficient Number of Items 50 a State Test DOK SDa CT Reading 2.81 0.25 0.88 16 CT Writing 2.20 0.23 0.94 8 IL ACT 1.80 0.11 0.91 75 KY Reading 2.46 0.16 0.78 30 MA English 2.35 0.18 0.91 42 ME English 2.23 0.20 0.86 28 MI Reading 2.21 0.35 0.85 29 MI Writing 3.61 0.71b 0.83 3 MO English 3.38 0.39 0.88 21 MS English 2.11 0.38 0.74 89 NH English 1.93 0.22 0.89 39 NJ English 2.67 0.20 0.88 26 NY English 1 2.09 0.26 0.90 18 NY English 2 2.79 0.19 0.87 12 OR EngFormA 2.08 0.22 0.83 77 OR EngFormB 2.18 0.17 0.80 77 OR EngFormC 2.20 0.19 0.82 77 PA Reading 1.92 0.30 0.84 21 PA Writing 3.72 0.33 0.69 3 TX Reading 2.45 0.17 0.75 42 TX Writing 2.42 0.25 0.90 41 UT Reading 2.22 0.11 0.88 37 UT Writing 2.32 0.20 0.89 31 VA Reading 2.28 0.32 0.79 34 VA Writing 1.77 0.33 0.87 31 WA English 2.32 0.19 0.91 46 WY English 2.34 0.16 0.91 25 G-Coefficient Number of Items Standard deviation among raters’ DOK mean values. See discussion below to explain this high value. b 51 Examination of Table 8, shows that most state tests obtained low values of standard deviation for DOK. This could be interpreted as a reliability factor among the raters’ observations. However, there is one case with an unusually high value of standard deviation (MI-Writing, SD = 0.71). The reason for this higher value of standard deviation in comparison to the other tests is due to a restriction of range (SD = 0.38), which was exacerbated by the low number of items (3). With fewer items the likelihood of observing no variation across items is increased. However, this test obtained an acceptable coefficient of reliability (G = 0.83) in contrast, for example, to PA-Writing, in which the restriction of range was lower (SD = 0.29) and did not reach an acceptable coefficient of reliability (G = 0.69) either. Inspecting the scatter graph (Figure 10), it seems that there is a slightly specific tendency that indicates a relationship between the values of DOK and the number of items (r = - 0.46, p < .05). Tests with lower number of items tend to get higher cognitive demand. Short tests (in number of items) tend to be more cognitively demanding because the items need to include more standards and the student needs to elaborate more on such type of test. A test item that needs a long answer, as in the case of an essay question, probably needs more elaboration because it may include more objectives, and consequently it might be considered more cognitively demanding. 52 100 90 80 Number of Test Items 70 60 50 40 30 20 10 0 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 DOK Figure 10. Scatter plot of DOK versus number of items for all states in language (r = 0.46, p < .05). KSUS in language and 27 state tests in English language were rated according to their cognitive demand (DOK). Results were presented using trilinear plots. Trilinear plots (Wainer, 1997, p. 111) are useful when there are three values summing 1 (or 100%), which is the case for DOK. Values of DOK above (DOKA), equal (DOKE), and below (DOKB) sum to 1 for every test. In this case, such values are the DOK of state tests with respect to the DOK of KSUS. A trilinear plot has been proven to be useful in comparing the alignment between KSUS and state tests in relation to DOK. A trilinear plot, as shown in Figure 11, is constructed using three axes inscribed in an equilateral triangle. Each axis represents the three variables (DOKA, DOKE, and DOKB), which run from the middle of a side up to the opposite vertex of the triangle. The length of each axis is 1. 53 Three distinct sections can be identified as Above, Equal, or Below, just eliminating part of the axes (See Figure 12 and subsequent). Figure 11. Trilinear plot of DOK showing the Utah-Reading test as an example. The three regions (Above, Equal, and Below) are places for state tests that are above, equal, or below in terms of DOK with respect to KSUS. As an example, Figure 11 depicts the location of the state of Utah (Reading test) in a trilinear plot. The values of DOK (DOKB = 0.36, DOKE = 0.48, and DOKA = 0.15) are taken from the axes, and the state is located at the intersection of the three perpendicular lines to the respective axes. This graphical representation gives an indication of the “position” of the state assessment in terms of DOK with respect to the KSUS-Language. In the particular case of the UTReading test, its position provides a good example of where this test is located in regards 54 to DOK with respect to KSUS-Language. Additionally, all states can be compared according to their relative positions to each other on the graph. Figure 12. Trilinear plot showing DOK comparisons for all states in Reading and Comprehension. The alignment results in terms of DOK for Reading and Comprehension are shown in Figure 12. In general, all states are clustered around intermediate values of DOK with respect to KSUS-Language. However, MO-English (0.11, 0.42, 0.47) and MSEnglish (0.47, 0.39, 0.14) are located in opposite positions indicating that MO reached the highest comparative value of DOK, and MS reached the lowest value. In other words, MO is in the Above-Equal region and MS is around the Below-Equal region of alignment with respect to KSUS-Language. 55 Those state tests that possess at least 50% of the items in the equal or above range (DOKE + DOKA = 0.50) are considered properly aligned according to Webb (1997). In the trilinear graph, those states located above the first dotted line (0.50) will satisfy such condition of adequate alignment. As shown in Figure 12, all state tests in Reading and Comprehension reached acceptable levels of alignment for DOK according to this condition. This study proposes a complementary condition to determine alignment extending and conciliating the works of Webb (1997) and Rothman et al. (2002). This condition is based upon three levels of alignment defined as Low, Middle, and High. Those states below the dotted line located at 0.50 in the trilinear plot are considered poorly aligned (Figure 12). The middle region of alignment is located between 0.50 and 0.66 in the trilinear graph. High alignment will be reached if the states are located above the dotted line marked as 0.66. The trilinear plot is useful in determining those three regions of alignment (Low, Middle, and High) since the dotted lines are traced through geometrically notable points of the graphic. The definition of these three regions in the graph also follows symmetric considerations of the trilinear plot. The intersection of the three axes defines an interesting point at 0.66, and the intersection of the axes with the sides of the triangle defines another point at 0.50. Combination of the six regions (Above, Equal, Below, High, Middle, and Low) provides finer granularity to the determination of alignment in terms of DOK. According to this convention, all state tests are in the Middle and High levels of alignment for Reading and Comprehension. These three levels of alignment are applied to the three 56 criteria of alignment (ROK, DOK, and BOK) and will be revisited later when the definition of the indices of alignment is discussed. The distribution of states in the three regions of alignment is presented in Table 9 for Reading and Comprehension. Table 9 Distribution of states in terms of DOK for Reading and Comprehension. DOK Low DOK Middle CO-English IL_ACT MI-Reading ME-English MS-English NH-English NY-English 1 OR-English A OR-English B OR-English C PA -Reading UT-Reading WA-English WY-English DOK High CT-Reading KY-Reading MA-English MO-English NJ-English NY-English 2 TX-Reading VA-Reading Another graphical representation useful in comparing DOK among states is stack graphs. Although stack graphs do not provide additional information than the centroid plots, they are simple and can be created using popular tools like MS Excel. The reason to include stack graphs here is to provide an alternative to the trilinear plots that were created using customized software not available to the public yet. Below is a comparison of DOK among states for Reading and Comprehension using a stack graph (Figure 13). As can be seen in the graphic, the status of MO-English stands out from the other states indicating the lowest value for DOKB (gray area), and consequently the highest value for DOKA (white area). Notice also that MO-English is 57 close to the middle path between DOKA and DOKE. A region above a horizontal line in which the DOK value is equal to 0.50 indicates proper alignment according to Webb’s condition. States with the gray area (DOKB) under the 0.50 line are considered properly aligned to the KSUS because the complement (DOKE + DOKA) is above 0.50. In a stack graph, two horizontal lines, one at 0.50 and the other at 0.66 DOK values, can also define the three regions (Low, Middle, and High). If the gray area (DOKB) for all state tests is below the 0.50 line, it signifies that there are no tests in the Low region, as is the case for Reading and Comprehension. DOK of State Tests With Respect to KSUS - Language Reading & Comprehension 1.20 1.00 DOK Proportions 0.80 DOKA DOKE 0.60 DOKB 0.40 0.20 NJ NY OR English English NH English English MS Reading English MO Reading English MI Reading Reading ME Reading English MA EnglishC English KY EnglishB Reading IL EnglishA ACT CT English 2 Reading CO English 1 English 0.00 PA TX UT VA WA WY States Figure 13. DOK of state tests with respect to KSUS-Language featuring Reading & Comprehension. It is important to notice that the vertical axis in the stack graph is measured in DOK proportions, where the maximum value should be 1. Also, the best way to interpret this stack graph is to observe the behavior of the gray area (DOKB) for each state test. As 58 the gray area moves upwards in the stack graph, it also indicates that the state test is moving downwards in the trilinear plot along the DOKB axis. In sum, lower DOKB means higher absolute values of DOK with respect to KSUS. The alignment results in terms of DOK for Writing are shown in Figure 14. In the topic of Writing, the majority of state tests are above the DOK value with respect to KSUS-Language. Notable exceptions are the states of Texas (TX-Writing; 0.71, 0.15, 015), Illinois (IL-ACT; 0.61, 0.32, 0.07), and Virginia (VA-Writing; 0.57, 0.24, 0.29), which are located in the Low region of alignment. According to the raters, the English test of Missouri (MO-English; 0.01, 0.14, 0.86) obtained the highest cognitive demand value in comparison to the DOK of KSUS-Language and in relation to the other states. Figure 14. Trilinear plot showing DOK comparisons for all states in Writing. 59 Table 10 Distribution of states in terms of DOK for Writing. DOK Low DOK Middle IL_ACT ME-English VA-Writing MS-English TX-Writing NH-English NY-English 2 DOK High CO-English CT-Writing MA-English MI-Writing MO-English OR-English B NJ-English NY-English 1 PA-Writing WA-English WY-English DOK of State Tests With Respect to KSUS - Language Writing 1.20 1.00 DOK Proportions 0.80 DOKA DOKE 0.60 DOKB 0.40 0.20 English English MS NH NJ NY English English MO English English MI Writing Writing ME Writing English MA Writing English IL EnglishB ACT CT English 2 Writing CO English 1 English 0.00 OR PA TX VA WA WY States Figure 15. DOK of state tests with respect to KSUS-Language featuring Writing Figure 15 is a comparison of DOK among states for Writing using a stack graph. The stack graph for Writing also shows the contrast between Missouri and Texas that is 60 illustrated in the trilinear plot. MO-English possesses the largest white area (DOKA = 0.86), while TX-Writing possesses the largest gray area (DOKB = 0.71). The three states (IL-ACT, TX-Writing, and VA-Writing) located in the Low region are represented in the stack graph by the only three tests in which the gray area is crossing the 0.50 line (Figure 15). The alignment results in terms of DOK for Research Skills are shown in Figure 16. The Research Skills topic shows higher dispersion of state tests in comparison to the former topics. Figure 16. Trilinear plot showing DOK comparisons for all states in Research Skills. 61 Table 11 Distribution of states in terms of DOK for Research Skills. DOK Low VA-Reading VA-Writing DOK Middle MI-Reading MS-English NH-English UT-Reading WA-English DOK High CT-Reading CT-Writing KY-Reading MA-English ME-English MI-Writing MO-English NJ-English NY-English 1 NY-English 2 PA -Reading TX-Reading TX-Writing WY-English Figure 17 is a comparison of DOK values among all states for Research Skills using a stack graph. In this graph the states MI-Reading, UT-Reading, and WA-English are just touching the 0.50 dotted line by the gray area (DOKB). This is an indication that they are on the border between the Low and Middle regions. 62 DOK of State Tests With Respect to KSUS - Language Research Skills 1.20 1.00 DOK Proportions 0.80 DOKA DOKE 0.60 DOKB 0.40 0.20 PA UT Writing Reading Reading Writing Reading TX VA English NY English NJ Reading NH English 2 MS English 1 MO English English MI English ME English MA Writing English KY Reading English CT Reading Writing Reading 0.00 WA WY States Figure 17. DOK of state tests with respect to KSUS-Language featuring Research Skills. The alignment results in terms of DOK for Critical Thinking are shown in Figure 18. For Critical Thinking, a cluster of states is located in the Above area indicating that those states obtained higher rates of DOK than the rate obtained by KSUS. 63 Figure 18. Trilinear plot showing DOK comparisons for all states in Critical Thinking. Table 12 Distribution of states in terms of DOK for Critical Thinking. DOK Low DOK Middle MS-English MO-English TX-Reading NJ-English TX-Writing UT-Reading WY-English 64 DOK High CO-English CT-Reading CT-Writing KY-Reading MA-English ME-English MI-Reading MI-Writing NH-English NY-English 1 NY-English 2 OR-English B PA -Reading VA-Reading VA-Writing WA-English Figure 19 is a comparison of DOK among states for Critical Thinking using a stack graph. DOK of State Tests With Respect to KSUS - Language Critical Thinking 1.20 1.00 DOK Proportions 0.80 DOKA DOKE 0.60 DOKB 0.40 0.20 NY OR PA TX UT VA English English Writing Reading Reading Writing Reading Writing Reading EnglishB English 2 English 1 NJ English MO MS NH English Writing MI English KY MA ME Reading English English Reading Writing CT English CO Reading English 0.00 WA WY States Figure 19. DOK of state tests with respect to KSUS-Language featuring Critical Thinking. Depth of Knowledge (DOK) of KSUS and State Tests – Mathematics A comparison of DOK values for KSUS-Math and all state assessments is shown in Figure 20. As shown in Figure 20, the mean value of DOK for all states (2.26) is below the average (from six raters) value of DOK for KSUS-Mathematics (2.32). However, some state tests (KY-Math = 2.69, MO-Math = 3.02, and WA-Math = 2.69) showed higher values of DOK than the average for all tests and for KSUS-Math. According to the data, raters determined that those three state tests are more cognitively demanding than 65 the rest. 0n the opposite side, there are three states (MN-MathBasic = 1.66, UT-Math = 1.70, and VA-Algebra I = 1.70) with the lowest cognitive demand values. State Tests' Cognitive Demand - Mathematics 5 4 DOK 3 2 1 VA Math UT Math TX Algebra II PA Algebra I Math OR Math NY Math NJ MathC NH MathB MS MathB MO MathA Math MI MathA Math MN Algebra1 ME Math MA Math Math KY MathBasic Math IL MathComp ACT CT Math KSUS CO Math Math Mathematics 0 WA WY States Figure 20. Comparison of KSUS-Math and state tests cognitive demand. The bold line across the graphic represents the mean DOK for all states (2.26). In Table 13 there is a comparison of all states in terms of DOK, G coefficient, and number of test items for mathematics. The state of MO-Math (SD = 0.87) obtained a higher value of standard deviation in comparison to the other states because one of the raters assigned a DOK value of 1 to all items of the test in contrast to the other raters who assigned values ranging from 3 to 5. Eliminating this rater, the standard deviation dropped to 0.23. However, the G coefficient remained equal and the average DOK jumped to 3.37. Table 13 Comparisons of DOK mean values, G-Coefficient, and number of test items for mathematics. State Test DOK SDa G-Coefficient Number of Items KSUS Math 2.32 0.16 0.90 75 CO Math 2.35 0.36 0.81 13 CT Math 2.58 0.21 0.86 19 66 a State Test DOK SDa IL ACT 1.99 0.32 0.81 59 KY Math 2.69 0.37 0.88 30 MA Math 2.31 0.49 0.87 42 ME Math 2.51 0.34 0.89 29 MN MathBasic 1.66 0.47 0.68 68 MN MathComp 2.55 0.21 0.89 30 MI Math 2.36 0.37 0.85 37 MO Math 3.02 0.87b 0.78 24 MS Algebra 1 2.28 0.33 0.80 65 NH Math 2.17 0.39 0.94 28 NJ Math 2.50 0.33 0.89 36 NY Math A 2.41 0.40 0.87 35 NY Math B 2.34 0.42 0.89 34 OR Math A 2.00 0.41 0.75 65 OR Math B 2.08 0.34 0.83 65 OR Math C 2.04 0.31 0.79 65 PA Math 2.17 0.48 0.89 20 TX Math 2.13 0.46 0.82 48 UT Math 1.70 0.42 0.74 70 VA Algebra I 1.70 0.46 0.63 50 VA Algebra II 1.96 0.38 0.83 50 WA Math 2.69 0.32 0.88 47 WY Math 2.36 0.38 0.88 19 G-Coefficient Number of Items Standard deviation among raters’ DOK mean values See discussion in text to explain this high value b The scatter graph in Figure 21 shows an appreciable correlation (r = - 0.67, p < .05) between the values of DOK among raters and the number of items of the tests. As in the case of English, tests with a lower number of items also tended to receive higher 67 scores of cognitive demand in mathematics. The same explanation about the higher DOK for short tests that was given to English is also valid here. 100 90 80 Number of Test Items 70 60 50 40 30 20 10 0 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 DOK Figure 21. Scatter plot of DOK versus number of items for all states in mathematics (r = 0.67, p < .05). The alignment results in terms of DOK for Computation are shown in Figure 22. In general, all states are clustered in the High (Above) region. However, CO-Math (0.51, 0.11, 0.38) is located in the Low region. According to the raters, the CO-Math test obtained in Computation a cognitive demanding score slightly below the one obtained by the KSUS-Math, and it was much lower than the scores of the other states. All other state tests were in the High region. 68 Figure 22. Trilinear plot showing DOK comparisons for all states in Computation. Table 14 Distribution of states in terms of DOK for Computation. DOK Low CO-Math DOK Middle 69 DOK High CT-Math IL-ACT KY-Math MA-Math ME-Math MN-MathBasic MN-MathComp MI-Math MO-Math MS-Algebra 1 NH-math NJ-Math NY-Math A NY-Math B DOK Low DOK Middle DOK High OR-Math A OR-Math B OR-Math C PA-Math TX-Math UT-Math VA-Algebra I VA-Algebra II WY-Math In Figure 23 there is a comparison of DOK among states for Computation using a stack graph. The larger white area indicates that most of the tests obtained higher values of DOK than those obtained by KSUS-Math in the topic of Computation. DOK of State Test with Respect to KSUS-Math Computation 1.20 1.00 DOK Proportions 0.80 DOKA 0.60 DOKE DOKB 0.40 0.20 OR VA Math Math UT Algebra I PA TX Algebra II Math Math C Math A Math NY Math B Math B Math Math A Math Algebra 1 Math MI MO MS NH NJ Math MN Math MathComp KY MA ME Math Math IL MathBasic ACT CO CT Math Math Math 0.00 WA WY States Figure 23. DOK of state tests with respect to KSUS-Math featuring Computation. The alignment results in terms of DOK for Algebra are shown in figure 24. 70 Figure 24. Trilinear plot showing DOK comparisons for all states in Algebra. In Algebra (Figure 24) the majority of state tests are located in the High region, meaning that those states have average DOK values higher than the DOK values of KSUS-Math. One notable case is the states of Missouri (Math. 0.17, 0.09, 074). According to the raters, MO-Math obtained the highest cognitive demand in comparison to the DOK of KSUS-Math, and in relation to the others states in the topic of Algebra. Table 15 Distribution of states in terms of DOK for Algebra. DOK Low DOK Middle OR-Math B OR-Math C TX-Math UT-Math VA-Algebra I 71 DOK High CO-Math CT-Math IL-ACT KY-Math MA-Math VA-Algebra II ME-Math MN-MathBasic MN-MathComp MI-Math MO-Math MS-Algebra 1 NH-Math NJ-Math NY-Math A NY-Math B OR-Math A PA-Math WA-Math WY-Math Figure 25 is a comparison of DOK among states for Algebra using a stack graph. In this graph, the state of Missouri possesses the largest white strip indicating the highest value of DOK above the KSUS-Math, in agreement with the data shown in Figure 24. DOK of State Tests with Respect to KSUS-Math Algebra 1.20 1.00 DOK Proportions 0.80 DOKA 0.60 DOKE DOKB 0.40 0.20 OR VA States Figure 25. DOK of state tests with respect to KSUS-Math featuring Algebra 72 Math Math Algebra I UT Algebra II PA TX Math Math C Math A Math NY Math B Math B Math Math A Math Algebra 1 Math MI MO MS NH NJ Math MN Math MathComp Math KY MA ME MathBasic Math IL Math ACT CO CT Math Math 0.00 WA WY The stack graph for Algebra (Figure 25) also shows the contrast between Minnesota (Math Basic) and Virginia (Algebra I). The alignment results in terms of DOK for Geometry are shown in Figure 26. Figure 26. Trilinear plot showing DOK comparisons for all states in Geometry. In the case of Geometry all states are located in the High (Above-Equal) area. Notably, NH-Math (0.05, 0.11, 0.84)) obtained the highest comparative value of DOK, while UT-Math (0.31, 0.51, 0.18) obtained the lowest comparative value of DOK. However, all states showed high levels of alignment for the Geometry topic. As shown in Figure 26 and in Table 16, all states were located in the High region. 73 Table 16 Distribution of states in terms of DOK for Geometry. DOK Low DOK Middle DOK High CO-Math CT-Math IL-ACT KY-Math MA-Math ME-Math MN-MathBasic MN-MathComp MI-Math MO-Math MS-Algebra 1 NH-Math NJ-Math NY-Math A NY-Math B OR-Math A OR-Math B OR-Math C PA-Math TX-Math UT-Math VA-Algebra I VA-Algebra II WA-Math WY-Math Figure 27 is a comparison of DOK among states for Geometry using a stack graph. The larger white area, and consequently the smaller gray area, indicates that most states obtained higher scores of DOK in relation to those obtained by KSUS-Math. Although PA-Math obtained the lowest score (the smallest white area and consequently the largest gray area) in comparison to the other states, it is still located in the High region as shown in Figure 26. In other words, the value of DOKE + DOKA = 0.69 is above the line 0.66. 74 DOK of State Test with Respect to KSUS-Math Geometry 1.20 1.00 DOK Proportions 0.80 DOKA 0.60 DOKE DOKB 0.40 0.20 VA Math Math Algebra I UT Algebra II PA TX Math Math C Math B OR Math NY Math A Math B Math Math A Math Algebra 1 MI MO MS NH NJ Math MN Math KY MA ME Math IL MathComp Math Math CO CT MathBasic ACT Math Math Math 0.00 WA WY States Figure 27. DOK of state tests with respect to KSUS-Math featuring Geometry. The alignment results in terms of DOK for Math Reasoning are shown in Figure 28. Most of the states are located in the Middle area for Math Reasoning. However, MOMath (0.28, 0.34, 0.38) obtained the highest comparative rate of DOK among all states, while IL-ACT (0.66, 0.29, 0.05) obtained the lowest DOK score. 75 Figure 28. Trilinear plot showing DOK comparisons for all states in Math Reasoning. Table 17 Distribution of states in terms of DOK for Math Reasoning. DOK Low DOK Middle IL_ACT CO-Math MN-MathBasic CT-Math OR-Math C MA-Math VA-Algebra I ME-Math VA-Algebra II MI-Math MS-Algebra 1 NH-Math NJ-Math NY-Math A NY-Math B OR-Math A OR-Math B PA-Math TX-Math UT-Math 76 DOK High KY-Math MN-MathComp MO-Math WA-Math WY-Math Figure 29 is a comparison of DOK among states for Math Reasoning using a stack graph. The larger gray area in the graphic indicates that most of the tests obtained lower values of DOK than those obtained by the KSUS-Math in Math Reasoning. DOK of State Tests with Respect to KSUS-Math Math Reasoning 1.20 1.00 DOK Proportions 0.80 DOKA 0.60 DOKE DOKB 0.40 0.20 OR VA Math Math Algebra I UT Algebra II PA TX Math Math C Math A Math NY Math B Math B Math Math A Math Algebra 1 MI MO MS NH NJ Math MN Math KY MA ME Math IL MathComp Math Math CO CT MathBasic ACT Math Math Math 0.00 WA WY States Figure 29. DOK of state tests with respect to KSUS-Math featuring Math Reasoning. The state tests with the gray area (DOKB) crossing the 0.50 dotted line (IL-ACT, MN-MathBasic, OR-Math C, VA-Algebra I, and VA-Algebra II) belong to the Low region of alignment, as can be observed also in Figure 28. Construction of the Alignment Index As noted, a software program was developed in order to prototype three potential indices of alignment (Area, Vector, and Mean). As a consequence of the performance of these three possible indices, the index represented by the mean of the three alignment 77 criteria (ROK, DOK, and BOK) was chosen as the most adequate to characterize the overall index of alignment by topic between standards and assessments. This overall index is then constructed as I = (ROK + DOKE + BOK)/3 as shown by equation 3.3 in Chapter 3. A new version of the software was created, using only this formula, in order to examine all possible values of the alignment criteria and to calculate the index according to the spectrum of all possible values that each alignment criterion could take. Appendix B includes a snapshot of the software. Figure 30 is a snapshot of the software running using data from the state of Colorado for Reading and Comprehension. The index value (I = 0.55) is the overall index of the state of Colorado-English in the topic of Reading and Comprehension. Figure 30. Index of alignment. State of Colorado-English (Reading and Comprehension). 78 Color code has been used to enhance the graphic. Values for the alignment criteria DOK (blue), ROK (red), and BOK (green) are entered using the slider bars. The value of the overall index (I) for Reading and Comprehension is 0.55, and the skewness is 0.34 for this particular case. The black line on the centroid plot represents the skewness, and it is showing the tendency towards BOK. Notice that BOK (0.74) is the alignment criterion that contributes the most to the index, so the skewness is oriented towards it. Definition of the Levels of Alignment Corresponding to the way in which the three regions of alignment were defined earlier in this chapter for the alignment criteria, and given the linearity of the equation that represents the index, as a linear function of the three criteria, those three regions can also be applied to the index of alignment. In other words, what is applied to each component, in this case each criterion, and as a result of the linearity of the formula, the index can also be broken up in the three regions. Consequently, index values below 0.50 are considered low, values between 0.50 and 0.66 are considered middle, and values above 0.66 are considered high. Alignment between KSUS and State Assessments As noted previously, four different alignment indices were defined in Chapter 3 under the definition of the indices of alignment section. Each partial index informs about different aspects of the alignment. Below are graphical representations (centroid plots) of such indices for the state of Colorado in English language and mathematics. Colorado has been chosen as an example to illustrate the entire methodology of alignment, which has been replicated for the rest of the states and used to obtain the summary results. In the 79 next section there is a summary comparison by topics among all states studied (See Figure 37 and subsequent). Colorado English Language. An Example English Language: Reading & Comprehension and Writing Reading and Comprehension Writing Partial Index with Respect to Standards Reading and Comprehension 1.0 0.5 0.40 Partial Index with Respect to Standards Writing DOKE 1.0 0.5 0.35 0.0 0.0 ROKS BOKS 0.75 ROKS I(S) = 0.48 Skw = 0.31 BOKS I(S) = 0.54 Skw = 0.38 Partial Index with Respect to Assessment Writing Partial Index with Respect to Assessment Reading and Comprehension DOKE DOKE 1.0 1.0 0.5 0.35 0.0 0.71 ROKA 0.31 0.55 0.68 0.5 DOKE 0.46 0.80 BOKA ROKA I(A) = 0.62 Skw = 0.41 0.31 0.0 0.58 BOKA I(A) = 0.45 Skw = 0.23 80 Overall Index of Alignment Reading and Comprehension 1.0 0.5 Overall Index of Alignment Writing DOKE DOKE 1.0 0.5 0.35 0.0 0.56 0.51 0.74 ROK BOK 0.0 0.67 ROK I = 0.55 Skw = 0.34 BOK I = 0.50 Skw = 0.31 Overall Index of Alignment Reading and Comprehension Overall Index of Alignment Writing DOKE+DOKA 1.0 1.0 0.5 0.0 ROK DOKE+DOKA 0.70 0.65 0.5 0.56 0.31 0.51 0.74 BOK ROK 0.0 0.67 BOK I(+) = 0.65 Skw = 0.16 I(+) = 0.63 Skw = 0.18 Figure 31. Alignment indices for Colorado-English with respect to KSUS-Language featuring Reading & Comprehension and Writing. In Figure 31 are represented the centroid plots of the four indices of alignment for Colorado-English in the topics of Reading & Comprehension and Writing. The lowest index is the partial index with respect to assessment [I(A) = 0.45] for Writing. The highest index is the overall index [I(+) = 0.65] for Reading and Comprehension. In terms of skewness, the lowest value (Skw = 0.16) is for the overall index I(+) for Reading and Comprehension. The highest skewness (Skw = 0.41) is for the partial index with respect to assessments I(A) for Reading and Comprehension. The direction and module (length = 0.41) of the arrow representing skewness for Reading and Comprehension indicates that 81 ROKA and BOKA contribute more than DOKE to the index. Also, the arrow is tilted towards BOKA as an indication that this criterion contributes the most to the index. Colorado English Language: Research Skills and Critical Thinking Research Skills Critical Thinking Partial Index with Respect to Standards Research Skills Partial Index with Respect to Standards Critical Thinking DOKE 1.0 1.0 DOKE 0.5 0.5 0.12 0.00 0.0 0.00 0.38 0.0 0.90 ROKS ROKS BOKS I(S) = 0 Skw = n/a I(S) = 0.47 Skw = 0.69 Partial Index with Respect to Assessment Critical Thinking Partial Index with Respect to Assessment Research Skills 1.0 BOKS DOKE 1.0 0.5 0.5 0.00 0.0 0.00 ROKA DOKE 0.13 0.0 ROKA BOKA I(A) = 0 Skw = n/a 0.12 0.40 BOKA I(A) = 0.22 Skw = 0.28 82 Overall Index of Alignment Research Skills 1.0 Overall Index of Alignment Critical Thinking DOKE 1.0 0.5 DOKE 0.5 0.12 0.00 0.0 0.00 0.250.0 0.65 ROK ROK BOK I = 0 Skw = n/a I = 0.34 Skw = 0.48 Overall Index of Alignment Research Skills 1.0 BOK Overall Index of Alignment Critical Thinking DOKE+DOKA DOKE+DOKA 1.0 0.5 0.85 0.5 0.00 0.0 0.00 0.250.0 0.65 ROK BOK ROK BOK I(+) = 0 Skw = n/a I(+) = 0.58 Skw = 0.53 Figure 32. Alignment indices for Colorado-English with respect to KSUS-Language featuring Research Skills and Critical Thinking. The graphs in figure 32 show the centroid plots of the four indices of alignment for Colorado-English in the topics of Research Skills and Critical Thinking. As can be noted, the raters did not find Research Skills in any of the items of this test. A comparison between the four different indices [ I(S), I(A), I, and I(+) ] and the four topics of English language is depicted in Figure 33. 83 Indices of Alignment for Colorado-English Index Value 0.70 0.60 0.50 Reading & Comprehension 0.40 Writing 0.30 0.20 Research Skills Critical Thinking 0.10 0.00 I(S) I(A) I I(+) Indices Figure 33. Indices of alignment by topics for Colorado-English language. According to the interpretation of the different indices of alignment given in the definition of the indices (Chapter 3), I(S) measures the index of alignment with respect to standards, and it is a measurement of the relevance of the standards to the items. As shown in Figure 33, Writing obtained the highest index for relevance of items. Its value [I(S)W > 0.50] means that there is an acceptable proportion of standards (objectives) addressed by the test in the topic of Writing. Reading & Comprehension and Critical Thinking obtained lower values, while Research Skills was totally absent in this test. For I(A), which measures the relevance of the items to the standards, Reading & Comprehension showed the highest value [I(A)R&C = 0.62] indicating an acceptable match of items with content found in the standards (objectives). However, Critical Thinking [I(A)CT = 0.22] did not appear to be covered sufficiently in this test with respect to the coverage of KSUS. The overall index (I) of alignment for CO-English follows the tendency of the partial indices since it is the average of the two. It is worth recalling that I = [I(S) + I(A)]/2. As a result, for CO-English, only the topic of Writing reached acceptable level of alignment with respect to the KSUS-Language. Taking into 84 consideration the combined DOK (DOKE + DOKA) criterion, all three topics present in the test reached acceptable values for this index of alignment I(+). Colorado Mathematics. An Example Mathematics: Computation and Algebra Computation Algebra Partial Index with Respect to Standards Computation 1.0 Partial Index with Respect to Standards Algebra DOKE 1.0 0.5 0.5 DOKE 0.51 0.12 0.19 0.0 0.33 0.0 0.86 ROKS 0.87 BOKS ROKS I(S) = 0.44 Skw = 0.66 BOKS I(S) = 0.52 Skw = 0.59 Partial Index with Respect to Assessment Algebra Partial Index with Respect to Assessment Computation DOKE 1.0 1.0 0.5 0.5 DOKE 0.51 0.12 0.31 0.0 0.32 0.0 0.51 ROKA BOKA ROKA I(A) = 0.31 Skw = 0.34 0.33 BOKA I(A) = 0.39 Skw = 0.19 85 Overall Index of Alignment Computation 1.0 Overall Index of Alignment Algebra DOKE DOKE 1.0 0.5 0.5 0.51 0.12 0.32 0.0 0.260.0 0.68 ROK 0.60 BOK ROK BOK I = 0.37 Skw = 0.49 I = 0.46 Skw = 0.31 Overall Index of Alignment Computation Overall Index of Alignment Algebra DOKE+DOKA 1.0 1.0 DOKE+DOKA 0.79 0.5 0.49 0.5 0.32 0.0 0.260.0 0.68 ROK 0.60 BOK ROK BOK I(+) = 0.50 Skw = 0.31 I(+) = 0.55 Skw = 0.47 Figure 34. Alignment indices for Colorado-Math with respect to KSUS-Math featuring Computation and Algebra. In Figure 34 are represented the plots of the four indices of alignment for the Colorado mathematics test in the topics of Computation and Algebra. The lowest index is the partial index with respect to assessment [I(A) = 0.31] for Computation. The highest index is the overall index [I(+) = 0.55] for Algebra. In terms of skewness, the lowest value (Skw = 0.19) is for the partial index with respect to assessment I(A) for Algebra. The highest skewness (Skw = 0.66) is for the partial index with respect to standards I(S) for Computation. Colorado Mathematics: Geometry and Math Reasoning 86 Geometry Math Reasoning Partial Index with Respect to Standards Geometry 1.0 Partial Index with Respect to Standards Math Reasoning DOKE 1.0 0.5 0.50 DOKE 0.5 0.42 0.230.0 0.260.0 0.77 0.94 ROKS BOKS ROKS I(S) = 0.57 Skw = 0.60 I(S) = 0.47 Skw = 0.47 Partial Index with Respect to Assessment Geometry 1.0 0.5 BOKS Partial Index with Respect to Assessment Math Reasoning DOKE 1.0 0.50 0.5 0.250.0 DOKE 0.42 0.0 0.50 0.66 ROKA 0.82 ROKA BOKA BOKA I(A) = 0.42 Skw = 0.25 I(A) = 0.63 Skw = 0.35 Overall Index of Alignment Geometry Overall Index of Alignment Math Reasoning 1.0 0.5 DOKE 1.0 0.50 DOKE 0.5 0.42 0.250.0 0.45 0.72 ROK 0.0 0.79 BOK ROK I = 0.49 Skw = 0.41 BOK I = 0.55 Skw = 0.36 87 Overall Index of Alignment Geometry Overall Index of Alignment Math Reasoning DOKE+DOKA 1.0 0.96 1.0 DOKE+DOKA 0.5 0.5 0.250.0 0.45 0.72 ROK 0.57 0.0 0.79 BOK ROK BOK I(+) = 0.64 Skw = 0.63 I(+) = 0.60 Skw = 0.30 Figure 35. Alignment indices for Colorado-Math with respect to KSUS-Math featuring Geometry and Math Reasoning. In Figure 35 are represented the centroid plots of the four indices of alignment for the Colorado mathematics test in the topics of Geometry and Math Reasoning. The lowest index is the partial index with respect to assessment [I(A) = 0.42] for Geometry. The highest index is the overall index [I(+) = 0.64] for Geometry. In terms of skewness, the lowest value (Skw = 0.25) is for the partial index with respect to assessment I(A) for Geometry. The highest skewness (Skw = 0.63) is for the overall index I(+) for Geometry. A comparison between the four different indices [ I(S), I(A), I, and I(+) ] and the four topics of mathematics is depicted in Figure 36. There is no value for Trigonometry because the CO-Math test did not include this topic. 88 Index Value Indices of Alignment for Colorado-Math 0.70 0.60 0.50 0.40 Computation Algebra 0.30 0.20 0.10 0.00 Geometry Math Reasoning I(S) I(A) I I(+) Indices Figure 36. Indices of alignment by topics for Colorado-Math. Again, according to the interpretation of the different indices of alignment given in the definition of the indices (Chapter 3), I(S) measures the index of alignment with respect to standards, and it is a measurement of the relevance of the standards to the items. As shown in Figure 36, Geometry obtained the highest index for relevance of items. Its value [I(S)G = 0.57] means that there is an acceptable proportion of standards (objectives) addressed by the test in the topic of Geometry. Algebra also obtained an acceptable index value [I(S)A = 0.53], while Computation and Math Reasoning obtained lower values below the 0.50 score. For I(A), which measures the relevance of the items to the standards, Math Reasoning showed the highest value [I(A)MR = 0.63], indicating an acceptable match of items with content found in the standards (objectives). However, Computation [I(A)c = 0.31], Algebra [I(A)A = 0.39], and Geometry [I(A)G = 0.42] did not reach acceptable levels of alignment. As mentioned previously, the overall index (I) of alignment for COMath follows the tendency of the partial indices since it is the average of the two. As a result, for CO-Math, only the topic of Math Reasoning reached an acceptable level of 89 alignment with respect to the KSUS-Math. Taking into consideration the combined DOK (DOKE + DOKA) criterion, all four topics reached acceptable values for this index of alignment I(+). Indices of Alignment for All States – Language The graphics below depict the overall indices of alignment (I) by topics for all state assessments in English language. These graphs are built taking the values of the overall index (I) from Figure 33 for all state tests. Index of Alignment - Reading & Comprehension 1.00 0.90 0.80 0.70 Index 0.60 0.50 0.40 0.30 0.20 0.10 TX UT Writing Reading Reading Writing Writing Reading Reading PA VA English OR English NY EnglishC NJ EnglishB NH EnglishA MS English 2 MO English 1 English MI English ME English MA Writing English KY English English IL Reading ACT Writing CT Reading CO Reading English 0.00 WA WY States Figure 37. Indices of alignment for all states featuring Reading and Comprehension. In Reading and Comprehension, all states obtained indices of alignment between 0.50 and 0.65, which are considered moderate. The average value of this overall index of alignment (I) for all states was 0.59 in Reading and Comprehension. 90 Index of Alignment - Writing 1.00 0.90 0.80 0.70 Index 0.60 0.50 0.40 0.30 0.20 0.10 Writing Reading UT VA English TX Reading Writing Writing PA English OR Reading English 2 NY Reading NJ EnglishB NH EnglishC MS EnglishA MO English 1 English MI English ME English MA Writing English KY English English IL Reading ACT CT Reading CO Writing English Reading 0.00 WA WY States Figure 38. Indices of alignment for all states featuring Writing. In Writing, OR-English B obtained the lowest index value (I = 0.37), while PAWriting obtained the highest value (I = 0.64). The average value of the index of alignment for all states was 0.49 in Writing. States without an index value indicated that the raters did not find matches for the topic in the test. Index of Alignment - Research Skills 1.00 0.90 0.80 0.70 Index 0.60 0.50 0.40 0.30 0.20 0.10 UT Writing Reading Writing TX Reading Reading Writing PA VA English OR English NY Reading NJ EnglishB NH EnglishC MS EnglishA MO English 2 English 1 English MI English ME English MA Writing English KY English English IL Reading ACT CT Reading Writing English CO Reading 0.00 WA WY States Figure 39. Indices of alignment for all states featuring Research Skills. In Research Skills, five states (KY-Reading, ME-English, MI-Reading, TXWriting, and WA-English) obtained indices of alignment lower than 0.20, which is 91 considered low. Two states (MO-English and TX-Reading) obtained the highest values, around 0.50, which are considered moderate. The average value of the index of alignment for all states in the topic of Research Skills was 0.28. Index of Alignment - Critical Thinking 1.00 0.90 0.80 0.70 Index 0.60 0.50 0.40 0.30 0.20 0.10 UT Writing Reading VA English TX Reading Writing Writing PA English OR Reading English 2 NY Reading NJ EnglishC NH EnglishB MS EnglishA MO English 1 English MI English ME English MA Writing English KY English English IL Reading ACT CT Reading Writing English CO Reading 0.00 WA WY States Figure 40. Indices of alignment for all states featuring Critical Thinking. In Critical Thinking, PA-Writing obtained the highest value of the index of alignment (0.68), while TX-Reading obtained the lowest value (0.21). The average value of the index of alignment for all states in Critical Thinking was 0.38. Total Index of Alignment for English Language The total index of alignment for each subject matter is calculated using the overall indices (I) and the weights of each topic. The total index of alignment for English language (ITE), as defined by equation 3.5 in Chapter 3 above, is represented in Figure 41 for all states. For illustration purposes, the total index of alignment between the CO-English test and the KSUS-Language can be calculated as follows: From previous results, the following values were found (Figure 33): Reading and Comprehension I = 0.55 92 Writing I = 0.50 Research Skills I = 0.00 Critical Thinking I = 0.34 The total index of alignment for CO-English is calculated using the weights of the topics (Table 6 and equation 3.5) as follows: ITE = 0.37*0.55 + 0.39*0.50 + 0.15*0.00 + 0.09*0.34 = 0.42 Another index that is possible to obtain using the index I(+), which corresponds to the Webb’s condition (DOK = DOKE + DOKA), is the following: ITE+ = 0.37*0.65 + 0.39*0.63 + 0.15*0.00 + 0.09*0.58 = 0.53 It is expected that this last index (ITE+) is always equal or greater than the former one (ITE). The general definition of the total index of alignment for any subject matter is then expressed as follows: IT = ∑wiIi / ∑wi Where, wi = Number of objectives per topic, and Ii = Overall index per topic. According to the convention for the three levels of alignment defined in this study, most states obtained levels below 0.50, which are considered slightly low. As Figure 41 shows, there are two extreme cases: MO-English = 0.57 with the highest score and OR-EnglishA = 0.23 with the lowest score. The average value of the total index of alignment for all states in English language was 0.37 (SD = 0.10). Only two states (MOEnglish = 0.57 and NJ-English = 0.52) reached middle levels of alignment (0.50 < TTE < 0.66). 93 Total Index of Alignment - English Language 1.00 0.90 0.80 0.70 Index 0.60 0.50 0.40 0.30 0.20 0.10 UT Writing Reading TX Reading Writing Writing Reading Reading PA VA English OR English NY EnglishC NJ EnglishB NH EnglishA MS English 2 MO English 1 English MI English ME English MA Writing English KY English English IL Reading ACT CT Reading CO Writing English Reading 0.00 WA WY States Figure 41. Total indices of alignment in English language for all states (M = 0.37, SD = 0.10). Indices of Alignment for All States – Mathematics Below there is a detailed analysis for all state tests in mathematics. This is the same analysis done for English language in the preceding section. The graphics below depict the overall index of alignment by topics for all state assessments in mathematics. Index of Alignment - Computation 1.00 0.90 0.80 Index 0.70 0.60 0.50 0.40 0.30 0.20 0.10 Figure 42. Indices of alignment for all states featuring Computation. 94 VA Math UT Algebra II TX Algebra I Math C PA States Math OR Math NY Math NJ Math B NH Math A MS Math B MO Math A MI Math MN Math ME Math Math MA Algebra 1 Math KY Math Math IL Math ACT CT MathComp Math CO MathBasic Math 0.00 WA WY The highest index value for the Computation topic was obtained by the CT-Math test (0.61), while MN-MathComp obtained the lowest value (0.32). Most state tests are slightly below 0.50. The average index of alignment for all states in Computation was 0.49. Index of Alignment - Algebra 1.00 0.90 0.80 Index 0.70 0.60 0.50 0.40 0.30 0.20 0.10 VA Math UT Math TX Algebra II PA Algebra I Math C OR Math NY Math NJ Math B NH Math A MS Math B MO Math A MI Math MN Math ME Math Math MA Algebra 1 Math KY Math Math IL Math ACT CT MathComp Math CO MathBasic Math 0.00 WA WY States Figure 43. Indices of alignment for all states featuring Algebra. For the Algebra topic, the VA-Algebra II test obtained the highest index of alignment (0.65), while MN-MathBasic obtained the lowest index (0.32). The average index of alignment for all states in Algebra was 0.48. 95 Index of Alignment - Geometry 1.00 0.90 0.80 Index 0.70 0.60 0.50 0.40 0.30 0.20 0.10 Math UT Math TX WA WY Algebra II PA Algebra I Math OR Math NY Math C NJ Math B NH Math A MS Math B MO Math A MI Math MN Math ME Math Math MA Algebra 1 Math KY Math Math IL Math ACT CT MathBasic Math CO MathComp Math 0.00 VA States Figure 44. Indices of alignment for all states featuring Geometry. In Geometry, TX-Math obtained the highest index of alignment (0.55), while MO-Math obtained the lowest value (0.28). The average index of alignment for all states in Geometry was 0.43. Index of Alignment - Math Reasoning 1.00 0.90 0.80 Index 0.70 0.60 0.50 0.40 0.30 0.20 0.10 VA Math UT Math TX Algebra II PA Algebra I Math OR Math NY Math C NJ Math B NH Math A MS Math B MO Math A MI Math MN Math ME Math Math MA Algebra 1 Math KY Math Math IL Math ACT CT MathBasic Math CO MathComp Math 0.00 WA WY States Figure 45. Indices of alignment for all states featuring Math Reasoning. KY-Math obtained the highest index (0.59) in Math Reasoning, while VAAlgebra II obtained the lowest index (0.43). The average index of alignment for all states in Math Reasoning was 0.51, which could be considered in the middle range. 96 It is worth mentioning that Trigonometry was not considered in this study because only five states showed such content. Additionally, the low weight of the Trigonometry topic (0.05) made practically negligible its contribution to the index in relation to the other topics. Total Index of Alignment for Mathematics The total index of alignment for mathematics for all states (ITM), as defined by equation 3.6 in Chapter 3, is represented in Figure 46. This total index of alignment for mathematics is calculated using procedures similar to the procedures used for English language above. Total Index of Alignment - Mathematics 1.00 0.90 0.80 Index 0.70 0.60 0.50 0.40 0.30 0.20 0.10 VA Math UT Math TX Algebra II Math PA Algebra I Math OR Math NY Math C NJ Math B NH Math A MS Math B MO Math A Math MI Math MN Math ME Algebra 1 MA Math KY MathComp Math IL MathBasic Math CT ACT Math CO Math Math 0.00 WA WY States Figure 46. Total indices of alignment in mathematics for all states (M = 0.46, SD = 0.03). According to the convention adopted in this study, most states obtained total indices of alignment for mathematics slightly below 0.50, which are considered in the Low-Middle range. The lowest total index was obtained by MN-MathBasic (0.37), and the highest index was 0.51, which was obtained by three states, MS-Algebra 1, NY-Math B, and TX-Math. The average of the total index of alignment in mathematics for all states was 0.46 (SD = 0.03). 97 The average of the total index of alignment among all states in English (M = 0.37) was lower than the average of the index in mathematics (M = 0.46). In both subject matters, these average scores were in the Low range of alignment with relation to their respective higher education expectations (KSUS). However, as shown previously, a few individual states did stand out from the rule. Alignment Index and Test Accountability The total index of alignment (IT) proposed in this study constitutes a quantitative entity aimed to measure the match between the content addressed by the state tests and the college expectations for any subject matter defined in the KSUS. Three levels of alignment were established: low (IT < 0.50), middle (0.50 < IT < 0.66), and high (IT > 0.66). In order to establish a relationship between levels of alignment and accountability features of state testing, two tests results were analyzed in mathematics subject matter, the National Assessment of Educational Progress (NAEP) and the Scholastic Aptitude Test (SAT) from the College Board. State-level NAEP assesses representative samples of 4th and 8th graders in public and non-public schools in the United States since 1994. NAEP developers claim that the test is a reliable instrument to monitor achievement over time across the nation. So, state-level NAEP scores were chosen in mathematics for 1996 and 2000, the years available for 8th grade. Although correlation between levels of alignment and direct NAEP scores did not show statistical significance, correlation results between the three levels of alignment and NAEP gain from 1996 to 2000 showed significant agreement (r = 0.46, p < 0.05). According to NAEP developers, this test is mostly designed to measure student 98 achievement in the context of instructional experience by tracking changes in students’ performance over time. In this sense, NAEP gains were more appropriate to correlate with the total index than NAEP direct scores. In relation to NAEP data, all states showed gains during this period, except the state of Utah. One possible conclusion that can be ventured about this correlation is that those state tests with higher levels of alignment to the college standards (KSUS) also respond to high state standards, and consequently, higher NAEP gains are obtained. Although the raters used state tests in the year 2002, it is safe to conclude that acceptable alignment between state tests and high state standards has been a tendency that persisted during several years. However, it could be argued that ability to take tests does not necessarily indicate significant learning and that “study for the test” could be playing a role here. Nevertheless, if state tests respond to high state standards, those negative factors could be considered benign when compared to the situation of state tests not responding to high standards. The other high-stakes assessment compared was the SAT test. Average annual results are available at state-level for this test. Although direct correlation between the levels of alignment and SAT gain during the same period of time did not show statistical relevance, correlations between NAEP gain and SAT gain for the same period was also acceptable (r = 0.52, p < 0.05). This result establishes a connection between NAEP gain and higher education, since the SAT test is taken by college-oriented students. 99 CHAPTER 5 CONCLUSIONS AND IMPLICATIONS Conclusions The main purpose of this study was the creation of a quantitative methodology to define and measure the alignment between standards (expectations) and assessment in the content focus category. The methodology developed in this study included the adoption of a minimum and concise language appropriate to define the criteria and to measure match between student expectations (standards) and state tests. Three dimensions were chosen to define the criteria for alignment: range (as ROK), depth (as DOK) and balance (as BOK). These three dimensions of alignment were conceptualized as bidirectional, that is, matching not only of standards addressed by items, but also matching items corresponding to standards. Using a bidirectional approach identifies both, the items that are targeted by the standards and the standards not represented on the test, that is, items without a correlate in the standards. These three dimensions represented the minimum needed to characterize alignment in the content focus category. The study methodology incorporated the construction of an alignment index as a mathematical formula, which informs quantitatively about the level of alignment between standards and assessment. The alignment index was constructed based upon the minimum alignment criteria chosen (ROK, DOK, and BOK). The index of alignment between standards and assessments was expressed as a mathematical function of the three alignment criteria. Among the three different formulas explored to define the alignment index, the simple mean of the three alignment criteria proved to be the most appropriate. The linear combination of the three criteria performed 100 smoothly when tested throughout all possible values of each criterion, using the software developed for that purpose. The other two formulas showed ill behaviors, particularly when any criterion approached zero. These irregular conducts were due to their nonlinear component of the formula, which included the product of the criteria and a root square. The process of measuring alignment was represented in a graphical manner by centroid plots (Figure 30). The values of the three alignment criteria (ROK, DOK, and BOK) were represented on each axis of the centroid plot. These values defined the three vertices of a triangle. The shape and size of the triangle gave an indication of the index value and also informed about the way each criterion contributed to the index. Two different types of graphics were used in this study: the centroid plots and the trilinear plots. Although similar, these two graphs served different purposes. Centroid plots were used to analyze and measure the indices of alignment, while trilinear plots were used to analyze and compare DOK values only (Figure 11). Additional to the graphical representation of the index, the concept of skewness was introduced as a complementary aspect of the alignment procedure. Skewness (as a vector) informed graphically about the weight each criterion contributed to the index. Skewness was calculated as the vectorial sum of the three criteria, since each axis could be assimilated to a vector. This concept proved valuable because it gave an indication toward which criterion the index was oriented. In other words, skewness was a vector pointing toward the criteria that contributed the most to the index. Skewness equal to zero meant that each criterion contributed (weighted) equally to the index, and it happened when the shape of the figure was an equilateral triangle. 101 This study began with a generalizability analysis of the data provided by the raters. The generalizability coefficient for the six subjects that rated the KSUS reached satisfactory levels of reliability. The G-Coefficient for KSUS-Language was 0.93, and for KSUS-Math was 0.89. In both cases the raters reached consensus on the DOK ranking, and the G values were within the acceptable levels (G > 0.80). These results could be attributed to the training of the subjects, who spent considerable time in training workshops prior to the actual rating activity. Training the raters probably was the most critical aspect of this alignment methodology, which is based on expert judgment. Detailed analysis of DOK was performed for the KSUS and for each state test in the two subject matters, English and mathematics. The DOK (the mean value among the six raters) for the KSUS-Language was 2.86 in Marzano’s scale ranging from 1 to 5. Most state tests in English obtained DOK values below the one obtained by KSUSLanguage (2.86) with three notable exceptions, MI-Writing (DOK = 3.61, SD = 0.71, G = 0.83, N = 3), MO-English (DOK = 3.38, SD = 0.39, G = 0.88, N = 21), and PA-Writing (DOK = 3.72, SD = 0.33, G = 0.69, N = 3), as shown in Table 8. The raters determined that these three state tests were the most cognitively demanding. A comparison of the DOK scores with the number of test items produced an appreciable correlation between the two (r = -0.46, p < .05), which can be observed in Figure 10. It was determined that items in a short tests tended to be more cognitively demanding because they concentrated more content. This tendency was even more pronounced in the case of mathematics (r = 0.67, p < .05) (Figure 21). The DOK for the KSUS-Math was 2.32 in the Marzano scale. Most state tests in mathematics obtained DOK values comparable to the one obtained by KSUS-Math. 102 However, MO-Math reached the highest value of DOK (3.02) among all states (see Table 13). This DOK value was even higher (3.37) after dropping one of the raters who assigned extremely low values to the items in comparison to the other five raters. In sum, the state of Missouri obtained the highest cognitive demand score from the raters in the two subject matters, English and mathematics. An alignment index was necessary in order to measure match between standards and assessment in a quantitative manner. An index could improve the possibility of obtaining precise and accurate values when judging the relationships among components of the instructional system, and in turn, it enables researchers to make better comparisons among the components of the system. The alignment index defined here was used to measure the alignment between higher education expectations (KSUS) and state assessment. This procedure of alignment was an opportunity to explore the properties of the proposed index. As a result of this alignment procedure, alignment comparisons among states were performed in terms of match between KSUS and current state tests. Additionally, this study compared the match of KSUS and state assessments across subject matter and across states. Four partial indices of alignment were defined [I(S), I(A), I, and I(+)] according to the different conceptualizations of the alignment criteria. Every partial index informed about a particular aspect of the alignment for each topic within a given subject matter. Partial indices by topic, I(S) and I(A), were a consequence of the bidirectional definition of all alignment criteria. The overall index by topic, I, was a combination of these two partial indices. The total index of alignment (IT) for each state test was, then, constructed as a combination of the overall indices and the weight each topic contributed to the whole 103 standard. This was a novel approach in alignment research in which a deeper and comprehensive knowledge was obtained about the measurement of match between standards and assessments. Trilinear plots were used in order to represent graphically DOK comparisons between KSUS and state tests. Three regions (Above, Equal, and Below) were used as places for tests that were above, equal, or below the DOK value of the KSUS. Trilinear plots were also useful to make comparisons among states depending on their relative position in the graph. In relation to alignment, the researcher suggested three levels of alignment (Low, Middle, and High), which were applicable to the different criteria and to the indices. The trilinear plot was useful in the definition of these three levels because such levels were traced by lines crossing notable points of the graph (See Figure 12 and subsequent). The three levels were defined depending on the values of the alignment criteria and the values of the indices, as follows: The Low level range was assigned when the criteria or indices reached values below 0.50; the Middle range when values were above 0.50 but below 0.66; and the High range of alignment was assigned when values were above 0.66. This approach of using ranges was desirable due to the nature of this alignment methodology, where a single cut off value was not appropriate. Ranges of alignment were considered to be less arbitrary than a single cut score, since some degree of vagueness was embedded in the description of standards and in the designation of the DOK levels. Comparative results in terms of cognitive demand showed that the state of Missouri, for example, obtained the third highest DOK score (DOK = 3.38) among all states in English language after PA-Writing (DOK = 3.72) and MI-Writing (DOK = 104 3.61). Alignment results in terms of DOK with respect to KSUS-Language showed that MO-English reached the highest value in comparison to the other states in Reading and Comprehension (DOKE = 0.42, DOKA = 0.47) (Figure 12) and in Writing (DOKE = 0.14, DOKA = 0.86) (Figure 14). In the topic Research Skills, MO-English also reached one of the highest values (DOKE = 0.84, DOKA = 0.16) (Figure 16). In the topic of Critical Thinking, MO-English reached intermediate values (DOKE = 0.47, DOKA = 0.08) (Figure 18). Missouri also reached the highest total index of alignment (ITE = 0.57) above the mean (M = 0.37) of all states. This value located Missouri in the Middle range of alignment for English. However, in mathematics MO-Math obtained a middling position (ITM = 0.41) below the mean (M = 0.46) of all states. This low index value with respect to KSUS-Math and in relation to the other states, located Missouri in the Low range of alignment for mathematics. This detailed analysis can be applied to all states in all subject matters, and also across grades if such data are available. Levels of alignment and accountability aspects of state testing were analyzed. The correlation between NAEP gain and levels of the alignment index could be interpreted as support for validity evidence of the alignment methodology; however, a word of caution is needed. Test results data were not ideal. NAEP is administered every four years; hence the closest data were years 1996 and 2000. Raters in this study selected state tests of 2002. As noted above, the acceptable correlation between high levels of alignment of state tests with KSUS and student achievement (NAEP gain) can be interpreted as a tendency of well-aligned states to show higher student performance. 105 The quantitative index of alignment constructed in this study was manipulated effectively in the digital domain using computational tools. Besides the construction of the index of alignment, a series of computational tools and graphical representations of the alignment criteria and the index itself were developed. The trilinear plot was exceptionally useful to represent graphically the DOK of the tests and also of great value to make comparisons between the tests and the KSUS, as well as to make comparisons among the tests themselves. The centroid plot was useful for visualizing the index of alignment allowing a finer granularity of the alignment methodology. Such visual representations helped to improve the understanding of the alignment process while providing valuable tools to disclose visual patterns, new relationships, and connections not seen in tables and flat graphs. As predicted in the framework of this study, this quantitative methodology of alignment provided the capability to measure alignment from a variety of perspectives, and the ability to make comparisons across topics of subject matter, across grades, and across states. This was a comprehensive methodology that collectively told the alignment story from its multiple perspectives and dimensions, a methodology anchored in the process of managing expert judgment imparted by the raters. Implications of the Findings The alignment methodology developed in this study improves our understanding of the alignment complexity and provides better tools for judging educational alignment, in particular content alignment between standards and assessments. The computational and graphical tools developed are powerful means of describing and measuring alignment from different perspectives. The detailed analysis of state tests in terms of their cognitive 106 demand (DOK) is in itself a powerful tool for school districts. The creation and utilization of an alignment index is expected to advance the research in systemic school reform, and specifically to advance alignment research. Alignment analysis is also important as a tool for decision making in schools, districts, and states. Educational leaders, after knowing the alignment status of the schooling system, can take corrective action to appropriately align their educational system should poor alignment exist. Also, when alignment is not possible or desirable, decision makers can take the appropriate actions based upon the knowledge of alignment, misalignment, or the other results this methodology could provide. The index of alignment developed in this study may be used to provide evidence of validity for those tests that are well aligned to their respective content standards. The tools developed in this study could be utilized by school districts in order to comply with the No Child Left Behind (NCLB) legislation. Alignment between standards and tests in K-12 is a fundamental component of the NCLB mandate, and the US Department of Education is closely monitoring this directive. Limitations of the Study The index defined in this research does not exhaust the totality of the alignment between standards and assessment, although it does give a suitable indication of the content (Content Focus) alignment category as it is addressed in current research. High levels of alignment were not expected in this study because the definition of the KSUS did not take into consideration any state test, and the KSUS were not designed for that purpose. However, reasonable levels of alignment were found among the KSUS and state tests in the two subject matters analyzed. It is important to emphasize that the alignment 107 results shown in this study do not constitute judgment about the quality of the state tests since this study was intended as an exercise to illustrate the use of the methodology developed. Very little information about the state tests was collected. More information could have helped to clarify the differences in the DOK scores and in the value of the index of alignment among the state assessments. Although the levels of DOK adopted by the raters (from 1 to 5) gave more room for selection, in comparison to other studies where only three levels were used, some raters reported difficulty in its utilization. Nevertheless, the generalizability results showed acceptable levels of agreement among raters. The data available from NAEP and SAT tests were not the best data for making comparisons with the levels of alignment of the index. NAEP is not given every year, so these limitations could undermine the interpretation of the correlation results. However, the results shown here, although provisional, could be interpreted as desirable trends among the levels of alignment between state tests and college expectations and student performance. State tests with higher levels of alignment are supposed to be more oriented toward those expectations that higher education institutions require from senior graduates. The correlation between levels of alignment, NAEP, and SAT scores established a relationship among levels of alignment, student performance in K-12, and success in college. In order to reach this conclusion, some previous considerations were needed. First, the dimensions of alignment (range, depth, and balance) should be sufficient to give a suitable indication of content match. Second, the definition of an alignment index as a linear function of the alignment criteria should be plausible enough 108 to genuinely represent a measurement of alignment. Third, the raters’ data should be reliable. Fourth, the KSUS should be considered high standards. Fifth, higher indices of alignment should indicate that state tests also comply with higher state standards. And sixth, state tests aligned to higher standards should impact student performance positively. All assumptions above are taken for granted for reasons already discussed in this study. Evidence of validity in this novel methodology cannot be acquired in a single experiment, so additional utilization is necessary to accumulate evidence of validity for this alignment approach. Recommendations for Further Research This is a work in progress. The methodology developed in this study could be used to measure alignment in other subject matters and in other contexts. For example, the index of alignment constructed could be used as a variable to be confronted with student achievement in their respective state tests and standards since a well-aligned school system should make an impact on student outcomes. Moreover, the tools developed in this study could also be used in other contexts such as the alignment between content of instruction and standards. Comparison with other accountability features of state testing such as ACT scores, K-12 graduation rates, and college admission are also valuable to perform. A closer look at the Missouri test should be performed in order to find reasons for its high DOK and its high alignment scores shown in English subject matter, since this could be the product of a coincidental factor unless a connection between Missouri’s state standards and the KSUS can be established. 109 The graphical tools developed proved to be useful in the presentation of results from different perspectives. Alignment research could be advanced by enhancing the computational tools introduced, which are aimed at discovering special patterns and relationships among variables that eventually lead to the formulation of hypotheses that can be tested in subsequent, more formal analysis. Further work is recommended in order to extend and improve the methodology, as well as to accumulate evidence of validity for this alignment approach through subsequent utilization. 110 REFERENCES Anderson, J. R. (1983). The architecture of cognition. Cambridge, MA: Harvard University Press. American Educational Research Association. (1999). Standards for educational and psychological testing. Washington, DC: Author. Baker, E. L., & Linn, R. L. (2000, Winter). Alignment: Policy goals, policy strategies, and policy outcomes. The CRESST line. Stanford, CA: National Center for Research on Evaluation, Standards, and Student Testing. Bishop, J. (1998). The effect of curriculum-based external exit exam systems on student achievement. Journal of Economic Education, 29(2), 171-183. Blank, R. K., Kim, J. J., and Smithson, J. (2000). Survey results of urban schools classroom practice in mathematics and science: 1999 Report (Monograph No. 2). Norwood, MA: Systemic Research, Inc. Blank, R. K., Porter, A., & Smithson, J. (2001). New tools for analyzing teaching, curriculum and standards in mathematics & science. Washington, DC: Council of Chief State Schools Officers. Bloom, B. S., Engelhart, M. D., Furst, E. J., Hill, W. H., & Krathwohl, D. R. (Eds.). (1956). Taxonomy of educational objectives: The classification of educational goals. Handbook I: Cognitive domain. New York: David McKay. Brennan, R. L. (2001). Generalizability theory. New York: Springer-Verlag. Buckendahl, C. W., Plake, B. S., Impara, J. C., & Irwin, P. M. (2000). Alignment of standardized achievement tests to state content standards: A comparison of 111 publishers’ and teachers’ perspectives. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA. Center for Research on Evaluation, Standards, and Student Testing (CRESST). (n.d.). Los Angeles, CA: University of California at Los Angeles. Retrieved November 20, 2002, from http://www.cse.ucla.edu/index.htm. Cohen, S. A. (1987). Instructional alignment: Searching for a magic bullet. Educational Researcher, 16(8), 16-20. Conley, D. (2002). Standards for success: Annual report. Eugene, OR: University of Oregon, Center for Educational Policy Research. Conley, D. and Brown, R. (in press). State high school assessments and standards for college success: Do they connect? Education Administration Quarterly. Council of Chief State School Officers (CCSSO). (n.d.). Retrieved November 29, 2002, from http://www.ccsso.org/. Consortium for Policy Research in Education (2000). Bridging the K-12/postsecondary divide with a coherent K-16 system (Policy Briefs). University of Pennsylvania. Philadelphia, PA: Author. Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements. New York: Wiley. Curriculum Alignment Project (CAP). (n.d.). Indiana Department of Education. Retrieved October 10, 2002, from http://www.niesc.k12.in.us/esc7sdev/cap1.htm and from http://hammond.k12.in.us/curralign.htm. 112 Feuer, M. J., Holland, P. W., Green, B. F., Bertenthal, M. W., & Hemphill, F. C. (1999). Uncommon measures: Equivalence and linkage among educational tests. Washington, DC: National Academy Press. Fuhrman, S. H. (1999, January). The new accountability. (Policy Briefs). Philadelphia, PA: University of Pennsylvania, Consortium for Policy Research in Education. Herman, J. L., Webb, N., & Zuniga, S. (2003). Alignment and college admissions: The match of expectations, assessment, and educators perspectives. (CSE Technical Report 593). Los Angeles, CA: University of California at Los Angeles. Center for the Study of Evaluation. Impara, J. C. (2001). Alignment: One element of an assessment’s instructional utility. Paper presented at the annual meeting of the National Council of Measurement in Education, Seattle, WA. Impara, J. C., Plake, B. S., & Buckendahl, C. W. (2000). The comparability of normreferenced achievement tests as they align to Nebraska’s language arts content standards. Paper presented at the Large Scale Assessment Conference, Snowbird, UT. Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112(3), 527-535. Kirst, M. (1998). Improving and aligning K-16 standards, admission, and freshman placement policies. Stanford, CA: Stanford University. National Center for Postsecondary Improvement. Kirst, M. & Venezia, A. (2001). Bridging the great divide between secondary schools and postsecondary education. Phi Delta Kappa, 83(1), 92-97. 113 La Marca, P. M. (2001). Alignment of standards and assessment as an accountability criterion. Practical Assessment, Research & Evaluation, 7(21). La Marca, P. M., Redfield, D., Winter, P., Bailey, A., & Despriet, L. (2000). State standards and state assessment systems: A guide to alignment. Washington, DC: Council of Chief State School Officers. Lashway, L. (1999). Holding schools accountable for achievement. (ERIC Document Reproduction Service No. 434 381) Le, V., Hamilton, L., & Robyn, A. (2000). Alignment among secondary and postsecondary assessment in California [On-line], Crucial Issues in California Education. Chapter 9. Retrieved September 29, 2002, from http://pace.berkeley.edu/pace_crucial_issues.html. Lewis, A. C. (1997). Figuring it out: Standards-based reforms in urban middle grades. Retrieved October 13, 2002, from http://www.middleweb.com/figuring.html. Marzano, R. J. (2001). Designing a new taxonomy of educational objectives. Thousand Oaks, CA: Corwin Press Inc. National Council on Education Standards and Testing (1992). Raising standards for American education: A report to Congress. The Secretary of Education, the National Education Goals Panel, and the American people. Washington, DC: Author. National Education Goals Panel (2000). Minnesota & TIMS, Exploring high achievement in eighth grade science. Washington, DC: Author. 114 National Education Association (NEA). (2002). Alignment of curriculum and tests to standards. Retrieved August 21, 2002, from http://www.nea.org/accountability/alignment.html. Nelson, G. D. (2002). Benchmarks and standards as tools for science education reform. American Association for the Advancement of Science. Retrieved October 19, 2002, from http://www.project2061.org/newsinfo/research/nelson/nelson1.html. Porter, A. C., & Smithson, J. L. (2001a). Are content standards being implemented in the classroom? A methodology and some tentative answers. In S. H. Fuhrman (Eds.), From the capitol to the classroom: Standards-based reform in the states, Part II (pp. 60-80). National Society for the Study of Education. Chicago, IL: University of Chicago Press. Porter, A. C., & Smithson, J. L. (2001b). Defining, developing, and using curriculum indicators (Rep. No. RR-048). Philadelphia, PA: University of Pennsylvania, Consortium for Policy Research in Education. Porter, A. C. (2002). Measuring the content of instruction: Uses of research and practice. Educational Researcher, 31(7), 3-14. Powell, A. G. (1996). Motivating students to learn: An American dilemma. In S. Fuhrman & J. O’Day (Eds.). Rewards and reform: Creating educational incentives that work. San Francisco: Jossey-Bass. Project 2061. (n.d.). American Association for the Advancement of Science. Retrieved February 3, 2003, from http://www.project2061.org/. 115 Rothman, R., Slattery, J. B., Vranek, J. L., & Resnick, L. B. (2002). Benchmarking and alignment of standards and testing (CSE Technical Report 566). Los Angeles, CA: University of California at Los Angeles, Center for the Study of Evaluation. Resnick, L. B., & Resnick, D. P. (1992). Assessing the thinking curriculum: New tools for educational reform. In B. R. Gifford & M. C. O’Connor (Eds.), Changing assessment: Alternative view of aptitude, achievement, and instruction (pp. 3775). Boston, MA: Kluwer Academic. Smith, M. S., & O’Day, J. A. (1991). Systemic school reform. In S. H. Fuhrman & B. Malen (Eds.), The politics of curriculum and testing: The 1990 yearbook of the politics of education association (pp. 233-267). New York, NY: Falmer Press. Subkoviak, M. J. (1988). A practitioner’s guide to computation and interpretation of reliability indices for mastery tests. Journal of Educational Measurement, 25(1), 47-55. Tafel, J., & Eberhart, N. (1999). Statewide school-college (K-16) partnerships to improve students’ performance. State Higher Education Executive Officers. Retrieved February 8, 2003, from http://www.sheeo.org/publicat/pub-k16.htm. The Bridge Project (2000). Strengthening K-16 transition policies. Stanford, CA: Stanford University: Author. Retrieved November 20, 2002, from http://www.stanford.edu/group/bridgproject. Wainer, H. (1997). Visual revelations. Mahwah, NJ: Lawrence Erlbaum Associates, Publishers. 116 Webb, N. L. (1997). Criteria for alignment of expectations and assessment in mathematics and science education (Research Monograph No. 8). Washington, DC: Council of Chief State School Officers. Webb, N. L. (1999). Alignment of science and mathematics standards and assessments in four states (Research Monograph No. 18). Madison, WI: National Institute for Science Education. Webb, N. L. (2001). Alignment analysis of State F Language Arts standards and assessments Grades 5, 8, and 11. Washington, DC: Council of Chief State School officers. Webb, N. L. (2002). An analysis of the alignment between mathematics standards and assessment for three states. Paper presented at the annual meeting of the American Educational Research Association, New Orleans, LA. 117 APENDICES Appendix A Prototype Model (Excel) for the Alignment Index 118 Appendix B Prototype Model (Software) for the Alignment Index 119 Appendix C ENGLISH KEY KNOWLEDGE AND SKILLS I. Reading Comprehension IA. The student will use reading skills and strategies to understand literature and information texts. IA.1. Use reading skills and strategies to understand a variety of informational texts: instructions for software, job descriptions, college applications, historical documents, government publications, newspapers, textbooks. IA.2. Use monitoring and self-correction methods and know when to read aloud. IA.3. Engage critically with the text: annotating, questioning, agreeing or disagreeing, summarizing, critiquing, formulating own responses. IA.4. Understand narrative terminology: author versus narrator, historical versus implied author, historical versus present-day reader. IA.5. Use reading skills and strategies to understand a variety of types of literature: epic piece (Iliad) or lyric poem, narrative novels, and philosophical pieces. IA.6. Understand plots and character development in literature, including characters’ motives, causes for actions, and the credibility of events. IA.7. Understand vocabulary and content: subject-area terminology, connotative and denotative meanings, idiomatic meanings. IA.8. Understand basic beliefs, perspectives, and philosophical assumptions underlying an author’s work: point of view, attitude, or values conveyed by specific use of language. IA.9. Use a variety of strategies to understand the origins and meanings of new words: analyzing word roots and affixes, recognizing cognates, using context clues, determining word derivations. IA.10. Make supported inferences and draw conclusions based on textual features: evidence in text, format, language use, expository structures, arguments used. IB. The student will be able to discuss with understanding the defining characteristics of literature and techniques of a variety of forms and genres. IB.1. Know the salient characteristics of major types and genres of literature: novels, short stories, horror stories, science fiction, biographies, autobiographies, poems, plays, etc. 120 IB.2. Distinguish the formal constraints of different types of texts: Shakespearean sonnets versus free verses. IB.3. Understand literary devices used to influence the reader and evoke emotions: imagery, characterization, choice of narrator, use of sound, formal and informal language. IB.4. Be able to discuss with understanding the effects of author’s style and literary devices on the overall quality of literary works: allusions, symbols, irony, voice, flashbacks, foreshadowing, time and sequence, mood. IB.5. Know archetypes, such as universal destruction, journeys and tests, banishment, that appear across a variety of types of literature: American literature, world literature, myths, propaganda, religious texts. IB.6. Be able to discuss with understanding themes such as initiation, love and duty, heroism, death and rebirth, that appear across a variety of literary works and genres. IB.7. Evaluate literature based on ambiguities, subtleties, contradictions in a text; based on aesthetic qualities of style, such as diction or mood. IC. The student will be familiar with a range of world literature. IC.1. Have some familiarity with major literary periods of English and American literature and their characteristic forms, subjects, and authors. IC.2. Have some familiarity with authors from literary traditions beyond the English-speaking world. IC.3. Have some familiarity with major works of literature produced by American and British authors. ID. The student will be able to discuss with understanding the relationships between literature and its historical and social contexts. ID.1. Know major historical events that may be encountered in literature. ID.2. Demonstrate familiarity with the concept that historical, social, and economic contexts influence form, style, and point of view; and that social influences affect author’s descriptions of character, plot, and setting. ID.3. Demonstrate familiarity with the concept of the relativity of all historical perspectives, including their own. ID.4. Be able to discuss with understanding the relationships between literature and politics: the political assumptions underlying an author’s work, the impact of literature on political movements and events. II. Writing IIA. The student will know how to use basic grammar conventions to write clearly 121 IIA.1. Identify parts of speech correctly and consistently: nouns, pronouns, verbs, adverbs, conjunctions, prepositions, adjectives, interjections. IIA.2. Use subject-verb agreement and consistent verb tense. IIA.3. Use pronoun agreement, different types of clauses and phrases appropriately: adverb clauses, adjective clauses, adverb phrases. IIB. The student will know conventions of punctuation and capitalization IIB.1. Use commas with nonrestrictive clauses and contrasting expressions. IIB.2. Use ellipses, colons, hyphens, semi-colons, apostrophes and quotation marks correctly. IIC. The student will know conventions of spelling IIC.1. Use a dictionary and other resources to spell new, unfamiliar, or difficult words. IIC.2. Differentiate between commonly confused terms: “its” and “it’s”, “affect” and “effect.” IIC.3. Know how to use the spellchecker function in word processing software and know the limitations of relying upon a spellchecker. IID. The student will use writing conventions to write clearly and coherently IID.1. Know and use several prewriting strategies: develop a focus, determine the purpose, plan a sequence of ideas, use structured overviews, create outlines IID.2. Use paragraph structure in writing: construct coherent paragraphs, arrange paragraphs in logical order IID.3. Use a variety of sentence structures appropriately in writing: compound, complex, compound-complex, parallel, repetitive, analogous IID.4. Present ideas so as to achieve overall coherence and logical flow in writing; use appropriate techniques to maximize cohesion (transitions, repetition) IID.5. Use writing conventions and documentation formats: style sheet methods such as MLA, APA; bibliography of sources IID.6. Demonstrate development of a unique style and voice in writing in a controlled fashion where appropriate IID.7. Use words correctly. Use words that mean what the writer intends to say. Use a varied vocabulary IIE. The student will use writing to communicate ideas, concepts, emotions, descriptions to the reader IIE.1. Know the difference between a topic and a thesis 122 IIE.2. Articulate a position through a thesis statement and advance it using evidence, example, counterargument that is relevant to the audience or issue at hand IIE.3. Use a variety of methods to develop arguments: use comparison-contrast reasoning; develop and sustain logical arguments (inductive-deductive); alternate appropriately between the general and the specific (make connections between public knowledge and personal observation and experience) IIE.4. Write to persuade the reader: anticipate and address counter arguments, use rhetorical devices, develop accurate and expressive style of communication (move beyond mechanics, add flair and elegance to writing) IIE.5. Use strategies to adapt writing for different audiences and purposes: include appropriate content; use appropriate language, style, tone, and structure; consider audience’s background IIE.6. Distinguish between formal and informal styles: formal paper, personal reflections, informal letters, memos IIE.7. Use appropriate strategies to write expository essays: include supporting evidence, use information from primary and secondary sources, use charts, graphs, tables and illustrations where appropriate, anticipate and address reader’s biases and expectations, use technical terms and notations. Use appropriate strategies and formats to write personal and business correspondence: appropriate organizational patterns, formal language and tone IIE.8. Use strategies to write fictional, autobiographical, and biographical narratives: develop point of view and literary elements, present events in logical sequence, convey a unifying theme or tone, use concrete and sensory language, pace action IIF. The student will use in priority fashion a variety of strategies to revise and edit written work to achieve maximum improvement in time available IIF.1. Review ideas and structure in substantive ways, improve depth of information, logic of organization, rethink appropriateness of writing in light of genre, purpose, and audience IIF.2. Use feedback from others to revise own written work III. RESEARCH SKILLS IIIA. The student will understand and use research methodologies IIIA.1. Formulate research questions, refine topics, develop a plan for research, and organize what is known about the topic IIIA.2. Use research to support and develop one’s own opinion, as opposed to simply restating existing information or opinions 123 IIIA.3. Identify through research the major concerns and debates in a given community or field of inquiry and address these in one’s writing IIIA.4. Identify claims in one’s writing that require outside support or verification IIIB. The student will know how to find a variety of sources and use them properly IIIB.1. Collect information to narrow and develop a topic and support a thesis IIIB.2. Understand the difference between primary and secondary sources IIIB.3. Use a variety of primary and secondary sources, print or electronic: books, magazines, newspapers, journals, periodicals, Internet IIIB.4. Critically evaluate sources: discern the quality of the materials, qualify the strength of the evidence and arguments, determine credibility, identify bias and perspective of author, use prior knowledge to judge; particularly as applied to Internet sources IIIB.5. Use sources to write research papers: integrate information from sources, logically introduce and incorporate quotations, synthesize information in a logical sequence, identify different perspectives, identify complexities and discrepancies in information, offer support for conclusions IIIB.6. Understand the concept of plagiarism and how (or why) to avoid it: paraphrasing, summarizing, quoting; particularly as applied to Internet sources IV. CRITICAL THINKING SKILLS IVA. The student will demonstrate connective intelligence IVA.1. Be able to discuss with understanding how personal experiences and values affect reading comprehension and interpretation IVA.2. Show ability to make connections between the component parts of a text one is reading or writing and the larger theoretical structures: presupposition, audience, purpose, writer’s credibility or ethos, types of evidence or material being used, and style (including correctness) IVB. The student will demonstrate ability to think independently IVB.1. Students should be comfortable formulating and expressing their own ideas IVB.2. Support one’s argument with logic and evidence that is relevant to one’s audience and which explicates one’s position as fully as possible IVB.3. Fully understand the scope on one’s argument and the claims underlying it IVB.4. Reflect on and assess the strengths and weaknesses of one’s ideas and their expression 124 Appendix D MATH KEY KNOWLEDGE AND SKILLS I. COMPUTATION IA. The student will know basic mathematics operations IA.1. Use arithmetic operations with fractions (e.g., add and subtract by finding a common denominator, multiply and divide, reduce) IA.2. Use exponents and scientific notation IA.3. Use radicals correctly IA.4. Understand relative magnitude IA.5. Calculate using absolute value IA.6. Know terminology for complex numbers, integers, rational numbers, irrational numbers and complex numbers IA.7. Use the correct order of arithmetic operations, particularly demonstrating facility with the Distributive Law IB. The student will know and carefully record symbolic manipulations IB.1. Use mathematical symbols and language appropriately (e.g., equal signs, parentheses, superscripts, subscripts) IC. The student will know and demonstrate fluency with mathematical notation and computation IC.1. Perform symbolic addition, subtraction, multiplication and division IC.2. Perform appropriate basic operations on sets (e.g., union, intersection, elements of, subsets, complement) IC.3. Be comfortable with alternative symbolic expression II. ALGEBRA IIA. The student will know and apply basic algebraic concepts IIA.1. Use the distributive property to multiply polynomials IIA.2. Divide polynomials (e.g., long division) IIA.3. Factor polynomials (e.g., difference of squares, perfect square trinomials, difference of two cubes, and trinomials like x2 + 3x + 2) IIA.4. Add, subtract, multiply, divide, and simplify rational expressions including finding common denominators IIA.5. Understand properties and basic theorems of roots, and exponents, (e.g., (x2)(x3)=x5 and ( x)3 = x3/2 125 IIA.6. Understand properties and basic theorems of logarithms (to bases 2, 10, and e) IIA.7. Know how to compose and decompose functions and find inverses of basic functions IIB. The student will use various techniques to solve basic equations and inequalities IIB.1. Solve linear equations and absolute value equations IIB.2. Solve linear inequalities and absolute value inequalities IIB.3. Solve systems of linear equations and inequalities using algebraic and graphical methods (e.g., substitution, elimination, addition, graphing) IIB.4. Solve quadratic equations using various methods and recognize real solutions IIB4a. Factoring IIB4b. Completing the square IIB4c. The quadratic formula IIC. The student will distinguish between expression, formula, equation, and function IIC.1. Distinguish between expression, formula, equation, and function and recognize when simplifying, solving, substituting in, or evaluating is appropriate (e.g., expand the expression (x + 3)(x + 1), substitute a = 3 and b = 4 into the formula a2 + b2 = c2, and solve the equation 0 = (x + 3)(x + 1), evaluate the function f(x) = (x + 3)(x + 1). IIC.2. Understand the concept of a function beyond it being a type of algebraic expression IIC.3. Know how to use polynomials and exponential functions in applications IIC.4. Use a variety of models (e.g., written statement, algebraic formula, table of input-output values, graph) to represent functions, patterns, and relationships IIC.5. Understand terminology and notation used to define functions (e.g., domain, range) IIC.6. Understand the general properties and characteristics of basic types of functions (e.g., polynomial, rational, exponential, logarithmic, trigonometric) IID. The student will understand the relationship between equations and graphs IID.1. Understand basic forms of the equation of a straight line and how to graph the line without a calculator IID.2. Student will understand the basic shape of a quadratic function and the relationships between the roots of the quadratic and zeroes of the function 126 IID.3. Know the basic shape of the graph of an exponential function and log functions, including exponential decay IIE. The student will know how to use algebra both procedurally and conceptually IIE.1. Recognize which type of model (i.e., linear, quadratic, exponential) best fits the context of a basic equation IIF. The student will demonstrate ability to algebraically work with formulas and symbols IIF.1. Know formal notation (e.g., sigma notation, factorial representation) and series of geometric and arithmetic progressions III. TRIGONOMETRY IIIA. The student will know and understand basic trigonometric principles IIIA.1. Know the definitions of sine, cosine, and tangent using right triangle geometry and similarity relations IIIA.2. Understand the relationship between a trigonometric function in standard form and its corresponding graph (e.g., domain, range, amplitude, period, phase shift, vertical shift) IIIA.3. Know and use identities for sum and difference of angles (e.g., sin (x ± y), cos (x ± y), tan (x ± y)) IIIA.4. Understand periodicity and recognize graphs of periodic functions, especially the trigonometric functions IV. GEOMETRY IVA. The student will understand and use basic plane and solid geometry IVA.1. Know properties of similarity, congruence and parallel lines cut by a transversal IVA.2. Know how to figure area and perimeter of basic figures IVA.3. The student will understand the ideas behind simple geometric proofs and be able to develop and write simple geometric proofs, such as the Pythagorean theorem, the fact that there are 180 degrees in a triangle, and the fact that the area of a triangle is half the base times the height IVA.4.Use geometric constructions to complete simple proofs, and to solve problems IVA.5. Use similar triangles to find unknown angle measurements and lengths of sides IVA.6. Visualize solids and surfaces in 3-dimensional space 127 IVA.7. Know basic formulae for volume and surface area for three-dimensional objects IVB. The student will know analytic (i.e., coordinate) geometry IVB.1. Know geometric properties of lines (e.g., slope, and midpoint of a line segment) and the formula for the distance between two points IVB.2. Use the Pythagorean Theorem and its converse and properties of special right triangles (e.g., 30°-60°-90° triangle) to solve mathematical and real-world problems (e.g., ladders, shadows, poles) IVB.3. Recognize geometric translations algebraically IVC. The student will understand basic relationships between geometry and algebra IVC.1. Understand that objects and relations in geometry correspond to objects and relations in algebra (e.g., a line in geometry corresponds to a set of ordered pairs satisfying an equation ax + by = c). IVC.2. Know the algebra and geometry of circles, and (for those who intend to study calculus) a parabolas and ellipses IVC.3. Use trigonometry for examples of algebraic/geometric relationship, including Law of Singes/Cosines V. MATHEMATICAL REASONING VA. The student will use mathematical reasoning to solve problems VA.1. Use inductive and deductive reasoning in basic arguments VA.2. Use geometric and visual reasoning VA.3. Use multiple representations (e.g., analytic, numerical, geometric) to solve problems VA.4. Learn to solve multi-step problems VA.5. Use a variety of strategies to revise solution processes VA.6. Experience both proof and counter example in problem solutions VA.7. The student will be familiar with the process of abstracting mathematical models from word problems, geometric problems, and applications and interpreting solutions in the context of these source problems. VB. The student will be able to work with mathematical notation to solve problems and to communicate solutions VB.1. Translate simple statements into equations (e.g., "Bill is twice as old as John" can be expressed by the equation b=2j) 128 VB.2. Understand the role of written symbols in representing mathematical ideas and the precise use of special symbols of mathematics VC. The student will know a select list of mathematical facts and know how to build upon these VC.1. The student will know a select list of mathematical facts and know how to build upon these. VD. The student will know how to estimate VD.1. Be familiar with decimal approximation of fractions VD.2. Know when an estimate or approximation is sufficient in problem situations in place of exact answers VD.3. Recognize the accuracy of an estimation VD.4. Know how to make and use estimates VE. The student will understand the appropriate uses of calculations and their limitations VE.1. Recognize misinformation that can arise from calculator use VE.2. Perform experiments on the calculator VE.3. Plot useful graphs VF. The student be able to generalize and to go from specific to abstract and back again VF.1. Determine the mathematical concept from the context of an external problem, solve the problem, and interpret the math solution in the context of the problem VF.2. Student will know how to use specific instances of general facts and how to look for general results that extend particular ones VG. The student will demonstrate active participation in the process of learning mathematics VG.1. Be willing to experiment with problems that have multiple solution methods VG.2. Demonstrate an understanding of the mathematical ideas behind the steps of a solution as well as the solution VG.3. Show an understanding of how to modify patterns and solution strategies to obtain different solutions VG.4. Recognize when a proposed solution does not work, analyze why, and use the analysis to seek a valid solution 129 VH. The student will recognize the broad range of applications of mathematical reasoning VH.1. Know some mathematical applications used in other fields (e.g., carbon dating, exponential growth, predator/pre models, periodic motion and the interactions of waves, amortization tables) VH.2. Know some of the roles mathematics has historically played and continues to play 130
© Copyright 2026 Paperzz