Alignment of State Assessments and Higher Education

Alignment of State
Assessments and Higher
Education Expectations:
Definition and Utilization of
an Alignment Index
By Dr. Gil Fonthal
THIS RESEARCH PAPER DEFIN E S A Q U A N T I TAT I V E I N D E X TO
M EASURE THE GAP BETWEEN SEC O N D A RY E D U C AT I O N A N D H I G H E R
EDUCATION IN TH E U N I T E D S TAT E S .
Results showed, on the one hand, that there were substantial differences in
the index of alignment between all state tests and university expectations
in English language. On the other hand, there were small differences in
the index of alignment for mathematics tests. However, in both subject
matters, most state tests obtained total indices of alignment slightly below
0.50, which is considered moderate to low.
also reached the highest total index of alignment (ITE = 0.57)
“ Missouri
above the mean (M = 0.37) of all states. This value located Missouri in the
Middle range of alignment for English.
”
Research done at the University of California,
Irvine. 2004.
INTEDCO
P. O. Box 8081
Laguna Hills, CA 92654. USA
���
(800) 880-1091 (949) 589-2360
www.intedco.org
Alignment of State Assessments and Higher Education Expectations:
Definition and Utilization of an Alignment Index
By Dr. Gil Fonthal
University of California, Irvine and Los Angeles (UCI/UCLA)
 Gil Fonthal, 2004
ABSTRACT
This study developed a quantitative methodology of alignment between standards
and assessments. Such methodology adopted three dimensions to define the
criteria of alignment: range, depth, and balance of content match. These three
criteria, which involve a revised and expanded version of the main categories and
criteria currently used in salient alignment methodologies, are the minimum and
sufficient concepts necessary to characterize the process of alignment between
standards and assessments in the content focus category. This methodology also
includes the construction of an index of alignment as a mathematical function that
is a linear combination of the three alignment criteria adopted. A software
program was developed to prototype possible indices (formulas) of alignment.
Additionally, small computational programs were developed to manipulate the
different alignment criteria and the index itself. Several graphical representations
were created to show results in a visual manner. As a result, an index was adopted
that complied with required internal mathematical structure, the logical inferences
about its function, and its performance using actual data. This alignment index
was utilized to measure the alignment between a set of higher education
expectations (standards suggested by a consortium of higher education institutions
in the US) and selected state assessments across the country.
Twenty-seven state tests were selected and analyzed to determine their alignment
with higher education expectations. The alignment was carried out in mathematics
and English language subject matters for each state test. In general terms, the
raters (from 4 to 6) found that the university expectations in English language are
more cognitively demanding than the state tests. The raters also found that the
state tests in mathematics are almost equally cognitively demanding than the
higher education expectations in this subject matter.
Results showed, on the one hand, that there were substantial differences in the
index of alignment between all state tests and university expectations in English
language. On the other hand, there were small differences in the index of
alignment for mathematics tests. However, in both subject matters, most state
tests obtained total indices of alignment slightly below 0.50, which is considered
moderate to low. The levels of the total index of alignment correlated with NAEP
scores of the states studied. These results provided validity evidence for the
methodology.
CHAPTER 1
INTRODUCTION
Statement of the Problem
Standards-based systemic school reform in the United States considers the
alignment of the public schooling system as a desired goal that would ensure an efficient
and effective educational system. Alignment and continuity among standards,
assessments, curriculum, professional development, and pedagogy is, then, considered a
condition for a healthy school system, and ultimately a condition for student success.
During the last few decades, standards-based school reform in the United States
has made a major impact on the educational system. At the heart of systemic reform is
the concept of alignment (National Council on Education Standards and Testing
(NCEST), 1992; Porter, 2002; Smith & O’Day, 1991). According to reformers, the
alignment of state standards, state assessments, curriculum, teachers’ professional
development, and classroom instruction will ensure high quality in student outcomes
(Fuhrman, 1999).
The definition and introduction of standards at all levels and the increased use of
standardized tests have become ubiquitous in systemic school reform. Across the nation,
a coalition of educational leaders and policy makers advocate standards and high-stakes
testing as the main educational policy to improve the accountability of the public
schooling system. As such, standardized assessment has become the significant means of
quality control of education and an instrument of reform of the educational agenda. Yet,
high-stakes tests remain the center of national debate and intense political rhetoric due to
the important differential consequences for students, teachers, administrators, and schools
1
as a whole. Despite the consensus about the necessity of an aligned schooling system,
there is neither a widely accepted definition or common terminology of alignment, nor a
proven quantitative methodology to perform or measure such alignment across contexts
in the instructional system (Porter, 2002).
The alignment of the educational system is a complex issue. Alignment among
specific components of the instructional system (standards, assessment, curriculum,
professional development, and pedagogy) has been addressed by different scholars, from
different points of view, since the inception of standards-based reform (Cohen, 1987;
NCEST, 1992; Smith & O’Day, 1991). Comprehensive information on research, policies,
and resources in regard to standards and alignment may be found through National
Center for Research on Evaluation Standards and Student Testing (CRESST); Tools for
Auditing National Standards-based Education of the National Education Association
(NEA); Project 2061 of the American Association for the Advancement of Science
(AAAS); Council of Chief State School Officers (CCSSO); and Curriculum Alignment
Project (CAP) in Indiana. Several studies from these projects are discussed here.
Alignment procedures among curricula, assessments, and standards have been
widely used in school districts across the nation (Buckendahl, Plake, Impara, & Irwin,
2000; Rothman, Slattery, Vranek, & Resnick, 2002). Several methodologies to ensure
and measure alignment among components of the instructional systems have been
developed (Webb, 1997, 1999: Porter & Smithson, 2001a, 2001b). Detailed analysis of
these methodologies will be included in this study.
There is a parallel between the alignment methodology and content-related
evidence of validity in educational measurement. For example, in the Standards for
2
Educational and Psychological Testing (American Educational Research Association,
1999), evidence of validity based on test content is defined as follows:
Evidence based on test content can include logical or empirical analyses of the
adequacy with which the test content represents the content domain and of the
relevance of the content domain to the proposed interpretation of test scores.
Evidence based on content can also come from expert judgment of the
relationship between parts of the test and the construct (p. 11).
The term “content domain” above is equivalent to state content standards in terms
of current alignment terminology. As a consequence, the two spheres of validity and
alignment intercept each other, which emphasizes their dual importance in standardsbased school reform. Nevertheless, the methodology developed in this study may provide
evidence of validity for the use of well-aligned assessments in some given purposes of
educational measurement such as graduation, college placement, and college admission
decisions.
During the last decade, as part of the standards-based movement, there has been
debate about a perceived mismatch between state tests and state standards (Resnick &
Resnick, 1992). A related issue is that institutions of higher education are raising concern
about the performance of freshmen entering the system (Powell, 1996; Kirst, 1998).
Additionally, the different standardized tests used as part of college admission criteria
have been under criticism for a variety of reasons (Kirst, 1998). Among these criticisms
are the lack of alignment between college admission assessment instruments and
secondary school tests and standards (Le, Hamilton, & Robyn, 2000). For example, the
Consortium for Policy Research in Education (CPRE, 2000) has found that in the
3
southern states, 75 different college placement tests are used, without any consideration
of secondary state standards. Further, educational leaders believe that the expectations of
institutions of higher education are divorced from what K-12 leaders expect from their
senior graduates (Bishop, 1996; Kirst & Venezia, 2001; La Marca, Redfield, Winter,
Bailey, & Despriet, 2000). As such, the challenge of alignment permeates the K-16
educational system (The Bridge Project, 2000).
Only recently have some alignment methodologies and documentation appeared
in the area of content alignment (Impara, 2001; La Marca et al., 2000; Porter & Smithson,
2001a; Webb, 1997). Most of these alignment methodologies are qualitative assertions
based on a cell-by-cell comparison criterion, in which a threshold has been arbitrarily
predefined. This kind of alignment is usually referred to as content alignment because its
objective is to match content covered between particular components of the instructional
system. These current alignment methodologies, especially their terminology and criteria,
utilized by different authors are diverse and sometimes confusing. Consequently, a
comprehensive language to describe alignment and a sound quantitative methodology to
measure alignment across the different contexts of the educational system is needed
(Porter, 2002). As Herman, Webb, & Zuniga (2003) pointed out: “[W]e also need better
methodologies for judging alignment, methodologies that recognize the meaning and
complexity of the concept of alignment and that can support better reform goals” (p. 2).
Purpose of the Study
The main purpose of the study is to develop a quantitative methodology to
measure alignment between standards and assessments. The existing alignment
methodology includes qualitative and quantitative concepts (Porter & Smithson, 2001a;
4
Webb, 1997), which address different criteria used to measure alignment between
standards and assessment. As such, in the present study, the researcher proposes an index
of alignment that takes into consideration an extension of the qualitative and quantitative
criteria defined by different scholars in alignment research. The alignment index is a
mathematical formula as a function of the main alignment criteria defined in the
conceptual framework of this study. Such an index could be expressed as follows:
I = f (alignment criteria)
This index is intended to serve in different educational contexts such as subject
matter, grade level, and across states. A quantitative metric, in the form of an alignment
index, will facilitate measurement and comparisons among components of the
instructional system, specifically between standards (content domain) and assessments. In
addition, a quantitative index is able to serve as a prototype and to be manipulated using
modern computational tools. Several computational tools were developed to prototype
the index and to manipulate quantitatively the three alignment criteria defined. The index
of alignment defined in this study was also utilized to match the content between a set of
higher education expectations (standards) and selected state assessments. The higher
education expectations, known as the Key Knowledge and Skills for University Success
(KSUS), were defined by a consortium of United States universities, under the auspices
of the American Association of Universities (AAU) (Conley, 2002).
Objectives of the Study
There are three main objectives of the study. The first objective is to adopt a
common and concise alignment language, taken from the extensive current terminology.
This language served to analyze, describe the alignment criteria, and systematize the
5
process of measuring alignment in a concise way. This minimum and concise language is
needed to reconcile and resolve differences in the alignment terminology used by
different authors. This minimum terminology is established in the framework and in the
definition of terms sections of this study.
The second objective is to quantify the alignment methodology proposing a
metric. This objective includes the development of a meaningful mathematical formula
(alignment index) that defines and measures the levels (in proportions or percentages) of
content match between standards and assessments. The definition of this index intends to
bring together and extend the two most currently used methodologies in alignment
research, which are Porter (2002) and Webb’s (1997, 1999) approaches. Besides the
definition of the index of alignment, a series of computational tools (software) have been
created in order to prototype and visualize the index, and to manipulate the different
alignment criteria defined in the conceptual framework of this study.
The third objective is for the alignment index to be used to measure the alignment
between a set of higher education expectations (standards/content domain) that a
consortium of United States universities has put together (Conley, 2002) and current state
assessments across the country. These higher education expectations are equivalent to
state standards in their form and content. This third objective intends to evaluate the
properties and performance of the alignment index proposed, as well as to provide
evidence of its validity. Alignment results were also used to find relationships with
accountability factors of high-stakes testing, such as the National Assessment of
Educational Progress (NAEP).
Significance of the Study
6
An advanced quantitative methodology of alignment is needed in current
standards-based school reform (Porter, 2002; Herman, 2003). Such alignment
methodology should improve our understanding of the alignment complexity, and should
provide better tools for judging content alignment in the educational system. This
research effort is expected to contribute to the fulfillment of this need.
The study of alignment is important because educational leaders and policy
makers need to know about the alignment status of the educational system. Having such
knowledge will enable them to make instructional decisions that improve the quality of
education. In addition, knowing the alignment status allows decision makers to take
corrective action to appropriately align the educational system where poor alignment
exists. More importantly, strengthening the alignment between higher education
expectations and K-12 curriculum and assessment may serve to improve the opportunities
for all students to enter and succeed in higher education. Several software programs have
been developed and are intended to expand the tools available to quantify the alignment
between standards and assessments. Also, novel visual representations are used as
graphical aids to describe, analyze, and measure alignment. Finally, this study is expected
to advance the research in standards-based school reform and, in particular, the research
on educational alignment.
Research Questions
The five fundamental research questions addressed by this study are:
Research Question 1. What is a minimum terminology necessary and sufficient to
analyze and describe the criteria in the process of measuring content alignment?
7
Research Question 2. What is the mathematical form and characteristics of a
quantitative metric (alignment index) to measure alignment between expectations and
assessment across instructional contexts?
Research Question 3. To what extent are state assessments and higher education
expectations (KSUS) aligned according to the index defined?
Research Question 4. What alignment comparisons can be made between KSUS
and state assessment across states and across subject matter using this index?
Research Question 5. What relationships could be established among the levels of
alignment and accountability features of state testing?
Definition of Terms
Alignment: The match and continuity among the main components of the
educational systems such as standards, assessments, curriculum, professional
development, and pedagogy. For the purposes of this study, the term alignment will be
used in the context of standards and assessments.
Alignment Index: Quantitative formula intended to measure (in proportions or
percentages) the alignment between standards and assessments. It ranges from 0 to 1,
where 1 indicates perfect alignment. There are three different indices defined in this
study. Partial Index with respect to standard, I(S), or with respect to assessment, I(A), for
each subject topic. Overall Index (I) as a combination of the two partial indices for each
subject topic. Total Index (IT) as the final index for each state test including all topics for
a subject matter. Detailed descriptions for these indices are in the conceptual framework
of this study in Chapter 3.
8
Range of Knowledge (ROK): Alignment criterion. Range or span of content
coverage among components of the instructional system. Specifically applied in this
study to standards and assessments. ROK is usually expressed as the proportion of
standards (or assessments) addressed by the tests (or standards). When ROK is measured
from assessments with respect to standards, it is expressed as ROKS. Conversely, when
ROK is measured from standards with respect to assessments, it is expressed as ROKA.
Detailed definitions of these terms are in Chapter 3 under the framework of the study.
Depth of Knowledge (DOK): Alignment criterion. Scale to indicate the levels of
cognitive complexity (cognitive demand) of any component of the instructional system.
Specifically applied in this study to standards and assessments. The values of DOK are
expressed as below (DOKB), equal (DOKE), or above (DOKA) when comparing
standards with respect to assessments. For example, a DOKE value for a test means the
proportion of items that have the same DOK value of the respective KSUS. More
information about DOK is in Chapter 3.
Balance of Knowledge (BOK): Alignment criterion. The degree of importance or
emphasis the standards have on the test. BOK is measured as the distribution of questions
(or standards) addressed by each standard (or assessment) throughout the test. When
BOK is measured from assessments with respect to standards, it is expressed as BOKS.
Conversely, when BOK is measured from standards with respect to assessments, it is
expressed as BOKA. A detailed discussion about BOK can be found in Chapter 3.
Skewness: Quantitative (vector) concept that indicates the contribution of each
alignment criterion to the index. Its module denotes the magnitude of the skewness and
its direction indicates toward which criterion the skewness of the index is oriented.
9
Standards for Success (S4S): Project fostered by the American Association of
Universities (AAU) aimed to bridge the gap between higher education expectations and
K-12 graduation performance. The S4S project, through a series of workshops, collected
the data that is used in this study to put to the test the index of alignment constructed. The
S4S project involved more than 400 educators and administrators from more than 20
higher education institutions.
Key Knowledge and Skills for University Success (KSUS): Higher education
expectations in several subject matters (language, mathematics, science, social science,
secondary language, and humanities/arts) defined by the S4S project. KSUS are
equivalent in content and form to K-12 standards. Appendices C and D list the KSUS for
language and mathematics, which are the subject matter analyzed in this research.
10
CHAPTER 2
REVIEW OF THE LITERATURE
Definition of Alignment
Alignment, in the context of the systemic standards-based school reform, refers to
the match, continuity, and synchronization among the main components of the
instructional system: content standards, assessment, curriculum, professional
development, and classroom practice.
Alignment and School Reform
The hypothesis of an aligned schooling system, as a necessary condition for a
healthy and effective educational system, and ultimately as guarantee for student
achievement, is built into the essence of the current systemic school reform (NCEST,
1992; Smith & O’Day, 1991). As Porter (2002) stated, “When a system is aligned, all the
messages from the policy environment are consistent with each other, content standards
drive the system, and assessment, materials, and professional development are tightly
aligned to the content standards” (p. 11).
Alignment as a Necessary Condition
Educational reformers believe that the present state of alignment is weak among
all components of the instructional system and agree about the importance of alignment
(Baker & Linn, 2000; Feuer, Holland, Green, Bertenthal, & Hemphill, 1999; Rothman et
al., 2002). The importance of alignment is also expressed in terms of educational
responsibility. States should design assessments aligned to their academic standards to
make justifiable accountability decisions (La Marca, 2001; La Marca et al., 2000;
11
Lashway, 1999). Impara (2001), after reviewing the different methodologies of
alignment, suggested the need to conduct appropriate large-scaled alignment studies to
inform instruction.
In regard to alignment, one important study was an in-depth analysis of the
reasons eighth grade science students in Minnesota were second only to students in
Singapore in the Third International Mathematics and Science Study (TIMSS) (National
Education Goals Panel, 2000). The researchers found that the two main reasons for such
world-class performance were the alignment (by design) of standards, assessment, and
curriculum, as well as statewide continuity of educational programs in Minnesota.
Alignment has been identified as an important element in educational
accountability systems. For example, Title I of the Elementary and Secondary Education
Act (ESEA) [P.L. 103-382] requires that the participant states adopt challenging content
standards as well as high standards for student performance. Title I mandates that states’
content standards “must be aligned” with the assessment system adopted by each state.
By the same token, the “No Child Left Behind” Act (ESEA, Act of 2001) ruled that, by
the 2005-06 school year, states should start administering annual assessments in reading
and mathematics for grades 3-8, which “must be aligned” with the state academic
standards. Other subject matter must follow in subsequent years. Despite the importance
of alignment recognized by these government mandates, none indicates how to perform
such alignment or what constitutes sufficient or appropriate alignment between tests and
standards.
Alignment also is an important element in educational measurement. Contentrelated evidence of test validity is based on the match between the content of the test
12
items and the content of the knowledge domain (expectations/standards). Knowledge of
these standards is what the test supposes to measure. Moreover, alignment studies and
content-related evidence of validity share commonalties in methodology, as will be
discussed later.
Alignment measurements range from simple traditional matches, in which,
through a checklist, subjective perceptions of the rater, or word counts, test items are
compared to standards at school and district levels (Lewis, 1997; Nelson, 2002), to
sophisticated large-scale studies, some lasting several years, comparing state standards
and state tests across the United States (Impara et al., 2000; Porter & Smithson, 2001a;
Webb, 1997; Rothman et al., 2002).
Another concern is the alignment problem between secondary and post-secondary
education (P-16) (Tafel & Eberhart, 1999). Kirst (1998) recognized the importance of this
alignment and identified four critical points, which “threaten to potentially undermine the
preparation of American secondary students for college education” (p. 2). These points
include: (a) lack of authentic measures for student assessment regarding college
preparation; (b) misalignment between secondary student preparation and college
admission and placement standards; (c) placement of unacceptable high number of
students in remedial classes; and (d) low retention and completion rates of students in
many public universities.
The lack of alignment between K-12 and post-secondary education (P-16) has
motivated various higher education institutions across the nation to review their
admission policies and launch efforts aimed to connect high school standards with
university success. A number of those efforts are directed at establishing a relationship
13
between university admissions and state K-12 standards and assessments (CPRE, 2000;
Tafel & Eberhart, 1999). For example, a consortium of universities across the nation
launched the Standards for Success (S4S) project, supported by the American Association
of Universities (AAU). Under S4S, a series of national conversations have taken place
involving nearly 400 higher education faculty members and administrators from more
than 20 higher education institutions (Conley, 2002). S4S has described what higher
education institutions expect of entering freshman in terms of knowledge and skills. The
S4S project put together a document, Key Knowledge and Skills for University Success
(KSUS), which includes six categories encompassing six academic disciplines—English,
math, science, social science, second language, and humanities/arts (Conley, 2002).
KSUS has been distributed to school districts across the United States, and it is
expected that some school districts will take KSUS into consideration when defining their
own content standards. Importantly, a comparison between KSUS and state tests is
equivalent to a comparison between state standards and state tests. Additionally, S4S has
carried out a series of workshops in which experienced educators have rated both KSUS
and state assessments, with respect to predefined depth of knowledge criteria and content
concurrence. Data from the S4S project is available to scholars and researchers who are
interested in developing a defensible methodology to match college expectations and
current K-12 assessment. This research study has utilized data from the S4S project with
the purpose of finding evidence of validity for the alignment index constructed.
Determining Alignment between Standards and Assessment
Three basic approaches have been used to determine alignment between standards
and assessment (La Marca et al., 2000; Webb, 1997; 1999). These approaches include:
14
(a) the development of the assessment followed sequentially after the development of the
standards; (b) post facto judgment, which is expert review of the standards and the
assessment; and (c) document analysis, which is a systematic analysis (coding) of
standards and assessment using a common pre-defined metric. These three strategies are
also used to secure content-related evidence when they refer to validity of assessmentbased interpretations in educational measurement.
Dimensions of Alignment
Two overarching dimensions of alignment have been identified in regard to test
item-level comparison to standards: content match and depth match. La Marca et al.
(2000) wrote a guide to assist states and school districts in aligning their assessment
systems to their standards. He identified relevant aspects of alignment to be considered,
such as content match, depth match, emphasis, performance match, accessibility, and
reporting. He also discussed alignment in the context of other important educational
components, such as accountability, teachers’ involvement, professional development,
policy development, textbook adoption, and K-16 connections.
La Marca (2001) further refined the two most important dimensions of content
alignment (content match and depth of match). For content match, according to La
Marca, the concern is how well the test matches subject area content identified through
state academic standards. Of relevance are broad content coverage, which concerns
whether test content addresses broad academic standards and whether there is categorical
congruence; range of coverage, which concerns whether test items address the specific
objectives related to each standard; and balance of coverage, which concerns whether test
items reflect the major emphasis and priorities of the academic standards. For depth
15
match, the concern is how well the test items match the knowledge and skills specified in
the state standards in terms of cognitive complexity. A test that emphasized simple recall,
for example, would not be well aligned with a standard calling for students to be able to
demonstrate a skill.
In a study of alignment of norm-referenced achievement tests with Nebraska’s
content standards, Impara et al. (2000) used teacher’s judgments to measure the level of
alignment using declarative perception of alignment, with the rating criteria of high level,
moderate level, low level, and no alignment. These authors contend that using this
definition and procedure of alignment resulted in a much higher likelihood that test items
would match the content of the standards. This lowest common denominator is not
acceptable in a genuine study of alignment, however. Impara’s definition of alignment
can be considered a narrow characterization of alignment in comparison to the work of
authors such as Webb (1997, 1999), Porter (2002), and Rothman et al. (2002).
The current study will concentrate only on the three main dimensions of
alignment that characterize the content focus category, which are breadth of match, depth
of match, and balance of match.
Methods of Alignment
Several methodologies, generally as a combination of the three basic approaches
presented above, have been proposed to determine the degree of alignment between
standards and assessment in states and school districts. These approaches are generally
referred to as content alignment methodologies because their aim is to match content
covered between standards and assessment. This type of content alignment usually
employs rating scales (percentages) based on breadth of content coverage and depth of
16
content coverage. Every component of the system (e.g. standard and assessment) receives
a rating in both scales (breadth and depth) and then a comparison is established according
to the level (percentage) of concurrence of content covered.
Buckendahl et al. (2000), for example, used panels of experienced teachers to rate
alignment between district standards and commercial tests. These authors found that
teachers’ and test publishers’ perspectives do not coincide when commercial test
publishers claim that their tests are aligned with current state standards. Rothman et al.
(2002) also trained a panel of experts to determine alignment between assessment and
standards, using protocols analogous to the scoring of performance assessment or
portfolios. In determining the degree of alignment, these authors designed a protocol
based on four dimensions to rate the level of alignment between assessment and
standards. The four dimensions are content centrality, performance centrality, challenge,
and balance and range.
The criterion of content centrality defines the match between the content of each
item and the content of the related standard. Sometimes this criterion is also used to
determine the importance of a topic in a test. Performance centrality focuses on the
degree of match between the types of performance exhibited by each test item and the
types of performance required by the corresponding standard. The criterion of challenge
compares the level of challenge between the assessment items and the related standard.
Two factors are considered in evaluating this match: (a) source of challenge, in which
reviewers rate items according to the intrinsic difficulty of the questions or to the
difficulty with respect to students’ background knowledge, and (b) level of challenge, in
which reviewers rate test items according to level of difficulty presented by the standards.
17
The level of challenge in both instruments (tests and standards) should be comparable.
Fulfilling the criterion on balance and range assures that test items cover the full range of
standards with an appropriate balance of emphasis across the standards. When referring
to defined content domain of skills, knowledge, and affect, this protocol corresponds to
some degree with the terminology of test validity in educational measurement.
Webb’s Alignment Methodology
Webb (1997) proposed the use of experts who perform systematic review of the
standards and the corresponding tests. Webb’s approach to measure alignment depends
upon five categories, each of which informs one aspect of the alignment methodology.
These five categories are content focus, articulation across grades and ages, equity and
fairness, pedagogical implications, and system applicability.
Content focus is related to the content knowledge of the standards and
assessment. It is subdivided into six criteria: categorical concurrence, depth of knowledge
consistency, range of knowledge correspondence, structure of knowledge comparability,
balance of representation, and dispositional consonance. Articulation across grades and
ages is related to students learning at different developmental stages and their
understanding of content and processes growing over time. This category is based on
cognitive soundness determined by research and understanding, as well as cumulative
growth in content knowledge during students’ schooling. Equity and fairness involve
alignment of standards and assessment, and give students the opportunity for higher
levels of learning. Pedagogical implications include the idea that alignment of standards
and assessment should affect teaching practice. Proper alignment will help teachers
develop appropriate pedagogy. Alignment is judged through engagement of students,
18
effective classroom practices, and use of technology, materials, and tools. System
applicability means that alignment of standards and assessment should help teachers to
create educational systems that are realistic, reliable, applicable, and attainable.
Currently, only the first category, content focus, has been developed extensively
in Webb’s (1997, 1999) work and by authors such as Porter (2000, 2002) and La Marca
(2001). In subsequent work, Webb (2001, 2002) refined his methodology concentrating
in the content focus category. He applied this methodology to the alignment of science,
mathematics, and language with their respective standards in selected states. In Webb’s
model of alignment, the content focus category includes the four criteria: categorical
concurrence, depth of knowledge consistency, range of knowledge correspondence, and
balance of representation.
The categorical concurrence (CC) criterion refers to the content categories of the
standards and assessment. This criterion is met if both documents address the same
content categories. According to Webb (1997, 1999), a minimum agreement of six items
addressing the same category is considered for alignment. This study challenges such
criterion for several reasons, which are discussed in detail below in the conceptual
framework of this study.
Depth of knowledge consistency (DOK) is an indication of the complexity of
knowledge. Webb (1997, 1999) defined four levels of DOK, which will be presented
later. This criterion is met if the standards and assessments are comparable on their
cognitive exigency. According to Webb, the two instruments (standards and assessment)
are comparable if the DOK of the assessment items are equal or above the depth of
knowledge of the standards. Webb contends that a minimum benchmark of 50% (the sum
19
of equal and above rates) is the criterion for alignment. The researcher will argue that
perfect alignment should be considered when test and standard have equal DOK. It is not
fair for learners to have higher levels of exigency in a test if the standard (and classroom
instruction) did not prepare them for such demand.
Range of knowledge correspondence (ROK) refers to the breadth of knowledge
between the standards and assessment. In other words, it is the proportion of standards
addressed by the items. Alignment is acceptable if a comparable span of knowledge is
achieved between standards and assessment. According to Webb (1997, 1999) an
agreement of at least 50% is needed for proper alignment. The researcher will argue that,
to have proper alignment, the proportion of items addressed by standards (bidirectional)
also should be considered. In other words, it is a measurement of how the test (items)
covers the standards.
Balance of representation (BR) means that the degree of importance and emphasis
of content, instruction, and tasks should be comparable in both instruments (tests and
standards). The number of questions that address a given standard should be equally
distributed across the test. Webb (1997, 1999) defined a partial index of alignment to
measure this criterion, using a formula similar to that proposed by Porter and Smithson
(2001a). This formula is central to this study and will be discussed later. According to
Webb (1997, 1999) an index value of at least .70 is needed for proper alignment using
this criterion. Webb’s BR is also in one direction in its measuring of emphasis of items in
relation to the standards. The researcher will argue that emphasis of standards in terms of
items (bidirectional) also should be taken in consideration. This modified version of
20
Webb’s balance of representation (BR) is called balance of knowledge (BOK) in this
study.
Although Webb’s (2001, 2002) work was in the alignment of expectations and
assessment in mathematics, language, and science, he provided a methodology, especially
in the content focus category, which can be applied in other content areas.
Webb (1997, 1999) proposed two additional criteria within the content focus
category, which are structure of knowledge comparability and dispositional consonance.
Webb, or any other scholar, has not further developed these two criteria. The current
study will address only the four criteria of categorical concurrence, depth of knowledge,
range of knowledge, and balance of representation.
Standards and Tests Cognitive Demand
For the content focus category, Webb (1999, 2002) provided four different levels
to judge depth of knowledge (cognitive demand) for both standards and assessments.
Level 1 is recall of a fact, information, or procedure. Level 2 is skill/concept that involves
the use of information, conceptual knowledge, and procedures, and uses two or more
steps. Level 3 is strategic thinking, which involves the use of reasoning, a plan or
sequence of steps, and more than one possible answer. Level 4 is extended thinking,
which is the use of investigation, multiple conditions or alternatives, non-routine
manipulations, and application of knowledge.
Marzano (2001), based on the Bloom’s taxonomy of educational goals (Bloom,
Engelhart, Furst, Hill, & Krathwohl, 1956), suggested a new taxonomy of educational
objectives. Marzano’s model changes Bloom’s ordered hierarchy of difficulty and
suggests, instead, an ordered hierarchy in terms of mental processes and levels of
21
consciousness, in which some mental processes exercise control over the operation of
other processes in a hierarchical manner. Marzano’s model presents three mental
systems: cognitive system, metacognitive system, and self-system.
Marzano’s model articulates six levels of mental processing. First is the cognitive
system which includes retrieval processes (level 1), comprehension processes (level 2),
analysis processes (level 3), and knowledge utilization processes (level 4). Next are the
metacognitive system processes (level 5) and then the self-system processes (level 6).
Each level of consciousness takes control of the preceding level, ranging from automatic
(subconscious) processes at level 1, up to full-conscious processes at level 6. Further,
each level subsumes the previous one. In other words, lower levels are contained in
subsequent higher levels. Levels of consciousness, rather than levels of complexity, are
the key factors in Marzano’s taxonomy. A comparison between Bloom and Marzano’s
models is depicted in Table 1.
Table 1
Comparison between Bloom’s and Marzano’s Models
Bloom’s Model
Marzano’s Model
Knowledge
Retrieval Processes
Comprehension
Comprehension Processes
Application
Analysis Processes
Analysis
Knowledge Utilization Processes
Synthesis
Metacognitive Processes
Evaluation
Self-System Processes
In recent work, Porter (2002) also defined five descriptors of categories (levels) of
cognitive demand in the area of mathematics. It is worthwhile to notice that Porter uses
22
the terms “cognitive demand” and “expectations for students” interchangeably. The term
“expectations” refers mainly to standards for most scholars. Porter’s five levels of
cognitive demand are: (1) memorize facts, definitions, and formulas; (2) perform
procedures and solve routine problems; (3) communicate understanding of concepts; (4)
solve non-routine problems and make connections; and (5) conjecture, generalize, and
prove.
As can be observed, Porter’s (2002) categories of cognitive demand resemble
Bloom and Marzano’s taxonomies. The S4S project used the first five levels of
Marzano’s model to rank the depth of knowledge of both instruments—state tests and
KSUS standards.
Cognitive scientists have hypothesized different levels of knowledge since
Bloom’s work almost 50 years ago. For example, Anderson (1983) defined three levels,
declarative, procedural, and strategic, as the three main categories that other scholars
have subdivided in an effort to understand the processes of higher order thinking.
Because different authors use different terminology that applies to alignment, it is
useful to recap the terms. When referring to the different levels to rate knowledge, for
example, Webb uses depth of knowledge; Marzano prefers levels of consciousness and
mental processing; Porter talks about cognitive demand or expectations for students;
Rothman uses levels of challenge; and La Marca prefers cognitive complexity. Overall,
there is lack of a common terminology, as well as disagreement among theories of higher
order cognition. These factors perhaps have prohibited the development of a firm
quantitative method (index) to evaluate alignment across educational contexts (Porter,
2002). The present study will adopt Marzano’s cognitive demand taxonomy because the
23
data provided by S4S used his constructs. However, the index of alignment defined in the
present study is transparent to the number of levels because DOK values are used to
compare standards and assessment regardless of the scale used.
Porter’s Alignment Methodology
Porter defined an overall index of alignment based on data collected using teacher
surveys. These surveys were designed to measure alignment between assessments and
classroom instruction. In Porter’s work, the data were obtained by a two-fold procedure.
First, he surveyed schoolteachers about level of content coverage for several subject
matters taught in the classroom. He also included questions about student expectations.
He used the term cognitive demand, which is equivalent to Webb’s DOK, to rate content
covered in classroom instruction as well as in assessments. Content of instruction was
then measured at the intersection between topics covered and students’ cognitive demand.
A table of topics covered versus cognitive demand was built. Second, the same procedure
was applied to the tests. Each item was rated with the same cognitive demand scale
(DOK), and another table of items versus cognitive demand was also built. These two
tables were converted to context matrices of proportions (percentages) to be able to make
comparisons.
A context matrix describing standards (X) and another describing assessment (Y)
were defined in Porter’s model. The match between these two matrices, as the sum of
cell-by-cell intercepts, is a measurement of content alignment between standards and
assessments. Porter’s index of alignment, which ranges from 0 to 1, is represented by the
following formula:
I = 1 - (Σ |X –Y|)/2
24
(1.1)
where X and Y denote cell proportions in the standards matrix and assessment matrix
respectively. An index value of 1 indicates perfect alignment. Figure 1 presents Porter
and Smithson’s (2001b) alignment analysis.
Figure 1. Alignment analysis. From Porter & Smithson (2001a).
To understand the methodology of alignment proposed by Porter (2002), an
example is helpful. Data are usually represented in a table of standards against test items,
such as the following (Table 2).
Table 2
Porter’s (2002) Methodology of Alignment
Item 1
Item 2
Item 3
Item 4
C.D.
N2
N3
N4
Standard 1
N1
N1
25
Standard 2
N2
Standard 3
N3
Standard 4
N4
√
√
√
√
Note. Item #: First row includes the test items of the assessment. Standard #: First column
includes the standards. C.D. (cognitive demand) ranges from N1 to N4 (integers). These
values are assigned to both instruments (standards and assessment) by the rater. √ :
Categorical concurrence assigned by raters (hits).
In Table 3 there is an example with imaginary data.
Table 3
Porter’s (2002) Methodology of Alignment Using Imaginary Data
Item 1
Item 2
Item 3
C.D.
2
Standard 1
1
√
Standard 2
1
Standard 3
3
Standard 4
2
4
3
Item 4
2
√
√
√
√
Tables 4 and 5 below show the data converted to two matrices of proportions, one
for the standards and another for the assessment. In this example, C.D. values range from
1 to 4 and the number of total hits (√) is 5.
Table 4
Porter’s (2002) Methodology of Alignment Using Constructed Matrix X with Respect to
Standards’ Cognitive Demand
Cognitive Demand
1
2
3
4
Standard 1
2/5
0
0
0
Standard 2
0
0
0
0
26
Standard 3
0
0
2/5
0
Standard 4
0
1/5
0
0
Table 5
Porter’s (2002) Methodology of Alignment Using Constructed Matrix Y with Respect to
Items’ Cognitive Demand
Cognitive Demand
1
2
3
4
Standard 1
0
2/5
0
0
Standard 2
0
0
0
0
Standard 3
0
0
1/5
1/5
Standard 4
0
1/5
0
0
Numbers into each cell represent the proportions of hits for each standard or
assessment in relation to the total number of hits. The index of alignment between
standards and assessment is then calculated by comparing, cell-by-cell, the two matrices
using equation 1.1:
(Σ|X-Y|)/2 = (|2/5 – 0| + |0 – 2/5| + |2/5 – 1/5| + |0 – 1/5| + |1/5 – 1/5|)/2 = 3/5 = 0.6
Thus, I = 0.4, which is considered a weak index of alignment (I < 0.5) (Porter, 2002).
In the conceptual framework of this study there is a detailed definition of the
alignment index proposed. A prototype of the index is presented in Appendix A, in which
the alignment index is defined in terms of DOK, ROK, and BOK. A comparison
including Webb (1997) and Porter’s (2002) models is also included in the prototype, as
well as graphical representations of the alignment components. It is worth mentioning
that the Porter’s index and the Webb’s DOKEqual calculations are completely equivalent.
In addition, the alignment index defined in this study was used to measure the
alignment between KSUS and selected state tests, helping to bridge the alignment gap
27
between K-12 and higher education institutions. Figure 2 represents all possible paths of
alignment between K-12 (State) and higher education (P-16), and within districts, states,
and P-16. Although the alignment index defined in this study could be used across
different contexts (paths in Figure 2), the properties and behavior of the index were tested
using S4S data and making comparisons between higher education standards and state
assessments. This procedure is represented by the bold arrow in Figure 2.
P-16
Assessm
ent
Standard
s
Figure 2. Bridging the gap between K-12 and P-16 (modified from Porter, 2002).
An index of alignment, taking into consideration Webb’s (1997, 1999) and
Porter’s (2002) approaches would serve to estimate alignment in different
directions—vertically and horizontally—across educational contexts (see Figure 1 and
Figure 2).
28
The Standards for Success Project (S4S)
A consortium of universities across the nation, supported by the American
Association of Universities (AAU), have joined efforts in describing what they expect of
entering freshmen in terms of knowledge and skills. Under the project Standards for
Success (S4S) (Conley, 2002), a series of national conversations have taken place
involving nearly 400 higher education faculty members and administrators from more
than 20 higher education institutions. The S4S project put together a document named
Key Knowledge and Skills for University Success (KSUS), which includes six categories
encompassing six academic disciplines—English, math, science, social science, second
language, and humanities/arts (Conley, 2002). The KSUS in language and mathematics
are listed in Appendices C and D respectively. The KSUS are a set of comprehensive
statements of what higher education institutions expect from well-prepared senior
graduates.
The KSUS standards used in this research are grouped into several areas
according to the different topics into which each subject matter is divided, as shown in
tables 6 and 7. The percentages are used later in Chapter 4 as weights in the definition of
the total index of alignment.
Table 6
KSUS – English Language Topics
Topics
No. of Objectives Percentages (Weights)
Reading and Comprehension
24
0.37
Writing
25
0.39
Research Skills
10
0.15
Critical Thinking
6
0.09
29
Table 7
KSUS – Mathematics Topics
Topics
No. of Objectives
Percentages (Weights)
Computation
11
0.15
Algebra
22
0.29
Trigonometry
4
0.05
Geometry
13
0.17
Math Reasoning
25
0.34
The KSUS have been distributed to school districts across the US, and it is
expected that some school districts take the KSUS into consideration when defining their
own content standards. As a consequence, the KSUS are appropriate to be used in this
study because they are equivalent (in form and content) to state standards. In other words,
a comparison between the KSUS and state tests is equivalent to comparing state
standards to state tests.
Twenty-seven state-level assessment exams with similar purposes were chosen
for this study. All of them are administered in grades 10 or 11, or are denominated as end
of course tests. Also, S4S has carried out a series of workshops where experienced
educators have rated both, KSUS and state assessments, with respect to predefined depth
of knowledge criteria and content concurrence. Data from the S4S project is available to
scholars and researchers who are interested in developing a defensible methodology to
match college expectations (KSUS) and current K-12 assessment (Conley, 2002).
The S4S project involved training of a group of experienced higher education and
K-12 teachers in a rating procedure based fundamentally on Webb’s (1997, 2000)
30
alignment methodology and Marzano’s (2001) taxonomy of educational objectives. The
raters needed to be familiar with the KSUS and the state’s assessment. Both, expectations
and tests, were rated with respect to the Depth of Knowledge (DOK) levels using
Marzano’s levels of mental processing ranging from 1 to 5. S4S raters performed three
basic tasks in order to compile the essential data that were used during the execution of
this study:
a. Rate the Depth of Knowledge level of each KSUS. This task provides values of
DOK for the standards.
b. Rate the Depth of Knowledge level of each assessment item. This task provides
values of DOK for the test items.
c. Determine the categorical concurrence between KSUS and corresponding test
items. This task provides values (hits) of categorical concurrence between the two
instruments (tests and standards).
Values of ROK and BOK (BR) have been obtained applying the respective
alignment criteria to the data collected above. This research project received S4S raw
data generated by different raters (from 4 to 6) for each subject matter (English and
mathematics) and selected state tests (27). Data from the S4S project served to explore
the properties and behavior of the alignment index defined in this study.
31
CHAPTER 3
METHODOLOGY
Study Objectives
This research study focuses on the development of a quantitative methodology to
analyze and measure the alignment between standards and assessment. This alignment
methodology includes the construction of an index as a mathematical entity that describes
and measures the alignment between expectations and tests in a quantitative manner.
Such a metric is needed to advance the research in this area (Porter & Smithson, 2001a,
Porter, 2002). In building the index, the researcher has taken into consideration the main
qualitative and quantitative criteria already established in the alignment research
literature (Webb, 1997, 2001, 2002; La Marca, 2001; Impara, 2001; Porter, 2002;
Rothman et al., 2002). This study reviewed, extended, and advanced such criteria by
building a cohesive, concise, and comprehensive methodology that informs alignment of
standards and tests from different perspectives.
Several computational tools were developed to prototype, create, and manipulate
the index and the criteria from which it is built. Additionally, the researcher adopted a
minimum terminology to define the criteria and to describe the process of measuring
alignment. Such terminology is defined in Chapter 1 under the section definition of terms
and expanded below in the conceptual framework of the study. Novel graphical
representations were developed to describe and measure alignment. In particular, trilinear
plots and centroid graphs were used to provide visual interpretations of the alignment
criteria and the indices constructed. Such visual representations served to improve our
understanding of the alignment process by analyzing visual patterns. As a result of these
32
graphical analyses, new data and new relationships could be revealed. This flexibility
allows researchers to analyze data from novel perspectives, and sometimes this graphical
representation lets scholars visualize undisclosed patterns or connections not shown in
tables and flat graphs.
Research Questions
This study is devoted to filling a need in educational alignment research, which
consists of determining a quantitative methodology to measure alignment between
expectations (standards) and assessment. A systematic and common terminology was
adopted in terms of the three alignment criteria that help to systematize and simplify the
alignment process. A minimum and concise language was needed in order to resolve
differences among the variant terminology and conceptualizations used by scholars when
defining the dimensions and criteria of alignment research. In addition, this research
study developed a mathematical index (alignment index) as a function of the alignment
criteria adopted. This index served to measure the match between expectations
(standards) and assessment. Also, this index was utilized across different educational
contexts such as subject matter and states. Data available from the S4S project (Conley,
2002) were used to explore the properties and performance of the proposed alignment
index.
The five fundamental research questions addressed by this study are:
Research Question 1. What is a minimum terminology necessary and sufficient to
analyze and describe the criteria in the process of measuring content alignment?
33
Research Question 2. What is the mathematical form and the characteristics of a
quantitative metric (alignment index) to measure alignment between expectations and
assessment across instructional contexts?
Research Question 3. To what extent are state assessments and higher education
expectations (KSUS) aligned according to the index defined?
Research Question 4. What alignment comparisons can be made between KSUS
and state assessment across states and across subject matter using this index?
Research Question 5. What relationships could be established among the levels of
alignment and accountability features of state testing?
Conceptual Framework of the Study
Based on the alignment research literature, the conceptual framework of this study
includes the concepts of content focus, content coverage, and content centrality. All three
concepts are considered equivalent and will be embedded in a single category, which is
the content focus category. In other words, for the purpose of the study, centrality is
considered constant through the standards and the tests. Content focus includes, in turn,
three alignment criteria: range of knowledge (ROK), depth of knowledge (DOK), and
balance of knowledge (BOK). The outline below represents the conceptual framework,
including in parentheses the terminology subsumed into each concept.
Content Focus (content coverage, content centrality)
ROK (range of coverage, span of coverage, range of knowledge)
DOK (depth of coverage, cognitive demand, depth of knowledge)
BOK (balance of coverage, balance of representation, balance of knowledge)
34
These three terms are the minimum criteria chosen in this study for alignment
between standards and assessments. They belong to the three dimensions of alignment
(range, depth, and balance) that characterize the content focus category.
The alignment index constructed in this study involves the quantification of the
three different criteria defined in the framework. This index is a formula as a function of
ROK, DOK, and BOK. Additionally, ROK and BOK will be extended to include
bidirectional alignment, not only from assessment with respect to standards (as Webb
proposed) but also from standards with respect to assessments. Alignment of tests with
respect to standards has been termed with the suffix “S.” Its meaning can be interpreted
as the relevance of standards to items. It is calculated as the proportion of topics
(objectives) of the standards addressed by the test. Likewise, alignment of the standards
with respect to assessments has been termed with the suffix “A.” Its meaning can be
interpreted as the relevance of the items to the standards. It is calculated as the proportion
of items that match content found in the standards. For example, ROKS and BOKA
represent range of knowledge with respect to standards and balance of knowledge with
respect to assessment respectively (See Appendix A). Webb’s balance of representation
(BR) will be replaced in this study by the extended concept balance of knowledge
(BOK). The criterion BOK is bidirectional, which includes balance of knowledge with
respect to standards (BOKS) and with respect to assessment (BOKA). BOKS can be
interpreted as a measurement of the distribution of questions addressed by each standard.
Conversely, BOKA can be interpreted as a measurement of the distribution of objectives
addressed by each item. Following the same convention, DOK is represented by DOKA,
DOKE, and DOKB indicating depth of knowledge that are above, equal, and below the
35
standards (or assessments) respectively. This bidirectional conceptualization of the
alignment criteria is instrumental in the definition of the indices of alignment below.
Webb’s (1997) categorical concurrency (CC) is not considered in this study
because this criterion has been challenged for several reasons. First, this is the only
criterion that uses a different metric in relation to the other criteria. Webb’s categorical
concurrence is measured as an absolute value rather than as proportions or percentages;
consequently, it is inappropriate to be included as a component of the overall index.
Second, the number six chosen by Webb, as the minimum condition for alignment in the
categorical concurrency category, is arbitrary and does not take into consideration the
number of standards included in the match. It is possible to have a short test with a small
number of standards addressed. Third, Webb’s categorical concurrence seems to
represent a validity criterion rather than an alignment criterion by itself, since it is based
on a previous work (Subkoviak, 1988), which is only related to reliability of mastery
tests. Although there is a relationship between test validity and alignment, as noted
before, categorical concurrence is not strictly a criterion of alignment as are the other
three. Consequently, categorical concurrence is an attribute of the test that is related only
to its validity and, thus, contradicts the proper assertion of Webb (2001) when he
contends that “alignment is a quality of the relationship between expectations and
assessments and not an attribute of any one of these two system components” (p. 2).
The terminology defined and used above constitutes a revised, modified, and
simplified terminology taken from the existing alignment research. Consequently, range
of knowledge (ROK), depth of knowledge (DOK), balance of knowledge (BOK), and all
of their sub-criteria defined above constitute the minimum and sufficient terminology to
36
describe the entire criteria in the process of measuring content alignment between
standards and assessments. Additionally, this study adopted the following terminology to
describe the components of the standards. Standards are divided into topics, and the
topics are composed of objectives. In this way, the terms defined in this conceptual
framework address the first research question of the study.
Prototyping the Index of Alignment
In order to define and represent graphically the index of alignment, centroid plots
were used. Centroid plots are graphs constructed using three concurrent axes at 120
degrees each. Each axis in the graph represents one of the three alignment criteria. In
turn, the criteria values determine the vertices of a triangle. A software program was
developed in order to prototype three possible formulas that would represent the index of
alignment (Figure 3). The first formula (Index Area) was represented by the area of a
triangle, in which its sides were the three different alignment criteria, ROK, DOK, and
BOK. This approach using the area of a triangle was utilized by Conley & Brown (in
press). The second formula (Index Vector) is a vector constructed as the sum of three
different vectors, each vector representing each of the alignment criteria. The third
possible index (Index Mean) was the simple average among ROK, DOK, and BOK.
Comparing the behavior of the three formulas, it was found that the first equation
represented by the area has a tendency to underestimate the value of the index, which
tended to zero when two criteria approached zero. This singularity is not appropriate
because the index should show a value different from zero when at least one of the
criteria is different from zero. The second equation, represented by a vector, tended to
overestimate the index. Consequently, the third formula, which corresponded to the
37
average of the three criteria, performed smoothly and was chosen as the best
representation of the alignment index.
Figure 3. Prototyping the index of alignment using a centroid plot. Three equations are
explored.
Taking the mean value of the three criteria to represent the index of alignment
seems to be plausible (Kane, 1992) given the available evidence provided by the software
prototype. Plausibility in this case is based on the best interpretation that can be made for
the definition of the index. The mean value of the three alignment criteria, used to define
the index, is the best interpretation possible since it is based on the evidence provided by
the software, by its performance, by its simplicity, and by its statistical meaning. None of
these characteristics is present in the other two formulas.
Additional to the definition of a mathematical formula to represent the index of
alignment, the researcher proposed the utilization of a concept that gives an idea of the
contribution every criterion play in the index. This concept is called skewness, which has
a vectorial character with magnitude (module) and direction. Skewness is the vectorial
38
sum of the three alignment criteria in the plane of the graphic. Its module represents the
magnitude (scalar) of how different is the contribution of each criterion to the index. The
direction of the skewness informs about the tendency towards which criterion the index is
oriented. In other words, skewness is oriented toward the criterion that contributes the
most to the index. Skewness equal to zero means that each criterion contributes equally to
the index. In graphical terms, skewness equal to zero happens when the three criteria
determine an equilateral triangle.
Definition of the Indices of Alignment
The definition of the index of alignment includes the representation of the index
as a mathematical formula, which is a function of the three alignment criteria (ROK,
DOK, and BOK). A test is considered aligned to its respective content standard from
multiple perspectives given by the different indices defined below. The levels of
alignment are chosen according to several reasons described in Chapter 5. For now, it is
worth mentioning that in relation to DOK, the index will include values of DOK that are
at least equal (DOKE) or above (DOKA). In other words, DOK is represented in the
indices as DOKE or DOKE + DOKA. The latest is a condition pursued by Webb (1997,
2001).
Four different alignment indices are defined in this study, as follows:
1. I(S): This partial index represents the alignment taking into consideration
the criteria with respect to standards. It measures the relevance of the
standards to the items. It is defined as:
I(S) = (ROKS + DOKE + BOKS)/3
39
(3.1)
2. I(A): This partial index represents the alignment taking into consideration
the criteria with respect to assessment. It measures the relevance of the
items to the standards. It is defined as:
I(A) = (ROKA + DOKE + BOKA)/3
(3.2)
3. I: This index is the overall index of alignment, which combines the two
partial indices mentioned above. It is defined as:
I = (ROK + DOKE + BOK)/3
(3.3)
where ROK = (ROKS + ROKA)/2 and BOK = (BOKS + BOKA)/2.
It can be shown that I = [ I(S) + I(A) ]/2
4. I(+): This overall index takes into consideration the combined depth of
knowledge equal and above (DOKE + DOKA). It is defined as:
I(+) = [ROK + (DOKE + DOKA) + BOK]/3
(3.4)
A linear combination of the alignment criteria has been chosen attending to the
reasons given above in the section prototyping the index of alignment. Appendix A
includes an operative prototype model for the alignment index. The four indices defined
above are calculated for every topic in each subject matter. Each state test in English
language, for example, obtained 16 different indices according to the number of topics
(Reading & Comprehension, Writing, Research Skills, and Critical Thinking), in which
the subject matter is divided. Sixteen indices in English language are the result of 4
indices multiplied by 4 topics.
The total index of alignment (IT) of a test is calculated taking into consideration
the overall index (I) defined in the numeral 3 above. Due to the fact that every topic in
the KSUS has a specific weight (number of objectives), the total index is defined as a
40
weighted average of each index (I) belonging to each topic. In the case of English
language, the total index for a test is defined as follows:
ITE = 0.37 [ I(Reading & Comprehension)] + 0.39 [ I(Writing)] + 0.15 [
I(Research Skills)] + 0.09 [ I(Critical Thinking)]
(3.5)
By a similar procedure, the total index for mathematics tests is defined as follows:
ITM = 0.15 [ I(Computation)] + 0.29 [ I(Algebra)] + 0.05 [ I(Trigonometry)] + 0.17
[ I(Geometry)] + 0.34 [ I(Math Reasoning)]
(3.6)
The weights are taken from tables 6 and 7 (Chapter 2) for English and
mathematics respectively. In both cases, the total indices of alignment range from 0 to 1.
This weighted approach was also used by Conley & Brown (in press).
The different indices defined above, along with the concept of skewness, give a
suitable indication of the content coverage alignment category from different perspectives
and provide a comprehensive approach to the alignment process. This section, which is
related to the definition and construction of a quantitative index of alignment, addresses
the second research question of this study.
Data Collection
With the purpose of testing the index of alignment constructed in this study, data
collected by the Standards for Success (S4S) project at Stanford and Oregon universities
were used to determine the alignment between higher education expectations (KSUS) and
selected state tests. S4S is the project that provided the definition of KSUS. The
researcher has received S4S raw data generated by a number of raters ranging from four
to six. Such data consisted of Excel workbooks including rating values of the KSUS and
state assessments according to a scale of DOK ranging from 1 to 5. Additionally, the
41
raters determined categorical concurrence between KSUS objectives and test items. The
categorical concurrence spreadsheet was a matrix of hits at the cross point of the
standards’ objectives (topics) and test items. A hit was defined as a mark in the cell
where objectives and test items intersect. A mark indicates that both (standard objectives
and test items) addressed the same content category. In Appendix A there is a detailed
description of the model including the definitions, calculations, and operations of all
variables.
Schools accountability data from different states was also gathered and
contrasted with alignment results. Results of the alignments were compared and
contrasted with accountability features of state testing such as NAEP. Twenty-seven state
tests in areas such as English language and mathematics were processed in this study.
Data Analysis
Data analysis is included in the third objective of this study. Data acquired from
the S4S project were analyzed and used to put the index to work. Values of ROK, DOK,
and BOK were obtained from raters’ responses using the model defined in Appendix A.
These values were used as the alignment criteria, and consequently they were used to
define the quantitative index of alignment. Two subject matters were chosen, English
language and mathematics. Four different indices of alignment [ I(S), I(A), I, and I(+) ]
were calculated by topics within each subject matter for all states. In Chapter 2, under the
section Standards for Success Project, there is a list of subject matter topics for English
language and mathematics. Reliability analysis of the raters’ data was performed using
generalizability theory. G-coefficients were calculated for all states in regard to DOK
values assigned by the raters.
42
Reliability of Data
Generalizability (G-Study) analysis was performed to determine reliability of the
raters’ data, using generalizability theory (G-Theory). Generalizability theory is a
statistical theory for evaluating the dependability of behavioral and social science
measurements (Brennan, 2001; Cronbach, Gleser, Nanda, & Rajaratnam, 1972). GTheory can be applied in this study because it provides a framework to analyze the
reliability of the observations provided by the raters during the execution of the S4S
project. The S4S project consisted of the training of a group of experienced higher
education and K-12 teachers in a rating procedure based on Webb’s (1997, 2000)
alignment methodology and Marzano’s (2001) taxonomy of educational objectives. The
raters needed to be familiar with KSUS and the state’s assessment. Both expectations and
tests were rated with respect to DOK levels using the Marzano’s levels of mental
processing ranging from 1 to 5. The raters performed three basic tasks to compile the
essential data used in this study: (a) rate the DOK level of each KSUS objective, which
provides values of DOK for the standards; (b) rate the DOK level of each assessment
item, which provides values of DOK for the test items; and (c) determine the categorical
concurrence between KSUS and corresponding test items. This procedure provides
values (hits) of categorical concurrence between the two instruments (tests and standards)
in a matrix of hits. Values of ROK and BR (BR was converted to BOK) were obtained
applying the respective alignment criteria to the data collected from the raters. Moreover,
validity of the constructed alignment index was based on its internal mathematical
structure, logical inferences about its function, and its performance when used with
KSUS and state tests. Additionally, alignment results were correlated with accountability
43
features of state testing, such as NAEP reports. Finally, it was recommended that
accumulated evidence of validity be obtained by posterior use of the index.
44
CHAPTER 4
RESULTS
Generalizability of KSUS
A generalizability study (G-Study) was performed in order to determine the
reliability of the data provided by the S4S project. Six trained subjects rated the KSUS in
language and mathematics according to the Marzano’s five levels of depth of knowledge
(DOK). Figure 4 depicts the generalizability coefficients for KSUS-Language and for
KSUS-Math versus the number of raters. For six raters, the G-coefficient for KSUSLanguage is 0.93 and for KSUS-Math is 0.89. In both cases, these two values indicate
acceptable levels of consistency among the raters. According to Figure 4, a number of
raters as low as 4 would be enough to reach acceptable levels of reliability (G > 0.80) in
both subject matters.
Generalizability
Coefficient
Raters Reliability for DOK (KSUS)
1
0.9
0.8
0.7
Language
0.6
0.5
0.4
0.3
Math
0
1
2
3
4
5
6
7
Number of Raters
Figure 4. Generalizability coefficients for KSUS in English and mathematics.
Generalizability of State Tests
Generalizability analysis was performed for all 27 state tests. Figure 5 depicts the
reliability coefficients for the state of Colorado-English as an example among all states.
45
The state of Colorado (the first one alphabetically) has been chosen with the only purpose
of illustrating the entire methodology of the study. As Figure 5 shows, generalizability
coefficients above 0.80 were obtained with three raters and above for English language in
the state of Colorado.
Raters Reliability for DOK (Colorado - Language)
Generalizability
Coefficient
1.00
0.90
0.80
English
0.70
0.60
0.50
0
1
2
3
4
5
6
Number of Raters
Figure 5. Generalizability coefficient for Colorado-Language.
For Mathematics, generalizability values around 0.80 were reached with five
raters for the state of Colorado, as seen in Figure 6.
Generalizability
Coefficient
Raters Reliability for DOK (Colorado - Math)
1.00
0.90
0.80
0.70
0.60
0.50
0.40
0.30
0.20
Math
0
1
2
3
4
5
6
Number of Raters
Figure 6. Generalizability coefficient for Colorado-Mathematics.
46
7
In terms of raters’ generalizability for test items, twenty-seven state assessments
were analyzed in English language and mathematics. In most cases, the generalizability
(dependability) results were consistent and reliable (see Figure 7). Only four states (KYReading, G = 0.78; MS-English, G = 0.74; PA-Writing, G = 0.69; and TX-Reading, G =
0.75), among a total of 27 in the English language area, were below the value of G (0.80)
that is considered desirable. However, KY-Reading (G = 0.78) and TX-Reading (G =
0.75) could be considered within the range of acceptable values. Only PA-Writing (G =
0.69) could be considered below acceptable scores. The lowest coefficients of
dependability for DOK are due mainly to the restriction of range across items determined
by the raters. This behavior has also been found in previous work of alignment such as
the one performed by Herman et al. (2003).
Raters Reliability for DOK - All States
Language
1.00
0.95
0.90
G-Coefficient
0.85
0.80
0.75
0.70
0.65
0.60
0.55
MI
MO MS NH NJ
OR
Figure 7. Raters’ reliability for all state tests in English-Language.
47
UT
Writing
Reading
Writing
Writing
TX
Reading
Writing
PA
States
Reading
Reading
EngFormC
EngFormB
English2
NY
EngFormA
English1
English
English
English
English
Writing
English
KY MA ME
Reading
IL
English
Reading
Writing
CT
ACT
KSUS CO
Reading
English
Language
0.50
VA
In order to determine the restriction of range for PA-Writing, it is necessary to
calculate the item variance per rater. This is done by calculating the mean DOK for each
rater across all items and, then, calculating the average among all raters. The variation of
mean for PA-Writing (SD = 0.29, M = 3.72) was quite small in contrast, for example, to
CT-Writing (SD = 1.23, M = 2.20). The small variation of mean for PA-Writing is an
indication of restriction of range across items. Raters chose values of DOK in the range
of 3 and 4 only, as can be noticed by the mean value of 3.72. Another factor contributing
to this low G value for PA-Writing was the number of items (3) for this test (Table 8). A
small number of items in a test do not provide enough room for variation of DOK scores,
if the raters do not find substantial differences among the items.
Similar circumstances appear in the case of mathematics tests (Figure 8). Two
state assessments (MN-MathBasic, G = 0.68; and VA-Algebra I, G = 0.63) showed the
lowest values of the dependability coefficient among all states. In both cases the range
was also small in comparison to the other tests. The variations of mean for MNMathBasic (SD = 0.46, M = 1.66) and for VA-Algebra I (SD = 0.49, M = 1.70) were
small in comparison, for example, to NH-Math (G = 0.94, SD = 0.90, M = 2.17). In these
two cases, raters chose small values of DOK around 1 and 2, as can be noticed by the
mean values around 1.70.
48
Raters Reliability for DOK - All States
Mathematics
1.00
0.95
0.90
G-Coefficient
0.85
0.80
0.75
0.70
0.65
0.60
0.55
VA
Math
UT
Math
TX
Algebra II
Math
PA
Algebra I
Math
MathC
OR
Math
NY
MathB
MathB
NJ
MathA
Math
MO MS NH
MathA
Math
MI
Algebra1
Math
MN
Math
MathComp
MathBasic
Math
KY MA ME
Math
ACT
IL
Math
KSUS CO CT
Mathematics
Math
Math
0.50
WA WY
States
Figure 8. Raters’ reliability for all state tests in mathematics.
Standards and Tests Cognitive Demand
Raters analyzed the KSUS and the selected state tests in terms of DOK. As noted,
the raters used Marzano’s cognitive demand scale ranging from 1 to 5, where higher
numbers indicated higher depth of knowledge. The values of DOK represented in the
graphics below were obtained as the average among the raters across all items.
Depth of knowledge (DOK) of KSUS and State Tests - Language
A comparison of DOK values for KSUS-Language and for all state assessments is
shown in Figure 9.
49
State Tests' Cognitive Demand - Language
5
DOK
4
3
2
1
NY
PA
TX
English
Writing
VA
English
Writing
UT
Reading
Reading
Writing
Reading
Writing
Reading
EngFormB
OR
EngFormC
NJ
English2
NH
EngFormA
English
MI
English1
English
English
MO MS
Writing
MA ME
English
KY
English
English
IL
Reading
Reading
Writing
CT
ACT
KSUS CO
Reading
English
Language
0
WA WY
States
Figure 9. Comparison of KSUS-Language and state test cognitive demand. The bold line
across the graphic represents the mean DOK for all states (2.39).
As shown in Figure 9, the mean value of DOK for all states (2.39) is below the
average value of DOK for KSUS-Language (2.86). However, some state tests (MIWriting = 3.61, MO-English = 3.38, and PA-Writing = 3.72) showed much higher values
of DOK than the average for all state tests and for the KSUS. According to the data, the
raters determined that those three state tests are more cognitively demanding than the
rest. In the opposite direction, there are two states (IL-ACT = 1.80 and VA-Writing =
1.77) with the lowest cognitive demand values. It is worth noting that the highest values
of DOK (MI-Writing = 3.61 and PA-Writing = 3.72) correspond to the shortest tests, both
tests with three items each. Table 8 presents the DOK values of state tests, G coefficients,
and the number of test items.
Table 8
Comparison of DOK mean values, G-Coefficient, and number of test items for English
language.
State
Test
DOK
SDa
KSUS
English
2.86
0.23
0.93
65
CO
English
2.18
0.18
0.90
21
G-Coefficient Number of Items
50
a
State
Test
DOK
SDa
CT
Reading
2.81
0.25
0.88
16
CT
Writing
2.20
0.23
0.94
8
IL
ACT
1.80
0.11
0.91
75
KY
Reading
2.46
0.16
0.78
30
MA
English
2.35
0.18
0.91
42
ME
English
2.23
0.20
0.86
28
MI
Reading
2.21
0.35
0.85
29
MI
Writing
3.61
0.71b
0.83
3
MO
English
3.38
0.39
0.88
21
MS
English
2.11
0.38
0.74
89
NH
English
1.93
0.22
0.89
39
NJ
English
2.67
0.20
0.88
26
NY
English 1
2.09
0.26
0.90
18
NY
English 2
2.79
0.19
0.87
12
OR
EngFormA
2.08
0.22
0.83
77
OR
EngFormB
2.18
0.17
0.80
77
OR
EngFormC
2.20
0.19
0.82
77
PA
Reading
1.92
0.30
0.84
21
PA
Writing
3.72
0.33
0.69
3
TX
Reading
2.45
0.17
0.75
42
TX
Writing
2.42
0.25
0.90
41
UT
Reading
2.22
0.11
0.88
37
UT
Writing
2.32
0.20
0.89
31
VA
Reading
2.28
0.32
0.79
34
VA
Writing
1.77
0.33
0.87
31
WA
English
2.32
0.19
0.91
46
WY
English
2.34
0.16
0.91
25
G-Coefficient Number of Items
Standard deviation among raters’ DOK mean values.
See discussion below to explain this high value.
b
51
Examination of Table 8, shows that most state tests obtained low values of
standard deviation for DOK. This could be interpreted as a reliability factor among the
raters’ observations. However, there is one case with an unusually high value of standard
deviation (MI-Writing, SD = 0.71). The reason for this higher value of standard deviation
in comparison to the other tests is due to a restriction of range (SD = 0.38), which was
exacerbated by the low number of items (3). With fewer items the likelihood of observing
no variation across items is increased. However, this test obtained an acceptable
coefficient of reliability (G = 0.83) in contrast, for example, to PA-Writing, in which the
restriction of range was lower (SD = 0.29) and did not reach an acceptable coefficient of
reliability (G = 0.69) either.
Inspecting the scatter graph (Figure 10), it seems that there is a slightly specific
tendency that indicates a relationship between the values of DOK and the number of
items (r = - 0.46, p < .05). Tests with lower number of items tend to get higher cognitive
demand. Short tests (in number of items) tend to be more cognitively demanding because
the items need to include more standards and the student needs to elaborate more on such
type of test. A test item that needs a long answer, as in the case of an essay question,
probably needs more elaboration because it may include more objectives, and
consequently it might be considered more cognitively demanding.
52
100
90
80
Number of Test Items
70
60
50
40
30
20
10
0
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
4.00
DOK
Figure 10. Scatter plot of DOK versus number of items for all states in language (r = 0.46, p < .05).
KSUS in language and 27 state tests in English language were rated according to
their cognitive demand (DOK). Results were presented using trilinear plots. Trilinear
plots (Wainer, 1997, p. 111) are useful when there are three values summing 1 (or 100%),
which is the case for DOK. Values of DOK above (DOKA), equal (DOKE), and below
(DOKB) sum to 1 for every test. In this case, such values are the DOK of state tests with
respect to the DOK of KSUS. A trilinear plot has been proven to be useful in comparing
the alignment between KSUS and state tests in relation to DOK. A trilinear plot, as
shown in Figure 11, is constructed using three axes inscribed in an equilateral triangle.
Each axis represents the three variables (DOKA, DOKE, and DOKB), which run from
the middle of a side up to the opposite vertex of the triangle. The length of each axis is 1.
53
Three distinct sections can be identified as Above, Equal, or Below, just eliminating part
of the axes (See Figure 12 and subsequent).
Figure 11. Trilinear plot of DOK showing the Utah-Reading test as an example.
The three regions (Above, Equal, and Below) are places for state tests that are
above, equal, or below in terms of DOK with respect to KSUS. As an example, Figure 11
depicts the location of the state of Utah (Reading test) in a trilinear plot. The values of
DOK (DOKB = 0.36, DOKE = 0.48, and DOKA = 0.15) are taken from the axes, and the
state is located at the intersection of the three perpendicular lines to the respective axes.
This graphical representation gives an indication of the “position” of the state assessment
in terms of DOK with respect to the KSUS-Language. In the particular case of the UTReading test, its position provides a good example of where this test is located in regards
54
to DOK with respect to KSUS-Language. Additionally, all states can be compared
according to their relative positions to each other on the graph.
Figure 12. Trilinear plot showing DOK comparisons for all states in Reading and
Comprehension.
The alignment results in terms of DOK for Reading and Comprehension are
shown in Figure 12. In general, all states are clustered around intermediate values of
DOK with respect to KSUS-Language. However, MO-English (0.11, 0.42, 0.47) and MSEnglish (0.47, 0.39, 0.14) are located in opposite positions indicating that MO reached
the highest comparative value of DOK, and MS reached the lowest value. In other words,
MO is in the Above-Equal region and MS is around the Below-Equal region of alignment
with respect to KSUS-Language.
55
Those state tests that possess at least 50% of the items in the equal or above range
(DOKE + DOKA = 0.50) are considered properly aligned according to Webb (1997). In
the trilinear graph, those states located above the first dotted line (0.50) will satisfy such
condition of adequate alignment. As shown in Figure 12, all state tests in Reading and
Comprehension reached acceptable levels of alignment for DOK according to this
condition.
This study proposes a complementary condition to determine alignment extending
and conciliating the works of Webb (1997) and Rothman et al. (2002). This condition is
based upon three levels of alignment defined as Low, Middle, and High. Those states
below the dotted line located at 0.50 in the trilinear plot are considered poorly aligned
(Figure 12). The middle region of alignment is located between 0.50 and 0.66 in the
trilinear graph. High alignment will be reached if the states are located above the dotted
line marked as 0.66. The trilinear plot is useful in determining those three regions of
alignment (Low, Middle, and High) since the dotted lines are traced through
geometrically notable points of the graphic. The definition of these three regions in the
graph also follows symmetric considerations of the trilinear plot. The intersection of the
three axes defines an interesting point at 0.66, and the intersection of the axes with the
sides of the triangle defines another point at 0.50.
Combination of the six regions (Above, Equal, Below, High, Middle, and Low)
provides finer granularity to the determination of alignment in terms of DOK. According
to this convention, all state tests are in the Middle and High levels of alignment for
Reading and Comprehension. These three levels of alignment are applied to the three
56
criteria of alignment (ROK, DOK, and BOK) and will be revisited later when the
definition of the indices of alignment is discussed.
The distribution of states in the three regions of alignment is presented in Table 9
for Reading and Comprehension.
Table 9
Distribution of states in terms of DOK for Reading and Comprehension.
DOK Low
DOK Middle
CO-English
IL_ACT
MI-Reading
ME-English
MS-English
NH-English
NY-English 1
OR-English A
OR-English B
OR-English C
PA -Reading
UT-Reading
WA-English
WY-English
DOK High
CT-Reading
KY-Reading
MA-English
MO-English
NJ-English
NY-English 2
TX-Reading
VA-Reading
Another graphical representation useful in comparing DOK among states is stack
graphs. Although stack graphs do not provide additional information than the centroid
plots, they are simple and can be created using popular tools like MS Excel. The reason
to include stack graphs here is to provide an alternative to the trilinear plots that were
created using customized software not available to the public yet.
Below is a comparison of DOK among states for Reading and Comprehension
using a stack graph (Figure 13). As can be seen in the graphic, the status of MO-English
stands out from the other states indicating the lowest value for DOKB (gray area), and
consequently the highest value for DOKA (white area). Notice also that MO-English is
57
close to the middle path between DOKA and DOKE. A region above a horizontal line in
which the DOK value is equal to 0.50 indicates proper alignment according to Webb’s
condition. States with the gray area (DOKB) under the 0.50 line are considered properly
aligned to the KSUS because the complement (DOKE + DOKA) is above 0.50. In a stack
graph, two horizontal lines, one at 0.50 and the other at 0.66 DOK values, can also define
the three regions (Low, Middle, and High). If the gray area (DOKB) for all state tests is
below the 0.50 line, it signifies that there are no tests in the Low region, as is the case for
Reading and Comprehension.
DOK of State Tests With Respect to KSUS - Language
Reading & Comprehension
1.20
1.00
DOK Proportions
0.80
DOKA
DOKE
0.60
DOKB
0.40
0.20
NJ
NY
OR
English
English
NH
English
English
MS
Reading
English
MO
Reading
English
MI
Reading
Reading
ME
Reading
English
MA
EnglishC
English
KY
EnglishB
Reading
IL
EnglishA
ACT
CT
English 2
Reading
CO
English 1
English
0.00
PA
TX
UT
VA
WA
WY
States
Figure 13. DOK of state tests with respect to KSUS-Language featuring Reading &
Comprehension.
It is important to notice that the vertical axis in the stack graph is measured in
DOK proportions, where the maximum value should be 1. Also, the best way to interpret
this stack graph is to observe the behavior of the gray area (DOKB) for each state test. As
58
the gray area moves upwards in the stack graph, it also indicates that the state test is
moving downwards in the trilinear plot along the DOKB axis. In sum, lower DOKB
means higher absolute values of DOK with respect to KSUS.
The alignment results in terms of DOK for Writing are shown in Figure 14. In the
topic of Writing, the majority of state tests are above the DOK value with respect to
KSUS-Language. Notable exceptions are the states of Texas (TX-Writing; 0.71, 0.15,
015), Illinois (IL-ACT; 0.61, 0.32, 0.07), and Virginia (VA-Writing; 0.57, 0.24, 0.29),
which are located in the Low region of alignment. According to the raters, the English
test of Missouri (MO-English; 0.01, 0.14, 0.86) obtained the highest cognitive demand
value in comparison to the DOK of KSUS-Language and in relation to the other states.
Figure 14. Trilinear plot showing DOK comparisons for all states in Writing.
59
Table 10
Distribution of states in terms of DOK for Writing.
DOK Low
DOK Middle
IL_ACT
ME-English
VA-Writing
MS-English
TX-Writing
NH-English
NY-English 2
DOK High
CO-English
CT-Writing
MA-English
MI-Writing
MO-English
OR-English B
NJ-English
NY-English 1
PA-Writing
WA-English
WY-English
DOK of State Tests With Respect to KSUS - Language
Writing
1.20
1.00
DOK Proportions
0.80
DOKA
DOKE
0.60
DOKB
0.40
0.20
English
English
MS
NH
NJ
NY
English
English
MO
English
English
MI
Writing
Writing
ME
Writing
English
MA
Writing
English
IL
EnglishB
ACT
CT
English 2
Writing
CO
English 1
English
0.00
OR
PA
TX
VA
WA
WY
States
Figure 15. DOK of state tests with respect to KSUS-Language featuring Writing
Figure 15 is a comparison of DOK among states for Writing using a stack graph.
The stack graph for Writing also shows the contrast between Missouri and Texas that is
60
illustrated in the trilinear plot. MO-English possesses the largest white area (DOKA =
0.86), while TX-Writing possesses the largest gray area (DOKB = 0.71). The three states
(IL-ACT, TX-Writing, and VA-Writing) located in the Low region are represented in the
stack graph by the only three tests in which the gray area is crossing the 0.50 line (Figure
15).
The alignment results in terms of DOK for Research Skills are shown in Figure
16. The Research Skills topic shows higher dispersion of state tests in comparison to the
former topics.
Figure 16. Trilinear plot showing DOK comparisons for all states in Research Skills.
61
Table 11
Distribution of states in terms of DOK for Research Skills.
DOK Low
VA-Reading
VA-Writing
DOK Middle
MI-Reading
MS-English
NH-English
UT-Reading
WA-English
DOK High
CT-Reading
CT-Writing
KY-Reading
MA-English
ME-English
MI-Writing
MO-English
NJ-English
NY-English 1
NY-English 2
PA -Reading
TX-Reading
TX-Writing
WY-English
Figure 17 is a comparison of DOK values among all states for Research Skills
using a stack graph. In this graph the states MI-Reading, UT-Reading, and WA-English
are just touching the 0.50 dotted line by the gray area (DOKB). This is an indication that
they are on the border between the Low and Middle regions.
62
DOK of State Tests With Respect to KSUS - Language
Research Skills
1.20
1.00
DOK Proportions
0.80
DOKA
DOKE
0.60
DOKB
0.40
0.20
PA
UT
Writing
Reading
Reading
Writing
Reading
TX
VA
English
NY
English
NJ
Reading
NH
English 2
MS
English 1
MO
English
English
MI
English
ME
English
MA
Writing
English
KY
Reading
English
CT
Reading
Writing
Reading
0.00
WA
WY
States
Figure 17. DOK of state tests with respect to KSUS-Language featuring Research Skills.
The alignment results in terms of DOK for Critical Thinking are shown in Figure
18. For Critical Thinking, a cluster of states is located in the Above area indicating that
those states obtained higher rates of DOK than the rate obtained by KSUS.
63
Figure 18. Trilinear plot showing DOK comparisons for all states in Critical Thinking.
Table 12
Distribution of states in terms of DOK for Critical Thinking.
DOK Low
DOK Middle
MS-English
MO-English
TX-Reading
NJ-English
TX-Writing
UT-Reading
WY-English
64
DOK High
CO-English
CT-Reading
CT-Writing
KY-Reading
MA-English
ME-English
MI-Reading
MI-Writing
NH-English
NY-English 1
NY-English 2
OR-English B
PA -Reading
VA-Reading
VA-Writing
WA-English
Figure 19 is a comparison of DOK among states for Critical Thinking using a
stack graph.
DOK of State Tests With Respect to KSUS - Language
Critical Thinking
1.20
1.00
DOK Proportions
0.80
DOKA
DOKE
0.60
DOKB
0.40
0.20
NY
OR
PA
TX
UT
VA
English
English
Writing
Reading
Reading
Writing
Reading
Writing
Reading
EnglishB
English 2
English 1
NJ
English
MO MS NH
English
Writing
MI
English
KY MA ME
Reading
English
English
Reading
Writing
CT
English
CO
Reading
English
0.00
WA WY
States
Figure 19. DOK of state tests with respect to KSUS-Language featuring Critical
Thinking.
Depth of Knowledge (DOK) of KSUS and State Tests – Mathematics
A comparison of DOK values for KSUS-Math and all state assessments is shown
in Figure 20. As shown in Figure 20, the mean value of DOK for all states (2.26) is below
the average (from six raters) value of DOK for KSUS-Mathematics (2.32). However,
some state tests (KY-Math = 2.69, MO-Math = 3.02, and WA-Math = 2.69) showed
higher values of DOK than the average for all tests and for KSUS-Math. According to the
data, raters determined that those three state tests are more cognitively demanding than
65
the rest. 0n the opposite side, there are three states (MN-MathBasic = 1.66, UT-Math =
1.70, and VA-Algebra I = 1.70) with the lowest cognitive demand values.
State Tests' Cognitive Demand - Mathematics
5
4
DOK
3
2
1
VA
Math
UT
Math
TX
Algebra II
PA
Algebra I
Math
OR
Math
NY
Math
NJ
MathC
NH
MathB
MS
MathB
MO
MathA
Math
MI
MathA
Math
MN
Algebra1
ME
Math
MA
Math
Math
KY
MathBasic
Math
IL
MathComp
ACT
CT
Math
KSUS CO
Math
Math
Mathematics
0
WA
WY
States
Figure 20. Comparison of KSUS-Math and state tests cognitive demand. The bold line
across the graphic represents the mean DOK for all states (2.26).
In Table 13 there is a comparison of all states in terms of DOK, G coefficient, and
number of test items for mathematics. The state of MO-Math (SD = 0.87) obtained a
higher value of standard deviation in comparison to the other states because one of the
raters assigned a DOK value of 1 to all items of the test in contrast to the other raters who
assigned values ranging from 3 to 5. Eliminating this rater, the standard deviation
dropped to 0.23. However, the G coefficient remained equal and the average DOK
jumped to 3.37.
Table 13
Comparisons of DOK mean values, G-Coefficient, and number of test items for
mathematics.
State
Test
DOK SDa G-Coefficient Number of Items
KSUS
Math
2.32
0.16
0.90
75
CO
Math
2.35
0.36
0.81
13
CT
Math
2.58
0.21
0.86
19
66
a
State
Test
DOK
SDa
IL
ACT
1.99
0.32
0.81
59
KY
Math
2.69
0.37
0.88
30
MA
Math
2.31
0.49
0.87
42
ME
Math
2.51
0.34
0.89
29
MN
MathBasic
1.66
0.47
0.68
68
MN
MathComp
2.55
0.21
0.89
30
MI
Math
2.36
0.37
0.85
37
MO
Math
3.02
0.87b
0.78
24
MS
Algebra 1
2.28
0.33
0.80
65
NH
Math
2.17
0.39
0.94
28
NJ
Math
2.50
0.33
0.89
36
NY
Math A
2.41
0.40
0.87
35
NY
Math B
2.34
0.42
0.89
34
OR
Math A
2.00
0.41
0.75
65
OR
Math B
2.08
0.34
0.83
65
OR
Math C
2.04
0.31
0.79
65
PA
Math
2.17
0.48
0.89
20
TX
Math
2.13
0.46
0.82
48
UT
Math
1.70
0.42
0.74
70
VA
Algebra I
1.70
0.46
0.63
50
VA
Algebra II
1.96
0.38
0.83
50
WA
Math
2.69
0.32
0.88
47
WY
Math
2.36
0.38
0.88
19
G-Coefficient Number of Items
Standard deviation among raters’ DOK mean values
See discussion in text to explain this high value
b
The scatter graph in Figure 21 shows an appreciable correlation (r = - 0.67, p <
.05) between the values of DOK among raters and the number of items of the tests. As in
the case of English, tests with a lower number of items also tended to receive higher
67
scores of cognitive demand in mathematics. The same explanation about the higher DOK
for short tests that was given to English is also valid here.
100
90
80
Number of Test Items
70
60
50
40
30
20
10
0
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
4.00
DOK
Figure 21. Scatter plot of DOK versus number of items for all states in mathematics (r = 0.67, p < .05).
The alignment results in terms of DOK for Computation are shown in Figure 22.
In general, all states are clustered in the High (Above) region. However, CO-Math (0.51,
0.11, 0.38) is located in the Low region. According to the raters, the CO-Math test
obtained in Computation a cognitive demanding score slightly below the one obtained by
the KSUS-Math, and it was much lower than the scores of the other states. All other state
tests were in the High region.
68
Figure 22. Trilinear plot showing DOK comparisons for all states in Computation.
Table 14
Distribution of states in terms of DOK for Computation.
DOK Low
CO-Math
DOK Middle
69
DOK High
CT-Math
IL-ACT
KY-Math
MA-Math
ME-Math
MN-MathBasic
MN-MathComp
MI-Math
MO-Math
MS-Algebra 1
NH-math
NJ-Math
NY-Math A
NY-Math B
DOK Low
DOK Middle
DOK High
OR-Math A
OR-Math B
OR-Math C
PA-Math
TX-Math
UT-Math
VA-Algebra I
VA-Algebra II
WY-Math
In Figure 23 there is a comparison of DOK among states for Computation using a
stack graph. The larger white area indicates that most of the tests obtained higher values
of DOK than those obtained by KSUS-Math in the topic of Computation.
DOK of State Test with Respect to KSUS-Math
Computation
1.20
1.00
DOK Proportions
0.80
DOKA
0.60
DOKE
DOKB
0.40
0.20
OR
VA
Math
Math
UT
Algebra I
PA TX
Algebra II
Math
Math C
Math A
Math
NY
Math B
Math B
Math
Math A
Math
Algebra 1
Math
MI MO MS NH NJ
Math
MN
Math
MathComp
KY MA ME
Math
Math
IL
MathBasic
ACT
CO CT
Math
Math
Math
0.00
WA WY
States
Figure 23. DOK of state tests with respect to KSUS-Math featuring Computation.
The alignment results in terms of DOK for Algebra are shown in figure 24.
70
Figure 24. Trilinear plot showing DOK comparisons for all states in Algebra.
In Algebra (Figure 24) the majority of state tests are located in the High region,
meaning that those states have average DOK values higher than the DOK values of
KSUS-Math. One notable case is the states of Missouri (Math. 0.17, 0.09, 074).
According to the raters, MO-Math obtained the highest cognitive demand in comparison
to the DOK of KSUS-Math, and in relation to the others states in the topic of Algebra.
Table 15
Distribution of states in terms of DOK for Algebra.
DOK Low
DOK Middle
OR-Math B
OR-Math C
TX-Math
UT-Math
VA-Algebra I
71
DOK High
CO-Math
CT-Math
IL-ACT
KY-Math
MA-Math
VA-Algebra II
ME-Math
MN-MathBasic
MN-MathComp
MI-Math
MO-Math
MS-Algebra 1
NH-Math
NJ-Math
NY-Math A
NY-Math B
OR-Math A
PA-Math
WA-Math
WY-Math
Figure 25 is a comparison of DOK among states for Algebra using a stack graph.
In this graph, the state of Missouri possesses the largest white strip indicating the highest
value of DOK above the KSUS-Math, in agreement with the data shown in Figure 24.
DOK of State Tests with Respect to KSUS-Math
Algebra
1.20
1.00
DOK Proportions
0.80
DOKA
0.60
DOKE
DOKB
0.40
0.20
OR
VA
States
Figure 25. DOK of state tests with respect to KSUS-Math featuring Algebra
72
Math
Math
Algebra I
UT
Algebra II
PA TX
Math
Math C
Math A
Math
NY
Math B
Math B
Math
Math A
Math
Algebra 1
Math
MI MO MS NH NJ
Math
MN
Math
MathComp
Math
KY MA ME
MathBasic
Math
IL
Math
ACT
CO CT
Math
Math
0.00
WA WY
The stack graph for Algebra (Figure 25) also shows the contrast between
Minnesota (Math Basic) and Virginia (Algebra I).
The alignment results in terms of DOK for Geometry are shown in Figure 26.
Figure 26. Trilinear plot showing DOK comparisons for all states in Geometry.
In the case of Geometry all states are located in the High (Above-Equal) area.
Notably, NH-Math (0.05, 0.11, 0.84)) obtained the highest comparative value of DOK,
while UT-Math (0.31, 0.51, 0.18) obtained the lowest comparative value of DOK.
However, all states showed high levels of alignment for the Geometry topic. As shown in
Figure 26 and in Table 16, all states were located in the High region.
73
Table 16
Distribution of states in terms of DOK for Geometry.
DOK Low
DOK Middle
DOK High
CO-Math
CT-Math
IL-ACT
KY-Math
MA-Math
ME-Math
MN-MathBasic
MN-MathComp
MI-Math
MO-Math
MS-Algebra 1
NH-Math
NJ-Math
NY-Math A
NY-Math B
OR-Math A
OR-Math B
OR-Math C
PA-Math
TX-Math
UT-Math
VA-Algebra I
VA-Algebra II
WA-Math
WY-Math
Figure 27 is a comparison of DOK among states for Geometry using a stack
graph. The larger white area, and consequently the smaller gray area, indicates that most
states obtained higher scores of DOK in relation to those obtained by KSUS-Math.
Although PA-Math obtained the lowest score (the smallest white area and consequently
the largest gray area) in comparison to the other states, it is still located in the High
region as shown in Figure 26. In other words, the value of DOKE + DOKA = 0.69 is
above the line 0.66.
74
DOK of State Test with Respect to KSUS-Math
Geometry
1.20
1.00
DOK Proportions
0.80
DOKA
0.60
DOKE
DOKB
0.40
0.20
VA
Math
Math
Algebra I
UT
Algebra II
PA TX
Math
Math C
Math B
OR
Math
NY
Math A
Math B
Math
Math A
Math
Algebra 1
MI MO MS NH NJ
Math
MN
Math
KY MA ME
Math
IL
MathComp
Math
Math
CO CT
MathBasic
ACT
Math
Math
Math
0.00
WA WY
States
Figure 27. DOK of state tests with respect to KSUS-Math featuring Geometry.
The alignment results in terms of DOK for Math Reasoning are shown in Figure
28. Most of the states are located in the Middle area for Math Reasoning. However, MOMath (0.28, 0.34, 0.38) obtained the highest comparative rate of DOK among all states,
while IL-ACT (0.66, 0.29, 0.05) obtained the lowest DOK score.
75
Figure 28. Trilinear plot showing DOK comparisons for all states in Math Reasoning.
Table 17
Distribution of states in terms of DOK for Math Reasoning.
DOK Low
DOK Middle
IL_ACT
CO-Math
MN-MathBasic
CT-Math
OR-Math C
MA-Math
VA-Algebra I
ME-Math
VA-Algebra II
MI-Math
MS-Algebra 1
NH-Math
NJ-Math
NY-Math A
NY-Math B
OR-Math A
OR-Math B
PA-Math
TX-Math
UT-Math
76
DOK High
KY-Math
MN-MathComp
MO-Math
WA-Math
WY-Math
Figure 29 is a comparison of DOK among states for Math Reasoning using a stack
graph. The larger gray area in the graphic indicates that most of the tests obtained lower
values of DOK than those obtained by the KSUS-Math in Math Reasoning.
DOK of State Tests with Respect to KSUS-Math
Math Reasoning
1.20
1.00
DOK Proportions
0.80
DOKA
0.60
DOKE
DOKB
0.40
0.20
OR
VA
Math
Math
Algebra I
UT
Algebra II
PA TX
Math
Math C
Math A
Math
NY
Math B
Math B
Math
Math A
Math
Algebra 1
MI MO MS NH NJ
Math
MN
Math
KY MA ME
Math
IL
MathComp
Math
Math
CO CT
MathBasic
ACT
Math
Math
Math
0.00
WA WY
States
Figure 29. DOK of state tests with respect to KSUS-Math featuring Math Reasoning.
The state tests with the gray area (DOKB) crossing the 0.50 dotted line (IL-ACT,
MN-MathBasic, OR-Math C, VA-Algebra I, and VA-Algebra II) belong to the Low
region of alignment, as can be observed also in Figure 28.
Construction of the Alignment Index
As noted, a software program was developed in order to prototype three potential
indices of alignment (Area, Vector, and Mean). As a consequence of the performance of
these three possible indices, the index represented by the mean of the three alignment
77
criteria (ROK, DOK, and BOK) was chosen as the most adequate to characterize the
overall index of alignment by topic between standards and assessments. This overall
index is then constructed as I = (ROK + DOKE + BOK)/3 as shown by equation 3.3 in
Chapter 3. A new version of the software was created, using only this formula, in order to
examine all possible values of the alignment criteria and to calculate the index according
to the spectrum of all possible values that each alignment criterion could take. Appendix
B includes a snapshot of the software.
Figure 30 is a snapshot of the software running using data from the state of
Colorado for Reading and Comprehension. The index value (I = 0.55) is the overall index
of the state of Colorado-English in the topic of Reading and Comprehension.
Figure 30. Index of alignment. State of Colorado-English (Reading and Comprehension).
78
Color code has been used to enhance the graphic. Values for the alignment criteria
DOK (blue), ROK (red), and BOK (green) are entered using the slider bars. The value of
the overall index (I) for Reading and Comprehension is 0.55, and the skewness is 0.34 for
this particular case. The black line on the centroid plot represents the skewness, and it is
showing the tendency towards BOK. Notice that BOK (0.74) is the alignment criterion
that contributes the most to the index, so the skewness is oriented towards it.
Definition of the Levels of Alignment
Corresponding to the way in which the three regions of alignment were defined
earlier in this chapter for the alignment criteria, and given the linearity of the equation
that represents the index, as a linear function of the three criteria, those three regions can
also be applied to the index of alignment. In other words, what is applied to each
component, in this case each criterion, and as a result of the linearity of the formula, the
index can also be broken up in the three regions. Consequently, index values below 0.50
are considered low, values between 0.50 and 0.66 are considered middle, and values
above 0.66 are considered high.
Alignment between KSUS and State Assessments
As noted previously, four different alignment indices were defined in Chapter 3
under the definition of the indices of alignment section. Each partial index informs about
different aspects of the alignment. Below are graphical representations (centroid plots) of
such indices for the state of Colorado in English language and mathematics. Colorado has
been chosen as an example to illustrate the entire methodology of alignment, which has
been replicated for the rest of the states and used to obtain the summary results. In the
79
next section there is a summary comparison by topics among all states studied (See
Figure 37 and subsequent).
Colorado English Language. An Example
English Language: Reading & Comprehension and Writing
Reading and Comprehension
Writing
Partial Index with Respect to Standards
Reading and Comprehension
1.0
0.5
0.40
Partial Index with Respect to Standards
Writing
DOKE
1.0
0.5
0.35
0.0
0.0
ROKS
BOKS
0.75
ROKS
I(S) = 0.48 Skw = 0.31
BOKS
I(S) = 0.54 Skw = 0.38
Partial Index with Respect to Assessment
Writing
Partial Index with Respect to Assessment
Reading and Comprehension
DOKE
DOKE
1.0
1.0
0.5
0.35
0.0
0.71
ROKA
0.31
0.55
0.68
0.5
DOKE
0.46
0.80
BOKA
ROKA
I(A) = 0.62 Skw = 0.41
0.31
0.0
0.58
BOKA
I(A) = 0.45 Skw = 0.23
80
Overall Index of Alignment
Reading and Comprehension
1.0
0.5
Overall Index of Alignment
Writing
DOKE
DOKE
1.0
0.5
0.35
0.0
0.56
0.51
0.74
ROK
BOK
0.0
0.67
ROK
I = 0.55 Skw = 0.34
BOK
I = 0.50 Skw = 0.31
Overall Index of Alignment
Reading and Comprehension
Overall Index of Alignment
Writing
DOKE+DOKA
1.0
1.0
0.5
0.0
ROK
DOKE+DOKA
0.70
0.65
0.5
0.56
0.31
0.51
0.74
BOK
ROK
0.0
0.67
BOK
I(+) = 0.65 Skw = 0.16
I(+) = 0.63 Skw = 0.18
Figure 31. Alignment indices for Colorado-English with respect to KSUS-Language
featuring Reading & Comprehension and Writing.
In Figure 31 are represented the centroid plots of the four indices of alignment for
Colorado-English in the topics of Reading & Comprehension and Writing. The lowest
index is the partial index with respect to assessment [I(A) = 0.45] for Writing. The
highest index is the overall index [I(+) = 0.65] for Reading and Comprehension. In terms
of skewness, the lowest value (Skw = 0.16) is for the overall index I(+) for Reading and
Comprehension. The highest skewness (Skw = 0.41) is for the partial index with respect
to assessments I(A) for Reading and Comprehension. The direction and module (length =
0.41) of the arrow representing skewness for Reading and Comprehension indicates that
81
ROKA and BOKA contribute more than DOKE to the index. Also, the arrow is tilted
towards BOKA as an indication that this criterion contributes the most to the index.
Colorado
English Language: Research Skills and Critical Thinking
Research Skills
Critical Thinking
Partial Index with Respect to Standards
Research Skills
Partial Index with Respect to Standards
Critical Thinking
DOKE
1.0
1.0
DOKE
0.5
0.5
0.12
0.00
0.0 0.00
0.38 0.0
0.90
ROKS
ROKS
BOKS
I(S) = 0 Skw = n/a
I(S) = 0.47 Skw = 0.69
Partial Index with Respect to Assessment
Critical Thinking
Partial Index with Respect to Assessment
Research Skills
1.0
BOKS
DOKE
1.0
0.5
0.5
0.00
0.0 0.00
ROKA
DOKE
0.13
0.0
ROKA
BOKA
I(A) = 0 Skw = n/a
0.12
0.40
BOKA
I(A) = 0.22 Skw = 0.28
82
Overall Index of Alignment
Research Skills
1.0
Overall Index of Alignment
Critical Thinking
DOKE
1.0
0.5
DOKE
0.5
0.12
0.00
0.0 0.00
0.250.0
0.65
ROK
ROK
BOK
I = 0 Skw = n/a
I = 0.34 Skw = 0.48
Overall Index of Alignment
Research Skills
1.0
BOK
Overall Index of Alignment
Critical Thinking
DOKE+DOKA
DOKE+DOKA
1.0
0.5
0.85
0.5
0.00
0.0 0.00
0.250.0
0.65
ROK
BOK
ROK
BOK
I(+) = 0 Skw = n/a
I(+) = 0.58 Skw = 0.53
Figure 32. Alignment indices for Colorado-English with respect to KSUS-Language
featuring Research Skills and Critical Thinking.
The graphs in figure 32 show the centroid plots of the four indices of alignment
for Colorado-English in the topics of Research Skills and Critical Thinking. As can be
noted, the raters did not find Research Skills in any of the items of this test.
A comparison between the four different indices [ I(S), I(A), I, and I(+) ] and the
four topics of English language is depicted in Figure 33.
83
Indices of Alignment for Colorado-English
Index Value
0.70
0.60
0.50
Reading & Comprehension
0.40
Writing
0.30
0.20
Research Skills
Critical Thinking
0.10
0.00
I(S)
I(A)
I
I(+)
Indices
Figure 33. Indices of alignment by topics for Colorado-English language.
According to the interpretation of the different indices of alignment given in the
definition of the indices (Chapter 3), I(S) measures the index of alignment with respect to
standards, and it is a measurement of the relevance of the standards to the items. As
shown in Figure 33, Writing obtained the highest index for relevance of items. Its value
[I(S)W > 0.50] means that there is an acceptable proportion of standards (objectives)
addressed by the test in the topic of Writing. Reading & Comprehension and Critical
Thinking obtained lower values, while Research Skills was totally absent in this test.
For I(A), which measures the relevance of the items to the standards, Reading &
Comprehension showed the highest value [I(A)R&C = 0.62] indicating an acceptable match
of items with content found in the standards (objectives). However, Critical Thinking
[I(A)CT = 0.22] did not appear to be covered sufficiently in this test with respect to the
coverage of KSUS. The overall index (I) of alignment for CO-English follows the
tendency of the partial indices since it is the average of the two. It is worth recalling that I
= [I(S) + I(A)]/2. As a result, for CO-English, only the topic of Writing reached
acceptable level of alignment with respect to the KSUS-Language. Taking into
84
consideration the combined DOK (DOKE + DOKA) criterion, all three topics present in
the test reached acceptable values for this index of alignment I(+).
Colorado Mathematics. An Example
Mathematics: Computation and Algebra
Computation
Algebra
Partial Index with Respect to Standards
Computation
1.0
Partial Index with Respect to Standards
Algebra
DOKE
1.0
0.5
0.5
DOKE
0.51
0.12
0.19
0.0
0.33 0.0
0.86
ROKS
0.87
BOKS
ROKS
I(S) = 0.44 Skw = 0.66
BOKS
I(S) = 0.52 Skw = 0.59
Partial Index with Respect to Assessment
Algebra
Partial Index with Respect to Assessment
Computation
DOKE
1.0
1.0
0.5
0.5
DOKE
0.51
0.12
0.31 0.0
0.32 0.0
0.51
ROKA
BOKA
ROKA
I(A) = 0.31 Skw = 0.34
0.33
BOKA
I(A) = 0.39 Skw = 0.19
85
Overall Index of Alignment
Computation
1.0
Overall Index of Alignment
Algebra
DOKE
DOKE
1.0
0.5
0.5
0.51
0.12
0.32 0.0
0.260.0
0.68
ROK
0.60
BOK
ROK
BOK
I = 0.37 Skw = 0.49
I = 0.46 Skw = 0.31
Overall Index of Alignment
Computation
Overall Index of Alignment
Algebra
DOKE+DOKA
1.0
1.0
DOKE+DOKA
0.79
0.5 0.49
0.5
0.32 0.0
0.260.0
0.68
ROK
0.60
BOK
ROK
BOK
I(+) = 0.50 Skw = 0.31
I(+) = 0.55 Skw = 0.47
Figure 34. Alignment indices for Colorado-Math with respect to KSUS-Math featuring
Computation and Algebra.
In Figure 34 are represented the plots of the four indices of alignment for the
Colorado mathematics test in the topics of Computation and Algebra. The lowest index is
the partial index with respect to assessment [I(A) = 0.31] for Computation. The highest
index is the overall index [I(+) = 0.55] for Algebra. In terms of skewness, the lowest
value (Skw = 0.19) is for the partial index with respect to assessment I(A) for Algebra.
The highest skewness (Skw = 0.66) is for the partial index with respect to standards I(S)
for Computation.
Colorado
Mathematics: Geometry and Math Reasoning
86
Geometry
Math Reasoning
Partial Index with Respect to Standards
Geometry
1.0
Partial Index with Respect to Standards
Math Reasoning
DOKE
1.0
0.5 0.50
DOKE
0.5 0.42
0.230.0
0.260.0
0.77
0.94
ROKS
BOKS
ROKS
I(S) = 0.57 Skw = 0.60
I(S) = 0.47 Skw = 0.47
Partial Index with Respect to Assessment
Geometry
1.0
0.5
BOKS
Partial Index with Respect to Assessment
Math Reasoning
DOKE
1.0
0.50
0.5
0.250.0
DOKE
0.42
0.0
0.50
0.66
ROKA
0.82
ROKA
BOKA
BOKA
I(A) = 0.42 Skw = 0.25
I(A) = 0.63 Skw = 0.35
Overall Index of Alignment
Geometry
Overall Index of Alignment
Math Reasoning
1.0
0.5
DOKE
1.0
0.50
DOKE
0.5 0.42
0.250.0
0.45
0.72
ROK
0.0
0.79
BOK
ROK
I = 0.49 Skw = 0.41
BOK
I = 0.55 Skw = 0.36
87
Overall Index of Alignment
Geometry
Overall Index of Alignment
Math Reasoning
DOKE+DOKA
1.0 0.96
1.0
DOKE+DOKA
0.5
0.5
0.250.0
0.45
0.72
ROK
0.57
0.0
0.79
BOK
ROK
BOK
I(+) = 0.64 Skw = 0.63
I(+) = 0.60 Skw = 0.30
Figure 35. Alignment indices for Colorado-Math with respect to KSUS-Math featuring
Geometry and Math Reasoning.
In Figure 35 are represented the centroid plots of the four indices of alignment for
the Colorado mathematics test in the topics of Geometry and Math Reasoning. The
lowest index is the partial index with respect to assessment [I(A) = 0.42] for Geometry.
The highest index is the overall index [I(+) = 0.64] for Geometry. In terms of skewness,
the lowest value (Skw = 0.25) is for the partial index with respect to assessment I(A) for
Geometry. The highest skewness (Skw = 0.63) is for the overall index I(+) for Geometry.
A comparison between the four different indices [ I(S), I(A), I, and I(+) ] and the
four topics of mathematics is depicted in Figure 36. There is no value for Trigonometry
because the CO-Math test did not include this topic.
88
Index Value
Indices of Alignment for Colorado-Math
0.70
0.60
0.50
0.40
Computation
Algebra
0.30
0.20
0.10
0.00
Geometry
Math Reasoning
I(S)
I(A)
I
I(+)
Indices
Figure 36. Indices of alignment by topics for Colorado-Math.
Again, according to the interpretation of the different indices of alignment given
in the definition of the indices (Chapter 3), I(S) measures the index of alignment with
respect to standards, and it is a measurement of the relevance of the standards to the
items. As shown in Figure 36, Geometry obtained the highest index for relevance of
items. Its value [I(S)G = 0.57] means that there is an acceptable proportion of standards
(objectives) addressed by the test in the topic of Geometry. Algebra also obtained an
acceptable index value [I(S)A = 0.53], while Computation and Math Reasoning obtained
lower values below the 0.50 score.
For I(A), which measures the relevance of the items to the standards, Math
Reasoning showed the highest value [I(A)MR = 0.63], indicating an acceptable match of
items with content found in the standards (objectives). However, Computation [I(A)c =
0.31], Algebra [I(A)A = 0.39], and Geometry [I(A)G = 0.42] did not reach acceptable
levels of alignment. As mentioned previously, the overall index (I) of alignment for COMath follows the tendency of the partial indices since it is the average of the two. As a
result, for CO-Math, only the topic of Math Reasoning reached an acceptable level of
89
alignment with respect to the KSUS-Math. Taking into consideration the combined DOK
(DOKE + DOKA) criterion, all four topics reached acceptable values for this index of
alignment I(+).
Indices of Alignment for All States – Language
The graphics below depict the overall indices of alignment (I) by topics for all
state assessments in English language. These graphs are built taking the values of the
overall index (I) from Figure 33 for all state tests.
Index of Alignment - Reading & Comprehension
1.00
0.90
0.80
0.70
Index
0.60
0.50
0.40
0.30
0.20
0.10
TX
UT
Writing
Reading
Reading
Writing
Writing
Reading
Reading
PA
VA
English
OR
English
NY
EnglishC
NJ
EnglishB
NH
EnglishA
MS
English 2
MO
English 1
English
MI
English
ME
English
MA
Writing
English
KY
English
English
IL
Reading
ACT
Writing
CT
Reading
CO
Reading
English
0.00
WA
WY
States
Figure 37. Indices of alignment for all states featuring Reading and Comprehension.
In Reading and Comprehension, all states obtained indices of alignment between
0.50 and 0.65, which are considered moderate. The average value of this overall index of
alignment (I) for all states was 0.59 in Reading and Comprehension.
90
Index of Alignment - Writing
1.00
0.90
0.80
0.70
Index
0.60
0.50
0.40
0.30
0.20
0.10
Writing
Reading
UT
VA
English
TX
Reading
Writing
Writing
PA
English
OR
Reading
English 2
NY
Reading
NJ
EnglishB
NH
EnglishC
MS
EnglishA
MO
English 1
English
MI
English
ME
English
MA
Writing
English
KY
English
English
IL
Reading
ACT
CT
Reading
CO
Writing
English
Reading
0.00
WA
WY
States
Figure 38. Indices of alignment for all states featuring Writing.
In Writing, OR-English B obtained the lowest index value (I = 0.37), while PAWriting obtained the highest value (I = 0.64). The average value of the index of
alignment for all states was 0.49 in Writing. States without an index value indicated that
the raters did not find matches for the topic in the test.
Index of Alignment - Research Skills
1.00
0.90
0.80
0.70
Index
0.60
0.50
0.40
0.30
0.20
0.10
UT
Writing
Reading
Writing
TX
Reading
Reading
Writing
PA
VA
English
OR
English
NY
Reading
NJ
EnglishB
NH
EnglishC
MS
EnglishA
MO
English 2
English 1
English
MI
English
ME
English
MA
Writing
English
KY
English
English
IL
Reading
ACT
CT
Reading
Writing
English
CO
Reading
0.00
WA
WY
States
Figure 39. Indices of alignment for all states featuring Research Skills.
In Research Skills, five states (KY-Reading, ME-English, MI-Reading, TXWriting, and WA-English) obtained indices of alignment lower than 0.20, which is
91
considered low. Two states (MO-English and TX-Reading) obtained the highest values,
around 0.50, which are considered moderate. The average value of the index of alignment
for all states in the topic of Research Skills was 0.28.
Index of Alignment - Critical Thinking
1.00
0.90
0.80
0.70
Index
0.60
0.50
0.40
0.30
0.20
0.10
UT
Writing
Reading
VA
English
TX
Reading
Writing
Writing
PA
English
OR
Reading
English 2
NY
Reading
NJ
EnglishC
NH
EnglishB
MS
EnglishA
MO
English 1
English
MI
English
ME
English
MA
Writing
English
KY
English
English
IL
Reading
ACT
CT
Reading
Writing
English
CO
Reading
0.00
WA
WY
States
Figure 40. Indices of alignment for all states featuring Critical Thinking.
In Critical Thinking, PA-Writing obtained the highest value of the index of
alignment (0.68), while TX-Reading obtained the lowest value (0.21). The average value
of the index of alignment for all states in Critical Thinking was 0.38.
Total Index of Alignment for English Language
The total index of alignment for each subject matter is calculated using the overall
indices (I) and the weights of each topic. The total index of alignment for English
language (ITE), as defined by equation 3.5 in Chapter 3 above, is represented in Figure 41
for all states.
For illustration purposes, the total index of alignment between the CO-English
test and the KSUS-Language can be calculated as follows:
From previous results, the following values were found (Figure 33):
Reading and Comprehension
I = 0.55
92
Writing
I = 0.50
Research Skills
I = 0.00
Critical Thinking
I = 0.34
The total index of alignment for CO-English is calculated using the weights of the
topics (Table 6 and equation 3.5) as follows:
ITE = 0.37*0.55 + 0.39*0.50 + 0.15*0.00 + 0.09*0.34 = 0.42
Another index that is possible to obtain using the index I(+), which corresponds to
the Webb’s condition (DOK = DOKE + DOKA), is the following:
ITE+ = 0.37*0.65 + 0.39*0.63 + 0.15*0.00 + 0.09*0.58 = 0.53
It is expected that this last index (ITE+) is always equal or greater than the former
one (ITE).
The general definition of the total index of alignment for any subject matter is
then expressed as follows: IT = ∑wiIi / ∑wi
Where,
wi = Number of objectives per topic, and
Ii = Overall index per topic.
According to the convention for the three levels of alignment defined in this
study, most states obtained levels below 0.50, which are considered slightly low. As
Figure 41 shows, there are two extreme cases: MO-English = 0.57 with the highest score
and OR-EnglishA = 0.23 with the lowest score. The average value of the total index of
alignment for all states in English language was 0.37 (SD = 0.10). Only two states (MOEnglish = 0.57 and NJ-English = 0.52) reached middle levels of alignment (0.50 < TTE <
0.66).
93
Total Index of Alignment - English Language
1.00
0.90
0.80
0.70
Index
0.60
0.50
0.40
0.30
0.20
0.10
UT
Writing
Reading
TX
Reading
Writing
Writing
Reading
Reading
PA
VA
English
OR
English
NY
EnglishC
NJ
EnglishB
NH
EnglishA
MS
English 2
MO
English 1
English
MI
English
ME
English
MA
Writing
English
KY
English
English
IL
Reading
ACT
CT
Reading
CO
Writing
English
Reading
0.00
WA
WY
States
Figure 41. Total indices of alignment in English language for all states (M = 0.37, SD =
0.10).
Indices of Alignment for All States – Mathematics
Below there is a detailed analysis for all state tests in mathematics. This is the
same analysis done for English language in the preceding section.
The graphics below depict the overall index of alignment by topics for all state
assessments in mathematics.
Index of Alignment - Computation
1.00
0.90
0.80
Index
0.70
0.60
0.50
0.40
0.30
0.20
0.10
Figure 42. Indices of alignment for all states featuring Computation.
94
VA
Math
UT
Algebra II
TX
Algebra I
Math C
PA
States
Math
OR
Math
NY
Math
NJ
Math B
NH
Math A
MS
Math B
MO
Math A
MI
Math
MN
Math
ME
Math
Math
MA
Algebra 1
Math
KY
Math
Math
IL
Math
ACT
CT
MathComp
Math
CO
MathBasic
Math
0.00
WA
WY
The highest index value for the Computation topic was obtained by the CT-Math
test (0.61), while MN-MathComp obtained the lowest value (0.32). Most state tests are
slightly below 0.50. The average index of alignment for all states in Computation was
0.49.
Index of Alignment - Algebra
1.00
0.90
0.80
Index
0.70
0.60
0.50
0.40
0.30
0.20
0.10
VA
Math
UT
Math
TX
Algebra II
PA
Algebra I
Math C
OR
Math
NY
Math
NJ
Math B
NH
Math A
MS
Math B
MO
Math A
MI
Math
MN
Math
ME
Math
Math
MA
Algebra 1
Math
KY
Math
Math
IL
Math
ACT
CT
MathComp
Math
CO
MathBasic
Math
0.00
WA
WY
States
Figure 43. Indices of alignment for all states featuring Algebra.
For the Algebra topic, the VA-Algebra II test obtained the highest index of
alignment (0.65), while MN-MathBasic obtained the lowest index (0.32). The average
index of alignment for all states in Algebra was 0.48.
95
Index of Alignment - Geometry
1.00
0.90
0.80
Index
0.70
0.60
0.50
0.40
0.30
0.20
0.10
Math
UT
Math
TX
WA
WY
Algebra II
PA
Algebra I
Math
OR
Math
NY
Math C
NJ
Math B
NH
Math A
MS
Math B
MO
Math A
MI
Math
MN
Math
ME
Math
Math
MA
Algebra 1
Math
KY
Math
Math
IL
Math
ACT
CT
MathBasic
Math
CO
MathComp
Math
0.00
VA
States
Figure 44. Indices of alignment for all states featuring Geometry.
In Geometry, TX-Math obtained the highest index of alignment (0.55), while
MO-Math obtained the lowest value (0.28). The average index of alignment for all states
in Geometry was 0.43.
Index of Alignment - Math Reasoning
1.00
0.90
0.80
Index
0.70
0.60
0.50
0.40
0.30
0.20
0.10
VA
Math
UT
Math
TX
Algebra II
PA
Algebra I
Math
OR
Math
NY
Math C
NJ
Math B
NH
Math A
MS
Math B
MO
Math A
MI
Math
MN
Math
ME
Math
Math
MA
Algebra 1
Math
KY
Math
Math
IL
Math
ACT
CT
MathBasic
Math
CO
MathComp
Math
0.00
WA
WY
States
Figure 45. Indices of alignment for all states featuring Math Reasoning.
KY-Math obtained the highest index (0.59) in Math Reasoning, while VAAlgebra II obtained the lowest index (0.43). The average index of alignment for all states
in Math Reasoning was 0.51, which could be considered in the middle range.
96
It is worth mentioning that Trigonometry was not considered in this study because
only five states showed such content. Additionally, the low weight of the Trigonometry
topic (0.05) made practically negligible its contribution to the index in relation to the
other topics.
Total Index of Alignment for Mathematics
The total index of alignment for mathematics for all states (ITM), as defined by
equation 3.6 in Chapter 3, is represented in Figure 46. This total index of alignment for
mathematics is calculated using procedures similar to the procedures used for English
language above.
Total Index of Alignment - Mathematics
1.00
0.90
0.80
Index
0.70
0.60
0.50
0.40
0.30
0.20
0.10
VA
Math
UT
Math
TX
Algebra II
Math
PA
Algebra I
Math
OR
Math
NY
Math C
NJ
Math B
NH
Math A
MS
Math B
MO
Math A
Math
MI
Math
MN
Math
ME
Algebra 1
MA
Math
KY
MathComp
Math
IL
MathBasic
Math
CT
ACT
Math
CO
Math
Math
0.00
WA
WY
States
Figure 46. Total indices of alignment in mathematics for all states (M = 0.46, SD = 0.03).
According to the convention adopted in this study, most states obtained total
indices of alignment for mathematics slightly below 0.50, which are considered in the
Low-Middle range. The lowest total index was obtained by MN-MathBasic (0.37), and
the highest index was 0.51, which was obtained by three states, MS-Algebra 1, NY-Math
B, and TX-Math. The average of the total index of alignment in mathematics for all states
was 0.46 (SD = 0.03).
97
The average of the total index of alignment among all states in English (M = 0.37)
was lower than the average of the index in mathematics (M = 0.46). In both subject
matters, these average scores were in the Low range of alignment with relation to their
respective higher education expectations (KSUS). However, as shown previously, a few
individual states did stand out from the rule.
Alignment Index and Test Accountability
The total index of alignment (IT) proposed in this study constitutes a quantitative
entity aimed to measure the match between the content addressed by the state tests and
the college expectations for any subject matter defined in the KSUS. Three levels of
alignment were established: low (IT < 0.50), middle (0.50 < IT < 0.66), and high (IT >
0.66).
In order to establish a relationship between levels of alignment and accountability
features of state testing, two tests results were analyzed in mathematics subject matter,
the National Assessment of Educational Progress (NAEP) and the Scholastic Aptitude
Test (SAT) from the College Board. State-level NAEP assesses representative samples of
4th and 8th graders in public and non-public schools in the United States since 1994.
NAEP developers claim that the test is a reliable instrument to monitor achievement over
time across the nation. So, state-level NAEP scores were chosen in mathematics for 1996
and 2000, the years available for 8th grade.
Although correlation between levels of alignment and direct NAEP scores did not
show statistical significance, correlation results between the three levels of alignment and
NAEP gain from 1996 to 2000 showed significant agreement (r = 0.46, p < 0.05).
According to NAEP developers, this test is mostly designed to measure student
98
achievement in the context of instructional experience by tracking changes in students’
performance over time. In this sense, NAEP gains were more appropriate to correlate
with the total index than NAEP direct scores. In relation to NAEP data, all states showed
gains during this period, except the state of Utah. One possible conclusion that can be
ventured about this correlation is that those state tests with higher levels of alignment to
the college standards (KSUS) also respond to high state standards, and consequently,
higher NAEP gains are obtained. Although the raters used state tests in the year 2002, it
is safe to conclude that acceptable alignment between state tests and high state standards
has been a tendency that persisted during several years. However, it could be argued that
ability to take tests does not necessarily indicate significant learning and that “study for
the test” could be playing a role here. Nevertheless, if state tests respond to high state
standards, those negative factors could be considered benign when compared to the
situation of state tests not responding to high standards.
The other high-stakes assessment compared was the SAT test. Average annual
results are available at state-level for this test. Although direct correlation between the
levels of alignment and SAT gain during the same period of time did not show statistical
relevance, correlations between NAEP gain and SAT gain for the same period was also
acceptable (r = 0.52, p < 0.05). This result establishes a connection between NAEP gain
and higher education, since the SAT test is taken by college-oriented students.
99
CHAPTER 5
CONCLUSIONS AND IMPLICATIONS
Conclusions
The main purpose of this study was the creation of a quantitative methodology to
define and measure the alignment between standards (expectations) and assessment in the
content focus category. The methodology developed in this study included the adoption
of a minimum and concise language appropriate to define the criteria and to measure
match between student expectations (standards) and state tests. Three dimensions were
chosen to define the criteria for alignment: range (as ROK), depth (as DOK) and balance
(as BOK). These three dimensions of alignment were conceptualized as bidirectional, that
is, matching not only of standards addressed by items, but also matching items
corresponding to standards. Using a bidirectional approach identifies both, the items that
are targeted by the standards and the standards not represented on the test, that is, items
without a correlate in the standards. These three dimensions represented the minimum
needed to characterize alignment in the content focus category. The study methodology
incorporated the construction of an alignment index as a mathematical formula, which
informs quantitatively about the level of alignment between standards and assessment.
The alignment index was constructed based upon the minimum alignment criteria chosen
(ROK, DOK, and BOK).
The index of alignment between standards and assessments was expressed as a
mathematical function of the three alignment criteria. Among the three different formulas
explored to define the alignment index, the simple mean of the three alignment criteria
proved to be the most appropriate. The linear combination of the three criteria performed
100
smoothly when tested throughout all possible values of each criterion, using the software
developed for that purpose. The other two formulas showed ill behaviors, particularly
when any criterion approached zero. These irregular conducts were due to their nonlinear component of the formula, which included the product of the criteria and a root
square.
The process of measuring alignment was represented in a graphical manner by
centroid plots (Figure 30). The values of the three alignment criteria (ROK, DOK, and
BOK) were represented on each axis of the centroid plot. These values defined the three
vertices of a triangle. The shape and size of the triangle gave an indication of the index
value and also informed about the way each criterion contributed to the index. Two
different types of graphics were used in this study: the centroid plots and the trilinear
plots. Although similar, these two graphs served different purposes. Centroid plots were
used to analyze and measure the indices of alignment, while trilinear plots were used to
analyze and compare DOK values only (Figure 11).
Additional to the graphical representation of the index, the concept of skewness
was introduced as a complementary aspect of the alignment procedure. Skewness (as a
vector) informed graphically about the weight each criterion contributed to the index.
Skewness was calculated as the vectorial sum of the three criteria, since each axis could
be assimilated to a vector. This concept proved valuable because it gave an indication
toward which criterion the index was oriented. In other words, skewness was a vector
pointing toward the criteria that contributed the most to the index. Skewness equal to zero
meant that each criterion contributed (weighted) equally to the index, and it happened
when the shape of the figure was an equilateral triangle.
101
This study began with a generalizability analysis of the data provided by the
raters. The generalizability coefficient for the six subjects that rated the KSUS reached
satisfactory levels of reliability. The G-Coefficient for KSUS-Language was 0.93, and for
KSUS-Math was 0.89. In both cases the raters reached consensus on the DOK ranking,
and the G values were within the acceptable levels (G > 0.80). These results could be
attributed to the training of the subjects, who spent considerable time in training
workshops prior to the actual rating activity. Training the raters probably was the most
critical aspect of this alignment methodology, which is based on expert judgment.
Detailed analysis of DOK was performed for the KSUS and for each state test in
the two subject matters, English and mathematics. The DOK (the mean value among the
six raters) for the KSUS-Language was 2.86 in Marzano’s scale ranging from 1 to 5.
Most state tests in English obtained DOK values below the one obtained by KSUSLanguage (2.86) with three notable exceptions, MI-Writing (DOK = 3.61, SD = 0.71, G =
0.83, N = 3), MO-English (DOK = 3.38, SD = 0.39, G = 0.88, N = 21), and PA-Writing
(DOK = 3.72, SD = 0.33, G = 0.69, N = 3), as shown in Table 8. The raters determined
that these three state tests were the most cognitively demanding. A comparison of the
DOK scores with the number of test items produced an appreciable correlation between
the two (r = -0.46, p < .05), which can be observed in Figure 10. It was determined that
items in a short tests tended to be more cognitively demanding because they concentrated
more content. This tendency was even more pronounced in the case of mathematics (r = 0.67, p < .05) (Figure 21).
The DOK for the KSUS-Math was 2.32 in the Marzano scale. Most state tests in
mathematics obtained DOK values comparable to the one obtained by KSUS-Math.
102
However, MO-Math reached the highest value of DOK (3.02) among all states (see Table
13). This DOK value was even higher (3.37) after dropping one of the raters who
assigned extremely low values to the items in comparison to the other five raters. In sum,
the state of Missouri obtained the highest cognitive demand score from the raters in the
two subject matters, English and mathematics.
An alignment index was necessary in order to measure match between standards
and assessment in a quantitative manner. An index could improve the possibility of
obtaining precise and accurate values when judging the relationships among components
of the instructional system, and in turn, it enables researchers to make better comparisons
among the components of the system. The alignment index defined here was used to
measure the alignment between higher education expectations (KSUS) and state
assessment. This procedure of alignment was an opportunity to explore the properties of
the proposed index. As a result of this alignment procedure, alignment comparisons
among states were performed in terms of match between KSUS and current state tests.
Additionally, this study compared the match of KSUS and state assessments across
subject matter and across states.
Four partial indices of alignment were defined [I(S), I(A), I, and I(+)] according
to the different conceptualizations of the alignment criteria. Every partial index informed
about a particular aspect of the alignment for each topic within a given subject matter.
Partial indices by topic, I(S) and I(A), were a consequence of the bidirectional definition
of all alignment criteria. The overall index by topic, I, was a combination of these two
partial indices. The total index of alignment (IT) for each state test was, then, constructed
as a combination of the overall indices and the weight each topic contributed to the whole
103
standard. This was a novel approach in alignment research in which a deeper and
comprehensive knowledge was obtained about the measurement of match between
standards and assessments.
Trilinear plots were used in order to represent graphically DOK comparisons
between KSUS and state tests. Three regions (Above, Equal, and Below) were used as
places for tests that were above, equal, or below the DOK value of the KSUS. Trilinear
plots were also useful to make comparisons among states depending on their relative
position in the graph.
In relation to alignment, the researcher suggested three levels of alignment (Low,
Middle, and High), which were applicable to the different criteria and to the indices. The
trilinear plot was useful in the definition of these three levels because such levels were
traced by lines crossing notable points of the graph (See Figure 12 and subsequent). The
three levels were defined depending on the values of the alignment criteria and the values
of the indices, as follows: The Low level range was assigned when the criteria or indices
reached values below 0.50; the Middle range when values were above 0.50 but below
0.66; and the High range of alignment was assigned when values were above 0.66. This
approach of using ranges was desirable due to the nature of this alignment methodology,
where a single cut off value was not appropriate. Ranges of alignment were considered to
be less arbitrary than a single cut score, since some degree of vagueness was embedded in
the description of standards and in the designation of the DOK levels.
Comparative results in terms of cognitive demand showed that the state of
Missouri, for example, obtained the third highest DOK score (DOK = 3.38) among all
states in English language after PA-Writing (DOK = 3.72) and MI-Writing (DOK =
104
3.61). Alignment results in terms of DOK with respect to KSUS-Language showed that
MO-English reached the highest value in comparison to the other states in Reading and
Comprehension (DOKE = 0.42, DOKA = 0.47) (Figure 12) and in Writing (DOKE =
0.14, DOKA = 0.86) (Figure 14). In the topic Research Skills, MO-English also reached
one of the highest values (DOKE = 0.84, DOKA = 0.16) (Figure 16). In the topic of
Critical Thinking, MO-English reached intermediate values (DOKE = 0.47, DOKA =
0.08) (Figure 18).
Missouri also reached the highest total index of alignment (ITE = 0.57) above the
mean (M = 0.37) of all states. This value located Missouri in the Middle range of
alignment for English. However, in mathematics MO-Math obtained a middling position
(ITM = 0.41) below the mean (M = 0.46) of all states. This low index value with respect to
KSUS-Math and in relation to the other states, located Missouri in the Low range of
alignment for mathematics. This detailed analysis can be applied to all states in all
subject matters, and also across grades if such data are available.
Levels of alignment and accountability aspects of state testing were analyzed. The
correlation between NAEP gain and levels of the alignment index could be interpreted as
support for validity evidence of the alignment methodology; however, a word of caution
is needed. Test results data were not ideal. NAEP is administered every four years; hence
the closest data were years 1996 and 2000. Raters in this study selected state tests of
2002. As noted above, the acceptable correlation between high levels of alignment of
state tests with KSUS and student achievement (NAEP gain) can be interpreted as a
tendency of well-aligned states to show higher student performance.
105
The quantitative index of alignment constructed in this study was manipulated
effectively in the digital domain using computational tools. Besides the construction of
the index of alignment, a series of computational tools and graphical representations of
the alignment criteria and the index itself were developed. The trilinear plot was
exceptionally useful to represent graphically the DOK of the tests and also of great value
to make comparisons between the tests and the KSUS, as well as to make comparisons
among the tests themselves. The centroid plot was useful for visualizing the index of
alignment allowing a finer granularity of the alignment methodology. Such visual
representations helped to improve the understanding of the alignment process while
providing valuable tools to disclose visual patterns, new relationships, and connections
not seen in tables and flat graphs.
As predicted in the framework of this study, this quantitative methodology of
alignment provided the capability to measure alignment from a variety of perspectives,
and the ability to make comparisons across topics of subject matter, across grades, and
across states. This was a comprehensive methodology that collectively told the alignment
story from its multiple perspectives and dimensions, a methodology anchored in the
process of managing expert judgment imparted by the raters.
Implications of the Findings
The alignment methodology developed in this study improves our understanding
of the alignment complexity and provides better tools for judging educational alignment,
in particular content alignment between standards and assessments. The computational
and graphical tools developed are powerful means of describing and measuring alignment
from different perspectives. The detailed analysis of state tests in terms of their cognitive
106
demand (DOK) is in itself a powerful tool for school districts. The creation and
utilization of an alignment index is expected to advance the research in systemic school
reform, and specifically to advance alignment research. Alignment analysis is also
important as a tool for decision making in schools, districts, and states. Educational
leaders, after knowing the alignment status of the schooling system, can take corrective
action to appropriately align their educational system should poor alignment exist. Also,
when alignment is not possible or desirable, decision makers can take the appropriate
actions based upon the knowledge of alignment, misalignment, or the other results this
methodology could provide.
The index of alignment developed in this study may be used to provide evidence
of validity for those tests that are well aligned to their respective content standards. The
tools developed in this study could be utilized by school districts in order to comply with
the No Child Left Behind (NCLB) legislation. Alignment between standards and tests in
K-12 is a fundamental component of the NCLB mandate, and the US Department of
Education is closely monitoring this directive.
Limitations of the Study
The index defined in this research does not exhaust the totality of the alignment
between standards and assessment, although it does give a suitable indication of the
content (Content Focus) alignment category as it is addressed in current research. High
levels of alignment were not expected in this study because the definition of the KSUS
did not take into consideration any state test, and the KSUS were not designed for that
purpose. However, reasonable levels of alignment were found among the KSUS and state
tests in the two subject matters analyzed. It is important to emphasize that the alignment
107
results shown in this study do not constitute judgment about the quality of the state tests
since this study was intended as an exercise to illustrate the use of the methodology
developed.
Very little information about the state tests was collected. More information could
have helped to clarify the differences in the DOK scores and in the value of the index of
alignment among the state assessments. Although the levels of DOK adopted by the
raters (from 1 to 5) gave more room for selection, in comparison to other studies where
only three levels were used, some raters reported difficulty in its utilization. Nevertheless,
the generalizability results showed acceptable levels of agreement among raters.
The data available from NAEP and SAT tests were not the best data for making
comparisons with the levels of alignment of the index. NAEP is not given every year, so
these limitations could undermine the interpretation of the correlation results. However,
the results shown here, although provisional, could be interpreted as desirable trends
among the levels of alignment between state tests and college expectations and student
performance.
State tests with higher levels of alignment are supposed to be more oriented
toward those expectations that higher education institutions require from senior
graduates. The correlation between levels of alignment, NAEP, and SAT scores
established a relationship among levels of alignment, student performance in K-12, and
success in college. In order to reach this conclusion, some previous considerations were
needed. First, the dimensions of alignment (range, depth, and balance) should be
sufficient to give a suitable indication of content match. Second, the definition of an
alignment index as a linear function of the alignment criteria should be plausible enough
108
to genuinely represent a measurement of alignment. Third, the raters’ data should be
reliable. Fourth, the KSUS should be considered high standards. Fifth, higher indices of
alignment should indicate that state tests also comply with higher state standards. And
sixth, state tests aligned to higher standards should impact student performance
positively. All assumptions above are taken for granted for reasons already discussed in
this study.
Evidence of validity in this novel methodology cannot be acquired in a single
experiment, so additional utilization is necessary to accumulate evidence of validity for
this alignment approach.
Recommendations for Further Research
This is a work in progress. The methodology developed in this study could be
used to measure alignment in other subject matters and in other contexts. For example,
the index of alignment constructed could be used as a variable to be confronted with
student achievement in their respective state tests and standards since a well-aligned
school system should make an impact on student outcomes. Moreover, the tools
developed in this study could also be used in other contexts such as the alignment
between content of instruction and standards.
Comparison with other accountability features of state testing such as ACT
scores, K-12 graduation rates, and college admission are also valuable to perform. A
closer look at the Missouri test should be performed in order to find reasons for its high
DOK and its high alignment scores shown in English subject matter, since this could be
the product of a coincidental factor unless a connection between Missouri’s state
standards and the KSUS can be established.
109
The graphical tools developed proved to be useful in the presentation of results
from different perspectives. Alignment research could be advanced by enhancing the
computational tools introduced, which are aimed at discovering special patterns and
relationships among variables that eventually lead to the formulation of hypotheses that
can be tested in subsequent, more formal analysis. Further work is recommended in order
to extend and improve the methodology, as well as to accumulate evidence of validity for
this alignment approach through subsequent utilization.
110
REFERENCES
Anderson, J. R. (1983). The architecture of cognition. Cambridge, MA: Harvard
University Press.
American Educational Research Association. (1999). Standards for educational and
psychological testing. Washington, DC: Author.
Baker, E. L., & Linn, R. L. (2000, Winter). Alignment: Policy goals, policy strategies,
and policy outcomes. The CRESST line. Stanford, CA: National Center for
Research on Evaluation, Standards, and Student Testing.
Bishop, J. (1998). The effect of curriculum-based external exit exam systems on student
achievement. Journal of Economic Education, 29(2), 171-183.
Blank, R. K., Kim, J. J., and Smithson, J. (2000). Survey results of urban schools
classroom practice in mathematics and science: 1999 Report (Monograph No. 2).
Norwood, MA: Systemic Research, Inc.
Blank, R. K., Porter, A., & Smithson, J. (2001). New tools for analyzing teaching,
curriculum and standards in mathematics & science. Washington, DC: Council of
Chief State Schools Officers.
Bloom, B. S., Engelhart, M. D., Furst, E. J., Hill, W. H., & Krathwohl, D. R. (Eds.).
(1956). Taxonomy of educational objectives: The classification of educational
goals. Handbook I: Cognitive domain. New York: David McKay.
Brennan, R. L. (2001). Generalizability theory. New York: Springer-Verlag.
Buckendahl, C. W., Plake, B. S., Impara, J. C., & Irwin, P. M. (2000). Alignment of
standardized achievement tests to state content standards: A comparison of
111
publishers’ and teachers’ perspectives. Paper presented at the annual meeting of
the National Council on Measurement in Education, New Orleans, LA.
Center for Research on Evaluation, Standards, and Student Testing (CRESST). (n.d.).
Los Angeles, CA: University of California at Los Angeles. Retrieved November
20, 2002, from http://www.cse.ucla.edu/index.htm.
Cohen, S. A. (1987). Instructional alignment: Searching for a magic bullet. Educational
Researcher, 16(8), 16-20.
Conley, D. (2002). Standards for success: Annual report. Eugene, OR: University of
Oregon, Center for Educational Policy Research.
Conley, D. and Brown, R. (in press). State high school assessments and standards for
college success: Do they connect? Education Administration Quarterly.
Council of Chief State School Officers (CCSSO). (n.d.). Retrieved November 29, 2002,
from http://www.ccsso.org/.
Consortium for Policy Research in Education (2000). Bridging the K-12/postsecondary
divide with a coherent K-16 system (Policy Briefs). University of Pennsylvania.
Philadelphia, PA: Author.
Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of
behavioral measurements. New York: Wiley.
Curriculum Alignment Project (CAP). (n.d.). Indiana Department of Education. Retrieved
October 10, 2002, from http://www.niesc.k12.in.us/esc7sdev/cap1.htm and from
http://hammond.k12.in.us/curralign.htm.
112
Feuer, M. J., Holland, P. W., Green, B. F., Bertenthal, M. W., & Hemphill, F. C. (1999).
Uncommon measures: Equivalence and linkage among educational tests.
Washington, DC: National Academy Press.
Fuhrman, S. H. (1999, January). The new accountability. (Policy Briefs). Philadelphia,
PA: University of Pennsylvania, Consortium for Policy Research in Education.
Herman, J. L., Webb, N., & Zuniga, S. (2003). Alignment and college admissions: The
match of expectations, assessment, and educators perspectives. (CSE Technical
Report 593). Los Angeles, CA: University of California at Los Angeles. Center
for the Study of Evaluation.
Impara, J. C. (2001). Alignment: One element of an assessment’s instructional utility.
Paper presented at the annual meeting of the National Council of Measurement in
Education, Seattle, WA.
Impara, J. C., Plake, B. S., & Buckendahl, C. W. (2000). The comparability of normreferenced achievement tests as they align to Nebraska’s language arts content
standards. Paper presented at the Large Scale Assessment Conference, Snowbird,
UT.
Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin,
112(3), 527-535.
Kirst, M. (1998). Improving and aligning K-16 standards, admission, and freshman
placement policies. Stanford, CA: Stanford University. National Center for
Postsecondary Improvement.
Kirst, M. & Venezia, A. (2001). Bridging the great divide between secondary schools and
postsecondary education. Phi Delta Kappa, 83(1), 92-97.
113
La Marca, P. M. (2001). Alignment of standards and assessment as an accountability
criterion. Practical Assessment, Research & Evaluation, 7(21).
La Marca, P. M., Redfield, D., Winter, P., Bailey, A., & Despriet, L. (2000). State
standards and state assessment systems: A guide to alignment. Washington, DC:
Council of Chief State School Officers.
Lashway, L. (1999). Holding schools accountable for achievement. (ERIC Document
Reproduction Service No. 434 381)
Le, V., Hamilton, L., & Robyn, A. (2000). Alignment among secondary and
postsecondary assessment in California [On-line], Crucial Issues in California
Education. Chapter 9. Retrieved September 29, 2002, from
http://pace.berkeley.edu/pace_crucial_issues.html.
Lewis, A. C. (1997). Figuring it out: Standards-based reforms in urban middle grades.
Retrieved October 13, 2002, from http://www.middleweb.com/figuring.html.
Marzano, R. J. (2001). Designing a new taxonomy of educational objectives. Thousand
Oaks, CA: Corwin Press Inc.
National Council on Education Standards and Testing (1992). Raising standards for
American education: A report to Congress. The Secretary of Education, the
National Education Goals Panel, and the American people. Washington, DC:
Author.
National Education Goals Panel (2000). Minnesota & TIMS, Exploring high achievement
in eighth grade science. Washington, DC: Author.
114
National Education Association (NEA). (2002). Alignment of curriculum and tests to
standards. Retrieved August 21, 2002, from
http://www.nea.org/accountability/alignment.html.
Nelson, G. D. (2002). Benchmarks and standards as tools for science education reform.
American Association for the Advancement of Science. Retrieved October 19,
2002, from http://www.project2061.org/newsinfo/research/nelson/nelson1.html.
Porter, A. C., & Smithson, J. L. (2001a). Are content standards being implemented in the
classroom? A methodology and some tentative answers. In S. H. Fuhrman (Eds.),
From the capitol to the classroom: Standards-based reform in the states, Part II
(pp. 60-80). National Society for the Study of Education. Chicago, IL: University
of Chicago Press.
Porter, A. C., & Smithson, J. L. (2001b). Defining, developing, and using curriculum
indicators (Rep. No. RR-048). Philadelphia, PA: University of Pennsylvania,
Consortium for Policy Research in Education.
Porter, A. C. (2002). Measuring the content of instruction: Uses of research and practice.
Educational Researcher, 31(7), 3-14.
Powell, A. G. (1996). Motivating students to learn: An American dilemma. In S.
Fuhrman & J. O’Day (Eds.). Rewards and reform: Creating educational
incentives that work. San Francisco: Jossey-Bass.
Project 2061. (n.d.). American Association for the Advancement of Science. Retrieved
February 3, 2003, from http://www.project2061.org/.
115
Rothman, R., Slattery, J. B., Vranek, J. L., & Resnick, L. B. (2002). Benchmarking and
alignment of standards and testing (CSE Technical Report 566). Los Angeles,
CA: University of California at Los Angeles, Center for the Study of Evaluation.
Resnick, L. B., & Resnick, D. P. (1992). Assessing the thinking curriculum: New tools
for educational reform. In B. R. Gifford & M. C. O’Connor (Eds.), Changing
assessment: Alternative view of aptitude, achievement, and instruction (pp. 3775). Boston, MA: Kluwer Academic.
Smith, M. S., & O’Day, J. A. (1991). Systemic school reform. In S. H. Fuhrman & B.
Malen (Eds.), The politics of curriculum and testing: The 1990 yearbook of the
politics of education association (pp. 233-267). New York, NY: Falmer Press.
Subkoviak, M. J. (1988). A practitioner’s guide to computation and interpretation of
reliability indices for mastery tests. Journal of Educational Measurement, 25(1),
47-55.
Tafel, J., & Eberhart, N. (1999). Statewide school-college (K-16) partnerships to improve
students’ performance. State Higher Education Executive Officers. Retrieved
February 8, 2003, from http://www.sheeo.org/publicat/pub-k16.htm.
The Bridge Project (2000). Strengthening K-16 transition policies. Stanford, CA:
Stanford University: Author. Retrieved November 20, 2002, from
http://www.stanford.edu/group/bridgproject.
Wainer, H. (1997). Visual revelations. Mahwah, NJ: Lawrence Erlbaum Associates,
Publishers.
116
Webb, N. L. (1997). Criteria for alignment of expectations and assessment in
mathematics and science education (Research Monograph No. 8). Washington,
DC: Council of Chief State School Officers.
Webb, N. L. (1999). Alignment of science and mathematics standards and assessments in
four states (Research Monograph No. 18). Madison, WI: National Institute for
Science Education.
Webb, N. L. (2001). Alignment analysis of State F Language Arts standards and
assessments Grades 5, 8, and 11. Washington, DC: Council of Chief State School
officers.
Webb, N. L. (2002). An analysis of the alignment between mathematics standards and
assessment for three states. Paper presented at the annual meeting of the
American Educational Research Association, New Orleans, LA.
117
APENDICES
Appendix A
Prototype Model (Excel) for the Alignment Index
118
Appendix B
Prototype Model (Software) for the Alignment Index
119
Appendix C
ENGLISH KEY KNOWLEDGE AND SKILLS
I. Reading Comprehension
IA. The student will use reading skills and strategies to understand
literature and information texts.
IA.1. Use reading skills and strategies to understand a variety of informational
texts: instructions for software, job descriptions, college applications, historical
documents, government publications, newspapers, textbooks.
IA.2. Use monitoring and self-correction methods and know when to read aloud.
IA.3. Engage critically with the text: annotating, questioning, agreeing or
disagreeing, summarizing, critiquing, formulating own responses.
IA.4. Understand narrative terminology: author versus narrator, historical versus
implied author, historical versus present-day reader.
IA.5. Use reading skills and strategies to understand a variety of types of
literature: epic piece (Iliad) or lyric poem, narrative novels, and philosophical
pieces.
IA.6. Understand plots and character development in literature, including
characters’ motives, causes for actions, and the credibility of events.
IA.7. Understand vocabulary and content: subject-area terminology, connotative
and denotative meanings, idiomatic meanings.
IA.8. Understand basic beliefs, perspectives, and philosophical assumptions
underlying an author’s work: point of view, attitude, or values conveyed by
specific use of language.
IA.9. Use a variety of strategies to understand the origins and meanings of new
words: analyzing word roots and affixes, recognizing cognates, using context
clues, determining word derivations.
IA.10. Make supported inferences and draw conclusions based on textual
features: evidence in text, format, language use, expository structures,
arguments used.
IB. The student will be able to discuss with understanding the defining
characteristics of literature and techniques of a variety of forms and genres.
IB.1. Know the salient characteristics of major types and genres of literature:
novels, short stories, horror stories, science fiction, biographies,
autobiographies, poems, plays, etc.
120
IB.2. Distinguish the formal constraints of different types of texts:
Shakespearean sonnets versus free verses.
IB.3. Understand literary devices used to influence the reader and evoke
emotions: imagery, characterization, choice of narrator, use of sound, formal
and informal language.
IB.4. Be able to discuss with understanding the effects of author’s style and
literary devices on the overall quality of literary works: allusions, symbols,
irony, voice, flashbacks, foreshadowing, time and sequence, mood.
IB.5. Know archetypes, such as universal destruction, journeys and tests,
banishment, that appear across a variety of types of literature: American
literature, world literature, myths, propaganda, religious texts.
IB.6. Be able to discuss with understanding themes such as initiation, love and
duty, heroism, death and rebirth, that appear across a variety of literary works
and genres.
IB.7. Evaluate literature based on ambiguities, subtleties, contradictions in a
text; based on aesthetic qualities of style, such as diction or mood.
IC. The student will be familiar with a range of world literature.
IC.1. Have some familiarity with major literary periods of English and
American literature and their characteristic forms, subjects, and authors.
IC.2. Have some familiarity with authors from literary traditions beyond the
English-speaking world.
IC.3. Have some familiarity with major works of literature produced by
American and British authors.
ID. The student will be able to discuss with understanding the relationships
between literature and its historical and social contexts.
ID.1. Know major historical events that may be encountered in literature.
ID.2. Demonstrate familiarity with the concept that historical, social, and
economic contexts influence form, style, and point of view; and that social
influences affect author’s descriptions of character, plot, and setting.
ID.3. Demonstrate familiarity with the concept of the relativity of all historical
perspectives, including their own.
ID.4. Be able to discuss with understanding the relationships between literature
and politics: the political assumptions underlying an author’s work, the impact
of literature on political movements and events.
II. Writing
IIA. The student will know how to use basic grammar conventions to write
clearly
121
IIA.1. Identify parts of speech correctly and consistently: nouns, pronouns,
verbs, adverbs, conjunctions, prepositions, adjectives, interjections.
IIA.2. Use subject-verb agreement and consistent verb tense.
IIA.3. Use pronoun agreement, different types of clauses and phrases
appropriately: adverb clauses, adjective clauses, adverb phrases.
IIB. The student will know conventions of punctuation and capitalization
IIB.1. Use commas with nonrestrictive clauses and contrasting expressions.
IIB.2. Use ellipses, colons, hyphens, semi-colons, apostrophes and quotation
marks correctly.
IIC. The student will know conventions of spelling
IIC.1. Use a dictionary and other resources to spell new, unfamiliar, or difficult
words.
IIC.2. Differentiate between commonly confused terms: “its” and “it’s”, “affect”
and “effect.”
IIC.3. Know how to use the spellchecker function in word processing software
and know the limitations of relying upon a spellchecker.
IID. The student will use writing conventions to write clearly and
coherently
IID.1. Know and use several prewriting strategies: develop a focus, determine
the purpose, plan a sequence of ideas, use structured overviews, create outlines
IID.2. Use paragraph structure in writing: construct coherent paragraphs,
arrange paragraphs in logical order
IID.3. Use a variety of sentence structures appropriately in writing: compound,
complex, compound-complex, parallel, repetitive, analogous
IID.4. Present ideas so as to achieve overall coherence and logical flow in
writing; use appropriate techniques to maximize cohesion (transitions,
repetition)
IID.5. Use writing conventions and documentation formats: style sheet methods
such as MLA, APA; bibliography of sources
IID.6. Demonstrate development of a unique style and voice in writing in a
controlled fashion where appropriate
IID.7. Use words correctly. Use words that mean what the writer intends to say.
Use a varied vocabulary
IIE. The student will use writing to communicate ideas, concepts, emotions,
descriptions to the reader
IIE.1. Know the difference between a topic and a thesis
122
IIE.2. Articulate a position through a thesis statement and advance it using
evidence, example, counterargument that is relevant to the audience or issue at
hand
IIE.3. Use a variety of methods to develop arguments: use comparison-contrast
reasoning; develop and sustain logical arguments (inductive-deductive);
alternate appropriately between the general and the specific (make connections
between public knowledge and personal observation and experience)
IIE.4. Write to persuade the reader: anticipate and address counter arguments,
use rhetorical devices, develop accurate and expressive style of communication
(move beyond mechanics, add flair and elegance to writing)
IIE.5. Use strategies to adapt writing for different audiences and purposes:
include appropriate content; use appropriate language, style, tone, and structure;
consider audience’s background
IIE.6. Distinguish between formal and informal styles: formal paper, personal
reflections, informal letters, memos
IIE.7. Use appropriate strategies to write expository essays: include supporting
evidence, use information from primary and secondary sources, use charts,
graphs, tables and illustrations where appropriate, anticipate and address
reader’s biases and expectations, use technical terms and notations. Use
appropriate strategies and formats to write personal and business
correspondence: appropriate organizational patterns, formal language and tone
IIE.8. Use strategies to write fictional, autobiographical, and biographical
narratives: develop point of view and literary elements, present events in logical
sequence, convey a unifying theme or tone, use concrete and sensory language,
pace action
IIF. The student will use in priority fashion a variety of strategies to revise
and edit written work to achieve maximum improvement in time available
IIF.1. Review ideas and structure in substantive ways, improve depth of
information, logic of organization, rethink appropriateness of writing in light of
genre, purpose, and audience
IIF.2. Use feedback from others to revise own written work
III. RESEARCH SKILLS
IIIA. The student will understand and use research methodologies
IIIA.1. Formulate research questions, refine topics, develop a plan for research,
and organize what is known about the topic
IIIA.2. Use research to support and develop one’s own opinion, as opposed to
simply restating existing information or opinions
123
IIIA.3. Identify through research the major concerns and debates in a given
community or field of inquiry and address these in one’s writing
IIIA.4. Identify claims in one’s writing that require outside support or
verification
IIIB. The student will know how to find a variety of sources and use them
properly
IIIB.1. Collect information to narrow and develop a topic and support a thesis
IIIB.2. Understand the difference between primary and secondary sources
IIIB.3. Use a variety of primary and secondary sources, print or electronic:
books, magazines, newspapers, journals, periodicals, Internet
IIIB.4. Critically evaluate sources: discern the quality of the materials, qualify
the strength of the evidence and arguments, determine credibility, identify bias
and perspective of author, use prior knowledge to judge; particularly as applied
to Internet sources
IIIB.5. Use sources to write research papers: integrate information from sources,
logically introduce and incorporate quotations, synthesize information in a
logical sequence, identify different perspectives, identify complexities and
discrepancies in information, offer support for conclusions
IIIB.6. Understand the concept of plagiarism and how (or why) to avoid it:
paraphrasing, summarizing, quoting; particularly as applied to Internet sources
IV. CRITICAL THINKING SKILLS
IVA. The student will demonstrate connective intelligence
IVA.1. Be able to discuss with understanding how personal experiences and
values affect reading comprehension and interpretation
IVA.2. Show ability to make connections between the component parts of a text
one is reading or writing and the larger theoretical structures: presupposition,
audience, purpose, writer’s credibility or ethos, types of evidence or material
being used, and style (including correctness)
IVB. The student will demonstrate ability to think independently
IVB.1. Students should be comfortable formulating and expressing their own
ideas
IVB.2. Support one’s argument with logic and evidence that is relevant to one’s
audience and which explicates one’s position as fully as possible
IVB.3. Fully understand the scope on one’s argument and the claims underlying
it
IVB.4. Reflect on and assess the strengths and weaknesses of one’s ideas and
their expression
124
Appendix D
MATH KEY KNOWLEDGE AND SKILLS
I. COMPUTATION
IA. The student will know basic mathematics operations
IA.1. Use arithmetic operations with fractions (e.g., add and subtract by finding a
common denominator, multiply and divide, reduce)
IA.2. Use exponents and scientific notation
IA.3. Use radicals correctly
IA.4. Understand relative magnitude
IA.5. Calculate using absolute value
IA.6. Know terminology for complex numbers, integers, rational numbers,
irrational numbers and complex numbers
IA.7. Use the correct order of arithmetic operations, particularly demonstrating
facility with the Distributive Law
IB. The student will know and carefully record symbolic manipulations
IB.1. Use mathematical symbols and language appropriately (e.g., equal signs,
parentheses, superscripts, subscripts)
IC. The student will know and demonstrate fluency with mathematical
notation and computation
IC.1. Perform symbolic addition, subtraction, multiplication and division
IC.2. Perform appropriate basic operations on sets (e.g., union, intersection,
elements of, subsets, complement)
IC.3. Be comfortable with alternative symbolic expression
II. ALGEBRA
IIA. The student will know and apply basic algebraic concepts
IIA.1. Use the distributive property to multiply polynomials
IIA.2. Divide polynomials (e.g., long division)
IIA.3. Factor polynomials (e.g., difference of squares, perfect square trinomials,
difference of two cubes, and trinomials like x2 + 3x + 2)
IIA.4. Add, subtract, multiply, divide, and simplify rational expressions including
finding common denominators
IIA.5. Understand properties and basic theorems of roots, and exponents, (e.g.,
(x2)(x3)=x5 and ( x)3 = x3/2
125
IIA.6. Understand properties and basic theorems of logarithms (to bases 2, 10, and
e)
IIA.7. Know how to compose and decompose functions and find inverses of basic
functions
IIB. The student will use various techniques to solve basic equations and
inequalities
IIB.1. Solve linear equations and absolute value equations
IIB.2. Solve linear inequalities and absolute value inequalities
IIB.3. Solve systems of linear equations and inequalities using algebraic and
graphical methods (e.g., substitution, elimination, addition, graphing)
IIB.4. Solve quadratic equations using various methods and recognize real
solutions
IIB4a. Factoring
IIB4b. Completing the square
IIB4c. The quadratic formula
IIC. The student will distinguish between expression, formula, equation, and
function
IIC.1. Distinguish between expression, formula, equation, and function and
recognize when simplifying, solving, substituting in, or evaluating is appropriate
(e.g., expand the expression (x + 3)(x + 1), substitute a = 3 and b = 4 into the
formula a2 + b2 = c2, and solve the equation 0 = (x + 3)(x + 1), evaluate the
function f(x) = (x + 3)(x + 1).
IIC.2. Understand the concept of a function beyond it being a type of algebraic
expression
IIC.3. Know how to use polynomials and exponential functions in applications
IIC.4. Use a variety of models (e.g., written statement, algebraic formula, table of
input-output values, graph) to represent functions, patterns, and relationships
IIC.5. Understand terminology and notation used to define functions (e.g., domain,
range)
IIC.6. Understand the general properties and characteristics of basic types of
functions (e.g., polynomial, rational, exponential, logarithmic, trigonometric)
IID. The student will understand the relationship between equations and
graphs
IID.1. Understand basic forms of the equation of a straight line and how to graph
the line without a calculator
IID.2. Student will understand the basic shape of a quadratic function and the
relationships between the roots of the quadratic and zeroes of the function
126
IID.3. Know the basic shape of the graph of an exponential function and log
functions, including exponential decay
IIE. The student will know how to use algebra both procedurally and
conceptually
IIE.1. Recognize which type of model (i.e., linear, quadratic, exponential) best fits
the context of a basic equation
IIF. The student will demonstrate ability to algebraically work with formulas
and symbols
IIF.1. Know formal notation (e.g., sigma notation, factorial representation) and
series of geometric and arithmetic progressions
III. TRIGONOMETRY
IIIA. The student will know and understand basic trigonometric principles
IIIA.1. Know the definitions of sine, cosine, and tangent using right triangle
geometry and similarity relations
IIIA.2. Understand the relationship between a trigonometric function in standard
form and its corresponding graph (e.g., domain, range, amplitude, period, phase
shift, vertical shift)
IIIA.3. Know and use identities for sum and difference of angles (e.g., sin (x ± y),
cos (x ± y), tan (x ± y))
IIIA.4. Understand periodicity and recognize graphs of periodic functions,
especially the trigonometric functions
IV. GEOMETRY
IVA. The student will understand and use basic plane and solid geometry
IVA.1. Know properties of similarity, congruence and parallel lines cut by a
transversal
IVA.2. Know how to figure area and perimeter of basic figures
IVA.3. The student will understand the ideas behind simple geometric proofs and
be able to develop and write simple geometric proofs, such as the Pythagorean
theorem, the fact that there are 180 degrees in a triangle, and the fact that the area
of a triangle is half the base times the height
IVA.4.Use geometric constructions to complete simple proofs, and to solve
problems
IVA.5. Use similar triangles to find unknown angle measurements and lengths of
sides
IVA.6. Visualize solids and surfaces in 3-dimensional space
127
IVA.7. Know basic formulae for volume and surface area for three-dimensional
objects
IVB. The student will know analytic (i.e., coordinate) geometry
IVB.1. Know geometric properties of lines (e.g., slope, and midpoint of a line
segment) and the formula for the distance between two points
IVB.2. Use the Pythagorean Theorem and its converse and properties of special
right triangles (e.g., 30°-60°-90° triangle) to solve mathematical and real-world
problems (e.g., ladders, shadows, poles)
IVB.3. Recognize geometric translations algebraically
IVC. The student will understand basic relationships between geometry and
algebra
IVC.1. Understand that objects and relations in geometry correspond to objects and
relations in algebra (e.g., a line in geometry corresponds to a set of ordered pairs
satisfying an equation ax + by = c).
IVC.2. Know the algebra and geometry of circles, and (for those who intend to
study calculus) a parabolas and ellipses
IVC.3. Use trigonometry for examples of algebraic/geometric relationship,
including Law of Singes/Cosines
V. MATHEMATICAL REASONING
VA. The student will use mathematical reasoning to solve problems
VA.1. Use inductive and deductive reasoning in basic arguments
VA.2. Use geometric and visual reasoning
VA.3. Use multiple representations (e.g., analytic, numerical, geometric) to solve
problems
VA.4. Learn to solve multi-step problems
VA.5. Use a variety of strategies to revise solution processes
VA.6. Experience both proof and counter example in problem solutions
VA.7. The student will be familiar with the process of abstracting mathematical
models from word problems, geometric problems, and applications and
interpreting solutions in the context of these source problems.
VB. The student will be able to work with mathematical notation to solve
problems and to communicate solutions
VB.1. Translate simple statements into equations (e.g., "Bill is twice as old as
John" can be expressed by the equation b=2j)
128
VB.2. Understand the role of written symbols in representing mathematical ideas
and the precise use of special symbols of mathematics
VC. The student will know a select list of mathematical facts and know how to
build upon these
VC.1. The student will know a select list of mathematical facts and know how to
build upon these.
VD. The student will know how to estimate
VD.1. Be familiar with decimal approximation of fractions
VD.2. Know when an estimate or approximation is sufficient in problem situations
in place of exact answers
VD.3. Recognize the accuracy of an estimation
VD.4. Know how to make and use estimates
VE. The student will understand the appropriate uses of calculations and
their limitations
VE.1. Recognize misinformation that can arise from calculator use
VE.2. Perform experiments on the calculator
VE.3. Plot useful graphs
VF. The student be able to generalize and to go from specific to abstract and
back again
VF.1. Determine the mathematical concept from the context of an external
problem, solve the problem, and interpret the math solution in the context of the
problem
VF.2. Student will know how to use specific instances of general facts and how to
look for general results that extend particular ones
VG. The student will demonstrate active participation in the process of
learning mathematics
VG.1. Be willing to experiment with problems that have multiple solution methods
VG.2. Demonstrate an understanding of the mathematical ideas behind the steps of
a solution as well as the solution
VG.3. Show an understanding of how to modify patterns and solution strategies to
obtain different solutions
VG.4. Recognize when a proposed solution does not work, analyze why, and use
the analysis to seek a valid solution
129
VH. The student will recognize the broad range of applications of
mathematical reasoning
VH.1. Know some mathematical applications used in other fields (e.g., carbon
dating, exponential growth, predator/pre models, periodic motion and the
interactions of waves, amortization tables)
VH.2. Know some of the roles mathematics has historically played and continues
to play
130