Marion_Demonstration Authority_062316_TILSA

The Innovative Assessment and Accountability
Demonstration Authority under ESSA
Scott Marion, Center for Assessment
CCSSO’s TILSA SCASS
Philadelphia, PA
June 23, 2016
Advance Organizer
• Overview of the Innovative Assessment and
Accountability Demonstration Authority under the
Every Student Succeeds Act
• Introduction to New Hampshire’s Performance
Assessment of Competency Education (PACE)
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
2
Innovative Assessment and Accountability
• Allows for a pilot for up to seven (7) states to use
competency-based or other innovative assessment
approaches for use in making accountability determinations
• Initial demonstration period of three (3) years with a two (2)
year extension based on satisfactory report from the director
of Institute for Education Sciences (IES), plus a potential 2
year waiver
• Rigorous assessment, participation, and reporting
requirements and subject to a peer review process
• Maybe used with a subset of districts based on strict
“guardrails,” with a plan to move statewide by end of
extension
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
3
Innovative Assessment System (Sec. 1204)
An Innovative Assessment System means a system of
assessments that may include:
1) competency-based assessments, instructionally embedded
assessments, interim assessments, cumulative year end
assessments, or performance-based assessments that
combine into an annual summative determination for a
student, which may be administered through computer
adaptive assessments;
2) assessments that validate when students are ready to
demonstrate mastery or proficiency and allow for
differentiated student support based on individual learning
needs.
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
4
Assessment requirements & flexibility
• Assessments are not Required to be the Same Statewide
– States can pilot the assessment system with a subset of
districts before scaling the system statewide by the end of the
Demonstration Authority
• Assessments may Consist Entirely of Performance Tasks
– States can design an assessment or system of assessments
that consists of all performance tasks, portfolios, or extended
learning tasks [they can now!]
• Assessments may be Administered When Students Are
Ready
– States can assess students when they are ready to
demonstrate mastery of standards and competencies as
applicable
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
5
Four Major Guardrails
Assessment
Quality
Comparability
• System comprised of high quality assessments that
support the calculation of valid, reliable, and
comparable annual determinations as well as
provide useful information to relevant stakeholders
• Produce yearly, student-level annual
determinations that are comparable across LEAs
Scale
Statewide
• Must have a logical plan to scale up the innovative
assessment system statewide
Demographic
Similarity
• Make progress toward achieving high-quality and
consistent implementation across demographically
diverse LEAs
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
6
Timeline-1
Rules
• ED has filed an intent to promulgate rules with OMB
• Draft rules likely early fall (Sept-Oct)
• Application based on final rules
Application • Likely released early winter
Awards
• All indications are that awards cannot happen until
next administration
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
7
Timeline-2
Initial
IES
• Up to seven states may be awarded initial 3-year application
• Up to four states may be part of a consortium
• Progress reviewed by “Director of IES” after 3 years
• Additional 2 years based on successful IES review
• Secretary of Ed may extend Authority to additional states after 3 years
Expansion • Initial states may request an additional 2 year extension
• At the end of the authority, the Secretary, based on peer review, will
determine if the state can fully transition to the pilot system
Transition
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
8
Does a state need a demonstration authority?
1. Will districts play a role in determining which assessments
count toward accountability?
No
Used by permission from
Jenny Poon, CCSSO.
Yes
Will districts be allowed to use locally-designed assessments
for accountability?
No, only nationallyrecognized
assessments
Yes
Will district-selected, nationallyrecognized assessments be allowed in
years other than high school?*
No
Model: One
statewide
system
Model: Districts
Select from StateApproved High School
Assessments*
Permissible
Without Pilot
Permissible
Without Pilot
Yes
Model: Districts
Select from StateApproved
Assessments
Requires Pilot
Model: StateValidated LocallyDesigned Systems**
Requires Pilot
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
9
It’s the System!
Learning
Reform
PD
Reform
State
Vision
Assessment
Reform
Accountability
Reform
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
10
State and District Roles
Innovative pilot initiatives will
require more work for local
districts than simply maintaining
the current system, but it is the
“right work,”
• Commit to the shared educational
vision
• Build capacity among the staff;
• Structure budgets to provide funding;
• Create time for collaborative and
individual planning;
• Implement record keeping and student
management systems;
• Embrace the notion that increased
flexibility brings with it increased
responsibility.
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
11
Theory of Action
• A theory of action should make the links explicit
among the various aspects of the system
• The theory of action should reveal testable
hypotheses that can be verified with evidence
through the implementation of the pilot
• This accumulation of evidence would support the ongoing validation of the assessment and accountability
system.
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
12
Example of a theory of action
State &
pilot
districts
collaborate
to design
innovative
learning &
assessment
system
Schools are
structured to
create
opportunities
for adult &
student
personalized
learning
Collaborative
, focused &
sustained
professional
development
Balanced
assessment
yields
useful
information
Educator
practices &
student
engagement
improve
Results are
used to
improve
The pilot
successfully
expands
Student
learning
improves
instruction
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
13
Cross-cutting themes for all pilots…
Equity
Transparency
Continuous
Improvement
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
14
Help is on the way!
The Center for Assessment and KnowledgeWorks with support from
the Nellie Mae Foundation are producing the following policy and
technical briefs being released from next week through the end of
the summer:
1. Creating a State Vision to Support the Design and
Implementation of An Innovative Assessment and Accountability
System
2. Ensuring and Evaluating Assessment Quality
3. Addressing Accountability Issues including Comparability
4. Professional Learning and Developing Assessment Literacy
5. Constructing a Research and Evaluation Plan
6. Building Capacity and Resources
7. Scaling Statewide and Ensuring Long Term Sustainability
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
15
What is PACE?
• The New Hampshire Department of Education (NH
DOE) was granted by the US Department of
Education (USED) a waiver from No Child Left Behind
(NCLB) to implement the Performance Assessment
of Competency Education (PACE) as a pilot
assessment and accountability system for a limited
number of school districts.
• Led by the NH DOE in close partnership with the
district leads and the Center for Assessment
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
16
NH’s Request to USED for a Pilot (NCLB)
• NH argued that the focus on yearly external
assessment-driven accountability can choke off
richer reform conversations
• Therefore, NH DOE requested that USED waive
certain provisions around state testing, but NOT
reporting!
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
17
Initial PACE Expectations
• State-model competencies aligned with college and career
outcomes provide the main learning targets
• Instructional system to support student learning of competencies
– Includes strategies to personalize learning
• Locally-design assessment system to measure student
achievement and growth related to competencies
• High quality performance assessments occupy a visible place in
the local assessment system
• Smarter Balanced assessment administered at least once in
elementary, middle and high school
• The use of at least one common (to all PACE districts)
performance assessment in grades/subjects not assessed by
Smarter Balanced (17)
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
18
Phasing in Districts
• NH DOE recognizes the challenges in such a major shift in
orientation.
• The State wants to move slowly to ensure that districts
are truly ready (e.g., capacity) to take on such initiatives.
– We are clear that this requires more rather than less work from
districts, but we argue it is better work!
• NH DOE asked for a “pilot” in 2014-2015 (i.e., “proof of
concept”) of a very small number of Tier 1 districts (4
small districts).
• The State expanded to four more districts in 2015-2016
(also considered Tier 1) and will then move to a somewhat
larger pilot in 2016-2017, assuming there are districts
meeting the entrance requirements and we get approval!
Scott Marion_ESSA_NH's PACE_May 2, 2016
19
20
Organization
to PACE Scaling
– Second Year
Tier 3 Planning
Districts
Planning Districts:
developing CBE:
Competencies,
instruction,
assessment, grading
Tier 2 Preparing
Districts
Tier I Implementing
Districts
Implementing districts: PD
in Calibration/scoring
practices
DoE/NH Learning
Initiative
PACE
Management
Team
In-state Partners:
Center for Assessment
New Hampshire Learning Initiative
Reaching Higher NH
Institutional Supports:
State Board
Governor’s Office
NH Legislature
Preparing for
implementation: PD
in Performance
Assessment
development and
implementation
National Partners:
Foundations
Center for Collaborative
Education
Center for Innovation in
Education/Stanford
But Why Change?
• We need a more intense focus on maximizing student
learning, engagement, and outcomes
• NCLB focused admirably on equity, but excellence needs to
be incentivized as well
• We need to create space for innovating approaches for
moving from good-to-great while studying the
implementation and results
• Provides an opportunity for deep engagement of our local
educators and leaders
• Allows NH to serve as a model for other states
– Many other Innovative Lab Network (a group supported by the
Council of Chief State School Officers) states are anxious to follow
NH’s lead
Scott Marion_ESSA_NH's PACE_May 2, 2016
21
Why PACE?
• Research on organizational change/reform and
human learning supports the notion that real
change/learning must be internally motivated
• “Drive (motivation) is fueled by a combination of
autonomy, mastery and purpose.” (Daniel Pink)
• Yet, current accountability systems, whether
motivated by ESEA waivers or state designed, are all
essentially externally oriented
• PACE provides an opportunity to shift to a more
internal orientation
Scott Marion_ESSA_NH's PACE_May 2, 2016
22
Scott Marion_ESSA_NH's PACE_May 2, 2016
23
Key Goals and Design Principles of PACE
• Focuses on college and/or career outcomes and
promotes deeper learning for all students
• A clear commitment towards improving the achievement
of educationally-disadvantaged students
• A clearly-described internal accountability process
supported by the local boards of education
• Commitment of resources (local and state) necessary to
ensure the plan’s success
• Leadership and educator capacity to design, implement,
support and sustain the system
Scott Marion_ESSA_NH's PACE_May 2, 2016
24
What Do Students Experience?
Competency System
Traditional System
• Time is variable but
expectations are not
• Focus on maximizing
engagement and deep
learning
• Students participate in rich
assessments to measure
deep learning
• Teachers facilitate and
support student learning
• Expectation that students
move at same pace
• Focus on “average”
• Assessments are
constrained by time and
agnostic to what students
have actually learned
• Limited student
engagement and agency
• Focus on comparability and
standardization
Scott Marion_ESSA_NH's PACE_May 2, 2016
25
Or more eloquently…
The business of schools is to invent tasks,
activities, and assignments that the students
find engaging and that bring them into
profound interactions with content and
processes they will need to master to be judged
well educated.
Schlechty, P.C. (2001) Shaking up the schoolhouse. San Francisco: Jossey-Bass
Scott Marion_ESSA_NH's PACE_May 2, 2016
26
What is PACE? – Water Tower Proposal!
Geometry PACE Common Task
• The Problem: Your town’s population is predicted to
increase over the next 3 years. As one of the town
planners, you are asked to address this issue in terms of
the town’s water supply. In order to meet the future
needs of the town, you need to make a proposal to add a
water tower somewhere on town property that will be
capable of holding 45,000 ± 2,000 cubic feet of water.
The town is looking for a water tower to contain the
most amount of water while using the least amount of
construction material.
• Student Task: Your job is to prepare a proposal that can
be submitted to the town planning committee. Using
your calculations of surface area and volume for the two
designs, describe and analyze the characteristics that
lead you to a final recommendation.
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
27
Solar Cooker
MS Science
Standards:
Task:
•
•
•
•
•
Essential Question: How is energy
transferred between places and
converted between types?
You are working for a company
that wants to find affordable and
environmentally-friendly ways to
reduce the need for wood and
charcoal when cooking.
You have been tasked to create a
device that uses renewable
energy.
You and a group will research,
design, build, and test a solar
cooker, applying everything you
have learned about energy this
past quarter.
Your final goal is to change the
temperature of a cup of water.
•
•
•
NGSS 4-PS3-2: Make observations to
provide evidence that energy can be
transferred from place to place by
sound, light, heat, and electric
currents and NGSS 4-PS3-4: Apply
scientific ideas to design, test, and
refine a device that converts energy
from one form to another.
NGSS 4-ESS3-1: Obtain and combine
information to describe that energy
and fuels are derived from natural
resources and their uses affect the
environment. Standard calls for
examples of renewable energy
sources such as sunlight.
NGSS 4-PS3-4: passive solar heater
that converts light into heat
example.
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
28
PACE Reality
• While the districts are certainly engaged in a
competency-based approach to organizing
instruction and assessment, trying to meet the
requirements of USED makes this more of a
hybrid approach
• This tension plays itself out in many of the
things we will discuss today
Scott Marion_ESSA_NH's PACE_May 2, 2016
29
Reciprocal Accountability in New Hampshire
The creation of the PACE accountability option reflects NH
DOE’s belief that school accountability works best if the
responsibility for design and implementation is shared by
districts and the state, rather than top-down mandates.
Known as “reciprocal accountability,” districts and schools
are responsible for determining and reporting on local
accountability measures, while the state is responsible for
support and oversight in helping districts establish strong
accountability systems (Marion & Leather, 2015, p. 9)
– For those old enough, “school delivery” or “OTL standards” were part of the
original conceptions of standards-based reform (e.g., Smith & O’Day, 1991)
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
30
Four Main Issues/Concerns for USED
• Assessment Quality
• Alignment
• Comparability
• Comparable Annual Determinations
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
31
It’s the System!
• Our focus is on the assessment system and not on
each individual assessment
• This is not to say that we don’t care about the
quality, alignment, comparability, and reliability of
individual tasks, but other than the common tasks,
we don’t have to care as much as some think
• For example, we know the reliability of a 10-point
test is relatively low, but if we had ten, 10-point tests
all tied to the same domain, we would not be that
worried…
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
32
What do the data look like?
• Each district’s assessment system is compromised of local
assessments tied to specific competencies (e.g., 6-12
competencies/course)
– The summative (e.g., end of unit) assessments are the only data that
count towards competency determinations
– These are often performance assessments, but not always
• Local (to each district) district-level performance
assessments
• A PACE common performance assessment in all
grades/subjects where Smarter Balanced is not
administered
• Smarter Balanced in a limited number of grades/subjects
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
33
Combining Multiple Measures
Smarter Balanced in select grades
PACE Common
Performance Task
Local performance
assessments
Competency 1
Local performance
assessments
Competency 2
Local performance
assessments
Competency 3
Local performance
assessments
Competency 4
District-Level
Competency
Scores
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
PACE
Comparable
Annual
Determinations
34
Ensuring quality
• In a reciprocal system, the state needs to ensure
quality and comparability
• Our initial focus on quality has been on the PACE
Common Tasks
• Our theory of action states that the common
task development and review processes will
positively influence the quality of local tasks
• We have concerns about the quality of local
tasks, but we have larger concerns about
squashing local agency
Scott Marion_ESSA_NH's PACE_May 2, 2016
35
Key Assessment Issues
• Clarity about the meaning “common”
• Task design to elicit evidence of complex
thinking
• Understanding the differences in utility
between generic and task-specific rubrics
• Clarifying the limits of scaffolding for
summative and instructional tasks
Scott Marion_ESSA_NH's PACE_May 2, 2016
36
Hierarchy of “common”
“Exact”
• Same exact task, rubrics, learning targets, curricular setting,
and admin conditions and timing
“Common”
• Same task “frame” (but allowance for choice), learning
targets, rubrics, curricular setting, admin conditions, but
different timing
“Common”
• Same task “frame,” learning targets, rubrics, admin conditions,
but different curricular settings and timing
Uncommon
• Same learning targets, rubrics, admin conditions, but different
task stimuli, curricular settings and timing
Scott Marion_ESSA_NH's PACE_May 2, 2016
37
Evidence Centered Design
Student
Model
Evidence
Model
Task
Model
Scott Marion_ESSA_NH's PACE_May 2, 2016
38
A continuum of performance assessments of deeper
learning (Linda Darling Hammond)
We’re in the middle of the continuum, but moving in the right direction!
Scott Marion_ESSA_NH's PACE_May 2, 2016
Linda
Darling-Hammond
39
Rubrics
• Districts have a preference for generic rubrics
that apply to a wide variety of tasks
• Most measurement professionals prefer taskspecific rubrics to improve validity and
reliability
• Trying to bridge the gap by working with
district leaders to create anchor papers for
each score point
Scott Marion_ESSA_NH's PACE_May 2, 2016
40
Scaffolding
• Scaffolding is a critical instructional action to further
students’ development
• Scaffolding in an assessment is a potential threat to
comparability
• We can create general rules around scaffolding (e.g.,
scoreable product must be independent), but they won’t
work for the range of tasks
– e.g., a 1-hour on-demand PBA to a 3-week project
• Therefore, each task must have specific, common (exact!)
rules for scaffolding
– Of course, the more scaffolding—even if common—the greater the
risk of non-comparability with independent assessments
Scott Marion_ESSA_NH's PACE_May 2, 2016
41
Alignment
• We care that all students are provided an
opportunity to learn (OTL) the content and skills
expected at each grade level
• Having an assessment system aligned to the
appropriate standards is one way to check for this
OTL
Scott Marion_ESSA_NH's PACE_May 2, 2016
42
Alignment and Generalizability
• There is a belief that having all students take the same
assessment (items) at the same time is the only way to
provide evidence regarding this OTL
– That belief is both right and wrong, but mostly wrong!
• We don’t care, per se, that a student got a particular set of
items right or wrong, but we do care that these scores may
tell us what the student knows relative to the underlying
knowledge and skills
– This is the heart of a validity argument
• Most state tests do a poor job of embodying these
underlying knowledge and skills…but they are standardized!
Scott Marion_ESSA_NH's PACE_May 2, 2016
43
Alignment
• We need to get closer to the learning, curriculum, and
instruction to have stronger evidence of alignment
• Capturing assessment information from throughout the
year will allow us to support stronger claims of depth,
breadth, and range (key alignment dimensions)
• The following is an excerpt from just one district’s
assessment map from one grade documenting the
alignment of the various assessments to just one of the
major competencies
Scott Marion_ESSA_NH's PACE_May 2, 2016
44
Grade 3 Math, Competency #1
C1~Operations & Algebraic Thinking: Students will demonstrate the ability to compute accurately, make reasonable estimates,
understand meanings of operations and use algebraic notation to represent and analyze patterns and relationships.
Performance Indicator(s)
RCC
CCSS
U1 U2 U3 U4
U5
U6 U7 U8
U9
Beginning Mid
Students will be able to
OA. 1,
W
W
represent and solve problems
C1PI1a OA.2
Ch
Ch
Ch C
W
W
W
W
involving multiplication and
SW
division.
SW SW
SW
Ch
C1PI1b OA. 3
Ch
Ch
Ch
C
W
W
W
OA.4,
SW
W
OA.6,
W
Ch
SW
Ch
C1PI1c OA.7
Ch
C
Ch C
SW WC
C
SW
W
Students will understand
properties of multiplication
W
and the relationship between
SW
Ch
SW
multiplication and division.
C1PI2
OA.5
Ch
O
Ch C
C
W Ch
Students will be able to solve
SW
SW
problems involving the four
C1PI3a OA.8
Ch
W
Ch C
C
W
W
W
operations, and identify and
C1PI3b OA.9
SW
O
W
W
explain patterns in arithmetic. C1PI3c
SW
C
End
W
W
W
W
W
Collecting and analyzing such maps, as well as examining a sample
of assessments that comprise the maps, will allow us to document
the alignment of the assessment system and the required standards
and competencies
Scott Marion_ESSA_NH's PACE_May 2, 2016
45
Tight-Loose
• As Grant Wiggins said, we can have “standards
without standardization”
• Too tight -- we choke off innovation, local agency,
and personalization
• Too loose -- it is difficult to support any claims of
comparability or technical quality that we might
want
• We think the data illustrate that we are on track
to achieving the right balance of tight-loose
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
46
Ongoing Implementation Challenges
• Assessment literacy and capacity
• Personalizing instruction
• Clarity of expectations and communications
• Time and money
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
47
Yes, this is hard!!
• As we tell other states, this is not for the faint of heart!
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
48
For more information:
Center for Assessment
www.nciea.org
Scott Marion
[email protected]
Scott Marion_Demonstration Authority (TILSA)_June 23, 2016
49