Managing the test construction process - NATO

Managing the test construction
process
Ülle Türk
Estonian National Defence College
STANAG 6001 Testing Conference
SHAPE, Mons, Belgium
September 2013
Processes in testing
 The test development process
 The test assembling process
 The test delivery process
 Marking, grading and reporting results
 Monitoring and review
2
The test development process
Decision
to
provide
test
Planning
Design
Try-out
Try-out
Informing stakeholders
3
Informing
stakeholders
Final test
specifcations
The test assembling process
Test specificationas
4
Producing
materials
Quality
control
Constructing
tests
Live test
materials
The test delivery process
Live test
materials
Arranging
venues
Registering
test takers
Preparing
materials
Administering
the test
Returning the
materials
Returned
materials
5
Marking, grading and reporting results
Returned
test
materials
6
Marking
Grading
Reporting
of results
Monitoring and review
 The purpose is to establish whether the test is
working properly and, if not, what needs to be
changed.
 Monitoring is part of the routine process of
producing and using a test
 Periodic reviews are done occasionally, outside of
the regular operation of a test
7
The test assembling process
Test specificationas
8
Producing
materials
Quality
control
Constructing
tests
Live test
materials
Timeline
 Testing session 1 (February-March)
 Setting grade boundaries
 Analysing the task and item performance
 Producing new tasks
 Initial review
 Improving tasks
 Second review
 Testing session 2 (June-July)
 Improving tasks
 Third review
9
Item review questions












10
Is the task clear in each item?
Is it likely that the person attempting an item will know what is expected?
Are the items expressed in the simplest possible language?
Is each item a fair item for assessment at the level tested?
Is the wording appropriate to the level tested?
Are there unintended clues to the correct answer?
Is the format reasonably consistent so that test-takers know what is required
from item to item?
Is there a single clearly correct (or best) answer for each item?
Is the type of item appropriate to the information required?
Are there statements in the items which are likely to offend?
Is there content which reflects bias on gender, racial, or other grounds?
Are there enough representative items to provide an adequate sample of the
behaviours to be assessed?
Timeline (continued)
 Compiling the pre-test
 Producing pre-test materials
 Pre-teting
 Testing session 3 (November-December)
 Analysing pre-test results
 Compiling the live tests
 Producing the live test materials
 Administering the new test
11
Sources
 ALTE. 2011. Manual for Language Test
Development and Examining for Use with the
CEFR. Strasbourg: Language Policy Division,
Council of Europe, www.coe.int/lang
 Izard, John. Overview of Test Construction.
Quantitative Research Methods in Educational
Planning, Module 6. UNESCO International
Institute for Educational Planning,
http://www.unesco.org/iiep
12