Constructing and Validating Parallel Forms of Performance

Task-based Assessment of
Group-constructed Oral Interaction
in EFL Classrooms
Presentation for
The 3rd Biennial Conference on Task-based Language Teaching
by
Angie H.C. Liu, Ph.D. (*[email protected])
Department of Applied Linguistics and Language Studies
Chung Yuan Christian University
September 13-16, 2009
1
Introduction
* Group-based oral discussions activate LLs’
interaction skills --- distributing and competing for opportunities to speak,
-- holding the floor,
-- accommodating other speakers,
-- making on-line adjustments.
2
Introduction
* format and design of group-based oral
assessment places all test takers in equal
power positions in relation to each other
 brings out the kind of interaction that is not
possible via traditional one-to-one oral
interviews
3
Introduction
* test discourse -- jointly created by multiple
test takers, potential influential parameters
-- characteristics of individual test takers,
-- group dynamics and composition,
-- assessment tasks,
-- scoring criteria
4
Introduction
* reality -- teachers often avoid group oral
assessment concerns related to the scoring of
such a complex and dynamic construct
* issues of validity, reliability and fairness
* traditional psychometric framework focuses on
individual performance
5
Introduction
* The current study investigates –
impact of differential scoring approaches
(i.e., group-scoring vs. individual-scoring)
on the group-based oral interaction in
assessment context
6
Introduction
* This is accomplished by analyzing EFL
test takers’ interaction profiles elicited
from collaborative assessment tasks under
different scoring approaches.
7
Research Question
Specifically, the following question is
being examined:
“To what extent does the scoring approach affect
EFL test taker’s oral interaction profiles elicited via
collaborative tasks?”
8
Study Design
* Participants
-- 48 EFL college freshmen from two intact classes
-- their English proficiency level (intermediate)
9
Study Design
* Procedures
1. Each participant was randomly assigned to a test
group of four.
2. Each test group was randomly assigned to one of
the two oral assessment tasks.
3. Each group was allotted 10-minute pre-task planning time.
10
Study Design
4. Prior to the planning time, participants had been
informed of the scoring method (i.e., group- or
individual-based) that would be used to evaluate
their oral performance.
-- a counter-balanced test administration
procedure was implemented to avoid potential
interaction between assessment tasks and
scoring methods (see table 1).
11
Study Design
Table 1: test administration plan
12
Class 1
Class 2
Task 1: Individual
Scoring
Task 2: Individual
Scoring
Task 2: Group
Scoring
Task 1: Group
Scoring
Instrument and Material
*Attributes of the Oral Assessment Tasks:
-- require test takers to discuss their reactions to
hypothetical crises (e.g., ship rescue)
-- non-convergent tasks
-- all members of the test group received exactly
the same information
13
Instrument and Material
* Task 2: as a group, discuss what you will do in the
following scenario:
“ A fierce storm hit two passenger cruises, which started to
sink. Both ships carry the same number of passengers. You
are the leader of the coast guards, who came to the rescue.
Unfortunately, your crew and you can only rescue one ship
at a time. The second ship to be rescued would have for
sure sunk to the bottom of the ocean by the time you get to
them.”
14
Instrument and Material
* Rating scale:
-- a ten-point scale was used for evaluating individual
and group oral performance
-- rating criteria include fluency, content,
communication and interaction
-- holistic scores were produced
15
Data Analysis
•
The interaction profiles of EFL test takers were
analyzed in terms of
-- discussion quantity
-- turn-taking patterns
-- repair type
16
Results and Discussion
•
Discussion Quantity (see Table 2)
Analysis Criteria
Total word count
Unduplicated
content word count
17
Individual Scoring
872
342
(39.2%)
Group Scoring
885
322
(36.4%)
Results and Discussion
(see Table 2)
1. ELL participants were found to generate similar
amount of discussion regardless the scoring
method used (872 word/IS vs. 885 word/GS).
18
Results and Discussion
2. A slightly higher proportion of unduplicated topicrelated content words was identified when
individual scoring was used (39.2%/IS vs.
36.4%/GS).
19
Results and Discussion
3. When GS was used, members of the same test
group tended to repeat or respond to the same
ideas by using more discourse markers such as “I
think”, ‘mm-hm’, ‘er’, and ‘yeah’.
4. When IS was used, individual test members tended
to initiate new ideas.
20
Results and Discussion
Turn Taking
In this study,
* a turn refers to “an actual occurrence of holding
the floor”
–> turns were counted when transfer of speakers
occurred at transition-relevance points
21
Results and Discussion
* Turn allocation was examined in terms of
1) the number of turns claimed by each speaker
2) the selection of next speaker
22
Results and Discussion
Table 3.1
Analysis Criteria
Number of turns exchanged
23
IS
GS
116
135
Length of turns
(*mean word count per turn)
7.5
6.6
Turn-allocation variance
2.9
3.2
Results and Discussion
1. More turns were exchanged among test members
when GS was used.
2. The duration of turns was shorter when GS was
used.
24
Results and Discussion
3. As shown by the turn-allocation variance, floor
dominance by few test takers was more apparent
when IS was used.
25
Results and Discussion
* Turn-allocation pattern –
Results showed that the overall pattern of speaker
selection was quite similar between the two
scoring methods.
26
Results and Discussion
Table 3.2 Turn Allocation Pattern
Analysis Criteria
IS
GS
Turn-allocation pattern
selected by current speaker
27
4
(3.8%)
3
(2.6%)
self-selected speaker
60 (56.6%)
62 (53.9%)
no one takes up the turn so the
current speaker continues
42 (39.6%)
50 (43.5%)
Results and Discussion
* Specifically,
1. When the floor was perceived to be open for grab,
EFL test takers tended to self-select themselves
as the next speaker.
28
Results and Discussion
2. The second most frequently adopted option for
speaker selection was for the current speaker to
continue when no one takes up the turn.
29
Results and Discussion
3. It turned out that EFL test takers rarely named the
next speaker for a turn in the assessment context.
-- inconsistent with the speaker-selection pattern
revealed in conversations carried by EFLs in
non-testing context.
30
Results and Discussion
Repair
* A repair is an attempt to address the problematic
communication.
It serves as a vital mechanism for maintaining
reciprocal interaction between speakers.
31
Results and Discussion
Table 4
Analysis Criteria
IS
GS
Total count of repair
8
4
Type of repair
32
self-initiated self-repair
6
3
other-initiated self-repair
1
1
self-initiated other-repair
1
0
other-initiated other-repair
0
0
Results and Discussion
1. Not many instances of repairs were identified in
this study.
2. Among them, self-initiated self-completed repairs
occurred most frequently and relatively more
frequent when IS was used.
33
Results and Discussion
3. As the goal for task-based interaction is to
accomplish the task in hand, repairs focus on
establishing mutual understanding between speakers.
-- this study showed that incorrect linguistic forms
and interlanguage forms were frequently ignored
by test takers unless they lead to complete
communication breakdown.
34
Results and Discussion
4. The correction of linguistic errors, if made at all,
was mostly done via self-initiated self-completed
repairs.
-- consistent with Seedhouse’s claim (2004) that
‘a learner in learner-learner interaction never
attempts to correct another learner’s linguistics
forms in task-oriented contexts.’
35
Conclusions
Key Findings:
1. The two scoring approaches resulted in similar quantity of group
discussions (*total number of words) .
2. Turn-taking was more equally distributed among group
members when group-scoring was used.
3. More content information and instances of self-initiated selfcompleted repairs were identified when individual scoring was
used.
36
Conclusions
•
To summarize,
it appears that the choice of the scoring approach
exerts real influence on the assessment outcome of
task-based group oral interaction in EFL classrooms.
37
Conclusions
•
38
When the performance is viewed as a joint product,
EFL test takers show higher collaboration effort and
focus more on meaning-based exchanges.
Conclusions
* Pedagogical Implications
1. If the goal is to develop the interactional competence of
language learners, it is recommended that instructors
assess them on a group basis, ignoring differential
contributions from individuals.
39
Conclusions
2. It seems that task-based group oral testing is more
effective in evaluating EFL learners’ development in the
meaning-and-fluency aspect rather than the form-andaccuracy aspect.
Reference
Seedhouse, P. (2004). The Interactional Architecture of the Language Classroom: A
Conversation Analysis Perspective. MA: Blackwell Publishing.
40