Évaluation des logiciels interactifs - L`UTES

The Pépite project
Automatic Multi-criteria
Assessment of Open-Ended
Questions: a case study in
school algebra
Élisabeth Delozanne, Paris Universitas, UPMC
D. Prévit, B. Grugeon, F. Chenevotot
ITS’2008
1
Cognitive modeling authoring tool
 Problem
 Multi-step reasoning, multiple equivalent reasonings
 Our approach
1. An expert teacher (or a researcher) defines diagnosis
exercises
2.A cognitive engineer implements templates that
generalize these particular diagnosis exercises
3.A teacher clones these diagnosis exercises by filling
template forms
4.A domain specific application
• generates the clone and a set of plausible correct
and incorrect anticipated solutions
• matches the student’s reasoning with anticipated
solutions
2
Outline
 An introductory example
 Pépite : a specific diagnosis tool
 PépiGen : a system to clone Pépite
 Author’s and Student’s points of view
 Automatic Diagnosis
 How does it work ?
 Pépinière *
• Formal processing of expression trees
 Conclusion
* in French : tree nursery
3
Blandine
Validity
Incorrect
V3
Use of letters
Incorrect
L3
Translation
Step-by-step with incorrect chains
T4
Algebraic Expressions
writing
Incorrect use of parentheses with
memory of meaning
EA31
Justification
By algebra using incorrect rules
J3
Aliou
Validity
Incorrect
V3
Use of letters
No
L5
Translation
Step-by-step
T2
Algebraic writing
No
EA?
Justification
By example
J2
Definitions
 Diagnosis exercise
 An exercise (statement and user interface)

 an analysis grid to assess every plausible solution
anticipated by experts
 Clone
 A similar exercise
• has the same kind of statement and user interface
• gives the same kind of information on students’
competence
 an analysis grid
• to assess every plausible solution
• automatically generated by the system
6
PépiGen
 A system to clone the Pépite diagnosis tool
 An author (a teacher)
 Chooses an exercise to be cloned
 Enters the statement of the clone
 PépiGen generates
 The student’s interface
 Each plausible solution (correct or incorrect) and its
assessment on several dimensions
7
The Author’s interface
8
The Student’s interface
9
The Automatic Diagnostic
10
Outline
 An introductory example
 Pépite : a specific diagnosis tool
 PépiGen : a system to clone Pépite
How does it work ?
 Pépinière
 Expanding the tree of plausible steps of correct
and incorrect algebraic transformations
 Walking through the tree to anticipate
different solutions and their assessment
 Diagnosing the student’s reasoning
 Conclusion
11
Plausible steps
(x+6)*3-3x
Correct rules
R1
R2
x*3+6*3-3x
x+6*3-3x
3x+18-3x
3x+18-3x
R3
R4
18
21x-3x
R3
18x
V1,EA1 V3,EA42
R3
-2x+18
R4
18
AC+BC
R3 : AB+AC
A(B+C)
Incorrect rules
R3
R5
R1 : (A+B)C
21x-3x
R3
18x
V3,EA31 V3,EA31EA42 V3,EA32
R2: (A+B)C
A+BC
R4: AB+C
B(A+C)
R5: A+B*C
(A+B)*C
Analysis grid generation
 PépiGen
1. sends the algebraic expression to Pépinière that
returns a tree of plausible steps
• Validity and Algebraic Expression Writing
2. completes the plausible solutions set with
• Non optimal algebraic
3. completes each solution assessment on the 5
dimensions
• V, EA, L, T, J
4. saves each algebraic solution and its assessment
• XML file : solution analysis grid
 Note :
 arithmetic reasonings are analyzed by the diagnosis
system
13
Analysis grid (extract)
<UnexpectedCorrectSolutions> (…)
<Comment>Algebraic proof ; the student interprets the
statement as an equation</Comment>
<Solution>
<Assessment>V2,EA1,L1,T1,J1</Asssessment>
<Expression>(x+6)*3-3*x = 18</Expression>
<Expression>x*3+6*3-3*x = 18</Expression>
<Rule>C,3</Rule>
<Expression>x*3+18-3*x = 18</Expression>
<Expression>18 = 18</Expression>
</Solution>
</UnexpectedCorrectSolution>
14
Automatic diagnosis
XM L
Student’s
reasoning
XM L
Analysis
grid
Diagnosis system
Equivalent
expression
tree ?
True/False
Expressions
Tree
processor
Pépinière
save
XM L
Student’s
reasoning+
assessment
Diagnosis algorithm
 Numerical or algebraic approach?
 Loop on each expression of the student’s reasonning
 Build the expression tree (ST)
 Loop on each Plausible solution in the analysis grid
• Build the expression tree (PT)
• If numerical approach
- substitute the numerical value in PT
• If ST  PT
- keep :PT, the rule and the comment and stop
 At the end
 walk through PT to set up the final assessment
 save the final assessment, the comment and the applied
rules
16
Results and tests
 On going work
 A demonstration prototype implements a complex
exercise cloning
 Authoring clones
 Solving
 Diagnosing
 Preliminary Tests
 assessment of a corpus of 141 students’ solutions
• Multi-step reasoning
• Multiple equivalent reasonings
 3 teachers tested it in the lab
17
Discussion
 Diagnosis
 compared with model tracing
≈ Tree of plausible steps: correct and incorrect rules
≠ Emphasis : whole reasoning/step-by-step
≠ Several student’s types of reasoning derived from a
single solution branch
≠ Multidimensional assessment
 Authoring
 Filling template forms
- Limited to specified exercises
Automatic multidimensional diagnosis validated by
experts
No programming, no modeling for teachers
18
Automatic Multi-criteria Assessment
 Our proposal
 Teachers clone a diagnosis tool previously
designed by experts
 The cloning process relies on
• A preliminary educational study in the
domain
• An implementation of templates of
diagnosis exercises
• A specific application to analyze reasonings
that are not pre-formated
 Demo: Friday afternoon
 http://pepite.univ-lemans.fr
19