cs2001-10 - University of Pittsburgh

Speech and Language Processing for
Educational Applications
Professor Diane Litman
Computer Science Department &
Intelligent Systems Program &
Learning Research and Development Center
A few words about me…

Currently
– Professor in CS and ISP (director)
– Senior Scientist at LRDC
– ITSPOKE research group

2 PhD students, your name here?, 3 CS undergrads, 1 postdoc, 1 programmer
– AI Research (speech and NLP, tutoring and education, applied learning,
affective computing)

Previously
– Member Technical Staff, AT&T Labs Research, NJ
– Assistant Professor, CS at Columbia University, NY
2
More generally...
NLP and the Learning Sciences
More generally...
NLP and the Learning Sciences
Learning Language
(reading, writing,
speaking)
Tutors
Scoring
More generally...
NLP and the Learning Sciences
Learning Language
Using Language
(reading, writing,
speaking)
(to teach everything else)
Tutors
Conversational
Tutors / Peers
Scoring
CSCL
More generally...
NLP and the Learning Sciences
Learning Language
Using Language
(reading, writing,
speaking)
(to teach everything else)
Readability
Tutors
Scoring
Processing
Language
Conversational
Tutors / Peers
(Michael Lipschultz
NLP for Peer Review Joanna Drummond
Heather Friedberg)
(Wenting Xiong)
CSCL
Discourse
Coding
Questioning
& Answering
Lecture
Retrieval
Current Research Grants
•An Affect-Adaptive Spoken Dialogue System that Responds Based on
User Model and Multiple Affective States
– Detect and adapt to student disengagement
– Vary tutor responses based on user model (expertise, gender) to increase
learning and satisfaction
•Improving Learning from Peer Review with NLP and ITS Techniques
- Detect important feedback features (i.e. is a solution given, is the review helpful)
- Enhance reviewer, author, and instructor interfaces
•Improving a Natural-Language Tutoring System That Engages Students
in Deep Reasoning Dialogues About Physics
- Use of tutor specialization/abstraction
- Research “in-vivo” (in a high school!)
Prior Dissertations Supervised

Machine Learning for Dialogue
– Hua Ai, User Simulation for Spoken Dialog System Development
– Min Chi, Do Micro-Level Tutorial Decisions Matter: Applying
Reinforcement Learning to Induce Pedagogical Tutorial Tactics

Discourse Theory for User Interfaces
– Mihai Rotaru, Applications of Discourse Structure for Spoken Dialogue
Systems

Cognitive Science for Intelligent Tutoring
– Arthur Ward, Reflection and Learning Robustness in a Natural Language
Conceptual Physics Tutoring System
Today: Spoken Tutorial Dialogue
 Motivation
 The
ITSPOKE Tutorial Dialogue System & Corpora
 Detecting and Adapting to Student Uncertainty
– Uncertainty Detection
– System Adaptation
– Impact on Student Meta(Cognition)
» Wizarded and fully-automated experiments
 Summing
Up
What is Tutoring?
• “A one-on-one dialogue between a teacher and a
student for the purpose of helping the student
learn something.”
[Evens and Michael 2006]
• Human Tutoring Excerpt
[Thanks to Natalie Person and Lindsay Sears,
Rhodes College]
Intelligent Tutoring Systems

Students who receive one-on-one instruction
perform as well as the top two percent of students
who receive traditional classroom instruction
[Bloom 1984]
 Unfortunately,
providing every student with a
personal human tutor is infeasible
– Develop computer tutors instead
Tutorial Dialogue Systems
 Why
is one-on-one tutoring so effective?
“...there is something about discourse and natural
language (as opposed to sophisticated pedagogical
strategies) that explains the effectiveness of
unaccomplished human [tutors].”
[Graesser, Person et al. 2001]
 Currently
only humans use full-fledged natural
language dialogue
Spoken Tutorial Dialogue Systems
 Most
human tutoring involves face-to-face
spoken interaction, while most computer
dialogue tutors are text-based
 Can
the effectiveness of dialogue tutorial
systems be further increased by using spoken
interactions?
Potential Benefits of Spoken Dialogue: I
 Dialogue
provides a learning environment that
promotes student activity (e.g., self-explanation)
– Tutor: The right side pumps blood to the lungs, and the left side pumps blood to
the other parts of the body. Could you explain how that works?
– Student (self-explains): So the septum is a divider so that the blood doesn't get
mixed up. So the right side is to the lungs, and the left side is to the body. So the
septum is like a wall...
 Self-explanation
occurs more in speech [Hausmann
and Chi 2002], and correlates with learning [Chi et al.
1994]
Potential Benefits of Spoken Dialogue: II
 Speech
contains prosodic information, providing
new sources of information about the student for
teacher adaptation [Fox 1993; Tsukahara and Ward 2001;
Pon-Barry et al. 2005]
A
correct but uncertain student turn
– ITSPOKE: How does his velocity compare to that of
his keys?
– STUDENT: his velocity is constant
Potential Benefits of Spoken Dialogue: III
 Spoken
conversational environments may foster
social relationships that may enhance learning
– AutoTutor [Graesser et al. 2003]
Potential Benefits of Spoken Dialogue: IV
• Some applications inherently involve spoken
dialogue
– Conversational Skills [Seneff, Johnson]
– Reading Tutors [Mostow, Cole]
• Others require hands-free interaction
– e.g., NASA training
Outline
 Motivation
 The
ITSPOKE System and Corpora
 Detecting and Adapting to Student Uncertainty
– Uncertainty Detection
– System Adaptation
– Experimental Evaluation
 Summing Up
• Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002]
• Sphinx2 speech recognition and Cepstral text-to-speech
• Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002]
• Sphinx2 speech recognition and Cepstral text-to-speech
• Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002]
• Sphinx2 speech recognition and Cepstral text-to-speech
Three Types of Tutoring Corpora
 Human
Tutoring
– 14 students / 128 dialogues (physics problems)
 Computer
Tutoring
– 72 students / 360 dialogues
 Wizard
Tutoring
– 81 students / 405 dialogues
– human performs speech recognition, semantic analysis
– computer performs dialogue management
Experimental Procedure

College students without physics
– Read a small background document
– Took a multiple-choice Pretest
– Worked 5 problems (dialogues) with ITSPOKE
– Took an isomorphic Posttest

Goal was to optimize Learning Gain
– e.g., Posttest – Pretest
Outline
 Motivation
 The
ITSPOKE System and Corpora
 Detecting and Adapting to Student Uncertainty
– Uncertainty Detection
– System Adaptation
– Experimental Evaluation
 Summing Up
Monitoring Student State (motivation)
Tutor26:
Let me ask my question in a more general way. What happens to the
velocity of a body when there is no force acting on it?
Student27: dammit (ASR: it is)
Tutor28 : Could you please repeat that?
Student29 : same (ASR: i same)
Tutor30 : Right. And what happens to the HORIZONTAL velocity of a body
when there is no HORIZONTAL force acting on it?
Student31: zero (ASR: the zero)
Tutor32 : Well... OK, let's consider how forces in the vertical direction affect the
motion of a body in a horizontal direction in a different scenario <…omitted…>
Student33: oh gravity you already said this (ASR: beats gravity you're exert less)
Tutor34: Fine. Are there any other forces acting on the apple as it falls?
Student35: no why are you doing this again (ASR: no y and to it yes)
Tutor36:
Correct. The only force acting on the apple as it falls to the earth is the
earth's gravitational force. In which direction does gravitational force act?
Student37: downward you computer (ASR: downward you computer)
Why Uncertainty?
 Most
frequent student state in our dialogue corpora
[Litman and Forbes-Riley 2004]
 Focus
of other learning sciences, speech and
language processing, and psycholinguistic studies
[Craig et al. 2004; Liscombe et al. 2005; Pon-Barry
et al. 2006; Dijkstra et al. 2006]
 .73
Kappa [Forbes-Riley et al. 2008]
Corpus-Based Detection Methodology
 Learn
detection models from training corpora
– Use spoken language processing to automatically extract
features from user turns
– Use extracted features (e.g., prosodic, lexical) to predict
uncertainty annotations
 Evaluate
learned models on testing corpora
– Significant reduction of error compared to baselines
[Litman and Forbes-Riley 2006; Litman et al. 2007]
Outline
 Motivation
 The
ITSPOKE System and Corpora
 Detecting and Adapting to Student Uncertainty
– Uncertainty Detection
– System Adaptation
– Experimental Evaluation
 Summing Up
System Adaptation: How to Respond?
 Theory-based
– [VanLehn et al. 2003; Craig et al. 2004]
 Corpus-based
– [Forbes-Riley and Litman 2005, 2007, 2008, 2010]
Theory-Based Adaptation:
Uncertainty as Learning Opportunity
 Uncertainty
represents one type of learning impasse,
and is also associated with cognitive disequilibrium
– An impasse motivates a student to take an active role in
constructing a better understanding of the principle.
[VanLehn et al. 2003]
– A state of failed expectations causing deliberation aimed at
restoring equilibrium. [Craig et al. 2004]
 Hypothesis:
The system should adapt to uncertainty
in the same way it responds to other impasses (e.g.,
incorrectness)
Corpus-Based Adaptation:
How Do Human Tutors Respond?
An
empirical method for designing dialogue
systems adaptive to student state
– extraction of “dialogue bigrams” from annotated
human tutoring corpora
– χ2 analysis to identify dependent bigrams
– generalizable to any domain with corpora labeled for
user state and system response
Example Human Tutoring Excerpt
S:
T:
S:
T:
So the- when you throw it up the acceleration will stay the
same?
[Uncertain]
Acceleration uh will always be the same because there isthat is being caused by force of gravity which is not
changing.
[Restatement, Expansion]
mm-k.
[Neutral]
Acceleration is– it is in- what is the direction uh of this
acceleration- acceleration due to gravity?
[Short Answer Question]
S:
T:
It’s- the direction- it’s downward. [Certain]
Yes, it’s vertically down.
[Positive Feedback,
Restatement]
Findings
Statistically
significant dependencies exist
between students’ state of certainty and the
responses of an expert human tutor
– After uncertain, tutor Bottoms Out and avoids
expansions
– After certain, tutor Restates
– After any emotion, tutor increases Feedback
 Dependencies
suggest adaptive strategies for
implementation in our computer tutor
[Forbes-Riley and Litman 2010]
Outline
 Motivation
 The
ITSPOKE System and Corpora
 Detecting and Adapting to Student Uncertainty
– Uncertainty Detection
– System Adaptation
– Experimental Evaluation
 Summing Up
Adaptation to Student Uncertainty in ITSPOKE
 Most
systems respond only to (in)correctness
 Recall
that literature suggests uncertain as well as
incorrect student answers signal learning impasses
 Experimentally
manipulate tutor responses to
student uncertainty, over and above correctness, and
investigate impact on learning
– Platform: Adaptive version(s) of ITSPOKE
Normal (non-adaptive) ITSPOKE
 System
Initiative Dialogue Format:
– Tutor Question – Student Answer – Tutor Response

Tutor Response Types:
– to Corrects (C): positive feedback (e.g. “Fine”)
– to Incorrects (I): negative feedback (e.g. “Well…”) and
» Bottom Out: correct answer with reasoning (easier)
» Subdialogue: questions walk through reasoning (harder)
Adaptive ITSPOKE(s)
 Our
Prior Work: Rank correctness (C, I) + uncertainty
(U, nonU) states in terms of impasse severity
State:
I+nonU I+U C+U C+nonU
Severity: most
less least none
Adaptive ITSPOKE(s)
 Our
Prior Work: Rank correctness (C, I) + uncertainty
(U, nonU) states in terms of impasse severity
State:
I+nonU I+U C+U C+nonU
Severity: most
less least none
 Adaptation
Hypothesis:
– ITSPOKE already resolves I impasses (I+nonU, I+U),
but it ignores one type of U impasse (C+U)
– Performance improvement if ITSPOKE provides
additional content to resolve all impasses
Two Uncertainty Adaptations
 Simple
Adaptation
– Same response for all 3 impasses
– Feedback on only (in)correctness
 Complex
Adaptation
– Different responses for the 3 impasses
» Based on human responses [Forbes-Riley and Litman 2010]
– Feedback on both uncertainty and (in)correctness
» Based on empathetic computer tutor literature [Wang et al.,
2005; Hall et al., 2004; Burleson et al., 2004]
Simple Adaptation Example: C+U
TUTOR1: By the same reasoning that we used for the car, what’s the
overall net force on the truck equal to?
STUDENT1: The force of the car hitting it?? [C+U]
TUTOR2: Fine. [FEEDBACK] We can derive the net force on the
truck by summing the individual forces on it, just like we
did for the car. First, what horizontal force is exerted on
the truck during the collision? [SUBDIALOGUE]

Same TUTOR2 subdialogue if student was I+U or I+nonU
Complex Adaptation Example: C+U
TUTOR2: That’s exactly right, but you seem unsure, so let’s sum up.
[FEEDBACK] The net force on the truck is equal to the impact
force on it. We can prove this just like we did for the car. First, we
know that gravity and the normal force on the truck must cancel
each other, otherwise the truck would not be at rest vertically.
Second, we know that the impact force is the only horizontal force
exerted on the truck. [LONG BOTTOM OUT]
Complex Adaptation Example: I+U
TUTOR1: By the same reasoning that we used for the car, what’s the
overall net force on the truck equal to?
STUDENT1: Zero?? [I+U]
TUTOR2: That’s not correct, but don’t worry. You seem to be aware
of your mistake. Let’s resolve it. [FEEDBACK] The net force on
the truck is equal to the impact force on it. Let’s walk through this
answer step by step. [SHORT BOTTOM OUT] We can derive the
net force on the truck by summing the individual forces on it, just
like we did for the car. First, what horizontal force is exerted on the
truck during the collision? [SUBDIALOGUE]
Complex Adaptation Example: I+nonU
TUTOR1: By the same reasoning that we used for the car, what’s the
overall net force on the truck equal to?
STUDENT1: The force of gravity. [I+nonU]
TUTOR2: I’m sorry, but I see an error in your answer. Let’s fix it.
[FEEDBACK] We can derive the net force on the truck by
summing the individual forces on it, just like we did for the car.
First, what horizontal force is exerted on the truck during the
collision? [SUBDIALOGUE]
Experiment 1: ITSPOKE-WOZ

Wizard of Oz version of ITSPOKE
– Human recognizes speech, annotates correctness and uncertainty
– Provides upper-bound language performance

4 Conditions
– Simple Adaptation: used same response for all impasses
– Complex Adaptation: used different responses for each impasse
– Normal Control: used original system (no adaptation)
– Random Control: gave Simple Adaptation to random 20% of
correct answers (to control for additional tutoring)

Prediction: Complex Adaptation > Simple Adaptation > Random
Control > Normal Control (for increasing learning)

Procedure: reading, pretest, 5 problems, survey, posttest
Results I: Learning
Metric
Condition
Normal Control
Random Control
Learning Gain
(Posttest – Pretest) Simple Adaptation
Complex Adaptation
F(3, 77) = 3.275, p = 0.02
N
21
20
20
20
Mean
Diff
p
.183 < Simple Adaptation .03
.269
.307
.213
-
Results I: Learning
Metric
Condition
Normal Control
Random Control
Learning Gain
(Posttest – Pretest) Simple Adaptation
Complex Adaptation
N
21
20
20
20
Mean
Diff
p
.183 < Simple Adaptation .03
.269
.307
.213
-
F(3, 77) = 3.275, p = 0.02
 Simple
Adaptation yields more student learning than
Normal Control (original ITSPOKE)
[Forbes-Riley and Litman 2010]
Results I: Learning
Metric
Condition
Normal Control
Random Control
Learning Gain
(Posttest – Pretest) Simple Adaptation
Complex Adaptation
N
21
20
20
20
Mean
Diff
p
.183 < Simple Adaptation .03
.269
.307
.213
-
F(3, 77) = 3.275, p = 0.02
 Simple
Adaptation yields more student learning than
Normal Control (original ITSPOKE)
[Forbes-Riley and Litman 2010]
 Similar
results for learning efficiency
[Forbes-Riley and Litman 2009]
Discussion

Predictions versus results:
-

Complex Adaptation > Simple Adaptation > Random Control >
Normal Control
Why didn’t Complex Adaptation outperform Simple Adaptation?
– Complex Adaptation’s human-based content responses were based
on frequency, not effectiveness
– Better data mining methods (e.g. reinforcement learning) needed
Additional Evaluations - Metacognition
 Do
metacognitive performance measures differ
across experimental conditions?
– Monitoring Accuracy [Nietfield et al. 2006]
Monitoring Accuracy
Correct
Incorrect
NonUncertain
CnonU
InonU
Uncertain
CU
IU
• The wizard's annotations for each student are first represented in an
array, where each cell represents a mutually exclusive option
• motivated by Feeling of (Another’s) Knowing [Smith and Clark
1993; Brennan and Williams 1995] which is closely related to
uncertainty [Dijkstra et al. 2006]
• The array is then used to compute monitoring accuracy
Monitoring Accuracy
Correct
Incorrect
NonUncertain
CnonU
InonU
Uncertain
CU
IU
(CnonU  IU )  ( InonU  CU )
Harmann Coefficient 
(CnonU  IU )  ( InonU  CU )
• Ranges from -1 (no monitoring accuracy) to 1 (perfect monitoring
accuracy)
Additional Results I
Complex
Simple Random Normal
Adaptation Adaptation Control Control
Measure
(20)
(20)
(20)
(21)
Monitoring Accuracy
.58
.62
.62
.52
Metacognitive
 Simple
(and random) increased monitoring accuracy,
compared to normal (p < .06 in paired contrasts)
[Litman and Forbes-Riley 2009]
Additional Results II
Metacognitive Measure (n=81)
Average Impasse Severity
Monitoring Accuracy
 Monitoring Accuracy
R
p
- .56
.00
.42
.00
(where higher is better) is positively
correlated with learning
[Litman and Forbes-Riley 2009]
Experiment 2: ITSPOKE-AUTO
 Sphinx2
speech recognizer
– Word Error Rate of 25%
 TuTalk
semantic analyzer
– Correctness Accuracy of 84.7%
 Weka
uncertainty model
– Logistic regression (includes lexical, prosodic, dialogue features)
– Uncertainty Accuracy of 76.8%
Preliminary Results: ITSPOKE-AUTO
WOZ
AUTO
Metacognitive Measure
Monitoring Accuracy
R
p
R
p
.42
.00
.35
.00
 Monitoring Accuracy
remains correlated with learning
under noisy conditions
 More
modest Local and Global learning differences across
experimental conditions
Current and Future Work
 Reduce
noise in fully automated system
 Incorporation
of student disengagement and
user modeling
 Crowd

sourcing (for acquiring training data)
Remediate metacognition, not just domain content
Outline
 Motivation
 The
ITSPOKE System and Corpora
 Detecting and Adapting to Student Uncertainty
– Uncertainty Detection
– System Adaptation
– Experimental Evaluation
 Summing Up
Summing Up
 Spoken
dialogue contributes to the success of human tutors
 By modifying presently available technology, successful
tutorial dialogue systems can also be built
 Adapting to uncertainty can further improve performance
 Similar
opportunities and challenges in many educational
applications
Resources

Recommended classes
–
–
–
–
–

Introduction to Natural Language Processing
Foundations of Artificial Intelligence
Machine Learning
Knowledge Representation
Seminar classes
Other resources
–
–
–
–
ITSPOKE Group Meetings
NLP @ Pitt
Intelligent Systems Program (ISP) Forum
Pittsburgh Science of Learning Center (PSLC)
59
Thank You!
 Questions?
 Further
Information
– http://www.cs.pitt.edu/~litman/itspoke.html