The Sentence Fairy - userpages

“The Sentence Fairy”:
NLP techniques in support of essay writing
by German-speaking elementary schoolers
Karin Harbusch, Gergana Itsova, Ulrich Koch, and Christine Kühner
University of Koblenz-Landau,
Computer Science Dept.
Universitätsstr. 1, 56070 Koblenz
Germany
Contact: [email protected]
Abstract
In the following we describe a virtual writing conference based on NLP
techniques. State-of-the-art computer support for writing tasks is restricted to
multiple-choice questions or quizzes. To our knowledge, no software tool
exists that deploys generation technology to evaluate the grammatical quality
of student output. We base feedback on the output of a natural language
generation system that provides all paraphrases. We apply parsing technology
only in the teacher mode, where it helps teachers to encode new stories in a
simple manner. In the exercise generation mode, the abstract representation
of the story is used to compose exercises where the pupils improve the story
given by simple main clauses or system-composed larger phrasal snippets. A
first prototype with rudimentary teacher mode, fully automatic exercise
generation mode and a first small usability study is described here.
1 Motivation
German elementary schoolers learn essay and story writing in a rather holistic
manner. Typically, they are presented a series of about 3 to 5 pictures and/or words
and then have to produce an interesting short story of about 10 sentences. In a writing
conference (Graves, 1983), the whole class evaluates one such essay and discusses
stylistic reshaping techniques to improve the text in a hands-on manner. The
“Sentence Fairy” (Satzfee) is a ‘virtual writing conference’ aiming to improve essaywriting skills in German elementary schoolers through little exercises, with welltargeted syntactic feedback that is produced automatically. The system does not aim
at stimulating the creativity needed to write interesting stories — this aspect is left to
the teacher (cf. teacher mode, where new stories are encoded in the system). The
software uses existing natural language processing components, viz. a
syntactic/semantic parser, and a paraphrase generator based on the Performance
Grammar (PG) formalism (Kempen & Harbusch, 2002). Syntactic structures are
enriched with semantic features in the spirit of Minimal Recursion Semantics (MRS;
Copestake et al., 2005).
An important design constraint was that the children should not need to type. In a
1
corpus study with 1,000 transliterated short stories by third/fourth grade schoolers (cf.
Thonke et al., in print), one finds out-of-vocabulary rates of 30% due to spelling
problems. Consequently, parsing accuracy amounted to a mere 60% (cf. Fränkel,
forthcoming). Hence, all interaction with the system proceeds through mouse
manipulation (drag & drop). Moreover, the system would not be accepted in German
classrooms if it required a high level of keyboard typing skills.
The Sentence Fairy consists of three main components: a ‘teacher mode’, an
‘exercise generation mode’, and a ‘learner mode’. The first component lets teachers
create new short stories in interaction with a syntactic/semantic parser, without the
need to specify much linguistic detail. From parsed story representations, in the
exercise generation mode, three types of writing exercises are automatically built up
for the learner mode: story reconstruction, sentence combining, and word ordering. A
first prototype with rudimentary teacher mode, fully automatic exercise generation
mode and a first small usability study is described here.
The paper is organized as follows. In the next section, we motivate the choice of
the linguistic formalism and present the parser and the generator underpinning the
Sentence Fairy. In Section 3, we motivate the system’s basic look based on softwareergonomic and e-learning rules. Moreover, we discuss the decision for our basic
feedback strategy applied in the system. In Section 4, we outline the three different
operating modes of the prototype. In Section 4.1, the design of the not yet
implemented teacher mode is presented. In Section 4.2, the fully operational exercise
generation mode is delineated and in the last Subsection, we present preliminary
results from a small usability study of the learner mode. In Section 5, we give an
overview of comparable approaches. In the final section we discuss future work. One
topic is defining new exercises inspired by exploring the corpus. Moreover, we
illustrate how the system can be tailored to other languages and L2-learning.
2 NLP components underpinning the Sentence Fairy
In this section, we motivate the choice for the linguistic formalism and present the
parser and the generator applied in the Sentence Fairy system.
The general linguistic formalism is Performance Grammar (PG) (see, e.g., Kempen
& Harbusch, 2002). This formalism is well suited to expressing fine-grained wordordering rules in Dutch and German (Kempen & Harbusch, 2003). Moreover, the wordorder rules can be tailored to different languages (Harbusch & Kempen, 2002), which
will be helpful for a Dutch and English system or L2-learning (cf. final Section).
Performance Grammar is a psycholinguistically motivated syntax formalism, in
declarative terms. PG aims not only at describing and explaining intuitive judgments
and other data concerning the well–formedness of sentences of a language, but also
at contributing to accounts of syntactic processing phenomena observable in language
comprehension and language production (cf. Kempen & Harbusch, 2003 and 2005).
In order to meet these demands, PG generates syntactic structures in a two-stage
process. In the first and most important ‘hierarchical’ stage, unordered hierarchical
structures (‘mobiles’) are assembled out of lexical building blocks. The key operation at
work here is typed feature unification, which also delimits the positional options of the
syntactic constituents in terms of so-called topological features. The second, much
simpler stage takes care of arranging the branches of the mobile from left to right by
‘reading out’ one positional option of every constituent. Syntactic structures are
enriched with semantic features in the spirit of Minimal Recursion Semantics (MRS;
Copestake et al., 2005; cf. a snippet of the story representation in Section 4.1).
2
The parser is implemented in Python (cf. http://www.python.org/) as an expansion
of the feature chart parser in NLTK (Bird et al., forthcoming). It follows the two-stage
construction paradigm in the definition of PG (cf. Harbusch & Kempen, 2000). First, all
dominance structures are generated by an expanded Earley parser (scanning without
checking word order). In the second step, all word order arrays are calculated in an
efficient manner. Currently, we are expanding the parser to semantic constructions.
This is a basic prerequisite for a fully operational teacher mode.
The natural language generation component (called Performance Grammar
Workbench (PGW)) can produce all paraphrases licensed by German (or Dutch) word
order rules (Harbusch et al., 2006). It has an interactive input device, but it can also
read a file with a dominance tree and (possibly underspecified) feature descriptions of
a clause (cf. example in Section 4.1). The appropriate syntactic shaping based on the
abstract input specification is fully under the control of PGW. The system calculates all
licensed sentences – in particular all word order variations. Moreover, the system runs
a set of malrules that describe typical errors users make (cf. Section 5). For instance
such a rule allows verb-second in German subordinate clauses. A list of the ill-formed
clauses and the respective malrules is constructed separately. So the Sentence Fairy
can issue an accurate error message according to each malrule (cf. feedback type
(7)).
The list of correct paraphrases and the list of malrule-caused paraphrases are used
to associate exact feedback to the students’ choice in the individual sentence
construction exercises (cf. Section 4.2).
3 General design concepts of the Sentence Fairy system
In this section, we outline the software-ergonomic concepts realized in our e-learning
tool. Moreover we discuss the basic feedback strategy underpinning the system.
The Sentence Fairy system is designed for elementary schoolers at the age of 8-10
years. We try to fulfill audio and visual expectations and needs of such users. The
selected basic colors follow software-ergonomic rules (Baumgart et al., 1996). We
have chosen yellow/orange/green as “young” and basically reassuring colors. All
further choices result from producing a good contrast to the chosen background (cf.
the basic colors in Figure 1).
The overall screen layout is always the same, so that the students’ expectations
after an initial learning phase are satisfied and so working with the system becomes
easier in the long run. In Figure 1, the meta-concept of our screen layout is illustrated
by blue balloons. All concepts are distributed over the page according to their
procedural order and follow a natural reading direction for German, i.e. top down, left
to right. Accordingly, the task to be performed by the pupil resides in the top left panel
(headline of the screen) followed by the exercise itself – distributed over virtually the
whole screen. The system’s dialog acts (feedback and audio support) occupy the
corners in the right panel. This layout suggests that the Sentence Fairy and the
mushroom (see below) only want to help occasionally.
The Fairy permanently resides in the upper right corner. We have chosen a
personalization of the tutor concept according to studies where a tutor is observed to
improve e-learning success (Paechter & Schweitzer, 2006). She provides feedback
after pressing the orange Fertig ‘Done’ button in the lower right panel at the end of
every exercise.
As for audio support, a loudspeaker represented by a mushroom assists the
Sentence Fairy in the lower right corner. It helps pupils with reading problems, as
3
sentences of the story on the page can be dragged to the mushroom to be read aloud
to the pupil. Those sentences snap back automatically to their original position to be
used in the exercise. Moreover, the pupils can also listen to exercises and feedback by
clicking the audio-buttons provided with those items. They are spoken with a female
voice to give the impression that the sentence fairy is active. The required MP3 files
are part of the system, whereas the sentences of the story are supposed to be
synthesized by a text-to-speech system or provided by the teacher.
The center of the screen is devoted to the individual drag & drop tasks fulfilling our
design constraint that the children should not need to type (cf. Section 1). In general,
the right lower panel is designated to the pieces to be moved, and the target of the
drag & drop operation is always located in the upper and/or left panel of the screen.
This general design decision configures a result screen that is filled in the upper/left
panel.
The feedback is associated with the Sentence Fairy in the upper right panel. It
consists of two parts: (1) a general binary indication of “correct” by a green checkmark
or “false” by a red circled F (cf. Figure 1) and (2) a compliment or an encouragement
with an explanatory text on how to do better.
The general question what to present as feedback is a delicate one here. Mason &
Bruning (2001) define the following strategies as a surrogate of many earlier
approaches: 1. No feedback, i.e. only a final result statistics is provided to the user at
the end of the whole questionnaire; 2. Knowledge of response means simply yes/no
feedback for every input (item verification); 3. Answer until correct is the same as (2)
but in case of an error the user has to try until the answer is correct; 4. Knowledge of
correct response provides the correct result with the user’s answer; 5. Topic
contingent points at passages or other learning material where the correct information
is located in case of a false answer; 6. Response contingent explains why the incorrect
answer was wrong and why the correct answer is correct; 7. Bug related relies on "bug
libraries" or rule sets to identify and correct a variety of common student errors and 8.
Attribute isolation provides item verification and highlights the central attributes of the
target concept. Attribute-isolation feedback focuses learners on key components of the
concept to improve general understanding of the phenomenon.
There is no best strategy. Moreover, we have to pay attention to the limitations of
our users (e.g. adequate vocabulary and simple lines of argumentation, trying to keep
the pupils' motivation high). So, in the first prototype, we follow strategies (3) entailed
with simple type (4) texts and type (7) feedback (based on malrule numbers) coined
into a not too complicated answer by the Sentence Fairy. However, the tailoring of the
feedback to the expectations of the pupils is a concrete matter of future work and in
particular further usability tests (cf. Section 4.3).
For an explanation of the exercise presented in Figure 1 see Section 4.2.
4
Figure 1: General look of any exercise.
4 The operating modes of the Sentence Fairy system
In this Section, we outline the three operating modes of the Sentence Fairy system. In
the teacher mode (cf. Section 2.1), an abstract story representation is constructed in a
dialog with the teacher. Based on the resulting abstract story representation, the
Sentence Fairy currently sets up three different exercise types fully automatically in the
exercise generation mode (cf. Section 2.2). In the learner mode, the pupils run the
virtual writing conference. In this paper we discuss first results of a usability study (cf.
Section 2.3).
4.1 Teacher mode to encode new stories
The goal of the teacher mode is to create new stories with minimal effort. In the long
run, we want to facilitate the dialog to let even the pupils enter their story themselves in
a deliberate ``knowledge-telling strategy”. Currently, this mode is not implemented in
the online prototype of the Sentence Fairy. Instead, we have directly manipulated the
internal knowledge bases of the system.
As teachers cannot be expected to encode the syntactic/semantic representation
directly, in our system, those representations have to be extracted by parsing plain text
(cf. Section 2) and asking for some elaboration and clarification in a dialog mode.
Missing lexical items and their morpho-syntactic and semantic features can be added
in a multiple-choice fashion. The parsed structures are verified, expanded or revised in
order to build correct and complete internal story representations. For instance, the
teacher may have to insert coreference tags between Tim and the boy. Moreover,
features like the gender of Tim have to be provided in order to enable feedback on
pronominalizations in the learner mode (cf. exercise (2)).
Thus, basically, an ordered series of pictures and the corresponding text in terms of
simple main clauses have to be entered by the teacher. Moreover, (s)he answers the
questions asked by the system and, finally, (s)he determines a limited set of Rhetorical
Structure Relations (RST; Mann and Thompson, 1988) between pairs of sentences. In
5
the current version of the Sentence Fairy system, we focus on temporal and causal
relations and their lexical and syntactic realizations.
In Figure 2, a little snippet of the story representation is outlined. The left panel
illustrates the syntactic/semantic representation of the simple main clause Tim geht
durch den Wald `Tim walks the forest`. In the right panel, two RST relations for this
sentence (cf. prop2) are delineated, i.e. that prop1 (`Tim wants to take a walk´; as
satellite (S)) and prop2 (as nucleus (N)) have a causal relationship and they happen
simultaneously (cf. TIME_simult). RST relations represent discourse functions. In the
database, a nucleus (the representation of one main clause; cf. prop2) and a satellite
(cf. prop1) are given. Syntactic realizations for the discourse markers are encoded in
the Sentence Fairy system.
prop1: ...
…
prop2: (walk ((sem(agent:Tim)
CAUSE(N:prop2, S:prop1)
(syn(subject(cat:ProperN;
TIME_simult(N:prop2,S:prop1)
gender: masculine; ...)
…
(sem(loc: ...))
prop3: ...
Figure 2: Snippet of the abstract story representation.
4.2 Exercise generation mode to set up the learner mode exercises with
feedback
The first prototype of the Sentence Fairy system comprises three exercise types: (1)
Story reconstruction, (2) sentence combining and (3) word ordering. The first exercise
realizes the first step in a writing conference, i.e. to read the text aloud and clarify the
story, the latter two belong to the text revision process, i.e. the stylistic reshaping of all
sentences in the story. By design, spelling check is not necessary in our drag & dropbased system. The two final steps in a writing conference are not yet covered in out
first prototype, namely the final edition, which is supposed to be done by the teacher,
and a pretty print. Thus, in the following, we concentrate on the design of exercises for
stylistic reshaping in the writing conference process.
In the exercise generation mode, the Sentence Fairy system fully automatically sets
up the learner mode, where the pupils are invited to select sentences or words, and to
move them to target positions in a drag & drop manner. Well-targeted feedback is
automatically generated on the basis of the internal story representation. For each of
the exercise types, we outline how the exercise is constructed from the story
representation, what the student is presented with, and how the feedback is
generated.
Exercise (1) - Story reconstruction. From the temporally ordered list of main
clauses in the story representation, the system randomly extracts a subset, replacing
each member of this subset with empty boxes. The resulting incomplete story is
presented on the left side of the screen as a cloze test; the extracted sentences are
shown on the right side of the screen and can be dragged into the empty boxes. The
automatic feedback is binary here (OK vs. encouraging to do it again, i.e. feedback
type (3)) and is determined by matching the pupil’s choice with the temporal order in
the system’s database. See Figure 1 for a screen shot where the pupil has only
partially solved the task. Some sentences still reside on their drag position and some
empty boxes are not yet filled. The Sentence Fairy gives a negative feedback but no
concrete hints as the task is supposed to be easy. So we decided to keep the pupil
motivated to go on without further hints.
6
“Exercise (2) - Sentence combining” (Mellon, 1969 or Daiker et al.,1985). All
syntactic instantiations of an arbitrarily selected RST relation together with its nucleus
and satellite are passed on to the paraphrase generator PGW (cf. Section 2). The
latter returns all grammatically correct compound sentences. For instance, the relation
“CAUSE” can be realized by the coordinating conjunction denn ‘for’ or by the
subordinating conjunctions da and weil ‘because’. These conjunctions are presented
on the screen as selectable items in one choice box. All syntactic realizations of RST
relations are stored in the Sentence Fairy system.
Another box shows the syntactic realizations of nucleus and satellite as main or
subordinate clauses. (Word order in German main clauses is “verb-second”; in
subordinate clauses it is “verb-final”; moreover, notice that we have only chosen a
subset of word order variants in order not to overtax the pupils.) In order to practise
building a compound sentence, the student selects a conjunction and two clauses with
appropriate word orders and moves them into the corresponding choice boxes. Based
on the list of automatically generated paraphrases, the system computes its feedback:
“OK” if the student response matches one of the system’s paraphrases, or some help
otherwise. In the latter case, a feedback text is presented indicating which syntactic
constraint(s) imposed by the chosen conjunction was/were violated, e.g. “You used
main clause word order with a subordinating conjunction.” This is feedback of type (7),
i.e. a malrule is explained. The system is also able to evaluate the application of
pronominalization rules, e.g. to replace one of two coreferential NPs with a personal
pronoun.
In Figure 3, a series of snapshots illustrates the generated exercise. Figure 3 (a)
outlines the initial screen generated by the system from the random selection of an
RST relation and a related nucleus and satellite in the story representation. One
sentence is preselected to be the first element in a combined sentence (cf. Tim geht
durch den Wald ‘Tim walks through the forest’). However the pupil can deselect this
sentence and choose the subordinate variant from the lower right panel, where the
subordinate word order for this sentence is provided as well. The other sentence (cf.
Tim möchte einen Spaziergang machen ‘Tim wants to take a walk’) is shown in main
and subordinate clause word orders (provided by the paraphrase generator PGW) on
the pile in the lower right panel. In the “bathtub”, all conjunctions which are licensed by
RST relations are suggested to be dragged to one of the orange oval boxes.
In Figure 3 (b), a correct combined sentence has been produced using denn ‘for’,
‘because‘ (cf. the green checkmark and the positive feedback). However, the sentence
is not yet perfect, as using the word Tim twice should be avoided. The system
suggests a pronominalization (cf. Figure 3 (c); varying the referent (e.g. Tim vs. the
little boy) is not yet dealt with). The system redisplays the built sentence with a landing
site for a pronoun. Pronoun forms of different genders and cases are automatically
calculated and become selectable items in the “bathtub” now. In the snapshot, a
pronoun of the wrong gender (feminine instead of masculine, sie ‘she’ vs. er ‘he’) has
been selected. Our system can automatically identify this error by looking up the
syntactic information of Tim (cf. the negative feedback (red circled F) encouraging the
pupil to try once more). We could have given more linguistic details here but we do not
want to overtax the children with those details. In an L2-environment, the details would
probably help.
7
Figure 3 (a): Exercise (2) – initial screen of sentence combining task.
Exercise (3) - Word ordering. The exercise is motivated by further corpus studies
with the transliterated student essays. We found a disproportionate amount of subjectverb-object clauses. So the exercise points out varying word order in German.
The system randomly selects a system-generated compound sentence from the
random selection of RST relations together with its related nucleus and satellite in the
story representation and presents it on the screen after having replaced the major
phrases of one of the original clauses with empty boxes. In Figure 4, a relative clause
has to be ordered by the pupil. The system has automatically predetermined: Tim, der
_ _ _ _, bleibt interessiert stehen ‘Tim, who _ _ _ _ , stops interestedly’). The pupil has
to find an appropriate word order for the relative clause (given the words: ‘somebody’,
‘sing’, ‘suddenly’, ‘hears’). The phrases are presented on the screen as selectable
items, and the student is invited to assemble a correct sentence — possibly one with a
constituent order that differs from the original order — by dragging the phrases into the
empty boxes. The Sentence Fairy can evaluate grammatical correctness by matching
the resulting sentence against the set of paraphrases computed by PGW. Accordingly,
an appropriate feedback is selected from a fixed list of error messages or
compliments. Figure 4 (b) illustrates our generator's capability to verify or falsify subtle
word order variants in German (singen hört ‘sing hears’ vs. hört singen ‘hears sing’
(‘hears singing’); cf. the negative feedback (red circled F) by the Fairy).
8
Figure 3 (b): Exercise (2) – sentence combining successfully performed.
Figure 3 (c): Exercise (2) – sentence combining with pronominalization where a pronoun of the wrong
gender has been selected, which causes a negative feedback.
As outlined in the final Section, several other exercise types for stylistic
improvements of an essay can be produced based on the abstract story
representation. Our next exercise to be implemented will deal with (in)direct speech,
which is a stylistic method to make a story interesting. However, in the corpus material
one finds many errors in using this method – although the pupils are native speakers
of German (cf. Section 6).
9
Figure 4 (a): Exercise (3) – initial screen of word order variation.
Figure 4 (b): Exercise (3) – Word order variation failed due to a subtle error in German word order.
4.3 Learner mode in a first usability study
In the learner mode the pupils are confronted with the system. We expect them to work
alone or in twos or threes without time limits. There should be somebody in the
background who can answer questions or help if needed.
We ran a preliminary usability study with the first prototype of the Sentence Fairy
system in order to identify general problems with the basic layout as defined in Section
3 and particularly to find elementary weaknesses of our drag & drop dialogs in the
individual exercises.
We are aware of the fact that this study is not statistically valid as we had only 6
children and their teacher solve the exercises and fill out a questionnaire. Moreover,
we ran the same experiment with 6 adults. For the next prototype, a statistically sound
study is planned. Nevertheless, the experiment gave positive feedback concerning our
10
overall layout and verified that all participants found the dialog with our virtual tutor
natural. They intuitively understood the drag & drop idea of the exercises. The children
accepted the Sentence Fairy as a virtual teacher and reacted positively to her
feedback.
We tested the system both in a class room environment and at a pupil's home. We
let them work alone and in a mixed group where one member had performed the
exercises before. We observed the children while they did the exercises1. At the end of
the session we had them fill out a short questionnaire. We asked whether the children
found the individual exercises easy or difficult, whether they found the feedback
understandable and helpful. The overall judgment was very positive. But in such a
small group anonymity is not guaranteed and so the observations during the sessions
are more helpful for us.
Obviously, the pupils had difficulties reading longer texts. It takes them quite some
time and effort. However, they hardly use the audio device after a first try to explore
the feature. This might be a “group effect” as they do not want the other children to
notice that they are not fluent readers. So the tendency is to skip the texts –
particularly the task description and the feedback – but instead ask the person in the
background for an oral explanation. This seems to be the usual strategy in the class
room. We consider making the audio output for the task and the feedback
automatically activated.
Another fact we observed was a misunderstanding in exercise (1). One pupil hit the
“ready” button after the first sentence was ordered. Obviously, they prefer immediate
feedback in short dialogs with only little information. There can be many turns. We will
try to satisfy this in the next prototype.
Furthermore we identified a design imposed error. Just because we present two
landing sites for conjunctions, the pupils feel inclined to fill both positions. When the
Sentence Fairy explains that there is only one conjunction per clause, they
immediately agree. Consequently we will remove the second box after one is filled.
We come back to the list of next steps to improve the Sentence Fairy system in
Section 6.
5 State of the art in essay writing systems
State-of-the-art computer support for writing tasks is restricted to multiple-choice
questions or quizzes. On the internet, one can find a wide variety of systems (cf., e.g.,
http://grammar.ccc.commnet.edu/grammar/index2.htm) which allow students to
exercise by typing their solution into a prepared window. This input is then compared
to the correct answer fixed in the system. However, the number of systems that deploy
NLP components to analyze the user’s input or to generate the exercises is much
smaller.
Concerning NLP techniques, it is much more obvious to apply a natural language
parser on the students' output than a generator. Virtually the entire literature on NLP
applications to the syntactic aspects of first- and second-language teaching is based
on syntactic parsing technology (see, e.g., Heift and Schulze, 2007). However, all
systems struggle with incorrect input. Thus, they all have to make sure that the parsing
quality does not get too poor. For instance, Fortmann & Forst (2004) propose malrules
1
We are aware of the fact that this may have influenced the experiment.
However, bringing pupils to an observation lab at our university might also have
influenced the experiment. We preferred a field study environment.
11
to cover typical errors.
To our knowledge, no software tool exists that deploys generation technology in a
“generate-and-test” manner to evaluate the grammatical quality of student output. A
main reason is probably the fact that virtually all natural language generation systems
work in a best-first manner, i.e. they produce only one output sentence but not all
paraphrases. As it is not so easy to change the control structure of such a system, the
choice of generators is very limited.
Zamorano Mansilla's (2004) project is the only one that applies a sentence
generator (KPML; Bateman, 1997) to the recognition and diagnosis of writing errors
(“fill-in-the-blank” exercises, not sentence combining). Zock and Quint (2004) convert
an electronic dictionary into a drill tutor or exercise generator for Japanese. They
deploy a goal-driven, template-based sentence generator. Thus, the paraphrasing
options for the user are limited.
Loosely related to the topic is the field of automatic generation of narratives (cf.
STORYBOOK (Callaway, 2000), a narrative prose generation system retelling variants
of the same story, or Narrator (Theune et al., 2006)). However, these systems have no
user-dialog interface that lets the student build sentences.
6 Conclusions
Summing up, we developed a virtual writing conference based on generation
technology. We base feedback on the output of a natural language generation system
that provides all paraphrases. We apply parsing technology only in the teacher mode,
where it helps teachers to encode new stories in a simple manner.
Presently, we are making the Sentence Fairy “student-proof” in preparation for a
system evaluation with third/fourth-grade elementary schoolers. In the near future, we
will develop exercises concerning the conversion between direct and indirect speech.
We deemed such exercises highly desirable when we analyzed a corpus of
transliterated handwritten essays by 4th-graders.
In the longer run, we hope to develop a system capable of automatically generating
grammar instructions tailored to systematic errors in Sentence Fairy in an integrated
manner by starting grammar teaching exercises triggered by special errors (e.g., a
concrete malrule number) in the sentence combining exercises. PGW can run a
visualization mode for the sentence construction process (cf. the COMPASS system,
which does sentence combining for purely syntactically encoded trees (Harbusch et
al., 2007)). This system could be activated in a Sentence Fairy session. We would like
to evaluate the learning success of such a combined mode in order to prove the claim
by Mellon (1969), who has shown in empirical studies that writing instruction as well as
grammar teaching yield better results when trained in an integrated manner than when
trained in isolation.
Another direction in which we could expand the virtual writing conference idea is
L2-learning for learners of varying age. The kernel system could be more or less the
same. However, an appropriate surface system with elaborate motivation and grading
devices would have to be added to our first prototype.
References
John A. Bateman (1997). Enabling technology for multilingual natural language
generation: The KPML development environment. Journal of Natural Language
Engineering, 3:5—55.
Günter Baumgart, Angela Müller, and Gerhard Zeugner (1996). Farbgestaltung. Berlin:
12
Cornelsen Verlag.
Steven Bird, Ewan Klein, and Edward Loper (forthcoming). Natural Language
Processing in Python, see http://nltk.sourceforge.net/index.php/Book.
Charles Brendan Callaway. 2000. Narrative Prose Generation. Ph.D. thesis, North
Carolina State University, Raleigh, NC.
Ann Copestake, Dan Flickinger, Ivan Sag, and Carl Pollard (2005). Minimal Recursion
Semantics: An introduction, Journal of Research on Language and Computation,
3(2-3):281-332.
Donald A. Daiker, Andrew Kerek, and Max Morenberg (Eds.). (1985). Sentence
Combining: A rhetorical perspective. Carbondale: Southern Illinois University Press.
Christian Fortmann and Martin Forst (2004). An LFG Grammar Checker for CALL. In:
Rudolfo Delmonte, Philippe Delcloque & Sara Tonelli (Eds.). Procs.
InSTIL/ICALL2004 Symposium on NLP and speech technologies in advanced
language learning systems (Venice, Italy). Padova: Unipress
Caroline Fränkel (forthcoming). Evaluation computerlinguistischer Methoden für die
Analyse von Schulaufsätzen. Diplomarbeit, Universität Koblenz-Landau, Campus
Koblenz, Germany.
Donald H. Graves (1983). Writing: Teachers & Children at Work. Portsmouth, NH:
Heinemann.
Karin Harbusch and Gerard Kempen (2000). Complexity of Linear Ordering in
Performance Grammar, TAG and HPSG. Procs. of the 5th International Workshop
on Tree Adjoining Grammars and Related Formalisms, Paris/France.
Karin Harbusch and Gerard Kempen (2002). A quantitative model of word order and
movement in English, Dutch and German complement constructions, Procs. of the
19th International Conference on Computational Linguistics (COLING 2002), Taipei,
Taiwan.
Karin Harbusch, Gerard Kempen, Camiel van Breugel, and Ulrich Koch (2006). A
generation-oriented workbench for Performance Grammar: Capturing linear order
variability in German and Dutch. Procs. of the Fourth International Natural
Language Generation Conference, Sydney, Australia.
Karin Harbusch, Camiel van Breugel, Ulrich Koch, and Gerard Kempen (2007).
Interactive sentence combining and paraphrasing in support of integrated writing
and grammar instruction: A new application area for natural language sentence
generators. Procs. of the 11th European Workshop on Natural Language
Generation (ENLG 2007), Dagstuhl, Germany.
Trude Heift & Mat Schulze (Eds.) (2003). Error diagnosis and error correction in CALL.
CALICO Journal, 20. (Special issue).
Trude Heift & Mathias Schulze (2007). Errors and Intelligence in Computer–Assisted
Language Learning: Parsers and Pedagogues. Routledge, London, GB.
Gerard Kempen and Karin Harbusch (2002). Performance Grammar: A declarative
definition. In: Anton Nijholt, Mariët Theune & Hendri Hondorp (Eds.), Computational
Linguistics in the Netherlands 2001. Pages 148-162. Amsterdam: Rodopi
Gerard Kempen and Karin Harbusch (2003). Dutch and german verb constructions in
performance grammar. In: Pieter A.M. Seuren and Gerard Kempen (Eds.), In Verb
Constructions in German and Dutch, Current Issues in Linguistic Theory 242. Pages
185-221. Amsterdam: John Benjamins.
Gerard Kempen and Karin Harbusch (2003). Word Order Scrambling as a
Consequence of Incremental Sentence Production. In: Holden Härtl and Heike
Tappe (Eds.). Mediating between Concepts and Grammar. Pages 141-164. Berlin:
13
Mouton DeGruyter.
Gerard Kempen and Karin Harbusch (2005). The relationship between grammaticality
ratings and corpus frequencies: A case study into word order variability in the
midfield of German clauses. In: Stephan Kepser and Marga Reis (Eds.), Linguistic
Evidence - Empirical, Theoretical, and Computational Perspectives. Pages 329-349.
Berlin: Mouton De Gruyter.
William C. Mann and Sandra A. Thompson (1988). Rhetorical Structure Theory:
Toward a functional theory of text organization. Text, 8(3):243-281.
B. Jean Mason and Roger Bruning (2001) Providing feedback in computer-based
instruction: What the research tells us. Retrieved April 5, 2004
(http://dwb.unl.edu/Edit/MB/MasonBruning.html)
John C. Mellon (1969). Transformational sentence-combining: A method for enhancing
the development of syntactic fluency in English composition. Urbana, IL: National
Council of Teachers of English.
Manuella Paechter and Karin Schweitzer (2006). Learning and motivation with virtual
tutors. Does it matter if the tutor is visible on the Net? In: Maja Pivec (Ed.), Affective
and emotional aspects in Human-Computer Interaction. Pages 155-164.
Amsterdam: IOS Press.
Mariët Theune, Nanda Slabbers, and Feikje Hielkema (2006). The Narrator: NLG for
digital storytelling. Procs. of the 11th European Workshop on Natural Language
Generation (ENLG 2007), Dagstuhl, Germany.
Franziska Thonke, Jana Groß Ophoff, Ingmar Hosenfeld und Kevin Isaac.
Kriteriengestützte Erfassung von Schreibleistungen im Projekt VERA. In Procs. of
the 15th Europäischer Lesekongress der deutschen Gesellschaft für Lesen und
Schreiben (DGLS). in Druck.
Juan Rafael Zamorano Mansilla (2004). Text generators, error analysis and feedback.
In: Rudolfo Delmonte, Philippe Delcloque & Sara Tonelli (Eds.). Procs.
InSTIL/ICALL2004 Symposium on NLP and speech technologies in advanced
language learning systems (Venice, Italy). Padova: Unipress.
Michael Zock and Julien Quint (2004). Converting an Electronic Dictionary into a Drill
Tutor. In: Rudolfo Delmonte, Philippe Delcloque & Sara Tonelli (Eds.). Procs.
InSTIL/ICALL2004 Symposium on NLP and speech technologies in advanced
language learning systems (Venice, Italy). Padova: Unipress.
14