tbedl - VUT FIT

• ML: Classical methods from AI
–Decision-Tree induction
–Exemplar-based Learning
–Rule Induction
–TransformationBasedErrorDrivenLearning
EMNLP’01
19/11/2001
TBEDL
Transformation-Based Error-Driven Learning
(Brill 92,93,95)
• The learning algorithm is a mistake-driven
greedy procedure that iteratively acquires a set
of transformation rules
• Firstly, unannotated text is passed through an
initial-state annotator
• Then, at each step the algorithm adds the
transformation rule that best repairs the current
errors
EMNLP’01
19/11/2001
TBEDL
Transformation-Based Error-Driven Learning
(Brill 92,93,95)
• Concrete rules are acquired by instantiation of a
predefined set of template rules:
conjunction_of_conditions
transformation
• When annotating a new text, all the
transformation rules are applied in order of
generation
EMNLP’01
19/11/2001
TBEDL
Transformation-Based Error-Driven Learning
(Brill 92,93,95)
Unnanotated
Text
TRAINING
Initial
State
Annotated
Text
“Truth”
Learner
EMNLP’01
Rules
19/11/2001
TBEDL
TB(ED)L Applied to POS Tagging
(Brill 92,93,94,95)
• Initial_State_Annotator = Most_Frequent Label
• Three types of templates
– Non lexicalized conditions
– Lexicalized patterns
– Morphological conditions for dealing with unknown
words
EMNLP’01
19/11/2001
TBEDL
TB(ED)L Applied to POS Tagging
(Brill 92,93,94,95)
• Non-lexicalized conditions:
EMNLP’01
19/11/2001
First implementation
TBEDL
EMNLP’01
19/11/2001
TBEDL
TB(ED)L Applied to POS Tagging
(Brill 92,93,94,95)
• Non-lexicalized conditions: best rules acquired
EMNLP’01
19/11/2001
TBEDL
TB(ED)L Applied to POS Tagging
(Brill 92,93,94,95)
• Lexicalized patterns:
EMNLP’01
19/11/2001
TBEDL
TB(ED)L Applied to POS Tagging
(Brill 92,93,94,95)
• Lexicalized patterns:
– as/IN tall/JJ as/IN
– We do ’nt eat / We did ’nt usually drink
EMNLP’01
19/11/2001
TBEDL
TB(ED)L Applied to POS Tagging
(Brill 92,93,94,95)
• Morphological conditions for dealing with
unknown words:
EMNLP’01
19/11/2001
TBEDL
TB(ED)L Applied to POS Tagging
(Brill 92,93,94,95)
• Unknown words: best rules acquired
EMNLP’01
19/11/2001
TBEDL
TB(ED)L Applied to POS Tagging
(Brill 92,93,94,95)
• Tested on 600 Kw of the Wall Street annotated
corpus
– Number of transformation rules: <500
– Accuracy:
• 97.0% - 97.2% (with no unknown words)
• The accuracy of a HMM trigram tagger is achieved using
only 86 transformation rules
• 96.6% considering unknown words (82.2%)
EMNLP’01
19/11/2001
TBEDL
TB(ED)L Applied to POS Tagging
(Brill 92,93,94,95)
EMNLP’01
19/11/2001
TBEDL
TB(ED)L and NLP
• POS Tagging
(Brill
92,94a,95; Roche & Schabes 95; Aone & Hausman 96)
• PP-attachment disambiguation
(Brill & Resnik, 1994)
• Grammar induction and Parsing (Brill, 1993)
• Context-sensitive Spelling Correction
(Mangu & Brill, 1996)
• Word Sense Disambiguation (Dini et al., 1998)
• Dialogue Act Tagging
(Samuel et al., 1998a,1998b)
• Semantic Role Labeling
(Higgins, 2004; Williams et al., 2004; CoNLL-2004)
EMNLP’01
19/11/2001
TBEDL
TB(ED)L: Main Drawback
• Computational cost
– Memory & Time (specially on Training)
• Some proposals
– Ramshaw & Marcus (1994)
– LazyTBL (Samuel 98)
 m-TBL (Lager 99)
– ICA (Hepple 00)
– FastTBL (Ngai & Florian, 01)
EMNLP’01
19/11/2001
TBEDL
Extensions: LazyTBEDL
(Samuel 98)
• Uses Brill’s TB(ED)L algorithm
• Applies Monte Carlo strategy to randomly sample from
the space of rules, rather than exhaustively analyzing
all possible rules
• The memory and time costs of the TB(ED)L algorithm
are drastically reduced without compromising accuracy
on unseen data
• Application to Dialogue Act Tagging
– Accuracy results: 75.5% over state-of-the-art systems
EMNLP’01
19/11/2001
TBEDL
Extensions: LazyTBEDL
EMNLP’01
(Samuel 98)
19/11/2001
TBEDL
Extensions: LazyTBEDL
EMNLP’01
(Samuel 98)
19/11/2001
TBEDL
Extensions: LazyTBEDL
EMNLP’01
(Samuel 98)
19/11/2001
TBEDL
Extensions: LazyTBEDL
EMNLP’01
(Samuel 98)
19/11/2001
TBEDL
Extensions: LazyTBEDL
EMNLP’01
(Samuel 98)
19/11/2001
TBEDL
Extensions: FastTBEDL
EMNLP’01
(Ngai & Florian 01)
19/11/2001
TBEDL
Extensions: FastTBEDL
(Ngai & Florian 01)
• Software available at:
http://nlp.cs.jhu.edu/rflorian/fntbl
EMNLP’01
19/11/2001
TBEDL
TB(ED)L: Summary
• Advantages
– General, simple and understandable modeling
– Provides a very compact set of interpretable
transformation rules
– High accuracy in many NLP applications
• Drawbacks
– Computational cost: high memory and time
requirements. But some efficient variants of TBL
have been proposed (fastTBL)
– Sequential application of rules
EMNLP’01
19/11/2001
TBEDL
TB(ED)L: Summary
• Others
– A transformation list is a processor and not a
classifier
– A comparison between Decision Trees and
Transformation lists can be found in (Brill, 1995)
EMNLP’01
19/11/2001