Microsoft PowerPoint 2007

Tree-based and Forest-based
Translation
Yang Liu
Institute of Computing Technology
Chinese Academy of Sciences
July 11, 2010
Liang Huang
Information Sciences Institute
University of Southern California
ACL 2010 Tutorial, Uppsala, Sweden
1
Outline




Part 1: Tree-based Translation
 Overview and Motivation
 Tree-to-String Model and Decoding
 Tree-to-String Rule Extraction
 Language Model-Integrated Decoding: Cube Pruning
Part 2: Forest-based Translation
 Packed Forest
 Forest-based Decoding
 Forest-based Rule Extraction
Part 3: Extensions
 Tree-to-Tree Translation
 Tree Sequence-based Translation
 Joint Parsing and Translation
Part 4: Conclusion
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
2
Natural Languages are Different
Я люблю тебя
I love you
我爱你
Je t'aime
당신을 사랑합니다
Eu te amo
‫אני אוהב אותך‬
‫من شما را دوست دارم‬
Ich liebe dich
Te quiero
Miluji tě
Tôi yêu bạn
Ti amo
ผมรักคุณ
わたしは、あなたを愛しています
Ik hou van je
Jag älskar dig
By Google Translate
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
3
Translation is Hard!
connocting poopie
July 11, 2010
HELP ONESELF
TERMINATING MACHINE
ACL 2010 Tutorial, Uppsala, Sweden
4
Machine Translation
沙龙
举行
布什
与
bushi
yu shalong juxing
Bush
held
July 11, 2010
a
talk
了
会谈
le
huitan
with
Sharon
ACL 2010 Tutorial, Uppsala, Sweden
5
Word-based MT
bushi
yu shalong
Bush
held
a
juxing
talk
le
huitan
with
Sharon
(Brown et al., 1993)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
6
Phrase-based MT
bushi
yu shalong
Bush
held
a
juxing
talk
le
huitan
with
Sharon
(Koehn et al., 2003; Och and Ney, 2004)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
7
Hierarchical Phrase-based MT
X3
X2
X1
bushi
yu shalong
Bush
held
X1
a
juxing
talk
le
huitan
with
Sharon
X2
X3
X3->(X1 yu shalong X2, X1 X2 with Sharon)
(Chiang, 2005; Chiang, 2007)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
8
Syntax-based MT
bushi
yu shalong
Bush
held
a
NNP
VBD
DT
NP
juxing
le
huitan
talk
with
Sharon
NN
IN
NNP
NP
NP
PP
VP
S
(Yamda and Knight, 2001; Galley et al., 2006; Shen et al., 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
9
Motivation

Human Translation



Understand the source sentence
Generate the target sentence
Compiling


Parse input program into a syntax tree
Generate code in machine language
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
10
Syntax-Directed Translation for Compiling


Input: y:=3*x+z
Parsing:
:=
id
(y)
+
id
(z)
*
const
(3)
id
(x)
(Irons, 1961; Lewis and Stearns, 1968; Aho and Ullman., 1972)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
11
Motivation

Human Translation



Compiling



Understand the source sentence
Generate the target sentence
Parse input program into a syntax tree
Generate code in machine language
Machine Translation


Parse the source sentence into a tree
Recursively transfer the tree into the target
language
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
12
Syntax-Directed Translation for MT


Input: bushi yu shalong juxing le huitan
Parsing:
IP
VP
PP
NPB
P
NPB
bushi
yu shalong
VPB
VS
AS
NPB
juxing
le
huitan
(Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
13
Outline




Part 1: Tree-based Translation
 Overview and Motivation
 Tree-to-String Model and Decoding
 Tree-to-String Rule Extraction
 Language Model-Integrated Decoding: Cube Pruning
Part 2: Forest-based Translation
 Packed Forest
 Forest-based Decoding
 Forest-based Rule Extraction
Part 3: Extensions
 Tree-to-Tree Translation
 Tree Sequence-based Translation
 Joint Parsing and Translation
Part 4: Conclusion
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
14
Tree-to-String Translation

Recursive rewrite by pattern-matching
IP
IP
X1:NPB
X2:VP
X1 X2
VP
PP
VPB
NPB
P
NPB
bushi
yu shalong
VS
AS
NPB
juxing
le
huitan
(Liu et al., 2006; Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
15
Tree-to-String Translation

Recursive rewrite by pattern-matching
NPB
bushi
Bush
VP
PP
VPB
NPB
P
NPB
bushi
yu shalong
VS
AS
NPB
juxing
le
huitan
(Liu et al., 2006; Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
16
Tree-to-String Translation

Recursive rewrite by pattern-matching
VP
X1:PP
X2:VPB
X2 X1
VP
PP
P
Bush
VPB
NPB
yu shalong
VS
AS
NPB
juxing
le
huitan
(Liu et al., 2006; Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
17
Tree-to-String Translation

Recursive rewrite by pattern-matching
VPB
VS
AS
X1:NPB
juxing le
held
VPB
Bush
a X1
PP
VS
AS
NPB
juxing
le
huitan
P
NPB
yu shalong
(Liu et al., 2006; Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
18
Tree-to-String Translation

Recursive rewrite by pattern-matching
NPB
huitan
talk
PP
NPB
Bush
held
a
huitan
P
NPB
yu shalong
(Liu et al., 2006; Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
19
Tree-to-String Translation

Recursive rewrite by pattern-matching
PP
P
X1:NPB
yu
with X1
PP
P
Bush
held
a
talk
NPB
yu shalong
(Liu et al., 2006; Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
20
Tree-to-String Translation

Recursive rewrite by pattern-matching
NPB
shalong
Sharon
NPB
Bush
held
a
talk
with shalong
(Liu et al., 2006; Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
21
Tree-to-String Translation

Recursive rewrite by pattern-matching
Syntax-directed translation (e.g., Irons, 1961)
Tree transducer (e.g., Knight and Graehl, 2005)
Tree-to-string translation
Synchronous grammar (e.g., Eisner, 2003)
…
Bush
held
a
talk
with Sharon
(Liu et al., 2006; Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
22
Expressive Power
phrasal translation
non-constituent phrase
VPB
PP
P
NPB
VS
AS
yu
shalong
juxing
le
with
Sharon
word omission
held
X1:NPB
LCP
LC
X1:IP
hou
P
dang
a X1
when X1
lexicalized re-ordering
NP
IP
CLP
X1:NP
DNP
VP
X2:IP
ben
X1
PP
multilevel re-ordering
QP
X1:CD
non-contiguous phrase
X3:VPB
X1:NP
X2:NP
DEG
de
X1
X3
X2
X2
of
X1
(Knight and Graehl, 2005)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
23
Outline




Part 1: Tree-based Translation
 Overview and Motivation
 Tree-to-String Model and Decoding
 Tree-to-String Rule Extraction
 Language Model-Integrated Decoding: Cube Pruning
Part 2: Forest-based Translation
 Packed Forest
 Forest-based Decoding
 Forest-based Rule Extraction
Part 3: Extensions
 Tree-to-Tree Translation
 Tree Sequence-based Translation
 Joint Parsing and Translation
Part 4: Conclusion
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
24
Tree-to-String Rule Extraction

Compute target spans
IP
“Bush … Sharon”
VP
“held … Sharon”
PP
VPB
“with Sharon”
NPB
“Bush”
P
“with”
“held a talk”
NPB
“Sharon”
bushi
yu shalong
Bush
held
a
VS
“held”
AS
“held”
juxing le
talk
with
NPB
“talk”
huitan
Sharon
(Galley et al., 2004)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
25
Tree-to-String Rule Extraction

Find admissible nodes
IP
“Bush … Sharon”
VP
“held … Sharon”
PP
VPB
“with Sharon”
NPB
“Bush”
P
“with”
“held a talk”
NPB
“Sharon”
bushi
yu shalong
Bush
held
a
VS
“held”
AS
“held”
juxing le
talk
with
NPB
“talk”
huitan
Sharon
(Galley et al., 2004)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
26
Tree-to-String Rule Extraction

Extract minimal rules
NPB
IP
“Bush … Sharon”
bushi
VP
Bush
“held … Sharon”
PP
VPB
“with Sharon”
NPB
“Bush”
P
“with”
“held a talk”
NPB
“Sharon”
bushi
yu shalong
Bush
held
a
VS
“held”
AS
“held”
juxing le
talk
with
NPB
“talk”
huitan
Sharon
(Galley et al., 2004)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
27
Tree-to-String Rule Extraction

Extract minimal rules
VP
IP
“Bush … Sharon”
X1:PP
VP
X2 X1
“held … Sharon”
PP
VPB
“with Sharon”
NPB
“Bush”
P
“with”
“held a talk”
NPB
“Sharon”
bushi
yu shalong
Bush
held
a
X2:VPB
VS
“held”
AS
“held”
juxing le
talk
with
NPB
“talk”
huitan
Sharon
(Galley et al., 2004)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
28
Tree-to-String Rule Extraction

Extract minimal rules
VPB
IP
“Bush … Sharon”
VS
VP
“Bush”
P
“with”
a X1
“held a talk”
NPB
“Sharon”
bushi
yu shalong
Bush
held
a
held
VPB
“with Sharon”
NPB
X1:NPB
juxing le
“held … Sharon”
PP
AS
VS
“held”
AS
“held”
juxing le
talk
with
NPB
“talk”
huitan
Sharon
(Galley et al., 2004)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
29
Tree-to-String Rule Extraction

Get composed rules
VPB
VS
AS
VPB
X1:NPB
juxing le
held
a X1
NPB
+
huitan
=
talk
VS
AS
NPB
juxing le
held
huitan
a
talk
tree substitution
(Galley et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
30
Outline




Part 1: Tree-based Translation
 Overview and Motivation
 Tree-to-String Model and Decoding
 Tree-to-String Rule Extraction
 Language Model-Integrated Decoding: Cube Pruning
Part 2: Forest-based Translation
 Packed Forest
 Forest-based Decoding
 Forest-based Rule Extraction
Part 3: Extensions
 Tree-to-Tree Translation
 Tree Sequence-based Translation
 Joint Parsing and Translation
Part 4: Conclusion
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
31
Bottom-up Decoding
NPB
bushi
IP
Bush
VP
PP
NPB
P
bushi
yu shalong
Bush
NPB
VPB
VS
AS
NPB
juxing
le
huitan
(Liu et al., 2006; Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
32
Bottom-up Decoding
NPB
shalong
IP
Sharon
VP
PP
VPB
Sharon
NPB
P
bushi
yu shalong
Bush
NPB
VS
AS
NPB
juxing
le
huitan
(Liu et al., 2006; Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
33
Bottom-up Decoding
NPB
huitan
IP
talk
VP
PP
VPB
Sharon
NPB
P
bushi
yu shalong
Bush
NPB
talk
VS
AS
NPB
juxing
le
huitan
(Liu et al., 2006; Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
34
Bottom-up Decoding
PP
P
IP
yu
with X1
VP
PP
with Sharon
VPB
Sharon
NPB
P
bushi
yu shalong
Bush
NPB
X1:NPB
talk
VS
AS
NPB
juxing
le
huitan
(Liu et al., 2006; Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
35
Bottom-up Decoding
VPB
VS
IP
held
with Sharon
NPB
P
NPB
bushi
yu shalong
a X1
held a talk
VPB
Sharon
Bush
X1:NPB
juxing le
VP
PP
AS
talk
VS
AS
NPB
juxing
le
huitan
(Liu et al., 2006; Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
36
Bottom-up Decoding
VP
X1:PP
IP
X2:VPB
X2 X1
held a talk with Sharon
VP
PP
held a talk
with Sharon
VPB
Sharon
NPB
P
bushi
yu shalong
Bush
NPB
talk
VS
AS
NPB
juxing
le
huitan
(Liu et al., 2006; Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
37
Bottom-up Decoding
IP
X1:NPB
Bush held a talk with Sharon
IP
X2:VP
X1 X2
held a talk with Sharon
VP
PP
held a talk
with Sharon
VPB
Sharon
NPB
P
bushi
yu shalong
Bush
NPB
talk
VS
AS
NPB
juxing
le
huitan
(Liu et al., 2006; Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
38
Beam Search
IP
VP
PP
NPB
P
NPB
bushi
yu shalong
VPB
VS
AS
NPB
juxing
le
huitan
(Liu et al., 2006; Huang et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
39
Exhaustive Search
held a talk with Sharon
held a talk and Sharon
held talks with Sharon
held talks and Sharon
…
VP
X1:PP
VP1,6
X2:VPB
X2 X1
July 11, 2010
PP1,3
VPB3,6
with Sharon
and Sharon
Sharon with
Sharon and
held a talk
held talks
hold a talk
hold talks
ACL 2010 Tutorial, Uppsala, Sweden
40
Update Bigram LM Probability
p1=p(“with’’) * p(“Sharon”|”with”)
p2=p(“held’’) * p(“a”|”held”)*p(“talk”|”a”)
with Sharon
held a talk
Only boundary words are used to update LM probability!
with Sharon held a talk
p1*p2*p(“held”|”Sharon”)/p(“held”)
held a talk with Sharon
p1*p2*p(“with”|”talk”)/p(“with”)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
41
Exhaustive Search with a Bigram Language Model
held * Sharon
hold * Sharon
held * with
hold * with
…
VP
X1:PP
VP1,6
X2:VPB
X2 X1
PP1,3
with * Sharon
and * Sharon
Sharon * with
Sharon * and
July 11, 2010
VPB3,6
held * talk
held * talks
hold * talk
hold * talks
ACL 2010 Tutorial, Uppsala, Sweden
42
Monotonicity
PP1,3
VP1,6
PP1,3
VPB3,6
monotonic
VPB3,6
1.0
3.0
4.0
6.5
held * talk
1.0
2.0
4.0
5.0
7.5
held * talks
1.1
2.1
4.1
5.1
7.6
hold * talk
2.0
3.0
5.0
6.0
8.5
hold * talks
3.5
4.5
6.5
7.5
10.0
(Huang and Chiang, 2005, 2007; Chiang, 2007)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
43
Non-Monotonicity
PP1,3
VP1,6
log(p(with|talk))log(p(with))
PP1,3
VPB3,6
LM introduces non-monotonicity
1.0
VPB3,6
3.0
4.0
6.5
held * talk
1.0
2.0 + 0.5 4.0 + 2.0 5.0 + 4.0 7.5 + 4.0
held * talks
1.1
2.1 + 0.3 4.1 + 1.5 5.1 + 3.5 7.6 + 3.0
hold * talk
2.0
3.0 + 0.5 5.0 + 2.0 6.0 + 4.0 8.5 + 4.0
hold * talks
3.5
4.5 + 0.3 6.5 + 1.5 7.5 + 3.5 10 + 3.5
(Huang and Chiang, 2005, 2007; Chiang, 2007)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
44
Cube Pruning
PP1,3
queue
4-best
VPB3,6
1.0
3.0
4.0
6.5
held * talk
1.0
2.5
6.0
9.0
11.5
held * talks
1.1
2.4
5.6
8.6
10.6
hold * talk
2.0
3.5
7.0
10.0
12.5
hold * talks
3.5
4.8
8.0
11.0
13.5
(Huang and Chiang, 2005, 2007; Chiang, 2007)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
45
Cube Pruning
queue
PP1,3
2.5
4-best
VPB3,6
1.0
3.0
4.0
6.5
held * talk
1.0
2.5
6.0
9.0
11.5
held * talks
1.1
2.4
5.6
8.6
10.6
hold * talk
2.0
3.5
7.0
10.0
12.5
hold * talks
3.5
4.8
8.0
11.0
13.5
(Huang and Chiang, 2005, 2007; Chiang, 2007)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
46
Cube Pruning
queue
2.4 6.0
4-best
2.5
VPB3,6
PP1,3
1.0
3.0
4.0
6.5
held * talk
1.0
2.5
6.0
9.0
11.5
held * talks
1.1
2.4
5.6
8.6
10.6
hold * talk
2.0
3.5
7.0
10.0
12.5
hold * talks
3.5
4.8
8.0
11.0
13.5
(Huang and Chiang, 2005, 2007; Chiang, 2007)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
47
Cube Pruning
queue
3.5 5.6 6.0
4-best
2.4 2.5
VPB3,6
PP1,3
1.0
3.0
4.0
6.5
held * talk
1.0
2.5
6.0
9.0
11.5
held * talks
1.1
2.4
5.6
8.6
10.6
hold * talk
2.0
3.5
7.0
10.0
12.5
hold * talks
3.5
4.8
8.0
11.0
13.5
(Huang and Chiang, 2005, 2007; Chiang, 2007)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
48
Cube Pruning
queue
4.8 5.6 6.0 7.0
4-best
2.4 2.5 3.5
VPB3,6
PP1,3
1.0
3.0
4.0
6.5
held * talk
1.0
2.5
6.0
9.0
11.5
held * talks
1.1
2.4
5.6
8.6
10.6
hold * talk
2.0
3.5
7.0
10.0
12.5
hold * talks
3.5
4.8
8.0
11.0
13.5
(Huang and Chiang, 2005, 2007; Chiang, 2007)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
49
Cube Pruning
queue
5.6 6.0 7.0
4-best
2.4 2.5 3.5 4.8
VPB3,6
PP1,3
1.0
3.0
4.0
6.5
held * talk
1.0
2.5
6.0
9.0
11.5
held * talks
1.1
2.4
5.6
8.6
10.6
hold * talk
2.0
3.5
7.0
10.0
12.5
hold * talks
3.5
4.8
8.0
11.0
13.5
(Huang and Chiang, 2005, 2007; Chiang, 2007)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
50
Cube Pruning within Rule Group
VPB
VS
AS
VPB
X1:NPB
VS
juxing le
held
AS
VPB
X1:NPB
VS
juxing le
a X1
AS
X1:NPB
juxing le
held X1
hold a X1
NPB5,6
Group rules
that have the
same LHS
1.0
2.0
2.5
1.0
2.1
5.0
3.7
held X1
1.4
3.2
4.0
5.0
hold a X1
2.0
3.1
6.0
4.7
held
a X1
(Huang and Chiang, 2005, 2007; Chiang, 2007)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
51
Cube Pruning within Node
VP1,6
PP1,3
VPB3,6
NPB2,3
NPB5,6
NPB2,3
VPB3,6
process all rules simultaneously!
significant savings of computation
(Huang and Chiang, 2005, 2007; Chiang, 2007)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
52
Outline




Part 1: Tree-based Translation
 Overview and Motivation
 Tree-to-String Model and Decoding
 Tree-to-String Rule Extraction
 Language Model-Integrated Decoding: Cube Pruning
Part 2: Forest-based Translation
 Packed Forest
 Forest-based Decoding
 Forest-based Rule Extraction
Part 3: Extensions
 Tree-to-Tree Translation
 Tree Sequence-based Translation
 Joint Parsing and Translation
Part 4: Conclusion
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
53
Syntactic Ambiguity
It is important to choose a correct tree for producing a good translation!
IP
IP
VP
PP
NPB
P
NPB
VPB
VS
AS NPB
bushi yu shalong juxing le huitan
with
NPB CC NPB
VPB
VS
AS NPB
bushi yu shalong juxing le huitan
and
``Bush held a talk with Sharon’’
July 11, 2010
NP
``Bush and Sharon held a talk’’
ACL 2010 Tutorial, Uppsala, Sweden
54
Parsing Mistake Propagation
source
string
tree
parse
translate
string
target
parsing mistakes potentially introduce translation mistakes!
(Quirk and Corston-Oliver, 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
55
1-best Trees => n-best Trees?
IP
IP
VP
PP
NPB
P
VPB
NPB
VS
AS NPB
bushi yu shalong juxing le huitan
NP
NPB CC NPB
VPB
VS
AS NPB
bushi yu shalong juxing le huitan
Very few variations among the n-best trees!
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
56
Packed Forest
IP0,6
VP1,6
NP0,3
PP1,3
NPB0,1 CC1,2 P1,2 NPB2,3
bushi
yu
VPB3,6
VS3,4 AS4,5
shalong juxing
le
NPB5,6
huitan
(Billot and Lang, 1989; Klein and Manning, 2001; Huang and Chiang, 2005)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
57
Outline




Part 1: Tree-based Translation
 Overview and Motivation
 Tree-to-String Model and Decoding
 Tree-to-String Rule Extraction
 Language Model-Integrated Decoding: Cube Pruning
Part 2: Forest-based Translation
 Packed Forest
 Forest-based Decoding
 Forest-based Rule Extraction
Part 3: Extensions
 Tree-to-Tree Translation
 Tree Sequence-based Translation
 Joint Parsing and Translation
Part 4: Conclusion
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
58
Pattern Matching on Forest
IP
NP
X3:VPB
X1:NPB CC
X2:NPB
yu
IP0,6
X1 X3 with X2
VP1,6
NP0,3
PP1,3
NPB0,1 CC1,2 P1,2 NPB2,3
bushi
yu
VPB3,6
VS3,4 AS4,5
shalong juxing
le
NPB5,6
huitan
(Mi et al., 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
59
Translation Forest
NPB
bushi
Bush
IP0,6
VP1,6
NP0,3
PP1,3
NPB0,1 CC1,2 P1,2 NPB2,3
bushi
yu
VPB3,6
VS3,4 AS4,5
shalong juxing
le
NPB5,6
NPB0,1
huitan
(Mi et al., 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
60
Translation Forest
NPB
NPB
bushi
shalong
Bush
Sharon
IP0,6
VP1,6
NP0,3
PP1,3
NPB0,1 CC1,2 P1,2 NPB2,3
bushi
yu
VPB3,6
VS3,4 AS4,5
shalong juxing
le
NPB5,6
NPB0,1
NPB2,3
huitan
(Mi et al., 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
61
Translation Forest
NPB
NPB
NPB
bushi
shalong
huitan
Bush
Sharon
talk
IP0,6
VP1,6
NP0,3
PP1,3
NPB0,1 CC1,2 P1,2 NPB2,3
bushi
yu
VPB3,6
VS3,4 AS4,5
shalong juxing
le
NPB5,6
NPB0,1
NPB2,3
NPB5,6
huitan
(Mi et al., 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
62
Translation Forest
NPB
NPB
NPB
bushi
shalong
huitan
Bush
Sharon
talk
VPB
VS AS X1:NPB
juxing le
held a X1
IP0,6
VP1,6
NP0,3
PP1,3
NPB0,1 CC1,2 P1,2 NPB2,3
bushi
yu
VPB3,6
VPB3,6
VS3,4 AS4,5
shalong juxing
le
NPB5,6
NPB0,1
NPB2,3
NPB5,6
huitan
(Mi et al., 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
63
Translation Forest
IP
NPB
NPB
NPB
bushi
shalong
huitan
Bush
Sharon
talk
VPB
X3:VPB
NP
VS AS X1:NPB
juxing le
X1:NPB CC X2:NPB
held a X1
yu
X1 X3 with X2
``Bush held a talk with Sharon’’
IP0,6
IP0,6
PP1,3
NPB0,1 CC1,2 P1,2 NPB2,3
bushi
yu
``held a talk’’
VP1,6
NP0,3
VPB3,6
VPB3,6
VS3,4 AS4,5
shalong juxing
le
NPB5,6
huitan
NPB0,1
NPB2,3
``Bush’’
``Sharon’’
NPB5,6
``talk’’
(Mi et al., 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
64
N-best Trees Vs. Forest
(Mi et al., 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
65
Forest as Virtual ∞-best list

How often is the ith-best tree picked by the decoder?
(Mi et al., 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
66
Outline




Part 1: Tree-based Translation
 Overview and Motivation
 Tree-to-String Model and Decoding
 Tree-to-String Rule Extraction
 Language Model-Integrated Decoding: Cube Pruning
Part 2: Forest-based Translation
 Packed Forest
 Forest-based Decoding
 Forest-based Rule Extraction
Part 3: Extensions
 Tree-to-Tree Translation
 Tree Sequence-based Translation
 Joint Parsing and Translation
Part 4: Conclusion
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
67
Forest-based Rule Extraction

Compute target spans
IP0,6
“Bush … Sharon”
NP0,3
“Bush □ with Sharon”
VP1,6
“held … Sharon”
PP1,3
“with Sharon”
NPB0,1 CC1,2 P1,2 NPB2,3
“Bush” “with” “with” “Sharon”
bushi
Bush
yu
held
VPB3,6
“held … talk”
VS3,4 AS4,5
“held”
“held”
“talk”
le
huitan
with
Sharon
shalong juxing
a
NPB5,6
talk
(Mi and Huang, 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
68
Forest-based Rule Extraction

Compute admissible nodes
IP0,6
“Bush … Sharon”
NP0,3
“Bush □ with Sharon”
VP1,6
“held … Sharon”
PP1,3
“with Sharon”
NPB0,1 CC1,2 P1,2 NPB2,3
“Bush” “with” “with” “Sharon”
bushi
Bush
yu
held
VPB3,6
“held … talk”
VS3,4 AS4,5
“held”
“held”
“talk”
le
huitan
with
Sharon
shalong juxing
a
NPB5,6
talk
(Mi and Huang, 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
69
Forest-based Rule Extraction

Extract minimal rules
IP0,6
NPB
“Bush … Sharon”
bushi
NP0,3
“Bush □ with Sharon”
VP1,6
“held … Sharon”
PP1,3
“with Sharon”
NPB0,1 CC1,2 P1,2 NPB2,3
“Bush” “with” “with” “Sharon”
bushi
Bush
yu
held
Bush
VPB3,6
“held … talk”
VS3,4 AS4,5
“held”
“held”
“talk”
le
huitan
with
Sharon
shalong juxing
a
NPB5,6
talk
(Mi and Huang, 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
70
Forest-based Rule Extraction

Extract minimal rules
IP0,6
VP
“Bush … Sharon”
NP0,3
“Bush □ with Sharon”
X1:PP
VP1,6
“held … Sharon”
PP1,3
“with Sharon”
NPB0,1 CC1,2 P1,2 NPB2,3
“Bush” “with” “with” “Sharon”
bushi
Bush
yu
held
X2 X1
VPB3,6
“held … talk”
VS3,4 AS4,5
“held”
NPB5,6
“held”
“talk”
le
huitan
with
Sharon
shalong juxing
a
X2:VPB
talk
(Mi and Huang, 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
71
Forest-based Rule Extraction

Extract minimal rules
VPB
IP0,6
VS
“Bush … Sharon”
AS
X1:NPB
juxing le
NP0,3
“Bush □ with Sharon”
VP1,6
“held … Sharon”
PP1,3
“with Sharon”
NPB0,1 CC1,2 P1,2 NPB2,3
“Bush” “with” “with” “Sharon”
bushi
Bush
yu
held
held
VPB3,6
“held … talk”
VS3,4 AS4,5
“held”
NPB5,6
“held”
“talk”
le
huitan
with
Sharon
shalong juxing
a
a X1
talk
(Mi and Huang, 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
72
Forest-based Rule Extraction

IP
Extract minimal rules
IP0,6
X1:NPB
“Bush … Sharon”
X2:VP
X1 X2
NP0,3
“Bush □ with Sharon”
VP1,6
“held … Sharon”
PP1,3
“with Sharon”
NPB0,1 CC1,2 P1,2 NPB2,3
“Bush” “with” “with” “Sharon”
bushi
Bush
yu
held
VPB3,6
“held … talk”
VS3,4 AS4,5
“held”
“held”
“talk”
le
huitan
with
Sharon
shalong juxing
a
NPB5,6
talk
(Mi and Huang, 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
73
Forest-based Rule Extraction

IP
Extract minimal rules
NP
IP0,6
“Bush … Sharon”
NP0,3
“Bush □ with Sharon”
X1:NPB X2:CC X3:NPB
PP1,3
“with Sharon”
“Bush” “with” “with” “Sharon”
bushi
Bush
yu
held
X1 X3 X4
VP1,6
“held … Sharon”
NPB0,1 CC1,2 P1,2 NPB2,3
X2
VPB3,6
“held … talk”
VS3,4 AS4,5
“held”
NPB5,6
“held”
“talk”
le
huitan
with
Sharon
shalong juxing
a
X4:VPB
talk
(Mi and Huang, 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
74
Rule Probabilities and Rule Count
P  r | lhs  r   
c r 

r ':lhs  r ' lhs  r 
c  r '
c r 
P  r | rhs  r   

r ':rhs  r '  rhs  r 


P r | root  lhs  r   
How often does a rule
occur in training examples?
c  r '
c r 

r ':root  lhs  r '   root  lhs  r  
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
c  r '
75
Fractional Count
Q: What ‘s the count of this rule on this training example?
IP0,6
VP
“Bush … Sharon”
NP0,3
“Bush □ with Sharon”
X1:PP
VP1,6
“held … Sharon”
PP1,3
“with Sharon”
NPB0,1 CC1,2 P1,2 NPB2,3
“Bush” “with” “with” “Sharon”
bushi
Bush
yu
held
X2 X1
VPB3,6
“held … talk”
VS3,4 AS4,5
“held”
NPB5,6
“held”
“talk”
le
huitan
with
Sharon
shalong juxing
a
X2:VPB
talk
(Mi and Huang, 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
76
 e    VP1,6   p  e     PP1,3     VPB3,6 
Fractional Count
  lhs  r  
c r  
  TOP 
IP0,6


“Bush … 
Sharon”
VP
NP0,3
“Bush □ with Sharon”
1,6
VP1,6
“heldp …
 e Sharon”
PP1,3
“with Sharon”
NPB0,1 CC1,2 P1,2 NPB2,3
“Bush” “with” “with” “Sharon”
bushi
Bush
yu
 shalong
PP1,3 
held
a
VPB3,6
“held … talk”
VS3,4 AS4,5
“held”
“held”
NPB5,6
“talk”
juxing   VPB
le 
huitan
talk
Sharon
3,6
with
(Mi and Huang, 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
77
Results
decoding
rule extraction
1-best tree
forest
1-best tree
0.2560
0.2674
forest
0.2679
0.2816
(Mi and Huang, 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
78
Outline




Part 1: Tree-based Translation
 Overview and Motivation
 Tree-to-String Model and Decoding
 Tree-to-String Rule Extraction
 Language Model-Integrated Decoding: Cube Pruning
Part 2: Forest-based Translation
 Packed Forest
 Forest-based Decoding
 Forest-based Rule Extraction
Part 3: Extensions
 Tree-to-Tree Translation
 Tree Sequence-based Translation
 Joint Parsing and Translation
Part 4: Conclusion
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
79
Tree-to-Tree Translation

Recursive rewrite by pattern-matching
IP
VP
PP
NPB
P
VPB
NPB
VS
IP
AS NPB
bushi yu shalong juxing le huitan
X1:NPB
X2:VP
X1:NP
X2:VP
S
NP
VP
(Eisner 2003, Zhang, 2007)
S
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
80
Tree-to-Tree Translation

Recursive rewrite by pattern-matching
VP
PP
NPB
P
VPB
NPB
VS
AS NPB
bushi yu shalong juxing le huitan
NPB
bushi
Bush
Bush
NNP
NNP
NP
NP
VP
(Eisner 2003, Zhang, 2007)
S
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
81
Tree-to-Tree Translation

Recursive rewrite by pattern-matching
VP
VP
PP
P
VPB
NPB
VS
X1:PP
VS
AS NPB
held
a
held
NNP VBD DT
NP
NN
a
VBD DT X2:NN
NP
NP
PP
VP
X1:PP
VP
(Eisner 2003, Zhang, 2007)
S
July 11, 2010
AS X2:NPB
juxing le
yu shalong juxing le huitan
Bush
VPB
ACL 2010 Tutorial, Uppsala, Sweden
82
Tree-to-Tree Translation

Recursive rewrite by pattern-matching
PP
P
PP
NPB
NPB
yu shalong
Bush
held
a
NNP VBD DT
NP
huitan
NN
P
NPB
yu shalong
with
Sharon
with
Sharon
IN
NNP
IN
NNP
NP
NP
PP
NP
PP
VP
(Eisner 2003, Zhang, 2007)
S
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
83
Tree-to-Tree Translation

Recursive rewrite by pattern-matching
Bush
held
a
NNP VBD DT
NP
NPB
NPB
huitan
huitan
talk
with
Sharon
NN
IN
NNP
NP
talk
NN
NP
PP
VP
(Eisner 2003, Zhang, 2007)
S
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
84
Tree-to-Tree Translation

Recursive rewrite by pattern-matching
Bush
held
a
NNP VBD DT
NP
talk
with
Sharon
NN
IN
NNP
NP
NP
PP
VP
(Eisner 2003, Zhang, 2007)
S
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
85
Tree-to-Tree Rule Extraction

Find admissible node pairs
IP
VP
PP
NPB
P
VPB
NPB
VS
AS NPB
bushi yu shalong juxing le huitan
Bush
held
a
NNP VBD DT
NP
talk
with
Sharon
NN
IN
NNP
NP
NP
PP
VP
S
July 11, 2010
(Zhang, 2007, Liu et al., 2009a)
ACL 2010 Tutorial, Uppsala, Sweden
86
Tree-to-Tree Rule Extraction

Extract minimal rules
IP
VP
PP
NPB
P
VPB
NPB
VS
NPB
AS NPB
bushi
bushi yu shalong juxing le huitan
Bush
held
a
NNP VBD DT
NP
talk
with
Sharon
Bush
NN
IN
NNP
NNP
NP
NP
NP
PP
VP
S
July 11, 2010
(Zhang, 2007, Liu et al., 2009a)
ACL 2010 Tutorial, Uppsala, Sweden
87
Tree-to-Tree Rule Extraction

Extract minimal rules
IP
VP
VP
PP
NPB
P
X1:PP
VPB
NPB
VS
VS
AS NPB
held
a
NNP VBD DT
NP
talk
with
Sharon
held
NN
IN
NNP
VBD DT X2:NN
NP
NP
a
NP
PP
VP
X1:PP
VP
(Zhang, 2007, Liu et al., 2009a)
S
July 11, 2010
AS X2:NPB
juxing le
bushi yu shalong juxing le huitan
Bush
VPB
ACL 2010 Tutorial, Uppsala, Sweden
88
Tree-to-Tree Rule Extraction

Extract minimal rules
IP
VP
PP
NPB
P
VPB
NPB
VS
IP
AS NPB
bushi yu shalong juxing le huitan
Bush
held
a
NNP VBD DT
NP
talk
with
Sharon
NN
IN
NNP
NP
X1:NPB
X2:VP
X1:NP
X2:VP
S
NP
PP
VP
S
July 11, 2010
(Zhang, 2007, Liu et al., 2009a)
ACL 2010 Tutorial, Uppsala, Sweden
89
Tree-to-Tree Rule Extraction

Get composed rules
VP
X1:PP
VP
VPB
VS
a
+
huitan
VS
AS NPB
juxing le
=
held
talk
VBD DT X2:NN
VPB
NPB
AS X2:NPB
juxing le
held
X1:PP
a
talk
VBD DT
NN
NP
huitan
NN
NP
X1:PP
X1:PP
VP
VP
(Zhang, 2007, Liu et al., 2009a)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
90
Challenges

Tree-to-tree translation is over-constrained



Poorest rule coverage
Suffers from parsing mistake propagation on both
sides
Recent advances



Use tree sequence (Zhang et al., 2008)
Use packed forest (Liu et al., 2009a)
Fuzzy extraction and decoding (Chiang, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
91
Outline




Part 1: Tree-based Translation
 Overview and Motivation
 Tree-to-String Model and Decoding
 Tree-to-String Rule Extraction
 Language Model-Integrated Decoding: Cube Pruning
Part 2: Forest-based Translation
 Packed Forest
 Forest-based Decoding
 Forest-based Rule Extraction
Part 3: Extensions
 Tree-to-Tree Translation
 Tree Sequence-based Translation
 Joint Parsing and Translation
Part 4: Conclusion
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
92
Non-Constituent Phrase Pairs
IP
NP
VPB
NPB CC NPB
VS
AS
NPB
bushi yu shalong juxing le huitan
Bush
held
a
talk with Sharon
(Marcu et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
93
Non-Constituent Phrase Pairs
bushi
yu shalong
Bush
held
a
NNP
VBD
DT
NP
juxing
le
huitan
talk
with
Sharon
NN
IN
NNP
NP
NP
PP
VP
S
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
(Marcu et al., 2006)
94
Non-Constituent Phrase Pairs
IP
VP
PP
NPB
P
VPB
NPB
VS
AS NPB
bushi yu shalong juxing le huitan
Bush
held
a
NNP VBD DT
NP
talk
with
Sharon
NN
IN
NNP
NP
NP
PP
VP
S
(Marcu et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
95
Rule Coverage
IP
phrase pair
VP
PP
NPB
P
NPB
VS
bushi yu shalong juxing le huitan
Bush
held
a
NNP VBD DT
NP
(shalong, Sharon)
√
√
√
√
√
√
√
√
√
(huitan, talk)
√
√
√
√
(yu shalong, with Sharon)
√
√
√
√
(juxing le, held)
√
√
×
(juxing … huitan, held … talk)
√
×
√
(yu … huitan, held … Sharon)
√
√
×
√
×
√
(bushi … huitan, Bush … Sharon)
√
√
√
√
(yu, with)
AS NPB
talk
with
Sharon
NN
IN
NNP
NP
NP
s2t t2t
√
√
√
(bushi, Bush)
VPB
s2s t2s
PP
100% 89% 89% 78%
VP
S
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
96
Rule Coverage
model
human
automatic
string-to-string
100%
100%
tree-to-string
78%
75%
string-to-tree
76%
72%
tree-to-tree
68%
60%
Results from (Chiang, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
97
Solutions

Extend to larger rules
VPB
IP
VS
AS
X1:NPB
juxing le
NP
held
VPB
NPB CC NPB
VS
AS
a X1
NPB
bushi yu shalong juxing le huitan
Bush
held
a
talk with Sharon
(Galley et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
98
Solutions

VPB
Add pseudo nodes
X1:*VPB_*NPB X2:NPB
X1
IP
a
X2
*VPB_*NPB
NP
VPB
NPB CC NPB
VS
AS
NPB
VS
AS
juxing
le
held
bushi yu shalong juxing le huitan
Bush
held
a
talk with Sharon
(Marcu et al., 2006)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
99
Solutions

VPB
Use tree sequences
X1:VS X1:AS X2:NPB
X1
IP
NP
VPB
NPB CC NPB
VS
AS
NPB
a
X2
VS
AS
juxing
le
held
bushi yu shalong juxing le huitan
Bush
held
a
talk with Sharon
(Liu et al., 2007; Zhang et al., 2008)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
100
Tree-Sequence + Forest
system
input
rule
BLEU
Moses
string
string-to-string
25.7
tree-to-string
26.1
tree-sequence-to-string
27.0
tree-to-string
27.7
tree-sequence-to-string
28.8
tree
tree-to-string
forest
Results from (Zhang et al., 2009)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
101
Other Solutions






Re-structure syntax-trees (Wang et al., 2007)
Offer more trees (Mi and Huang, 2008)
Re-align syntax trees and strings (May and
Knight, 2007)
Well-formed dependency structures (Shen et
al., 2008)
Gibbs sampling (Cohn and Blunsom, 2009)
Joint decoding (Liu et al., 2009b)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
102
Outline




Part 1: Tree-based Translation
 Overview and Motivation
 Tree-to-String Model and Decoding
 Tree-to-String Rule Extraction
 Language Model-Integrated Decoding: Cube Pruning
Part 2: Forest-based Translation
 Packed Forest
 Forest-based Decoding
 Forest-based Rule Extraction
Part 3: Extensions
 Tree-to-Tree Translation
 Tree Sequence-based Translation
 Joint Parsing and Translation
Part 4: Conclusion
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
103
Separate Parsing and Translation
source
string
tree/forest
parse
translate
string
target
 Separate grammar for parsing and translation
 decoding is fast!
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
104
Joint Parsing and Translation
source
string
tree/forest
parse + translate
string
target
• Its search space is larger than tree/forest
• It is a translator as well as a parser
• Parsing interacts with translation
(Liu and Liu, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
105
Tree-to-String Translation as Parsing
NPB
bushi
Bush
NPB
bushi
yu shalong
juxing
le
huitan
Bush
(Liu and Liu, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
106
Tree-to-String Translation as Parsing
P
yu
NPB
P
bushi
yu shalong
with
juxing
le
huitan
Bush with
(Liu and Liu, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
107
Tree-to-String Translation as Parsing
NPB
shalong
Sharon
NPB
P
bushi
yu shalong
Bush with
NPB
juxing
le
huitan
Sharon
(Liu and Liu, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
108
Tree-to-String Translation as Parsing
NPB
huitan
talk
NPB
P
bushi
yu shalong
Bush with
NPB
NPB
juxing
Sharon
le
huitan
talk
(Liu and Liu, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
109
Tree-to-String Translation as Parsing
PP
X1:P
PP
X2:NPB
X1 X2
NPB
P
bushi
yu shalong
Bush with
NPB
NPB
juxing
Sharon
le
huitan
talk
(Liu and Liu, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
110
Tree-to-String Translation as Parsing
PP
X1:P
PP
X2:NPB
X1 X2
NPB
P
bushi
yu shalong
Bush
NPB
NPB
juxing
with Sharon
le
huitan
talk
(Liu and Liu, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
111
Tree-to-String Translation as Parsing
VPB
VS
AS
X1:NPB
juxing le
PP
NPB
P
bushi
yu shalong
Bush
VPB
NPB
with Sharon
held
VS
AS
NPB
juxing
le
huitan
a X1
talk
(Liu and Liu, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
112
Tree-to-String Translation as Parsing
VPB
VS
AS
X1:NPB
juxing le
PP
NPB
P
bushi
yu shalong
Bush
VPB
NPB
with Sharon
held
VS
AS
NPB
juxing
le
huitan
held
a
a X1
talk
(Liu and Liu, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
113
Tree-to-String Translation as Parsing
VP
VP
X1:PP
PP
NPB
P
bushi
yu shalong
Bush
VPB
NPB
with Sharon
X2 X1
VS
AS
NPB
juxing
le
huitan
held
X2:VPB
a
talk
(Liu and Liu, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
114
Tree-to-String Translation as Parsing
VP
VP
X1:PP
PP
VPB
NPB
P
NPB
bushi
yu shalong
Bush
held
a
X2 X1
VS
AS
NPB
juxing
le
huitan
talk
with
X2:VPB
Sharon
(Liu and Liu, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
115
Tree-to-String Translation as Parsing
IP
IP
VP
X1:NPB
PP
VPB
NPB
P
NPB
bushi
yu shalong
Bush
held
a
X1 X2
VS
AS
NPB
juxing
le
huitan
talk
with
X2:VP
Sharon
(Liu and Liu, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
116
Tree-to-String Translation as Parsing
IP
IP
VP
X1:NPB
PP
VPB
NPB
P
NPB
bushi
yu shalong
Bush
held
a
X1 X2
VS
AS
NPB
juxing
le
huitan
talk
with
X2:VP
Sharon
(Liu and Liu, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
117
Translation Evaluation
algorithm
input
parsing
model
rules
BLEU
time
tree
none
1.2M
29.8
0.56
forest
PCFG
1.9M
31.6
9.49
none
32.0
51.41
PCFG
32.4
55.52
Lex
32.6
89.35
PCFG+Lex
32.7
91.72
matching
parsing
string
7.7M
(Liu and Liu, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
118
Parsing Evaluation
parsing model
F1
time
none
62.7
23.9
PCFG
65.4
24.7
Lex
79.8
48.8
PCFG + Lex
80.6
50.4
(Liu and Liu, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
119
Results on Tree-to-Tree
task
Chinese
Arabic
extraction
rules
features
BLEU
string-to-string
440M
1K
23.7
tree-to-tree
50M
5K
23.9
string-to-string
790M
1K
48.9
tree-to-tree
38M
5K
47.5
Results from (Chiang, 2010)
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
120
Outline




Part 1: Tree-based Translation
 Overview and Motivation
 Tree-to-String Model and Decoding
 Tree-to-String Rule Extraction
 Language Model-Integrated Decoding: Cube Pruning
Part 2: Forest-based Translation
 Packed Forest
 Forest-based Decoding
 Forest-based Rule Extraction
Part 3: Extensions
 Tree-to-Tree Translation
 Tree Sequence-based Translation
 Joint Parsing and Translation
Part 4: Conclusion
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
121
Conclusion

Statistical machine translation



Word-based
Phrase-based
Syntax-based




July 11, 2010
flat
String-to-String
String-to-Tree
Tree-to-String
Tree-to-Tree
hierarchical
ACL 2010 Tutorial, Uppsala, Sweden
122
Conclusion

Tree-based translation



Pros: simplicity, faster decoding, expressive
grammar, no need for binarization
Cons: commits to 1-best tree
Forest-based translation

Compromise between tree-based and string-based,
combining the advantages of both


July 11, 2010
Fast decoding, but does not commit to 1-best trees
Significant improvement of translation
performance over tree-based
ACL 2010 Tutorial, Uppsala, Sweden
123
VP
VV
PN
谢谢 大家
Thank you
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
124
Bibliography









Alfred V. Aho and Jeffrey D. Ullman. 1972. The Theory of Parsing, Translation,
and Compiling, volume I: parsing. Prentice Hall, Englewood Cliffs, New Jersey.
Peter F. Brown, Stephan A. Della Pietra, Vincent J. Della Pietra, and Robert L.
Mercer. 1993. The mathematics of statistical machine translation: Parameter
estimation. Computational Linguistics, 19(2): 263-311.
Sylvie Billot and Bernard Lang. 1989. The structure of shared forests in
ambiguous parsing. In Proceedings of ACL 1989.
David Chiang, 2005. A hierarchical phrase-based model for statistical machine
translation. In Proceedings of ACL 2005.
David Chiang, 2007. Hierarchical phrase-based translation. Computational
Linguistics, 33(2): 201-228.
David Chiang, 2010. Learning to translate with source and target syntax. In
Proceedings of ACL 2010.
Trevor Cohn and Phil Blunsom. 2009. A bayesian model for syntax-directed tree
to string grammar induction. In Proceedings of EMNLP 2009.
Michel Galley, Mark Hopkins, Kevin Knight, and Daniel Marcu. 2004. What’s in
a translation rule? In Proceedings of HLT-NAACL 2004.
Michel Galley, Jonathan Graehl, Kevin Knight, Daniel Marcu, Steve DeNeefe,
Wei Wang, and Ignacio Thayer. 2006. Scalable inference and training of contextrich syntactic translation models. In Proceedings of COLING-ACL 2006.
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
125
Bibliography










Liang Huang and David Chiang. 2005. Better k-best parsing. In Proceedings of
IWPT 2005.
Liang Huang and David Chiang. 2007. Forest rescoring: Faster decoding with
integrated language models. In Proceedings of ACL 2007.
Liang Huang, Kevin Knight, and Aravind Joshi. 2006. Statistical syntax-directed
translation with extended domain of locality. In Proceedings of AMTA 2006.
Liang Huang. 2008. Forest reranking: Discriminative parsing with non-local
features. In Proceedings. of ACL-HLT 2008.
E. T. Irons. 1961. A syntax-directed compiler for ALGOL 60. Comm. ACM, 4(1):
51-55.
Kevin Knight and Jonathan Graehl. 2005. An overview of probabilistic tree
transducers for natural language processing. In Proceedings of CICLing 2005.
Dan Klein and Christopher D. Manning. 2001. Parsing and hypergraphs. In
Proceedings of IWPT 2001.
Philipp Koehn, Franz Och, and Daniel Marcu. 2003. Statistical phrase-based
translation In Proceedings of HLT-NAACL 2003.
P. M. Lewis and R. E. Stearns. 1968. Syntax-directed transduction. Journal of the
ACM, 15(3): 465-488.
Yang Liu, Qun Liu, and Shouxun Lin. 2006. Tree-to-string alignment template
for statistical machine translation. In Proceedings of COLING-ACL 2006.
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
126
Bibliography









Yang Liu, Yun Huang, Qun Liu, and Shouxun Lin. 2007. Forest-to-string
statistical translation rules. In Proceedings of ACL 2007.
Yang Liu, Yajuan Lu, and Qun Liu. 2009a. Improving tree-to-tree translation
with packed forests. In Proceedings of ACL-IJCNLP 2009.
Yang Liu, Haitao Mi, Yang Feng, and Qun Liu. 2009b. Joint decoding with
multiple translation models. In Proceeding of ACL-IJCNLP 2009.
Yang Liu and Qun Liu. 2010. Joint parsing and translation. Submitted to
COLING 2010.
Daniel Marcu, Wei Wang, Abdessamad Echihabi, and Kevin Knight. 2006. SPMT:
Statistical machine translation with syntactified target language phrases. In
Proceedings of EMNLP 2006.
Jonathan May and Kevin Knight. 2007. Syntax re-alignment models for machine
translation. In Proceedings of EMNLP 2007.
Haitao Mi and Liang Huang. 2008. Forest-based translation rule extraction. In
Proceedings of EMNLP 2008.
Haitao Mi, Liang Huang, and Qun Liu. 2008. Forest-based translation. In
Proceedings of ACL-HLT 2008.
Franz Och and Hermann Ney. 2004. The alignment template approach to
statistical machine translation. Computational Linguistics, 30(4): 417-449.
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
127
Bibliography







Libin Shen, Jinxi Xu, and Ralph Weischedel. 2008. A new string-to-dependency
machine translation algorithm with a target dependency language model. In
Proceedings of ACL-HLT 2008.
Ashish Venugopal, Andreas Zollmann, Noah Smith, and Stephan Vogel. 2008.
Wider pipelines: n-best alignments and parses in mt training. In Proceedings of
AMTA 2008.
Wei Wang, Kevin Knight, and Daniel Marcu. 2007. Binarizing syntax trees to
improve syntax-based machine translation accuracy. In Proceedings of EMNLP
2007.
Kenji Yamada and Kevin Knight. 2001. A syntax-based statistical machine
translation model. In Proceedings of ACL 2001.
Hui Zhang, Min Zhang, Haizhou Li, Aiti Aw, and Chew Lin Tan. 2009. Forestbased tree sequence to string translation model. In Proceedings of ACL-IJCNLP
2009.
Min Zhang, Hongfei Jiang, Aiti Aw, Jun Sun, Sheng Li, and Chew Lin Tan. 2007.
A tree-to-tree alignment-based model for statistical machine translation. In
Proceedings of MT Summit 2007.
Min Zhang, Hongfei Jiang, Aiti Aw, Haizhou Li, Chew Lin Tan, and Sheng Li.
2008. A tree sequence alignment-based tree-to-tree translation model. In
Proceedings of ACL-HLT 2008.
July 11, 2010
ACL 2010 Tutorial, Uppsala, Sweden
128