SESSION
7
N a t u r a l Language:
Speech
A PARSER FOR A SPEECH UNDERSTANDING SYSTEM
by
W i l l i a m H. Paxton
and
Ann E. R o b i n s o n
A r t i f i c i a l I n t e l l i g e n c e Center
S t a n f o r d Research I n s t i t u t e
Menlo P a r k , C a l i f o r n i a
94025
Abstract
T h i s p a p e r d e s c r i b e s a p a r s i n g system s p e c i f i c a l l y
d e s i g n e d f o r spoken r a t h e r t h a n w r i t t e n i n p u t .
The
parser is part of a p r o j e c t in progress at Stanford
Research Institute to d e v e l o p a computer s y s t e m f o r
understanding speech.
The a p p r o a c h d e s c r i b e d uses as
much h e u r i s t i c k n o w l e d g e a s p o s s i b l e i n o r d e r t o m i n i mize t h e demands o n a c o u s t i c a n a l y s i s .
P a r s i n g Speech
P a r s i n g c o n t i n u o u s spoken E n g l i s h i s a s i g n i f i c a n t l y d i f f e r e n t problem than p a r s i n g w r i t t e n E n g l i s h ,
T h e r e a r e a t l e a s t two m a j o r d i f f e r e n c e s between spoken
and t e x t u a l i n p u t .
F i r s t , spoken i n p u t i s n o t a s e a s i l y decoded a s
text input.
There a r e d i f f i c u l t i e s a s s o c i a t e d w i t h
e s t a b l i s h i n g t h e b o u n d a r i e s o f t h e p a r t i c u l a r words i n
the a c o u s t i c stream, w i t h accounting f o r the v a r i a t i o n s
o f a word b y d i f f e r e n t s p e a k e r s , f o r d i f f e r e n t p r o n u n c i a t i o n s o f a word b y t h e same s p e a k e r , and f o r d i f f e r e n t p r o n u n c i a t i o n s o f t h e same word i n d i f f e r e n t c o n texts.
R e c o g n i t i o n o f spoken i n p u t words i s much more
complex and much l e a s r e l i a b l e t h a n r e c o g n i t i o n o f
w r i t t e n input words.
Second, speech provides information by prosodic
f e a t u r e s — i n t o n a t i o n , s t r e s s , pause, Juncture, 1 , 2 For
example, i n t o n a t i o n contours, the v a r i a t i o n s in p i t c h
and rhythm in an utterance, provide clues to the l o c a t i o n of phrase boundaries and to s y n t a c t i c r e l a t i o n ships of the words.
(Work is now being done, p a r t i c u l a r l y at the UN1VAC Corporation 3 and at the U n i v e r s i t y
of M i c h i g a n , * ' 6 on how to e x t r a c t such i n f o r m a t i o n from
the acoustic s i g n a l and how to Incorporate it i n t o a
grammar f o r the computer.)
A Strategy f o r Parsing Speech
In one approach to parsing t e x t , the parser reads
the input words, looks them up in the l e x i c o n , and uses
t h e i r s y n t a c t i c features f o r guidance.
This approach
ie much less d e s i r a b l e f o r parsing speech because of
the problem of r e l i a b l y separating and i d e n t i f y i n g the
input words using current acoustic techniques.
Because of t h i s l i m i t a t i o n of acoustic r e c o g n i t i o n ,
we have chosen to r e s t r i c t the d e t a i l e d acoustic p r o cessing to t e s t i n g words t h a t have been hypothesised by
the parser on the basis of a wide range of s y n t a c t i c ,
semantic, and prosodic knowledge. We expect the s h i f t
to v e r i f i c a t i o n of parser-proposed words to lead to more
r e l i a b l e acoustic d e c i s i o n s since the v e r i f i c a t i o n
algorithms can be tailormade f o r the i n d i v i d u a l words
and can take context i n t o c o n s i d e r a t i o n . Moreover, by
hypothesizing words in order of t h e i r l i k e l i h o o d in a
p a r t i c u l a r context, we should be able to reduce the
average number of i n c o r r e c t words proposed and, hence,
the p o t e n t i a l f o r e r r o r s i n acoustic r e c o g n i t i o n .
An early version of a system p a r t i a l l y using t h i s
approach has been put together by making m o d i f i c a t i o n s
to Terry Winograd's program. 6,7 Our new system is not
d i r e c t l y based on Winograd's program, but it reflects
his i n f l u e n c e .
It also has been influenced by the work
of B i l l Woods and Ron Kaplan on parsing s y s t e m s . 8 ' 9 ' 1 0
While keeping the approach mentioned above, the
new system differs from the old one in many ways.
Sign i f i c a n t l y , i t uses a " b e s t - f i r s t " parsing s t r a t e g y i n
place of the more conventional " d e p t h - f i r s t " s t r a t e g y .
Both s t r a t e g i e s are designed to deal w i t h "choice
p o i n t s " in the grammar—places where there are several
a l t e r n a t i v e s f o r continuing the parse but not enough
i n f o r m a t i o n a v a i l a b l e to decide among them. With the
d e p t h - f i r s t strategy, choice p o i n t s are handled by using
" b a c k t r a c k i n g . " (A s i n g l e path from the choice point is
pursued u n t i l an inconsistency w i t h the input is found.
At that time, the path is permanently abandoned. The
parser is backed up to the most r e c e n t l y encountered
choice point to t r y the next a l t e r n a t i v e . )
There are several reasons why t h i s d e p t h - f i r s t
method is u n a t t r a c t i v e f o r parsing speech. F i r s t , the
u n c e r t a i n t y o f acoustic r e c o g n i t i o n makes i t d i f f i c u l t
to decide conclusively whether a path should be abandoned. With speech, it makes more sense to deal w i t h
the l i k e l i h o o d of a path than w i t h a c a t e g o r i c a l acceptance or r e j e c t i o n . A second o b j e c t i o n is t h a t the
d e p t h - f i r s t strategy makes it impossible to hypothesize
words in order of t h e i r l i k e l i h o o d .
Before the other
a l t e r n a t i v e s at a choice point can be considered, a l l
p o s s i b i l i t i e s f o r s a t i s f y i n g the f i r s t a l t e r n a t i v e must
be e x p l o r e d . This forces the t e s t i n g of all words corresponding t o the f i r s t a l t e r n a t i v e regardless o f t h e i r
l i k e l i h o o d s . A f i n a l o b j e c t i o n t o the d e p t h - f i r s t
s t r a t e g y is t h a t it explores many more f a l s e paths than
a s t r a t e g y using h e u r i s t i c knowledge to guide the parse.
Extra f a l s e paths are p a r t i c u l a r l y bad f o r speech
because of the high cost of the acoustic t e s t s needed
to determine t h a t a path is a dead end and the danger
of f o l l o w i n g f a l s e paths f a r t h e r than necessary due to
u n c e r t a i n acoustic r e c o g n i t i o n .
Research i n A r t i f i c i a l
I n t e l l i g e n c e has shown t h a t s u b s t a n t i a l increases in
e f f i c i e n c y can r e s u l t from the a p p l i c a t i o n of h e u r i s t i c
knowledge to guide searches.
Equivalent gains should
be p o s s i b l e if knowledge is used to guide the search f o r
successful parse.
*See f o r instance N l l s s o n ' s book and the work referenced
there.11
These c o n s i d e r a t i o n s a l l s u p p o r t a b e s t - f i r s t
parsing strategy.
I n t h i s a p p r o a c h , each new p a t h
r e s u l t i n g from a choice p o i n t is assigned a p r i o r i t y
according to i t s estimated l i k e l i h o o d of leading to a
correct parse.
The p a t h s a r e t h e n added t o t h e s e t o f
a l l p a t h s t h a t have been g e n e r a t e d b u t n o t y e t e x t e n d e d
during this parse.
The system f o l l o w s t h e h i g h e s t p r i o r i t y p a t h f r o m t h e comprehensive s e t u n t i l i t s p r i o r i t y
drops or it reaches a choice p o i n t .
At that time the
cycle repeats.
S i n c e the new p a t h chosen need n o t be
one o f the s u c c e s s o r s o f t h e p r e v i o u s p a t h , t h e p a r s e r
w i l l not n e c e s s a r i l y continue along a s i n g l e path u n t i l
i t r e a c h e s a dead e n d .
I n s t e a d , i t w i l l suspend a p a t h
when t h e r e i s a n a l t e r n a t i v e a v a i l a b l e w i t h a h i g h e r
estimated l i k e l i h o o d .
I t w i l l resume t h e o r i g i n a l a t
a l a t e r t i m e i f i t becomes most l i k e l y a g a i n .
The b e s t - f i r s t method a v o i d s t h e o b j e c t i o n s made
t o the d e p t h - f i r s t s t r a t e g y .
F i r s t , i t i s not necessary to decide c o n c l u s i v e l y t h a t a path is inapprop r i a t e , t o f o r c e a "yes o r n o . "
The g r a d a t i o n s o f
c o n f i d e n c e i n a c o u s t i c t e s t s can b e r e f l e c t e d i n a
range of p r i o r i t i e s .
Second, words can b e h y p o t h e sized in order of t h e i r l i k e l i h o o d since t h e r e are no
c o n s t r a i n t s imposed b y the p a r s i n g s t r a t e g y o n the
o r d e r in which paths are e x p l o r e d .
And f i n a l l y , the
number o f f a l s e p a t h s e x p l o r e d can p o t e n t i a l l y b e
reduced by u s i n g h e u r i s t i c knowledge to guide t h e
parse.
C o n t r o l s t r u c t u r e s a l s o r e p l a c e GOTO's w i t h f u n c t i o n s t h a t g i v e a c l e a r and e x p l i c i t f o r m t o t h e s t a n d a r d p a r s i n g o p e r a t i o n s such a s l i s t i n g a l t e r n a t i v e s
and i d e n t i f y i n g o p t i o n a l e l e m e n t s .
We agree w i t h
D i j k s t r a and o t h e r s who f i n d GOTO's p a r t i c u l a r l y h a r m f u l t o p r o g r a m c l a r i t y . 1 2 - 1 4 Many o t h e r p a r s i n g s y s tems have t r e a t e d the p r o g r a m as a c o l l e c t i o n of l a b e l e d
blocks o f i n s t r u c t i o n s w i t h control t r a n s f e r r e d a r b i t r a r i l y by e x p l i c i t , reference to l a b e l s .
T h i s amounts
t o the u n c o n s t r a i n e d use o f GOTO's w h e t h e r t h e p a r t i c u l a r s y n t a c t i c form is a m u l t i d i r e c t i o n branch statement,
a s t a t e t r a n s i t i o n , o r a p r o d u c t i o n language f o r m a l i s m .
The e l i m i n a t i o n o f GOTO's i s p r o b a b l y t h e most i m p o r t a n t
way i n w h i c h t h e c o n t r o l s t r u c t u r e s reduce the c o m p l e x i t y o f the p a r s e r .
F i n a l l y , t h e r e a r e advantages i n d e v e l o p i n g t h e
c o n t r o l s t r u c t u r e s a s a n e x t e n s i o n t o t h e LISP l a n g u a g e .
These i n c l u d e c o m p l e t e freedom i n t h e use o f p r o c e d u r e s
f o r s t r u c t u r i n g the system, a v a i l a b i l i t y o f the c o n t r o l
and d a t a m a n i p u l a t i o n f a c i l i t i e s o f LISP, a n d c o m p a t i b i l i t y w i t h s t a n d a r d d e b u g g i n g and p r o g r a m development
aids .
B e f o r e d e s c r i b i n g t h e c o n t r o l s t r u c t u r e s t h a t have
been added t o LISP, i t i s u s e f u l t o r e v i e w t h e o v e r a l l
p a r s i n g s t r a t e g y and i n t r o d u c e some new t e r m i n o l o g y .
A s t h e system a t t e m p t s t o u n d e r s t a n d a n i n p u t , i t uses
the grammar to g e n e r a t e a sequence of p a t h s .
Corres p o n d i n g t o each p a t h i s a " p r o c e s s . "
The l i k e l i h o o d
o f a p a t h i s d e t e r m i n e d b y p r i o r i t y f u n c t i o n s and i s
r e f l e c t e d i n the value o f the p r i o r i t y f o r the p r o c e s s .
P r i o r i t i e s a r e p o s i t i v e numbers whose v a l u e s a r e i r r e l evant e x c e p t t o e s t a b l i s h a n o r d e r o f a l l t h e p r o c e s s e s .
The h i g h e s t p r i o r i t y p r o c e s s i s r u n u n t i l c i t h e r i t s
p r i o r i t y d r o p s a s a r e s u l t o f some t e s t o r i t r e a c h e s
a choice p o i n t .
I n e i t h e r case, c o n t r o l i s t r a n s f e r r e d t o t h e h i g h e s t p r i o r i t y member o f t h e l i s t o f
processes.
Parsing continues u n t i l an acceptable
i n t e r p r e t a t i o n o f t h e i n p u t i s f o u n d o r some r e s o u r c e
bound i s e x c e e d e d .
Of course, t h i s p o t e n t i a l f o r reducing f a l s e paths
can b e r e a l i z e d o n l y i f e x t e n s i v e k n o w l e d g e i s s u c c e s s f u l l y incorporated.
Winograd made a n o t e w o r t h y s t e p i n
t h i s d i r e c t i o n b y u s i n g semantics t o f i l t e r out u n i n t e r p r e t a b l e p h r a s e s as soon as t h e y were p a r s e d .
But
more can be d o n e ; knowledge can be used to g u i d e t h e
p a r s i n g r a t h e r than to act simply as a passive f i l t e r ,
and a w i d e r range of knowledge can be used .
For
i n s t a n c e , in a d d i t i o n to semantics, the parser could be
g u i d e d b y such t h i n g s a s p r o s o d i e s , s t a t i s t i c s r e g a r d i n g v o c a b u l a r y and s y n t a c t i c c o n s t r u c t s , models o f t h e
u s e r and t h e d i a l o g .
Control
the p a r t i c u l a r semantic c o n s i d e r a t i o n s i n c o n j u n c t i o n
with other relevant information s p e c i f i c to a place in
t h e grammar.
Structures
The a t t e m p t to i n c o r p o r a t e more knowledge may
appear f o o l h a r d y i n t h e l i g h t o f t h e g r e a t c o m p l e x i t y
of e x i s t i n g systems.
Winograd h i m s e l f has remarked
t h a t h i s p r o g r a m was n e a r i n g t h e l i m i t s o f comprehens i b i l i t y ; o u r e a r l y system was c e r t a i n l y n o b e t t e r .
T o combat t h i s p r o b l e m , w e have i n t r o d u c e d c o n t r o l
s t r u c t u r e s t o e n c o u r a g e t h e c a r e f u l , s y s t e m a t i c use o f
h e u r i s t i c k n o w l e d g e and
a i d clarity of
the parser.
W i t h t h i s s t r a t e g y i n mind w e can d i s c u s s t h e
actual control functions.
The most b a s i c o f t h e s e i s
ALT, w h i c h i s used t o l i s t a l t e r n a t i v e s .
I t s syntax
i s (ALT a l t 1 . . . a l t n ) , where each a l t . i s o f t h e f o r m
{altname a l t p r i o r i t y e1 . . . e n ) .
Altname is a unique
name f o r t h i s a l t e r n a t i v e .
The use o f t h e name i s
d e s c r i b e d i n t h e s e c t i o n PATHS and MAPS.
Altpriority
is evaluated to determine the p r i o r i t y to be given the
process corresponding t o t h i s a l t e r n a t i v e .
Finally,
e 1 . . . e n s p e c i f y the a c t i o n f o r t h i s a l t e r n a t i v e .
A c t i o n s can i n c l u d e c a l l i n g o t h e r c o n t r o l f u n c t i o n s o r
procedures t h a t i n c l u d e c o n t r o l f u n c t i o n s .
In other
w o r d s , t h e r e i s c o m p l e t e freedom i n d y n a m i c a l l y n e s t i n g
control structures.
The e f f e c t o f t h e ALT i s t o r e p l a c e
one p r o c e s s by s e v e r a l new o n e s .
Each new p r o c e s s
i n d e p e n d e n t l y can c o n t i n u e the c o m p u t a t i o n s t a r t e d b y
the o r i g i n a l process.
When one has f i n i s h e d i t s a c t i o n
e . . . . e , i t can i m m e d i a t e l y g o o n t o t h e s t a t e m e n t
f o l l o w i n g t h e ALT.
The c o n t r o l s t r u c t u r e s d i f f e r e n t i a t e between t h e
p r i o r i t y f u n c t i o n s t h a t embody t h e s p e c i a l knowledge
and t h e o t h e r p a r s e r f u n c t i o n s t h a t embody t h e grammar.
The grammar f u n c t i o n s a l o n e d e f i n e t h e p o s s i b l e p a t h s ,
w h i l e the h e u r i s t i c functions c o n t r o l the order i n
which the paths w i l l be e x p l o r e d .
I n t h i s way t h e
grammar i s n o t o b s c u r e d b y t h e l o g i c needed t o a s s i g n
priorities.
The d i v i s i o n i s m a i n t a i n e d b y p r o v i d i n g
e a c h a l t e r n a t i v e w i t h i t s own p r i o r i t y f u n c t i o n .
This
b o t h f u r t h e r s t h e i n c o r p o r a t i o n o f knowledge and s i m p l i f i e s the development of h e u r i s t i c s .
W e d o n o t have
t o t r y t o b u i l d a n o m n i s c i e n t s e m a n t i c s module c a p a b l e
o f e v a l u a t i n g any c o n f i g u r a t i o n ; o u r system w i l l use
217
w i t h an a d j e c t i v e or noun. The statement f o l l o w i n g
the ALT c o n t r o l s the parsing of o p t i o n a l a d j e c t i v e s
and the head noun. These are not allowed a f t e r a
THINGPRON, are o p t i o n a l a f t e r e i t h e r a DEMONSTRATIVEADJ
or a QUANTIFIER, and are required in a l l other cases.
This is represented in the program by an OPTIONALIF
i n s i d e a COND. The f i n a l statement of NOUNGROUP looks
f o r modifying r e l a t i v e clauses or p r e p o s i t i o n a l phrases.
A second c o n t r o l f u n c t i o n is OPTION, which is
used to i d e n t i f y o p t i o n a l components in the grammar.
I t s form is (OPTION optname o p t p r i o r i t i e s e 1 . . . e n ) ,
where optname is a unique name f o r t h i s o p t i o n ,
o p t p r l o r i t i e s evaluates to a p a i r of p r i o r i t i e s , and
e 1 . . . e n specify the o p t i o n a l a c t i o n . The f i r s t of
the p r i o r i t i e s is assigned to a process t h a t w i l l
execute e 1 ..............e n , and the second to a process t h a t
w i l l simply go d i r e c t l y to the successor of the OPTION.
This grammar w i l l f i n d constructions such as:
SEQUENCE is s i m i l a r to OPTION. On entry to
SEQUENCE the process B p l i t s i n t o two-one which executes e1..............en, and one which does n o t . However,
u n l i k e OPTION, the f i r s t process reenters SEQUENCE
a f t e r completing e n . This leads to computing another
p a i r of p r i o r i t i e s and s p l i t i n g i n t o one process
which executes e1..............en a second time and one which
does n o t . In t h i s manner, the SEQUENCE statement can
be reentered an a r b i t r a r y number of times to parse an
a r b i t r a r i l y long sequence of c o n s t i t u e n t s s a t i s f y i n g
e1..............en. .
1
n
the b i g green t a b l e "
"that p a r t "
"some narrow p i e c e s "
"something"
"everything green"
"he," " i t "
A l l of these except the pronouns can be followed
by a sequence of modifying phrases. Relative clauses
such as "which you dropped" and " t h a t was picked up"
or p r e p o s i t i o n a l phrases such as "on the f l o o r " and
"under the t a b l e " are modifying phrases.
The f o u r t h c o n t r o l f u n c t i o n is OPTIONALIF.
Its
form is (OPTIONALIF c o n d i t i o n name p r i o r i t i e s
e 1 ...................e n ). If the c o n d i t i o n is true then e 1 . . . e
1
en
I
n
are o p t i o n a l ; otherwise they are r e q u i r e d . The name
and p r i o r i t i e s f o r OPTIONALIF are l i k e those f o r
OPTION and are used only if the c o n d i t i o n is t r u e .
Word V e r i f i e r
Another important part of the parsing system is
the word v e r i f i e r . This component testa acoustic data
f o r the presence of words from such terminal categories
as noun, verb, and a d j e c t i v e t h a t are predicted by the
parser according to some path through the grammar. The
words in the terminal category can be thought of as
a l t e r n a t i v e s , e x a c t l y l i k e the higher s y n t a c t i c a l t e r natives in the grammar, that should be considered in
order of t h e i r l i k e l i h o o d .
The same s t r u c t u r e of a l t e r natives and p r i o r i t i e s is here found on the l e v e l of
word v e r i f i c a t i o n . Estimated l i k e l i h o o d s can not only
order the words in the terminal category, but also
defer acoustic processing as long as higher p r i o r i t y
a l t e r n a t i v e s remain to be explored.
The f i n a l f u n c t i o n is PARSE, used to c a l l the
program f o r a grammatical u n i t .
I t s syntax is
(PARSE u n i t a r g 1 . . . . . . . a r g 1 ) , where u n i t is the name of
the f u n c t i o n to be c a l l e d f o r a grammatical u n i t and
arg 1 .......... a r g n are o p t i o n a l arguments f o r that funct i o n . The r e s u l t of PARSE is a parse tree f o r the
u n i t . PARSE also plays an important r o l e in the
implementation as discussed below.
An Example
Having discussed the major elements of the parser,
we w i l l now present a sample grammar f o r a noun phrase
(Figure 1 ) . U n f o r t u n a t e l y , an example small enough to
present here cannot f u l l y demonstrate the value of
t h i s approach in developing a large system. But t h i s
example can I l l u s t r a t e the format of a grammar. L i s t ings of a much more complete grammar are a v a i l a b l e
from the authors.
The word v e r i f i e r must f i r s t e s t a b l i s h a p r i o r i t y
f o r each candidate word and then, according to those
p r i o r i t i e s , schedule acoustic v e r i f i c a t i o n . Corresponding to these two operations there is a p r i o r i t y funct i o n and a v e r i f i c a t i o n f u n c t i o n associated w i t h each
word in the vocabulary. The t e s t s made by the p r i o r i t y
f u n c t i o n are computationally simple, although broad in
scope. The r e s u l t s of p r e l i m i n a r y acoustic processing
are consulted to ensure that the word is at least feasi b l e . The frequency of occurrence of the word and the
l i k e l i h o o d of co-occurrence of the word w i t h recently
used words are considered. F i n a l l y , the semantic
features of the word are compared to the semantic
r e s t r i c t i o n s of the c o n t e x t .
The example grammar is a LISP program c a l l e d
NOUNGROUP. The f i r s t statement in it is an ALT.
In
t h i s example, the f i r s t a l t e r n a t i v e i s
(ARTICLE (ARTPRESENT)
(WDTYPE ARTICLE))
Altname Is "ARTICLE," a l t p r i o r i t y is computed by
the f u n c t i o n ARTPRESENT, and the a c t i o n f o r t h i s
a l t e r n a t i v e is the c a l l to WDTYPE. The f u n c t i o n WDTYPE
finds in the input a word of the category s p e c i f i e d by
i t s argument.
I n t h i s example, a l t 1 looks f o r a n i n i t i a l a r t i c l e
In the noun group, A l t 2 through a l t 5 look f o r a
demonstrative a d j e c t i v e , q u a n t i f i e r , pronoun and t h i n g
pronoun r e s p e c t i v e l y . The l a s t a l t e r n a t i v e , a l t 6
(labeled NULL) allows the noun group to s t a r t d i r e c t l y
In contrast to the quick t e s t s made by the p r i o r i t y f u n c t i o n , the v e r i f i c a t i o n f u n c t i o n may perform
extensive acoustic analyses to determine how w e l l the
proposed word matches the next p o r t i o n of i n p u t . The
d e t a i l s of t h i s processing are beyond the scope of t h i s
paper but have been sketched elsewhere. 7 Since the
v e r i f i c a t i o n f u n c t i o n depends only on acoustic informat i o n , i t s r e s u l t s are saved so t h a t , if another process
looks f o r the same word in the same place, the p r i o r
r e s u l t s can be used. Of course, the r e s u l t s of the
p r i o r i t y f u n c t i o n s cannot be reused In t h i s manner
since they are extremely dependent on c o n t e x t .
218
Figure 1
A path describes the flow of c o n t r o l through the
grammar leading to the current s t a t e of a process.
For example, at ALT statements the name of the a l t e r n a t i v e taken is added to the path, and when a word is
recognized in the input It also is added. The actual
s t r u c t u r e of the path is a l i s t of names and words w i t h
the most recent h i s t o r y of the process at the f r o n t of
the l i s t . A map is the part of the path that was
followed by a process when it successfully parsed some
grammatical u n i t . The f u n c t i o n PARSE automatically
records the map before it returns, and it checks f o r a
map when it is c a l l e d .
PATHS and MAPS
In the word v e r i f i e r , we have seen how one p r o cess can use information gained by another; the
r e s u l t s of a v e r i f i c a t i o n f u n c t i o n can be used by a
subsequent process looking f o r the game word. The
next step is to share the s y n t a c t i c s t r u c t u r e s that
are found. But the processes may have had d i f f e r e n t
contextual c o n s t r a i n t s and may need to make d i f f e r e n t
t e s t s , problems f o r d i r e c t s h a r i n g . We have added
PATHS and MAPS to the parsing system to f a c i l i t a t e
sharing and to allow f o r a d d i t i o n a l t e s t s .
This Is c l o s e l y r e l a t e d to the w e l l formed subs t r i n g f a c i l i t y described by Woods.8
219
The c o n t r o l f u n c t i o n s (ALT, OPTION, OPTIONALIF,
and SEQUENCE) c o n s u l t t h e map and r a i s e t h e p r i o r i t y
o f t h e named a l t e r n a t i v e .
The map t h u s h e l p s t o g u i d e
the p r o c e s s w h i c h i s p a r s i n g a c o n s t i t u e n t b y c o n t r i b u t i n g data on a l t e r n a t i v e s successful in p r i o r
attempts.
N e v e r t h e l e s s , i t i s s t i l l p o s s i b l e t h a t t h e map
w i l l f a i l due t o d i f f e r e n c e s i n t h e c o n t e x t .
When a
p r o c e s s i s f o r c e d o f f t h e p a t h s p e c i f i e d b y a map, i t
s i m p l y r e v e r t s t o t h e s t a n d a r d mode o f p a r s i n g .
This
e n s u r e s t h a t t h e use o f maps w i l l n o t r e s u l t i n t h e
acceptance of a p r e v i o u s l y parsed c o n s t i t u e n t t h a t
f a l l s t o meet t h e c u r r e n t c o n s t r a i n t s .
The map m e r e l y
s e r v e s a s h e u r i s t i c k n o w l e d g e t o g u i d e t h e r e p a r s e and
can b e o v e r r i d d e n b y o t h e r c o n s i d e r a t i o n s .
F i n a l l y , the map must take i n t o account the fact
that there may be more than one successful parse of,
say, a noun group s t a r t i n g at a p a r t i c u l a r l o c a t i o n
in the i n p u t . In i t s most general form a map is a
tree s t r u c t u r e w i t h each path through the t r e e c o r r e sponding to a successful parse.
If a branch point is
reached w h i l e f o l l o w i n g a map, the several a l t e r n a t i v e s
w i l l b e given raised p r i o r i t i e s .
b r a n c h p o i n t a r e s h a r e d b y more t h a n one p r o c e s s .
Only
t h e v a r i a b l e s i n t e r m i n a l s e c t i o n s can b e m o d i f i e d b y
the process without i n t e r f e r i n g w i t h other processes;
t h e v a r i a b l e s i n n o n t e r m i n a l s e c t i o n s must n o t b e
changed.
T h i s p r e s e n t s a p r o b l e m when t h e p r o c e s s
wishes t o r e t u r n from the f i r s t (nearest the r o o t )
f u n c t i o n i n the t e r m i n a l s e c t i o n t o the l a s t f u n c t i o n
in the l a s t nonterminal s e c t i o n .
The p r o c e s s cannot
use t h a t s e c t i o n because i t i s s h a r e d .
The s o l u t i o n i s
t o make a copy o f i t t h a t w i l l have t h e same r e l a t i o n
to o t h e r s e c t i o n s of the p a t h as the shared s e c t i o n but
w i l l b e f o r p r i v a t e use o f t h e p r o c e s s .
To ensure that
t h i s c o p y i n g i s done p r o p e r l y , each s t a c k s e c t i o n b e g i n s
w i t h a c a l l o f PARSE, and i n s t e a d o f r e t u r n i n g i n t h e
n o r m a l manner, PARSE makes a copy of the n e x t s e c t i o n
o f t h e t r e e and t r a n s f e r s c o n t r o l t o i t .
This provides
a new t e r m i n a l s e c t i o n f o r t h e p r o c e s s and r e t u r n s c o n t r o l t o t h e f u n c t i o n t h a t c a l l e d PARSE o r i g i n a l l y .
F i g u r e 3 shows t h e e f f e c t o n t h e s t a c k s t r u c t u r e o f
s u c h a r e t u r n by PARSE .
It is worth noting a second use f o r the path as a
debugging t o o l . The parsing s t r a t e g y causes c o n t r o l
to move among a large number of a c t i v e processes, and
the computation leading to the current s t a t e of a process w i l l have been i n t e r l a c e d w i t h the a c t i v i t y of
other processes. Standard t r a c i n g techniques are less
useful f o r debugging, and the path is a valuable aid
in determining how a process has reached i t s current
state.
Implementation
Copying is also necessary when new processes are
c r e a t e d . Figure 4 shows a p o r t i o n of the stack before
and a f t e r an ALT statement.* When a process is k i l l e d
i t s terminal stack section i s d e l e t e d .
The parser w i l l be implemented in BBN-LISP
using the m u l t i p l e environment c o n t r o l s t r u c t u r e
f a c i l i t y described by Bobrow and Wegbreit, 1 6 Among
other t h i n g s , t h i s f a c i l i t y generalizes the standard
l i n e a r stack to a tree s t r u c t u r e . Figure 2 i n t r o duces some terminology f o r discussing such a t r e e .
In ( a ) , p, running in Section A, is about to reach an
ALT statement. In ( b ) , p has been replaced by processes
P 1 through pn, each w i t h a copy of A, corresponding to
the n a l t e r n a t i v e s of the ALT.
Figure 2
The r o o t c o r r e s p o n d s t o t h e base o f a n o r d i n a r y
s t a c k , and e a c h t i p behaves l i k e t h e t o p o f a n o r d i n ary s t a c k .
In the parser there is a one-to-one correspondence between t h e t i p s o f t h e s t a c k s t r u c t u r e
and p r o c e s s e s .
When a p r o c e s s c a l l s a f u n c t i o n , t h e
v a r i a b l e s f o r t h e f u n c t i o n a r e added t o t h e t i p f o r
t h a t p r o c e s s , and when t h e f u n c t i o n r e t u r n s , t h e v a r i a b l e s are removed.
The p a t h f r o m t h e r o o t t o t h e
t i p contains a l l the v a r i a b l e s f o r the process.
At
branch p o i n t s the paths f o r s e v e r a l processes j o i n .
Thus t h e v a r i a b l e s i n t h e p a t h f r o m t h e r o o t t o t h e
This is not q u i t e t r u e .
In the actual system the copy
A 1 is not made u n t i l p 1 becomes the highest p r i o r i t y
process. Thus a l l the new processes which have not yet
been a c t i v a t e d share the same t e r m i n a l s e c t i o n .
220
There a r e two main r e s t r i c t i o n s o n t h e p a r s e r
caused b y t h i s i m p l e m e n t a t i o n .
F i r s t , data s t r u c t u r e s
w h i c h are not i n t e n d e d t o b e g l o b a l t o a l l processes
c a n n o t b e changed e x c e p t a f t e r c o p y i n g .
This is to
a v o i d i n t e r f e r e n c e between p r o c e s s e s w h i c h may be
s h a r i n g t h e same s t r u c t u r e s .
For e x a m p l e , i f p r o c e s s
p in F i g u r e 4a, has a v a r i a b l e X'whose v a l u e is a l i s t ,
t h e n p r o c e s s e s p 1 t h r o u g h p n i n F i g u r e 4 b w i l l a l l have
v a r i a b l e s p o i n t i n g t o t h e same l i s t ( i n LISP t e r m i n o l o g y , t h e X's w i l l b e EQ).
T o change t h e v a l u e o f X ,
p r o c e s s p i must s t o r e a new p o i n t e r in X r a t h e r t h a n
modify the o r i g i n a l s t r u c t u r e p o i n t e d to by X.
While
t h i s r e s t r i c t i o n has a f f e c t e d t h e d e s i g n o f t h e p a r s e r ,
i t has n o t been a s e r i o u s p r o b l e m .
r o u t i n e s f o r use i n word v e r i f i c a t i o n , p r i o r i t y f u n c t i o n s f o r g r a m m a t i c a l a l t e r n a t i v e s , and s e m a n t i c m o d e l i n g o f t h e domain o f d i s c o u r s e .
Acknowledgements
W e w o u l d l i k e t o thank the p e o p l e i n t h e p r o j e c t
f o r h e l p i n g u s t o see some o f t h e p r o b l e m s i n v o l v e d i n
speech u n d e r s t a n d i n g and t o d e v e l o p t h e a p p r o a c h t o
p a r s i n g speech d e s c r i b e d i n t h i s p a p e r .
Project leader
Don Walker has been p a r t i c u l a r l y h e l p f u l i n t h i s
respect.
W e have a l s o b e n e f i t e d f r o m d i s c u s s i o n s w i t h
o t h e r p a r t i c i p a n t s i n t h e Advanced P r o j e c t s Agency
speech u n d e r s t a n d i n g p r o g r a m . 1 7
The r e s e a r c h r e p o r t e d
h e r e i n was s u p p o r t e d at SRI by the Advanced Research
P r o j e c t s Agency o f t h e Department o f D e f e n s e , m o n i t o r e d
by t h e U.S. Army Research O f f i c e - D u r h a m u n d e r C o n t r a c t
DAHC04 72 C 0 0 0 9 .
The second main r e s t r i c t i o n o n t h e p a r s e r , t h a t
o n l y v a r i a b l e s w i t h i n t h e t e r m i n a l s e c t i o n o f the
s t a c k can b e changed, d i d p r e s e n t a p r o b l e m .
Certain
v a r i a b l e s such a s the c u r r e n t p o s i t i o n i n the input
a r e p r i v a t e t o each p r o c e s s b u t g l o b a l w i t h i n the
process.
The s o l u t i o n i s t o i d e n t i f y such v a r i a b l e s
to t h e system as " p r o c e s s g l o b a l s . " On e n t r y to PARSE,
t h e p r o c e s s g l o b a l s a r e rebound t o t h e i r o l d v a l u e s .
Thus t h e y a r e w i t h i n t h e t e r m i n a l s e c t i o n and can b e
g i v e n new v a l u e s .
B e f o r e PARSE e x i t s , i t p r o p a g a t e s
t h e p r o c e s s g l o b a l s back u p t h e s t a c k .
F o r example,
i n g o i n g f r o m F i g u r e 3 a t o F i g u r e 3b, t h e p r o c e s s
g l o b a l s f o r p are propagated from s e c t i o n B to s e c t i o n
A,.
B y g e n e r a l i z i n g t h i s scheme, v a r i a b l e s can appear
to be bound at any l e v e l . Thus we can e l i m i n a t e t h e
second r e s t r i c t i o n o n t h e p a r s e r b y i d e n t i f y i n g c e r t a i n
s p e c i a l v a r i a b l e s to the system.
References
1.
2.
3.
4.
The i m p l e m e n t a t i o n has t h e p r i m e advantage of
a v o i d i n g a s p e c i a l i n t e r p r e t e r f o r the p a r s i n g language,
S i n c e t h e " l a n g u a g e " c o n s i s t s o f a d d i t i o n a l LISP f u n c tions,
t h e e n t i r e p a r s e r can b e c o m p i l e d w i t h t h e
s t a n d a r d LISP c o m p i l e r and debugged w i t h the s t a n d a r d
LISP d e b u g g e r .
T h i s means f a s t e r e x e c u t i o n and e a s i e r
debugging.
O f c o u r s e , t h e b i g g e s t improvement i n
e f f i c i e n c y r e l i e s o n knowledge t o g u i d e t h e p a r s e r ,
b u t c o m p i l a t i o n w i l l g i v e a s u b s t a n t i a l speed u p i n
addition.
Conclusion
The system d e s c r i b e d i n t h i s p a p e r i s p a r t o f an.
o n g o i n g r e s e a r c h p r o j e c t i n speech u n d e r s t a n d i n g and
is intended to provide a s o l i d basis f o r continuing
work.
T h e r e a r e many o t h e r a s p e c t s t o speech u n d e r s t a n d i n g besides t h e development of a p a r s e r system.
I t has been t o o u r b e n e f i t t o b e p a r t o f a speech p r o j e c t a t SRI t h a t i n c l u d e s p e r s o n n e l i n t e r e s t e d i n a
v a r i e t y of tasks from acoustic s i g n a l processing
t h r o u g h s e m a n t i c r e p r e s e n t a t i o n t o o v e r a l l system
organization,
A s i z e a b l e grammar f o r E n g l i s h has been w r i t t e n
using t h i s parsing system.
Although it is d i f f i c u l t
t o c h a r a c t e r i z e t h e scope o f a grammar, i t appears t o
be as e x t e n s i v e as o t h e r s we are f a m i l i a r w i t h ( i n
p a r t i c u l a r t h o s e o f W i n o g r a d 6 and Woods, e t a l . 9 ) .
Our c u r r e n t e f f o r t s c e n t e r o n a c o u s t i c p r o c e s s i n g
5.
6.
7.
w . A . Woods, "An E x p e r i m e n t a l P a r s i n g System f o r
T r a n s i t i o n Network Grammars," i n N a t u r a l Language
P r o c e s s i n g , R . R u s t i n e d . , A l g o r i t h m i c s P r e s s , New
York, New Y o r k , p p . 111-154 ( 1 9 7 3 ) .
9.
W. A. Woods, R. M. K a p l a n , B. Nash-Webber, "The
Lunar S c i e n c e s N a t u r a l Languages I n f o r m a t i o n
System:
F i n a l R e p o r t , " R e p o r t No. 2378, B o l t
Beranek and Newman, I n c . , Cambridge, M a s s a c h u s e t t s
(1972).
10.
R. M. K a p l a n , "A G e n e r a l S y n t a c t i c P r o c e s s o r , " in
N a t u r a l Language P r o c e s s i n g , R . R u s t i n e d . ,
A l g o r i t h m i c P r e s s , New Y o r k , New Y o r k , p p . 2 9 3 241 (1973) .
N . J , N l l s s o n , P r o b l e m s o l v i n g Methods i n A r t i f i c i a l
I n t e l l i g e n c e , M c G r a w - H i l l , New Y o r k , New York ( 1 9 7 1 )
12.
13.
14.
221
J . J . Robinson, " p r e d i c t i n g P h o n o l o g i c a l P h r a s e s , "
M a n u s c r i p t f r o m t h e U n i v e r s i t y o f M i c h i g a n , Ann
Arbor, Michigan (1973).
T . W i n o g r a d , U n d e r s t a n d i n g N a t u r a l Language,
Academic P r e s s , New York, New York ( 1 9 7 2 ) .
D . E . W a l k e r , "Speech U n d e r s t a n d i n g R e s e a r c h , "
A n n u a l T e c h n i c a l R e p o r t , C o n t r a c t DAHCO4-72-C-O09,
SRI P r o j e c t 1526, A r t i f i c i a l I n t e l l i g e n c e C e n t e r ,
S t a n f o r d Research I n s t i t u t e , Menlo P a r k , C a l i f o r n i a
(February 1973).
8.
11.
These f u n c t i o n s o f c o u r s e depend o n t h e m u l t i p l e
e n v i r o n m e n t c o n t r o l f a c i l i t y i n BBN-LISP.
1 . L e h i s t e , S u p r a s e g m e n t a l s , M I T Press, Cambridge,
Massachusetts ( 1 9 7 0 ) .
P . L i e b e r m a n , I n t o n a t i o n , P e r c e p t i o n , and Language,
MIT P r e s s , Cambridge, M a s s a c h u s e t t s ( 1 9 6 7 ) .
W. A. L e a , H. F. Medress, T. E. S k i n n e r , " P r o s o d i c
A i d s to Speech R e c o g n i t i o n , " T e c h n i c a l Reports PX
7940 and PX 10232, UNIVAC C o r p o r a t i o n , S t . P a u l ,
M i n n e s o t a ( O c t o b e r 1972, A p r i l , 1 9 7 3 ) .
M. H. O ' M a l l e y , "The Use of P r o s o d i c U n i t s in S y n t a c t i c D e c o d i n g , " presented at the Session on
L i n g u i s t i c U n i t s at the 85th Meeting of the Acoust i c a l S o c i e t y o f A m e r i c a , B o s t o n , 1 3 A p r i l 1973
( U n i v e r s i t y o f M i c h i g a n , Ann A r b o r , M i c h i g a n
(1973)).
E. W. D i j k s t r a , "GOTO s t a t e m e n t C o n s i d e r e d H a r m f u l , "
l e t t e r t o e d i t o r , CACM, V o l . 1 1 , No, 3 , ( 1 9 6 8 ) .
W. A. W o l f , "Programming W i t h o u t t h e GOTO," Proc.
o f I F I P Congress ( 1 9 7 1 ) .
0. J. D a h l , E. W. D i j k s t r a , C. A. R. Hoare,
S t r u c t u r e d Programming, Academic P r e s s , New Y o r k ,
New York (1972) .
15.
16.
17.
W. Teitelman, et a l . , BBN-LISP, Bolt Beranek
and Newman, I n c . , Cambridge, Massachusetts
(1971).
D. G. Bobrow, B. Wegbrelt, "A Model and Stack
Implementation of M u l t i p l e Environments," Report
No. 2334, Bolt Beranek and Newman, I n c . , Cambridge,
Massachusetts (1972).
A. Newell, et a l . , Speech Understanding Systems,
North-Holland Publishing Company, Amsterdam, The
Netherlands (1973) .
222
© Copyright 2026 Paperzz