A multi-level organization for problem solving using

Carnegie Mellon University
Research Showcase @ CMU
Computer Science Department
School of Computer Science
1975
A multi-level organization for problem solving
using many, diverse, cooperating sources of
knowledge
Lee D. Erman
Carnegie Mellon University
Victor Lesser
Follow this and additional works at: http://repository.cmu.edu/compsci
This Technical Report is brought to you for free and open access by the School of Computer Science at Research Showcase @ CMU. It has been
accepted for inclusion in Computer Science Department by an authorized administrator of Research Showcase @ CMU. For more information, please
contact [email protected].
NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS:
The copyright law of the United States (title 17, U.S. Code) governs the making
of photocopies or other reproductions of copyrighted material. Any copying of this
document without permission of its author may be prohibited by law.
A MULTI-LEVEL ORGANIZATION FOR PROBLEM
USING MANY, DIVERSE, COOPERATING
SOURCES OF KNOWLEDGE
Lee D. Erman
and
Victor R. Lesser
March, 1975
A MULTI-LEVEL ORGANIZATION FOR PROBLEM SOLVING
USING MANY, DIVERSE, COOPERATING SOURCES OF KNOWLEDGE
Lee D. Erman
and
Victor R. Lesser
Computer Science Department
1
Carnegie-Mellon University
Pittsburgh, Pa. 15213
March, 1975
ABSTRACT
An
organization
problems.
is
presented
The hypothesize-and-test
for
implementing
solutions
to
knowledge-based
paradigm is used as the basis for c o o p e r a t i o n
AI
among
m a n y d i v e r s e and independent knowledge sources (KS's). The KS's are assumed i n d i v i d u a l l y
t o b e e r r o r f u l and incomplete.
A u n i f o r m and integrated multi-level structure, the blackboard, holds the c u r r e n t s t a t e
of the system.
K n o w l e d g e sources cooperate by creating, accessing, and modifying e l e m e n t s
in t h e b l a c k b o a r d .
T h e activation of a KS is data-driven, based on the o c c u r r e n c e of p a t t e r n s
in t h e b l a c k b o a r d w h i c h match templates specified by the knowledge s o u r c e .
E a c h l e v e l in the blackboard specifies a different representation of the p r o b l e m s p a c e ;
the
sequence
of
approximately
levels forms a loose hierarchy in which the elements
be
described
as
abstractions
of
elements
at
the
next
at e a c h l e v e l
lower
can
level.
This
d e c o m p o s t i o n c a n be thought of as an a priori framework of a plan for s o l v i n g the p r o b l e m ;
e a c h l e v e l is a g e n e r i c stage in the plan.
T h e e l e m e n t s at each level in the blackboard are hypotheses about some a s p e c t
that level.
set
T h e internal structure of an hypothesis consists of a fixed set of a t t r i b u t e s ; t h i s
is t h e same for
attributes
are
hypotheses
selected
to
at all levels of representation in the b l a c k b o a r d .
serve
as
mechanisms
for
implementing
the
may
relationships,
deductions
create
which
made
by
networks
are
the
explicit
KS's
of
structural
in
the
about
relationships
blackboard, serve
the
hypotheses;
among
to
they
also
Knowledge
hypotheses.
represent
allow
These
data-directed
h y p o t h e s i z e - a n d - t e s t paradigm and for efficient goal-directed scheduling of KS's.
sources
of
These
inferences
and
competing
and
o v e r l a p p i n g partial solutions to be handled in an integrated manner.
T h e H e a r s a y l l speech-understanding system is an implementation of this o r g a n i z a t i o n ;
it is u s e d h e r e as an example for descriptive purposes.
T h i s r e s e a r c h w a s s u p p o r t e d in part by the Defense A d v a n c e d R e s e a r c h P r o j e c t s A g e n c y
u n d e r c o n t r a c t no. F 4 4 6 2 0 - 7 3 - C - 0 0 7 4 'and monitored by the Air F o r c e Office" of S c i e n t i f i c
Research.
2
This paper was accepted for presentation at the 4th International Conference on Artificial Intelligence, September 1975.
INTRODUCTION
T h i s p a p e r d e s c r i b e s an organization for knowledge-based artificial i n t e l l i g e n c e
programs.
(AI)
A l t h o u g h this organization has been derived while developing s e v e r a l g e n e r a t i o n s
o f s p e e c h u n d e r s t a n d i n g systems, w e feel that it has general application to other domains of
l a r g e A I p r o b l e m s (e.g., v i s i o n ,
1
robotics, chess, natural language understanding, and p r o t o c o l
analysis).
Our e f f o r t s f o l l o w from the early work of Reddy (1966) and Reddy and V i c e n s ( V i c e n s ,
1 9 6 9 ) , t h r o u g h the H e a r s a y l system (Reddy, et al., 1973a, 1973b; Erman, 1974), w h i c h
the
first
demonstrable
connected-speech understanding system, up t h r o u g h the
was
currently
d e v e l o p i n g H e a r s a y l l system (Erman, et al., 1973; Lesser, et al., 1974; Fennell, 1975).
These
e f f o r t s have increasingly f o c u s e d on the overall system organization for s o l v i n g the p r o b l e m ;
this
has
resulted
in
the
design
and
construction
of
a
sophisticated
e n v i r o n m e n t w i t h i n w h i c h problem-solving strategies are d e v e l o p e d .
and
structured
Others w o r k i n g in t h i s
a r e a also c o n s i d e r this aspect important.^ The Hearsayll system will b e u s e d h e r e as t h e
p r i m a r y e x a m p l e for d e s c r i b i n g the organization.
THE PROBLEM
T h e c l a s s of A I problem that is addressed in this paper is c h a r a c t e r i z e d b y h a v i n g a
l a r g e p r o b l e m s p a c e and the requirement of a large amount of knowledge f o r its s o l u t i o n .
The
l a r g e amount of explicit knowledge differentiates these problems f r o m o t h e r A I
(e.g., t h e o r e m - p r o v i n g )
in which v e r y
general
"weak" methods
are a p p l i e d
using
areas
meager
a m o u n t s of b u i l t - i n knowledge (Newell, 1969). Further, the knowledge n e e d e d c o v e r s a w i d e
and diverse
set of areas (some examples in the speech understanding p r o b l e m a r e
a n a l y s i s , a c o u s t i c - p h o n e t i c s , phonology, syntax, semantics, and pragmatics).
We
signal
call
each
s u c h a r e a a k n o w l e d g e - s o u r c e (KS) and also define a KS to be an agent w h i c h e m b o d i e s t h e
k n o w l e d g e of its area and which can take actions based on that knowledge.^
T h e s o u r c e s of knowledge are often incomplete and approximate.
may
1
be
traced
to
three
sources: First, the theory
on which
the
This e r r o r f u l n a t u r e
KS
is
based
may
be
R e d d y ( 1 9 7 3 ) is a comparison of the speech and vision problem domains.
N e w e l l , et al., (1971) contains an excellent in-depth study of the s p e e c h
problem.
The current
state-of-the-art
is represented in the p a p e r s of
S y m p o s i u m o n S p e e c h Recognition (Erman, 1974b; Reddy, 1975).
(1973,
understanding
the
1974
In particular,
IEEE
Barnett
1975), and Rovner, et al., (1974) also describe highly s t r u c t u r e d systems; B a k e r
( 1 9 7 4 ) has a highly s t r u c t u r e d system based on a simple Markov model.
^ F o r the p u r p o s e s of this discussion, a KS can be considered static; i.e., w h e t h e r a KS l e a r n s
f r o m e x p e r i e n c e is an issue that is orthogonal to this organization.
1
incomplete or incorrect.
problem,
are
often
For example, modern phonological theories, as applied to the s p e e c h
vague
and incomplete.
Second, the implementation
of
a KS
may
be
i n c o m p l e t e or incorrect; this may be caused by an incorrect translation of the t h e o r y t o t h e
p r o g r a m o r b y an intentionally heuristic implementation of the t h e o r y .
Finally, t h e k n o w l e d g e
s o u r c e may b e o p e r a t i n g on incorrect or incomplete data supplied to it b y o t h e r K S ' s .
1
A s o n e k n o w l e d g e source makes errors and creates ambiguities, o t h e r KS's must
brought
soon
to bear to c o r r e c t and clarify those actions.
as
possible
after
the
introduction
of
be
This KS c o o p e r a t i o n s h o u l d o c c u r
an e r r o r
or
ambiguity
in o r d e r
to
as
limit
its
ramifications.
A mechanism for providing this high degree of cooperation is the h y p o t h e s i z e - a n d - t e s t
paradigm.
In this paradigm, solution-finding is v i e w e d as an iterative p r o c e s s .
the iteration involves
some
Each s t e p in
a) the creation of an hypothesis, which is an "educated g u e s s " a b o u t
a s p e c t of the problem, and
b) a test of the plausibility of the h y p o t h e s i s .
t h e s e s t e p s use a p r i o r i knowledge about the problem, as well as the p r e v i o u s l y
hypotheses.
This
iterative
guess-building
terminates
when
a
consistent
Both of
generated
hypothesis
is
g e n e r a t e d w h i c h satisfies the requirements of an overall solution.
A s a s t r a t e g y for developing such systems, one needs the ability to add and r e p l a c e
s o u r c e s of k n o w l e d g e and to explore different control strategies.
Thus, s u c h c h a n g e s must
b e r e l a t i v e l y e a s y to accomplish; there must also be ways to evaluate the p e r f o r m a n c e of t h e
s y s t e m in g e n e r a l and the roles of the various knowledge sources and c o n t r o l s t r a t e g i e s in
particular.
T h i s ability to experiment conveniently with the system is crucial if the amount of
k n o w l e d g e is large and many people are needed to introduce and validate it.
One means of
h e l p i n g to p r o v i d e these flexibilities is to require that KS's be independent.
B e c a u s e the problems are large and require many computation s t e p s f o r their s o l u t i o n ,
the
system
must
be
efficient
"production"
application
development
versions
s y s t e m is d e v e l o p e d .
in
its
computation.
system; however,
it
must
This
also
must
be
be
certainly
reasonably
because of the experimental w a y that a complex,
true
efficient
for
in
a
the
knowledge-based
That is, many iterative runs over a significant amount of test d a t a must
b e made t o d e v e l o p and evaluate the knowledge sources and control strategies.
1
T h i s may also i ^ u d e externally supplied data <e.g., the digitized
i s
t h
e
^
t i c
.^^ - J
e
f
m
a
J
h i
n
input to the speech-understanding system); the transducers of t h e s e d a t a c a n
c o n s i d e r e d to b e KS's w h i c h also introduce error.
2
a c 0
^
be
MODEL FOR COOPERATION OF KNOWLEDGE SOURCES
T h e r e q u i r e m e n t that knowledge sources be independent implies that the
( a n d v e r y e x i s t e n c e ) of e a c h must not be necessary or crucial to the o t h e r s .
functioning
On t h e o t h e r
h a n d , t h e KS's are r e q u i r e d to cooperate in the iterative guess-building, using and c o r r e c t i n g
one
another's
guesses; this
implies that there must be interaction among the
processes.
T h e s e t w o o p p o s i n g requirements have led to a design in which each KS i n t e r f a c e s t o t h e
o t h e r s e x t e r n a l l y in a uniform way that is identical across KS's and in w h i c h no
s o u r c e k n o w s w h a t or how many other KS's exist.
knowledge
The interface is implemented as a d y n a m i c
g l o b a l data s t r u c t u r e , called the blackboard. The primary units in the b l a c k b o a r d are g u e s s e s
about
particular
throughout
aspects
of
the
problem; these
the blackboard, are called hypotheses.
units, which
have
a
uniform
structure
At any time, the b l a c k b o a r d
holds
c u r r e n t s t a t e of the system; it contains all the guesses about the problem that exist.
the
Subsets
o f h y p o t h e s e s r e p r e s e n t partial solutions to the entire problem; these may c o m p e t e w i t h t h e
p a r t i a l s o l u t i o n s r e p r e s e n t e d by other (perhaps overlapping) subsets.
E a c h k n o w l e d g e s o u r c e may access any information in the blackboard.
i n f o r m a t i o n to the blackboard by creating (or deleting) hypotheses, b y
hypotheses,
and
by
hypotheses.
The
generation
exclusive
establishing
and
or
modifying
modification
of
explicit
globally
means of communication among the diverse KS's.
structural
accessible
Each may a d d
modifying
existing
relationships
among
hypotheses
is
the
This mechanism of c o o p e r a t i o n ,
w h i c h is an implementation of the hypothesize-and-test paradigm, allows a KS to c o n t r i b u t e
knowledge
w i t h o u t being aware of which other KS's will use the information or w h i c h
s u p p l i e d the information that it used.
independent
and s e p a r a b l e .
It is in this way that knowledge s o u r c e s
are
KS
made
The structural relationships (which are mentioned a b o v e
and
w h i c h w i l l be d e s c r i b e d below) form a network of the hypotheses and are u s e d to r e p r e s e n t
t h e d e d u c t i o n s and inferences which caused a KS to generate one h y p o t h e s i s f r o m o t h e r s .
T h e explicit
r e t e n t i o n in the blackboard of these dependency relationships is u s e d to h o l d ,
a m o n g o t h e r things, competing hypotheses.
Because these are held in an i n t e g r a t e d manner,
s e l e c t i v e b a c k t r a c k i n g for e r r o r r e c o v e r y and other search strategies can be i m p l e m e n t e d in
an e f f i c i e n t and non-redundant way.
Decomposition of Knowledge
T h e d e c o m p o s i t i o n of the overall task into various knowledge s o u r c e s is r e g a r d e d as
b e i n g natural; i.e., the units of the decomposition represent those pieces of k n o w l e d g e w h i c h
can
1
be
distinguished
and
recognized as being somehow
naturally
independent*
Such
a
T h e a p p r o a c h taken in knowledge source decomposition is not an attempt to c h a r a c t e r i z e
3
s c h e m e of " i n v e r s e decomposition" (or, composition) seems v e r y natural for many p r o b l e m s o l v i n g tasks, and it fits well into the hypothesize-and-test approach to p r o b l e m - s o l v i n g .
long
as
a sufficient
"covering set" of
knowledge
areas required for
problem
solution
m a i n t a i n e d , o n e c a n f r e e l y add new knowledge sources, or replace or delete o l d o n e s .
knowledge
source
is
self-contained, but
each
is expected
to
cooperate
As
with
is
Each
the
other
k n o w l e d g e s o u r c e s that h a p p e n to be present in the system at that time,
A k n o w l e d g e source is specified in three parts:
a) the conditions under w h i c h it is t o
b e a c t i v a t e d (in terms of the conditions in the blackboard in which it is i n t e r e s t e d ) ,
b) t h e
k i n d s of c h a n g e s it makes to the blackboard, and
3) a procedural statement (program) of t h e
algorithm
A
which
possessing
accomplishes
some
processing
those
changes.
capability
which
is
knowledge
able
to
source
solve
some
is
thus
defined
subproblem,
as
given
a p p r o p r i a t e circumstances for its activation.
Activation of Knowledge Sources
A
knowledge
source
is instantiated
as a knowledge-source
process
whenever
the
b l a c k b o a r d exhibits characteristics which satisfy a "precondition" of the k n o w l e d g e s o u r c e .
A
p r e c o n d i t i o n of a KS is a description of some partial state of the b l a c k b o a r d w h i c h d e f i n e s
w h e n and w h e r e the KS can contribute its knowledge b y modifying the b l a c k b o a r d .
The
KS
c o n t r i b u t e s its k n o w l e d g e through the mechanism of making h y p o t h e s e s and e v a l u a t i n g
and
m o d i f y i n g the c o n t r i b u t i o n s of other knowledge sources (by v e r i f y i n g and rating or r e j e c t i n g
the
hypotheses
made by other knowledge sources).
A KS carries out these
actions
r e s p e c t to a particular context, the context being some subset of the p r e v i o u s l y
with
generated
h y p o t h e s e s in the blackboard. Thus, new hypotheses or modifications to existing h y p o t h e s e s
a r e c o n s t r u c t e d f r o m the (static) knowledge of the KS and the e d u c a t e d g u e s s e s made
at
s o m e p r e v i o u s time b y other knowledge sources.
The
trigger
modifications
further
made
by
any
given knowledge-source
process
are
expected
k n o w l e d g e sources by creating new conditions in the b l a c k b o a r d to
t h o s e k n o w l e d g e s o u r c e s , in turn, respond.
to
which
The structure of a hypothesis is s o d e s i g n e d as
t o a l l o w the p r e c o n d i t i o n s of most KS's to be sensitive to a single, simple c h a n g e in s o m e
h y p o t h e s i s (such as the creation of a new hypothesis of a particular t y p e , a c h a n g e of
r a t i n g , o r the c r e a t i o n of a structural link between particular kinds of h y p o t h e s e s ) .
Through
s o m e h o w the o v e r a l l problem solution process and then apply some sort of t r a f f i c
a n a l y s i s t o its internal workings in order to decompose the total p r o c e s s
interacting
into
k n o w l e d g e sources. Rather, knowledge sources are d e f i n e d b y
s o m e intuitive notion about the various pieces of knowledge w h i c h c o u l d b e
in a u s e f u l w a y to help achieve a solution.
4
a
flow
minimally
starting
with
incorporated
this d a t a - d i r e c t e d interpretation of the hypothesize-and-test paradigm, KS's c a n also e x h i b i t
a h i g h d e g r e e of asynchronous activity and potential parallelism.
Control
1
schemes in which one KS explicitly invokes other
KS's are not
appropriate
b e c a u s e of the requirement that KS's be independent and because the invocation of a KS may
d e p e n d o n a complex set of conditions which is created by the combined actions of
several
KS's.
contain
F u r t h e r , s u c h direct-calling schemes complicate KS's by requiring that t h e y
information
about
the
KS's
that
they
will call.
These
same
arguments
apply
against
a
c e n t r a l i z e d c o n t r o l scheme which is explicitly predefined for a set of KS's.
Decomposition of the Blackboard
T h e b l a c k b o a r d is partitioned into distinct information levels; e a c h level is u s e d t o h o l d
a d i f f e r e n t r e p r e s e n t a t i o n of the problem space.
(Examples of levels in the s p e e c h p r o b l e m
a r e " s y n t a c t i c " , "lexical", "phonetic", and "acoustic"; examples in scene analysis a r e
"picture
p o i n t " , "line segment", "region", and "object".) Associated with each level is a set of p r i m i t i v e
e l e m e n t s a p p r o p r i a t e for representing the problem at that level.
(In the s p e e c h s y s t e m , f o r
e x a m p l e , the elements at the lexical level are the words of the v o c a b u l a r y to b e r e c o g n i z e d ,
while
the
elements
at
the
phonetic
level
are
the
phones
(sounds)
of
English.)
Each
h y p o t h e s i s e x i s t s at a particular level and is labeled as being a particular element of t h e s e t
o f p r i m i t i v e elements at that level.
The
decomposition
of
the
problem
space
into
levels
is
a natural
d e c o m p o s i t i o n into KS's of the knowledge that is to be brought to bear.
parallel
to
the
For many KS's, t h e
K S n e e d s to deal w i t h only one or. a few levels to apply its knowledge; it n e e d not e v e n b e
aware
of
the
existence
of
other
levels.
Thus, each
KS can be
made
as
simple
as
its
k n o w l e d g e allows; its interface to the rest of the system is in units and c o n c e p t s w h i c h a r e
n a t u r a l to it.
A l s o , n e w levels can be added as new sources of k n o w l e d g e
w h i c h n e e d to use them.
for
efficiently
are
designed
Finally, it will be shown that the multi-level r e p r e s e n t a t i o n
s e q u e n c i n g the activity of the KS's in a non-deterministic
manner
allows
and
for
m a k i n g use of multiprocessing.
1
O n e might think of this model for data-directed activation of KS's as a p r o d u c t i o n s y s t e m
( N e w e l l , 1 9 7 3 ) w h i c h is executed asynchronously. The preconditions c o r r e s p o n d t o t h e
l e f t - h a n d s i d e s (conditions) of productions, and the knowledge sources c o r r e s p o n d t o t h e
r i g h t - h a n d s i d e s (actions) of the productions. Conceptually, these l e f t - h a n d s i d e s a r e
e v a l u a t e d continuously.
When a precondition is satisfied, an instantiation of
the
c o r r e s p o n d i n g r i g h t - h a n d side of its production is created; this instantiation is e x e c u t e d at
s o m e a r b i t r a r y subsequent time (perhaps subject to instantiation scheduling c o n s t r a i n t s ) .
It is i n t e r e s t i n g to note that this generalized form of h y p o t h e s i z e - a n d - t e s t leads t o a
s y s t e m o r g a n i z a t i o n w i t h some characteristics also similar to Q A 4 (Rulifson, et al., 1 9 7 3 )
a n d P L A N N E R (Hewitt, 1972). In particular, there are strong similarities in the d a t a - d i r e c t e d
s e q u e n c i n g of p r o c e s s e s .
5
T h e s e q u e n c e of levels forms a loose hierarchical structure in w h i c h the e l e m e n t s
each
l e v e l c a n approximately be described as abstractions of elements at the
level.*
next
at
lower
(For example, an utterance is composed of phrases, w h i c h are made of w o r d s , p u t
together
as s y l l a b l e s , each of which can be described as a sequence of p h o n e s , e a c h
of
w h i c h is c o m p o s e d of acoustic segments, each of which can be d e s c r i b e d b y a s e q u e n c e of
t e n - m i l l i s e c o n d intervals w i t h certain kinds of acoustic characteristics.)
M o s t of the relationships of a hypothesis are with hypotheses at its level o r a d j a c e n t
l e v e l s ; f u r t h e r , t h e s e relationships can usually be derived (by a KS a p p r o p r i a t e to the l e v e l )
without
h a v i n g to delve below the level of abstraction of the hypothesis.
c o n t e x t simplifies the function of knowledge sources.
This locality
of
(Or from the other point of v i e w , t h e
d e c o m p o s i t i o n of k n o w l e d g e into sufficiently simple-acting KS's also simplifies and l o c a l i z e s
r e l a t i o n s h i p s in the blackboard.)^
T h e d e c o m p o s i t i o n of the blackboard into distinct levels of r e p r e s e n t a t i o n c a n also b e
t h o u g h t of as an a p r i o r i framework of a plan for problem-solving.
s t a g e in t h e plan.
Each level is a g e n e r i c
T h e goal at each level is to create and validate h y p o t h e s e s at that l e v e l .
T h e o v e r a l l goal of the system is to create the most plausible n e t w o r k of h y p o t h e s e s
sufficiently
covers
the
levels.
('Plausible'
and
'sufficiently'
s u f f i c i e n t in the judgment of the knowledge sources".)
here
mean
that
"plausible
and
In speech understanding, f o r e x a m p l e ,
t h e g o a l at the phonetic level is a phonetic transcription of the utterance, w h i l e t h e o v e r a l l
goal
is
a n e t w o r k w h i c h connects hypotheses directly d e r i v e d from the acoustic
input
to
context
of
synthesis,
or
h y p o t h e s e s w h i c h d e s c r i b e the semantic content of the utterance.
The
creation
hypotheses
at
a
or
modification
lower
level
(or
of
an
hypothesis
levels) can
be
which
is
considered
based
an
action
on
a
of
a b s t r a c t i o n ; c o n v e r s e l y , manipulations of an hypothesis based o n a higher level c o n t e x t
b e c o n s i d e r e d analysis, or elaboration.
In order to overcome the e r r o r f u l n e s s of the
and
nature, both kinds of
also
make
use of
their
redundant
action are d e s i r a b l e
can
KS's
in
the
system.^
Many
of
the
ideas here fit neatly into Simon's description of
a "nearly
decomposable
h i e r a r c h i c a l s y s t e m " (Simon, 1962).
This
simplification
of
form
and interaction
is
an expected
characteristic
of
a
nearly
d e c o m p o s a b l e hierarchical system (ibid.).
T h e use of the terms 'analysis' and 'synthesis' here are r e v e r s e d from their usual u s e s in
t h e s p e e c h r e c o g n i t i o n domain.
representation
direction.
Traditionally, 'synthesis' means going f r o m a h i g h e r - l e v e l
(e.g., lexical) to the
speech
signal, while
analysis
refers
the
other
In s p e e c h recognition, however, the object is really to s y n t h e s i z e a meaning f o r
t h e u t t e r a n c e f r o m the pieces of data which make up the s p e e c h signal.
6
to
O f t e n , the context
for an analysis or synthesis action is localized to the l e v e l
just
a b o v e o r b e l o w the level at which the action takes place. However, this is not a r e q u i r e m e n t ;
i n f a c t , an action w h i c h skips over several levels can serve strongly to direct the a c t i v i t y of
t h e s y s t e m and t h e r e b y significantly prune the search space.
e q u i v a l e n t to c o n s t r u c t i n g a major step in a plan.
Such a jump o v e r
levels
is
Further, there is no r e q u i r e m e n t that a
j u m p n e c e s s a r i l y b e filled in completely (or even partially) if KS's are confident e n o u g h in t h e
c o n s i s t e n c y of the larger step.
Thus, the KS's can dynamically define the g r a n u l a r i t y in t h e
h y p o t h e s i s n e t w o r k n e c e s s a r y to assure the desired degree of consistency; this g r a n u l a r i t y
m a y v a r y at d i f f e r e n t places in the blackboard, depending on the particular s t r u c t u r e s
that
occur.
A p p e n d i x A contains a description of the blackboard and KS d e c o m p o s i t i o n s
for
the
H e a r s a y l l s p e e c h - u n d e r s t a n d i n g system.
Hypotheses: Structure and Interrelationships
T h e internal s t r u c t u r e of an hypothesis consists of a fixed set of a t t r i b u t e s
(named
f i e l d s ) ; this set is the same for hypotheses at all levels of representation in the b l a c k b o a r d .
These
attributes
are s e l e c t e d to serve as mechanisms for implementing the
hypothesize-and-test
paradigm.*
data-directed
The values of the attributes are d e f i n e d and m o d i f i e d
by
t h e KS's.
A t t r i b u t e s can be g r o u p e d into several classes:
T h e f i r s t class of attributes names the hypothesis: it contains the unique name of t h e
h y p o t h e s i s , the name of its level, and its label from the element set at that l e v e l .
T h e next class of attributes is composed of parameters which rate the
hypothesis.
T h e s e include s e p a r a t e numerical ratings derived from a) a priori information
about
t h e h y p o t h e s i s , b) analysis actions performed on the hypothesis, c) s y n t h e s i s actions,
a n d d) combinations of (a), (b), and (c).
A n o t h e r set of attributes contains information about KS attention to the h y p o t h e s i s .
These
been
include a cumulative measure of the amount of computation that has
expended
on
the
hypothesis
as
well
as
suggestions
for
how
already
much
more
p r o c e s s i n g s h o u l d occur and of what type (e.g., analysis or synthesis).
O n e v e r y important set of attributes describes the structural relationships w i t h o t h e r
h y p o t h e s e s , as d e s c r i b e d below.
F o r e a c h p r o b l e m domain, it is likely that there are other attributes w h i c h are b a s i c
-——
A
In H e a r s a y l l , a KS can s p e c i f y particular attributes of hypotheses at particular l e v e l s
w h i c h it w a n t s to have monitored. Whenever a change is made to one of t h e s e m o n i t o r e d
a t t r i b u t e s , the KS can be activated and notified of the nature of the change. T h e s e c t i o n
b e l o w o n "Data-Directed Activation of Knowledge Sources" contains a more c o m p l e t e
d e s c r i p t i o n of this p r o c e s s .
7
t o t h e p r o b l e m and w h i c h should be provided in the structure of the
these
form
a
problem-specific
i n s t a n c e , time
class of
attributes.
is a fundamental concept, so the
In s p e e c h
hypotheses;
understanding,
H e a r s a y l l system
has
for
a class
of
a t t r i b u t e s f o r d e s c r i b i n g the b e g i n - and end-time and the duration of the e v e n t w h i c h
the hypothesis represents.
fuzzy
(These attributes include ways of explicitly
notions of the times.)
representing
For vision, likely attributes w o u l d include the
location
a n d d i m e n s i o n of the element and trajectory information for moving o b j e c t s .
T h e c a p a b i l i t y f o r a r b i t r a r y KS-specific attributes is also included.
This c a n b e u s e d
b y a KS to hold a r b i t r a r y information about the hypothesis; in this w a y a KS n e e d not
h o l d s t a t e information about the hypothesis across activations of the KS and a l l o w s ,
f o r example, the e a s y implementation of generator functions.
knowledge
of
m o d i f y the
If s e v e r a l KS's s h a r e
the name of one of these attributes, each of them c a n a c c e s s
attribute's value and thus communicate just as if it w e r e
and
a "standard"
a t t r i b u t e ; this can be used as an escape mechanism for explicit KS intercommunication.
A
u n i q u e class of hypothesis attributes, called processing state attributes, c o n t a i n s
succinct
summaries
and classifications of
the values of the other
attributes.
For
e x a m p l e , the values of the rating attributes are summarized and the h y p o t h e s i s
classified
as
(strongly
either
verified
"unrated",
and
"neutral"
unique), or
(noncommittal),
"rejected".
Other
"verified",
processing
is
"guaranteed"
state
attributes
s u m m a r i z e the structural relationships with other hypotheses and c h a r a c t e r i z e , f o r
e x a m p l e , w h e t h e r the hypothesis has been "sufficiently and c o n s i s t e n t l y "
s y n t h e t i c a l l y (i.e., as an abstraction of hypotheses at lower levels).
The
described
processing
s t a t e a t t r i b u t e s are especially useful for efficiently triggering k n o w l e d g e s o u r c e s ; f o r
e x a m p l e , a KS may s p e c i f y in its precondition that it is to be activated w h e n e v e r
hypothesis
for
the
at a particular level becomes "verified".
g o a l - d i r e c t e d scheduling of
knowledge
a
These attributes are also u s e d
sources, as d e s c r i b e d
in t h e
next
section.
G i v e n a s p e c i f i c hypothesis, a KS can examine the value of any of its a t t r i b u t e s .
knowledge
source
also needs the ability to retrieve sets of h y p o t h e s e s
whose
A
attributes
s a t i s f y c o n d i t i o n s in w h i c h the KS is interested. (E.g., a KS in the s p e e c h s y s t e m may w a n t t o
find
all
hypotheses
at
p a r t i c u l a r time range.)
a c c o m p l i s h i n g this.
the
phonetic
level
which
are
vowels
and w h i c h
This partial s p e c i f i c a t i o n p e r m i t s a
a) a set of desired values or
A
to
matching-prototype
provided.)
is
applied
a
set
of
hypotheses;*
b) a d o n ' t - c a r e c o n d i t i o n .
those
match those specified b y the matching-prototype
r e s u l t of t h e s e a r c h .
also
a
T h e s e a r c h condition is specified b y a m a t c h i n g - p r o t o t y p e . w h i c h is a
c o m p o n e n t to b e c h a r a c t e r i z e d b y :
values
within
T h e system provides an associative retrieval s e a r c h mechanism f o r
p a r t i a l s p e c i f i c a t i o n of the components of a hypothesis.
component
occur
More
hypotheses
are
whose
returned
as
the
(Associative retrieval of structural relationships among h y p o t h e s e s
complex retrievals can be accomplished b y combining
the
is
retrieval
p r i m i t i v e s in a p p r o p r i a t e w a y s .
1
T h i s set c a n b e d e r i v e d b y the KS from several sources.
includes
the
hypotheses
attributes
partition
following
at
a
primitive
particular
sources:
level,
c) all hypotheses
(in the
of
the
blackboard), and
d) all hypotheses
at a particular
whose
implementation
blackboard),
level
o v e r l a p a given interval (this provides an extremely efficient,
m o n i t o r e d (for the KS) have changed.
8
The Hearsayll
a) all hypotheses
attributes
whose
b ) all
time
two-dimension
which
are
being
Structural
represented
abstractions
relationships
through
the
use
between
of
nodes
links; links
(hypotheses)
in
the
of
specifying
contextual
A link is an element w h i c h
associates
provide
about the relationships of hypotheses.
a means
blackboard
are
t w o h y p o t h e s e s as an o r d e r e d pair; one of the nodes is termed the u p p e r h y p o t h e s i s , a n d
t h e o t h e r is c a l l e d the l o w e r hypothesis.
The lower hypothesis is said to s u p p o r t t h e u p p e r
h y p o t h e s i s w h i l e the u p p e r hypothesis is called a use of the lower one; in g e n e r a l , t h e l o w e r
h y p o t h e s i s is at the same or a lower level in the blackboard than the u p p e r h y p o t h e s i s .
T h e r e a r e s e v e r a l t y p e s of links,
r e l a t i o n s h i p s . * C o n s i d e r this structure:
with
the
types
describing
various
kinds
of
HI
H2
H3
H4
H I is t h e u p p e r h y p o t h e s i s and H2, H3, and H4 are the lower h y p o t h e s e s of links L I , L 2 , a n d
L3, respectively.
If the links are all of type OR, the interpretation is that H I is e i t h e r a n H 2
o r a n H3 o r an H4. T h i s is one w a y that alternative descriptions are p o s s i b l e .
the
figure
are
necessary
of
type
to support
AND, the interpretation
the existence of
HI.
is that all of
the l o w e r
If t h e links in
hypotheses
(Note that, in general, all of
the
are
supporting
( l o w e r ) links of a h y p o t h e s i s are of the same type; one can thus talk of the " t y p e o f
the
h y p o t h e s i s " , w h i c h is the same as the t y p e of all of its lower links.)
These
two
types
s p e c i f i e s a set/member
of
node
represent different
kinds of
abstractions:
the
OR-node
relationship while the AND-node defines a composition a b s t r a c t i o n .
V a r i a n t s of the A N D - and OR-links are also possible.
For example, a SEQUENCE link is similar
t o t h e A N D - l i n k e x c e p t that an ordering is implied on the set of lower h y p o t h e s e s s u p p o r t i n g
the upper hypothesis.
(For the Hearsayll speech understanding system, this o r d e r i n g u s u a l l y
is i n t e r p r e t e d as indicating a time ordering of the lower hypotheses.)
B e s i d e s s h o w i n g analysis and synthesis relationships b e t w e e n h y p o t h e s e s (e.g., that
o n e h y p o t h e s i s is c o m p o s e d of several other units), a link is a statement about the d e g r e e t o
w h i c h o n e h y p o t h e s i s implies (i.e., "gives evidence for the existence of") another h y p o t h e s i s .
T h e s t r e n g t h of the implication is held as attributes of the link.
The s e n s e of the i m p l i c a t i o n
m a y b e n e g a t i v e ; that is, a link may indicate that one hypothesis is evidence f o r the i n v a l i d i t y
of another.
hypothesis
T h i s statement of implication may be bi-directional; the e x i s t e n c e of t h e
may g i v e c r e d e n c e to the existence of the lower
hypothesis
upper
and vice v e r s a .
.
1
T h e p a r t i c u l a r kinds of relationships described here are some of those that w e r e w e r e
d e s i g n e d f o r the s p e e c h problem. Although they undoubtedly are not the c o m p l e t e s e t f o r
all c o n c e i v a b l e needs, they do represent the kinds of relationships that n e e d to b e a n d a r e
e x p r e s s a b l e in the blackboard.
9
Finally, these
relationships can be constructed in an iterative manner; links c a n b e
added
b e t w e e n e x i s t i n g h y p o t h e s e s b y KS's as they discover new evidence f o r s u p p o r t .
J u s t as an h y p o t h e s i s can have more than one lower link, so it can have s e v e r a l u p p e r
links.
E a c h of t h e s e r e p r e s e n t s a different use of the hypothesis; the uses may b e c o m p e t i n g
or complementary.
T h e ability to have multiple uses and supports of the same h y p o t h e s i s , as
o p p o s e d t o c r e a t i n g duplicates for each competing use and abstraction, s e r v e s t o k e e p t h e
b l a c k b o a r d compact
and t h e r e b y reduces the combinatoric explosion in the s e a r c h s p a c e .
F u r t h e r , s i n c e all the information about the hypothesis is localized, all uses and s u p p o r t s o f
the
hypothesis
automatically
and immediately
share
any
new
information
added
to
the
h y p o t h e s i s b y a n y k n o w l e d g e sources.
A p r o b l e m w i t h this localization can occur if the interactions b e t w e e n h y p o t h e s e s s p a n
more than one level.
1
In this case, a particular support of the hypothesis (at a l o w e r l e v e l )
m a y b e i n c o n s i s t e n t w i t h one (or more) of the uses of the hypothesis (at a higher l e v e l ) b u t
is
consistent
with
other
uses (or
potential
uses) of
the
hypothesis.
In o r d e r
to
avoid
d u p l i c a t i n g t h e h y p o t h e s i s , a mechanism, called a connection matrix, exists in t h e s y s t e m .
A
c o n n e c t i o n matrix is an attribute of a hypothesis; its value specifies w h i c h of t h e a l t e r n a t i v e
s u p p o r t s of the h y p o t h e s i s are applicable ("connected to") which of its uses.
connection
matrix
T h e use of a
allows the results of previous decisions of KS's to b e a c c u m u l a t e d
for
f u t u r e u s e and modification without necessitating contextual duplication of p a r t s of t h e d a t a
base.
T h i s kind of reusage and multiple usage of blackboard s t r u c t u r e s r e d u c e s much of t h e
e x p e n s i v e b a c k t r a c k i n g that characterizes many problem-solving systems.
A p p e n d i x B contains an example of a structure built in the b l a c k b o a r d of t h e H e a r s a y l l
system.
Goal-Directed Scheduling of Knowledge Sources
A s d e s c r i b e d earlier, the overall goal of the system is to c r e a t e the most
network
of
hypotheses
that
sufficiently
spans
the
levels.
At
any
instant
of
plausible
time,
b l a c k b o a r d may contain many incomplete networks, each of w h i c h is plausible as f a r
goes.
of
Some of t h e s e incomplete networks may also share s u b n e t w o r k s .
analysis
and
synthesis
actions
of
knowledge
sources,
incomplete
e x p a n d e d (or c o n t r a c t e d ) and may be joined together (or fragmented).
may
be
many
A g a i n , this fits w e l l into Simon's formulation of hierarchical systems.
10
networks
can
be
A t a n y time, t h e r e
for
the
T h e task of goal-directed scheduling is to d e c i d e t o w h i c h of
t h e s e s i t e s t o allocate computing resources.
1
as it
Through the results
places in the blackboard which satisfy the (precondition) c o n t e x t s
a c t i v a t i o n of particular KS's.
the
S e v e r a l of the attribute classes of a hypothesis can be helpful in making s c h e d u l i n g
decisions.
P a r t i c u l a r l y valuable are the values of the attention attributes, w h i c h , as d e s c r i b e d
e a r l i e r , a r e indicators telling how much computation has been e x p e n d e d o n the
hypotheses
a n d s u g g e s t i o n s b y KS's of how desirable it is to devote further e f f o r t o n the
hypothesis
( a l o n g w i t h the kinds of processing that are desirable).
The processing state a t t r i b u t e s
are
a l s o v a l u a b l e f o r making scheduling decisions.
U s i n g t h e s e kinds of information, a knowledge source might be s c h e d u l e d for e x e c u t i o n
b e c a u s e it p o s s e s s e s the only processing capability available to be applied to an i m p o r t a n t
incompletely
focusing
explored
factors
structural
area of
which
connections
the
highlight
blackboard.
activity
For
example, if the
in a blackboard
region
blackboard
in w h i c h
b e t w e e n two adjoining levels, the scheduler
contains
there
should give
are
a
no
higher
p r i o r i t y to a k n o w l e d g e s o u r c e which will attempt (as indicated in its e x t e r n a l s p e c i f i c a t i o n s )
t o make s u c h a c o n n e c t i o n than to a knowledge source which is likely merely t o p e r f o r m a
minor
refinement
processes
analysis
ready
in
on
the
ratings
in one of
the
levels.
However,
if
there
are
to execute, the scheduling algorithm can perform a t y p e of
which
it
schedules
those
knowledge
sources
which
are
likely
no
such
means-ends
to
produce
b l a c k b o a r d c h a n g e s which, in turn, might trigger the activation of KS's in w h i c h the s y s t e m is
currently interested.
The
implementation of
the goal-directed scheduling strategy
is s e p a r a t e d
from
the
a c t i o n s of individual knowledge sources. That is, the decision of whether a KS can c o n t r i b u t e
in a p a r t i c u l a r context is local to the KS, while the assignment of that KS t o o n e of t h e many
contexts
on which
it can possibly
operate is made more globally.
a) d e c o u p l i n g of f o c u s i n g strategy from knowledge-source activity,
e n v i r o n m e n t (blackboard) from the control flow (KS activation), and
The three
quickly
of
b) d e c o u p l i n g of t h e d a t a
c) the limited c o n t e x t in
w h i c h a KS o p e r a t e s , together permit a quick refocusing of attention of KS's.
refocus
aspects
The ability to
is v e r y important because the errorful nature of the KS activity
leads
to
m a n y i n c o m p l e t e and p o s s i b l y contradictory hypothesis networks; thus, as s o o n as p o s s i b l e
a f t e r a n e t w o r k no longer seems promising, the resources of the system s h o u l d b e e m p l o y e d
elsewhere.*
' ^ynsvll*
"
9
7
5
)
d
e
S
C
n
b
e
m
«™°<*>°"
°< SOaMirectad scheduling in the
11
IMPLEMENTATION OF DATA-DIRECTED ACTIVATION OF KNOWLEDGE SOURCES
A s s o c i a t e d w i t h e v e r y knowledge source is a specification of the b l a c k b o a r d c o n d i t i o n s
required
for
the
activation
of
that
knowledge
source.
This
specification,
p r e c o n d i t i o n , is a decision procedure whose tests are matching-prototypes
relationships
arbitrarily
blackboard)
contexts.
and
complex
resulting
decisions
in the
(based
activation of
on
current
and
desired knowledge
past
a
structural
w h i c h , w h e n applied to the blackboard in an associative manner, d e t e c t
r e g i o n s of the b l a c k b o a r d in which the knowledge source is interested.
contain
called
the
This p r o c e d u r e may
modifications
sources
within
to
the
T h e context c o r r e s p o n d i n g to the discovered blackboard r e g i o n w h i c h
the
chosen
satisfies
s o m e k n o w l e d g e s o u r c e ' s precondition is used as an initial context in w h i c h t o activate that
knowledge source.
the
system's
T h e e f f i c i e n c y of the KS precondition evaluation is an important a s p e c t of
implementation, especially
as the
knowledge
is d e c o m p o s e d
into
more
and
s m a l l e r KS's and e a c h KS activation requires less computation.
The
Hearsayll
evaluation
efficient
blackboard.
system,
by
as
placing
an
example
additional
of
an
functions
T h e s e functions are activated whenever
implementation,
in
the
makes
routines
precondition
which
modify
any KS modifies an a t t r i b u t e
the
in
the
b l a c k b o a r d w h i c h some other KS has asked to be monitored. The e s s e n c e of the m o d i f i c a t i o n
is
preserved
in a data structure, called a change set, which is specific
to the
attribute
c h a n g e d and the KS w h i c h requested the monitoring. A KS specifies in a n o n - p r o c e d u r a l w a y
(either
statically
increase
the
or dynamically) those attributes which it wants to monitor.
e f f i c i e n c y , monitoring can further
be localized to
particular
In o r d e r
levels
or
to
even
individual hypotheses.
C h a n g e sets s e r v e to categorize blackboard modifications (events) and are t h u s u s e f u l
i n p r e c o n d i t i o n evaluation since they limit the areas in the blackboard that n e e d b e e x a m i n e d
in
detail.
As
currently
implemented
in
Hearsayll,
the
precondition
evaluator
of
each
k n o w l e d g e s o u r c e exists as a separate process which monitors changes in the d a t a b a s e (i.e.,
it monitors- additions to those change sets in which the KS is interested).
The
precondition
p r o c e s s is itself d a t a - d i r e c t e d in that it is activated only w h e n sufficient c h a n g e s h a v e b e e n
m a d e in the b l a c k b o a r d (i.e., w h e n an entry is made into one of its change s e t s , as a s i d e effect
of
themselves
a
relevant
have
blackboard
preconditions,
knowledge sources.
modification).
albeit
of
In
effect,
a much simpler
the
form
precondition
than
those
processes
possible
for
For example, a precondition process in the s p e e c h s y s t e m may s p e c i f y
that it s h o u l d b e activated w h e n e v e r changes occur to two adjacent h y p o t h e s e s at t h e w o r d
l e v e l o r w h e n e v e r s u p p o r t is added to the phrasal level.
B y using the (coarse) c l a s s i f i c a t i o n s
a f f o r d e d b y c h a n g e sets, the system avoids most unnecessary executions of the p r e c o n d i t i o n
processes.
being
12
T h e major point is that the scheme of precondition evaluation is e v e n t - d r i v e n ,
b a s e d o n the o c c u r r e n c e of changes in the blackboard-, i.e., it is o n l y
at p o i n t s
of
modification
to
the
become satisfied.
blackboard
that
a precondition
that
was
previously
unsatisfied
may
In particular, precondition evaluators are not involved in a f o r m of
busy
w a i t i n g in w h i c h t h e y are constantly looking for something that is not yet t h e r e .
O n c e i n v o k e d , a precondition procedure uses sequences of associative r e t r i e v a l s
structural
matches
satisfying
the
precondition
sources.
on
portions
preconditions
procedure
may
of
of
the
one
blackboard
or
more
be responsible
for
in an attempt
of
"its"
to
establish
knowledge
instantiating several
a
and
context
sources;
any
(related)
knowledge
Notice that the data-directed nature of precondition evaluation
and
given
knowledge-
s o u r c e a c t i v a t i o n is linked closely to the primitive functions that are able to modify t h e d a t a
b a s e , f o r it is o n l y at points of modification that a precondition that w a s u n s a t i s f i e d b e f o r e
may
become
satisfied.
Hence,
data
base
modification
routines
have
the
responsibility
( a l t h o u g h p e r h a p s indirectly) of activating the precondition evaluation mechanism.
Implementation on Parallel Computers
great
Because
of the independence of KS's and their data-directed activation, t h e r e
deal
potential
of
parallelism in this organization.
Trends
in computer
is
a
architecture
i n d i c a t e that large amounts of computing power will be economically r e a l i z e d in a s y n c h r o n o u s
multiprocessor
networks.
Thus,
the
implementation
of
such
large
AI
programs
on
m u l t i p r o c e s s o r s becomes an attractive goal.
There are, however, a set of issues in s u c h an
implementation;
interference
simultaneously
most
of
these
deal
with
to access the blackboard.
among
KS's
when
they
attempt
Effective solutions to these p r o b l e m s h a v e
been
d e v e l o p e d in the H e a r s a y l l implementation; Lesser, et al., (1974), Lesser (1975), and F e n n e l l
a n d L e s s e r ( 1 9 7 5 ) d e s c r i b e these solutions.
ACKNOWLEDGMENTS
W e w i s h to acknowledge the contributions of the following p e o p l e :
Richard Fennell for
his major r o l e in the development of Hearsayl and II (and, in particular, the m u l t i p r o c e s s i n g
a s p e c t s of the s y s t e m organization) and for helpful suggestions for this p a p e r , Raj R e d d y f o r
m a n y of the basic ideas w h i c h have led to the organization d e s c r i b e d here, A l l e n N e w e l l f o r
his insights into the task of problem-solving, Gregory Gill for his untiring e f f o r t s in s y s t e m
i m p l e m e n t a t i o n , and F r e d e r i c k Hayes-Roth for suggestions for this paper.
13
Appendix A:
r r l
EXAMPLE OF BLACKBOARD AND KS DECOMPOSITION IN HEARSAY I I
A
W
1
F i g u r e 1 s h o w s a schematic of the levels of Hearsayll.
Conceptual
Phrasal
Lexical
Syllabic
Surface-phonemic
Phonetic
Segmental
Parametric
Figure 1. The Levels in Hearsayll.
Parametric
utterance
acoustic
Level
-
The
parametric
level
holds
the
most
basic
representation
that the system has; it is the only direct input to the machine
signal.
Several
different
sets
of
parameters
are
being
used
of
the
about
the
in
Hearsayll
i n t e r c h a n g e a b l y : 1/3-octave filter-band energies measured e v e r y 10 m s e c , L P C - d e r i v e d
v o c a l - t r a c t parameters, and w i d e - b a n d energies and z e r o - c r o s s i n g counts.
Segmental
Level
-
This
level
represents
the
utterance
as labeled
acoustic
segments.
A l t h o u g h the set of labels may be phonetic-like, the level is not intended to b e p h o n e t i c
— t h e s e g m e n t a t i o n and labeling reflect acoustic manifestation and do not, f o r e x a m p l e ,
attempt
to
compensate
for
the
context
of
the
segments
a c o u s t i c a l l y dissimilar segments into (phonetic) units.
or
attempt
to
As with all levels, a n y
p o r t i o n of the utterance may be represented by more than one competing
combine
particular
hypothesis
• (i.e., multiple segmentations and labelings may co-exist).
P h o n e t i c L e v e l - A t this level, the utterance is represented b y a phonetic d e s c r i p t i o n .
This
is a b r o a d p h o n e t i c description in that the size (duration) of the units is o n the o r d e r of
t h e " s i z e " of phonemes; it is a fine phonetic description to the extent that e a c h e l e m e n t
is l a b e l e d w i t h a fairly detailed allophonic classification (e.g., "stressed, n a s a l i z e d [I]").
S u r f a c e - P h o n e m i c L e v e l - This level, named by seemingly contradicting terms, r e p r e s e n t s
t h e u t t e r a n c e b y phoneme-like units, with the addition of modifiers such as s t r e s s
and
b o u n d a r y ( w o r d , morpheme, syllable) markings.
S y l l a b i c L e v e l - T h e unit of representation here is the syllable.
L e x i c a l L e v e l - T h e unit of information at this level is the w o r d .
P h r a s a l L e v e l - Syntactic elements appear at this level.
In fact, since a level may c o n t a i n
a r b i t r a r i l y many " s u b - l e v e l s " of elements using the AND and OR links, traditional kinds of
s y n t a c t i c t r e e s can be directly represented here.
C o n c e p t u a l L e v e l - T h e units at this level are "concepts." As with the phrasal lev-el, it may
b e a p p r o p r i a t e to use the graph structure of the da>ta base to indicate r e l a t i o n s h i p s
a m o n g d i f f e r e n t concepts.
A s examples of knowledge sources, Figure 2 shows the first set i m p l e m e n t e d f o r
Hearsayll.
left.
1
T h e levels are indicated as horizontal lines in the figure and are l a b e l e d at t h e
T h e k n o w l e d g e s o u r c e s are indicated by arcs connecting levels; the s t a r t i n g point(s) o f
A p p e n d i c e s A and B are reprinted from Lesser, et al (1974); they are i n c l u d e d h e r e
convenience.
14
for
an arc indicates the level(s) of maior "innnt" fnr t h
"output" level where the
most of
these par icuar
Ln^ill
knoltZ
.
n
a
i/c
K S
«
-
T
a
d
i
- Levels -
°
'
n
J .L.
a
S
n
d
t
h
e
^
e
n
P
d
I n
g
6
n
indicates
o i n t
e
r
a
l
'
t
h
e
a
c
t
i
o
the
"
Knowledge Sources -
CONCEPTUAL
Semantic W o r d Hypothesizer
PHRASAL
-Syntactic Parser
'-4
LEXICAL
Phoneme Hypothesizer
SYLLABIC
W o r d Candidate Generator
^ ^ " P h o n o l o g i c a l Rule Applier
SURFACEPHONEMIC
|
Phone—Phoneme Synchronizer
PHONETIC
- P h o n e Synthesizer
SEGMENTAL
tb
1—r&
9
I—Segment—Phone
Synchronizer
Parameter—Segment
Synchronizer
Segmenter-Classif ier
PARAMETRIC
Figure 2. A Set of Knowledge Sources for H e a r s a y l l .
T h e S e g m e n t e r - C l a s s i f ier knowledge source uses the description of the s p e e c h signal to
p r o d u c e a l a b e l e d acoustic segmentation. For any portion of the utterance, s e v e r a l
p o s s i b l e alternative segmentations and labels may be produced.
The
Phone
Synthesizer
phonetic level.
uses
labeled
acoustic
segments
to
generate
elements
at
the
This p r o c e d u r e is sometimes a fairly direct renaming of an h y p o t h e s i s at
t h e s e g m e n t a l level, perhaps using the context of adjacent segments.
In o t h e r c a s e s ,
p h o n e s y n t h e s i s r e q u i r e s the combining of several segments (e.g., the g e n e r a t i o n of [t]
from
a segment
of silence followed by a segment of aspiration) or
the i n s e r t i o n
of
p h o n e s not indicated directly by the segmentation (e.g., hypothesizing the e x i s t e n c e of
an [ I ] if a v o w e l seems velarized and there is no [I] in the neighborhood).
T h i s KS is
t r i g g e r e d w h e n e v e r a new hypothesis is created at the segmental level.
15
o
f
The
Word
Candidate
locations
and o t h e r
Generator
uses
phonetic
information
(primarily
just
areas of high phonetic reliability) to generate w o r d
at
stressed
hypotheses.
T h i s is accomplished in a two-stage process, with a stop at the syllabic level, f r o m w h i c h
lexical r e t r i e v a l is more effective.
T h e Semantic W o r d Hypothesizer uses semantic and pragmatic information about t h e t a s k
( n e w s r e t r i e v a l , in this case) to predict words at the lexical level.
T h e S y n t a c t i c W o r d Hypothesizer uses knowledge at the phrasal level to p r e d i c t
n e w w o r d s at the lexical level which are adjacent (left or right) to w o r d s
possible
previously
g e n e r a t e d at the lexical level. This knowledge source is activated at the b e g i n n i n g of an
u t t e r a n c e r e c o g n i t i o n attempt and, subsequently, whenever a new w o r d is c r e a t e d at
t h e lexical l e v e l .
T h e P h o n e m e H y p o t h e s i z e r knowledge source is activated whenever a w o r d h y p o t h e s i s is
c r e a t e d (at the lexical level) which is not yet supported by h y p o t h e s e s at the s u r f a c e phonemic level.
Its action is to create one or more sequences at the s u r f a c e - p h o n e m i c
l e v e l w h i c h r e p r e s e n t alternative pronounciations of the w o r d .
(These
pronounciations
a r e c u r r e n t l y p r e - s p e c i f i e d as entries in a dictionary.)
T h e P h o n o l o g i c a l Rule A p p l i e r rewrites sequences at the surface-phonemic l e v e l .
is u s e d :
T h i s KS
1) to augment the dictionary lookup of the Phoneme H y p o t h e s i z e r , and
2) t o
h a n d l e w o r d b o u n d a r y conditions that can be predicted by rule.
The
Phone-Phoneme
Synchronizer
is triggered whenever
e i t h e r the p h o n e t i c or the surface-phonemic level.
hypothesis with hypotheses
either direction.
The
an hypothesis
is
created
at
This KS attempts to link up the n e w
at the other level.
This linking may b e m a n y - t o - o n e
in
S y n t a c t i c P a r s e r uses a syntactic definition of the input language to d e t e r m i n e if a
c o m p l e t e s e n t e n c e may be assembled from words at the lexical level.
The
primary
duties
of
the
S y n c h r o n i z e r are similar:
Segment-Phone
Synchronizer
and
the
Parameter-Segment
to recover from mistakes made by the (bottom-up) actions o f
t h e P h o n e S y n t h e s i z e r and Segmenter-Classifier,
f r o m the higher to the lower level.
respectively, b y
allowing
feedback
In addition t o the knowledge source modules described above, all of w h i c h
speech
knowledge, several
system
in
a
manner
policy
identical
to
modules exist.
the
speech
These
modules, w h i c h i n t e r f a c e
modules,
execute
policy
embody
to
decisions,
the
e.g.,
p r o p a g a t i o n of ratings and calculation of processing-state attributes.
Appendix B:
EXAMPLE OF A BLACKBOARD FRAGMENT IN HEARSAY II
F i g u r e 3 is an example of a fragment that might occur in Hearsayll's b l a c k b o a r d .
The
l e v e l of an h y p o t h e s i s is indicated by its vertical position; the names of the l e v e l s a r e g i v e n
o n t h e left.
Time location is approximately indicated b y horizontal placement, but d u r a t i o n is
o n l y v e r y r o u g h l y indicated (e.g., the boxes surrounding the t w o h y p o t h e s e s at t h e p h r a s a l
l e v e l s h o u l d b e much wider).
Alternatives are indicated by proximity; f o r example, ' w i l l ' a n d
' w o u l d ' a r e w o r d h y p o t h e s e s covering the same time span.
Likewise, ' q u e s t i o n ' and ' m o d a l -
q u e s t i o n ' , ' y o u l ' and ' y o u 2 \ and ' J ' and 'Y' all represent pairs of alternatives.
T h i s example illustrates several features of the data structure:
The
hypothesis
'youj at the lexical level, has two alternative phonemic
"spellings"
i n d i c a t e d ; the h y p o t h e s e s labeled ' y o u l ' and ' y o u 2 ' are nodes c r e a t e d , also at t h e
l e x i c a l l e v e l , t o hold those alternatives.
arbitrarily.
16
In general, such s u b - l e v e l s may b e c r e a t e d
'quest i on'
PHRASAL
(SEQ)
>\\W
'modal q u e s t i o n '
(SEQ)
- \ - '
'you'
(OPT)
LEXICAL
'youl'
(SEQ)
^
V
10
11
4
you2'
(SEQ)
SURFACEPHONEMIC
F i g u r e 3. A n Example of a Fragment in the Blackboard.
T h e link b e t w e e n *youV and D ' is a special kind of SEQUENCE link (indicated h e r e b y
a d a s h e d line) called a CONTEXT link; a CONTEXT link indicates that t h e l o w e r
h y p o t h e s i s s u p p o r t s the upper one and is contiguous to its brother links, but it is n o t
" p a r t o f " t h e u p p e r hypothesis in the sense that i t is not within the time i n t e r v a l o f
t h e u p p e r h y p o t h e s i s — rather, it supplies a context for its brother(s). In this c a s e ,
o n e may " r e a d " t h e structure as stating ' " y o u l ' is composed of ' J ' f o l l o w e d b y ' A X '
( s c h w a ) i n the context of the preceding <D\" (This reflects the phonological rule that
" w o u l d y o u " is o f t e n s p o k e n as "would-ja.") Thus, a CONTEXT link allows important
c o n t e x t u a l relationships to be represented without violating t h e implicit time
a s s u m p t i o n s about SEQUENCE nodes.
?
W h e r e a s t h e phonemic spelling of the w o r d " y o u " held b y *youV includes a c o n t e x t u a l
c o n s t r a i n t , t h e V°u2' option does not have this constraint. However, *youV a n d
V o u 2 ' a r e s u c h similar hypotheses that there is strong reason for w a n t i n g t o r e t a i n
t h e m as alternative options under you* (as indicated in Figure 3), r a t h e r t h a n
r e p r e s e n t i n g them unconnectedly. A connection matrix is used here t o r e p r e s e n t this
k i n d of relationship; the connection matrix of ' y o u ' (symbolized in Figure 3 b y t h e 2d i m e n s i o n a l b i n a r y matrix in the node) specifies that support *youV is relevant t o u s e
' q u e s t i o n ' (but not to 'modal-question') and that support V>u2' is relevant t o b o t h
uses.
K
T h e n a t u r e of the implications represented by the links provides a uniform b a s i s f o r
p r o p a g a t i n g c h a n g e s made in one part of the data structure to other relevant p a r t s w i t h o u t
n e c e s s a r i l y r e q u i r i n g the intervention of particular knowledge s o u r c e s at e a c h s t e p .
C o n s i d e r i n g t h e example of Figure 3, assume that the validity of the h y p o t h e s i s l a b e l e d ' J ' is
m o d i f i e d b y some KS (presumably operating at the phonetic level) and becomes v e r y l o w .
O n e p o s s i b l e s c e n a r i o f o r rippling this change through the data base is g i v e n h e r e :
F i r s t , t h e estimated validity of *youV
is reduced, because ' J ' is a l o w e r h y p o t h e s i s o f
T h i s , in t u r n , may cause the rating of <you' to be reduced.
17
T h e c o n n e c t i o n matrix at ' y o u ' specifies that ' y o u l * is not relevant to ' m o d a l - q u e s t i o n j
s o t h e latter h y p o t h e s i s is not affected by the change in rating of the f o r m e r .
Notice
that the e x i s t e n c e of the connection matrix allows this decision to be made l o c a l l y in
t h e d a t a s t r u c t u r e , without having to search back d o w n to the ' D ' and 'J?
'Question,' h o w e v e r , is s u p p o r t e d by ' y o u l ' (through the connection matrix at you*)
K
}
so
its r a t i n g is a f f e c t e d .
Further
p r o p a g a t i o n s can continue to occur, perhaps d o w n the other SEQUENCE
links
u n d e r ' q u e s t i o n ' and ' y o u l . '
Notice
that
all
of
these
modifications
are
"speech-knowledge
independent"
and
can
be
a c c o m p l i s h e d uniformly at all levels of the blackboard by a single policy k n o w l e d g e s o u r c e .
T h i s p o l i c y KS d o e s not need to access or trigger any other KS but can d i r e c t l y d e r i v e all t h e
i n f o r m a t i o n it n e e d s from the hypothesis and link fields that are uniformly p r e s e n t and f r o m
t h e implicit semantics of the structures in the blackboard.
REFERENCES
B a k e r , J a m e s (1974), "The DRAGON System — A n Overview;' in Erman (1974b).
B a r n e t t , J . ( 1 9 7 3 ) , "A Vocal Data Management System" IEEE Trans. Audio and Electroacoustics,
A U - 2 1 , 3.
B a r n e t t , J . ( 1 9 7 5 , in press), "Module Linkage and Communication in Large Systems," in Reddy
(1975).
E r m a n , L. D., R. D. Fennell, V. R. Lesser, and D. R. Reddy (1973), "System O r g a n i z a t i o n s
Speech
Understanding:
Architectures
for
Implications
of
Network
and
Multiprocessor
for
Computer
A I " Proc. 3 r d Inter. Joint Conf. on Artificial Intel., Stanford, Ca.,
pp. 194-199.
E r m a n , L. D. (1974), " A n Environment
and System for Machine Understanding of
Connected
Speech," Ph.D. thesis, Comp. Sci. Dept, Stanford Univ.
E r m a n , L. D. (ed.) (1974b), Contributed Papers of the IEEE Symposium on Speech Recognition,
A p r i l 1 5 - 1 9 , 1974, Pittsburgh, Pa., IEEE Cat. No. 7 4 C H 0 8 7 8 - 9 A E . M a n y of t h e s e
p a p e r s have b e e n r e p r i n t e d in IEEE Trans, on Acoustics, Speech, and Signal Processing,
A S S P - 2 3 , no. 1 (Feb., 1975).
F e n n e l l , R. D., and V. R. Lesser (1975), "Parallelism in A.I. Problem-Solving: A C a s e Study of
H e a r s a y l l ; T e c h . Report, Comp. Sci. Dept., Carnegie-Mellon Univ., P i t t s b u r g h , P a .
1
F e n n e l l , R. D. ( 1 9 7 5 b , in.preparation), Ph.D. thesis, Comp. Sci. Dept., C a r n e g i e - M e l l o n Univ.
H a y e s - R o t h , F., V. R. Lesser, and D. W. Kosy (1975, in preparation), "A Design f o r
Attentional
C o n t r o l in a Distributed-Logic Speech Understanding System" T e c h . R e p o r t , C o m p . S c i .
Dept., C a r n e g i e - M e l l o n Univ., Pittsburgh, Pa.
H e w i t t , C.
(1972), "Description
and Theoretical
Analysis
(Using
Schemata) of
Planner:
L a n g u a g e for P r o v i n g Theorems and Manipulating Models in a R o b o t " AI M e m o
A
No.
2 5 1 , MIT Project MAC.
L e s s e r , V. R., R. D. Fennell, L D. Errr\an
& D. R. Reddy (1974), "Organization of the H E A R S A Y II
}
S p e e c h Understanding System;
1
in Erman (1974b).
Also a p p e a r e d in IEEE Trans, on
Acoustics, Speech, and Signal Processing, A S S P - 2 3 , no. 1, pp. 1 1 - 2 3 (Feb., 1975).
L e s s e r , V. R. ( 1 9 7 5 , in
press), "Parallel
Processing
S u r v e y of Design Problems" in Reddy (1975).
18
in Speech
Understanding
Systems:
A
N e w e l l , A. (1969), "Heuristic Programming: Ill-Structered Problems;
in J . S. A r o n o f s k y (ed.),
1
Progress in Operations Research, 3, Wiley, 3 6 3 - 4 1 5 .
N e w e l l , A., J . Barnett, J . Forgie, C. Green, D. Klatt, J. C. R. Licklider, J . M u n s o n , R. R e d d y , a n d
W. W o o d s (1971), Speech Understanding Systems: Final Report of a Study Group, p u b .
b y N o r t h - H o l l a n d (1973).
N e w e l l , A. ( 1 9 7 3 ) , " P r o d u c t i o n Systems:
Models of Control Structures,
11
in W . C . C h a s e
(ed.)
Visual Information Processing, Academic Press, pp. 4 6 3 - 5 2 6 .
R e d d y , D. R. (1966), " A n A p p r o a c h to Computer Speech Recognition by Direct A n a l y s i s of t h e
S p e e c h Wave|' (Ph.D. thesis) AI Memo No. 43, Comp. Sci. Dept., S t a n f o r d Univ.,
Stanford, Ca.
R e d d y , D. R., L D. Erman, and R. B. Neely (1973a), "A Model and a S y s t e m f o r M a c h i n e
R e c o g n i t i o n of Speech; IEEE Trans. Audio and Electroacoustics, A U - 2 1 , 3, p p . 2 2 9 - 2 3 8 .
1
R e d d y , D. R.,
L D. Erman,
R. D. Fennell,
and
R. B. Neely
(1973b),
"The
HEARSAY
Speech
U n d e r s t a n d i n g System: A n Example of the Recognition P r o c e s s " Proc. 3rd Inter. Joint
Conf. on Artificial Intel., Stanford, Ca., pp. 185-193.
R e d d y , D. R. (1973), "Eyes and Ears for Computers; Tech. Report, Comp. Sci. Dept., C a r n e g i e 1
M e l l o n University, Pittsburgh, Pa.
Keynote speech p r e s e n t e d at Conf. o n
Cognitive
P r o c e s s e s and Artifical Intelligence, Hamburg, April, 1973.
R e d d y , R.,
and
A. Newell
Understanding
(1974),
System"
in
"Knowledge
and
L. W. Gregg
(ed.)
its
Representation
Knowledge
and
in
Cognition,
a
Speech
Lawrence
E r l b a u m Assoc., Washington, D. C , chap. 10, (in press).
R e d d y , D. R.
(ed.)
(1975,
in
press),
Invited
Papers
of
the
IEEE
Symposium
on
Speech
a
Speech
Recognition, A p r i l 15-19, 1974, Pittsburgh, Pa., Academic Press.
R o v n e r , P.,
N a s h - W e b b e r , B.,
and
Woods, W.
(1974),
U n d e r s t a n d i n g S y s t e m " in Erman (1974b).
"Control
Concepts
in
Also appeared in IEEE Trans, on Acoustics,
Speech, and Signal Processing, A S S P - 2 3 , no. 1, pp. 136-140 (Feb., 1975).
R u l i f s o n , J . F., et al., (1973), "QA4:
A Procedural Calculus for Intuitive Reasoning;' T e c h n i c a l
Note 7 3 , AI Center, Stanford Res. Inst., Menlo Park, Ca.
S i m o n , H. A., (1962), "The Architecture of Complexity", Proc. Amer. Philosophical Society, 1 0 6 ,
pp. 467-482.
V i c e n s , P. ( 1 9 6 9 ) , " A s p e c t s of Speech Recognition," Report C S - 1 2 7 , (Ph.D. Thesis), C o m p . S c i .
Dept., S t a n f o r d Univ.
19