Carnegie Mellon University Research Showcase @ CMU Computer Science Department School of Computer Science 1975 A multi-level organization for problem solving using many, diverse, cooperating sources of knowledge Lee D. Erman Carnegie Mellon University Victor Lesser Follow this and additional works at: http://repository.cmu.edu/compsci This Technical Report is brought to you for free and open access by the School of Computer Science at Research Showcase @ CMU. It has been accepted for inclusion in Computer Science Department by an authorized administrator of Research Showcase @ CMU. For more information, please contact [email protected]. NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or other reproductions of copyrighted material. Any copying of this document without permission of its author may be prohibited by law. A MULTI-LEVEL ORGANIZATION FOR PROBLEM USING MANY, DIVERSE, COOPERATING SOURCES OF KNOWLEDGE Lee D. Erman and Victor R. Lesser March, 1975 A MULTI-LEVEL ORGANIZATION FOR PROBLEM SOLVING USING MANY, DIVERSE, COOPERATING SOURCES OF KNOWLEDGE Lee D. Erman and Victor R. Lesser Computer Science Department 1 Carnegie-Mellon University Pittsburgh, Pa. 15213 March, 1975 ABSTRACT An organization problems. is presented The hypothesize-and-test for implementing solutions to knowledge-based paradigm is used as the basis for c o o p e r a t i o n AI among m a n y d i v e r s e and independent knowledge sources (KS's). The KS's are assumed i n d i v i d u a l l y t o b e e r r o r f u l and incomplete. A u n i f o r m and integrated multi-level structure, the blackboard, holds the c u r r e n t s t a t e of the system. K n o w l e d g e sources cooperate by creating, accessing, and modifying e l e m e n t s in t h e b l a c k b o a r d . T h e activation of a KS is data-driven, based on the o c c u r r e n c e of p a t t e r n s in t h e b l a c k b o a r d w h i c h match templates specified by the knowledge s o u r c e . E a c h l e v e l in the blackboard specifies a different representation of the p r o b l e m s p a c e ; the sequence of approximately levels forms a loose hierarchy in which the elements be described as abstractions of elements at the next at e a c h l e v e l lower can level. This d e c o m p o s t i o n c a n be thought of as an a priori framework of a plan for s o l v i n g the p r o b l e m ; e a c h l e v e l is a g e n e r i c stage in the plan. T h e e l e m e n t s at each level in the blackboard are hypotheses about some a s p e c t that level. set T h e internal structure of an hypothesis consists of a fixed set of a t t r i b u t e s ; t h i s is t h e same for attributes are hypotheses selected to at all levels of representation in the b l a c k b o a r d . serve as mechanisms for implementing the may relationships, deductions create which made by networks are the explicit KS's of structural in the about relationships blackboard, serve the hypotheses; among to they also Knowledge hypotheses. represent allow These data-directed h y p o t h e s i z e - a n d - t e s t paradigm and for efficient goal-directed scheduling of KS's. sources of These inferences and competing and o v e r l a p p i n g partial solutions to be handled in an integrated manner. T h e H e a r s a y l l speech-understanding system is an implementation of this o r g a n i z a t i o n ; it is u s e d h e r e as an example for descriptive purposes. T h i s r e s e a r c h w a s s u p p o r t e d in part by the Defense A d v a n c e d R e s e a r c h P r o j e c t s A g e n c y u n d e r c o n t r a c t no. F 4 4 6 2 0 - 7 3 - C - 0 0 7 4 'and monitored by the Air F o r c e Office" of S c i e n t i f i c Research. 2 This paper was accepted for presentation at the 4th International Conference on Artificial Intelligence, September 1975. INTRODUCTION T h i s p a p e r d e s c r i b e s an organization for knowledge-based artificial i n t e l l i g e n c e programs. (AI) A l t h o u g h this organization has been derived while developing s e v e r a l g e n e r a t i o n s o f s p e e c h u n d e r s t a n d i n g systems, w e feel that it has general application to other domains of l a r g e A I p r o b l e m s (e.g., v i s i o n , 1 robotics, chess, natural language understanding, and p r o t o c o l analysis). Our e f f o r t s f o l l o w from the early work of Reddy (1966) and Reddy and V i c e n s ( V i c e n s , 1 9 6 9 ) , t h r o u g h the H e a r s a y l system (Reddy, et al., 1973a, 1973b; Erman, 1974), w h i c h the first demonstrable connected-speech understanding system, up t h r o u g h the was currently d e v e l o p i n g H e a r s a y l l system (Erman, et al., 1973; Lesser, et al., 1974; Fennell, 1975). These e f f o r t s have increasingly f o c u s e d on the overall system organization for s o l v i n g the p r o b l e m ; this has resulted in the design and construction of a sophisticated e n v i r o n m e n t w i t h i n w h i c h problem-solving strategies are d e v e l o p e d . and structured Others w o r k i n g in t h i s a r e a also c o n s i d e r this aspect important.^ The Hearsayll system will b e u s e d h e r e as t h e p r i m a r y e x a m p l e for d e s c r i b i n g the organization. THE PROBLEM T h e c l a s s of A I problem that is addressed in this paper is c h a r a c t e r i z e d b y h a v i n g a l a r g e p r o b l e m s p a c e and the requirement of a large amount of knowledge f o r its s o l u t i o n . The l a r g e amount of explicit knowledge differentiates these problems f r o m o t h e r A I (e.g., t h e o r e m - p r o v i n g ) in which v e r y general "weak" methods are a p p l i e d using areas meager a m o u n t s of b u i l t - i n knowledge (Newell, 1969). Further, the knowledge n e e d e d c o v e r s a w i d e and diverse set of areas (some examples in the speech understanding p r o b l e m a r e a n a l y s i s , a c o u s t i c - p h o n e t i c s , phonology, syntax, semantics, and pragmatics). We signal call each s u c h a r e a a k n o w l e d g e - s o u r c e (KS) and also define a KS to be an agent w h i c h e m b o d i e s t h e k n o w l e d g e of its area and which can take actions based on that knowledge.^ T h e s o u r c e s of knowledge are often incomplete and approximate. may 1 be traced to three sources: First, the theory on which the This e r r o r f u l n a t u r e KS is based may be R e d d y ( 1 9 7 3 ) is a comparison of the speech and vision problem domains. N e w e l l , et al., (1971) contains an excellent in-depth study of the s p e e c h problem. The current state-of-the-art is represented in the p a p e r s of S y m p o s i u m o n S p e e c h Recognition (Erman, 1974b; Reddy, 1975). (1973, understanding the 1974 In particular, IEEE Barnett 1975), and Rovner, et al., (1974) also describe highly s t r u c t u r e d systems; B a k e r ( 1 9 7 4 ) has a highly s t r u c t u r e d system based on a simple Markov model. ^ F o r the p u r p o s e s of this discussion, a KS can be considered static; i.e., w h e t h e r a KS l e a r n s f r o m e x p e r i e n c e is an issue that is orthogonal to this organization. 1 incomplete or incorrect. problem, are often For example, modern phonological theories, as applied to the s p e e c h vague and incomplete. Second, the implementation of a KS may be i n c o m p l e t e or incorrect; this may be caused by an incorrect translation of the t h e o r y t o t h e p r o g r a m o r b y an intentionally heuristic implementation of the t h e o r y . Finally, t h e k n o w l e d g e s o u r c e may b e o p e r a t i n g on incorrect or incomplete data supplied to it b y o t h e r K S ' s . 1 A s o n e k n o w l e d g e source makes errors and creates ambiguities, o t h e r KS's must brought soon to bear to c o r r e c t and clarify those actions. as possible after the introduction of be This KS c o o p e r a t i o n s h o u l d o c c u r an e r r o r or ambiguity in o r d e r to as limit its ramifications. A mechanism for providing this high degree of cooperation is the h y p o t h e s i z e - a n d - t e s t paradigm. In this paradigm, solution-finding is v i e w e d as an iterative p r o c e s s . the iteration involves some Each s t e p in a) the creation of an hypothesis, which is an "educated g u e s s " a b o u t a s p e c t of the problem, and b) a test of the plausibility of the h y p o t h e s i s . t h e s e s t e p s use a p r i o r i knowledge about the problem, as well as the p r e v i o u s l y hypotheses. This iterative guess-building terminates when a consistent Both of generated hypothesis is g e n e r a t e d w h i c h satisfies the requirements of an overall solution. A s a s t r a t e g y for developing such systems, one needs the ability to add and r e p l a c e s o u r c e s of k n o w l e d g e and to explore different control strategies. Thus, s u c h c h a n g e s must b e r e l a t i v e l y e a s y to accomplish; there must also be ways to evaluate the p e r f o r m a n c e of t h e s y s t e m in g e n e r a l and the roles of the various knowledge sources and c o n t r o l s t r a t e g i e s in particular. T h i s ability to experiment conveniently with the system is crucial if the amount of k n o w l e d g e is large and many people are needed to introduce and validate it. One means of h e l p i n g to p r o v i d e these flexibilities is to require that KS's be independent. B e c a u s e the problems are large and require many computation s t e p s f o r their s o l u t i o n , the system must be efficient "production" application development versions s y s t e m is d e v e l o p e d . in its computation. system; however, it must This also must be be certainly reasonably because of the experimental w a y that a complex, true efficient for in a the knowledge-based That is, many iterative runs over a significant amount of test d a t a must b e made t o d e v e l o p and evaluate the knowledge sources and control strategies. 1 T h i s may also i ^ u d e externally supplied data <e.g., the digitized i s t h e ^ t i c .^^ - J e f m a J h i n input to the speech-understanding system); the transducers of t h e s e d a t a c a n c o n s i d e r e d to b e KS's w h i c h also introduce error. 2 a c 0 ^ be MODEL FOR COOPERATION OF KNOWLEDGE SOURCES T h e r e q u i r e m e n t that knowledge sources be independent implies that the ( a n d v e r y e x i s t e n c e ) of e a c h must not be necessary or crucial to the o t h e r s . functioning On t h e o t h e r h a n d , t h e KS's are r e q u i r e d to cooperate in the iterative guess-building, using and c o r r e c t i n g one another's guesses; this implies that there must be interaction among the processes. T h e s e t w o o p p o s i n g requirements have led to a design in which each KS i n t e r f a c e s t o t h e o t h e r s e x t e r n a l l y in a uniform way that is identical across KS's and in w h i c h no s o u r c e k n o w s w h a t or how many other KS's exist. knowledge The interface is implemented as a d y n a m i c g l o b a l data s t r u c t u r e , called the blackboard. The primary units in the b l a c k b o a r d are g u e s s e s about particular throughout aspects of the problem; these the blackboard, are called hypotheses. units, which have a uniform structure At any time, the b l a c k b o a r d holds c u r r e n t s t a t e of the system; it contains all the guesses about the problem that exist. the Subsets o f h y p o t h e s e s r e p r e s e n t partial solutions to the entire problem; these may c o m p e t e w i t h t h e p a r t i a l s o l u t i o n s r e p r e s e n t e d by other (perhaps overlapping) subsets. E a c h k n o w l e d g e s o u r c e may access any information in the blackboard. i n f o r m a t i o n to the blackboard by creating (or deleting) hypotheses, b y hypotheses, and by hypotheses. The generation exclusive establishing and or modifying modification of explicit globally means of communication among the diverse KS's. structural accessible Each may a d d modifying existing relationships among hypotheses is the This mechanism of c o o p e r a t i o n , w h i c h is an implementation of the hypothesize-and-test paradigm, allows a KS to c o n t r i b u t e knowledge w i t h o u t being aware of which other KS's will use the information or w h i c h s u p p l i e d the information that it used. independent and s e p a r a b l e . It is in this way that knowledge s o u r c e s are KS made The structural relationships (which are mentioned a b o v e and w h i c h w i l l be d e s c r i b e d below) form a network of the hypotheses and are u s e d to r e p r e s e n t t h e d e d u c t i o n s and inferences which caused a KS to generate one h y p o t h e s i s f r o m o t h e r s . T h e explicit r e t e n t i o n in the blackboard of these dependency relationships is u s e d to h o l d , a m o n g o t h e r things, competing hypotheses. Because these are held in an i n t e g r a t e d manner, s e l e c t i v e b a c k t r a c k i n g for e r r o r r e c o v e r y and other search strategies can be i m p l e m e n t e d in an e f f i c i e n t and non-redundant way. Decomposition of Knowledge T h e d e c o m p o s i t i o n of the overall task into various knowledge s o u r c e s is r e g a r d e d as b e i n g natural; i.e., the units of the decomposition represent those pieces of k n o w l e d g e w h i c h can 1 be distinguished and recognized as being somehow naturally independent* Such a T h e a p p r o a c h taken in knowledge source decomposition is not an attempt to c h a r a c t e r i z e 3 s c h e m e of " i n v e r s e decomposition" (or, composition) seems v e r y natural for many p r o b l e m s o l v i n g tasks, and it fits well into the hypothesize-and-test approach to p r o b l e m - s o l v i n g . long as a sufficient "covering set" of knowledge areas required for problem solution m a i n t a i n e d , o n e c a n f r e e l y add new knowledge sources, or replace or delete o l d o n e s . knowledge source is self-contained, but each is expected to cooperate As with is Each the other k n o w l e d g e s o u r c e s that h a p p e n to be present in the system at that time, A k n o w l e d g e source is specified in three parts: a) the conditions under w h i c h it is t o b e a c t i v a t e d (in terms of the conditions in the blackboard in which it is i n t e r e s t e d ) , b) t h e k i n d s of c h a n g e s it makes to the blackboard, and 3) a procedural statement (program) of t h e algorithm A which possessing accomplishes some processing those changes. capability which is knowledge able to source solve some is thus defined subproblem, as given a p p r o p r i a t e circumstances for its activation. Activation of Knowledge Sources A knowledge source is instantiated as a knowledge-source process whenever the b l a c k b o a r d exhibits characteristics which satisfy a "precondition" of the k n o w l e d g e s o u r c e . A p r e c o n d i t i o n of a KS is a description of some partial state of the b l a c k b o a r d w h i c h d e f i n e s w h e n and w h e r e the KS can contribute its knowledge b y modifying the b l a c k b o a r d . The KS c o n t r i b u t e s its k n o w l e d g e through the mechanism of making h y p o t h e s e s and e v a l u a t i n g and m o d i f y i n g the c o n t r i b u t i o n s of other knowledge sources (by v e r i f y i n g and rating or r e j e c t i n g the hypotheses made by other knowledge sources). A KS carries out these actions r e s p e c t to a particular context, the context being some subset of the p r e v i o u s l y with generated h y p o t h e s e s in the blackboard. Thus, new hypotheses or modifications to existing h y p o t h e s e s a r e c o n s t r u c t e d f r o m the (static) knowledge of the KS and the e d u c a t e d g u e s s e s made at s o m e p r e v i o u s time b y other knowledge sources. The trigger modifications further made by any given knowledge-source process are expected k n o w l e d g e sources by creating new conditions in the b l a c k b o a r d to t h o s e k n o w l e d g e s o u r c e s , in turn, respond. to which The structure of a hypothesis is s o d e s i g n e d as t o a l l o w the p r e c o n d i t i o n s of most KS's to be sensitive to a single, simple c h a n g e in s o m e h y p o t h e s i s (such as the creation of a new hypothesis of a particular t y p e , a c h a n g e of r a t i n g , o r the c r e a t i o n of a structural link between particular kinds of h y p o t h e s e s ) . Through s o m e h o w the o v e r a l l problem solution process and then apply some sort of t r a f f i c a n a l y s i s t o its internal workings in order to decompose the total p r o c e s s interacting into k n o w l e d g e sources. Rather, knowledge sources are d e f i n e d b y s o m e intuitive notion about the various pieces of knowledge w h i c h c o u l d b e in a u s e f u l w a y to help achieve a solution. 4 a flow minimally starting with incorporated this d a t a - d i r e c t e d interpretation of the hypothesize-and-test paradigm, KS's c a n also e x h i b i t a h i g h d e g r e e of asynchronous activity and potential parallelism. Control 1 schemes in which one KS explicitly invokes other KS's are not appropriate b e c a u s e of the requirement that KS's be independent and because the invocation of a KS may d e p e n d o n a complex set of conditions which is created by the combined actions of several KS's. contain F u r t h e r , s u c h direct-calling schemes complicate KS's by requiring that t h e y information about the KS's that they will call. These same arguments apply against a c e n t r a l i z e d c o n t r o l scheme which is explicitly predefined for a set of KS's. Decomposition of the Blackboard T h e b l a c k b o a r d is partitioned into distinct information levels; e a c h level is u s e d t o h o l d a d i f f e r e n t r e p r e s e n t a t i o n of the problem space. (Examples of levels in the s p e e c h p r o b l e m a r e " s y n t a c t i c " , "lexical", "phonetic", and "acoustic"; examples in scene analysis a r e "picture p o i n t " , "line segment", "region", and "object".) Associated with each level is a set of p r i m i t i v e e l e m e n t s a p p r o p r i a t e for representing the problem at that level. (In the s p e e c h s y s t e m , f o r e x a m p l e , the elements at the lexical level are the words of the v o c a b u l a r y to b e r e c o g n i z e d , while the elements at the phonetic level are the phones (sounds) of English.) Each h y p o t h e s i s e x i s t s at a particular level and is labeled as being a particular element of t h e s e t o f p r i m i t i v e elements at that level. The decomposition of the problem space into levels is a natural d e c o m p o s i t i o n into KS's of the knowledge that is to be brought to bear. parallel to the For many KS's, t h e K S n e e d s to deal w i t h only one or. a few levels to apply its knowledge; it n e e d not e v e n b e aware of the existence of other levels. Thus, each KS can be made as simple as its k n o w l e d g e allows; its interface to the rest of the system is in units and c o n c e p t s w h i c h a r e n a t u r a l to it. A l s o , n e w levels can be added as new sources of k n o w l e d g e w h i c h n e e d to use them. for efficiently are designed Finally, it will be shown that the multi-level r e p r e s e n t a t i o n s e q u e n c i n g the activity of the KS's in a non-deterministic manner allows and for m a k i n g use of multiprocessing. 1 O n e might think of this model for data-directed activation of KS's as a p r o d u c t i o n s y s t e m ( N e w e l l , 1 9 7 3 ) w h i c h is executed asynchronously. The preconditions c o r r e s p o n d t o t h e l e f t - h a n d s i d e s (conditions) of productions, and the knowledge sources c o r r e s p o n d t o t h e r i g h t - h a n d s i d e s (actions) of the productions. Conceptually, these l e f t - h a n d s i d e s a r e e v a l u a t e d continuously. When a precondition is satisfied, an instantiation of the c o r r e s p o n d i n g r i g h t - h a n d side of its production is created; this instantiation is e x e c u t e d at s o m e a r b i t r a r y subsequent time (perhaps subject to instantiation scheduling c o n s t r a i n t s ) . It is i n t e r e s t i n g to note that this generalized form of h y p o t h e s i z e - a n d - t e s t leads t o a s y s t e m o r g a n i z a t i o n w i t h some characteristics also similar to Q A 4 (Rulifson, et al., 1 9 7 3 ) a n d P L A N N E R (Hewitt, 1972). In particular, there are strong similarities in the d a t a - d i r e c t e d s e q u e n c i n g of p r o c e s s e s . 5 T h e s e q u e n c e of levels forms a loose hierarchical structure in w h i c h the e l e m e n t s each l e v e l c a n approximately be described as abstractions of elements at the level.* next at lower (For example, an utterance is composed of phrases, w h i c h are made of w o r d s , p u t together as s y l l a b l e s , each of which can be described as a sequence of p h o n e s , e a c h of w h i c h is c o m p o s e d of acoustic segments, each of which can be d e s c r i b e d b y a s e q u e n c e of t e n - m i l l i s e c o n d intervals w i t h certain kinds of acoustic characteristics.) M o s t of the relationships of a hypothesis are with hypotheses at its level o r a d j a c e n t l e v e l s ; f u r t h e r , t h e s e relationships can usually be derived (by a KS a p p r o p r i a t e to the l e v e l ) without h a v i n g to delve below the level of abstraction of the hypothesis. c o n t e x t simplifies the function of knowledge sources. This locality of (Or from the other point of v i e w , t h e d e c o m p o s i t i o n of k n o w l e d g e into sufficiently simple-acting KS's also simplifies and l o c a l i z e s r e l a t i o n s h i p s in the blackboard.)^ T h e d e c o m p o s i t i o n of the blackboard into distinct levels of r e p r e s e n t a t i o n c a n also b e t h o u g h t of as an a p r i o r i framework of a plan for problem-solving. s t a g e in t h e plan. Each level is a g e n e r i c T h e goal at each level is to create and validate h y p o t h e s e s at that l e v e l . T h e o v e r a l l goal of the system is to create the most plausible n e t w o r k of h y p o t h e s e s sufficiently covers the levels. ('Plausible' and 'sufficiently' s u f f i c i e n t in the judgment of the knowledge sources".) here mean that "plausible and In speech understanding, f o r e x a m p l e , t h e g o a l at the phonetic level is a phonetic transcription of the utterance, w h i l e t h e o v e r a l l goal is a n e t w o r k w h i c h connects hypotheses directly d e r i v e d from the acoustic input to context of synthesis, or h y p o t h e s e s w h i c h d e s c r i b e the semantic content of the utterance. The creation hypotheses at a or modification lower level (or of an hypothesis levels) can be which is considered based an action on a of a b s t r a c t i o n ; c o n v e r s e l y , manipulations of an hypothesis based o n a higher level c o n t e x t b e c o n s i d e r e d analysis, or elaboration. In order to overcome the e r r o r f u l n e s s of the and nature, both kinds of also make use of their redundant action are d e s i r a b l e can KS's in the system.^ Many of the ideas here fit neatly into Simon's description of a "nearly decomposable h i e r a r c h i c a l s y s t e m " (Simon, 1962). This simplification of form and interaction is an expected characteristic of a nearly d e c o m p o s a b l e hierarchical system (ibid.). T h e use of the terms 'analysis' and 'synthesis' here are r e v e r s e d from their usual u s e s in t h e s p e e c h r e c o g n i t i o n domain. representation direction. Traditionally, 'synthesis' means going f r o m a h i g h e r - l e v e l (e.g., lexical) to the speech signal, while analysis refers the other In s p e e c h recognition, however, the object is really to s y n t h e s i z e a meaning f o r t h e u t t e r a n c e f r o m the pieces of data which make up the s p e e c h signal. 6 to O f t e n , the context for an analysis or synthesis action is localized to the l e v e l just a b o v e o r b e l o w the level at which the action takes place. However, this is not a r e q u i r e m e n t ; i n f a c t , an action w h i c h skips over several levels can serve strongly to direct the a c t i v i t y of t h e s y s t e m and t h e r e b y significantly prune the search space. e q u i v a l e n t to c o n s t r u c t i n g a major step in a plan. Such a jump o v e r levels is Further, there is no r e q u i r e m e n t that a j u m p n e c e s s a r i l y b e filled in completely (or even partially) if KS's are confident e n o u g h in t h e c o n s i s t e n c y of the larger step. Thus, the KS's can dynamically define the g r a n u l a r i t y in t h e h y p o t h e s i s n e t w o r k n e c e s s a r y to assure the desired degree of consistency; this g r a n u l a r i t y m a y v a r y at d i f f e r e n t places in the blackboard, depending on the particular s t r u c t u r e s that occur. A p p e n d i x A contains a description of the blackboard and KS d e c o m p o s i t i o n s for the H e a r s a y l l s p e e c h - u n d e r s t a n d i n g system. Hypotheses: Structure and Interrelationships T h e internal s t r u c t u r e of an hypothesis consists of a fixed set of a t t r i b u t e s (named f i e l d s ) ; this set is the same for hypotheses at all levels of representation in the b l a c k b o a r d . These attributes are s e l e c t e d to serve as mechanisms for implementing the hypothesize-and-test paradigm.* data-directed The values of the attributes are d e f i n e d and m o d i f i e d by t h e KS's. A t t r i b u t e s can be g r o u p e d into several classes: T h e f i r s t class of attributes names the hypothesis: it contains the unique name of t h e h y p o t h e s i s , the name of its level, and its label from the element set at that l e v e l . T h e next class of attributes is composed of parameters which rate the hypothesis. T h e s e include s e p a r a t e numerical ratings derived from a) a priori information about t h e h y p o t h e s i s , b) analysis actions performed on the hypothesis, c) s y n t h e s i s actions, a n d d) combinations of (a), (b), and (c). A n o t h e r set of attributes contains information about KS attention to the h y p o t h e s i s . These been include a cumulative measure of the amount of computation that has expended on the hypothesis as well as suggestions for how already much more p r o c e s s i n g s h o u l d occur and of what type (e.g., analysis or synthesis). O n e v e r y important set of attributes describes the structural relationships w i t h o t h e r h y p o t h e s e s , as d e s c r i b e d below. F o r e a c h p r o b l e m domain, it is likely that there are other attributes w h i c h are b a s i c -—— A In H e a r s a y l l , a KS can s p e c i f y particular attributes of hypotheses at particular l e v e l s w h i c h it w a n t s to have monitored. Whenever a change is made to one of t h e s e m o n i t o r e d a t t r i b u t e s , the KS can be activated and notified of the nature of the change. T h e s e c t i o n b e l o w o n "Data-Directed Activation of Knowledge Sources" contains a more c o m p l e t e d e s c r i p t i o n of this p r o c e s s . 7 t o t h e p r o b l e m and w h i c h should be provided in the structure of the these form a problem-specific i n s t a n c e , time class of attributes. is a fundamental concept, so the In s p e e c h hypotheses; understanding, H e a r s a y l l system has for a class of a t t r i b u t e s f o r d e s c r i b i n g the b e g i n - and end-time and the duration of the e v e n t w h i c h the hypothesis represents. fuzzy (These attributes include ways of explicitly notions of the times.) representing For vision, likely attributes w o u l d include the location a n d d i m e n s i o n of the element and trajectory information for moving o b j e c t s . T h e c a p a b i l i t y f o r a r b i t r a r y KS-specific attributes is also included. This c a n b e u s e d b y a KS to hold a r b i t r a r y information about the hypothesis; in this w a y a KS n e e d not h o l d s t a t e information about the hypothesis across activations of the KS and a l l o w s , f o r example, the e a s y implementation of generator functions. knowledge of m o d i f y the If s e v e r a l KS's s h a r e the name of one of these attributes, each of them c a n a c c e s s attribute's value and thus communicate just as if it w e r e and a "standard" a t t r i b u t e ; this can be used as an escape mechanism for explicit KS intercommunication. A u n i q u e class of hypothesis attributes, called processing state attributes, c o n t a i n s succinct summaries and classifications of the values of the other attributes. For e x a m p l e , the values of the rating attributes are summarized and the h y p o t h e s i s classified as (strongly either verified "unrated", and "neutral" unique), or (noncommittal), "rejected". Other "verified", processing is "guaranteed" state attributes s u m m a r i z e the structural relationships with other hypotheses and c h a r a c t e r i z e , f o r e x a m p l e , w h e t h e r the hypothesis has been "sufficiently and c o n s i s t e n t l y " s y n t h e t i c a l l y (i.e., as an abstraction of hypotheses at lower levels). The described processing s t a t e a t t r i b u t e s are especially useful for efficiently triggering k n o w l e d g e s o u r c e s ; f o r e x a m p l e , a KS may s p e c i f y in its precondition that it is to be activated w h e n e v e r hypothesis for the at a particular level becomes "verified". g o a l - d i r e c t e d scheduling of knowledge a These attributes are also u s e d sources, as d e s c r i b e d in t h e next section. G i v e n a s p e c i f i c hypothesis, a KS can examine the value of any of its a t t r i b u t e s . knowledge source also needs the ability to retrieve sets of h y p o t h e s e s whose A attributes s a t i s f y c o n d i t i o n s in w h i c h the KS is interested. (E.g., a KS in the s p e e c h s y s t e m may w a n t t o find all hypotheses at p a r t i c u l a r time range.) a c c o m p l i s h i n g this. the phonetic level which are vowels and w h i c h This partial s p e c i f i c a t i o n p e r m i t s a a) a set of desired values or A to matching-prototype provided.) is applied a set of hypotheses;* b) a d o n ' t - c a r e c o n d i t i o n . those match those specified b y the matching-prototype r e s u l t of t h e s e a r c h . also a T h e s e a r c h condition is specified b y a m a t c h i n g - p r o t o t y p e . w h i c h is a c o m p o n e n t to b e c h a r a c t e r i z e d b y : values within T h e system provides an associative retrieval s e a r c h mechanism f o r p a r t i a l s p e c i f i c a t i o n of the components of a hypothesis. component occur More hypotheses are whose returned as the (Associative retrieval of structural relationships among h y p o t h e s e s complex retrievals can be accomplished b y combining the is retrieval p r i m i t i v e s in a p p r o p r i a t e w a y s . 1 T h i s set c a n b e d e r i v e d b y the KS from several sources. includes the hypotheses attributes partition following at a primitive particular sources: level, c) all hypotheses (in the of the blackboard), and d) all hypotheses at a particular whose implementation blackboard), level o v e r l a p a given interval (this provides an extremely efficient, m o n i t o r e d (for the KS) have changed. 8 The Hearsayll a) all hypotheses attributes whose b ) all time two-dimension which are being Structural represented abstractions relationships through the use between of nodes links; links (hypotheses) in the of specifying contextual A link is an element w h i c h associates provide about the relationships of hypotheses. a means blackboard are t w o h y p o t h e s e s as an o r d e r e d pair; one of the nodes is termed the u p p e r h y p o t h e s i s , a n d t h e o t h e r is c a l l e d the l o w e r hypothesis. The lower hypothesis is said to s u p p o r t t h e u p p e r h y p o t h e s i s w h i l e the u p p e r hypothesis is called a use of the lower one; in g e n e r a l , t h e l o w e r h y p o t h e s i s is at the same or a lower level in the blackboard than the u p p e r h y p o t h e s i s . T h e r e a r e s e v e r a l t y p e s of links, r e l a t i o n s h i p s . * C o n s i d e r this structure: with the types describing various kinds of HI H2 H3 H4 H I is t h e u p p e r h y p o t h e s i s and H2, H3, and H4 are the lower h y p o t h e s e s of links L I , L 2 , a n d L3, respectively. If the links are all of type OR, the interpretation is that H I is e i t h e r a n H 2 o r a n H3 o r an H4. T h i s is one w a y that alternative descriptions are p o s s i b l e . the figure are necessary of type to support AND, the interpretation the existence of HI. is that all of the l o w e r If t h e links in hypotheses (Note that, in general, all of the are supporting ( l o w e r ) links of a h y p o t h e s i s are of the same type; one can thus talk of the " t y p e o f the h y p o t h e s i s " , w h i c h is the same as the t y p e of all of its lower links.) These two types s p e c i f i e s a set/member of node represent different kinds of abstractions: the OR-node relationship while the AND-node defines a composition a b s t r a c t i o n . V a r i a n t s of the A N D - and OR-links are also possible. For example, a SEQUENCE link is similar t o t h e A N D - l i n k e x c e p t that an ordering is implied on the set of lower h y p o t h e s e s s u p p o r t i n g the upper hypothesis. (For the Hearsayll speech understanding system, this o r d e r i n g u s u a l l y is i n t e r p r e t e d as indicating a time ordering of the lower hypotheses.) B e s i d e s s h o w i n g analysis and synthesis relationships b e t w e e n h y p o t h e s e s (e.g., that o n e h y p o t h e s i s is c o m p o s e d of several other units), a link is a statement about the d e g r e e t o w h i c h o n e h y p o t h e s i s implies (i.e., "gives evidence for the existence of") another h y p o t h e s i s . T h e s t r e n g t h of the implication is held as attributes of the link. The s e n s e of the i m p l i c a t i o n m a y b e n e g a t i v e ; that is, a link may indicate that one hypothesis is evidence f o r the i n v a l i d i t y of another. hypothesis T h i s statement of implication may be bi-directional; the e x i s t e n c e of t h e may g i v e c r e d e n c e to the existence of the lower hypothesis upper and vice v e r s a . . 1 T h e p a r t i c u l a r kinds of relationships described here are some of those that w e r e w e r e d e s i g n e d f o r the s p e e c h problem. Although they undoubtedly are not the c o m p l e t e s e t f o r all c o n c e i v a b l e needs, they do represent the kinds of relationships that n e e d to b e a n d a r e e x p r e s s a b l e in the blackboard. 9 Finally, these relationships can be constructed in an iterative manner; links c a n b e added b e t w e e n e x i s t i n g h y p o t h e s e s b y KS's as they discover new evidence f o r s u p p o r t . J u s t as an h y p o t h e s i s can have more than one lower link, so it can have s e v e r a l u p p e r links. E a c h of t h e s e r e p r e s e n t s a different use of the hypothesis; the uses may b e c o m p e t i n g or complementary. T h e ability to have multiple uses and supports of the same h y p o t h e s i s , as o p p o s e d t o c r e a t i n g duplicates for each competing use and abstraction, s e r v e s t o k e e p t h e b l a c k b o a r d compact and t h e r e b y reduces the combinatoric explosion in the s e a r c h s p a c e . F u r t h e r , s i n c e all the information about the hypothesis is localized, all uses and s u p p o r t s o f the hypothesis automatically and immediately share any new information added to the h y p o t h e s i s b y a n y k n o w l e d g e sources. A p r o b l e m w i t h this localization can occur if the interactions b e t w e e n h y p o t h e s e s s p a n more than one level. 1 In this case, a particular support of the hypothesis (at a l o w e r l e v e l ) m a y b e i n c o n s i s t e n t w i t h one (or more) of the uses of the hypothesis (at a higher l e v e l ) b u t is consistent with other uses (or potential uses) of the hypothesis. In o r d e r to avoid d u p l i c a t i n g t h e h y p o t h e s i s , a mechanism, called a connection matrix, exists in t h e s y s t e m . A c o n n e c t i o n matrix is an attribute of a hypothesis; its value specifies w h i c h of t h e a l t e r n a t i v e s u p p o r t s of the h y p o t h e s i s are applicable ("connected to") which of its uses. connection matrix T h e use of a allows the results of previous decisions of KS's to b e a c c u m u l a t e d for f u t u r e u s e and modification without necessitating contextual duplication of p a r t s of t h e d a t a base. T h i s kind of reusage and multiple usage of blackboard s t r u c t u r e s r e d u c e s much of t h e e x p e n s i v e b a c k t r a c k i n g that characterizes many problem-solving systems. A p p e n d i x B contains an example of a structure built in the b l a c k b o a r d of t h e H e a r s a y l l system. Goal-Directed Scheduling of Knowledge Sources A s d e s c r i b e d earlier, the overall goal of the system is to c r e a t e the most network of hypotheses that sufficiently spans the levels. At any instant of plausible time, b l a c k b o a r d may contain many incomplete networks, each of w h i c h is plausible as f a r goes. of Some of t h e s e incomplete networks may also share s u b n e t w o r k s . analysis and synthesis actions of knowledge sources, incomplete e x p a n d e d (or c o n t r a c t e d ) and may be joined together (or fragmented). may be many A g a i n , this fits w e l l into Simon's formulation of hierarchical systems. 10 networks can be A t a n y time, t h e r e for the T h e task of goal-directed scheduling is to d e c i d e t o w h i c h of t h e s e s i t e s t o allocate computing resources. 1 as it Through the results places in the blackboard which satisfy the (precondition) c o n t e x t s a c t i v a t i o n of particular KS's. the S e v e r a l of the attribute classes of a hypothesis can be helpful in making s c h e d u l i n g decisions. P a r t i c u l a r l y valuable are the values of the attention attributes, w h i c h , as d e s c r i b e d e a r l i e r , a r e indicators telling how much computation has been e x p e n d e d o n the hypotheses a n d s u g g e s t i o n s b y KS's of how desirable it is to devote further e f f o r t o n the hypothesis ( a l o n g w i t h the kinds of processing that are desirable). The processing state a t t r i b u t e s are a l s o v a l u a b l e f o r making scheduling decisions. U s i n g t h e s e kinds of information, a knowledge source might be s c h e d u l e d for e x e c u t i o n b e c a u s e it p o s s e s s e s the only processing capability available to be applied to an i m p o r t a n t incompletely focusing explored factors structural area of which connections the highlight blackboard. activity For example, if the in a blackboard region blackboard in w h i c h b e t w e e n two adjoining levels, the scheduler contains there should give are a no higher p r i o r i t y to a k n o w l e d g e s o u r c e which will attempt (as indicated in its e x t e r n a l s p e c i f i c a t i o n s ) t o make s u c h a c o n n e c t i o n than to a knowledge source which is likely merely t o p e r f o r m a minor refinement processes analysis ready in on the ratings in one of the levels. However, if there are to execute, the scheduling algorithm can perform a t y p e of which it schedules those knowledge sources which are likely no such means-ends to produce b l a c k b o a r d c h a n g e s which, in turn, might trigger the activation of KS's in w h i c h the s y s t e m is currently interested. The implementation of the goal-directed scheduling strategy is s e p a r a t e d from the a c t i o n s of individual knowledge sources. That is, the decision of whether a KS can c o n t r i b u t e in a p a r t i c u l a r context is local to the KS, while the assignment of that KS t o o n e of t h e many contexts on which it can possibly operate is made more globally. a) d e c o u p l i n g of f o c u s i n g strategy from knowledge-source activity, e n v i r o n m e n t (blackboard) from the control flow (KS activation), and The three quickly of b) d e c o u p l i n g of t h e d a t a c) the limited c o n t e x t in w h i c h a KS o p e r a t e s , together permit a quick refocusing of attention of KS's. refocus aspects The ability to is v e r y important because the errorful nature of the KS activity leads to m a n y i n c o m p l e t e and p o s s i b l y contradictory hypothesis networks; thus, as s o o n as p o s s i b l e a f t e r a n e t w o r k no longer seems promising, the resources of the system s h o u l d b e e m p l o y e d elsewhere.* ' ^ynsvll* " 9 7 5 ) d e S C n b e m «™°<*>°" °< SOaMirectad scheduling in the 11 IMPLEMENTATION OF DATA-DIRECTED ACTIVATION OF KNOWLEDGE SOURCES A s s o c i a t e d w i t h e v e r y knowledge source is a specification of the b l a c k b o a r d c o n d i t i o n s required for the activation of that knowledge source. This specification, p r e c o n d i t i o n , is a decision procedure whose tests are matching-prototypes relationships arbitrarily blackboard) contexts. and complex resulting decisions in the (based activation of on current and desired knowledge past a structural w h i c h , w h e n applied to the blackboard in an associative manner, d e t e c t r e g i o n s of the b l a c k b o a r d in which the knowledge source is interested. contain called the This p r o c e d u r e may modifications sources within to the T h e context c o r r e s p o n d i n g to the discovered blackboard r e g i o n w h i c h the chosen satisfies s o m e k n o w l e d g e s o u r c e ' s precondition is used as an initial context in w h i c h t o activate that knowledge source. the system's T h e e f f i c i e n c y of the KS precondition evaluation is an important a s p e c t of implementation, especially as the knowledge is d e c o m p o s e d into more and s m a l l e r KS's and e a c h KS activation requires less computation. The Hearsayll evaluation efficient blackboard. system, by as placing an example additional of an functions T h e s e functions are activated whenever implementation, in the makes routines precondition which modify any KS modifies an a t t r i b u t e the in the b l a c k b o a r d w h i c h some other KS has asked to be monitored. The e s s e n c e of the m o d i f i c a t i o n is preserved in a data structure, called a change set, which is specific to the attribute c h a n g e d and the KS w h i c h requested the monitoring. A KS specifies in a n o n - p r o c e d u r a l w a y (either statically increase the or dynamically) those attributes which it wants to monitor. e f f i c i e n c y , monitoring can further be localized to particular In o r d e r levels or to even individual hypotheses. C h a n g e sets s e r v e to categorize blackboard modifications (events) and are t h u s u s e f u l i n p r e c o n d i t i o n evaluation since they limit the areas in the blackboard that n e e d b e e x a m i n e d in detail. As currently implemented in Hearsayll, the precondition evaluator of each k n o w l e d g e s o u r c e exists as a separate process which monitors changes in the d a t a b a s e (i.e., it monitors- additions to those change sets in which the KS is interested). The precondition p r o c e s s is itself d a t a - d i r e c t e d in that it is activated only w h e n sufficient c h a n g e s h a v e b e e n m a d e in the b l a c k b o a r d (i.e., w h e n an entry is made into one of its change s e t s , as a s i d e effect of themselves a relevant have blackboard preconditions, knowledge sources. modification). albeit of In effect, a much simpler the form precondition than those processes possible for For example, a precondition process in the s p e e c h s y s t e m may s p e c i f y that it s h o u l d b e activated w h e n e v e r changes occur to two adjacent h y p o t h e s e s at t h e w o r d l e v e l o r w h e n e v e r s u p p o r t is added to the phrasal level. B y using the (coarse) c l a s s i f i c a t i o n s a f f o r d e d b y c h a n g e sets, the system avoids most unnecessary executions of the p r e c o n d i t i o n processes. being 12 T h e major point is that the scheme of precondition evaluation is e v e n t - d r i v e n , b a s e d o n the o c c u r r e n c e of changes in the blackboard-, i.e., it is o n l y at p o i n t s of modification to the become satisfied. blackboard that a precondition that was previously unsatisfied may In particular, precondition evaluators are not involved in a f o r m of busy w a i t i n g in w h i c h t h e y are constantly looking for something that is not yet t h e r e . O n c e i n v o k e d , a precondition procedure uses sequences of associative r e t r i e v a l s structural matches satisfying the precondition sources. on portions preconditions procedure may of of the one blackboard or more be responsible for in an attempt of "its" to establish knowledge instantiating several a and context sources; any (related) knowledge Notice that the data-directed nature of precondition evaluation and given knowledge- s o u r c e a c t i v a t i o n is linked closely to the primitive functions that are able to modify t h e d a t a b a s e , f o r it is o n l y at points of modification that a precondition that w a s u n s a t i s f i e d b e f o r e may become satisfied. Hence, data base modification routines have the responsibility ( a l t h o u g h p e r h a p s indirectly) of activating the precondition evaluation mechanism. Implementation on Parallel Computers great Because of the independence of KS's and their data-directed activation, t h e r e deal potential of parallelism in this organization. Trends in computer is a architecture i n d i c a t e that large amounts of computing power will be economically r e a l i z e d in a s y n c h r o n o u s multiprocessor networks. Thus, the implementation of such large AI programs on m u l t i p r o c e s s o r s becomes an attractive goal. There are, however, a set of issues in s u c h an implementation; interference simultaneously most of these deal with to access the blackboard. among KS's when they attempt Effective solutions to these p r o b l e m s h a v e been d e v e l o p e d in the H e a r s a y l l implementation; Lesser, et al., (1974), Lesser (1975), and F e n n e l l a n d L e s s e r ( 1 9 7 5 ) d e s c r i b e these solutions. ACKNOWLEDGMENTS W e w i s h to acknowledge the contributions of the following p e o p l e : Richard Fennell for his major r o l e in the development of Hearsayl and II (and, in particular, the m u l t i p r o c e s s i n g a s p e c t s of the s y s t e m organization) and for helpful suggestions for this p a p e r , Raj R e d d y f o r m a n y of the basic ideas w h i c h have led to the organization d e s c r i b e d here, A l l e n N e w e l l f o r his insights into the task of problem-solving, Gregory Gill for his untiring e f f o r t s in s y s t e m i m p l e m e n t a t i o n , and F r e d e r i c k Hayes-Roth for suggestions for this paper. 13 Appendix A: r r l EXAMPLE OF BLACKBOARD AND KS DECOMPOSITION IN HEARSAY I I A W 1 F i g u r e 1 s h o w s a schematic of the levels of Hearsayll. Conceptual Phrasal Lexical Syllabic Surface-phonemic Phonetic Segmental Parametric Figure 1. The Levels in Hearsayll. Parametric utterance acoustic Level - The parametric level holds the most basic representation that the system has; it is the only direct input to the machine signal. Several different sets of parameters are being used of the about the in Hearsayll i n t e r c h a n g e a b l y : 1/3-octave filter-band energies measured e v e r y 10 m s e c , L P C - d e r i v e d v o c a l - t r a c t parameters, and w i d e - b a n d energies and z e r o - c r o s s i n g counts. Segmental Level - This level represents the utterance as labeled acoustic segments. A l t h o u g h the set of labels may be phonetic-like, the level is not intended to b e p h o n e t i c — t h e s e g m e n t a t i o n and labeling reflect acoustic manifestation and do not, f o r e x a m p l e , attempt to compensate for the context of the segments a c o u s t i c a l l y dissimilar segments into (phonetic) units. or attempt to As with all levels, a n y p o r t i o n of the utterance may be represented by more than one competing combine particular hypothesis • (i.e., multiple segmentations and labelings may co-exist). P h o n e t i c L e v e l - A t this level, the utterance is represented b y a phonetic d e s c r i p t i o n . This is a b r o a d p h o n e t i c description in that the size (duration) of the units is o n the o r d e r of t h e " s i z e " of phonemes; it is a fine phonetic description to the extent that e a c h e l e m e n t is l a b e l e d w i t h a fairly detailed allophonic classification (e.g., "stressed, n a s a l i z e d [I]"). S u r f a c e - P h o n e m i c L e v e l - This level, named by seemingly contradicting terms, r e p r e s e n t s t h e u t t e r a n c e b y phoneme-like units, with the addition of modifiers such as s t r e s s and b o u n d a r y ( w o r d , morpheme, syllable) markings. S y l l a b i c L e v e l - T h e unit of representation here is the syllable. L e x i c a l L e v e l - T h e unit of information at this level is the w o r d . P h r a s a l L e v e l - Syntactic elements appear at this level. In fact, since a level may c o n t a i n a r b i t r a r i l y many " s u b - l e v e l s " of elements using the AND and OR links, traditional kinds of s y n t a c t i c t r e e s can be directly represented here. C o n c e p t u a l L e v e l - T h e units at this level are "concepts." As with the phrasal lev-el, it may b e a p p r o p r i a t e to use the graph structure of the da>ta base to indicate r e l a t i o n s h i p s a m o n g d i f f e r e n t concepts. A s examples of knowledge sources, Figure 2 shows the first set i m p l e m e n t e d f o r Hearsayll. left. 1 T h e levels are indicated as horizontal lines in the figure and are l a b e l e d at t h e T h e k n o w l e d g e s o u r c e s are indicated by arcs connecting levels; the s t a r t i n g point(s) o f A p p e n d i c e s A and B are reprinted from Lesser, et al (1974); they are i n c l u d e d h e r e convenience. 14 for an arc indicates the level(s) of maior "innnt" fnr t h "output" level where the most of these par icuar Ln^ill knoltZ . n a i/c K S « - T a d i - Levels - ° ' n J .L. a S n d t h e ^ e n P d I n g 6 n indicates o i n t e r a l ' t h e a c t i o the " Knowledge Sources - CONCEPTUAL Semantic W o r d Hypothesizer PHRASAL -Syntactic Parser '-4 LEXICAL Phoneme Hypothesizer SYLLABIC W o r d Candidate Generator ^ ^ " P h o n o l o g i c a l Rule Applier SURFACEPHONEMIC | Phone—Phoneme Synchronizer PHONETIC - P h o n e Synthesizer SEGMENTAL tb 1—r& 9 I—Segment—Phone Synchronizer Parameter—Segment Synchronizer Segmenter-Classif ier PARAMETRIC Figure 2. A Set of Knowledge Sources for H e a r s a y l l . T h e S e g m e n t e r - C l a s s i f ier knowledge source uses the description of the s p e e c h signal to p r o d u c e a l a b e l e d acoustic segmentation. For any portion of the utterance, s e v e r a l p o s s i b l e alternative segmentations and labels may be produced. The Phone Synthesizer phonetic level. uses labeled acoustic segments to generate elements at the This p r o c e d u r e is sometimes a fairly direct renaming of an h y p o t h e s i s at t h e s e g m e n t a l level, perhaps using the context of adjacent segments. In o t h e r c a s e s , p h o n e s y n t h e s i s r e q u i r e s the combining of several segments (e.g., the g e n e r a t i o n of [t] from a segment of silence followed by a segment of aspiration) or the i n s e r t i o n of p h o n e s not indicated directly by the segmentation (e.g., hypothesizing the e x i s t e n c e of an [ I ] if a v o w e l seems velarized and there is no [I] in the neighborhood). T h i s KS is t r i g g e r e d w h e n e v e r a new hypothesis is created at the segmental level. 15 o f The Word Candidate locations and o t h e r Generator uses phonetic information (primarily just areas of high phonetic reliability) to generate w o r d at stressed hypotheses. T h i s is accomplished in a two-stage process, with a stop at the syllabic level, f r o m w h i c h lexical r e t r i e v a l is more effective. T h e Semantic W o r d Hypothesizer uses semantic and pragmatic information about t h e t a s k ( n e w s r e t r i e v a l , in this case) to predict words at the lexical level. T h e S y n t a c t i c W o r d Hypothesizer uses knowledge at the phrasal level to p r e d i c t n e w w o r d s at the lexical level which are adjacent (left or right) to w o r d s possible previously g e n e r a t e d at the lexical level. This knowledge source is activated at the b e g i n n i n g of an u t t e r a n c e r e c o g n i t i o n attempt and, subsequently, whenever a new w o r d is c r e a t e d at t h e lexical l e v e l . T h e P h o n e m e H y p o t h e s i z e r knowledge source is activated whenever a w o r d h y p o t h e s i s is c r e a t e d (at the lexical level) which is not yet supported by h y p o t h e s e s at the s u r f a c e phonemic level. Its action is to create one or more sequences at the s u r f a c e - p h o n e m i c l e v e l w h i c h r e p r e s e n t alternative pronounciations of the w o r d . (These pronounciations a r e c u r r e n t l y p r e - s p e c i f i e d as entries in a dictionary.) T h e P h o n o l o g i c a l Rule A p p l i e r rewrites sequences at the surface-phonemic l e v e l . is u s e d : T h i s KS 1) to augment the dictionary lookup of the Phoneme H y p o t h e s i z e r , and 2) t o h a n d l e w o r d b o u n d a r y conditions that can be predicted by rule. The Phone-Phoneme Synchronizer is triggered whenever e i t h e r the p h o n e t i c or the surface-phonemic level. hypothesis with hypotheses either direction. The an hypothesis is created at This KS attempts to link up the n e w at the other level. This linking may b e m a n y - t o - o n e in S y n t a c t i c P a r s e r uses a syntactic definition of the input language to d e t e r m i n e if a c o m p l e t e s e n t e n c e may be assembled from words at the lexical level. The primary duties of the S y n c h r o n i z e r are similar: Segment-Phone Synchronizer and the Parameter-Segment to recover from mistakes made by the (bottom-up) actions o f t h e P h o n e S y n t h e s i z e r and Segmenter-Classifier, f r o m the higher to the lower level. respectively, b y allowing feedback In addition t o the knowledge source modules described above, all of w h i c h speech knowledge, several system in a manner policy identical to modules exist. the speech These modules, w h i c h i n t e r f a c e modules, execute policy embody to decisions, the e.g., p r o p a g a t i o n of ratings and calculation of processing-state attributes. Appendix B: EXAMPLE OF A BLACKBOARD FRAGMENT IN HEARSAY II F i g u r e 3 is an example of a fragment that might occur in Hearsayll's b l a c k b o a r d . The l e v e l of an h y p o t h e s i s is indicated by its vertical position; the names of the l e v e l s a r e g i v e n o n t h e left. Time location is approximately indicated b y horizontal placement, but d u r a t i o n is o n l y v e r y r o u g h l y indicated (e.g., the boxes surrounding the t w o h y p o t h e s e s at t h e p h r a s a l l e v e l s h o u l d b e much wider). Alternatives are indicated by proximity; f o r example, ' w i l l ' a n d ' w o u l d ' a r e w o r d h y p o t h e s e s covering the same time span. Likewise, ' q u e s t i o n ' and ' m o d a l - q u e s t i o n ' , ' y o u l ' and ' y o u 2 \ and ' J ' and 'Y' all represent pairs of alternatives. T h i s example illustrates several features of the data structure: The hypothesis 'youj at the lexical level, has two alternative phonemic "spellings" i n d i c a t e d ; the h y p o t h e s e s labeled ' y o u l ' and ' y o u 2 ' are nodes c r e a t e d , also at t h e l e x i c a l l e v e l , t o hold those alternatives. arbitrarily. 16 In general, such s u b - l e v e l s may b e c r e a t e d 'quest i on' PHRASAL (SEQ) >\\W 'modal q u e s t i o n ' (SEQ) - \ - ' 'you' (OPT) LEXICAL 'youl' (SEQ) ^ V 10 11 4 you2' (SEQ) SURFACEPHONEMIC F i g u r e 3. A n Example of a Fragment in the Blackboard. T h e link b e t w e e n *youV and D ' is a special kind of SEQUENCE link (indicated h e r e b y a d a s h e d line) called a CONTEXT link; a CONTEXT link indicates that t h e l o w e r h y p o t h e s i s s u p p o r t s the upper one and is contiguous to its brother links, but it is n o t " p a r t o f " t h e u p p e r hypothesis in the sense that i t is not within the time i n t e r v a l o f t h e u p p e r h y p o t h e s i s — rather, it supplies a context for its brother(s). In this c a s e , o n e may " r e a d " t h e structure as stating ' " y o u l ' is composed of ' J ' f o l l o w e d b y ' A X ' ( s c h w a ) i n the context of the preceding <D\" (This reflects the phonological rule that " w o u l d y o u " is o f t e n s p o k e n as "would-ja.") Thus, a CONTEXT link allows important c o n t e x t u a l relationships to be represented without violating t h e implicit time a s s u m p t i o n s about SEQUENCE nodes. ? W h e r e a s t h e phonemic spelling of the w o r d " y o u " held b y *youV includes a c o n t e x t u a l c o n s t r a i n t , t h e V°u2' option does not have this constraint. However, *youV a n d V o u 2 ' a r e s u c h similar hypotheses that there is strong reason for w a n t i n g t o r e t a i n t h e m as alternative options under you* (as indicated in Figure 3), r a t h e r t h a n r e p r e s e n t i n g them unconnectedly. A connection matrix is used here t o r e p r e s e n t this k i n d of relationship; the connection matrix of ' y o u ' (symbolized in Figure 3 b y t h e 2d i m e n s i o n a l b i n a r y matrix in the node) specifies that support *youV is relevant t o u s e ' q u e s t i o n ' (but not to 'modal-question') and that support V>u2' is relevant t o b o t h uses. K T h e n a t u r e of the implications represented by the links provides a uniform b a s i s f o r p r o p a g a t i n g c h a n g e s made in one part of the data structure to other relevant p a r t s w i t h o u t n e c e s s a r i l y r e q u i r i n g the intervention of particular knowledge s o u r c e s at e a c h s t e p . C o n s i d e r i n g t h e example of Figure 3, assume that the validity of the h y p o t h e s i s l a b e l e d ' J ' is m o d i f i e d b y some KS (presumably operating at the phonetic level) and becomes v e r y l o w . O n e p o s s i b l e s c e n a r i o f o r rippling this change through the data base is g i v e n h e r e : F i r s t , t h e estimated validity of *youV is reduced, because ' J ' is a l o w e r h y p o t h e s i s o f T h i s , in t u r n , may cause the rating of <you' to be reduced. 17 T h e c o n n e c t i o n matrix at ' y o u ' specifies that ' y o u l * is not relevant to ' m o d a l - q u e s t i o n j s o t h e latter h y p o t h e s i s is not affected by the change in rating of the f o r m e r . Notice that the e x i s t e n c e of the connection matrix allows this decision to be made l o c a l l y in t h e d a t a s t r u c t u r e , without having to search back d o w n to the ' D ' and 'J? 'Question,' h o w e v e r , is s u p p o r t e d by ' y o u l ' (through the connection matrix at you*) K } so its r a t i n g is a f f e c t e d . Further p r o p a g a t i o n s can continue to occur, perhaps d o w n the other SEQUENCE links u n d e r ' q u e s t i o n ' and ' y o u l . ' Notice that all of these modifications are "speech-knowledge independent" and can be a c c o m p l i s h e d uniformly at all levels of the blackboard by a single policy k n o w l e d g e s o u r c e . T h i s p o l i c y KS d o e s not need to access or trigger any other KS but can d i r e c t l y d e r i v e all t h e i n f o r m a t i o n it n e e d s from the hypothesis and link fields that are uniformly p r e s e n t and f r o m t h e implicit semantics of the structures in the blackboard. REFERENCES B a k e r , J a m e s (1974), "The DRAGON System — A n Overview;' in Erman (1974b). B a r n e t t , J . ( 1 9 7 3 ) , "A Vocal Data Management System" IEEE Trans. Audio and Electroacoustics, A U - 2 1 , 3. B a r n e t t , J . ( 1 9 7 5 , in press), "Module Linkage and Communication in Large Systems," in Reddy (1975). E r m a n , L. D., R. D. Fennell, V. R. Lesser, and D. R. Reddy (1973), "System O r g a n i z a t i o n s Speech Understanding: Architectures for Implications of Network and Multiprocessor for Computer A I " Proc. 3 r d Inter. Joint Conf. on Artificial Intel., Stanford, Ca., pp. 194-199. E r m a n , L. D. (1974), " A n Environment and System for Machine Understanding of Connected Speech," Ph.D. thesis, Comp. Sci. Dept, Stanford Univ. E r m a n , L. D. (ed.) (1974b), Contributed Papers of the IEEE Symposium on Speech Recognition, A p r i l 1 5 - 1 9 , 1974, Pittsburgh, Pa., IEEE Cat. No. 7 4 C H 0 8 7 8 - 9 A E . M a n y of t h e s e p a p e r s have b e e n r e p r i n t e d in IEEE Trans, on Acoustics, Speech, and Signal Processing, A S S P - 2 3 , no. 1 (Feb., 1975). F e n n e l l , R. D., and V. R. Lesser (1975), "Parallelism in A.I. Problem-Solving: A C a s e Study of H e a r s a y l l ; T e c h . Report, Comp. Sci. Dept., Carnegie-Mellon Univ., P i t t s b u r g h , P a . 1 F e n n e l l , R. D. ( 1 9 7 5 b , in.preparation), Ph.D. thesis, Comp. Sci. Dept., C a r n e g i e - M e l l o n Univ. H a y e s - R o t h , F., V. R. Lesser, and D. W. Kosy (1975, in preparation), "A Design f o r Attentional C o n t r o l in a Distributed-Logic Speech Understanding System" T e c h . R e p o r t , C o m p . S c i . Dept., C a r n e g i e - M e l l o n Univ., Pittsburgh, Pa. H e w i t t , C. (1972), "Description and Theoretical Analysis (Using Schemata) of Planner: L a n g u a g e for P r o v i n g Theorems and Manipulating Models in a R o b o t " AI M e m o A No. 2 5 1 , MIT Project MAC. L e s s e r , V. R., R. D. Fennell, L D. Errr\an & D. R. Reddy (1974), "Organization of the H E A R S A Y II } S p e e c h Understanding System; 1 in Erman (1974b). Also a p p e a r e d in IEEE Trans, on Acoustics, Speech, and Signal Processing, A S S P - 2 3 , no. 1, pp. 1 1 - 2 3 (Feb., 1975). L e s s e r , V. R. ( 1 9 7 5 , in press), "Parallel Processing S u r v e y of Design Problems" in Reddy (1975). 18 in Speech Understanding Systems: A N e w e l l , A. (1969), "Heuristic Programming: Ill-Structered Problems; in J . S. A r o n o f s k y (ed.), 1 Progress in Operations Research, 3, Wiley, 3 6 3 - 4 1 5 . N e w e l l , A., J . Barnett, J . Forgie, C. Green, D. Klatt, J. C. R. Licklider, J . M u n s o n , R. R e d d y , a n d W. W o o d s (1971), Speech Understanding Systems: Final Report of a Study Group, p u b . b y N o r t h - H o l l a n d (1973). N e w e l l , A. ( 1 9 7 3 ) , " P r o d u c t i o n Systems: Models of Control Structures, 11 in W . C . C h a s e (ed.) Visual Information Processing, Academic Press, pp. 4 6 3 - 5 2 6 . R e d d y , D. R. (1966), " A n A p p r o a c h to Computer Speech Recognition by Direct A n a l y s i s of t h e S p e e c h Wave|' (Ph.D. thesis) AI Memo No. 43, Comp. Sci. Dept., S t a n f o r d Univ., Stanford, Ca. R e d d y , D. R., L D. Erman, and R. B. Neely (1973a), "A Model and a S y s t e m f o r M a c h i n e R e c o g n i t i o n of Speech; IEEE Trans. Audio and Electroacoustics, A U - 2 1 , 3, p p . 2 2 9 - 2 3 8 . 1 R e d d y , D. R., L D. Erman, R. D. Fennell, and R. B. Neely (1973b), "The HEARSAY Speech U n d e r s t a n d i n g System: A n Example of the Recognition P r o c e s s " Proc. 3rd Inter. Joint Conf. on Artificial Intel., Stanford, Ca., pp. 185-193. R e d d y , D. R. (1973), "Eyes and Ears for Computers; Tech. Report, Comp. Sci. Dept., C a r n e g i e 1 M e l l o n University, Pittsburgh, Pa. Keynote speech p r e s e n t e d at Conf. o n Cognitive P r o c e s s e s and Artifical Intelligence, Hamburg, April, 1973. R e d d y , R., and A. Newell Understanding (1974), System" in "Knowledge and L. W. Gregg (ed.) its Representation Knowledge and in Cognition, a Speech Lawrence E r l b a u m Assoc., Washington, D. C , chap. 10, (in press). R e d d y , D. R. (ed.) (1975, in press), Invited Papers of the IEEE Symposium on Speech a Speech Recognition, A p r i l 15-19, 1974, Pittsburgh, Pa., Academic Press. R o v n e r , P., N a s h - W e b b e r , B., and Woods, W. (1974), U n d e r s t a n d i n g S y s t e m " in Erman (1974b). "Control Concepts in Also appeared in IEEE Trans, on Acoustics, Speech, and Signal Processing, A S S P - 2 3 , no. 1, pp. 136-140 (Feb., 1975). R u l i f s o n , J . F., et al., (1973), "QA4: A Procedural Calculus for Intuitive Reasoning;' T e c h n i c a l Note 7 3 , AI Center, Stanford Res. Inst., Menlo Park, Ca. S i m o n , H. A., (1962), "The Architecture of Complexity", Proc. Amer. Philosophical Society, 1 0 6 , pp. 467-482. V i c e n s , P. ( 1 9 6 9 ) , " A s p e c t s of Speech Recognition," Report C S - 1 2 7 , (Ph.D. Thesis), C o m p . S c i . Dept., S t a n f o r d Univ. 19
© Copyright 2026 Paperzz