NETWORK REGIONS: Alternatives to the Winner-Take-All

N E T W O R K REGIONS:
Alternatives to the Winner-Take-All Structure
Hon Wai Chun
Lawrence A. Bookman
Niki Afshartous
Honeywell Bull
Computer Science Department
Honeywell Bull
300 Concord Road, MA30-895A
Brandeis University
300 Concord Road, MA30-895A
Billerica, MA 01821-4186
Waltham, MA 02254
Billerica, MA 01821-4186
U.SA.
U.S.A.
U.S.A.
ABSTRACT
Winner-take-all ( W T A ) structures are currently
used in massively parallel (connectionist) networks to
represent competitive behavior among sets of alternative
hypotheses. However, this form of competition might be
too rigid and not be appropriate for certain applications.
For example, applications that involve noisy and erroneous inputs might mislead W T A structures into selecting
a wrong outcome. In addition, for networks that continuously process input data, the outcome must dynamically change w i t h changing inputs; W T A structures
might " l o c k - i n " on a previous outcome. This paper
offers an alternative competition model for these applications.
The model is based upon a meta-network
representation scheme called network regions that are
analogous to net spaces in partitioned semantic networks.
Network regions can be used in many ways to clarify the
representational structure in massively parallel networks.
This paper focuses on how they are used to provide a
flexible and adaptive competition model. Regions can be
considered as representational units that represent the
conceptual abstraction of a collection of nodes (or
hypotheses).
Through this higher-level abstraction,
regions can better influence the collective behavior of
nodes w i t h i n the region. Several AI applications were
used to test and evaluate this model.
I. I N T R O D U C T I O N
Winner-take-all ( W T A ) structures [l] represent
competitive
behavior
among
sets
of
alternative
hypotheses in massively parallel networks [2, 3, 4). These
networks consist of large numbers of simple processing
elements that give rise to emergent collective properties.
The behavior of such networks has been shown to closely
match human cognition in many tasks, such as natural
language understanding and parsing, learning, speech
perception and recognition, speech generation, physical
skill modeling, vision and others.
In a W T A structure, whenever there is any activation, the structure forces only the node (which may
represent a hypothesis) w i t h the highest output level to
remain activated, while the other nodes die out. This
mechanism allows one to define "decision points" that
380
KNOWLEDGE REPRESENTATION
Figure 1. W T A structure that represents competing weak fricatives.
In the example, the link weights for inhibition and
activation links are arbitrarily set to be —0.6 and 0.33
respectively and the decay factors are set to 0.3. Figure
1 is the final state after the inputs —voiced, 4-labial,
and - f f r i c a t i v e were activated and the network was
relaxed. As the figure shows, the W T A structure enabled
/(/ to compete and suppress the other candidates.
output suppress nodes with lower output. There are two
parameters that influence this competitive behavior —
the inhibition link weights and the nodes' decay factor.
The link weights define the degree of competition.
Lower link weights represent a milder, slower form of
competition. Higher link weights represent stronger,
quicker competition. The decay factor, on the other
hand, controls how well a network retains the decision
made by the WTA structure and allows the network to
reset itself in the absence of inputs. In this paper, these
two parameters are assumed to be uniform.
In general, WTA structures are useful in applications where only one outcome is desired and where the
input data is not noisy and can be clearly categorized. In
these well-defined applications, some of the above mentioned problems can be solved by using building blocks
or combination rules in constructing the WTA structure
[l]. However, for applications involving real world data
with large variance, a more flexible and adaptive competition is needed. This paper presents an alternative
model of competition for these applications.
HI. N E T W O R K REGIONS
n.
ISSUES
Although WTA structures have been used successfully in many AI applications, this model of competition
might not always be appropriate. For example, applications that involve noisy and erroneous inputs might
mislead WTA structures into forming a wrong decision.
In addition, for networks which continuously process
input data, the network must be able to dynamically
change decisions based upon changing inputs. WTA
structures might "lock-in" on a particular decision. The
following outlines the key problems when WTA structures are used in these applications.
Overly rigid competition — Since the mutual inhibition link weights are fixed, the degree of competition is
also fixed. If the input data is not at a consistent activation level or if new hypotheses are encoded into the
structure, the competition may be too weak or too
strong. Competition that is too weak may result in multiple activations or decisions, some of which may be conflicting. Excessive competition may result in a structure
that is very sensitive to the output levels of the nodes.
This may mislead a WTA structure into prematurely
forming an incorrect decision.
Decisions are not flexible — Once a WTA structure
"locks-in" on one interpretation, it is difficult to shift to
another interpretation as the input changes. In other
words, newly activated nodes will have little chance of
competing with the current "winner". Flexible decision
making is essential for dynamic systems where the network continuously processes different inputs. The decay
factor does alleviate the "lock-in" effect by allowing the
winning node to gradually decay, but may not be in time
to accommodate changes in the input.
No likelihood information — The output level of a
node may sometimes be considered a "likelihood" or
"certainty" measure [5]. WTA structures, by definition,
allow only one "winner", thus the output levels of other
nodes are suppressed and this "likelihood" information is
lost. In certain applications, it is more desirable to have
several outcomes with varying output levels to reflect the
likelihood of these outcomes.
Network region is a meta-network structure that
represents the conceptual abstraction of a collection of
nodes or hypotheses. Intuitively, it represents a "chunk"
of network where well-behaved properties can be defined.
Graphically, a region is displayed as a solid-lined rectangular box around the collection of nodes it represents.
The following figure shows the previous WTA
structure enclosed in a network region, labeled W F ,
which represents the category of weak fricatives.
Figure 2. A region representing the category of weak fricatives.
Conceptually, there are similarities between network regions and partitioned semantic networks [6]. In
semantic networks, nodes and arcs are partitioned into
net spaces, while in our massively parallel networks,
nodes and arcs are partitioned into network regions. The
net spaces are used mainly to delimit the scopes of quantified variables. In this paper, network regions are used
to delimit sets of competing hypotheses. In addition,
besides being conceptual units, network regions are computational entities just as nodes and links are. Network
regions can also activate or inhibit other nodes or
regions, similar to spaces that can be linked to other
parts of a semantic network.
Regions are in effect higher-order nodes with internal parameters, that have node counterparts (i.e. potential level, output level, and inputs) where:
S - sum of all the nodes' output within region.
P - average potential of activated nodes in region.
V - average output of activated nodes in region.
Ir - average input level activated nodes in region.
These parameters act as indicators of the collective state
of the nodes within a region and are updated during each
Chun, Bookman, and Afshartoua
381
relaxation cycle. This provides an abstract view of a set
of nodes that can be used either to influence the network
system outside the region, or to better control the collective behavior of nodes within the region.
In terms of influencing the network system outside
a given region, regions can be used to encode partition of
concerns [7]. In the above example, even though the network may still be determining the correct weak fricative,
the state of the WF region (or more precisely, the value
of Vr) can immediately provide information to other network structures which may only need to know that a
weak fricative is present and not necessarily which particular weak fricative. This allows phonological rules,
that apply to all weak fricatives, to trigger whenever
there is any activity in the WF region.
Regions also simplify the task of knowledge encoding by permitting knowledge to be encoded at appropriate levels of abstraction. In speech recognition, phonological rules are used at various levels of abstraction.
Through the use of regions, rules can be encoded into
connectionist networks at the same level they are
expressed. An example rule is: if a final voiced fricative
is actually voiced, then the following sound is voiced.
This may be encoded with a region that represents the
set of all voiced fricatives. If this region is active and is
followed by a silence, then the system should expect a
+voiced sound to follow. Without the regions representation, this knowledge has to be encoded for each voiced
fricative separately.
With regions, knowledge is
expressed at a more appropriate level of abstraction, that
allows networks to be more modular and comprehensible.
The interaction of
structures is discussed in
The current paper focuses
to influence the collective
within a region.
The N-REGION accomplishes this by a normalizing
computation that limits the maximum total output
within a region (the value of Sr of the N-REGION) to
1.0. In other words, the total "likelihood" within the NREGION will not exceed 1.0. The general idea of normalization is of course not new, however with network
regions the normalization is integrated into the relaxation process itself. Figure 3 and Table 1 shows the output levels of the four weak fricatives with and without
the N-REGION, after the inputs —voiced, - l a b i a l and
+fricative were activated for 1 cycle.
regions with other network
(Bookman and Chun, 1987).
on the use of network regions
competitive behavior of nodes
A. N-REGION
The N-REGION (normalizing region) provides
activation stability for competing hypotheses. In addition this mechanism allows "likelihood" information to
be maintained and permits competition that is more
tolerant to variations in input levels.
In certain applications, it is useful to have more
than one outcome from a set of alternative hypotheses.
In these cases, the output level of a node may be considered as a "likelihood" or "certainty" of a particular
hypothesis [5]. Since WTA structures only allow one
"winner", the output levels of the other nodes are
suppressed and the "likelihood" information is lost. The
N-REGION, without the mutual inhibition links, maintains this likelihood information within a set of alternative hypotheses. If we consider a hypothesis by itself, the
output level would indicate the "likelihood" of this
382
hypothesis as determined by the current low-level inputs.
For example, if only half of the inputs to a particular
hypothesis are activated, the hypothesis may have an
output level of 0.5. This information is only useful if we
are considering how well a particular hypothesis matches
the current input. However, when hypotheses are in
competition, it is more informative to have the output
level indicate the likelihood of a particular hypothesis
among all other competing hypotheses.
KNOWLEDGE REPRESENTATION
The output level of / v / with the WTA structure
indicates that there is a 0.66 likelihood that / v / is
matched just by considering only the inputs (only twothirds of the features for / v / were activated) . This output level is misleading, since /f/, / v / and /o/ are all
highly activated. This would misinform other network
structures that all three were closely matched. However,
with the N-REGION, the output level of / v / is adjusted
to 0.25, indicating the likelihood of / v / among the four
alternatives is only 0.25. This mechanism puts a
hypothesis in perspective among all the other competing
hypotheses.
This approach is similar to the UB (upper-bound)
parameter proposed in (Feldman and Ballard, 1982).
In our example, the activation link weighty are uniformly set to 0.33. In evidential reasoning systems, this weight would be E ( hypothesis 1 input). However, the
utility of the N-REGION is the same.
However, instead of inhibiting all the nodes equally, the
N-REGION proportionally adjusts the nodes output level
to maintain the "likelihood" information.
The virtual lateral inhibition proposed by (Reggia,
1985) also maintains a normalized activation in each
layer when only a single input node is activated at a
time. Since the effect of this competition model is similar to WTA structures, the network might also "lock-in"
on a premature outcome.
Since there are no inhibition links between competing nodes in the N-REGION, it does not encode active
competition or contour enhancement [10] (filter out noise
by suppressing nodes with low output level). However,
there is implicit competition through the conservation of
activation within the N-REGION. In addition, by this
lack of active competition, N-REGIONs will not "locki n " on a particular decision. If active competition is
required, the N-REGION can be used in conjunction
with a WTA structure.
The amount of competitive inhibition received by a
node in a WTA structure varies with the total output
level in the structure. After the WTA structure is finetuned for a particular input activation level, the performance varies when the input level increases or decreases.
When N-REGIONs are used on top of WTA structures,
the N-REGIONs maintain the total output level to be
constant. Hence, a more uniform competition results
since the total amount of competition will not vary with
input levels. This also eliminates the problem of too
weak a competition where several hypotheses get fully
activated. With N-REGIONs, the activation of these
hypotheses will be normalized.
Competition in the C-REGION is defined as:
competition. = kI lr wI E vi
[i = j]
where kJ is a constant, indicating the sensitivity to the
input. In contrast to the original WTA competition, the
"effective inhibition weight" is now (kI Ir wI.). The
parameter kI is set so that the "effective inhibition
weight" for the expected input levels is the same as the
original WTA inhibition weight.
The following experiments show how the CREGION adapts to varying input levels. In the examples, the activation link weights from the inputs are 0.33.
The inhibition link weights in both the WTA structure
and the C-REGION are -0.6. The proportionality constant kJ is set to 1.5 so that the "effective inhibition
weight of the C-REGION is approximately equal to the
inhibition weight (i.e. —0.6) of the WTA structure when
inputs nodes are fully activated.
In experiment 1, when the input nodes were fully
activated, the outcome ("winner") will shift when inputs
change for both the WTA and C-REGION. However,
when the average input level, /, drops to 0.4, the WTA
structure "locks i n " on the previous outcome, while the
C-REGION will shift to the correct decision. Table 2
shows that the WTA locked onto /$/, while the CREGION correctly shifted to /S/. Figure 4 shows the
final state of the network at the end of 45 cycles.
To summarize, the most significant advantages of
N-REGIONs are that they maintain "likelihood" information, permit uniform competition, eliminate excessive
competition, allow multiple outcomes, and will not
"lock-in" on a single "winner".
B. C-REGION
The C-REGION (competitive region) provides an
adaptive and flexible competition that permits graceful
shifting of decisions made when inputs change.
The WTA competition is rigid because the weights
on the inhibition links are fixed. When input levels are
not consistent, as in the case of noisy data, competition
may either be too weak or too strong. For example, in
speech recognition, the clarity in which certain phonemes
are spoken varies greatly within each word. The CREGION avoids this problem by having the inhibition
link weights be sensitive to the current inputs to the
region (i.e. the value of Ir in the region). As the CREGION relaxes, the inhibition link weights are adjusted
to adapt to the current input level.
In experiment 2 (see Table 3), the average input
level, /, for the first 15 cycles was lower than expected
(only 0.4). However, both network structures still settled
on the correct outcome of /f/. In the previous experiment, the WTA structure failed because it "locked i n "
on a previous outcome. Here, there is no previous outcome to lock onto. However, when the input was
changed and the average input level dropped further to
0.26, the WTA structure failed while the C-REGION still
Chun, Bookman, and Afshartous 383
noise, whereas the input to the word recognition example
is highly noisy and ambiguous. The natural language
example shows the importance of flexible decision making. All examples were implemented using the AlNET-2
system [11] (a massively parallel network simulator and
development environment) which runs on Symbolics Lisp
machines.
A. HIGH-LEVEL VISION
The main advantage of the C-REGION Is that it
provides a flexible competition which adapts to the
current input level. The decision made by the structure
is less dependent on previous decisions (i.e. will not "lock
i n " on previous results). This ability to graceful change
decisions is crucial in networks which continuously process different inputs at varying levels. When the input
level is higher than the expected value, the competition
in both the WTA structure and C-REGION might be too
strong. Consequently, more negative competition activation will be spread. If this activation is higher than the
nodes1 output level, nodes in both structures may toggle
between active and inactive states. The CN-REGION
eliminates this problem.
This section investigates the line drawing labeling
problem [12] of high-level vision. In our line drawing
labeling system, each junction in an object is represented
by a WTA structure with nodes that represent the possible labels for that junction. The label of a junction is
rigidly constrained; it must agree with the labels of adjacent junctions. These constraints are represented as
activation links between adjacent junctions. This
method of constraint propagation by spreading activation mimics the effects of the Waltz filtering algorithm.
An example object and its correct labeling is shown
in Figure 5. The three types of junctions that occur in
this object are the L's, forks, and arrows (see Figure 6).
C. C N - R E G I O N
CN-REGIONs, used on top of WTA structures,
combine the advantages of both C-REGIONs and NREGIONs. The N-REGION maintains uniform competition with possibly multiple outcomes, while the CREGION adapts competition to varying input levels.
When average input levels are lower than expected, the
C-REGION can still permit the inputs to influence the
competition. When average input levels are higher than
expected, the N-REGION prevents excessive competition.
The net effect of combining these two is that competition
is highly flexible and uniform. The possibility of too
strong or too weak competition is greatly reduced.
The effects of the CN-REGION are similar to the
competition in Grossberg's on-center off-surround network [10]. Groasberg's network uses a quenching threshold to limit which nodes will be activated. In essence,
the adaptive competition in the CN-REGION is similar
to a self-adjusting quenching threshold that changes with
the current input level.
IV. APPLICATIONS
The network region model of competition was
tested in three AI applications — high-level vision, isolated word recognition, and microfeature-based natural
language understanding. These examples illustrate the
effectiveness of WTA structures and network regions in
different classes of applications. The input to the labeling problem of high-level vision is well-defined without
384
KNOWLEDGE REPRESENTATION
All nodes were initially active, to indicate all labels
are equally possible. As the network relaxed, nodes
which did not satisfy the defined constraints died out.
Eventually, the network converges to the correct labeling
as shown by the darkened nodes in Figure 7.
In our experiments, networks performed correctly
as long as there was a balance between the inhibition and
activation link weights. Both excessive inhibition and
weak activation caused the network to converge
incorrectly. Weak inhibition and excessive activation
resulted in multiple labels being selected at some of the
junctions.
This labeling system was also tested with NREGIONs at each junction, in addition to the WTA
structures. This resulted in a slightly greater range of
possible inhibition and activation link weights in which
the network would still converge to the correct labeling.
In addition, the network tended to degrade gracefully
when link weights varied beyond this range. Without
N-REGIONs there was a sharp threshold after which the
network performed improperly. This improvement in
network behavior can be attributed to the uniform competition provided by N-REGION. Similar results were
observed for the CN-REGION.
Figure 7. Network with WTA structures to perform labeling.
WTA structures are effective for the line drawing
labeling problem and other domains where constraints
are rigidly defined. The problem of "locking i n " on a
decision is not relevant here since evidence for a particular hypothesis (label) cannot vary with time. However,
as we shall see in the following sections, this problem
becomes more significant in applications where evidence
for a hypothesis can vary with time.
Word recognition involves evidence gathering (in
the form of phonetic segments matched) over a period of
time. An observation from the experiments is that a
WTA structure might converge to a decision based on
the initial data and "lock-in" on an incorrect word (very
often the initial part of an utterance is noisy). In this
case, the initial noise may gradually dominate the competition. In addition, during the recognition process, the
correct phonetic segment may not always be the most
highly activated segment. This would easily mislead a
WTA structure, whereas a CN-REGION would be more
capable of accommodating this type of noisy and erroneous data. The adaptive competition capability of CNREGION would permit later inputs to be able to positively influence the network. Figure 10 illustrates how
the WTA structure misinterpreted the utterance "six" to
be "seven" based on initial data. Figure 11 is the result
from using a CN-REGION on the same utterance.
B. ISOLATED WORD RECOGNITION
An isolated word recognition system, called SECO
[13, 14], was used as an application with highly noisy and
inconsistent inputs. This system recognizes spoken
letter-names and digits. It partially evolved from a massively parallel model of word perception called COHORT
[15]. The structure of SECO is similar to COHORT with
the addition of a temporal constraint layer.
The word recognition network consists of five
layers — phonetic feature (e.g. vowels, stops, etc.) layer
that is activated by an LPC-based front-end processor,
phonetic segment (e.g. /f/, / v / , etc.) layer, phonetic segment token layer, temporal constraint layer, and a word
layer which is the output of this system. Nodes in the
Chun, Bookman, and Afahartout
385
When a word is spoken, the duration of the speech
varies with the speaker and the current context. This
variation in the duration of input data causes problems
for WTA structures. If an utterance is longer than
expected, word level nodes will be highly activated before
the utterance is complete and prematurely settle on a
''winner." The normalizing effect of the CN-REGION
prevents this "premature" recognition by adjusting the
output levels to represent the current "likelihood" of a
word at any point during the recognition process.
When competition is too weak, the WTA structure
may fully activate more than one word node. The outcomes may also be activated before utterances is completed. With CN-REGION, normalization reduces the
nodes' output to better indicate their "likelihoods." Figure 12 shows an example where the input utterance is
"three." This particular utterance is somewhat noisy
and the network confuses this with "four" while slightly
favoring "three". The plot shows the WTA structure
activates both "three" and "four." This would misinform other network structures that both words were
recognized. In the case of CN-REGION (Figure 13),
"three" and "four" are only partially activated.
The top layer is a "local" connectionist model, the bottom layer, a distributed layer of "microfeatures." The
microfeatures are used as a basis for defining nodes (i.e.
concepts, hypotheses) in the top layer, at least partially,
and to associate the node with others that share its
microfeatures. Each node in the top layer, is connected
via bi-directional links to only those microfeatures that
describe it. Conceptually related nodes have common
microfeatures.
The following is one of the sentences used to contrast the competition found in WTA structures with
CN-REGIONs.
"John went driving with five bucks in his truck."
The network that models this sentence is shown in Figure 14 (only the top layer is shown). The rectangular
nodes in the center of the figure represent the input
words. The structure above this is the syntactic parse
tree for the sentence. The elliptical nodes below the
input words represent the competing word senses. In
addition, semantic constraints associated with the inputs,
are encoded but not displayed. There is also a microfeature memory system that represents the currently
active and inactive microfeatures (these are not shown in
the figure).
Figure 14. Network used to process the example sentence.
The example sentence is ambiguous, for it can
either mean: he went driving with five dollars in his
truck, or he went driving with five deer in his truck. In
WTA-1 (Figure 14) two problems occur. Initially priming the WEEKEND and GAMBLING context, causes the
system to settle on the dollar interpretation of bucks,
while priming it with HUNTING causes it to settle on
the deer interpretation of bucks. In both cases the competition is too strong and there really should not be a
total winner in this case. Instead there should be some
overlap, as the features shared by both senses of the sentence are shared by both priming contexts. Using CNREGIONs, this is exactly what happens. Priming
WEEKEND and GAMBLING causes the dollar sense to
be higher than the deer sense, while priming HUNTING
causes the deer sense to be higher than the dollar sense,
yet neither totally win out, thus reflecting the ambiguity
in the data.
386
KNOWLEDGE REPRESENTATION
In WTA-2, the correct sense of truck is not chosen.
Here, initially the input driving activates the transport
sense of truck, but later inputs activate the vehicle sense
of the word. However, they are too late to enter competition. This was not the case with the CN-REGION,
which correctly shifted to the vehicle sense of the word.
Results from our experiments indicate several problems that WTA structures pose for natural language processing. First, the semantic knowledge of the network
and its microfeatures varies with one's experience. There
is no one correct set. Thus, slight variations in the
network's microfeature set will cause problems for WTA
networks which are sensitive to transiently higher input.
This can lead to premature stabilization of activation.
On the other hand, the CN-REGIONs allow smooth competitions among the microfeatures, the syntactic structures, the semantic constraints, and the input words,
thus enabling subtle differences in meaning to exist.
Secondly, the processing of a sentence is sequential over
time. As a result, one's understanding of the sentence
and the sentences that follow are gradually refined as one
processes the inputs. A WTA may prematurely "lock
i n " on a word sense and ignore later information.
Whereas, CN-REGION permits a dynamic flexibility in
shifting from one interpretation to another. In addition,
WTA does not provide a good representation for ambiguous meanings, it always choose the one that is transiently
higher. CN-REGION allows multiple outcomes each
associated with a ''likelihood" measure.
V. RESEARCH DIRECTIONS
The research documented here represents the first
step in utilizing network regions to provide higher-level
representation in massively parallel networks. As regions
can provide a means of creating hierarchies, as well as
shared structures, it may be possible to use current learning techniques to create networks that can generalize
from existing concepts to form new regions that
represent these generalized concepts. This is a topic for
future research. Currently we are investigating the
effects of interaction among regions [8], that is, the computational mechanisms needed to represent the interaction of higher-order concepts.
ACKNOWLEDGEMENTS
We would like to thank Dave Waltz and the
members of the Brandeis AI Group for their comments.
We would also like to thank the members of the
Knowledge Engineering Center at Honeywell Bull.
REFERENCES
[l] Feldman, J.A. and D.H. Ballard, "Connectionist
Models and Their Properties," Cognitive Science, 6,
1982, pp.205-254.
[2] Hinton, G.E. and J.A. Anderson (eds.), Parallel
Models of Associative Memory, Hillsdale: Lawrence
Erlbaum Associates, 1081.
[3] Rumelhart, D.E. & McClelland, J.L., Parallel Distributed Processing: Explorations in the Microstructure
of Cognition Volume 1, MIT Press, 1086.
4] McClelland, J.L. & D.E. Rumelhart, Parallel Distributed Processing: Explorations in the Microstructure
of Cognition Volume 2, MIT Press, 1086.
5) Shastri, L. and J.A. Feldman, ''Semantic Networks
and Neural Nets," Computer Science Department,
The University of Rochester, TR131, June, 1084.
6] Hendrix, G.G., "Expanding the Utility of Semantic
Networks Through Partitioning," In Proceedings of
IJCAI-15, 1075, pp.115-121
7) Brachman, R.J. and H.J. Levesque, "On the Partitioning of Concerns in Knowledge Representation,"
In Proceedings of AAA1-82, 1082.
8] Bookman, L.A. and II.W. Chun, ''A Model of HigherOrder Concept Interaction Using Network Regions,"
Computer Science Department, Brandeis University,
Technical Report, CS-87-130, Massachusetts, 1087.
0] Reggia, J.A., "Virtual Lateral Inhibition in Parallel
Activation Models of Associative Memory," In
Proceedings of IJCAI-85, 1087, pp.244-248.
10] Grossberg, S., "Contour Enhancement, Short Term
Memory, and Constancies in Reverberating Neural
Networks," Studies in Applied Mathematics, Vol.
LII, No. 3, September, 1073, pp.213-257.
11] Chun, H.W., "AINET-2 User's Manual," Computer
Science Department, Brandeis University, Technical
Report, CS-86-126, Massachusetts, 1086.
12] Waltz, D. L., "Understanding Line Drawings of
Scenes with Shadows," in The Psychology of Computer Vision, P. Winston (ed), McGraw-Hill, 1075.
13] Chun, H.W., "A Representation for Temporal
Sequence and Duration in Massively Parallel Networks: Exploiting Link Interactions," In Proceedings
of AAAI-86, August 1086.
14] Wong, M.K. and H.W. Chun, "Toward a Massively
Parallel System for Word Recognition," 1986 IEEE International Conference on Acoustics, Speech, and
Signal Processing, Tokyo, Japan, April 1086.
15] Elman, J.L. and J.L. McClelland, "Speech Perception as a Cognitive Process: The Interactive Activation Model," In N. Lass (ed.), Speech and Language:
Vol. X., Orlando, Florida, Academic, 1084.
16] Bookman, L.A., "A Microfeature Based Scheme For
Modelling Semantics," In Proceedings of IJCAI-87,
1087.
17] Cottrell, G.W. "A connectionist approach to word
sense disambiguation," (PhD thesis) TR 154, U. of
Rochester CS Dept., 1085.
18] Waltz, D.L. and J.B. Pollack, "Massively Parallel
Parsing: A Strongly Interactive Model of Natural
Language Interpretation,"
Cognitive Science,
Volume 0, Number 1, January-March, 1085.
Chun, Bookman, and Afshartous
387