The Binding Pool Model of Visual Working Memory

The Pennsylvania State University
The Graduate School
The Binding Pool Model of Visual Working Memory
A Thesis in
Psychology
by
Garrett S. P. Swan
© 2013 Garrett S. P. Swan
Submitted in Partial Fulfillment
of the Requirements
for the Degree of
Master of Science
December 2013
ii
The thesis of Garrett S. P. Swan was reviewed and approved* by the following:
Bradley Wyble
Assistant Professor of Psychology
Thesis Advisor
Frank G. Hillary
Associate Professor of Psychology
Rick Gilmore
Associate Professor of Psychology
Melvin M. Mark
Professor of Psychology
Head of Department of Psychology
*Signatures are on file in the Graduate School
iii
Abstract: Visual working memory (VWM) is a complex process of encoding, storing, and
retrieving visual information. To better understand this process, researchers have designed
models to describe the mechanisms of VWM. The two prevalent models assume that information
is either stored in discrete slots or as a function of a limited pool of resources. However, these
models and others fail to provide an explicit mechanism explaining how items can be encoded
and retrieved individually, yet are subject to interactions with other items in VWM. The model
that I am proposing, called the Binding Pool model, formulates a mechanism for indexing a
resource pool. The Binding Pool model utilizes a Type/Token architecture to differentiate items
that are stored in the connecting region called the binding pool. This particular architecture
provides a clear framework for exploring VWM tasks, such as the change detection and
continuous report paradigms, and can be used to predict behavior in a variety of task
manipulations. The thesis will specifically cover three modifications to the Binding Pool model
and a set of predictions.
Word Count: 10,527
Figures: 14
Table: 1
iv
List of Figures
Figure 1. Change detection paradigm……………………...3
Figure 2. Continuous report paradigm……………….…….6
Figure 3. The Binding Pool model structure………………16
Figure 4. Encoding stage……………………………….….21
Figure 5. Retrieval stages………………………………….23
Figure 6. Retrieved population vector………………..........25
Figure 7. Set size interference………………………….….26
Figure 8. Simulation of continuous report paradigm…...…28
Figure 9. Thresholds for change detection paradigm….…..29
Figure 10. Simulation of change detection paradigm……..30
Figure 11. Simulation of ‘magnet’ effect…………...……..31
Figure 12. Simulation of retrieval confidence………...…..32
Figure 13. Simulation of inter-item similarity ……...…….38
Figure 14. Simulation of complex ensembles ……….…...39
v
List of Tables
Table 1. List of parameters………………………………33
vi
Table of contents
List of Figure………………………………………………iv
List of Tables……………………………………………...v
Chapter 1. History of visual working memory......... ….….1
Change detection paradigm……………………….2
Continuous report paradigm………………………5
Chapter 2. Modelling visual working memory………..….10
Descriptive models………………………………..10
Computational models…………………………….13
Chapter 3. Binding Pool model……………………...……16
Components of model…………………………….17
Encoding stage…………………..………………..21
Retrieval stages…………………………………...22
Behavioral simulations…………………………...27
Chapter 4. Predictions……………………………..….…..36
Inter-item similarity……………………………….36
Variations in the ‘magnet’ effect by ensembles…..39
High confident retrieval swaps…………………...41
Chapter 5. Model GUI………………………………...….45
Bibliography…………………………………….….…….46
1
Chapter 1. History of visual working memory
Memory is one of the oldest and most intriguing topics in psychology. How can information
from one moment be preserved and used to guide action in another? It is an important question
because success in a variety of tasks depends on accurate and precise retrieval of relevant
information. While typing this manuscript, a variety of memory systems enable me to type
without looking at the keys, to know the syntax and semantics of English, and to plan which
words I will be typing next. Memory, and specifically working memory, is highly correlated with
intelligence, problem solving, learning, and the ability to solve complex puzzles (Daneman &
Merikle, 1996; Conway, Kane, & Engle, 2003). Each of these tasks requires that information can
be extracted from some memory store, used as needed, and then removed from the active focus
of attention to allow for other information to be attended. The goal of this thesis is to understand
a type of memory that can rapidly store and retrieve visual information for goal directed behavior
called visual working memory (VWM). This will be accomplished by constructing a neural
simulation of working memory and to then use that model to simulate and predict various
behavioral data. The model will be constrained by various findings in the VWM field and serve
as an existence proof for a mechanistic account of a resource based VWM model. Before delving
into the mechanisms of the model, it is important to understand the work that has come before it.
Working memory is typically modelled as the active state between the presentation of the
information and storage of the information in long-term memory and is specifically used for
storing and retrieving information (Atkinson & Shiffrin, 1968; Baddeley & Hitch, 1974). Unlike
other types of memory, working memory is concerned with active representations that must be
manipulated or used to perform some task. Some models separate working memory from long-
2
term memory (c.f. Cowan, 2001). If working memory is a separate system, what are some of its
properties and the mechanisms behind these properties?
To answer these questions, researchers have designed models that are motivated by behavioral
phenomenon and biological constraints to simulate how a system may encode, store, and retrieve
specific information. Both behavioral and biological data provide constraints on the model, and
the more constraints that can be applied, the more likely it is that the model reflects the
properties of the natural system. Most VWM models adopted characteristics from early verbal
working memory tasks that primarily focused on discrete representations of categorical stimuli.
Verbal working memory accounts as a precursor to visual working memory.
Much of the current debate concerning the mechanisms of working memory can be attributed to
task designs from early memory span tasks involving the maintenance of lists. The lists
themselves were composed of nonsense words, actual words, numbers, letters, or even images
(Standing, 1973). After being presented with such a list, participants were asked to free recall as
many items as possible (Tulving, 1962), report a single item from a pair (Postman, 1972), or
report whether an item was or was not in the study list. The appeal of memory span tasks is in
the ease of quantification. If a participant is able to report 5 of the 7 items correctly, the
implication is that their memory stored 5 items. This research emphasized the limited capacity
nature of working memory and played an important role in shaping the direction of the field by
providing a focus on categorically distinct stimuli. Furthermore, these assumptions played an
important role in developing a visual recognition task called change detection that utilizes similar
assumptions.
Change detection as a measure of visual working memory
3
One of the most prevalent task designs in VWM studies is a recognition task called the change
detection task. In this task, participants are instructed to remember a set of stimuli. The to-beremembered stimuli will disappear for a duration of about 1 second and then a second array of
test stimuli appears (Figure 1). Participants make a binary response as to whether or not the test
set of stimuli changed from the study set of stimuli. Typically, response accuracy is near ceiling
until the amount of stimuli, or set size, exceeds 3 items in the study array. The intuition behind
the task is that items stored in memory from the study array are compared with the items in the
probe array and only changes to items that were in memory can be detected. These assumptions
are motivated by the idea that items are stored in memory either completely or not at all, an idea
taken from early memory span tasks. The most striking evidence supporting these assumptions is
from Luck and Vogel (1999), who demonstrated that changes between complex objects with as
many as four features each can be detected at the same level of accuracy as objects with only one
feature. Memory capacity in this theoretical framework has been referred to as “slots”, with the
idea being that each slot can hold one item, regardless of its complexity. However, evidence for
this has been controversial and various other task designs have found performance to degrade as
more features are added to an object (Fougnie, Suchow, & Alvarez, 2012; Oberauer &
Eichenberger, 2013).
Change detection task
4
A.
.
.
.
.
Time
Figure 1: In a whole-display change detection task, participants see a set of stimuli (in this
example, a red, blue, and yellow patch) and are asked to remember the items. The items
disappear after an encoding duration (typically 100-500ms) and following a retention interval of
several hundred milliseconds, a new display is presented that may or may not contain a changed
item. Participants determine whether or not an item has changed. In this example, the red patch
changed to green, so a correct response would be to respond ‘Change’.
More recently, methodological changes to the change detection task have been used to
demonstrate that representations of items may be stored at varying degrees of strength, and that
perhaps slots are not the unit of memory storage. Alvarez and Cavanagh (2004) empirically
tested how item complexity influences performance in a change detection task by presenting
complex items (objects with multiple features), novel items (Kanji characters), familiar items
(Snodgrass line drawings), and simple items (objects with a single feature) to the participants.
Participants had the most difficulty with the complex objects, then novel, followed by familiar,
and lastly the simple items. The differences in memory performance for the different types of
objects lend credence to the idea that the units of memory may not be as simple as an object
5
representation (Awh, Barton, & Vogel, 2007). This work has been expanded by Oberauer et al.
(2013), who systemically explored how different features affected retrieval accuracy. Similarly
to Alvarez and Cavanagh (2004), Oberauer et al. (2013) found that adding additional features per
object degraded performance in a change detection task, thereby providing more evidence that
memory representations are more complex than discrete item slots. Others have demonstrated
that VWM is not an all or none storage, but is affected by the relative dissimilarity between items
(Lin & Luck, 2009; Keshvari, van den Berg, & Ma, 2013).
Despite these improvements in the design, the change detection task is still limited because the
output is a binary response. An alternate approach is to use a recall task in which participants are
asked to directly retrieve an analog feature of a stimulus.
Continuous report task as a recall visual working memory task.
Near the turn of the century, Prinzmetal et al. (1998) published a paper on the phenomenology of
continuous stimuli. The goal of this article was to examine how attention modulates the ability to
recreate an analog feature of a remembered stimulus over a continuum of possible responses
using a color wheel. The difference between the recreated color and the actual color is the
retrieval precision of that feature, and over many trials, responses accumulate into a response
distribution with the width of the distribution reflecting the strength of the memory
representation. The results of the study demonstrated modest attentional effects on the shape of
the response distribution and did little to change the field of attention and phenomenology.
However, the design of the response measure did have a lasting effect on the VWM field.
Specifically, by collecting a direct analog response value of a specific stimulus, researchers
6
could begin to ascertain exactly how the quality of an item’s representation is influenced by the
task conditions.
Wilken and Ma (2004) adapted Prinzmetal’s design to measure the quality of memory
representations in a VWM context. In their task, referred to from here on out as the continuous
report task, participants are briefly presented with an array of stimuli that is similar to the study
array in a change detection task. However, unlike the change detection task, after a retention
interval, a single item is cued and the response is a mouse click along a continuum (in the case of
selecting a color hue or an orientation) or moving a slider with the mouse (in the case of
adjusting spatial frequency for Gabor patches). In both cases, responses are analog, so that the
dependent measure is the retrieval precision of the selected item relative to the presented item
(Figure 2).
Continuous report task
.
.
.
.
Time
Figure 2: In a continuous report task, participants see a set of stimuli and are asked to remember
the hues. In this task, once the stimuli reappear, participants are given a location cue (the light
7
gray patch) to recreate that specific hue by selecting a color along the color wheel with a mouse
click.
The principle finding of Wilken and Ma (2004) was that as the number of stimuli in the study
array increases, so does the standard deviation of the response distribution. Importantly, the
relationship between the precision of the distribution and the set size is captured with a power
law. This finding suggests that there is a degradation of representational fidelity when additional
information is inputted into memory. The slot model would have instead predicted that the
response distribution would not have changed until the amount of information exceeds capacity.
Contrary to the slot theory, the precision of the retrieved memory clearly degrade from set size 1
to 2, which indicates that the slot architecture may not be appropriate memory architecture. The
inadequacy of the slot model is even more pronounced when dissecting specific types of
responses from the response distribution.
Zhang and Luck (2008) suggested that the response distribution is composed of distinct types of
responses and a mixture model can be used to distinguish these responses. Their analysis method
assumes that the response distribution is a combination of two distributions: a normal
distribution over a circular space, a von Mises distribution, centered at zero with a standard
deviation influenced by the conditions of the task and a second component which is a uniform
distribution representing random guesses. This is represented mathematically as:
( ̂)
(
)
(̂
)
[1]
8
where ̂ is the reported color (in radians),
participant guesses,
is the target color,
is the proportion of trials in which the
represents the Von Mises distribution with a mean of zero and standard deviation
.
A mixture model is used to isolate correct reports and guesses from the response distribution.
When using this mixture model, the proportion of guessing is found to increase with set size, but
the precision of a target response seemingly reaches an asymptote as set size exceeds 3 or 4
stimuli. These results are evidence in favor of a limited capacity system because it implies that
no new information is being inputted to degrade precision.
However Bays et al. (2009) pointed out that participants might sometimes make retrieval errors
by retrieving and recreating a non-target stimulus. This response can be captured in the mixture
model by adding another von Mises distribution centered at the location of the non-target,
( ̂)
where
(
)
(̂
)
( )∑
(̂
)
representing the proportion of retrieval errors, or non-target selections, and
number of non-targets.
[2]
as the
is the non-target color (in radians).
In order to determine if a given response is a retrieval of a target or a non-target, the mixture
model compares the retrieved value to both the target and non-target location and selects the best
fit. Removing retrieval errors should then improve the validity of the precision estimate. The
typical result is that the proportion of retrieval and guessing errors increase as a function of set
size, and the precision estimate does not reach an asymptote. Instead, the precision data can be
captured with a power law. These results again challenged the idea of slots, demonstrating that
precision decreases continuously as set size increases.
9
Another measurement that provides additional information about the status of memory traces is
that of confidence. Only a single paper has examined the relationship between retrieval precision
and confidence (Rademaker, Tredway, & Tong, 2012). The results of this study demonstrated
that the majority of guessing and retrieval errors occur when confidence for an item is low.
Importantly, there are low confidence responses to target items in each set size condition. In this
sense, confidence could be used as another tool for dissecting the response distribution because it
helps to isolate the true precision of an encoded item by differentiating further the types of
retrievals. Medium and high confident responses after being analyzed with the mixture model
should better capture the representation of a successfully stored and retrieved item.
In summary, new change detection and continuous report tasks have challenged the traditional
view that representations in VWM are discrete and independent. Instead, models have begun
using other forms of representations and mechanisms for describing these behaviors. In the next
chapter, I will be discussing some of these computational models and how they have been
applied to VWM.
10
Chapter 2. Modelling visual working memory
Descriptive models provide an intuitive, theoretical description of how a complex system works
through analogical reasoning. Two such models have come to dominate our understanding of
how VWM works. The first of these models is ‘slot’ based and assumes that representations are
discrete. The second is ‘resource’ based and assumes that representations receive portions of a
limited resource pool.
Slot based models of VWM
‘Slot’ based models of VWM are the most prevalent descriptive models of VWM. The basic
architecture of these models is that representations are maintained independently of other stored
representations in a metaphorical ‘slot’ (Luck & Vogel, 1999). Each slot is able to maintain a
single object with an unspecified amount of features. In other words, if an item is allotted a slot
during encoding, its representation is maintained and can be retrieved regardless of its
complexity. An object can only be encoded into a slot if there is an available slot. Otherwise, that
object is not encoded. Therefore, the probability of encoding an object, p(e), among a set of
stimuli, n, is k/n = p(e) with k representing the amount of available slots. The simplicity of this
model allows it to be easily applied to data and understood as a construct, but it neglects some
principles of information processing as a consequence of its simplicity.
For example, an issue that is often not addressed by the ‘slot’ model is the specific nature of
encoding and retrieval. The model assumes that if a single item is presented, then the encoded
representation of that item is an exact copy of that stimulus that can be retrieved with the same
fidelity as it had originally been presented with. However, it is quite clear from precision tasks
that even encoding and retrieval of a single color is not perfect (Wilken & Ma, 2004; Zhang &
11
Luck, 2008; Bays, Catalao, & Husain, 2009). At some level, there must a loss of representational
fidelity.
Another flaw of the slot model is its inability to account for changes in retrieval precision for
increasing set sizes. If memory representations are truly independent, then storing a second color
patch should not influence precision of the first stored color patch. However, precision for an
item falls dramatically when storing a second item. In response, Zhang and Luck (2008) added
an additional component to the model, now called the slots+averaging model that enables a
representation to be held in multiple slots if there are available slots. For example, if only two
items are presented, then one item may be represented in multiple slots if there are more than two
overall slots. This manipulation can account for precision decrementing as a function of set size,
but it does add another layer of complexity. Can objects with multiple features be averaged over
many slots? Does the model always encode each item first before it begins assigning double
representations?
Another implication of the slot model architecture is that precision as a function of set size will
eventually reach an asymptote. This asymptote corresponds to the capacity limit on the amount
of information that can be stored. Specifically, if the number of items presented exceeds the
amount of slots, then the reported precision will not change regardless of how many extra items
are presented after storage capacity has been reached. This particular effect is controversial
because it is not found when including non-targets in the mixture model (Bays et al., 2009) and it
contradicts findings that participants encode Gestalts of color patches (Brady & Tenenbuam,
2013) and irrelevant features from stimuli (Gao, Li, Yin, & Shen, 2010).
Resource based models of VWM
12
An alternative descriptive architecture of VWM is a resource based model approach. The theory
behind a resource model is not a new one, but it has only recently been applied to working
memory (Bays & Husain, 2008). In these models, there is a finite amount of resources available
to maintain representations and the number of stored representations influences the precision of
retrieval. The allocation of resources is dependent upon the total amount of resources and the
amount of stimuli presented. For example, a single item is recalled with a precision afforded by 4
units of representation, which means that 4 is the most units the resource pool can allot. If two
items were presented then the precision of recalling a single item would be 2. One of the issues
with this initial model is that each item is allotted an equal proportion of resources, but there’s
evidence in the attentional literature that suggests the items are not encoded uniformly (Cowan &
Maunsell, 2010; van den Berg, Shin, Chou, George, & Ma, 2012).
Therefore, an addendum to the resource model is that the allotted resource per item is variable
and dependent upon the maximum available precision (van den Berg et al., 2012). This
modification is congruent with research that finds stored representations are not equally
represented. Also, this modification provides a better fit to behavioral data. However, there is a
fundamental issue with storage in the resource model; the resource model architecture is
underspecified in describing how representations can be recalled individually. This issue can be
resolved by incorporating a Type/Token architecture described in Chapter 2.
One way of adjudicating between distinct theories is to develop computational simulations. Such
models provide a way to formalize the assumptions of each theory in a mathematically explicit
construct. Importantly, the formalization of a model elucidates the boundary conditions of that
theory, and thereby, the limits of the theory to explain behavior. Another benefit of formalizing a
theory through computation is the ability to make predictions. These predictions are typically an
13
outcome of the specific architecture of the model, and provide an important means for testing the
validity of the model.
There is a long history of computational modeling in the context of modeling memory. The
models discussed below contain elements of working memory, although some are less
appropriate for VWM. Specifically, the majority of these models are relevant to the model
presented in this manuscript because they utilize distributed patterns of activity. In other words,
two representations are not differentiated by discrete nodes, but by a pattern of activity among a
collection of nodes. Despite being similar in their use of distributed networks for storage, the
models discussed below utilize a variety of encoding and retrieval mechanisms.
Computational models of memory
One important model that helped to lay the groundwork for future work is the Search of
Associative Memory (SAM) model of long-term memory that can simulate data from verbal free
recall, paired-associate recall, and recognition tasks (Raaijmakers & Shiffrin, 1981). In SAM,
working memory (STS) is a temporary store with a limited capacity that maintains categorical
information, such as a list of words. Once the limit of items in STS has been reached, a random
item in the list is overwritten. Long-term memory (LTS) contains all previously stored
information and the information maintained in (STS) in the form of a distributed matrix. In order
to retrieve information, a cue is inputted into STS to probe LTS. Successful retrieval is
dependent upon the strength of the connection between the cue and sampled image of the cued
information in LTS. Although this model is designed specifically for verbal working memory, it
does provide a mechanism that is essentially similar to that required by slot-based models of
VWM.
14
The theory of distributed associative memory (TODAM) is another computational model of
memory, and like SAM, stores information is a distributed pattern of activity in a memory matrix
(Murdock, 1995). In TODAM, an item is represented as a vector of random values, and an
association between multiple items is stored as a convolution of the item vectors. The distributed
memory matrix accumulates activity as the model stores more information. In this model,
working memory is treated as the operation of interacting with the memory matrix. In the case of
simulating retrieval of a word from a pair of words, the model uses a division operation to reduce
the memory matrix to activity that corresponds to the other item in the pair. However, the
accumulation of activity from noise and other stored items interfere with the quality of the
retrieved memory. A criterion is then used to determine if the retrieved item has been correctly
retrieved. The notion of interference from other items and determining retrieval thresholds are
critical components of the Binding Pool model as well.
These models, along with others, have played an important role in laying the groundwork for
understanding the storage of individual pieces of information as being distributed across multiple
storage units. More recently, other models have used such distributed representations in
constructs specifically aimed at understanding working memory and visual working memory.
Oberauer et al. (2012) proposed a model of working memory, called the serial-order-in-a-box
complex span (SOB CS), which utilizes a similar storage structure to the model described in this
manuscript, albeit simulating a complex span task instead of VWM tasks. In this model,
presented items activate the corresponding item node, and then the item node is bound to a
position node. Bound position nodes have an increased level of activity to that item relative to a
base line level of activity to the rest of the item nodes. During recall, the model projects activity
from the cued position node back to the item layer, and thereby recreates the item layer based on
15
the connection weights to that position node. By matching the retrieved item layer to a list of
candidate item layers, the model selects the item layer most similar to the probe and retrieves an
item. The components of interest in this model are the use of binding links between items and the
instance of that item, which are similar to the relationship between types and tokens described in
chapter 3.
Contrary to the models described above, Wei et al. (2012) have designed a model with localist
representations of stimuli that are stored in the stimuli’s feature space. In their model, memory is
represented in the form of bump attractors along a feature space. Each attractor in this space is a
discrete representation of a feature value, so that 360 attractors could represent each hue in a
color wheel. Activity of an attractor is maintained with a self-excitatory loop, and multiple
attractors can be maintained at once. However, two attractors may inadvertently combine activity
to form a single bump if the system is maintaining multiple items. One of the limitations of this
model is that memory storage and the specific features of the inputted information are contained
within the same layer. Thus, the model has no means of dissociating repetitions as discrete items
or of binding multiple features of the same item.
16
Chapter 3. Binding Pool model
The Binding Pool is a mechanistic model of VWM that uses a resource pool as a mechanism for
indexing stored representations. The Binding Pool model utilizes a neurally plausible
architecture that links Types and Tokens (Kanwisher, 1987) using a distributed resource pool
called the binding pool (Figure 3). The Type/Token framework originated from studies in the
repetition blindness literature to describe how multiple copies of the same item could be stored in
a memory system. This idea was implemented in a computational form in Bowman and Wyble
(2007) to account for specific effects in the attentional blink literature. By adding the distributed
binding pool, the Type/Tokens framework can be extended to specifically simulate the
interaction of stored representations in VWM.
Figure 3: In this illustration of the Binding Pool model, a diagonal green line is being stored
(shown at the bottom of the figure). The features of the stimulus (color, location, and orientation)
17
are separated into distinct feature layers, which we refer to as types. The types are connected to
the binding pool, in which the type links are bound to an object representation, referred to as a
token. In this example, the stimulus is being stored in token 1. However if there were multiple
stimuli, each would be assigned to a different token. The number of tokens shown here should
not be construed as a capacity limit, since the model can encode more than three tokens per trial.
Types as stimulus features
The type layer is a layer of neurons that represent both the input and output of the model. The
input into the model is in the form of node activity in the type layer. It is assumed that there are
many type layers in the brain and each type layer codes for a specific feature (color, orientation,
etc.). Importantly, the specific activity of the nodes within a type layer determines what feature is
being represented.
The activity in the type layer is projected to the binding pool during encoding. If multiple items
are presented during a study phase, then the type layer activity changes to a new pattern of
activity for each feature as the model attempts to serially encode each item. After a stimulus is
encoded in the binding pool, type activity is no longer necessary.
The output of the model is also expressed through the type layers. Retrieval begins by activating
the cued type layer, which is then used to retrieve the token bound to both the cue and probed
type layers. The retrieved token and cue project activity to the binding pool, which is then
projected to the probed type layer. The output of this process is a memory trace of the original
item in the form of a distributed pattern of activity across the type nodes.
18
Tokens index representations
The token layer indexes stored information. Contrary to the type layer, a token node is not
stimulus specific and can correspond to any stored item. In other words, a token node can be
bound to any pattern of active type nodes. Therefore, it a vital assumption of the model that
encoding is serial, so that a token node always corresponds to a single item. If multiple items
were bound to a token node, then it would be difficult to differentiate the items during retrieval.
Therefore, tokens allow the model to store multiple pieces of information concurrently,
regardless of how the information is presented and whether or not there are repetitions.
Importantly, the type layers are not directly connected to tokens. If this were the case, then
representations would be independently maintained in memory and there would have to be
predetermined connections between each type and token node. Instead, tokens and types are
bound in a distributed pool of resources called the binding pool.
The binding pool as storage
The binding pool is a neural instantiation of the resource pool and is the location of item storage
in the Binding Pool model. It is a cluster of neurons that sits between the type and token layers.
One of the unique features of the binding pool is that there is not a one-to-one connection
between a binding pool node and either a type or token node. Instead, the binding pool stores
distributed representations of information, which allows the model to store many different
combinations of feature values. However, as a consequence of this distributed representation,
multiple type and token nodes can connect to the same binding pool node, creating interference
during encoding and retrieving.
19
The connections between the type and token layers to the binding pool are fixed, which means
that there is no synaptic modification during a trial. Information storage occurs when there is
concurrent activation of a type and token node to a single binding pool node. The accumulation
of activity in a binding pool node is similar to accumulation of activity in response (R) units in
the Perceptron (Rosenblatt 1958). R-unit activity is the summation of initial activity times the
connection weights. However, unlike the Perceptron, there is not a threshold on the accumulated
activity, and thereby no limit on the amount of activity in a binding pool node. The activity level
reached during encoding is maintained through a presumed self-excitatory loop (which is not
simulated explicitly), so that the binding pool no longer requires activation from either a type or
token after encoding.
Connections in the Binding Pool model
Connections between the type and token layers to the binding pool are fixed for a given trial
(although we assume that connectivity would actually be constant for a whole set of trials from a
single subject). For each trial, the type and token layers randomly connect to 45% of the binding
pool without replacement. There are no restrictions on the amount of links to an individual
binding pool node. For continuous feature dimensions, the type connectivity is structured to
provide a similarity gradient by incorporating structured overlap between neighboring type
nodes. For example, type nodes 1 and 2 have more overlapping connections than type nodes 1
and 3. To do this, the proportion of the overlapping connections between any two type nodes is
modified to include an extra proportion of shared nodes that is equal to .3distance, with distance as
the separation between two type nodes. Therefore, type nodes 1 and 2 will share an extra 30% of
their connections and type nodes 1 and 3 will share an extra 9% of their type nodes, and so on. In
the following sections, the operations of the model will be explicitly detailed, so that the reader
20
could recreate the mechanisms of the model solely with these descriptions. This section will
describe the encoding of a single color patch into VWM.
Presenting a stimulus
As a first step, the color feature of the stimulus is represented as an integer drawn from a uniform
distribution [1 360], and the location property of the stimulus is treated as categorical and
therefore assumes a value in an interval scale from [1 max(stimuli)]. For continuous variables,
the input needs to be converted into a type node representation such that each type node
corresponds to a range of 36 degrees, since there are 360 degrees and 10 type nodes. The two
type nodes with values closest chosen stimulus value are selected, one with a value below, and
one with a value above the stimulus value as shown below,
(
(
(
)
)
(
(
(
[3]
))
))
[4]
Where floor and ceil correspond to rounding down and rounding up respectively. Stimulus refers to a
feature value in the range (1 to 360). If the type node selection is 0, then that type node is set to 10
instead. If the stimulus value falls exactly on the value of a given type node (i.e. 36, 72, etc), then only
one type node is active with a value of 1.0. In the case of feature dimensions with discrete values instead
of continuous values, the model assigns one value per type node with activation values of 1.0.
Encoding a stimulus
21
Information is encoded by projecting activity from the type and token layers to the binding pool
(Figure 4). These converging projections activate the subset of binding pool nodes that receive
input from each layer. Those binding pool nodes that have been activated store the link between
an item’s features and the specific instance.
[5]
∑∑
where
is the set of binding pool nodes in the binding pool, indexed by β.
active token activation level,
is type layer 1, and
is the currently
is type layer 2. L is the connection matrix
between the type layer 1 and the binding pool, and M represents the same matrix for the other set
of type nodes. N is the connection matrix between the Token layer and the binding pool with
indices (the currently active Token) and β. n represents the number of type nodes in each type
layer. After encoding, the activity of all of the neurons in the binding pool is normalized so that
the total sum of binding pool activity is 1.0.
Figure 4: In this illustration of encoding, the model is storing an item with a color and a location
feature. The color and location are first translated into type node representations. The white
arrows indicate the magnitude of activation during this conversion process (location, in this case,
22
is categorical, so that the type node is either on or off). Then, these types provide input into the
binding pool along with input from the active token. The binding pool nodes that receive
convergent input from both types and the token are activated (shown in dark blue). The black
lines indicate active connections to the binding pool. The gray lines indicate connections that
exist, but are not active.
Retrieving a token
Retrieving from the binding pool occurs by inputting a cue into the model (Figure 5a). The goal
of the cue is to reactivate the token that the cue is signaling. A cue is inputted into the model in
much the same way as a feature. In this example, a location cue is activated along the location
type node space. This dimension is projected into the binding pool, activating a subset that then
project to the token layer. A gating system is presumed to be used to direct the flow of activity in
the binding pool in the correct direction (Wyble, Bowman & Nieuwenstein, 2009). The activity
at the level of the token layer is calculated with this equation,
[6]
∑∑∑
Retrieving a type from a token and cue
If the difference between the highest and second highest token is above a specified threshold,
then the model recalls the highest token (Figure 5b). Next, the model recreates the probed feature
of the cued object. The active token then projects activity back to the binding pool, activating a
23
subset that projects to the probed type layer. The activity at the type layer is calculated with this
equation,
[7]
∑∑
The activity at the probed type layer is the reconstructed trace of the original input in the form of
a population vector. If the model fails to differentiate the two highest tokens, then the model
simply guesses by selecting a random feature value from the feature space.
Figure 5: A) In order to retrieve the probed feature from the cued feature, the model first
retrieves the associated token representation. A retrieval cue activates a type node, which
projects to the binding pool. The convergent input from these binding pool nodes reactivates
nodes in the token layer and, on average, the token that had been originally bound to this location
will receive the most activity. The retrieved token is then used to retrieval the probed feature. B)
The retrieved token, in conjunction with the location cue (Figure 4) are then used to retrieve the
color feature bound to that token. The white arrows indicate the magnitude of retrieved activity
24
of the type layer in the form of a population vector. The mean of the population vector
corresponds to the retrieved feature (see Figure 6).
Retrieved type layer activity
Assuming a token was successfully retrieved, the type layer activity originally associated with
that token is retrieved. The retrieved population vector represents the reconstructed memory
trace of the originally inputted information associated with that token. The model can interpret
the population vector in two ways. If the probed type layer is of a continuous dimension (e.g.
Prinzmetal et al., 1998), then each type node is converted into a vector as extending from an
imaginary circle with a length equal to its activation value. These vectors are computed with
Cartesian coordinates and then summated. The summed vector is converted back into polar
coordinates. This conversion results in a vector with an angular direction that corresponds to the
retrieved location in the continuum of possible responses and a length value that corresponds to
the retrieved confidence (Figure 6). If the probed type layer is categorical, the model uses a
similar mechanism used in token retrieval by comparing the highest and second highest
reconstructed nodes. If the difference exceeds a threshold, the model declares the highest node to
be the retrieved feature. Otherwise, the model guesses by randomly recalling a feature. Note that
none of the data that we explicitly simulate are categorical.
25
Figure 6: Illustration of the properties of the retrieved mean vector. A.) The length of each arrow
around the color wheel corresponds to the activation level of a type node. The vectors are added
together to produce a composite vector that has a length and an angle. This figure illustrates the
deviation (
) of the retrieved color ( ̂) from the color of the original stimulus ( ). This angle is
interpreted as the retrieved feature value and the length of the vector ( ) is interpreted as retrieval
confidence. B.) A scatterplot of the relationship between the length and deviation for trials of set
size 4. Note that there is a cluster near zero deviation that corresponds to a correct retrieval of a
token and the corresponding feature. There is also a more broadly distributed band of responses
at a lower confidence level, corresponding to trials in which the item was a token was retrieved,
but the corresponding color value had not been encoded. And finally, at the very bottom of the
figure, the line of dots indicates cases in which token retrieval failed entirely, and the model
guessed randomly. In this plot, deviation values, which normally range from 0 to 180, have been
normalized to the range [0 1].
Storing and retrieving multiple items
26
When multiple items are presented in a display, it is assumed that the visual system assigns
tokens to each discrete item. Furthermore, the model assumes that stimuli are serially sampled
from the display in a random order, and encoded. The number of items sampled per trial varies to
reflect fluctuations in the attentional state of participants (Sims, Jacobs, & Knill, 2012). The
number of samples per trial is drawn from a uniform distribution [2 7], which represents a range
of possible attentional states. For example, if 4 items were presented, the odds of encoding all 4
items would be 50% because on a given trial the number of encoded items will be 2,3,4,5,6 or 7,
half of which are equal to or greater than 4. When multiple items are encoded, each projects
activity to the same binding pool. Because each binding pool node is connected to multiple
tokens, the retrieval of a single item will be affected by the activity of other items stored in the
binding pool. Therefore, interference results when multiple items are stored, and this interference
increases systematically with memory load (Figure 6).
Figure 7: Histograms of retrieved color value in degrees, centered at 0 degrees relative to the
color that was encoded, taken from the retrieved population mean. This, and each further
simulation (unless otherwise notes), was collected over 10,000 trials. Note that each retrieved
27
value is included in the distribution (e.g. set size 6 has 60,000 data points). From the top to the
bottom, the histograms indicate set sizes of 1, 2, 4, and 6.
Models are best used as tools for simulating behavior by fitting existing data and predicting
behavior to novel tasks (Chapter 4). The Binding Pool model is able to simulate both the recall of
a feature in a continuous report task and detection of change in a change detection task.
Simulations of both continuous report and change detection paradigms use the same encoding
and retrieval stages and the properties of the mean population vector. The model can simulate a
variety of manipulations on these tasks, such as the relationship between reporting ‘Change’ and
the magnitude of change, retrieval confidence in a continuous report task, and interference
caused by other items in memory
Simulating continuous report
Simulations of a continuous report task use the location property of the mean vector in the form
of a degree value (Figure 8). The location property corresponds to the selected color value and
therefore the retrieved color of the model. These location values can be pooled over many trials
into a response distribution that shows the same degradation of precision as behavioral response
distributions. The retrieved location value can be further analyzed by using a mixture model.
28
Figure 8: Continuous report simulation. The model simulations were fitted to data from Bays et
al. (2009), in the condition with a stimulus presentation of 100ms. The RMSE value shown in the
middle panel is the average RMSE value for swapping and guessing errors. Across 20 simulated
subjects, the largest standard error value for standard deviation was .33 and .001 for swaps and
guesses.
Simulating change detection
Simulating change detection uses both properties of the mean vector in a two stage process
(Figure 9). These stages occur during the presentation of the test phase in a change detection
task. In the first stage, the model retrieves the color in the same way the model retrieves a color
in the continuous report task. If there are multiple colors in the test display as in a whole display
change detection task, then retrieval of each color is attempted whether or not that color was
encoded. In the second stage, the model determines whether or not the corresponding item is the
same as the item retrieved from memory. To do this, the model uses both properties of the
retrieved color in a dual threshold process. If both the retrieved length and deviation of the
29
retrieved location from the item in the probe display (whether changed or not changed) exceed
threshold, then the model reports a change. Otherwise, the model does not report a change.
Figure 9: Scatterplots show the relationship between the length of a retrieved vector (i.e.
confidence) and the deviation between the retrieved color and the actual color at each location in
the probe display. A) The left scatterplot for each pair indicates the response for an unchanged
item in the probe display, while the right plot indicates the response for a changed item. The
deviations are uniformly distributed for the changed item because the changed stimulus was
selected from a uniform distribution. The thresholds for deviation and length are indicated, and
green dots indicate correct responses (CR = correct rejection, FA = False Alarm). This example
is for set size 4. B) Scatterplots for set sizes 2, 4, 6, and 8. Note that as set size increases, the
confidence decreases, and the deviation increases. Also, the thresholds change systematically as
a function of set size according to the parameters.
30
Another dimension to the simulation of change detection is how the difference between the
original item and the changed item influences how likely the model will report a change. This
occurs naturally in the binding pool model because of the loss of the precision as more items are
presented (Figure 10).
Figure 10: Change detection simulations. A) Proportion of hits versus false alarms of the
binding pool model compared with data from Keshvari et al. (2013). The RMSE value is the
average RMSE of the hit and false alarm rate. In this figure, the largest standard error for
simulations is .0071. This, and all further standard errors, are derived from the averaged output
of 15 simulation batches, each of which ran 750 trials. B) Proportion of reporting a ‘change’
given the difference in degrees between an item in the memory array and the probe display. The
largest standard error value for any simulated data point is .0268 (error bars omitted for clarity).
Simulating the ‘magnet’ effect
31
An ensemble in VWM is similar to an ensemble cast in film. When multiple actors and actresses
are equally represented in a production, then the production is said to have an ensemble cast. In
the case of VWM, if multiple stimuli are presented, then these stimuli might be encoded into a
single gist representation called an ensemble that represents a combination of the relevant
presented information (Brady & Alvarez, 2011). For example, if multiple red circles of varying
sizes were presented, then there may be an ensemble representation of a red circle at the mean
size. The simplest demonstration of an ensemble is the shifting, or pulling, of one representation
towards another representation called the ‘magnet’ effect (Huang & Sekuler, 2010).
In the Binding Pool model, an ensemble emerges naturally out of the distributed nature of the
binding pool, so that the model does not need a separate ensemble representation of the presented
stimuli. The Binding Pool model can systematically simulate this effect by presenting items fixed
at various distances along the response continuum and measuring how the median of the
response distribution shifts as a function of the separation between the items (Figure 11). This
effect occurs because of the overlap in connectivity between the type layers and binding pool.
32
Figure 11: A simulation of the interference effect originally demonstrated by Huang and Sekuler
(2010). Two items are stored in the binding pool, with the first item being fixed at 180 degrees
on a continuous scale and the second item presented at various distances (0 to 90 in increments
of 18 degrees) relative to it on the feature dimension. The y-axis is the median error of the
retrieved target item.
median of the first item’s retrieved location towards the second item as a function of its distance.
Simulating confidence binning
Confidence measurements are an emergent property of reconstructing the memory trace. The
mean population vector has both an angular direction and length property. The length property
represents the variability in activity among the type layer after retrieval. If multiple type nodes
were reconstructed with a high activity, then the retrieved length value from the averaging
process would be small as it is pulled in multiple directions. Conversely, if the reconstructed type
layer had a single highly active node, then the length property of the averaged population vector
would be large. Therefore, we interpret the length property as retrieval confidence in the model
(Figure 12).
33
Figure 12: A simulation of a continuous report task with responses divided by simulated
confidence. Confidence levels are binned by the length of the retrieved vector into tertiles. As
found by Rademaker et al. (2012) high confident responses are rarely swaps or guesses. In this
figure, ‘con.’ refers to confidence.
Parameters were fit using a grid-based search algorithm. In this sampling process, an initial set of
parameters is chosen that produced a moderate fit to both the continuous report and change
detection behavior data (note, data was fit to Bays et al., 2009 and Keshvari et al., 2013). Next, a
parameter is selected and changed to a new value drawn from a vector of a predefined range. The
model then uses this new value and outputs a root mean square error (RMSE) of the fit between
the model’s simulation and the behavior data. If the new fit is better than the old fit, then the
model accepts the new value. Otherwise, the model continues to use the old value. The gridbased search algorithm process stops once the RMSE value exceeds a threshold or has reached a
floor in which no new parameter has decreased the RMSE value after 100 iterations.
Parameters for the simulation of the continuous report and change detection tasks are outlined in
Table 1. These parameters were not adjusted to simulate the predictions in the next chapter.
Parameter
Description
Value
Binding Pool Size
The amount of available nodes for storing
800
links; as the size increases, so does the
representational fidelity of stored
34
information.
Type Layer Size
The amount of type nodes in each feature
10
space
Connection Sparsity
The proportion of active connections
0.45
between the each layer and the binding
pool
Type Connection
The proportion of added overlap between
Overlap
neighboring type nodes
Encoding Capacity
The number of stimuli encoded per trial is
0.30
[2 7]
drawn for a uniform distribution
Token individuation
If the ratio of the maximum retrieved
threshold
token activity to the average retrieved
0.016
token activity does not exceed this
threshold, then the model will initiate
‘guess’ by choosing an unbound token.
Length threshold for
Two parameters (baseline and setsize
ϕs =
change detection
slope) define the thresholds for detecting
0.005*Setsize +
change based on the length of the
0.09
retrieved vector (i.e. confidence).
35
Deviation Threshold
Two parameters define the thresholds for
ϕd =
detecting change based on the deviation
0.038*Setsize +
of the retrieved value from the probe
0.16
value.
Table 1: List of parameters used by the Binding Pool model.
36
Chapter 4. Predictions
Models are best utilized as tools for explaining a behavior and predicting new behaviors, and the
ability to do both is a partial validation of the model as an accurate representation of the
mechanisms of the simulated system. In the case of the Binding Pool model, there are several
predictions that arise naturally out of the architecture of the model. The model predicts that
varying the similarity of the presented features improves precision and confidence, the
magnitude of the ‘magnet’ effect of two items on a third item is modulated by the relation of the
former two items, and lastly, high confident retrieval errors could emerge given the right
circumstances. The two former predictions rely on the ability of the model to store repetitions as
two representations.
Inter-item similarity
The model makes clear predictions about how inter-item similarity affects the ability to retrieve
information and the confidence about that retrieval process. The role of similarity has been
explored in a variety of contexts, but it has seldom been used as a manipulation in recall based
VWM tasks. Previous research using change detection paradigms have demonstrated that
accuracy increases if the presented stimuli in both the study and probe display are drawn from a
similar set of colors (Lin & Luck, 2009; Sims, Knill, & Jacobs). The implication of these results
is that the quality of the memory representation increases as the similarity of the presented
stimuli increase. However, it has not been demonstrated how the precision of a representation is
affected when presented with two other items of varying similarity and whether or not presenting
more similar items will produce better or worse performance than a single item. The Binding
37
Pool model can predict data for these cases because of its ability to represent items of arbitrary
similarity.
In the Binding Pool model, similarity is defined as proximity in type space and is formalized in
two ways. Neighboring type nodes have overlapping projections to the binding pool and are
therefore likely to interact during retrieval. Also, the mean population vector is affected by the
proximity of type nodes. Proximal type nodes will pull the mean vector closer to each other,
whereas distal type nodes will cause the mean vector fall between the type nodes.
To simulate variations in similarity, I varied the distance between four stimuli so that the
proximity of the presented stimuli is at regular intervals from 0 degrees (i.e. 4 repetitions) to 90
degrees (i.e. items are equally distal). By changing the distribution of selected items, the
assumption of uniformity is violated in the mixture model. Therefore, the following prediction
will use the standard deviation of the entire response distribution instead of using the mixture
model to parcel out correct trials from swaps and guesses.
The model predicts that the precision and retrieval confidence should increase as the other items
become more similar (Figure 13). This prediction is an outcome of the increased proportion of
overlapping connections between proximal type nodes. As type nodes become more proximal, it
is more likely that the model will accurately retrieve the correct type node. Interestingly, the
precision and retrieval confidence of 4 repetitions is still less than the precision and retrieval
confidence of a single item. In this case, the variations in encoding may cause a target not to be
encoded when multiple items are presented.
If behavioral experiments show that similarity does not increase precision and if repetitions are
equally precise as a single item, then one implication might be that more complex type
38
representations need to be incorporated in the model. For example, there may be a single
ensemble representation of the 4 repetitions with the quality of a single representation of 1 item.
Figure 13: A) A simulation of response precision as a function of the similarity of stored stimuli.
The set size is held constant at four, and the stimuli are distributed in feature space at fixed
intervals from 0 to 90 degrees. As a baseline comparison, the green line represents response
precision when only a single item is encoded. Note that the precision of a repeated item is worse
than the precision of one encoded item, and this is due to the probability of encoding failure in
the repeated item case. B) Histograms of length values (i.e. simulated confidence) pooled over
multiple trials for various sets of item similarity with set size 4. Note that the model predicts that
confidence will be much greater for a single item than for 4 items, regardless of whether they are
similar or repeated.
Variations in the ‘magnet’ effect caused by ensembles
39
If a second item can distort the representation of a target (Huang & Sekuler, 2010), then how will
multiple non-targets affect the precision of a target? In the Binding Pool model, multiple items
interact in the form of overlapping connections in the binding pool, causing a loss of precision
and bias towards the other items. The Binding Pool model can simulate how multiple non-targets
influence a target by systemically varying the positions of the non-targets relative to the target. It
is important that the mean of the non-targets does not change for each variation to ensure that
any bias is the result of the relation between the non-targets and not their summed proximity to
the target.
To generate this prediction, the model will fix the location of the target, C along a feature
dimension (like luminance), and vary the relative positions of the non-targets, A and B along the
same dimension. In this prediction, the mean value of A and B is maintained at a single degree
value. The result of interest is how C is pulled towards the mean of A and B as a function of the
distance in location of A and B. The model predicts that as A and B become more proximal to
one another, so does the magnet effect of A and B increase on C (Figure 14). Importantly, this
effect is most influential for repetitions, suggesting that repetitions are stored as two separate
representations. If this effect is not found, then it again suggests that the model needs to
incorporate more complex forms of representations, such as an ensemble for repetitions.
40
Figure 14: Illustration of the experimental paradigm for Prediction 5. A). A, B, and C represent
stimulus features for three encoded objects on a continuous dimension such as spatial frequency.
The proximity of A and B in feature space are varied up to and including the case where the two
stimuli are identical. The effect of their interference is measured on stimulus C. The predicted
magnet effect on C is depicted in (B). Numbers 1 through 4 indicate which data points in B
correspond to scenarios depicted in A.
High confident swaps are possible
In the Binding Pool model, swaps can originate from two sources: retrieving a token bound to a
different cue and interference in reconstructing the type layer. In the latter case, the location cue
41
reactivates the correct token, but the retrieval of the color bound to that token is strongly pulled
towards another feature value. This is a simulation of a low or medium confidence retrieval
error, as found by Rademaker et al. (2012). The model also predicts that occasionally, an
incorrect token may be activated by the location cue, prompting the retrieval of a color bound to
a different location, and thus, the model predicts that this will be the case under certain
conditions. However, this prediction is contrary to evidence from Rademaker et al. (2012) who
found no high confident retrieval errors. The model predicts that these high confident retrieval
errors can be elicited if stimuli are presented at closer locations. As two stimuli increase in
proximity, so will the overlap of their projections to the binding pool, and thereby the odds of an
incorrect token being retrieved. If this prediction is not found to be true, then high confident
retrieval errors may arise out of higher forms of cognition in other tasks, and not as a function of
the proximity of stimuli.
Impact of the prediction
There are two theoretical implications that arise out of these predictions; the encoding and
storage of repetitions and the prediction that there are distinct forms of retrieval errors.
In the Binding Pool model, repetitions can be encoded separately by attaching a different token
to each copy of the repetition. This form of storage enables the model to retrieve either
repetition, as demonstrated in the repetition blindness literature (Kanwisher, 1987). Therefore,
repetitions are stored and encoded like other presented items, unlike other model which may
encode repetitions as a single representation in the form of an ensemble (Brady & Tenenbuam,
2013) or as increased activity in a bump attractor along a feature space (Wei, Wang, & Wang,
2012). If predictions 1 and 2 fail to be demonstrated, then the Binding Pool model will need to be
42
adapted in order to incorporate more complex forms of representations, such as hierarchical type
layers.
Retrieval errors in the Binding Pool model can occur in two distinct ways. The first type of
retrieval error may occur during retrieval of the probed type. Non-targets in memory may have
overlapping connections with the target, and in some cases, cause the target representation to be
pulled strongly towards a non-target representation. The second type of retrieval error occurs
when the incorrect token is retrieved after the presentation of the cue. The difference between
these retrieval errors is in the magnitude of the length property of the mean population vector. If
the token is correctly retrieved but the representation is pulled towards a non-target, then the
length value will be small. On the other hand, if the token is incorrectly retrieved, then the length
value will be as large as a normally retrieved item, resulting in high confident retrieval errors.
However, if the percentage of high confident retrieval errors cannot be modulated by increasing
the probability of an incorrect token being retrieved, then there may need to be a more complex
mechanism for simulating high confident retrieval errors.
Conclusion
The theoretical landscape of the VWM field was substantially changed when Wilken and Ma
(2004) used a continuous report task to measure the quality of a memory representation. The
results from these studies elucidated the limitations of discrete item, or slot, based models of
working memory and favored a model with a limited pool of resources. However, resource based
models are underspecified about how stored information is indexed in memory and how specific
objects are content addressable, so that specific features of an object can be retrieved. The
43
Binding Pool model is a neural implementation of a resource theory that utilizes mechanisms
derived from previous computational work to solve the problem of indexing a resource pool.
The Binding Pool model stores information as links between an item’s features (types) and the
instance of that item (token) in a distributed binding pool. Multiple items can be maintained
within this network, but as the number of items increase, so does the interference in the binding
pool. This interference is an outcome of overlapping connections in the binding pool and results
in the graded recall precision when simulating continuous report paradigms. Recognition tasks,
such as a change detection task, are simulated with a dual threshold that the model uses to
determine if a retrieved item is different from the item in the probe display. The Binding Pool
model can simulate and predict a variety of VWM behaviors. Specifically, the model can
simulate how the magnitude of change in a change detection task influences the probability of
reporting a change. The model is able to account for how retrieval confidence influences
precision and the proportions of errors. Also, the model can accurately capture how
representations can be influenced by other items in memory. Along with these behaviors, the
model predicts that increasing similarity will improve precision, a repetition pulls a third item
stronger than two proximal colors, and that high confident retrieval errors can occur for proximal
item locations. The Binding Pool model is a computational formalization of a resource model
that can account for a variety of visual working memory behaviors.
The impact of the Binding Pool model on the field of VWM is to combine principles from both
slot and resource perspectives into a formalized computational model that also provides a
mechanism for indexing stored representation. The model can simulate a variety of effects and is
not limited to the effects described in this thesis. Future work on the Binding Pool model will
44
focus on temporal encoding of stimuli and the ability to store more complex representations by
utilizing either hierarchical type encoding or token co-activation.
45
Chapter 5. Model GUI
The model was designed in Matlab and can be downloaded from
http://www.bradwyble.com/research/models/bindingpool. Along with the necessary scripts for
executing the model, I have included an interactive graphical user interface (GUI). This GUI is
designed to simulate the figures in this thesis and to allow the user to design their own
simulations. Please see the ReadMe.txt file located in the directory for directions.
46
Biblography
Alvarez, G. A., & Cavanagh, P. (2004). The capacity of visual short-term memory is set both by
visual information load and by number of objects. Psychological Science. 15(2). 106-111
Atkinson, R. C., & Shiffrin, R. M. (1968). Human memory: a proposed system and its control
processes. In K.W. Spence & J.T. Spend (Eds.), The psychology of leaning and motivation
(Vol. 2, pp. 89-195). New York: Academic Press.
Awh, E., Barton, B., & Vogel, E. K. (2007). Visual working memory represents a fixed number
of items regardless of complexity. Psychological Science. 18(7). 622-628
Baddeley, A. D., & Hitch, G. J. (1974). Working memory. In G. H. Bower (Ed.), Recent
advances in learning and motivation Vol. VIII, (pp. 47-90). New York: Academic Press.
Bays, P. M., Catalao, R. F. G., & Husain, M. (2009). The precision of visual working memory is
set by allocation of a shared resousce. Journal of Vision. 9(10). 1-11
Bays, P. M., & Husain, M. (2008). Dynamic shifts of limited working memory resources in
human vision. Science, 321(8), 851-854
Bowman, H., & Wyble, B. (2007). The simultaneous type, serial token model of temporal
attention and working memory. Psychological Review. 114(1). 38-70
Brady, T. F., & Alvarez, G. A. (2011). Hierarchical encoding in visual working memory:
ensemble statistics bias memory for individual items. Psychological Science, 22(3). 384-92
Brady, T. F., & Tenenbuam, J. B. (2013). A probabilistic model of visual working memory:
incorporating higher order regularities into working memory capacity estimates.
Psychological Review, 120(1). 85-109
Cohen, M. R., Maunsell, J. H. R. (2010). A neuronal population measure of attention predicts
behavioral performance on individual trials. Journal of Neurosicence, 30, 15241-15253
Conway, A. R. A., Kane, M. J., Engle, R. W. (2003). Working memory capacity and its relation
to general intelligence. Trends in Cognitive Sciences, 7, 547-552
Cowan, N. (2001). The magical number 4 in short-term memory: a reconsideration of mental
storage capacity. Behavioral and Brain Science, 24, 87-185
Daneman, M., & Merikle, P. M. (1996). Working memory and language comprehension: A
meta-analysis. Psychonomic Bulletin & Review, 3, 422-433
47
Fougnie, D., Suchow, J. W., & Alvarez, G. A. (2012). Variability in the quality of working
memory. Nature Communication, 3, 1229
Gao, Z., Li, J., Yin, J., & Shen, M. (2010). Dissociated mechanisms of extracting perceptual
information into visual working memory. PLoS ONE, 5(12), 1-15
Huang, J., & Sekuler, R. (2010). Distortions in recall from visual memory: two classes of
attractors at work. Journal of Vision. 10, 1-27
Kanwisher, N. G. (1987). Repetition blindness: type recognition without token individuation.
Cognition. 27, 117-143
Keshvari, S., van den Berg, R., & Ma, W. J. (2013). No evidence for an item limit in change
detection. PLoS Computational Biology, 9(2). e1002927
Lin, P., & Luck, S. J. (2009). The influence of similarity on visual working memory
representations. Visual Cognition. 17(3). 1-15
Luck, S. J., & Vogel, E. K. (1997). The capacity of visual working memory for features and
conjunctions. Nature. 390(6657). 279-281
Murdock, B. B. (1993). TODAM2: A model for the storage and retrieval of item, associative,
and serial-order information. Psychological Review, 100(2), 183-203
Oberauer, K., & Eichenberger, S. (2013). Visual working memory declines when more features
must be remembered for each object. Memory and Cognition, May 2013
Oberauer, K., Lewandowsky, S., Farrell, S., Jarrold, C., & Greaves, M. (2012). Modeling
working memory: an interference model of complex span. Psychonomic Bulletin & Review,
19(5). 779-819
Postman, L. (1972). Transfer, interference, and forgetting. In J.W. Kling & L.A. Riggs (Eds.),
Woodworth & Schossberg’s experimental psychology (3rd ed., Vol. 2, pp. 1019-1132). New
York: Holt, Rinehart & Winston.
Prinzmetal, W., Amiri, H., Allen, K., & Edwards, T. (1998). Phenomenology of attention: 1.
Color, location, orientation, and spatial frequency. Journal of Experimental Psychology:
Human Perception and Performance. 24(1). 261-282
Rademaker, R. L., Tredway, C. H., & Tong, F. (2012). Introspective judgments predict the
precision and likelihood of successful maintenance of visual working memory. Journal of
Vision. 12(13), 1-13
48
Raaijmakers, J. G. W., & Shiffrin, R. M. (1982). Search of associative memory. Psychological
Review, 88(2), 93-134
Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and
organization in the brain. Psychological Review. 65(6). 386-408
Sims,C.. R., Jacobs, R. A., & Knill, D. C. (2012). An ideal observer analysis of visual working
memory. Psychological Review. 119(4). 807-30
Standing, L. (1973). Learning 10,00 pictures. The Quarterly Journal of Experimental
Psychology, 19(2), 73-74
Tulving, E. (1962). Subjected organization in free recall of “unrelated “ words. Psychological
Review, 69, 344-354
van den Berg, R., Shin, H., Chou, W., George, R., & Ma, W. J. (2012). Variability in encoding
precision accounts for visual short-term memory limitations. Proceedings of the National
Academy of Sciences of the United States of America. 109(22). 8780-8785
Wei, W., Wang, X. J., & Wang, D. (2012). From Distributed resources to limited slots in
multiple-item working memory: a spiking network model with normalization. Journal of
Neuroscience, 32, 11228-11240
Wilken, P., & Ma, W. J. (2004). A detection theory account of change detection. Journal of
Vision. 4, 1120-1135
Wyble, B., Bowman, H., & Nieuwenstein, M. (2009). The attentional blin provides episodic
distinctiveness: sparing at a cost. Journal of Experimental Psychology: Human Perception
and Performance. 35(3). 787-807
Zhang, W., & Luck, S. J. (2008). Discrete fixed-resolution representations in visual working
memory. Nature. 452(7192). 233-235