ARTICLE Received 22 Feb 2012 | Accepted 31 Oct 2012 | Published 27 Nov 2012 DOI: 10.1038/ncomms2237 Variability in the quality of visual working memory Daryl Fougnie, Jordan W. Suchow & George A. Alvarez Working memory is a mental storage system that keeps task-relevant information accessible for a brief span of time, and it is strikingly limited. Its limits differ substantially across people but are assumed to be fixed for a given person. Here we show that there is substantial variability in the quality of working memory representations within an individual. This variability can be explained neither by fluctuations in attention or arousal over time, nor by uneven distribution of a limited mental commodity. Variability of this sort is inconsistent with the assumptions of the standard cognitive models of working memory capacity, including both slot- and resource-based models, and so we propose a new framework for understanding the limitations of working memory: a stochastic process of degradation that plays out independently across memories. Department of Psychology, Harvard University, William James Hall, 33 Kirkland Street, Cambridge, Massachusetts 02138, USA. Correspondence and requests for materials should be addressed to D.F. (email: [email protected]). NATURE COMMUNICATIONS | 3:1229 | DOI: 10.1038/ncomms2237 | www.nature.com/naturecommunications & 2012 Macmillan Publishers Limited. All rights reserved. 1 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2237 Here, using variants of the behavioral task outlined, in Figure 1 we show that there is variability in the quality of working memory within an individual. This variability can be explained neither by fluctuations in attention or arousal over time, nor by uneven distribution of a limited mental commodity. We propose that a complete understanding of the capacity and architecture of working memory must consider both the allocation of a mental commodity and the stability of the information that is stored. Results Revealing variability in the quality of working memory. Participants were asked to perform a working memory task using displays with 1, 3 or 5 colourful dots (Fig. 1). The standard fixedprecision model1 and our variable-precision model was fit to each participant’s distribution of errors (see Methods). Figure 3 overlays the maximum a posteriori fixed- and variable-precision models on a histogram of errors made by one of the participants, for each of three set sizes: 1, 3 and 5. The variable-precision model provided a better fit to the data for each participant at each set size. When the two models were compared using the Akaike Information Criterion (AIC), a measure of goodness of fit, we found lower AIC values for the variable-precision model for each Probability a –180 –120 –60 0 Error (deg) 60 120 180 b Probability W orking memory is critical to navigation, communication, problem solving and other activities but there are stark limits to how much we can remember1–9. A goal of research on working memory is to describe these limitations and determine their source. Cognitive models typically explain the capacity of working memory by postulating a mental commodity that is divided among the items or events to be remembered1,3,5,9–12. A core assumption of these models is that the quality (‘fidelity’, ‘uncertainty’) of memories is determined solely by the amount of the commodity that is allocated to each item, being otherwise fixed1,5,10–13. Items receiving more of the commodity are better remembered. A common approach to measuring the capacity of visual working memory is through the use of simple behavioral tasks. In one such task, participants are asked to remember the colours of a set of colourful dots, and then, after some delay, to report the colour of a dot selected at random (Fig. 1)1,9. Error is recorded as the difference between the correct colour and the reported colour. The distribution of these errors is used to test models of memory. Standard models1,5,10–13 assume that for each dot in the display, the participant is in one of two states, either remembering or forgetting. Importantly, these two states lead to different errors when the participant is asked to report the dot’s colour: errors for a forgotten item are uniformly distributed on the colour wheel, whereas errors for a remembered item cluster around the correct colour, with a spread that depends on how well it was remembered. The standard models assume that the form of the error distribution for remembered items is a circular normal (von Mises) distribution, with a spread that can be fully characterized by a single number: its precision, here defined as the s.d. (ref. 1). It is possible to relax the assumption that the quality of memory is fixed, instead supposing that it varies within an individual. To determine whether such variability is present, we can inspect the shape of the error distribution. Critically, variability in the precision of memory within an individual leads to an error distribution that is more peaked than a circular normal distribution because it involves mixing error distributions that differ in precision but have the same mean (Fig. 2). With sufficient data, it is possible to measure this peakedness to estimate the extent of high-order variability in the quality of memory within an individual. 0 6 12 18 24 30 Precision (deg) 36 42 48 c Probability Fixed precision Variable precision –180 Response screen 900 ms retention 600 ms encoding Figure 1 | Timeline of a working memory task. First, a display of colourful dots is briefly presented (lower left). This is followed by an interval during which the participant is asked to hold the colours in mind. Finally, a response screen appears. On the response screen, the position of a randomly sampled item is cued and the participant is asked to select the colour of the item that was presented at that location (upper right). 2 –120 –60 0 Error (deg) 60 120 180 Figure 2 | Fixed and variable-precision models produce different distributions of error. (a) Each curve shows the distribution of precision (the s.d. of error on the memory task) for a different fixed-precision model. (b) A variable-precision model can be thought of as a higher-order distribution of fixed-precision models. One such higher-order distribution is shown here; it is a truncated normal with a mean of 241 and a s.d. of 71, bounded at 01 and 1001. (c) Variable-precision models have different distributions of error. The green line is the distribution of errors for a model whose precision is fixed at 241. (The green distribution in panels a and c are the same.) The red line is the expected distribution of errors if precision were drawn from the full distribution in panel b. Mixing together distributions with different precision produces a peaked shape that is a hallmark of the variable-precision model. NATURE COMMUNICATIONS | 3:1229 | DOI: 10.1038/ncomms2237 | www.nature.com/naturecommunications & 2012 Macmillan Publishers Limited. All rights reserved. ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2237 participant at each set size (Supplementary Table S1); the average magnitude of these AIC differences (7, 11.3 and 12) is a decisive victory for the variable-precision model, leaving the fixedprecision model with ‘essentially no support’ from the data14. Furthermore, the amount of variation in memory precision increased with set size for each participant (Fig. 4). The average precision across subjects was 11.61, 17.41 and 23.71 for set sizes 1, 3 and 5, respectively. The average s.d. of precision, a measure of variability in memory quality, was 3.01, 5.41 and 8.91. These results were obtained with a variable-precision model in which the variability in precision was assumed to follow a truncated normal distribution. Evidence of variable quality was also found when substituting a gamma distribution over the inverse of the variance, similar to ref. 15 (average inverse variance 0.0091, a Frequency Fixed precision Variable precision –180 1 item –120 –60 0 Error (deg) 60 120 180 Frequency b –180 3 items –120 –60 0 Error (deg) 60 120 180 Frequency c –180 5 items –120 –60 0 Error (deg) 60 120 180 Figure 3 | Distinguishing between models with fixed and variable precision. Each panel shows a histogram of errors on the working memory task and the best-fit fixed precision (green line) and variable-precision (red line) model. The working memory task used 1, 3 or 5 items (panels a, b and c, respectively). These data are from a representative participant. b c Probability Probability 1 item 3 items 5 items Determining the source of the variability. There are many possible sources of variability, including both true variability in memory precision as well as variability from other sources, such as perceptual noise, eye movements and differences in the memorability of certain colours or locations. A series of control experiments were performed that demonstrate that these other sources made only a negligible contribution to the observed variability in memory quality. When the memory demands of the task were removed by keeping the stimulus present while the participant responded, variability was eliminated and the data were better fit by a fixed-precision model (Supplementary Table S2). Even when the average error on the perception task was increased to match that of the memory task (Supplementary Methods), variability was still smaller (2.21 versus 8.91, t(4) ¼ 3.86, Po0.05, independent samples t-test). A further experiment showed that there were negligible differences in the memorability of different colours (Supplementary Methods). We measured the variation in precision for six colour bins, each spanning 301 of the colour wheel (0–301, 60–901, y, 300–3301). Even though the range of colours within each bin was now only a small subset of the colour wheel, we still found substantial variability within a bin (Supplementary Fig. S1). In fact, the variability within a bin was comparable to that in the full data set (Supplementary Fig. S1c, one-sample t-tests, all t(5)’s o0.5, all P40.65 for both participants). Furthermore, the variation in estimates of precision across bins was minimal and accounts for at most a small (3.9%) portion of the total variance. Similarly, we tested whether performance differed depending on the location of the tested item, a possible contributor to the observed variability in memory. For both the three- and five-item displays, we found little variance in the estimated precision across locations (4.15 deg2 for three-item displays, 3.24 deg2 for five-item displays). (This is an upper bound on the contribution to variance because it includes measurement error.) Furthermore, the observed variance explains only 12.7% and 3.8% of the total variance for the three-item and five-item conditions, respectively. Representational quality could differ for items that are fixated compared with items that are not. The encoding duration used in Experiment 1 was long enough to allow for a saccade, therefore raising the possibility that variability in precision is due entirely to imbalances in encoding introduced by eye movements. We found that memories are variably precise even when eye movements are not possible. Two participants were asked to remember the colour of three items presented for 100 ms. Although this encoding duration is too short for an eye movement, we nonetheless found Probability a 0.0051, 0.0031 deg ! 2; average s.d. of inverse variance, 2.3 " 10 ! 5, 1.1 " 10 ! 5, 9.0 " 10 ! 6 deg ! 2). S1 0 20 40 60 Precision (deg) 80 S2 0 20 40 60 80 S3 0 Precision (deg) 20 40 60 80 Precision (deg) Figure 4 | Variability in the quality of working memory depends on the load. The estimated distribution of precision according to the best-fitting variableprecision model for set sizes 1 (red), 3 (green) and 5 (blue), for each of three subjects (panels a–c). The bars along the horizontal axis show the precision values of the best-fit fixed-precision models. NATURE COMMUNICATIONS | 3:1229 | DOI: 10.1038/ncomms2237 | www.nature.com/naturecommunications & 2012 Macmillan Publishers Limited. All rights reserved. 3 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2237 4 Probability Random probe –180 –120 –60 0 60 120 180 Error (deg) b Probability Observed Choose the best –180 –120 –60 0 Error (deg) 60 c 120 180 Simulated Choose the best Probability There is variability between items within a trial. In the following experiments, we show that this variability can be explained neither by state-based fluctuations in attention or arousal across trials16, nor by the uneven allocation of a mental commodity within a trial. This pattern of results is not easily captured by models of working memory in which the quality of a memory is determined solely by the allocation of a finite commodity. In a new experiment, on half of the trials participants were given the opportunity to report the colour of the item they remembered best. On the other half of the trials, they reported the colour of an item that was selected for them at random. These two conditions were intermixed in random order. Note that only variability within a trial can contribute to the participant’s decision in this task. Therefore, if this variability is accessible to the participant, precision will be better for the best-remembered item than for one selected at random. This enables us to measure the extent to which the quality of memory varies within versus across trials. When probed on a random item, the mean precision was 19.9±1.21 (s.e.m.) and its s.d. was 6.2±0.601 (Fig. 5a). However, when given the opportunity to choose which item to report, the mean precision was better (15.4±0.751, paired samples t-test, t(9) ¼ 5.3, Po0.001) and less variable (4.4±0.381, t(9) ¼ 3.4, P ¼ 0.01) (Fig. 5b). This outcome provides an existence proof of within-trial variability and implies that it is cognitively present and useful. But how much of the variation is contained within a trial, and how much of it is across trials? To answer this question, we ran simulations under conditions of purely within- and purely acrosstrial variability (Fig. 5c; see Methods). We did not find a difference between the observed distribution and the simulated distribution for purely within-trial variation, either in mean (15.41 observed versus 15.5±0.751 simulated; paired samples t-test, t(9) ¼ 0.05, P40.85) or s.d. (4.41 observed versus 5.01±0.411 simulated; t(9) ¼ 1.21, P40.25) (Fig. 5d). Nor did we find a difference in the mean or s.d. of precision between the trials in which a random item was probed and trials in another control experiment, run with ten new participants, that did not include a ‘choose the best’ condition (19.91 versus 22.1±1.181, independent samples t-tests, t(18) ¼ 0.91, P ¼ 0.38; 6.21 versus 6.0±0.741, t(18) ¼ 0.74, P ¼ 0.48). This suggests that giving the participant the opportunity to choose the best-remembered item on some trials did not alter how items were encoded on the other trials. In showing that nearly all of the variability in memory quality is accessible to the participant on a single trial, these results rule out a sizeable contribution of across-trial variability in the present study and suggest near-perfect metamemory for the relative quality of working memory representations. These findings also speak to an alternative account of the results of the first a –180 –120 –60 0 Error (deg) d 60 120 180 Random probe Observed choose the best Simulated choose the best Probability variability in the precision of working memory. Both participants’ data were better fit by the variable- than fixed-precision model (average difference in AICc values of 4.0 in favour of the variableprecision model). The previous manipulation eliminated the possibility of eye movements during stimulus presentation, but it does not address the possibility that gaze was not centered on fixation. We performed an additional follow-up experiment in which stimulus presentation was conditional on 1,000 ms of continuous, uninterrupted fixation (Supplementary Methods). The data for two out of the three participants were better fit by the variablethan fixed-precision model, with an average difference in AICc of 6.7 in favour of the variable-precision model. 0 20 40 60 80 100 Precision (deg) Figure 5 | Choosing the item that was remembered best. (a) When participants were asked to report an item selected at random, they made errors. (b) When allowed to report the item they remembered best, they performed better, seen as a narrower distribution of error. The ability to pick out the best-remembered item is evidence that the items differed in how well they were remembered. (c) Using the distribution in panel a, we simulated the distribution of errors that we would expect to find in panel b if participants had perfect metamemory, always picking out the best-remembered item. (d) The observed and simulated distributions of memory quality match, suggesting that participants have nearly perfect metamemory. experiment, where we found that the distribution of response error was more peaked than is predicted by standard fixedprecision models1,5,10–13. Is it possible that the distribution’s peaked shape arose not because of variability in precision, but because of a fixed-precision error distribution that is naturally more peaked than a circular normal distribution? The results of the current experiment make this alternative account unlikely because nearly all of the peakedness in the distribution of report error is accounted for by variability across items within a trial, suggesting that the peakedness of the error distribution reflects variability across items and leaving little reason to posit an additional source. Variability in memory quality is independent across items. Many theories hold that the quality of a memory is determined by the allocation of a finite commodity, where items receiving more NATURE COMMUNICATIONS | 3:1229 | DOI: 10.1038/ncomms2237 | www.nature.com/naturecommunications & 2012 Macmillan Publishers Limited. All rights reserved. ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2237 Second report, sorted by first Third report, sorted by first Second report, sorted by third Third report, sorted by second Above median error Below median error Frequency Frequency First report, sorted by second −180 −120 −60 0 60 Error (deg) First report, sorted by third 120 180 Figure 6 | Evidence that memory degradation is independent across items. Distributions of report error for the first, second and third responses sorted by whether the response to another item in the trial was below (black) or above (gray, flipped) the median error. Memory quality does not differ between the two, which is evidence of memory degradation that is independent across items. of the commodity are better remembered 1,3,5,9,12. These theories predict that some of the within-trial variation in precision can be accounted for by uneven allocation: when more of the commodity has been allocated to one item, less is available for the others, thereby producing tradeoffs between items within a display. To determine whether the variability we observed was due to uneven commodity allocation, we performed an experiment in which participants were asked to report the colour of every item in the display. We then tested for tradeoffs by comparing the errors made for one item when the response to another item was good versus bad (that is, above versus below the median absolute error for that response across trials) (Fig. 6). We found no difference in precision between above-median and below-median splits for any combination of items, regardless of whether the estimates of precision were derived from the fixed-precision model (Fig. 7a) (paired samples t-tests, all t(15)’so1.6 and all P40.1) or the variable-precision model (Fig. 7b) (all t(15)’so1.2 and all P40.25). This evidence of independence is based on a null result. However, using Monte Carlo simulations we found that our tradeoff-detection method is powerful enough to reliably detect tradeoffs if they were to exist (Supplementary Notes and Methods). We considered two sources of tradeoffs: (1) noiseless uneven allocation, where the participant knowingly attends or gives preferential processing to a chosen item, and (2) noisy even allocation, where the participant tries to distribute her resources evenly, but fails, unwittingly giving preference to some items over others. We found that any unevenness strong enough to produce the observed variability in precision would have been readily detected through our procedure for detecting tradeoffs (Supplementary Notes and Methods). Finding independence of precision for items within a display17,18 rules out that the observed variability in precision reflects unevenness in the allocation of discrete slots1 or a graded resource5. Discussion Visual working memory is not perfect. Its imperfections have been used to fashion cognitive models of visual working memory in which the quality of a memory is determined solely by the amount of a mental commodity that it receives. In contrast to the predictions of these models, the present results show that even under conditions of even allocation, the quality of working memory varies across trials and items. This finding is consistent with a stochastic process that degrades memory for each individual item and plays out independently across them. The outcome of this process leaves each memory in its own state, with its own precision, which when combined across items, produces the characteristic peaked shape of report-error distributions. Thus, visual working memory is stochastic both at the level of a memory’s content and at the level of its quality. A physiological mechanism that might account for variability in the quality of working memory representations is cortical noise that impedes the sustained activation of the neural populations that code for the remembered information5,12,19–23. The precision of memory representations may correspond to the amount of signal drift over time and the extent to which noisy signals are pooled within a neural population1,5,9,20,24–26. The recruitment of larger neural populations will improve the precision of working memory representations through decreased signal drift or increased redundancy in coding. Critically, the probability that neurons within a population will self-sustain after stimulus offset is known to be affected by internal noise20–22. Under these circumstances, the precision of memory representations would be determined by the outcome of stochastic physiological processes operating independently across representations. Existing models explain the source of limitations in working memory solely by reference to a finite capacity for storing information. Here, we have shown that a complete account of visual working memory must also consider the stability of stored information. To put forward one model capable of producing variability of the sort described here, consider a finite mental commodity that must be divided among the items to be remembered. As in existing models, this information limit can be formalized as a set of N independent samples that are allocated to the items being remembered1,9–12, with slot models setting N to B3, and with continuous resource models considering the limit as N tends to infinity. Suppose that the samples are unstable and that each has some probability of surviving until the time of the test. This leads to variability in memory quality of a specific form: a binomial distribution of samples per item. This random process, known as the pure death process in studies of population NATURE COMMUNICATIONS | 3:1229 | DOI: 10.1038/ncomms2237 | www.nature.com/naturecommunications & 2012 Macmillan Publishers Limited. All rights reserved. 5 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2237 a Fixed precision model Probability First report, sorted by second Third report, sorted by first Second report, sorted by first Above median error Below median error s.d. 21.3 ± 3.5° s.d. 19.7 ± 3.2° −180 −120 −60 0 60 Error (deg) 120 s.d. 23.2 ± 2.8° s.d. 26.9 ± 3.5° s.d. 22.5 ± 3.2° s.d. 23.6 ± 3.3° 180 First report, sorted by third Third report, sorted by second Second report, sorted by third s.d. 20.2 ± 4.2° s.d. 22.0 ± 2.9° b s.d. 23.1 ± 3.4° s.d. 23.4 ± 2.9° s.d. 23.7 ± 3.7° s.d. 24.0 ± 4.3° Variable precision model Probability First report, sorted by second Third report, sorted by first Second report, sorted by first Above median error Below median error s.d. 22.9 ± 5.0° s.d. 23.5 ± 6.1° s.d. 21.4 ± 4.8° s.d. 19.5 ± 6.3° −180 −120 −60 0 60 Error (deg) 120 s.d. 23.0 ± 6.0° s.d. 26.2 ± 6.9° 180 First report, sorted by third Second report, sorted by third s.d. 20.6 ± 6.7° s.d. 23.0 ± 5.7° Third report, sorted by second s.d. 22.5 ± 2.5° s.d. 23.5 ± 6.5° s.d. 24.3 ± 5.0° s.d. 26.0 ± 5.2° Figure 7 | Model fits for distributions of report error. (a) We separately fit a fixed-precision model to the data from each participant, response (first, second or third), sorting and split (above versus below the median error). The best-fit s.d. did not differ between the splits, which means that memory quality was comparable. (b) We did the same using the variable-precision model and found the same result: evidence of memory degradation that is independent across items. genetics, is one of many possible random processes that might degrade memory. Our model includes a guess state, a catch-all for trials in which the participant guesses randomly, either because of lapses in memory or because of other hiccups, such as blinks and slips of the hand. A recent paper15 has suggested that allowing for variability in the quality of working memory obviates the need to include a guess state. Though we agree that there is risk of overestimating the rate of guessing if variability is ignored, we found that even taking variability into account, a guess state still accounts for a sizeable proportion of trials (Supplementary Notes and Methods). Moreover, the goal of the present work is not to argue that participants never guess, but to show that variability in the quality of working memory reflects a random process of degradation that plays out independently across memories. Models of working memory that are constrained by the known properties of the brain uniformly propose that the maintenance of working memory representations involves stochastic processes that operate at all levels of processing21,27,28. Yet most cognitive models explain working memory solely by reference to the division of a mental commodity: a resource divided among stored items1,3–5,9,10,12. In these models, the quality of a memory representation is deterministic and based solely on the amount of the commodity allocated to it. Demonstrating independent, stochastic variation in memory quality within an individual requires a new framework that includes a role for stochastic processes, helping to bridge the gap between biologically plausible neural models on the one hand and cognitive models on the other. 6 Methods Participants. Participants were between the ages of 18 and 28 years, had normal or corrected-to-normal vision and received either $10 per hour or course credit. Participants whose testing ran over multiple days were given a bonus of $10 per session after the first. The studies were performed in accordance with Harvard University regulations and approved by the Committee on the Use of Human Subjects in Research under the Institutional Review Board for the Faculty of Arts and Sciences. Stimuli. The stimulus was 1, 3 or 5 colourful dots arranged in a ring around a central fixation mark. The radius of each dot was 0.41 in visual angle and the radius of the ring was 3.81. Each dot was randomly assigned one of 180 equally spaced equiluminant colours drawn from a circle (radius 591) in the CIE L*a*b* colour space, centered at L ¼ 54, a ¼ 18 and b ¼ ! 8. The grouping strength of the colours was controlled by transforming the colour values to vectors in polar space, selecting displays with the constraint that the magnitude of the mean vector of each display was equal to that expected of a randomly sampled display. Presentation. Stimuli were rendered by a computer running MATLAB with the Psychophysics Toolbox29,30. The display’s resolution was 1,920 " 1,200 at 60 Hz, with a pixel density of 38 pixels per cm. The viewing distance was 60 cm. The background was gray, with a luminance of 45.4 cd m ! 2. Procedure. On each trial, the stimulus was presented for 600 ms and then removed. After a 900-ms delay, a filled white circle appeared at the location of a randomly selected item and hollow circles appeared at the location of the other items. Participants were asked to report the probed item’s colour by selecting it from a response screen with the full colour wheel, 6.51 in radius, centered around fixation. A black indicator line was placed at the outer edge of the response wheel at the position closest to the cursor. Once the participants moved the mouse, the filled circle’s colour was continuously updated to match the currently selected colour. Participants registered their choice by clicking a mouse. Responses were not speeded. Feedback was provided after each response by displaying onscreen the error in degrees. NATURE COMMUNICATIONS | 3:1229 | DOI: 10.1038/ncomms2237 | www.nature.com/naturecommunications & 2012 Macmillan Publishers Limited. All rights reserved. ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2237 Experiment 1 methods. Three participants performed three sessions, each lasting 1.5–2 h. Each session used 1, 3 or 5 items, and their order (and therefore, set size) was counterbalanced across participants. There were 800 trials per session for set sizes 1 and 3 and 720 trials per session for set size 5. Experiment 2 methods. Ten participants performed two 700-trial sessions with a set size of 3. On half of the trials, participants reported the item they remembered best. On the other half, participants reported the colour of a randomly selected item. Probe types were intermixed in random order. On ‘choose the best’ trials, instructions appeared at the top of the screen telling participants to report a colour and then to select the location of the best-remembered item by clicking its location. On these trials, filled white circles appeared at the locations of the probed and non-probed items. Experiment 3 methods. A total of 16 participants performed one 500-trial session with a set size of 3. Instead of reporting only one item, participants were asked to report all three. Items were probed in random order. Feedback was given. Non-probed items, even those already reported, were indicated by hollow circles. General framework for modeling the error distribution. All of the models considered here can be seen as special cases of an infinite scale mixture, a general framework that describes error distributions with a fixed mean and a precision (scale) that is sampled from some higher-order distribution, known as the ‘mixing distribution’. Fixed-precision model. In the fixed-precision model, the participant is in one of two states for each item on the display. With probability 1–g, she remembers the item and has some fixed amount of information about it. Limited memory leads to errors in recall, which are assumed to be distributed according to a von Mises distribution (the circular analogue of a normal distribution) centered at zero (or, perhaps, with some bias m) and with spread determined by a concentration parameter k that reflects the memory’s precision. The mixing distribution is thus the Dirac delta function at k, a distribution concentrated at that one point. With probability g, she remembers nothing about the item and guesses randomly, producing errors distributed according to a circular uniform. Together, the probability density function of this model is given by 1 þ ð1 ! gÞfðx ; m; kÞ; 360 where f is the von Mises density, defined by f ðx ; g; m; kÞ ¼ ðgÞ fðx ; m; kÞ ¼ ek cosðx ! mÞ ; 360I0 ðkÞ where I0 is the modified Bessel function of order zero. Variable-precision model. The variable-precision model is the same as the fixedprecision model, except that precision is distributed according to some higherorder distribution, for example, a truncated normal or gamma distribution. Unless guided by a theory for the source of variability in precision, the choice of mixing distribution is arbitrary, but certain combinations of error distribution and mixing distribution are convenient. For example, when error is normally distributed about the correct value, and when precision (that is, the inverse of variance) is gamma distributed, the resulting experimental data takes the form of a generalized Student’s t-distribution. (This result generalizes to the case of distributions wrapped on the circle, which is useful for representing colours and orientations.) For guess rate g, bias m and s.d. s, the probability density function of this model is given by 1 þ ð1 ! gÞcðx ; m; s; vÞ; 360 where c is the wrapped generalized Student’s t-distribution31 and is given by " ! ðv þ 1Þ/2 1 ! c X ðx þ 360l ! mÞ2 1þ ; cðx ; m; s; vÞ ¼ 2 s v s l¼ ! 1 f ðx ; m; s; g; vÞ ¼ ðgÞ with c¼G ! " v þ 1 #$ $v %pffiffiffiffiffi% G pv : 2 2 Incorporating higher-order variability into new models of working memory is therefore a simple matter of replacing the usual choice of error distribution, a von Mises distribution, with a wrapped generalized Student’s t-distribution. Some combinations of error and mixing distributions give rise to named distributions with known analytic expressions32,33, but others, like the truncated normal mixing distribution, must be simulated. The mean precision and its s.d. were free parameters, bounded between 01 and 1001. (The upper bound is arbitrary, but an s.d. above 1001 is typically indistinguishable from guesses anyways). This should not be taken as a strong claim about the form of the distribution of memory precision, which will depend crucially on the process that degrades memories. However, no matter which distribution is chosen, the presence of variability produces a signature effect on the data—peakedness in the error distribution—seen most clearly in Fig. 2, and which is readily detected using the present methods. Model fitting. Markov Chain Monte Carlo was used to find the maximum a posteriori parameter set of a model given the errors on the memory task. Each model was fit separately for each subject and experimental condition. The data were not binned. We used the Metropolis–Hastings algorithm, which takes a random walk over the parameter space, sampling locations in proportion to how well they describe the data and match prior beliefs. A non-informative Jeffreys prior was placed over each parameter. The proposal distribution used to recommend jumps was a multivariate Gaussian centered at the current parameter set, whose s.d. was tuned during a burn-in period that ended when all of the chains, which started at different locations in the parameter space, converged. We collected 15,000 samples from these converged chains and report the sample with the maximum posterior probability. Model comparison. To compare the relative goodness of fit of the fixed- and variable-precision models, we computed each model’s AIC, a measure that includes a penalty term for the number of free parameters34. For a model with k parameters and maximum log-likelihood L, the AIC is given by AIC ¼ ! 2L þ 2k: ð1Þ Because this formulation of the AIC assumes infinite data, it is necessary to correct the value when estimating it from experimental data. The corrected AIC, AICc, is given by AICc ¼ AIC þ 2kðk þ 1Þ ; n!k!1 ð2Þ where n is the sample size—that is, the number of trials performed by a participant on the working memory task. Simulating across- and within-trial variation in Experiment 2. We performed Monte Carlo simulations to estimate the distribution of precision when ‘choosing the best’ under conditions of purely across-trial variation or purely within-trial variation. For each subject, 1,000 trials (of three-item displays) were simulated according to that participant’s maximum a posteriori parameter values for the random probe condition. When assuming across-trial variation, the precision value for each trial was selected at random from one of the three randomly drawn precision values. Error was simulated by sampling from a von Mises distribution with the drawn precision value. For each participant, the resulting error distributions were fit with the variable-precision model, yielding parameter estimates that matched the random probe data (mean 20.21 and s.d. 6.41; comparable to Fig. 5, red line). When assuming within-trial variation, we selected the best precision value from the three that were randomly sampled. These values were used to simulate error responses that were then fit with the variable-precision model (Fig. 5, blue line, mean 15.41 and s.d. 5.01). References 1. Zhang, W. & Luck, S. Discrete fixed-resolution representations in visual working memory. Nature 453, 233–235 (2008). 2. Vogel, E. K., Woodman, G. F. & Luck, S. J. Storage of features, conjunctions and objects in visual working memory. J. Exp. Psychol. Hum. Percept. Perform. 27, 92–114 (2001). 3. Alvarez, G. A. & Cavanagh, P. The capacity of visual short-term memory is set both by visual information load and by number of objects. Psychol. Sci. 15, 106–111 (2004). 4. Luck, S. J. & Vogel, E. K. The capacity of visual working memory for features and conjunctions. Nature 390, 279–281 (1997). 5. Bays, P. M. & Husain, M. Dynamic shifts of limited working memory resources in human vision. Science 321, 851–854 (2008). 6. Rensink, R. A. The dynamic representation of scenes. Vis. Cogn. 7, 17–42 (2000). 7. Baddeley, A. D. & Logie, R. In Models of Working Memory: Mechanisms of Active Maintenance and Executive Control. (eds Miyake, A. & Shah, P.) 28–61 (Cambridge University Press, 1999). 8. Baddeley, A. D. Working Memory. (Oxford University Press, 1986). 9. Wilken, P. A detection theory account of change detection. J. Vis. 4, 1120–1135 (2004). 10. Anderson, D. E., Vogel, E. K. & Awh, E. Precision in visual working memory reaches a stable plateau when individual item limits are exceeded. J. Neurosci. 31, 1128–1138 (2011). 11. Zhang, W. & Luck, S. J. The number and quality of representations in working memory. Psychol. Sci. 22, 1434–41 (2011). 12. Bays, P. M., Catalao, R. F. G. & Husain, M. The precision of visual working memory is set by allocation of a shared resource. J. Vis. 9, 1–11 (2009). 13. Fougnie, D., Asplund, C. L. & Marois, R. What are the units of storage in visual working memory? J. Vis. 10, 1–11 (2010). NATURE COMMUNICATIONS | 3:1229 | DOI: 10.1038/ncomms2237 | www.nature.com/naturecommunications & 2012 Macmillan Publishers Limited. All rights reserved. 7 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms2237 14. Burnham, K. P. & Anderson, D. R. Multimodel inference: understanding AIC and BIC in model selection. Sociol. Methods Res. 33, 261–304 (2004). 15. Van den Berg, R., Shin, H., Chou, W.C., George, R & Ma, W.J. Variability in encoding precision accounts for visual short-term memory limitations. Proc. Natl Acad. Sci. 109, 8780–8785 (2012). 16. Kahneman, D. Attention and Effort. (Prentice-Hall, 1973). 17. Fougnie, D. & Alvarez, G. A. Object features fail independently in visual working memory: Evidence for a probabilistic feature-store model. J. Vis. 11, 1–12 (2011). 18. Huang, L. Visual working memory is better characterized as a distributed resource rather than discrete slots. J. Vis. 10, 1–8 (2010). 19. Deco, G., Edmund, T. R. & Romo, R. Stochastic dynamics as a principle of brain function. Prog. Neurobiol. 88, 1–16 (2009). 20. Camperi, M. & Wang, X. J. A model of visuospatial short-term memory in prefrontal cortex: recurrent network and cellular bistability. J. Comput. Neurosci. 5, 383–405 (1998). 21. Wang, X. J. Synaptic reverberation underlying mnemonic persistent activity. Trends Neurosci. 24, 455–463 (2001). 22. Constantinidis, C. & Wang, X. J. A neural circuit basis for spatial working memory. Neuroscientist 10, 553–565 (2004). 23. Fuster, J. & Alexander, G. Neuron activity related to short-term memory. Science 173, 652–654 (1971). 24. Luce, D. R. Thurstone’s discriminal processes fifty years later. Psychometrika 42, 461–489 (1977). 25. Bonnel, A. M. & Miller, J. Attentional effects on concurrent psychophysical discriminations: Investigations of a sample-size model. Attention, Perception, & Psychophysics 55, 162–179 (1994). 26. Thurstone, L. L. A law of comparative judgment. Psychol. Rev. 34, 273–286 (1927). 27. Durstewitz, D., Seamans, J. K. & Sejnowski, T. J. Neurocomputational models of working memory. Nature Neurosci. 3, 1184–1191 (2000). 28. Durstewitz, D., Kelc, M. & Güntürkün, O. A neurocomputational theory of the dopaminergic modulation of working memory functions. J. Neurosci. 19, 2807–2822 (1999). 8 29. Brainard, D. H. The Psychophysics Toolbox. Spat. Vis. 10, 433–436 (1997). 30. Pelli, D. G. The VideoToolbox software for visual psychophysics: transforming numbers into movies. Spat. Vis. 10, 437–442 (1997). 31. Pewsey, A., Lewis, T. & Jones, M. C. The wrapped t family of circular distributions. Aust. NZ J. Stat. 49, 79–91 (2007). 32. Andrews, D. F. & Mallows, C. L Scale mixtures of normal distributions. J. Royal Stat. Soc. Series B (Methodological) 36, 99–102 (1974). 33. Gneiting, T. Normal scale mixtures and dual propability densities. J. Stat. Comput. Simulat. 59, 375–384 (1997). 34. Akaike, H. A new look at the statistical model identification. IEEE Trans. Automatic Control 19, 716–723 (1974). Acknowledgements We thank Sarah Cormeia and Christian Forhby for help with data collection. We are grateful to Geoff Woodman, Chris Asplund, Michael Cohen, Olivia Cheung, Justin Jungé and Tim Brady for helpful discussions. This work was supported by NIH Grant 1F32EY020706 to D.F. and NIH Grant R03-086743 to G.A.A. Author contributions D.F., J.W.S and G.A.A designed the experiments. D.F. performed the experiments. D.F. and J.W.S. analysed the results. D.F., J.W.S and G.A.A wrote the paper. Additional information Supplementary Information accompanies this paper at http://www.nature.com/ naturecommunications Competing financial interests: The authors declare no competing financial interests. Reprints and permission information is available online at http://npg.nature.com/ reprintsandpermissions/ How to cite this article: Fougnie D. et al. Variability in the quality of visual working memory. Nat. Commun. 3:1229 doi: 10.1038/ncomms2237 (2012). NATURE COMMUNICATIONS | 3:1229 | DOI: 10.1038/ncomms2237 | www.nature.com/naturecommunications & 2012 Macmillan Publishers Limited. All rights reserved. Supplementary Figure S1. Memorability differences between colors does not drive variation in precision. (a) Best-fit precision distributions for six color bins, each of 30°. Each distribution is labeled with the color at the bin’s center, and the black curve is all of the data together. The legend shows the range of colors included in each bin. The curves were fit simultaneously to all subjects’ data. (b) How much variability is to be expected by chance? We simulated six data sets by sampling precision values from the black curve. Even though each simulation is based on the same distribution, the estimated model parameters display a range of values. This range is comparable to that seen in the observed binned data in panel (a). (c) The variability in the full data set (solid black line) is the same as that for the average bin (dashed black line). Supplementary Figure Figure S2. Simulating uneven allocation policies. (a) A ternary plot that shows the noiseless uneven allocation policies used for simulations. At each point, the allocation to x1, x2, and x3 sums to 1. The corners correspond to the three policies for which one item is given all of the commodity. The center corresponds to a policy of even allocation across the items. Each filled circle corresponds to a fixed, uneven allocation policy that could produce at least as much variability in precision as we observed in our data. (b & c). c) Noisy allocation distributions are formalized as a Dirichlet distribution over the space of allocation policies, centered on even allocation. The distribution in panel (b) has lower noise levels (concentration parameters of 9, 9, and 9) than the one in panel (c) (concentration parameters of 4, 4, and 4). Supplementary Figure S3. S3. Variability from noisy even allocation produces detectable tradeoffs. tradeoffs. Noisy even allocation was formalized as a symmetric Dirichlet distribution on the simplex, centered at even allocation, with concentration parameters (α,α,α) that determine the noise level. Monte Carlo simulations were performed for 15 levels of noise, evenly-spaced in log units from low noise where α = (10,10,10) to high noise where α = (1,1,1). Simulated data was fit with the variable-precision model in order to estimate the standard deviation of precision (blue line). We also performed a trade-off analysis to determine the percentage of simulations at each noise level (red line) in which at least one of the six possible sortings showed a difference in estimated precision. The dashed line is drawn at the variability estimated in the data. Critically, at the level of allocation noise necessary to produce this amount of variability our tradeoff detection procedure is nearly perfect. Supplementary Table S1. AICc values for Experiment 1. Subject Fixed (n=1) Variable (1) Fixed (3) Variable (3) Fixed (5) Variable (5) 1 6371 6367 7876 7870 7550 7545 2 6463 6457 7785 7780 7326 7312 3 6171 6160 7207 7184 7390 7373 Lower AICc values are better fits. Supplementary Table S2. AICc values for perceptual control study. study. No noise Noise Subject Fixed Variable Fixed Variable 1 4528 4528 7141 7140 2 4734 4744 7116 7116 3 4423 4435 7866 7882 Lower AICc values are better fits. Supplementary Notes The power of tradeoff detection In Experiment 3, we asked participants to report multiple items per display and found independence of precision across items. From this we concluded that variability in precision is not due to tradeoffs in the allocation of a commodity to remembered items. In the Supplementary Methods, we describe our method for showing that the tradeoff-detection method is powerful enough to reliably detect tradeoffs if they were to exist. We considered two sources of tradeoffs: (1) noiseless uneven allocation, where the participant knowingly attends or gives preferential processing to a chosen item, and (2) noisy even allocation, where the participant tries to distribute her commodity evenly, but fails, unwittingly giving preference to some of the items. Each allocation (i.e., allotment of resources) can be described as a vector of proportions of the commodity dedicated to each item (e.g., [1/3, 1/3, 1/3] for even allocation across three items, or [0.1, 0.1, 0.1, 0.2, 0.5] for uneven allocation across five). The set of all such possible allocations defines a simplex (Supplementary Figure 2A). Guessing The present data was not as well described by models that did not include a uniform error component. Indeed, the AICc values indicated that models with a uniform component provided a better fit to the data (all AICc values were decisively, >100, better). Furthermore, estimates of this guess state do not differ to a large extent between the variable- and fixed-precision models. Estimates of the rate of guessing in Experiment 1 were 1.1%, 14.1%, and 21.6% for the variable-precision model for set sizes 1, 3, and 5 respectively). This was comparable to the guess rates for the fixed-precision model (1.4%, 15.9%, and 22.0%). In Experiment 2 there was a large drop in guess rate between the ‘random probe’ and ‘choose the best’ condition (variable-precision: 18.7% to 7.5%; fixed-precision: 20% to 8.2%). The data do suggest that fixed-precision models may slightly over-estimate guess rates, presumably because these models misclassify some highly imprecise responses as guess responses. Supplementary Methods Variability Variability in perception We eliminated the memory demands of the task by keeping the stimulus present while the participant responded. The stimulus was a single colorful circle, 0.75° in radius, placed in the center of the display. The participant was asked to select the color of the dot from a color wheel containing all of the possible colors. Three participants completed 750 trials of this perceptual task. Naturally, participants were better at the perceptual task than at the memory task, and so in a variant of the perceptual task, we added noise to the stimulus so that the error matched that of the working memory task. This was accomplished by setting eighty percent of the stimulus’ pixels to either black or white. The pixels were selected randomly for each participant but held constant across trials to minimize variability caused by differences in the noise. Three participants completed 750 trials of this perceptual task with noise. Variability in memory for specific colors To measure any contribution of differences in memorability to the variability in precision observed in the first experiment, we replicated the study and compared the variability measured in the full data set to that measured in subsets of trials binned by color. Variability found within one of these subsets cannot be attributed to differences in the memorability of colors because all of the trials within a bin were similar in color. Two participants performed 3,120 trials of a color working memory task with three items. Variability in memory is not due to eye movements We performed an experiment on three participants using an EyeLink eye tracker, in which the stimulus was displayed conditional on 1000 ms of continuous, uninterrupted fixation. The fixation region was defined by a square (1.35° from center to edge) centered on the fixation mark. The stimulus display consisted of the presentation of three colorful circles for 100 ms (see Methods for information on the visual angle and screen position of the stimuli). Noiseless uneven allocation To detect whether our analysis is sensitive to trade-offs due to noiseless uneven allocation (perhaps as a byproduct of a quantized storage capacity that can not be evenly divided among items, e.g., 4 slots among 3 objects), we performed Monte Carlo simulations for each point on the simplex capable of producing sufficient variability to explain the present data. The possible allocation policies of the commodity to n items are defined by the set of points (x1, x2, …, x n), where x i is the proportion of the commodity dedicated to the ith item and where the x i’s sum to 1. In the case of n = 3 items, these policies can be visualized in a ternary plot whose vertices correspond to the three policies where one item is given all of the commodity, and whose center corresponds to even allocation across the items (Supplementary Fig. 2a). To convert from the proportion of allocated commodity to the precision of report error, we used a sample-size model in which the variance of expected report error is inversely proportional to an item’s allocation 1,38,39. The proportionality constant was determined by setting the precision for an item under even allocation to 21.5°, the median precision across participants in Experiment 3. The sample size model makes it possible to estimate the expected precision under different allocation policies. Note that since we have defined precision as the square root of the report error variance, changing the allocation to one item will lead to a change in precision equivalent to dividing 21.5° by the square root of the change in allocation. For example, doubling the allocation to an item (66.7%) will lead to an expected precision of 21.5°/√2 =15.2°. Rather than make assumptions about the particular allocation policy chosen by the participant, instead we simulated data for each of the possible allocation strategies, in increments of 2%, that are capable of producing the variability observed in our experiments. (For example, there’s no need to consider the case of even allocation because it is incapable of producing any variability at all.) For each of the 963 allocation policies that met these requirements, we simulated response error data for 16 participants, where each participant makes three responses on each of 450 trials. We then asked whether sorting by the first response led to an observable difference on the precision of the second response, and vice versa. If either result was statistically significant at α = 0.05, we considered it as evidence of non-independence between the two responses. We repeated this procedure 1,000 times. For the worst-case allocation policy (i.e., the one that was hardest to detect), we found evidence of non-independence 96.6% of the time. Averaged over all tested allocation policies, we found evidence of non-independence 99.9% of the time. Thus, comparing responses within a trial is a powerful test for detecting tradeoffs in allocation between items. Noisy even allocation We performed Monte Carlo simulations to determine whether our analysis method is sensitive to trade-offs induced by noisy even allocation. We formalize noisy even allocation through a symmetric Dirichlet distribution on the simplex, centered at perfect evenness, and whose concentration parameter corresponds to the amount of noise. We simulated 15 levels of noise ranging from little noise (concentration parameter = 10, 10, 10) to very high levels of noise (concentration parameter = 1, 1, 1) (Supplementary Figure 2b). (The simulated noise levels were evenly spaced intervals in log space between these extremes). For each noise level, 50 separate simulations were performed. For each simulation, 500 trials (consisting of 3 items) were independently generated for 16 participants. Precision values for each item were generated by applying a sample size model [1,38,39] (as in the noiseless uneven allocation simulations) to allocation vectors drawn from a Dirichlet distribution. Measurement error for each simulation was generated by drawing values from a von Mises distribution according to the stimulated precision of each item. We then asked whether sorting responses by whether another response was an accurate or inaccurate judgment led to an observable difference in precision. We determined the percentage of simulations at each noise level that showed evidence of a trade-off for at least one of the six possible sortings. At low noise there is no measurable variability and consequently trade-offs are rarely detected. With increased noise, variability increases and trade-offs are easier to detect. Importantly, the likelihood of detecting tradeoffs rises above 95% well before the mean estimate of variability reaches 8.5° (the estimate of the variability in precision in the observed data) (Supplementary Figure 3). These simulations demonstrate that our method of sorting error distributions by performance on another item in the same trial is a highly sensitive method test for detecting trade-offs between items regardless of whether these trade-offs emerge from a noisy allocation policy or a fixed, but uneven allocation process. Supplementary References 38. 39. Bonnel, A.M., and Miller, J. (1994). Attentional effects on concurrent psychophysical discriminations: Investigations of a sample-size model. Attention, Perception, & Psychophysics 55, 162-179. Palmer, J. (1990). Attentional limits on the perception and memory of visual information. Journal of Experimental Psychology: Human Perception and Performance 16, 332-350.
© Copyright 2026 Paperzz