1 SUPPLEMENTARY INFORMATION 2 3 A rational theory of the limitations of working memory and attention 4 Ronald van den Berg & Wei Ji Ma 5 6 7 8 Contents 9 MODEL DETAILS ......................................................................................................................... 1 10 Relation between J and κ ............................................................................................................ 1 11 Variable precision ....................................................................................................................... 1 12 Expected behavioral loss function by task .................................................................................. 2 13 The behavioral loss function drops out when the behavioral error is binary .............................. 4 14 Conditions under which optimal precision declines with set size .............................................. 4 15 REFERENCES ............................................................................................................................... 6 16 SUPPLEMENTARY FIGURES..................................................................................................... 7 17 18 MODEL DETAILS 19 Relation between J and κ 20 We measure encoding precision as Fisher Information, denoted J. As derived in earlier work1, 21 the mapping between J and the concentration parameter κ of a Von Mises encoding noise 22 distribution is J 23 1. Larger values of J map to larger values of κ, corresponding to narrower noise distributions. I 1 , where I1 is the modified Bessel function of the first kind of order I 0 24 25 Variable precision 26 In all our models, we incorporated variability in precision 27 encoded item independently from a Gamma distribution with mean J and scale parameter τ. We 1 2,3 by drawing the precision for each 28 denote the distribution of a single precision value by p J | J , and the joint distribution of the 29 precision values of all N items in a display by p J | J , p J i | J , . N i 1 30 31 32 Expected behavioral loss function by task As a consequence of variability in precision, computation of expected behavioral loss requires 33 integration over both the behavioral error, ε, and the vector with precision values, J, 34 Lbehavioral p | J , N p J | J , dJ if is discrete 0 Lbehavioral J , N L p | J, N p J | J , dJd if is continuous behavioral 0 0 35 The distribution of precision, p J | J , , is the same in all models, but Lbehavioral(ε) and p(ε|J,N) 36 are task-specific. We next specify these two components separately for each task. 37 Delayed estimation. In delayed estimation, the behavioral error only depends on the 38 memory representation of the target item. We assume that this representation is corrupted by 39 Von Mises noise, 40 p | J, N p | J T 1 F J cos e T , 2 I 0 F J T 41 where JT is the precision of the target item and F(.) maps Fisher Information to a concentration 42 parameter κ; we implement this mapping by numerically inverting the mapping specified above. 43 Furthermore, the behavioral loss function is assumed to be a power-law function of the absolute 44 estimation error, Lbehavioral=|ε|β, where β>0 is a free parameter. 45 46 47 Change detection. We assume that subjects report “change present” whenever the posterior ratio for a change exceeds 1, p change present | x, y p change absent | x, y 1, 48 where x and y denote the vectors of noisy measurements of the stimuli in the first and second 49 displays, respectively. Under the Von Mises assumption, this rule evaluates to 4 2 pchange 50 1 pchange 1 N N i 1 I 0 I 0 x,i I 0 y,i x,2 i x,2 i 2 x,i y,i cos yi xi 1, 51 where pchange is a free parameter representing the subject’s prior belief that a change will occur, 52 and κx,i and κy,i denote the concentration parameters of the Von Mises distributions associated 53 with the observations of the stimuli at the ith location in the first and second displays, 54 respectively. 55 56 The behavioral error, ε, takes only two values in this task: correct and incorrect. We assume that observers map each of these values to a loss value, 57 58 if is "incorrect" Lincorrect Lbehavioral Lcorrect if is "correct". 59 60 For example, an observer might assign a loss of 0 to any correct decision and a loss of 1 to any 61 incorrect decision. The expected behavioral loss is a weighted sum of Lincorrect and Lcorrect, 62 63 Lbehavioral J , N pcorrect J , N Lcorrect 1 pcorrect J , N Lincorrect , 64 65 where pcorrect J , N is the probability of a correct decision. This probability is not analytic, but 66 can be easily be approximated using Monte Carlo simulations. 67 Change localization. Expected behavioral loss is computed in the same way as in the 68 change-detection task, except that a different decision rule must be used to compute 69 pcorrect J , N . As shown in earlier work 3, the Bayes-optimal rule for the change-localization 70 task is to report the location that maximizes I0 71 I 0 x,i I 0 y,i x,2 i x,2 i 2 x,i y,i cos yi xi , where all terms are defined in the same way as in the model for the change-detection task. 72 Visual search. The expected behavioral loss in the model for visual search is also 73 computed in the same way as in the model for change detection, again with the only difference 74 being the decision rule used to compute pcorrect J , N . The Bayes-optimal rule for this task is to 3 ppresent I 0 D e i cos xi sT 75 report “target present” when 76 subject’s prior belief that the target will be present, κD the concentration parameter of the 77 distribution from which the distractors are drawn, κi the concentration parameter of the noise 78 distribution associated to the stimulus at location i, xi the noisy observation of the stimulus at 79 location i, and sT the value of the target (see 5 for a derivation). 1 ppresent I 0 i2 D2 2 i D cos xi sT , where ppresent is the 80 81 The behavioral loss function drops out when the behavioral error is binary 82 When the behavioral error ε takes only two values, the behavioral loss can also take only two 83 values. The integral in the expected behavioral loss (Eq (2) in the main text) then simplifies to a 84 sum of two terms, 85 86 Lbehavioral J , N pcorrect J , N Lcorrect 1 pcorrect J , N Lincorrect pcorrect J , N Lcorrect Lincorrect Lincorrect . 87 88 The optimal (loss-minimizing) value of J is then 89 J optimal N argmin pcorrect J , N Lcorrect Lincorrect Lincorrect Lneural J , N 90 J argmin pcorrect J , N L Lneural J , N , J 91 92 where ΔL ≡ Lcorrect – Lincorrect. Since ΔL and have interchangeable effects on J optimal , we fix ΔL to 93 1 and fit only as a free parameter. 94 95 Conditions under which optimal precision declines with set size 96 In this section, we show that when the expected behavioral loss is independent of set size (as in 97 delayed estimation, but also single-probe change detection), the rational model predicts optimal 98 precision to decline with set size whenever the following four conditions are satisfied: 99 100 1) Expected behavioral loss is a strictly decreasing function of encoding precision, i.e., an increase in precision results in an increase in performance. 4 101 2) Expected behavioral loss is subject to a law of diminishing returns 6: the behavioral 102 benefit obtained from a unit increase in precision decreases with precision. This law will 103 hold when condition 1 holds and the loss function is bounded from below, which is 104 generally the case as errors cannot be negative. 105 3) Expected neural loss is an increasing function of encoding precision. 106 4) Expected neural loss is subject to a law of increasing loss: the amount of loss associated 107 with a unit increase in precision increases with precision. This condition translates to 108 stating that the loss per spike must either be constant or increase with spike rate, which 109 has been found to be generally the case 7. 110 These conditions translate to the following constraints on the first and second derivatives of the 111 expected loss functions, 1. L 'behavioral J 0 2. L "behavioral J 0 112 3. L 'neural J 0 4. L "neural J 0. 113 114 The loss-minimizing value of precision is found by setting the derivative of the expected total 115 loss function to 0, 116 0 L 'total J L 'behavioral J NL 'neural J , 117 118 119 120 which is equivalent to L 'behavioral J L 'neural J N. (S1) 121 122 The left-hand side is strictly positive for any J , because of constraints 1 and 3 above. In 123 addition, it is a strictly decreasing function of J , because 124 5 125 d L 'behavioral J L ''behavioral J L 'neural J L 'behavioral J L ''neural J 2 dJ L 'neural J L' J neural 126 is necessarily greater than 0 due to the four constraints specified above. As illustrated in 127 Supplementary Figure S1, Eq. (S1) can be interpreted as the intersection point between the 128 function specified by the left-hand side (solid curve) and a flat line at a value N (dashed lines). 129 The value of J at which this intersection occurs (i.e., J optimal ) necessarily decreases with N. 130 When expected behavioral loss does depend on set size (such as in whole-array change 131 detection or change localization), the proof above does not apply and we were not able to extend 132 the proof to this domain. 8 9 10 133 134 REFERENCES 135 1. 136 137 perception under variability in encoding precision. PLoS One 7, (2012). 2. 138 139 Keshvari, S., van den Berg, R. & Ma, W. J. Probabilistic computation in human Fougnie, D., Suchow, J. W. & Alvarez, G. A. Variability in the quality of visual working memory. Nat. Commun. 3, 1229 (2012). 3. van den Berg, R., Shin, H., Chou, W.-C., George, R. & Ma, W. J. Variability in encoding 140 precision accounts for visual short-term memory limitations. Proceedings of the National 141 Academy of Sciences 109, 8780–8785 (2012). 142 4. 143 144 Keshvari, S., van den Berg, R. & Ma, W. J. No Evidence for an Item Limit in Change Detection. PLoS Comput. Biol. 9, (2013). 5. 145 Mazyar, H., Van den Berg, R., Seilheimer, R. L. & Ma, W. J. Independence is elusive : Set size effects on encoding precision in visual search. J. Vis. 13, 1–14 (2013). 146 6. Mankiw, N. G. Principles of economics. Book 328, (2004). 147 7. Sterling, P. & Laughlin, S. Principles of neural design. (MIT Press, 2015). 148 8. Anderson, D. E. & Awh, E. The plateau in mnemonic resolution across large set sizes 149 indicates discrete resource limits in visual working memory. Atten. Percept. Psychophys. 150 74, 891–910 (2012). 151 9. 152 153 Anderson, D. E., Vogel, E. K. & Awh, E. Precision in visual working memory reaches a stable plateau when individual item limits are exceeded. J. Neurosci. 31, 1128–38 (2011). 10. Rademaker, R. L., Tredway, C. H. & Tong, F. Introspective judgments predict the 6 154 precision and likelihood of successful maintenance of visual working memory. J. Vis. 12, 155 21 (2012). 156 157 SUPPLEMENTARY FIGURES Circular kurtosis Circular variance 158 Anderson & Awh 2012 (180 deg) Anderson & Awh 2012 (360 deg) Anderson et al. 2011 (Exp 1) Rademaker et al. 2012 2 4 6 8 2 4 6 8 2 4 6 8 2 4 6 8 2 4 6 8 2 4 6 8 2 4 6 8 2 4 6 8 1 0.5 0 1 0.5 0 Data 159 160 Model Set size Supplementary figure S1. Fits to the three delayed-estimation benchmark data sets that were excluded from the main analyses. Circular variance (top) and circular kurtosis (bottom) of the estimation error distributions as a function of set size, split by experiment. Error bars and shaded areas represent 1 s.e.m. of the mean across subjects. The first three datasets were excluded from the main analyses on the ground that they were published in papers that were later retracted (Anderson & Awh, 2012; Anderson et al. 2012). The Rademaker et al. (2012) dataset was excluded from the main analyses because it contains only two set sizes, which makes it less suitable for a fine-grained study of the relationship between encoding precision and set size. 7 0.1 0.08 0.06 0.04 Joptimal 8 1.7 8 Joptimal 6 2.1 Joptimal 4 3.2 Joptimal 2 6.1 0.02 6 4 2 0 0 2 4 6 8 10 Mean encoding precision, 161 Figure S2. Graphical illustration of Eq. (S1). The value of at which the equality described by Eq. (S1) holds is the intersection point between the function specified by the left-hand side (red curve) and a flat line at a value Nλ. Since the left-hand side is strictly positive and also a strictly decreasing function of , the value at which this intersection occurs (i.e., optimal) necessarily decreases with N. 162 163 8
© Copyright 2026 Paperzz