Supplementary Material for Learning the Exception to the Rule: Model-Based fMRI Reveals Specialized Representations for Surprising Category Members Tyler Davis*, Bradley C. Love, Alison R. Preston *To whom correspondence should be addressed. Email: [email protected] This file includes: Tables S1 to S5 Supplementary Methods 1 Supplemental Tables Table S1. Regions correlated with the recognition strength measure from SUSTAIN during the stimulus presentation period. Region Hemisphere Z Anterior Cingulate Cortex Caudate Cerebellum R R L Fusiform Gyrus L R 3.24 4.42 4.05 4.40 3.59 3.50 3.44 3.88 3.25 3.52 4.26 3.42 3.88 3.90 3.62 4.25 4.53 3.71 3.33 5.40 4.15 3.31 4.52 4.03 3.33 3.88 4.50 3.81 4.28 3.78 4.44 3.75 4.87 4.19 3.73 4.27 4.56 4.01 4.17 4.06 4.71 3.95 3.37 Frontal Pole Hippocampus Inferior Frontal Gyrus—Pars Triangularis Inferior Temporal Gyrus Insula Intracalcerine Cortex Midbrain Middle Frontal Gyrus L R L R L R L L L R Bilateral L R Lateral Occipital Cortex—Inferior Division Lateral Occipital Cortex—Superior Division R L R Nucleus Accumbens Occipital Pole Orbital Frontal Cortex Pallidum Paracingulate Gyrus Precuneus Posterior Cingulate Gyrus Supramarginal Gyrus Superior Frontal Gyrus Superior Parietal Lobule Thalamus L R R R L R R R L R R L R MNI coordinates (x, y, z) 12, 28, 26 10, 10, -2 -8, -22, -30 -22, -58, -36 0, -62, -34 -6, -76, -24 -22, -86, -6 40, -58, -10 38, -44, -18 -44, 40, 32 50, 28, 24 -24, -32, -6 20, -38, 2 -52, 24, 18 48, 28, 6 -40, -58, -6 -32, 16, -10 -16, -72, 6 14, -68, 16 0, -24, -24 -36, 4, 42 -38, 20, 28 48, 14, 32 46, 6, 52 30, 2, 42 44, -86, 12 -26, -60, 40 -26, -76, 52 30, -66, 32 18, -62, 52 -8, 12, -4 16, -88, 2 34, 28, -6 12, -6, -6 -4, 10, 52 4, 26, 44 14, -70, 36 8, -18, 30 -42, -44, 44 12, 12, 58 32, -50, 42 -12, -10, -4 20, -24, 8 2 Table S2. Regions correlated with the error correction measure from SUSTAIN during the feedback period. Region Angular Gyrus Anterior Cingulate Cortex Caudate Cerebellum Fusiform Gyrus Frontal Pole Hippocampus Inferior Frontal Gyrus—Pars Opercularis Inferior Frontal Gyrus—Pars Triangularis Insula Juxtapositional Lobule Midbrain Middle Frontal Gyrus Middle Temporal Gyrus Lateral Occipital Cortex—Inferior Division Lateral Occipital Cortex—Superior Division Hemisphere Z L R R L R L L R L R L R L R R L L R R L R L L R 4.55 4.05 4.89 3.32 3.95 3.94 3.73 4.36 4.12 3.62 3.42 4.76 4.24 4.47 3.29 4.83 4.17 5.00 5.24 4.08 3.36 4.15 3.36 5.4 4.69 3.76 4.45 3.95 3.40 4.76 4.89 3.89 4.63 4.55 3.99 4.47 5.07 4.24 3.70 4.01 3.99 3.94 3.60 L Orbital Frontal Cortex Paracingulate Gyrus R R Precuneus Precentral Gyrus Posterior Cingulate Gyrus R L R L R R L L R L Supramarginal Gyrus Superior Frontal Gyrus Superior Temporal Gyrus Superior Parietal Lobule Thalamus MNI coordinates (x, y, z) -62, -56, 14 2, 6, 26 10, 6, 2 -14, 8, 6 6, -56, -38 -18, -76, -28 -30, -76, -28 34, -48, -16 -32, -50, -12 32, 48, 4 -32, 54, 0 24, -40, 0 -44, 22, 20 48, 26, 10 30, 12, -18 -32, 18, -8 -14, 10, 46 4, -28, -6 36, 20, 32 -40, 12, 58 72, -30, -4 -50, -28, -6 -50, -69, 10 24, -60, 50 36, -74, 38 26, 78, 50 -34, -66, 60 -22, -78, 54 -28, -86, 54 38, 22, -6 10, 38, 22 14, 22, 38 14, -64, 34 -36, 6, 26 8, -26, 26 -8, -30, 26 42, -46, 40 12, 16, 54 -50, 2, -16 -32, -54, 38 10, -14, 8 -10, -8, 0 -10, -26, 6 3 Table S3. Regions correlated with the error correction measure of ALCOVE during the feedback period. Region Hemisphere Z Angular Gyrus Anterior Cingulate Cortex L R Caudate Cerebellum R R 3.69 4.39 3.39 4.20 4.48 3.49 3.36 4.70 4.03 4.43 4.03 4.96 3.60 4.67 3.63 3.71 4.46 4.93 4.70 4.19 3.74 3.79 3.45 4.68 5.36 4.57 5.57 4.20 4.67 3.91 4.72 3.92 5.32 4.74 4.43 3.99 3.47 3.23 4.91 5.11 4.68 4.85 4.55 3.45 4.81 4.68 3.92 L Fusiform Gyrus Frontal Pole Hippocampus Inferior Frontal Gyrus—Pars Opercularis R L R L R R L Inferior Occipital Gyrus Inferior Temporal Gyrus R R Juxtapositional Lobule Midbrain Middle Temporal Gyrus L R L R L R Lateral Occipital Cortex—Inferior Division Lateral Occipital Cortex—Superior Division L L R Middle Frontal Gyrus L Orbital Frontal Cortex Pallidum Paracingulate Gyrus Planum Polare Precuneus R L L R L R R L MNI coordinates (x, y, z) -40, -54, 24 4, 6, 26 10, 44, 6 10, 8, 2 0, -58, -28 30, -66, -28 10, -76, -22 -2, -46, -14 -16, -36, -26 -14, -74, -28 -32, -74, -30 34, -50, -18 40, -12, -36 -30, -52, -12 32, 48, 4 -24, 52, 14 24, -20, 0 46, 24, 10 -44, 22, 20 -46, 24, -4 44, -80, 12 48, -54, -10 56, -20, -32 -14, 8, 48 4, -28, -4 -12, -16, -10 34, 18, 34 -42, 12, 58 48, -18, -12 62, -30, -6 -50, -26, -6 -46, -78, 12 22, -58, 50 30, -76, 40 -20, -72, 40 -32, -62, 56 -28, -86, 34 -32, -84, 18 40, 20, -8 -24, 22, -10 -12, 2, 2 10, 40, 22 8, 26, 34 -8, 44, 12 48, 0, -16 14, -66, 34 -18, -60, -36 4 Table S3. Continued Region Hemisphere Z Precentral Gyrus R L Posterior Cingulate Gyrus R 3.49 4.85 4.67 4.25 4.66 4.90 4.51 3.37 4.75 3.17 4.01 3.80 3.69 3.55 4.76 Supramarginal Gyrus Superior Frontal Gyrus Superior Temporal Gyrus Superior Parietal Lobule Temporal Pole Thalamus L R L R L L L R R MNI coordinates (x, y, z) 36, -8, 44 -48, -4, -20 -42, 0, 32 2, -24, -22 10, -26, 28 -8, -28, 26 42, -44, 40 -44, -34, 38 12, 16, 56 -24, -4, 54 -4, 46, 32 -66, -40, 4 -32, -54, 38 38, 4, -40 8, -14, 6 5 Table S4. Regions demonstrating greater activation for correctly categorized exception items relative to correct rule-following items during stimulus presentation. Region Anterior Cingulate Cortex Caudate Cerebellum Hemisphere Z L R L R 3.57 5.98 5.93 4.74 4.08 3.97 4.81 4.21 3.87 3.76 3.73 3.53 4.18 4.37 4.37 4.25 4.72 4.31 3.92 3.14 3.73 3.40 4.67 4.64 4.09 4.91 4.48 MNI coordinates (x, y, z) -8, 40, 16 8, 8, -2 -8, 12, -2 36, -56, -32 14, -50, -16 26, -66, -22 -26, -64, -30 -6, -76, -24 -4, -60, -34 -42, -60, -38 0, -66, -8 -24, -46, -30 40, -50, -10 -20, -84, -8 -30, -70, -10 -42, 20, 6 48, 38, 24 42, 58, 12 26, 50, -16 40, 52, -6 -18, 50, -20 -14, 66, -14 26, -26, -8 18, -36, 4 48, 28, 8 -58, 26, 20 -40, -58, -6 5.78 4.12 4.65 5.70 5.18 4.81 5.29 5.10 4.65 4.37 3.34 3.65 3.92 5.10 5.72 3.77 4.82 4.67 3.71 3.70 5.59 4.75 34, 24, -4 14, -78, 14 -10, -78, 12 2, -20, -12 44, 16, 30 36, 6, 40 -42, 4, 50 -36, 8, 32 -40, 30, 20 -34, 2, 64 62, -34, -10 -54, -34, -14 36, -84, 4 32, -66, 32 -28, -62, 40 -18, -64, 54 30, -56, 2 18, -90, 2 6, -94, 14 -12, -96, 0 -32, 24, -6 22, -8, -4 L Fusiform Gyrus R L Frontal Operculum Cortex Frontal Pole L R L Hippocampus R Inferior Frontal Gyrus—Pars Triangularis R R L Inferior Temporal Gyrus Insula Intracalcerine Cortex Midbrain Middle Frontal Gyrus R R L R R L Middle Temporal Gyrus Lateral Occipital Cortex—Inferior Division Lateral Occipital Cortex—Superior Division R L R R L Lingual Gyrus Occipital Pole R R Orbital Frontal Cortex Pallidum L L R 6 Table S4. Continued Region Hemisphere Z Paracingulate Gyrus R Precuneus R L R L R R L L L 5.88 4.60 4.41 5.29 5.86 3.57 3.66 4.23 5.54 5.38 3.98 4.05 3.24 Precentral Gyrus Posterior Cingulate Gyrus Superior Parietal Lobule Temporal Pole Thalamus MNI coordinates (x, y, z) 6, 30, 40 6, 10, 54 10, 40, 20 16, -70, 44 -6, -74, 42 32, -18, 52 -58, 6, 38 4, -30, 32 34, -50, 42 -36, -46, 40 -54, 18, -12 -10, -20, 12 -20, -32, -2 7 Table S5. Regions correlated with the recognition strength measure from SUSTAIN, controlling for the effect of reaction time. Region Caudate Cerebellum Frontal Pole Fusiform Gyrus Hippocampus Inferior Frontal Gyrus—Pars Triangularis Inferior Temporal Gyrus Insula Intracalcerine Cortex Lateral Occipital Cortex—Inferior Division Lateral Occipital Cortex—Superior Division Lingual Gyrus Midbrain Middle Frontal Gyrus Nucleus Accumbens Occipital Pole Orbital Frontal Cortex Pallidum Paracingulate Gyrus Posterior Cingulate Gyrus Precuneus Superior Frontal Gyrus Superior Parietal Lobule Supramarginal Gyrus Thalamus Hemisphere Z R L L L L L R R R R L R R L R L L L L R 4.42 4.05 4.40 3.61 3.50 3.52 4.26 3.88 3.44 3.25 3.44 3.88 3.77 3.42 3.62 3.90 4.25 3.95 4.53 3.24 MNI coordinates (x, y, z) 10 , 10, -2 -8, -22, -30 -22, -58, -36 -32, -62, -32 -6, -76, -24 -44, 40, 32 50, 38, 24 40, -58, -10 32, -52, -16 38, -44, -18 -22, -86, -6 20, -38, 2 26, -28, -8 -24, -32, -6 48, 28, 6 -52 , 24, 18 -40, -58, -6 -22, 46, -28 -32, 16, -10 18, -78, 6 L L R R L R L R R Bilateral L L R R R R L R R R R 3.71 3.48 3.33 3.88 4.50 4.28 3.81 3.78 3.51 5.40 4.15 3.31 4.52 4.12 4.03 3.33 4.44 3.75 4.87 4.19 4.27 -16, -72, 6 -10, -78, 14 14, -68, 16 44, -86, 12 -26, -60, 40 30, -66, 32 -26, -76, 52 18, -62, 52 30, -58, 0 0, -24, -14 -36, 4, 42 -38, 20, 28 48, 14, 32 36, 6, 36 46, 6, 52 30, 2, 42 -8, 12, -4 16, -88, 2 34, 28, -6 12, -6, -6 4, 26, 44 L L R R R R L L R 3.73 3.58 3.99 4.56 4.06 4.71 4.17 3.95 3.37 -4, 10, 52 -8, 18, 48 8, -18, 30 14, -70, 36 12, 12, 58 32, -50, 42 -42, -44, 44 -12, -10, 4 20, -24, 8 8 Table S6. Abstract category structure. Each row represents a unique stimulus (i.e., beetle). The four values assigned to a stimulus denote the four stimulus dimensions (e.g., antenna, legs, etc.) assigned to a beetle. Each numeric value (1 or 2) represents a specific feature instantiation (e.g., red or green eyes). The first dimension represents the rule-relevant dimension. Most Hole A beetles have a 1 on the first dimension (e.g., red eyes) whereas most Hole B beetles have a 2 (e.g., green eyes). The first stimulus in each of the columns is therefore an exception. Hole A Beetles Hole B Beetles 2 2 2 2* 1 2 2 2* 1112 2112 1121 2121 1211 2211 *Exception item Recognition Test Foils 1111 1122 1212 1221 2222 2211 2121 2112 Supplemental Methods: Model Formalism and Procedures SUSTAIN Input Stimulus Representation. An input stimulus is represented to SUSTAIN as a pattern of activation on input units that code the different stimulus features and possible values that these features can take. For each stimulus feature, i (e.g., a beetle’s eye color), with k possible values (two in the present experiment; e.g., red or green eyes), there are k input units. Input units are set to one if the unit represents the feature value or zero otherwise. The entire stimulus is represented by I posik , with i indicating the stimulus feature and k indicating the value for feature i. “Pos” indicates that the stimulus is represented as a point in a multidimensional space. The distance µij between the ith stimulus feature and cluster j’s position along the ith feature is ! vi ik µij = 1/2" | I posik # H pos | j (S1) k=1 ! ik where vi is the number of possible values that the ith stimulus feature can take and H pos is j cluster j’s position on the ith feature for value k. Distance µij is always between 0 and 1, inclusive. ! 9 Response Selection. After encoding, each cluster is activated based on the similarity of the cluster to the input stimulus. Cluster activation is given by: na % (" ) e # $ "i µ ij i H act j = (S2) i=1 na % (" ) # i i=1 ! where H act is cluster j’s activation, na is the number of stimulus features, and "i is the receptive j field tuning (which controls the model’s attention to a given feature) for feature i, and r is the attentional parameter (constrained to be non-negative). "i is set to 1 at the start of learning. ! ! Clusters enter into a competition to respond to an input stimulus through mutual inhibition. The out final output of each cluster j is H j , which is ! given by: H out j = " (H act j ) nc # (H act " j ) H act j ! (S3) i=1 ! where nc is the current number of clusters and β is a lateral inhibition parameter (constrained to be non-negative) that controls the level of cluster competition. out Only the cluster with the largest output value is allowed to pass its output, H j , across its connections to the output layer. The cluster that wins the competition, Hm, passes its output to the k output units of the unknown feature dimension z ! (S4) Czkout = w m,zk H mout out ! where Czk is the output of the unit representing the kth feature value of the zth feature, and wm,zk is the weight from the winning cluster, Hm, to the output unit Czk. In the simulations present here out where only the category label is queried, z always stands for the category label, and Czk is ! calculated for each of the k (k=2) values that the category label can take (Hole A or B). The probability of making a response k for a queried dimension, z, on a given trial is: ! out P(k) = e(dC zk vz "e ) (dC zkout (S5) ) j=1 ! 10 Cluster recruitment. In the present simulation, SUSTAIN was initialized with two clusters, one for each category, that represent the modal stimulus in the category, excluding exceptions. This initialization reflects the cuing procedure used in the study, which alerted subjects to the rulerelevant dimension. Other clusters are recruited in SUSTAIN in response to surprising stimulus. Similar to recent uses of SUSTAIN (Love & Gureckis, 2007), we included a hippocampal function parameter, τh (constrained to be between 0 and 1) that probabilistically determines whether an error will lead to new cluster recruitment. If SUSTAIN makes a prediction error, and τh exceeds q, where q is a randomly generated value (between 0 and 1), a new cluster is recruited. If the value of τh is 1, all errors lead to new cluster recruitment, and the model is equivalent to SUSTAIN’s original formulation (i.e., Love, Medin, & Gureckis, 2007). Learning. The learning rules determine how the clusters are updated. Only the winning clusters are updated. If a new cluster is recruited on a trial, it will be the winner. Otherwise, the cluster that is most similar to the current stimulus will be the winner. The winning cluster Hm, will have its position adjusted by: (S6) "H mposik = #(I posik $ H mposik ) ! where η is the learning rate parameter. The result of the updating is that the winning cluster moves toward the current stimulus. Over the course of learning, each cluster will tend toward the center of its members. Receptive field tunings are updated according to (S7) "#i = $e% #i µ im (1% #iµim ) where m indexes the winning cluster. ! The weights from the winning cluster to the output units are adjusted by the one layer delta learning rule (Rumelhart, Hinton & Williams, 1986). "w m,zk = #(t zk $ Czkout )H mout ! (S8) Simulations. For the present simulation, items were presented to SUSTAIN using the same order and randomization methods as for human subjects. To reflect the rule cuing procedure, the model was initialized with a cluster for each category that represented the average of the rule-following items within the category, and the attention weight on the rule-relevant dimension was initialized and fixed at 5.1 (cf. Love & Gureckis, 2007). The free parameters, γ, β, η, d, and τh, were fit to the average learning curve using standard optimization techniques. Obtained values were: γ = 1.259, β = 0.702, η = 0.415, d = 25.264, τh = 0.174. The recognition strength measure R was calculated by summing the output H out j for all clusters: ! 11 nc R = " H out j (S9) j=1 ! The error correction measure was given by the summing absolute value of the difference between the winning cluster’s output on unit k, for the queried feature dimension, z, and the item’s actual position on this dimension. vz E = " | Czkout # I poszk | (S10) k=1 where vz, again, equals the number of output units for feature dimension z. ! Trial-by-trial values for recognition strength and error correction measures were averaged over 1000 simulations. ALCOVE Input Stimulus Representation. A stimulus is represented to ALCOVE as a pattern of activation a along input nodes that code the stimulus’ values v for each feature dimension i. An entire stimulus is represented by the column vector ain. Input nodes are gated by dimensional attention strength αi that reflects the relevance of each feature dimension for the task. Input stimuli activate each hidden node according to the psychological similarity of the input stimulus to the hidden node, where hidden nodes are each of the previously observed exemplars. The activation of the jth hidden node is: "c( a hid j =e ! $ # i |hji"a iin | j r q/r ) (S11) where c a constant (c > 0) that determines the specificity of the hidden node, and where r and q are constants that determine the psychological distance metric and similarity gradient, respectively. Both r and q are set to 1 in the present applications, which corresponds to cityblock distance metric with an exponential similarity gradient. Response Selection. Hidden nodes are connected to k output nodes that reflect the available response categories (i.e., hole A or hole B) by a weight wkj that gives the strength of the association between hidden node j and output node k. Output node activation is given by akout = " w kj a hid j (S12) hid j The probability of classifying a given stimulus into category k on a trial is given by ! 12 out e"a k Pr(K) = out # e"ak (S13) out k where is φ a mapping constant. ! Learning. The dimensional attention strengths αi and association weights wkj are learned via gradient descent on sum squared error. The error generated by the model on a given trial is given by E = 1/2# (t k " akout ) 2 (S14a) out k where tk are teacher values that code the feedback given to ALCOVE such that ! ! ! tk = { max(+1,a kout ) if the stimulus is in category K min("1,a kout ) if the stimulus is not in category K } (S14b) The dimensional attention strengths and association weights are updated on each trial according to the learning rules "w kjout = #w (t k $ akout )a hid j (S15) ' * ) , in "# = $ %# &)& (t k $ akout )w kj ,a hid j c | h ji $ ai | hid out ,+ j ) (k (S16) where λw and λα are learning rate constants that are constrained to be greater than 0. ! Simulations. ALCOVE was trained using the same procedure as the SUSTAIN simulations, with obtained values for the free parameters: λα = 0.01, λw = 0.40, c = 0.66, φ = 3.125. Recognition strength is computed in ALCOVE as familiarity or similarity of an item to all exemplars stored in memory (i.e., the sum of the hidden node activations). "a hid j (S17) hid j Error is computed for each trial as in S14a. ! 13
© Copyright 2026 Paperzz