Perception, 2007, volume 36, pages 49 ^ 74 DOI:10.1068/p5582 Visual binding of English and Chinese word parts is limited to low temporal frequencies Alex O Holcombe School of Psychology, Tower Building, Park Place, Cardiff University, Cardiff CF10 3AT, Wales, UK; and Department of Psychology, University of California at San Diego, La Jolla, CA 92093, USA; e-mail: [email protected] Jeff Judson Department of Psychology, University of California at San Diego, La Jolla, CA 92093, USA Received 19 August 2005, in revised form 26 January 2006; published online 5 January 2007 Abstract. Some perceptual mechanisms manifest high temporal precision, allowing reports of visual information even when that information is restricted to windows smaller than 50 ms. Other visual judgments are limited to much coarser time scales. What about visual information extracted at late processing stages, for which we nonetheless have perceptual expertise, such as words? Here, the temporal limits on binding together visual word parts were investigated. In one trial, either the word `ball' was alternated with `deck', or `dell' was alternated with `back', with all stimuli presented at fixation. These stimuli restrict the time scale of the rod identities because the two sets of alternating words form the same image at high alternation frequencies. Observers made a forced choice between the two alternatives. Resulting 75% thresholds are restricted to 5 Hz or less for words and nonword letter strings. A similar result was obtained in an analogous experiment with Chinese participants viewing alternating Chinese characters. These results support the theory that explicit perceptual access to visual information extracted at late stages is limited to coarse time scales. 1 Introduction Physiological recordings show that initial coding in the visual system provides reliable information at fine temporal scales, with neurons in primary visual cortex following flicker frequencies well over 60 Hz (Gur and Snodderly 1997). Frequencies much higher than this are completely lost to perception, but still an impressive degree of precision is preserved in some cases. For example, when viewing a periodic pattern of bars (a sinusoidal grating) moving within a window at 20 Hz, humans can perceive its direction of motion (Burr and Ross 1982) and depth (Morgan and Castet 1995). This occurs despite the fact that for 20 Hz gratings, temporal integration over 50 ms or more yields a uniform field or bars at every position, totally obliterating the motion and depth information. The fact that we nonetheless perceive motion and depth at these rates implies that the mechanisms underlying our perception of motion and depth operate at a time scale better than 50 ms. This high temporal precision is not restricted to perception of individual features ö the pairing of colour with orientation can also be perceived with over 20 Hz stimulus alternation rates (Holcombe and Cavanagh 2001). In that experiment, participants viewed a display rapidly alternating between a red leftward-tilted pattern and a green rightward-tilted pattern. In other trials, the feature pairing was reversed, with green leftward tilt alternating with red rightward tilt. If the representation of the individual features at the point of binding had a precision worse than 50 ms, or if the binding process integrated over a period longer than 50 ms, then participants would not have been able to distinguish the feature pairings of these stimuli. The fast threshold found imposes strong constraints on the neural computations underlying the binding process, as do more recent studies of the binding of distributed shape elements (Clifford et al 2004). The results of these latter studies reveal that scattered dot pairs are also 50 A O Holcombe, J Judson bound together with high, 20 Hz, temporal precision when they form certain regular global patterns such as spirals and concentric circles (Clifford et al 2004). Hence, the visual system is capable of combining spatially disparate features into global form with a precision better than 50 ms. Mechanisms for symmetry perception may have similarly impressive temporal resolution (Tyler et al 1995). Interestingly, aside from the cases when disparate elements form certain global patterns, the pairing of disparate elements generally appears to be limited to low alternation rates. Recall that colour and orientation can be bound at a temporal precision better than 20 Hz when the features are spatially superposed. When these same features are spatially separated, their pairing can only be perceived at about 3 Hz or less (Holcombe and Cavanagh 2001; demonstration available from http://viperlib.com or author AOH's website), suggesting temporal precision worse than 100 ms. The contrast between some visual binding judgments that reflect a precision better than 20 Hz and others limited to several hertz or worse sharpens some important issues in temporal processing. In particular, at high temporal frequencies, what prevents perceptual binding for some features but not others? The perceptual experience of high-speed visual binding may hold a clue. Consider the experience of two red and green perpendicular gratings alternating at slow rates. One experiences first one pattern, then the other, and so on. However, when the two gratings are alternated at a rate faster than about 6 ^ 8 Hz, they no longer seem to be experienced as individuals. Instead, although one is aware that the stimuli are flickering, the gratings seem to be experienced simultaneously, as if they were both continuously available to cognition (Holcombe 2001; Holcombe and Cavanagh 2001; demonstrations may be viewed at http://viperlib.org and AOH's website). This `temporal transparency' phenomenon (Holcombe 2001) also occurs with rapid alternation of Glass dot patterns (Clifford et al 2004). Temporal transparency is accompanied by a loss of functional access to the two stimuli as individuals (Holcombe 2001; Clifford et al 2004). These phenomena suggest that, by the time visual signals reach awareness, they have been combined over long intervals, preventing explicit judgments of fine time scales. However, even when visual awareness does not seem to follow rapid alternations (Holcombe 2001), not all high-frequency information is lost. Certain aspects of the visual world, such as the identity of certain global forms and the pairing of local colour and orientation, are extracted before awareness by mechanisms with high temporal resolution. Those aspects that are not extracted early on, such as the pairing of spatially disparate colour and orientation, apparently cannot be perceived at fast alternation rates. The notion of a substantial loss in temporal resolution prior to cognitive stages suggests that visual representations from a range of times are combined, as if squeezed together through a bottleneck, to arrive as one to cognition and consciousness. To fully appreciate this theory, it is important to understand why low temporal resolution at cognition and consciousness nevertheless does not preclude fine-time-scale access to all aspects of the visual world. In the case of global form, the global-form mechanisms may extract the presence of a left spiral and a right spiral at fine time scales, effectively demodulating this aspect of the stimuli by sending constant labels on to higher levels, signaling the presence of a left spiral and a right spiral. But in perceptual experience, these signals have been combined, yielding the awareness of the presence of a left spiral and a right spiral, but losing the representation of which occurred when. But if the globalform mechanism did not have high temporal resolution, the observer could not perceive the spirals at all, for the dots from the successive patterns would be combined before the global-form mechanism operated (Clifford et al 2004). With visual information apparently integrated over a long interval for explicit perception, an obvious question is where in the visual system this integration occurs. As well as being a fundamental aspect of the architecture of the visual system, the answer Binding words is slow 51 would help to identify the neural correlates of visual awareness. The evidence so far points to a site somewhere between the extrastriate areas responsible for global-form extraction and the unknown areas that putatively correspond to the visual awareness of successive events. This is following the assumptions of the popular quest for the neural correlates of consciousness (eg Crick and Koch 1995, 2003). But even if it is a mistake to speak of brain areas corresponding to visual awareness, understanding the bottleneck will nonetheless put informative constraints on future, perhaps more sophisticated, theories of visual awareness. In this paper, we seek to further localise the aspects of perception for which substantial loss of resolution occurs by measuring the visual time scale at which humans can access the identity of letter strings, English words, and Chinese characters. Linguistic material was chosen for three reasons. First, visual words and letter strings are encoded very late in the visual processing hierarchy öapparently in the temporal lobe (McCandliss et al 2003). If late visual mechanisms are generally limited to coarse time scales, word perception should certainly show this limitation. A second reason for choosing words is that, if any high-level visual judgment were to occur at fine time scales, the recognition of words should be a top candidate. Words are processed automatically by adult human readers (Stroop 1935), which is no doubt related to the decades of daily experience most adults have with reading. This has further led to some unitisation in the processing of words (Tao et al 1997). A final advantage of words is that the way they are processed in normal reading has some resemblance to the way they are presented in the laboratory experiment. Indeed, most theories of reading suggest that only one or two words are processed per fixation in the rapid series of fixations made during normal reading (Brysbaert and Vitu 1998; Inhoff et al 2000; Rayner et al 2004). Potter and her colleagues have reported numerous experiments with rapid presentation of words. They find that even at presentation rates of 12 words per second, participants can recall most of the words when together they form a sentence (Potter 1993), which might suggest high temporal resolution. However, these experiments were not designed to isolate the duration of processing of individual words. For example, in these rapid serial-presentation experiments, temporal integration of a word with the preceding and following words does not completely obliterate the cues to the word's identity. From the present point of view, the masked priming literature (Kinoshita and Lupker 2003) suffers from a similar limitation as does the rapid serial-presentation work. The finding of semantic priming from very briefly presented words does suggest that word binding and processing are extremely rapid. However, such results do not provide good constraints on the temporal precision of the binding of letters into a word. In such experiments, typically a word is presented for a few dozen milliseconds before a patterned post-mask is presented. In this paradigm, considerable imprecision in the relative processing time of individual letters and binding of the letters together might still yield semantic activation. For example, even if letter units are activated at substantially different times, there might still be no ambiguity which letters are to be bound together, since no other letters are presented. Hence, the binding mechanism could potentially integrate over a much longer period than the actual presentation time of the word. Temporal limits on word binding have not been previously investigated, but in a number of studies binding errors have been revealed by brief, simultaneous presentation of multiple words or letter strings. Mozer (1983) presented pairs of four-letter words for a few hundred milliseconds and followed them with a post-mask. Letter migration errors sometimes resulted. For example, when `line' and `lace' were presented, occasionally participants reported seeing `lice' or `lane'. These errors occur more frequently than would be expected by guessing (McClelland and Mozer 1986; Treisman and Souther 1986). 52 A O Holcombe, J Judson Harris and her colleagues showed that rapid presentation of letter strings can lead to repetition blindness for letter strings, words, and parts of words (Harris and Morris 2001). The patterns of letter migrations and repetition blindness that occur in these conditions are used for deciding among the various proposed models of how letter strings are represented and how reading is accomplished (Davis and Bowers 2004). Our main interest, the relationship of binding limits for linguistic material to those for other visual materials, does not seem to have been directly investigated by anyone. However, the reverse-hierarchy theory of perception by Hochstein and Ahissar (2002) may appear relevant. It suggests that, when we view a scene, our initial conscious percept reflects high-level visual processing, such as the identity of faces and words, whereas perception of visual details, such as individual orientations, is slower and requires feedback from higher levels of the cortex. This reverse-hierarchy theory might be taken to imply that visual characteristics extracted at later cortical stages, such as the identity of a word, are processed more rapidly and have higher temporal resolution than those extracted at earlier stages. However, the results for which Hochstein and Ahissar's theory was concocted do not directly address this issue. The theory is largely based on patterns of response times in visual-search experiments, which indicate that search for high-level properties, such as object category, can be faster than search for more basic features. The present interest is in the processing of an individual item rather than the apparently simultaneous processing of the many items of a search array. Furthermore, in visual search, the temporal precision of processing is unlikely to be the most important factor determining reaction time (RT). Search RTs are critically dependent on factors such as the heterogeneity of items in the display (Duncan and Humphreys 1989), and whether the items are processed in parallel (Egeth et al 1972). Hence, the results of visual-search experiments may reveal the characteristics of initial perception when viewing a cluttered scene without indicating the temporal resolution of the processing of an individual attended item. Thorpe and his colleagues (Thorpe et al l996; Fabre-Thorpe et al 2001; VanRullen and Thorpe 2001; Rousselet et al 2002) have measured the speed of high-level visual discrimination using evoked potential latencies. In one experiment, evoked potentials recorded at the scalp discriminated scenes containing an animal from those that do not. These potentials had latencies as short as 150 ms, suggesting fine temporal resolution for scene and animal processing. However, follow-up work found that the latency of the discriminating signal is quite variable, ranging from 150 ms to 300 ms (Johnson and Olshausen 2003). Thus, processing of the various features involved in these visual discriminations may yet be temporally imprecise, despite the impressive overall performance. To address the temporal precision of high-level visual processing more directly, we use a task in which the participants must bind together linguistic features presented at the same timeöeither the two halves of a four-letter string or, in the Chinese experiment, two halves of a Chinese character. If word or letter-string binding mechanisms have temporal properties similar to those for motion, global form, spatially superposed colour and orientation, flicker, or stereopsis (Morgan and Castet 1995), thresholds should be better than 15 Hz. On the other hand, if the task relies on slower, perhaps more central mechanisms, with the temporal characteristics of that of binding spatially separated colour and orientation features, we can expect that accurate performance will occur only for slow rates öless than 8 Hz. As it turns out, threshold rates for binding the linguistic materials do fall into this latter slow category. Different stimulus conditions were used for this investigation öChinese characters, words, pseudowords, and mostly unpronounceable nonwords. Although most studies that use varying linguistic conditions such as these are designed to isolate the effect of various linguistic factors on small differences in performance (eg Murray and Forster 2004), this study is an exception. Binding words is slow 53 Here, the interest was not in small differences in performance, but rather only in any dramatic differences in temporal threshold öessentially, whether each of the stimulus categories yielded slow (5 8 Hz) or fast ( 415 Hz) thresholds. Revealing smaller differences due to particular linguistic factors would require a much larger set of stimuli, carefully balanced in a way that would be very difficult given the constraints imposed by the paradigm (described below). As it turns out, a slow threshold is found in each condition. Given this, the use of the varying conditions provides confidence that the slow threshold is not an accident of the particular items chosen, as would be a worry if only one condition were used. Figure 1 schematises an example trial in which the word `ball', presented at fixation, is alternated with the word `deck', also presented at fixation. In the experiment, this stimulus train is paired with another in which `dell' is alternated with `back'. The utility of using these particular words is seen when alternating the two sets of words at very high rates. At rates exceeding the temporal resolution of the visual system, observers will perceive the sum, which is identical for the two pairs (figure 1). Alternating stimuli Sum A Two-alternative forced choice B A Two-alternative forced choice B Figure 1. A schematic representation of the possible stimuli for a trial in the English words experiment and the Chinese character experiment. The top half of the figure corresponds to a trial in which the participant was informed that either `ball' or `deck' would be presented in rapid alternation at the fixation point, or `dell' and `back' would alternate. Because the sums of these word pairs are the same, participants could only discriminate between the two alternatives when the presentation rate was slow enough for word perception mechanisms to individuate the two temporal intervals. Thanks to the identical sums of the two stimuli, this presentation method systematically limits the binding information to a particular temporal interval. Varying the alternation rate then varies this temporal interval, allowing an estimate of the resolution of the system. The method exploits the fact that all mechanisms, including those 54 A O Holcombe, J Judson underlying word recognition, have a non-zero interval over which they integrate or, loosely, a temporal resolution. In other words, when two inputs to a mechanism are alternated at a sufficiently high rate, a mechanism will passively integrate them as if they were presented at the same time. In the experimental trial corresponding to the stimuli shown in figure 1, observers are presented with one of the alternating pairs and asked to determine which was presented. If the presentation rate is such that both words are presented within the interval that the word recognition system averages over, then performance will be at chance. By varying the temporal frequency (presentation rate) of the stimuli across trials, the temporal resolution can be estimated. To compare the results to those with methods used previously in the literature, temporal thresholds were also measured with a single-presentation masked exposure. The word `binding' is used to refer to many different things in psychophysics and neuroscience. In this paper, by binding we mean the ability to report the temporal pairing of two aspects of a visual stimulus. The results of our binding experiments are consistent with the idea that by the time visual signals reach awareness, they have been combined over an interval of the order of 100 ms. This notion is related to the `psychological moment' hypothesis, first advanced over one hundred years ago (von Baer 1864). Historically, a variety of evidence has been marshaled in support of this somewhat vague idea that psychological processes operate on about a 100 ms time scale (Stroud 1956). A limitation of the previous evidence is that the temporal resolution of different visual judgments was not compared, and as seen here, comparing these may be quite informative for understanding visual mechanisms. The present work does not address the psychological-moment theory claim that processing occurs in discrete episodes (Geissler et al 1999; VanRullen and Koch 2003; but see Kline et al 2004, 2006). Instead, the issue here is the temporal precision of binding word parts, with the question set aside for now whether these limits reflect continuous averaging or discrete quantised processing. Simplified movies of the stimuli give a sense of what it was like to be in the experiment and can be viewed at AOH's web page, currently http://www.psych.usyd.edu.au/staff/alexh. 2 Methods 2.1 Participants For the experiments with English words, sixteen participants were recruited from local laboratories and undergraduate psychology classes at the University of California at San Diego. All spoke English as their first language and were paid by the hour. Six Chinese participants were recruited from campus Chinese associations and all were adults under the age of 35 years, raised in mainland China, and all reported fluency with the type of Chinese characters used in the study. 2.2 English language, repetitive-presentation experiments In these experiments, participants were shown the stimuli at a variety of alternation rates and made a forced choice between two alternatives. At the beginning of the trial, two pairs of four-letter strings were presented. After studying the two pairs of strings for as long as they wished, the observers fixated on the central dot and hit a key to initiate the alternating presentation of one of the pairs of strings that they had just studied. During the alternating stimulus, observers were to neither blink nor move their eyes, and their right eye was monitored as described in section 2.5. Immediately after the alternation ended, the two pairs of letter strings were again arrayed on the screen and observers hit a key to indicate which had been presented. Feedback indicated whether a correct response had been made. Binding words is slow 55 The lighting in the room was dim, and during the trial the background of the screen was a uniform 84 cd mÿ2. The observers viewed the screen from a chin-rest placed 68 cm away. The letter strings appeared in a black ( 1 cd mÿ2 ) rectangular region 85 pixels wide (2.41 deg) by 27 pixels high (0.76 deg). The letters were drawn in lower-case Courier font with the letter `l' 20 pixels (0.57 deg) high. Each letter was drawn 22 pixels (0.62 deg) to the right of the previous letter. The CRT monitor was set to a refresh rate of 85 Hz, and the stimulus began with the presentation of the `XXXX' pre-mask at 84 cd mÿ2 for two frames (23 ms). The alternating letter strings followed. Each trial had a particular slowest temporal frequency, the target rate. The goal of the experiment was to determine how narrow an interval between successive words would still allow observers to accurately determine the pairing of the letters, with a 75%-correct criterion. To achieve this, however, the stimuli could not be simply alternated at different frequencies. Care must be taken to prevent observers from picking off a single stimulus from the beginning or the end of the stimulus train. Consider that the first stimulus in a train is not pre-masked in the same way as the subsequent items and the last stimulus is not post-masked in the same way as the previous items. This means that the observers' integration window would at certain times include only a single word, obscuring any effect of alternation rate. Indeed, previous results purporting to reveal high-frequency grouping (Usher and Donnelly 1998) have since been explained by lack of adequate pre-masking (Beaudot 2002; Dakin and Bex 2002). The transient, high-resolution mechanism engaged when pre-masking is inadequate (Beaudot 2002) may even extend to more than just the first item (Holcombe et al 2001). This transient mechanism can obscure the effect of temporal frequency on performance, by allowing even temporal differences finer than the integration interval of the visual system to affect performance (Beaudot 2002). Use of this transient mechanism can be excluded by beginning the stimulus with a period in which the stimuli are much too rapid and at too low a contrast to perceive the feature pairing, and then gradually increasing the contrast and stimulus exposure duration to the target rate (Holcombe and Cavanagh 2001; Holcombe et al 2001; Dakin and Bex 2002; Clifford et al 2003, 2004). Hence, there were competing constraints on the manner of the stimulus presentationöthe desire for a smooth increase in stimulus duration at the beginning and decrease at the end, the desire for approximately equal total stimulus train durations across target rates, and the goal of at least two successive stimuli exposed at the target rate, all while keeping total trial length short enough that participants could fixate well throughout. In practice, pilot data from the authors and a few other observers indicated that total stimulus train duration was less important than having a smooth ramp, so the procedure was designed accordingly. In these repetitive-presentation conditions the stimuli were always presented for two frames (23 ms), but the black (5 0:3 cd mÿ2 ) interstimulus interval (ISI) varied to achieve various temporal frequencies. It is appropriate to think in terms of the stimulus onset asynchrony (SOA), which is the two-frame stimulus duration plus the ISI. This corresponds to the amount of time the stimulus was available given temporal integration. The results of pilot experiments yielded confidence that the relative proportion of exposure duration versus ISI in the SOA was not critical. Because the presentation of two successive stimuli form a cycle, temporal frequency corresponds to the duration of two letter string exposures plus their ISIs. As described in detail in the next paragraph, within a given trial, the alternation of the letter strings began at a very high rate and low luminance and gradually increased in luminance while slowing to the critical rate for that trial. After a few presentations at the critical rate, the presentation rate gradually increased and the luminance diminished until the trial ended with a 1-frame ISI terminated by an `XXXX' post-mask for 2 frames. This ensured that observers could not make the discrimination based on the beginning or end of the stimulus train, 56 A O Holcombe, J Judson when the stimulus would not be fully pre- and post-masked. In other words, without this procedure the presentations at the beginning and end would be available at lower than the intended temporal frequency. A detailed description of the initial ramp-up and terminating ramp-down of ISI and luminance for each target temporal frequency is given below. The initial ramping-up of the exposure of the stimulus was carried out according to the following rules. While the duration of the stimulus was constant at 2 frames, the first three presentations had no ISI and the fourth was presented after a 1-frame ISI. The first two of these presentations were within a few multiples of contrast threshold (5 1 cd mÿ2 ), with the third presented around 3 cd mÿ2 and the fourth around 10 cd mÿ2. The fifth was presented at about twice the luminance of the fourth, and all subsequent stimuli were presented at about 25 cd mÿ2 , until the last two, which were presented at 9 and about 1 cd mÿ2 , respectively. The duration of the ISIs after the fourth presentation increased until the target ISI was reached, with successive ISIs after the first three of 0, 0, and 1 being 2, 4, 9, and 19 frames. For those target ISIs that were greater than 19 frames, such as 30 frames, the target rate began immediately after the 19-frame presentation. Target ISIs of less than 19 frames began immediately after the nextlowest ISI duration in the ramp. For example if the target ISI was 6 frames, the duration of the preceding ISIs were 0, 0, 1, 2, and 4 frames. The gradual ramp down that began after the exposures at the target ISI included decrements of 10 frames until the ISI was less than 10 frames, followed by decrements of 3 frames until 1 frame was reached. Each stimulus train had at least one 1-frame ISI presented, with additional if needed to yield at least three exposures of the stimulus during the ramp down. For example, when the target ISI was 30 frames, the following ISIs were 20, 10, 7, 4, 1 frames. For a target ISI of 13 frames, the ISIs of the downramp were 3, 1, 1, 1 frames. The effect of the 1 frame and 0-ISI exposures at the end and beginning were to provide a low-contrast mask which appeared as the sum of the 2 letter strings, hence camouflaging the letter pairing. The number of exposures at the target ISI was set to bring the total duration of the stimulus train close to 1.6 s. These constraints yielded a range of 1.6 to 1.9 s, mostly because the acceleration and deceleration at beginning and end took longer for longer rates. This variation is not very different from that in the duration of the target rates (376 ms). Participants quickly grew accustomed to waiting for the stimulus to slow and subsequently attempting to perform the task. The letter strings were centred on the fixation point, which alternated between bright white and red, changing each time a letter string was presented. This flicker provided salient repeated events at the fovea, which helped the observer to maintain fixation. At the end of each trial, a tone and short message informed the observer whether the correct choice had been made. There were three different trial types, tested in separate blocks: words, nonwords, and pseudowords. The ten possible word sets for the trials of the words condition are listed in table 1, as are the pairs of letter strings for the other conditions. In nonword trials, none of the four letter strings formed words, and observers had to choose which of the two pairs of nonwords had alternated. The nonword strings were formed by transposing the first two and last two letters of the words in the words condition. In a pseudoword condition, all the letter strings formed nonwords that were pronounceable, in contrast to the nonword condition, in which most were not pronounceable. Sessions lasted less than 1 h. After several practice trials, the span of interstimulus intervals was set for that participant, guided by the participant's performance in the practice trials. Each subject then participated in 40 trials per condition per interstimulus interval. The three conditions (words, nonwords, pseudowords) were tested in different sessions, whereas the interstimulus intervals were randomly intermixed. Some participants were also in earlier preliminary experiments. When it was seen that participant Binding words is slow 57 Table 1. The stimuli for each of the three conditions of the letter string experiment for English readers. In each trial, participants discriminated between two possibilities. The two pairs of letter strings in each cell represent the two possible alternating stimuli specified to the participant at the beginning of a trial. The stimuli in the nonwords condition were created by spatially transposing the bigrams in the words condition, so that the same letters were used in the words and nonwords conditions. In all cases, temporally averaging across successively presented letter strings makes correct discrimination impossible. Frame Words Nonwords Pseudowords A B A B A B 1 2 ball deck back dell llba ckde llde ckba rane moll rall mone 1 2 pump hell pull hemp mppu llhe mphe llpu runk pote rute ponk 1 2 dent pull dell punt ntde llpu ntpu llde lort hent lont hert 1 2 tank mope tape monk nkta pemo nkmo peta pime jurt pirt jume 1 2 ball hunk bank hull llba nkhu llhu nkba dert jomp demp jort 1 2 pink jump pimp junk nkpi mpju mppi nkju hape miks haks mipe 1 2 tame lick tack lime meta ckli ckta meli zate koms zams kote 1 2 pike jock pick joke kepi ckjo ckpi kejo nive lank nink lave 1 2 call port cart poll llca rtpo rtca llpo paln bisk pask biln 1 2 milk fore mire folk lkmi refo remi lkfo ferm hont fent horm GE's data departed at the fastest SOA of the word condition from the pattern of other participants, he was run in an additional session of the word condition. 2.3 Chinese characters, repetitive presentation Ten pairs of Chinese characters, shown in figure 2, were used. The Chinese characters utilised were adopted in their final form by the People's Republic of China over forty years ago and are the form used today in the overwhelming majority of written and printed material in mainland China. Certain parts of these characters are called `radicals'. For the present experiment, these radicals have been exchanged in a particular character pair, in order to create a corresponding pair of characters. As in the English case, this yields two pairs of lexemes that have the same sum (figure 1). This ensured that temporally integrating over successive characters would yield the same sum for either pair. A character pair was presented in similar fashion to the presentation of pairs of English letter strings, as described in the previous section. The only difference was the pre- and post-mask, comprised of the sum of the two successive characters presented on that trial, rather than the `XXXX' mask used in the English-language experiments. After several practice trials, the span of interstimulus intervals was set, guided by participants' performance in the practice trials. Each mainland-China-raised participant participated in 40 trials in each of 5 alternation rates, randomly intermixed, for a total of 200 trials. Sessions lasted less than 1 h. Frame Character pair A 1 B Translation A B why? how? live 2 pour river 1 mother excellent 2 sand code 1 town trouble 2 meet sincere 1 slave prostitute 2 actress only 1 ice rain cave 2 copper ring 1 burn cook 2 drink forgive 1 bridge rubber 2 photo alien 1 OK chirp 2 flesh fat 1 media beautiful 2 burn coal 1 Mei (name of river) flood 2 Lao (name of mountain) Mei (name of mountain) %C Mean log % frequency 83 ÿ1.04 76 ÿ0.83 71 ÿ1.59 82 ÿ1.10 78 ÿ2.13 72 ÿ1.50 72 ÿ1.80 66 ÿ1.22 65 ÿ1.35 73 ÿ1.88 Figure 2. The pairs of Chinese characters used in the experiment with Chinese readers. As in the experiment with English readers, 10 pairs of stimuli were used. The sum of each pair of successive characters is equivalent to the sum of the complementary pair (A versus B). The meaning of most of these characters is ambiguous without additional characters as context. One possible meaning is listed for each. %C stands for percent correct. Binding words is slow 59 2.4 English language, single presentation Unlike in the repetitive-presentation experiments, only one letter string was shown, and it was presented just once. It was preceded by an `XXXX' mask shown for 2 frames (23 ms) and a 1-frame ISI (12 ms) following the pre-mask. The letter string was always shown for 2 frames and followed by an ISI of variable duration, varied across trials to determine threshold performance. The `XXXX' was presented subsequently for 3 frames (33 ms) to post-mask the string. As in the other experiments, at the end of each trial a tone and short message informed the observer whether the correct choice had been made. The experiment consisted of two conditions: words and nonwords. As in the repetitive-presentation experiment, the stimuli were comprised of those in table 1. The pre-trial screen and response screen were identical to that of the multiplepresentation experiments. Four letter strings were shown in the preview screen, and the same four letter strings were presented again in the response screen. These sets of four were the same as those used in the repetitive-presentation condition. Because in this condition only one of the letter strings, randomly chosen, was presented as a stimulus, only one of the four alternatives was correct, meaning that chance performance was 25% instead of the 50% chance level of the repetitive-presentation experiments. Each participant ran in 40 trials at each of 5 ISIs and each of the two conditions, for a total of 400 trials. Individual sessions lasted less than 1 h. Some of the participants (TM, TN, DB) had also participated in the repetitive-presentation experiment. 2.5 Eyetracking Saccades and blinks might allow participants to reduce the intended masking by the stimuli in the train. To detect in which trials participants blinked or moved their eyes, the right eye was monitored during the experiment. In the experiments with English readers, the eyetracking setup typically allowed rejection of trials in which observers made larger than 0.5 deg movements from fixation. For the Chinese observers, apparently due to the differing average configuration of their eyelids, performance by the eyetracker was much more variable, and typically not nearly as precise, reliably allowing exclusion of movements only when they reached a few degrees in magnitude or greater. Nevertheless, the eyetracking did have a similar psychological effect in the Chinese experiment as in the English experiment. That is, the participants were constantly aware that their eye movements were being monitored and, in the experience of the experimenters, they maintained fixation much better than naive observers otherwise do. Each participant was monitored for saccades and blinks with a Skalar Iris (http:// www.skalar.nl) infrared eyetracker (Reulen et al 1988). The output of the eyetracker, representing the horizontal position of the right eye, was recorded throughout stimulus presentation. With observer's chin on a rest, the eyetracker was calibrated first by manual adjustment to yield high gain and an approximately symmetric signal when fixating two targets at opposite ends of the screen. Observers then participated in an automated calibration session consisting of repeated saccades among five dots. Two were horizontally spaced 1.2 deg on either side of fixation, and two more were set 11.6 deg from fixation. These calibration data were used to create a criterion for deciding when a saccade was made. In the case of the experiments with letter strings, the criterion was exceeded when a saccade of approximately 0.5 deg or greater occurred. To create this criterion, the position signals from the calibration trials were filtered with a Chebyshev-type-two first-order low-pass filter. The maximum piecewise velocity during the time of the saccade was determined, and the mean maximum velocity from a sample of 1.2 deg calibration saccades was calculated. In the case of the experiment which included English words, the 1.2 deg calibration saccade with maximum velocity 60 A O Holcombe, J Judson closest to the mean was examined more closely. The Chebyshev-filtered signal was convolved with the kernel [1 ^ 0.2 ^ 1], and the criterion for rejecting a trial was set to be significantly smaller than the magnitude of the convolved 1.2 deg record. In particular, the maximum magnitude of the convolved signal corresponding to noise around the time of the saccade was estimated, and the rejection threshold was set a bit higher than this noise-evoked signal. More precisely, the criterion was calculated by adding to the maximum noise-evoked convolution magnitude 20% of the difference between the saccade-evoked convolution magnitude and the noise-evoked convolution. Apparently because of the smaller average eye opening of the Chinese participants, a reliable estimate of the magnitude of the convolved signal evoked by the small calibration saccades could not be made; indeed, often the noise was as large as this signal. The criterion for rejecting a trial was instead set at just above the noise seen in the perisaccadic interval around the 11.6 deg saccade time. 3 Results 3.1 Letter strings, repetitive presentation Figure 3 shows performance as a function of alternation rate for the English words, nonwords, and pseudowords conditions, after trials with significant eye movement were eliminated. Accuracy was highest and usually close to perfect for the slowest alternation rates, when exposure duration was longest. At higher alternation rates, performance gradually declined. Pilot experiments had consistently shown that performance was near chance at frequencies above 8 Hz, and this was borne out in the data. An apparent exception was shown by participant GE for words at 8.5 Hz, but when he returned to the lab 5 months later (data shown in inverted triangles), the anomaly was not replicated, despite the benefit of practice evident at the slower rates. Hence, the one exception to poor performance at higher temporal frequencies appears to have been due to random chance. A useful way to summarise performance in this task is with the 75% threshold of each subject in each condition. As such thresholds are commonly used in psychophysics, they have the further advantage of allowing rough comparisons with earlier psychophysical studies. In particular, a point near 75% is often used because it is typically at a portion of the curve where the accuracy versus frequency slope is high, making it sensitive to differences across conditions. We extracted the 75% thresholds from the data by fitting percent correct (%C) to cycle duration, using a class of curves that capture the main features of psychometric functions. The curve class fitted, equation below, was the top half of a cumulative Gaussian rescaled to go from 50% to 98.5% (98.5% used instead of 100% to allow for occasional lapses), with freely varying sigma (s) and a freely varying exponent on cycle duration in seconds (x). For convenience, the expression is given in terms of the MATLAB error function used (erf ): r 1 x %C 0:5 1 0:985 erf p . 2 s The best-fitting curve was chosen by maximum likelihood via Nelder ^ Mead simplex search method (MATLAB code available from the first author). The 75% thresholds in this 2-alternative forced-choice design were sometimes a bit higher than the 3 ^ 4 Hz found previously for judging the pairing of spatially separated shapes and colours (Holcombe and Cavanagh 2001), but they were far below the 20 Hz rates achievable for binding local elements into the global forms of Glass patterns (Clifford et al 2004). Recall that the purpose of these experiments was not to contrast the different classes of stimuli. Instead, different classes of stimuli were used simply to determine Binding words is slow %C 100 61 SOA=ms 100 50 3.5 Hz 2.5 Hz %C %C 1.5 Hz GE GE GE 75 50 3.3 Hz 3.2 Hz 100 1.5 Hz JY JY JY 75 50 25 3.7 Hz 2.7 Hz 100 %C 33 DB 50 25 1.9 Hz SH SH SH 75 50 25 2.7 Hz 2.8 Hz 100 %C SOA=ms 100 50 33 DB DB 100 1.9 Hz TL TL TL 75 50 25 5.2 Hz 100 %C SOA=ms 100 50 75 25 8.2 Hz 3.3 Hz TM TM TM 75 50 25 4.3 Hz 2.8 Hz 100 %C 33 2.1 Hz TN TN TN 75 50 25 7.4 Hz 5 10 AR=Hz 2.9 Hz 15 words 5 10 AR=Hz pseudowords 0.9 Hz 15 5 10 AR=Hz 15 nonwords Figure 3. Each plot is percent correct (%C) as a function of alternation rate (AR) for a particular participant (inset initials). Alternation rate is expressed in hertz at bottom, with corresponding stimulus onset asynchrony shown at top. Chance performance is 50%. Trials in which significant eye movement was detected were excluded. Standard error bars are shown where they exceed the symbol size. Thin line is fit psychometric function. Dashed line shows 75% threshold. In the words condition subject GE shows an anomalous data point for the fastest rate, which did not recur in a second session (inverted triangles). 62 A O Holcombe, J Judson whether the slow threshold result was a general one. As linguistic variables were not controlled across the different classes of stimuli, a number of factors could explain any differences, which should be kept in mind in the following report of the results in the various conditions. For the different conditions of this experiment, as shown in table 2, performance was highest with the words (75% threshold 4.3 Hz), somewhat lower for the pseudowords condition (3.6 Hz), and still lower for the nonwords (1.9 Hz). Paired-samples t -tests indicate that the difference between the words and pseudowords is not significant (t6 0:82, p 0:44), whereas all other differences are: for the words versus nonwords t6 3:4, p 0:01, and for the pseudowords versus nonwords t6 3:1, p 0:02. Table 2. 75% Thresholds (cycles sÿ1 =ms SOA). Repetitive presentation Mean SE=Hz N words pseudowords nonwords Chinese characters 4.3 Hz=116 ms 1.6 7 3.6 Hz=139 ms 2.0 7 1.9 Hz=263 ms 0.7 7 2.9 Hz=172 ms 1.1 6 These results clearly indicate that temporal binding thresholds for the nonwords were lower than for the pseudowords and the words. The main point of this paper is that all the conditions yield slow thresholds, all of less than 5 Hz. Still, the difference between the words and nonwords is notable, especially since several strings in the nonwords condition were pronounceable (this was a consequence of satisfying the constraint that the same bigrams be used as in the words condition). The effect of blocking can be examined by comparing the unpronounceable strings in the nonwords condition to the four strings within that condition which were pronounceable (`pemo', `meta', `kepi', `refo'). Excluding those pronounceable strings does not change the thresholds by much. Thresholds were lowered by 0.1055 (not statistically significant, t6 0:46, p 0:67). That the pronounceable-pseudowords condition nonetheless yielded much higher thresholds suggests some benefit of blocking pronounceable nonwords together, perhaps because in the less-pronounceable-nonwords condition participants were in the habit of using a less effective coding strategy for the letter strings. In the analysis just described, thresholds from different-sized data sets were compared (nonwords condition with and without pronounceable nonwords). Some data-fitting algorithms can exhibit a bias dependent on the size of the data set. To ensure that the exclusion of stimuli did not bias the threshold estimation procedure with our algorithm, thresholds were re-estimated after excluding random four-stimulus subsets of the nonword condition. Considering five hundred random subsets drawn with replacement, the average threshold difference was negligible (0.0048 Hz) and certainly not statistically significant (t499 0:57, p 0:57). For further examination of the effect of pronounceability within the nonwords condition, one might compare thresholds for the four pronounceable strings with the thresholds for the rest. Unfortunately this was not possible, because the small numbers of trials with pronounceable strings made the thresholds unstable. However, some insight can be had into the variation due to particular items by examining percent correct by item, collapsed across duration, which is shown in table 3. The overall percent correct for the trials of the nonwords condition containing strings which happen to be pronounceable is at the low end of the accuracy for the items in the pseudowords condition containing exclusively pronounceable items. Although the effect of blocking should be further explored, the difference between the nonwords condition and the other conditions appears reliable. The difference likely Binding words is slow 63 Table 3. Percent correct (%C) for each stimulus pairing, collapsed across alternation rates. Words Nonwords Pseudowords A B %C A B %C A B %C ball deck back dell 73 llba ckde llde ckba 59 rane moll rall mone 78 pump hell pull hemp 77 mppu llhe mphe llpu 70 runk pote rute ponk 77 dent pull dell punt 75 ntde llpu ntpu llde 54 lort hent lont hert 77 tank mope tape monk 75 nkta pemo nkmo peta 74 a pime jurt pirt jume 65 ball hunk bank hull 80 llba nkhu llhu nkba 56 dert jomp demp jort 65 pink jump pimp junk 76 nkpi mpju mppi nkju 58 hape miks haks mipe 71 tame lick tack lime 73 meta ckli ckta meli 64 a zate koms zams kote 69 pike jock pick joke 81 kepi ckjo ckpi kejo 62 a nive lank nink lave 69 call port cart poll 78 llca rtpo rtca llpo 58 paln bisk pask biln 70 milk fore mire folk 80 lkmi refo remi lkfo 67 a ferm hont fent horm 71 a Note: These instances of the `nonwords' condition contained pronounceable strings. results from one of the various mental processes involved in determining which letters were presented simultaneously. The difference across conditions cannot be due to difficulties in identifying the individual letters at higher rates, as the letters used in corresponding trials of the words and nonwords conditions are identical (table 1). The possible reasons for the difference are described in section 5 below. The present experiment was not designed to investigate the role of word frequency on binding threshold. Still, its role in variation of performance within our words condition can be tentatively examined by regressing performance on word frequency. Frequencies were taken from the CELEX database (Baayen et al 1995; Davis 2005) and the mean of the log frequencies for each set of four words was calculated. `Mope' was not present in the frequency database, so it was assigned a nominal frequency of 0.6, lower than that of all the other words, which ranged from 0.67 to 1233. When overall percent correct for each four-word set is regressed on mean log frequency, the resulting r 0:16 is small and not significant ( p 0:68). Unfortunately, the number of trials per four-word set per rate was too small (4) to allow separate estimation of temporal thresholds for each four-word set. The regression was performed separately for the five presentation rates but was far from significant in all cases, with the r negative for two. In the context of the range of temporal thresholds reported in earlier literature, thresholds in all conditions of this experiment were decidedly slow. But rather than being a general result for linguistic stimuli, it is possible that these results may be dependent on the particular words and nonwords chosen, the language used, or even the writing system. To determine the generality of the slow thresholds found for these linguistic stimuli, we decided to go as far afield as possible, and investigate a case of different words from a different language with a different writing system. 64 A O Holcombe, J Judson 3.2 Chinese characters (repeated presentation) Figure 4 shows percent correct as a function of alternation rate for the Chinese characters, with one plot for each of the six Chinese natives. As in the experiment with letter strings, thresholds rapidly decreased with increasing temporal frequency. The mean 75% threshold was 2.9 Hz, in the slow end of the range found with the letter strings. A quantitative statistical comparison with the results of the letter strings experiment is not advisable, given the use of different participants and materials. Still, there is no question that thresholds are a great deal lower than the 15 ^ 20 Hz frequencies found for binding local elements into regular global forms (Clifford et al 2004). %C 100 SOA=ms 100 50 33 JY 50 3.3 Hz 2.1 Hz 100 %C SOA=ms 100 50 CL 75 25 DS 75 LH 50 25 1.7 Hz 3.2 Hz 100 %C 33 JL NL 75 50 25 2.5 Hz 2.3 Hz 5 10 AR=Hz 15 5 10 AR=Hz 15 Figure 4. Each plot shows one subject's percent correct (%C) as a function of alternation rate (AR) of the Chinese characters. The thin line shows the psychometric fit, and the inferred 75% threshold is also shown. Standard error bars are shown where they exceed symbol size. All subjects were expert readers of simplified Chinese. Chance performance is 50%. Although in reading, as with English words, Chinese character corpus frequency can have a large effect on performance (Seidenberg 1985; Hue 1992), in the present task no significant effect was found. Simplified Chinese character frequencies were taken from the combined corpus of Da (2004). The mean log percent frequency was calculated for each four-character set (figure 2). Regressing overall percent correct on log frequency yielded a small Pearson product-moment (r 0:049, p 0:89). Trials per word at each rate were too few to yield good estimates of the 75% threshold. However, we were able to examine whether there might be a strong effect of word frequency confined to a particular difficulty level by combining percent correct across observers for the fastest rate for each observer, the second fastest rate for each, etc. Pearson correlations were low and never significant. From the lowest to the fastest rate, the r s were 0.34, 0.19, 0.18, ÿ0:23, and ÿ0:17, with corresponding p s 0.34, 0.59, 0.62, 0.52, 0.64. 3.3 Letter strings, single presentation The results with letter strings described above reflect a repeated-presentation method designed to test the temporal resolution of binding the parts together. This is in contrast with earlier studies of the perception of masked letter strings, in which a stimulus was presented only once in a trial. It was expected that, with this more traditional methodology, the items could be discriminated on the basis of much shorter exposure intervals Binding words is slow 65 as, without repeated presentation, binding during the exposure time interval was not necessary. The data shown in figure 5 show that, indeed, letter strings can be perceived on the basis of much briefer intervals when only one is shown, even when it is masked. The reasons for this are described in section 5. Although chance performance was 25% rather than 50% as in the other experiments, performance was so good that SOAs even for 75% performance for many participants were briefer than could be presented with our CRT, and were almost always much briefer than thresholds with repeated presentation. Since thresholds could not be estimated from these data, to compare performance across conditions paired t-tests on percent correct were performed. Successive screen refreshes were separated by 11.8 ms, and the briefest stimulus exposure in the experiment consisted of two refreshes drawing the letter string and a single blank frame before the post-mask, for a total exposure of 35.3 ms. At this shortest SOA, participants provided the correct answer in 87% of trials in the words condition against 71% of trials in the nonwords condition. A two-sample paired t-test showed this difference to be significant (t10 2:9, p 0:02). The difference was also significant at the next two longer durations for which the advantage of the words condition was 10% and 13% (t10 2:25, p 0:048; and t10 4:1, p 0:002), but not the longest two durations for which the advantage was 3% and 2%. The absence of a statistically significant difference in the longest two durations is welcome, as it suggests that the performance difference was caused by the manipulation of exposure duration rather than always being present. The differences at the faster rates are comparable to, or somewhat larger than, those found in earlier literature (Manelis 1974). Accuracy in this experiment was not sensitive to mean log word frequency of each four-word set. Regressing overall percent correct on mean log word frequency per fourword set yields an inverse correlation (Pearson r ÿ0:33, p 0:35). Performing the analysis separately by rate yields a slightly negative r for each rate, with none statistically significant. SOA=ms 100 50 %C 100 SOA=ms 100 50 33 SOA=ms 100 50 33 SOA=ms 100 50 33 33 75 50 25 AS AZ JW LJ CS DB RC TM %C 100 75 50 25 5 10 AR=Hz %C 100 words 75 50 25 15 nonwords TN JO EL 5 10 AR=Hz 15 5 10 AR=Hz 15 5 10 AR=Hz 15 Figure 5. With this single-presentation experiment, the same stimuli and exposure durations were used as in the repeated-presentation experiments. Overall performance and 75% thresholds are much better. Each plot shows one subject's percent correct (%C) as a function of SOA of the two successively presented words. Equivalent alternation rate (AR) is shown at bottom. Standard error bars are shown where they exceed symbol size. 66 A O Holcombe, J Judson 4 Eye movements and 75% thresholds 4.1 Letter strings, repetitive presentation In this experiment one concern was that eye movements might, on some trials, allow participants to circumvent the pre- and post-masking. Each successive stimulus in the repetitive-presentation paradigm is meant to mask the others by falling on the same patch of retina. This way, when presentation rate exceeds temporal resolution of the neural processes that binds the letters into a string, successive unbound strings will be combined and alternating strings with the same sum will not be discriminated. However, if a saccade is made at particular times during the stimulus presentation, then successive strings will not land on the same patch of retina. Potentially, the brain could then combine visual information over long periods and still recover the string's identity. Indeed, in the case of an eye movement, the string would be combined with the empty background rather than a successively presented masking string. Hence, saccades might yield inflated temporal thresholds; moreover, a different pattern of eye movement in different conditions might change the pattern of thresholds across conditions. In the authors' experience, using saccades willfully to subvert the masking was difficult but occasionally effective, especially after extensive practice. Strategic eye blinking could also subvert the stimulus masking. Fortunately, eyeblinks were easily detected by the eyetracker and all but quite small saccades were also reliably detected. Analysis of the data indicates that eye movements and blinks did not have a substantial effect on thresholds. Thresholds estimated from the full data set were compared with those estimated from the data after discarding trials in which saccades were detected (see section 2.5 for eyetracking details). The mean changes in threshold caused by rejecting trials with eye movements in each condition were, with associated standard error across the seven subjects, ÿ0:6 0:9 Hz in the words condition, 0:2 0:5 Hz in the pseudowords condition, and ÿ0:1 0:2 Hz in the nonwords condition. None of these changes in estimated threshold was significant according to t-tests. Also insignificant were the differences between conditions in these threshold changes (one-way ANOVA: F2, 18 2:73, p 0:092). Nonetheless, the decrease in threshold caused by rejecting trials with eye movement in the words condition (t6 1:74, p 0:13), although statistically insignificant, is consistent with the possibility that, in a small number of trials, eye movements defeated the pre- or post-masking. The magnitude of threshold changes due to rejecting eye movements is nearly as large as the difference in threshold between the words and pseudowords condition. For this reason, any further future studies of the difference between these conditions (nonsignificant in this study) should monitor eye movements. The evidence also indicates that the number of eye movements and blinks detected differs across the conditions. The differences across the conditions in number of trials rejected were not very large, but they were statistically significant. Of 40 trials per condition per subject per interstimulus interval, the mean number and associated standard error of trials rejected were: 10:8 1:14 for the words condition, 4:7 0:88 for the pseudowords condition, and 8:6 0:92 for the nonwords condition. In an ANOVA with condition, interstimulus interval, and their interaction as factors and number of trials rejected as the dependent variable, condition was significant (F2 4:43, p 0:015), whereas interstimulus interval was not (F4 0:11, p 0:98). It is not clear why participants apparently made a comparable number of eye movements in the words and nonwords condition, but made fewer in the pseudowords condition. However, this pattern of differences does not correlate with the differences in temporal threshold, which provides further confidence that the measured differences in thresholds were not related to differences in eye movements in the different conditions. Some participants exhibited significantly more eye movements than others, with the average number of trials rejected per 40-trial block ranging from 2.3 for one participant Binding words is slow 67 to 18.3 for another. This was expected, as it is commonly observed in psychophysical experiments that some participants fixate better than others. If these eye movements allowed some subjects to inflate their thresholds, then one would expect a positive relationship between number of eye movements and 75% thresholds estimated from the data when all trials, including eye-movement trials, are included. A small and statistically insignificant but positive correlation was found (r 0:2, p 0:39). That this already-small correlation diminished to nearly zero (r 0:04, p 0:86) when trials with detected eye movements were discarded further increases confidence that thresholds after screening by the eyetracker were not greatly inflated by eye movements. 4.2 Chinese characters (repetitive presentation) The relationship of the data in the Chinese characters experiment to the detected eye movements did not differ markedly from that found in the letter-strings experiment. Chance performance was 50%, and the 75% thresholds were again estimated just as in the English repetitive-presentation experiment. In the Chinese characters experiment, of the 40 trials that each subject ran in each condition ^ ISI combination, an average of 11.9 were rejected owing to eye movements or blinks. Rejecting these trials caused a mean decrease of threshold of 0.4 Hz, which is statistically insignificant (t5 1:65, p 0:16). As in the experiment with letter strings, there were differences among subjects in the number of eye movements made, with the range spanning 17 to 6. Again, as in the experiment with letter strings, statistically insignificant correlations were found between the number of eye movements made and the threshold, both before and after trials with eye movements were discarded (r 0:45, p 0:37; r 0:35, p 0:49, respectively). 5 Discussion Of all visual skills, reading is one of the most overlearned. Indeed, most literate adults have decades of near-daily experience with the task of reading. Thanks to this practice, word reading is quite automatic (Stroop 1935). If automaticity and general experience determined temporal limits in perception, then we might expect thesholds for words to be among the fastest of all visual thresholds. Instead, in the current study we found that temporal binding thresholds for English words are comparable to those for nonword letter strings and arbitrary conjunctions of spatially disparate colour and shape elements. We found a similar result for binding the parts of Chinese characters, even though they may be processed more holistically than are English words (eg Tzeng et al 1979; Chen 1984). The enormous advantage of words over nonwords stimuli in frequency of exposure, memorability, and other factors has not allowed temporal binding thresholds to approach those of the fastest in the visual system. Another result of the present work further distinguishes the slow-threshold linguistic material from visual material that is bound at high rates. The phenomenology of it is clearly different. In the case of alternating Glass patterns and alternating gratings, at rates faster than several hertz observers report perceiving both patterns simultaneously, yet the features presented at the same time remain grouped in perception (Holcombe 2001; Clifford et al 2004). In the case of the letter strings, as the rate of alternation increases beyond 5 Hz, the successive letters eventually appear as if transparently overlaid, but observers reported that there is no strong grouping perceived between particular letters. In other words, binding is inaccurate not owing to perceptual misbindings, but rather because no particular binding is really perceived. Although this is very like the phenomenology previously found for slow alternation thresholds, it is entirely different from the binding problems reported with single exposure. In those cases, participants often mistakenly had high confidence that their binding errors were correct (Shallice and McGill 1978; Mozer 1983). Further investigation will be needed to discover the reasons for this difference. 68 A O Holcombe, J Judson Frequency of experience with words has a large effect on response time in conventional word-reading tasks. In a study of visual-duration thresholds for identifying words, Howes and Solomon (1951) found large Pearson correlations with log frequency of between r 0:5 and 0.8. Unlike conventional identification tasks, with our forced-choice methodology participants were informed of the options before each trial. The intention was to minimise any time needed for lexical search and other linguistic processes, in order to maximise the possibility for a fast threshold. The absence of a significant frequency effect suggests that we succeeded. Previous studies of Chinese characters have sometimes found an even larger frequency effect than is found in English (Seidenberg 1985; Hue 1992). In the present investigation with Chinese characters this frequency effect was absent. Our primary interest is what causes words and Chinese characters to be inaccessible at high temporal frequencies, in contrast to simple motion direction (Burr and Ross 1982), flicker, edges and texture boundaries (Forte et al 1999; Kandil and Fahle 2003; Ramachandran and Rogers-Ramachandran 1991), depth from binocular disparity (Morgan and Castet 1995), and pairings of superposed local colour and orientation. Perhaps the most striking discrepancy from the present result with pairing the spatially separated forms of a word was the result when many form elements were arranged into simple shapes such as spirals or radial patterns. These are integrated with high efficiency in both space and time (Wilson and Wilkinson 1998; Clifford et al 2004), whereas letters and Chinese character parts apparently are not. These results fit well into the present theoretical framework of lower temporal resolution for central processes. First, consider the case of a series of bars drifting at high temporal frequencies, resulting in a retinal patch and its corresponding cortical cells receiving a rapidly fluctuating input. The motion direction of such stimuli is extracted by areas relatively early in the processing hierarchy (MT or earlier) whose neurons have high temporal resolution (Borghuis et al 2003). These mechanisms transform the rapidly varying stimulus information into a constant signal. That is, the rapidly varying (high temporal frequency) pattern of light and dark as the bars pass by in a particular direction is signaled by a relatively constant (low temporal frequency) firing by direction-selective cells. Hence, any subsequent loss of high temporal frequencies, such as may occur at more central stages, will preserve the now low-frequency motion signal. In the case of words and Chinese characters, the poor temporal resolution may result from the lexical recognition mechanisms being situated after a stage which averages over several dozen milliseconds or more. Unfortunately, extant models of word processing have yet to address these temporal binding issues (McClelland and Rumelhart 1981; Brysbaert and Vitu 1998; Coltheart et al 2001; Engbert et al 2002; Pollatsek et al 2003; Reilly and Radach 2003; Davis, submitted). Indeed, usually no attempt is made to situate the efficiency of work processing in the context of other visual judgments. Instead, word models in the literature are designed primarily to explain differences in response time due to psycholinguistic factors such as lexicality, word frequency, length effects, and priming effects. A partial exception are the modeling efforts of Graboi and Lisman (2003) who grapple directly with some neurophysiological constraints on processing time. Their model does not address our repetitive-presentation method, but appears to predict that, with a 2-alternative task like ours, recognition should occur in only a few processing cycles, or about 50 ms or less. This is, of course, quite different from the thresholds found here, which were uniformly greater than 100 ms. If word-perception models were changed to accommodate temporal binding thresholds, this might help the models account for normal and rapid serial-presentation reading as well. Progress in understanding visual thresholds for words, as opposed to response times, has been halting, and computational models that explain word perception do Binding words is slow 69 not seem to address the recent psychophysical threshold findings. Analysis to date, however, indicates that contrast and lateral-masking thresholds for words can be explained by independent detection of letters, without holistic word processing (Pelli et al 2003; Martelli et al 2005). This is quite different from contrast thresholds for certain static forms and motions (Morrone et al 1995; Wilson and Wilkinson 1998), in which detectors integrate efficiently over the constituent elements. As these same stimuli and visual judgments that efficiently pool over space also have precise temporal thresholds, we see a parallel here. Perhaps the presence of a specialised mechanism in the visual system leads to both benefits for the visual judgments it serves. The fast thresholds found to date are robust, in that variants of the forced-choice task and of the particular stimulus parameters chosen do not change the 415 Hz result (Burr and Ross 1982; Holcombe and Cavanagh 2001; Clifford et al 2003, 2004), provided that the total duration of the stimulus train is sufficient (Bodelon et al 2004). The slow 3 Hz thresholds are also robust, in that different arrangements of the spatially separated features had little effect on thresholds (Holcombe and Cavanagh 2001). Given the robustness of previously found fast and slow thresholds to methodological changes, it is unlikely that changing the particulars of the presentation method used will allow thresholds for pairing spatially separated letters to improve by much. The present results suggest, however, that high-level factors like pronounceability might shift the slowness of the slow thresholds. This is in agreement with the present theory. To understand why, first consider the robustness of the fast thresholds. In cases where early visual mechanisms bind the relevant visual information, if the observer is to make the appropriate response, the visual representation must subsequently be recognised by central processes and consolidated into memory. These central processes will certainly take time, but thanks to the temporal transparency phenomenon the relevant visual representation is continuously available to central processes (Holcombe 2001). Hence, with a long train of stimuli any reasonable duration requirement of central processing is eventually met, as long as the temporal frequency does not exceed the resolution of the early visual-binding mechanism. The situation is quite different when temporal transparency does not occur, such as with the present linguistic material. Then, central processes no longer have the interrupted, extended duration of the stimulus train available to recognise and remember. Instead, without temporal transparency a particular stimulus is available centrally only in a series of short episodes interrupted by short episodes of the other stimulus. The duration of every episode corresponds to the interstimulus interval. With visual recognition, memory encoding and consolidation, both likely to require a substantial amount of sustained processing time (Jolicoeur and Dell'Acqua 1998; Lawson and Jolicoeur 2003), factors that affect the duration of these processes would also affect the temporal threshold in our tasks. In other words, when peripheral processes do not transform the intermittent stimulus into a constant representation, cognitive stages can be a time-limiting step. In the case of the unpronounceable strings in our experiment, they likely had longer central encoding times and this may have yielded the slower thresholds. Interestingly, the advantage here for words over unpronounceable nonwords, although small when compared to the full range of resolution limits exhibited in visual performance, is sizeable by the standards of previous studies. In these previous studies using brief presentation of linguistic material, the letter string was presented just once in a trial, and usually post-masked. Duration identification thresholds were longer for words than for nonwords (eg Baron and Thurston 1973; Manelis 1974). Typically, however, participants were not informed of the alternatives before the stimulus was presented, allowing memory factors and bias to play a large role. Most published results from forced-choice designs did not find an advantage for words (Bjork and Estes 1973; Thompson and Massaro 1973). In the forced-choice study which did find a 70 A O Holcombe, J Judson word advantage (Smith and Haviland 1972), the advantage was quite small, much smaller than that found with the repetitive-presentation paradigm of this paper. A 4% ^ 7% difference in accuracy was found in that study, whereas our study yielded an accuracy difference of 16% at the shortest duration in the single-presentation case and even larger in the repeated-presentation experiment (as much as 22%). It is always difficult to be sure that a difference between words and nonwords has not been caused by the greater ease with which words are remembered. In the present study we attempted to minimise the role of memory by presenting the four possible stimuli both before and immediately after each trial, but this does not guarantee that the observers invested the time necessary to completely internalise the options. If they did not, the poorer thresholds in the nonwords condition could be explained by a difference in the degree to which the alternatives were held in working memory during the stimulus viewing period. Even if the alternatives are learned well, differences in memory consolidation and retention processes could still result in poorer thresholds for nonwords. Determining the processing stage(s) at which words have an advantage over nonwords will require significant further work. One reason for the historically large difference found here between words and unpronounceable strings likely is the novel repeated-presentation methodology. This method also provides some other advantages over traditional methods, as discussed in the following section. 5.1 Masking, single presentation, and multiple presentation In single-presentation conditions, the SOA between word and mask can be less than 40 ms and still the word can be reliably discriminated (Manelis 1974). This result was replicated in the present work. But the result with the novel, repetitive-temporalbinding paradigm was quite differentö75% thresholds of 116 ms, even though the same items were used in both conditions. The present experiments, then, show that differences between the single-presentation and repetitive-presentation literatures are due to method rather than material. A major factor distinguishing single from repetitive presentation is the difference in the time scale that the target's visual information is available. This difference may go a long way towards explaining the much longer thresholds found in repetitive presentation. Consider that the repetitive-presentation condition was designed to be limited by the temporal precision with which the word parts are represented. In the repetitivepresentation condition, if the system loses enough precision in its representation of which letters were presented when, performance should be at chance. In the singlepresentation case, temporal imprecision is of little consequence. As long as each of the letters is perceived, it matters not whether the system represents their time of occurrence precisely. Furthermore, in single presentation even when the letter strings are presented so briefly that the visual system integrates the letter string together with the conventional `XXXX' masks, potentially information is still available to determine which letter string was presented. That is, summing the mask and the letter string does not obliterate cues to the identity of the letter string. This is in contrast to the repetitive-presentation case, where summing successive stimuli obliterates all clues to which of the two alternatives was presented. In the efforts to determine the stages in processing that lexical factors modulate, researchers should attempt to minimise differences in the attention allocated to words versus nonwords. The repetitive-presentation technique is likely to be less affected by the dynamics of attention than is single presentation. Certain words may attract attention more than do nonwords (Mack and Rock 1998), which may contribute to the advantage of words in single presentation, where attention must be allocated at the right instant. Binding words is slow 71 If attention is accidentally engaged at the wrong time, subsequent stimuli can easily be missed (Duncan et al 1994). This is potentially a problem in simple masked displays, whereas the consequences of occasionally engaging attention on the wrong stimulus is less of an issue when targets are the only things presented, and are presented repeatedly. As we have seen above, differences in the information available between the singleand repetitive-presentation methods at a given SOA may explain the difference in temporal thresholds. This should indeed be considered the most likely explanation of the difference. Still, there remains the possibility of an actual difference in the way the stimuli are treated in the two conditions. It is conceivable that the method of repetitive presentation itself may change the nature of the processing that occurs. Specifically, the temporal integration window conceivably might expand to reflect the temporally extended input of the repetitive-presentation condition. This could lead to the slow alternation thresholds found herein despite temporally precise processing when words are presented in non-repetitive fashion. The accurate performance at high rates of nonrepetitive presentation found by Potter and others (Rubin and Turano 1992; Potter 1993) as well as Sperling et al (1971) inspire this suggestion that single presentation may indeed result in shorter integration intervals. Yet this possibility remains doubtful, because all the reasons described above for better performance with non-repetitive presentation apply to this literature as well. 5.2 Future directions The present results militate for a model in which the temporal precision of binding is fine for efficient, specialised visual mechanisms, but coarse for those that rely on higherlevel mechanisms, such as judgments of linguistic material. This is consistent with the theory that high-level mechanisms subserving explicit judgments have access only to coarse visual time scales. In the tasks used in the present paper, only explicit judgments were solicited from the study participants. It is possible that high-level mechanisms, such as those which extract lexical information, process visual information with high temporal resolution, but that this information does not become available to explicit judgments. The existence of semantic priming from words even with very brief exposures (eg Greenwald et al 1996) is suggestive here, although the concerns described above for single presentation do apply. Still, Potter and her colleagues (Potter 1993; O'Connor and Potter 2002) have provided evidence that individual items are processed to a semantic level when embedded in a rapid stream of items, and that these items can be recalled only if they are conceptually related to other items in the stream. To test the temporal precision of this sort of implicit word processing, a neuroimaging study or behavioural study of priming should be carried out, using an appropriate methodology such as the repetitive-presentation method introduced here. Acknowledgments. We thank Liqiang Huang for assistance with the Chinese aspects of the study, including his work to identify Chinese character pairs that could be used in our paradigm. Bill Holcombe and Janice Lai also contributed. Tom Sanocki, John Jacobson, and Catherine Harris commented extensively on the manuscript and Mark Elliott provided stimulating feedback. Discussions with Colin Davis were very beneficial. References Baayen R H, Piepenbrock R, Rijn H van, 1995 The CELEX Lexical Database. Release 2 [CD-ROM] (Philadelphia, PA: Linguistic Data Consortium, University of Pennsylvania) Baer K V von, 1864 ``Welche Auffassung der lebendigen Natur ist die richtige? Und wie ist diese Auffassung auf die Entomologie anzuwenden? [Which view of living nature is correct? And how is this view to be applied to entomology?], in Reden gehalten in wissenschaftlichen Versammlungen und kleinere Aufsa«tze vermischten Inhalts Ed. H Schmitzdorff (St Petersburg: Verlag der kaiserlichen Hofbuchhandlung) pp 237 ^ 283 Baron J, Thurston I, 1973 ``An analysis of the word superiority effect'' Cognitive Psychology 4 207 ^ 228 72 A O Holcombe, J Judson Beaudot W H, 2002 ``Role of onset asynchrony in contour integration'' Vision Research 42 1 ^ 9 Bjork E, Estes W K, 1973 ``Letter identification in relation to linguistic context and masking conditions'' Memory & Cognition 1 217 ^ 223 Bodelon C, Fallah M, Reynolds J H, 2004 ``Temporal resolution of orientation/color conjunctions'', paper presented at the Society for Neuroscience Annual Meeting, San Diego, California Borghuis B G, Perge J A, Vajda I, Wezel R J van, Grind W A van de, Lankheet M J, 2003 ``The motion reverse correlation (MRC) method: a linear systems approach in the motion domain'' Journal of Neuroscience Methods 123 153 ^ 166 Burr D C, Ross J, 1982 ``Contrast sensitivity at high velocities'' Vision Research 22 479 ^ 484 Brysbaert M, Vitu F, 1998 ``Word skipping: Implications for theories of eye movement control in reading'', in Eye Guidance in Reading and Scene Perception Ed. G Underwood (New York: Elsevier) pp 125 ^ 147 Chen H C, 1984 ``Detecting radical component of Chinese characters in visual reading'' Chinese Journal of Psychology 26 29 ^ 34 Clifford C W G, Arnold D H, Pearson J, 2003 ``A paradox of temporal perception revealed by a stimulus oscillating in colour and orientation'' Vision Research 43 2245 ^ 2253 Clifford C W, Holcombe A O, Pearson J, 2004 ``Rapid global form binding with loss of associated colors'' Journal of Vision 4 1090 ^ 1101 (http://journalofvision.org/4/12/8/, DOI:10.1167/4.12.8) Coltheart M, Rastle K, Perry C, Langdon R, Ziegler J, 2001 ``DRC: a dual route cascaded model of visual word recognition and reading aloud'' Psychological Review 108 204 ^ 256 Crick F, Koch C, 1995 ``Are we aware of neural activity in primary visual cortex?'' Nature 375 121 ^ 123 Crick F, Koch C, 2003 ``A framework for consciousness'' Nature Neuroscience 6 119 ^ 126 Da J, 2004 ``A corpus-based study of character and bigram frequencies in Chinese e-texts and its implications for Chinese language instruction'', in Proceedings of the Fourth International Conference on New Technologies in Teaching and Learning Chinese Eds Z Pu, T Xie, J Xu (Beijing: Tsinghua University Press) pp 501 ^ 511 Dakin S C, Bex P J, 2002 ``Role of synchrony in contour binding: some transient doubts sustained'' Journal of the Optical Society of America A 19 678 ^ 686 Davis C J, 2005 ``N-Watch: A program for deriving neighborhood size and other psycholinguistic statistics'' Behavior Research Methods 37 65 ^ 70 Davis C J, submitted ``The SOLAR (Self-Organizing Lexical Acquisition and Recognition) model of visual word identification, Part 1: Orthographic input coding and lexical matching'' Davis C J, Bowers J S, 2004 ``What do letter migration errors reveal about letter position coding in visual world recognition?'' Journal of Experimental Psychology: Human Perception and Performance 30 923 ^ 941 Duncan J, Humphreys G W, 1989 ``Visual search and stimulus similarity'' Psychological Review 96 433 ^ 458 Duncan J, Ward R, Shapiro K, 1994 ``Direct measurement of attentional dwell time in human vision'' Nature 369 313 ^ 316 Egeth H, Jonides J, Wall S, 1972 ``Parallel processing of multielement displays'' Cognitive Psychology 3 674 ^ 698 Engbert R, Longtin A, Kliegl R, 2002 ``A dynamical model of saccade generation in reading based on spatially distributed lexical processing'' Vision Research 42 621 ^ 636 Fabre-Thorpe M, Delorme A, Marlot C, Thorpe S, 2001 ``A limit to the speed of processing ultrarapid visual categorization of novel natural scenes'' Journal of Cognitive Neuroscience 13 171 ^ 180 Forte J, Hogben J H, Ross J, 1999 ``Spatial limitations of temporal segmentation'' Vision Research 39 4052 ^ 4061 Geissler H G, Schebera F U, Kompass R, 1999 ``Ultra-precise quantal timing: Evidence from simultaneity thresholds in long-range apparent movement'' Perception & Psychophysics 61 707 ^ 726 Graboi D, Lisman J, 2003 ``Recognition by top ^ down and bottom ^ up processing in cortex: the control of selective attention'' Journal of Neurophysiology 90 798 ^ 810 Greenwald A G, Draine S C, Abrams R L, 1996 ``Three cognitive markers of unconscious semantic activation'' Science 273 1699 ^ 1702 Gur M, Snodderly M, 1997 ``A dissociation between brain activity and perception: Chromatically opponent cortical neurons signal chromatic flicker that is not perceived'' Vision Research 37 377 ^ 382 Harris C L, Morris A L, 2001 ``Illusory words created by repetition blindness: a technique for probing sublexical representations'' Psychonomic Bulletin & Review 8 118 ^ 126 Binding words is slow 73 Hochstein S, Ahissar M, 2002 ``View from the top: hierarchies and reverse hierarchies in the visual system'' Neuron 36 791 ^ 804 Holcombe A O, 2001 ``A purely temporal transparency mechanism in the visual system'' Perception 30 1311 ^ 1320 Holcombe A O, Cavanagh P, 2001 ``Early binding of feature pairs for visual perception'' Nature Neuroscience 4 127 ^ 128 Holcombe A O, Kanwisher N, Treisman A, 2001 ``The midstream order deficit'' Perception & Psychophysics 63 322 ^ 329 Howes D H, Solomon R L, 1951 ``Visual duration threshold as a function of word probability'' Journal of Experimental Psychology 41 401 ^ 410 Hue C W, 1992 ``Recognition processes in character naming'', in Language Processing in Chinese Eds H C Chen, O J L Tzen (Amsterdam: Elsevier) pp 93 ^ 107 Inhoff A W, Starr M, Shindler K L, 2000 ``Is the processing of words during eye fixations in reading strictly serial?'' Perception & Psychophyhsics 62 1474 ^ 1484 Johnson J S, Olshausen B A, 2003 ``Timecourse of neural signatures of object recognition'' Journal of Vision 3 499 ^ 512 (http://journalofvision.org/3/7/4, DOI:10.1167/3.7.4) Jolicoeur P, Dell'Acqua R, 1998 ``The demonstration of short-term consolidation'' Cognitive Psychology 36 138 ^ 202 Kandil F I, Fahle M, 2003 ``Mechanisms of time-based figure ^ ground segregation'' European Journal of Neuroscience 18 2874 ^ 2882 Kinoshita S, Lupker S J (Eds), 2003 Masked Priming: The State of the Art (New York: Psychology Press) Kline K, Holcombe A O, Eagleman D M, 2004 ``Illusory motion reversal is caused by rivalry, not by perceptual snapshots of the visual field'' Vision Research 44 2653 ^ 2658 Kline K, Holcombe A O, Eagleman D M, 2006 ``Illusory motion reversal does not imply discrete processing: Reply to Rojas et al.'' Vision Research 46 1158 ^ 1159 Lawson R, Jolicoeur P, 2003 ``Recognition thresholds for plane-rotated pictures of familiar objects'' Acta Psychologica 112 17 ^ 41 McCandliss B D, Cohen L, Dehaene S, 2003 ``The visual word form area: expertise for reading in the fusiform gyrus'' Trends in Cognitive Sciences 7 293 ^ 299 McClelland J L, Mozer M C, 1986 ``Perceptual interactions in two-word displays: familiarity and similarity effects'' Journal of Experimental Psychology: Human Perception and Performance 12 18 ^ 35 McClelland J L, Rumelhart D E, 1981 ``An interactive activation model of context effects in letter perception: Part 1. An account of basic findings'' Psychological Review 88 375 ^ 407 Mack A, Rock I, 1998 Inattentional Blindness (Cambridge, MA: MIT Press) Manelis L, 1974 ``The effect of meaningfulness in tachistoscopic word perception'' Perception & Psychophysics 16 182 ^ 192 Martelli M, Majaj N J, Pelli D G, 2005 ``Are faces processed like words? A diagnostic test for recognition by parts'' Journal of Vision 5 58 ^ 70 (http://journalofvision.org/5/1/6/, DOI:10.1167/5.1.6) Morgan M J, Castet E, 1995 ``Stereoscopic depth perception at high velocities'' Nature 378 380 ^ 383 Morrone M C, Burr D C, Vaina L M, 1995 ``Two stages of visual processing for radial and circular motion'' Nature 376 507 ^ 509 Mozer M C, 1983 ``Letter migration in word perception'' Journal of Experimental Psychology: Human Perception and Performance 9 531 ^ 546 Murray W S, Forster K I, 2004 ``Serial mechanisms in lexical access: the rank hypothesis'' Psychological Review 111 721 ^ 756 O'Connor K J, Potter M C, 2002 ``Constrained formation of object representations'' Psychological Science 13 106 ^ 111 Pelli D G, Farell B, Moore D C, 2003 ``The remarkable inefficiency of word recognition'' Nature 423 752 ^ 756 Pollatsek A, Reichle E D, Rayner K, 2003 ``Modeling eye movements in reading: Extensions of the E-Z Reader model'', in The Mind's Eye: Cognitive and Applied Aspects of Eye Movement Research Eds J Hyona, R Radach, H Deubel (Oxford: Elsevier) pp 361 ^ 390 Potter M C, 1993 ``Very short-term conceptual memory'' Memory & Cognition 21 156 ^ 161 Ramachandran V S, Rogers-Ramachandran D C, 1991 ``Phantom contours: A new class of visual patterns that selectively activates the magnocellular pathway in man'' Bulletin of the Psychonomic Society 29 391 ^ 394 Rayner K, Ashby J, Pollatsek A, Reichle E D, 2004 ``The effects of frequency and predictability on eye fixations in reading: implications for the E-Z Reader model'' Journal of Experimental Psychology: Human Perception and Performance 30 720 ^ 732 74 A O Holcombe, J Judson Reilly R, Radach R, 2003 ``Foundations of an interactive activation model of eye movement control in reading'', in The Mind's Eye: Cognitive and Applied Aspects of Eye Movement Research Eds J Hyona, R Radach, H Deubel (Oxford: Elsevier) pp 429 ^ 455 Reulen J P H, Marcus J T, Koops D, Vries F R de, Tiesinga G, Boshuizen K, Bos J E, 1988 ``Precise recording of eye movement: the IRIS technique. Part 1'' Medical & Biological Engineering Computing 26 20 ^ 26 Rousselet G A, Fabre-Thorpe M, Thorpe S J, 2002 ``Parallel processing in high-level categorization of natural images'' Nature Neuroscience 5 629 ^ 630 Rubin G S, Turano K, 1992 ``Reading without saccadic eye movements'' Vision Research 32 895 ^ 902 Seidenberg M S, 1985 ``The time course of phonological code activation in two writing systems'' Cognition 19 1 ^ 30 Shallice T, McGill J, 1978 ``The origins of mixed errors'', in Attention and Performance VII Ed. J Requin (Hillsdale, NJ: Lawrence Erlbaum Associates) pp 193 ^ 208 Smith E E, Haviland S E, 1972 ``Why words are perceived more accurately than non words: inference versus unitization'' Journal of Experimental Psychology 92 59 ^ 64 Sperling G, Budiansky J, Spivak J, Johnson M C, 1971 ``Extremely rapid visual search: The maximum rate of scanning letters for the presence of a numeral'' Science 174 307 ^ 311 Stroop J R, 1935 ``Studies of interference in verbal reactions'' Journal of Experimental Psychology 18 643 ^ 662 Stroud J M, 1956 ``The fine structure of psychological time'', in Information Theory in Psychology Ed. H Quastler (Glencoe, IL: Free Press) pp 140 ^ 207 Tao L, Healy A F, Bourne L E, 1997 ``Unitization in second-language learning: Evidence from letter detection'' American Journal of Psychology 110 385 ^ 395 Thompson M C, Massaro D W, 1973 ``Visual information and redundancy in reading'' Journal of Experimental Psychology 98 49 ^ 54 Thorpe S, Fize D, Marlot C, 1996 ``Speed of processing in the human visual system'' Nature 381 520 ^ 522 Treisman A, Souther J, 1986 ``Illusory words: the role of attention and of top ^ down constraints in conjoining letters to form words'' Journal of Experimental Psychology: Human Perception and Performance 12 3 ^ 17 Tyler C W, Hardage L, Miller R T, 1995 ``Multiple mechanisms for the detection of mirror symmetry'' Spatial Vision 9 79 ^ 100 Tzeng O J, Hung D L, Cotton B, 1979 ``Visual internalisation effect in reading Chinese characters'' Nature 282 499 ^ 501 Usher M, Donnelly N, 1998 ``Visual synchrony affects binding and segmentation in perception'' Nature 394 179 ^ 182 VanRullen R, Koch C, 2003 ``Is perception discrete or continuous?'' Trends in Cognitive Sciences 7 207 ^ 213 VanRullen R, Thorpe S J, 2001 ``The time course of visual processing: From early perception to decision-making'' Journal of Cognitive Neuroscience 3 454 ^ 461 Wilson H R, Wilkinson F, 1998 ``Detection of global structure in Glass patterns: implications for form vision'' Vision Research 38 2933 ^ 2947 ß 2007 a Pion publication ISSN 0301-0066 (print) ISSN 1468-4233 (electronic) www.perceptionweb.com Conditions of use. This article may be downloaded from the Perception website for personal research by members of subscribing organisations. Authors are entitled to distribute their own article (in printed form or by e-mail) to up to 50 people. This PDF may not be placed on any website (or other online distribution system) without permission of the publisher.
© Copyright 2026 Paperzz