Putting neurons in culture: The cerebral foundations of reading and mathematics III. The human Turing machine Stanislas Dehaene Collège de France, and INSERM-CEA Cognitive Neuroimaging Unit NeuroSpin Center, Saclay, France Raoul Hausmann. L'esprit de notre time (Tête mécanique) © Paris, Musée National d’Art Moderne Summary of preceding talks: The brain mechanisms of reading and elementary arithmetic • Human cultural inventions are based on the recycling (or reconversion) of elementary neuronal mechanisms inherited from our evolution, and whose function is sufficiently close to the new one. • Why are we the only primates capable of cultural invention? quantity representation high-level control verbal code visual form A classical solution: new « modules » unique to the human brain • Michael Tomasello (The cultural origins of human cognition, 2000) « Human beings are biologically adapted for culture in ways that other primates are not. The difference can be clearly seen when the social learning skills of humans and their nearest primate relatives are systematically compared. The human adaptation for culture begins to make itself manifest in human ontogeny at around 1 year of age as human infants come to understand other persons as intentional agents like the self and so engage in joint attentional interactions with them. This understanding then enables young children to employ some uniquely powerful forms of cultural learning to acquire the accumulated wisdom of their cultures » • “Theory of mind” and “language” abilities certainly play an important role in our species’ pedagogical abilities, and therefore the transmission of culture • However, they do not begin to explain our remarkably flexible ability for cultural invention cutting across almost all cognitive domains. Another design feature is needed. The theory of a global workspace • In addition to the processors that we inherited from our primate evolution, the human brain may possess a well-developed non-modular global workspace system, primarily relying on neurons with long-distance axons particular dense in prefrontal and parietal cortices • Thanks to this system, - processors that do not typically communicate with one another can exchange information - information can be accumulated across time and across different processors - we can discretize incoming information arising from analog statistical inputs - we can perform chains of operations and branching • The resulting operation may (superficially) resemble the operation of a rudimentary Turing machine The Turing machine: a theoretical model of mathematical operations Turing, A. M. (1936). On computable numbers, with an application to the Entscheidungsproblem. Proc. London Math. Soc., 42(230-265). • • • m-configurations S(r) scanned square a t a • qn d • We may compare a man in the process of computing a real number to a machine which is only capable of a finite number of conditions q1, q2, … qR which will be called « m-configurations ». The machine is supplied with a « tape » (the analogue of paper) running through it, and divided into sections (called « squares ») each capable of bearing a « symbol ». At any moment, there is just one square, say the r-th, bearing the symbol S(r), which is « in the machine ». We may call this square the « scanned square ». The symbol on the scanned square may be called the « scanned symbol ». The « scanned symbol » is the only one of which the machine is, so to speak, « directly aware ». (...) The possible behaviour of the machine at any moment is determined by the m-configuration qn and the scanned symbol S(r). [This behaviour is limited to writing or deleting a symbol, changing the m-configuration, or moving the tape.] It is my contention that these operations include all those which are used in the computation of a number. Infinite tape The essential features of the Turing machine Turing makes a number of postulates concerning the human brain. • Mental objects are discrete and symbolic • At a given moment, only a single mental object is in awareness • There is a limited set of elementary operations (which operate without awareness) • Other mental operations are achieved through the conscious execution of a series of elementary operations (a serial algorithm) m-configurations S(r) scanned square d a t a The Church-Turing thesis: • Any function that can be computed by a human being can be computed by a Turing machine qn Infinite tape During his career, Turing himself kept a distanced attitude with this thesis : • On the one hand, he attempted to design the first “artificial intelligence” programs (e.g. the first Chess program) and suggested that the behavior of a computer might be indistinguishable from that of a human being (“Turing test”). • On the other, he did not exclude that the human brain may possess “intuitions” (as opposed to mere computing “ingenuity”) and envisaged an “oracle-machine” that would be more powerful that a Turing machine The fate of the computer metaphor in cognitive science and neuroscience • The concepts of Turing machine and of information processing have played a key role at the inception of cognitive science • Since the sixties, cognitive psychology has tried to define the algorithms used by the human brain to read, calculate, search in memory, etc. • Some researchers and philosophers even envisaged that the brain-computer metaphor was the “final metaphor” that “need never be supplanted”, given that “the physical nature [of the brain] places no constraints on the pattern of thought” (Johnson-Laird, Mental models, 1983) • However, the computer metaphor turned out to be unsatisfactory: – The most elementary operations of the human brain, such as face recognition or speaker-invariant speech recognition, were the most difficult to capture by a computer algorithm – Conversely, the most difficult operations for a human brain, such as computing 357x456, were the easiest for the computer. The human brain: A massively parallel machine ~1011 neurons ~10 15 synapses For basic perceptual and motor operations, computing with networks and attractors provides a strong alternative to the computer metaphor - Mental objects are coded as graded activation levels, not discrete symbols -Computation is massively parallel Model of written word recognition (McClelland and Rumelhart, 1981) Model of face recognition (Shimon Ullman) Even mathematical operations – the very domain that inspired Turing – do not seem to operate according to classical computer algorithms The Distance Effect in number comparison (first discovered by Moyer and Landauer, 1967) 800 99 31 Response time 750 700 84 52 650 600 larger or smaller than 65 ? 550 500 450 smaller larger 400 30 0.2 Error rate 40 0.1 50 60 70 80 90 0 100 Target numbers Dehaene, S., Dupoux, E., & Mehler, J. (1990). Journal of Experimental Psychology: Human Perception and Performance, 16, 626-641. Do Turing-like operations bear no relation to the operations of the human brain? This conclusion seems paradoxical, given the wide acceptance of the ChurchTuring thesis in mathematics. However... • When we perform complex calculations, our response time is well predicted by the sum of the durations of each elementary operation, with appropriate branching points • In some tasks that require a conscious effort, the human brain operates as a very slow serial machine. • In spite of its parallel architecture, it presents a « central stage » during which mental operations only operate sequentially. On the impossibility of executing two tasks at once Response time The « psychological refractory period » (Welford, 1952; Pashler, 1984) Response 2 Response 1 Task 2 Task 1 Target T1 Response 2 Target T2 2 Target T2 time Response 1 Time interval between stimuli On the impossibility of executing two tasks at once Response time The « psychological refractory period » (Welford, 1952; Pashler, 1984) Response 2 Response 1 Task 2 Task 1 Target T1 Response 2 Target T2 Target T2 time Response 1 Time interval between stimuli On the impossibility of executing two tasks at once Response time The « psychological refractory period » (Welford, 1952; Pashler, 1984) Response 2 Response 1 Response 2 Task 2 Task 1 Target T1 Target T2 Target T2 time Response 1 Time interval between stimuli On the impossibility of executing two tasks at once Response time The « psychological refractory period » (Welford, 1952; Pashler, 1984) Response 1 Response 2 Response 2 Task 2 Task 1 Target T1 Target T2 Target T2 time Response 1 Time interval between stimuli On the impossibility of executing two tasks at once Response time The « psychological refractory period » (Welford, 1952; Pashler, 1984) Response 1 Response 2 Response 2 Task 2 Task 1 Stimuli 1 et 2 Target T2 time Response 1 Time interval between stimuli On the impossibility of executing two tasks at once Response time Pashler (1984) : -only « central operations » are serial - perceptual and motor stages run in parallel slack time P2 P1 Response 1 C2 C1 Stimuli 1 and 2 Target T1 Response 2 Response 2 M2 Slowing the P stage by presenting numerals in Arabic or in verbal notation Target T2 2 Task 2 Response 1 M1 Task 1 time Response 1 Time interval between stimuli Target T2 Response 2 Sigman and Dehaene, PLOS Biology, 2005 Event-related potentials dissociate parallel and serial stages during dual-task processing Subjects were engaged in a dual-task: -number comparison of a visual Arabic numeral with 45, respond with right hand -followed by pitch judgment on an auditory tone, respond with left hand 2.5 1800 Response times 1400 RT2 1000 RT1 600 0 300 900 1200 T1-T2 delay Separation of ERPs at long T1-T2 delays 2 V 1.5 A 1 0.5 0 -1000 -500 N1 0 500 P3 Visual events 1000 N1 1500 2000 P3 Auditory events Sigman and Dehaene, in preparation Event-related potentials dissociate parallel and serial stages during dual-task processing Subjects were engaged in a dual-task: -number comparison of a visual Arabic numeral with 45, respond with right hand 1800 Response times 1.5 1 0.5 1400 RT2 0 N1 -0 . 5 -1 -5 0 0 1000 0 500 1000 1500 0 500 1000 1500 0 500 1000 1500 0 500 1000 1500 1.5 -followed by pitch judgment on an auditory tone, respond with left hand 2.5 RT1 1 600 0.5 0 300 900 T1-T2 delay 1200 P3 0 -0 . 5 -5 0 0 Separation of ERPs at long T1-T2 delays 2 1.5 V 1.5 A 1 0.5 1 0 N1 0.5 0 -1000 -0 . 5 -1 -5 0 0 -500 0 500 1000 1500 2000 1.5 1 0.5 0 N1 P3 Visual events N1 P3 Auditory events P3 -0 . 5 -1 -5 0 0 Sigman and Dehaene, in preparation Locating the sites of processing bottlenecks: parieto-prefrontal networks Dux, Ivanoff, Asplund & Marois, Neuron, 2007 PRP VSTM/MOT Att. Blink Marois & Ivanoff, TICS, 2005 review of imaging studies of bottleneck tasks The central stage is associated with conscious processing The « attentional blink » phenomenon When both T1 and T2 are briefly presented and followed by a maks, participants who perform a task on T1 may fail to report or even perceive the presence of T2. P2 P1 X C1 Percentage of perceived stimuli C2 M2 M1 Task 1 Target T1 Target T2 (masked) Time interval between stimuli J. Raymond, K. Shapiro, J. Duncan, C. Sergent Conscious access and non-conscious processing during the attentional blink Variable T1-T2 lag Target 1 Target 2 HRQF CINQ CVGR XOOX time 90 Percent of trials Bimodal distribution of T2 visibility 80 Seen 70 60 50 Not seen 40 30 20 10 0 1 2 3 4 T1-T2Lag Lag 6 8 0 2 4 8 10 12 14 y ibrat t i l i b i u is SV 6 is ive v bject 16 18 20 ing ility Sergent, Baillet & Dehaene, Nature Neuroscience, 2005 Time course of scalp-recorded potentials during the attentional blink UNSEEN T2 SEEN T2 (minus T2-absent trials) (minus T2-absent trials) DIFFERENCE Sergent et al., Nature Neuroscience, 2005 Timing the divergence between seen and not-seen trials in the attentional blink (Sergent et al., Nature Neuroscience 2005) Unchanged initial processing Late non-conscious processing P1 : 96 ms N1 : 180 ms N4 : 348 ms -2 μV -4 μV -3 μV Seen Not seen +2 μV +4 μV Abrupt divergence around 270 ms… N2 : 276 ms N3 : 300 ms -2 μV -3 μV +3 μV All-or-none ignition P3a : 436 ms P3b : 576 ms Seen Not seen +2 μV +3 μV -2 μV +2 μV -2 μV +2 μV The cerebral mechanisms of this central limitation: a collision of the N2 and P3 waves PROCESSING OF TASK 1 (difference task/no task) N2 P3a P3b +3 μV -3 μV T1 onset 100 -200 200 -100 300 400 100 T2 onset 500 200 600 300 700 400 800 500 time from T1 onset (ms) time from T2 onset (ms) +2 μV -2 μV N2 P3a P3b PROCESSING OF TASK 2 (difference seen/not seen) Dehaene, Baillet et Sergent, Nature Neuroscience 2005 Sources of the difference between seen and unseen trials Middle temporal t = 300 ms seen not seen Inferior frontal seen not seen absent Dorsolateral prefrontal t = 436 ms Activation in event-related potentials: Sergent, Baillet & Dehaene, Nature Neuroscience, 2005 fMRI activation to a seen or unseen stimulus during the attentional blink Marois et al., Neuron 2004 Sources of the difference between seen and unseen trials Middle temporal t = 300 ms seen not seen Inferior frontal Dorsolateral prefrontal t = 436 ms Activation in event-related potentials: Sergent, Baillet & Dehaene, Nature Neuroscience, 2005 Phase synchrony in MEG: Gross et al, PNAS 2004 An architecture mixing parallel and serial processing: Baar’s (1989) theory of a conscious global workspace The global neuronal workspace model (Dehaene & Changeux) hierarchy of modular processors automatically activated processors high-level processors with strong long-distance interconnectivity processors mobilized into the conscious workspace Dehaene, Kerszberg & Changeux, PNAS, 1998 Dehaene & Changeux, PNAS, 2003; PLOS, 2005 inspired by Mesulam, Brain, 1998 Prefrontal cortex and temporo-parietal association areas form long-distance networks Von Economo (1929): Greater layer II/III thickness Guy Elston (2000) Greater arborizations and spine density V1 TE Pat Goldman-Rakic (1980s): long-distance connectivity of dorsolateral prefrontal cortex PFC Detailed simulations of the global neuronal workspace using a semi-realistic network of spiking neurons (Dehaene et al., PNAS 2003, PLOS Biology, 2005) Area D Area C Area B1 Area B2 Area A1 Area A2 Feedforward T1 T2 60 40 20 0 Feedback Thalamocortical column Spiking neurons Cortex supragranular AMPA NMDA layer IV GABA neuromodulator current infragranular to/from lower areas to/from higher areas NaP Leak « Ignition » of the global workspace by target T1 0 100 200 300 400 500 Failure of ignition by target T2 Spike-rate adaptation KS Optional cellular oscillator currents Dehaene, Sergent, & Changeux, PNAS, 2003 Thalamus Is the brain an analogical or a discrete machine? A problem raised by John Von Neumann •Turing assumed that his machine processed discrete symbols • According to Von Neumann, there is a good reason for computing with discrete symbols, and it also applies to the brain: « All experience with computing machines shows that if a computing machine has to handle as complicated arithmetical tasks as the nervous system obviously must, facilities for rather high levels of precision must be provided. The reason is that calculations are likely to be long, and in the course of long calculations, not only do errors add up but also those committed early in the calculation are amplified by the latter parts of it » (…) « Whatever the system is, it cannot fail to differ considerably from what we consciously and explicitly consider as mathematics » (The computer and the brain, 1958) Why and how does the brain discretize incoming analog inputs? The answer given by... Alan Turing The decision algorithm by stochastic accumulation designed by Turing at Bletchley Park ⎡ probabilit y of I if A is true ⎤ Weight of input I in favor of A = Log ⎢ ⎥ ⎣ probabilit y of I if A is false ⎦ Total weight in favor of A = initial bias + weight ( I1 ) + weight ( I 2 ) + weight ( I 3 ) + ... Evolution of the weight in favor of A Decision boundary for A Emission of response A Decision boundary for non-A From numerosity detectors to numerical decisions: Elements of a mathematical theory (S. Dehaene, Attention & Performance, 2006, in press) 1. Coding by Log-Gaussian numerosity detectors 1 2 4 8 16… Internal logarithmic scale : log(n) 2. Application of a criterion and formation of two pools of units Criterion c Stimulus of numerosity n Pool favoring R1 Pool favoring R2 3. Computation of log-likelihood ratio by differencing Pool favoring R2 Pool favoring R1 Response in simple arithmetic tasks: -Larger or smaller than x? -Equal to x? - LLR for R2 over R1 4. Accumulation of LLR, forming a random-walk process Mean Response Time Trial 1 Trial 2 Trial 3 Decision threshold for R2 Starting point of accumulation Decision threshold for R1 Time A fronto-parietal network might implement stochastic accumulation •Neurons in prefrontal and parietal cortex exhibit a slow stochastic increase in firing rate during decision making •Stochastic accumulation can be modeled by networks of selfconnected and competing neurons Decision Fixed threshold Accumulation at a variable speed Simulated neuronal activity: Wong & Wang, 2006 Kim & Shadlen, 1999 Hypothesis: there is an identity between the stochastic accumulation system postulated in and the central system postulated in PRP models Task 1 Stimulus 1 C M The accumulation of evidence required by Turing’s algorithm would be implemented by the recurrent connectivity of a distributed parieto-frontal system. Response 1 P RT1 Distribution Task 2 Stimulus 2 Δ P W M C Response 2 RT2 Distribution 0 1000 2000 ms Perceptual stage (P) Central Integration (C) Wait period (W) Motor stage (M) Sigman and Dehaene (PLoS 2005, PLoS 2006) Is a stochastic random walk constitutive of the « central stage »? M This model makes very specific predictions about the source of variability in response time: - Factors that affect the P or M stages should add a fixed delay number notation (digits or words) Motor complexity (one or two taps) - Factors that affect C should increase variance Numerical distance Target T2 C Response P RT Distribution Perceptual stage (P) Central Integration (C) Motor stage (M) C factor C factor P factor M factor P factor M factor Sigman & Dehaene, PLOS: Biology, 2004 Prefrontal and parietal cortices may contain a general mechanism for creating discrete categorical representations • Categorical representation of visual stimuli in the primate prefrontal cortex (Freedman, Riesenhuber, Poggio & Miller, Science, 2001). •Parietal representations can also be categorical (Freedman & Assad, Nature 2006) “Whereas the rest of cortex can be characterized as a fundamentally analog system operating on graded, distributed information, the prefrontal cortex has a more discrete, digital character.” (O’Reilly, Science 2006) Exploring the cerebral mechanisms of the non-linear threshold in conscious access (Del Cul and Dehaene, submitted) E M6 M + E Time mask Subjective visibility Objective performance . . . + . . . 6 + . Delay: 0-150ms prime 16ms delay Logic = Use this sigmoidal profile as a « signature » of conscious access. Which ERP components show this profile? Left target 1.5 5 100 4 80 3 60 2 40 100ms 1 0 -1 1 P1a 0.5 20 0 50 100 -3 -5 -6 0 16 33 50 66 83100 2 150 -2 -4 -Only the P3 shows a non-linearity similar to behavioral report Amplitude 120 μV -Activation profiles become increasingly non-linear with time Latency 1.5 100 P1b 140ms 1 0.5 50 0 50 100 120 1.5 N1 1 100 167ms 80 60 0 50 0.5 100 0 16 33 50 66 83100 4 240 220 3 200 N2 2 227ms 1 180 16 33 50 66 83100 2 180 160 140 0 160 140 0 50 100 0 16 33 50 66 83100 5 400 4 350 3 P3 2 300 370ms 0 50 1 100 0 16 33 50 66 83100 A late non-linearity underlying conscious access during masking (Del Cul et Dehaene, submitted) Delay: Variable delay 9 E M M E 16 ms 250 ms Parietal First phase: local and linear 100 ms 83 ms 66 ms 50 ms 33 ms 16 ms Second phase: global and non-linear (amplification) Frontal Fusiform A hypothetical scheme for the « human Turing machine » • The workspace can perform complex, consciously controlled operations by chaining several elementary steps •Each step consists in the top-down recruitment, by a fronto-parietal network, of a set of specialized processors, and the slow accumulation of their inputs into categorical bins, which allows to reach a conscious decision with a fixed, predefined degree of accuracy. time Sigman & Dehaene, PLOS:Biology, 2005 Consciousness is needed for chaining of two operations (Sackur and Dehaene, submitted) •Presentation of a masked digit (2, 4, 6, ou 8) just below threshold •Four tasks Naming Arithmetic Comparison Composite task (example: n+2) 2 4 6 8 2 4 6 8 2 4 6 8 2 4 6 8 n+2 2 4 6 2 8 Subliminal Performance on non-conscious trials 4 6 8 smaller larger Comparison Naming 2 4 6 8 compare smaller larger Arithmetic Composite task (incongruent trials) Chance level Conclusions • Turing proposed a minimal model of how mathematical operations unfold in the mathematician’s brain • We now know that the Turing machine is not a good description of the overall operation of our most basic processors • However, it might be a good description of the (highly restricted) level of serial and conscious operations, which occur within a « global neuronal workspace » • The global neuronal workspace may have evolved to - achieve discrete decisions by implementing Turing’s stochastic accumulation algorithm on a global brain scale - broadcast the resulting decision to other processors, thus allowing for serial processing chains and a « human Turing machine » - thus giving us access to new computational abilities (the « ecological niche » of Turing-like recursive functions) • By allowing the top-down recruitment of specific processors, the global workspace may play an important role in our cultural ability to « play with our modules » and to invent novel uses for evolutionary ancient mechanisms • Very little is know about the human Turing machine: - How does the brain represent and manipulate discrete symbols? - What is the repertoire of elementary non-conscious operations? - How do we « pipe » the result of one operation into another?
© Copyright 2026 Paperzz