Modelling of Self Control through Precommitment Behaviour in a Functionally Decomposed Connectionist Architecture CASE FOR SUPPORT, PART 2: Description of Proposed Research and its Context The aim of the proposed research is to investigate how evolution has resulted in self control such that people must use precommitment behaviour to control their future actions. Why is it that some people have organised, successful lives while others ruin their lives, or in the extreme, kill themselves with overeating, lack of exercise, smoking, or drug abuse, which are generally deemed to be irrational behaviour? The proposed research aims to investigate these questions by gaining a greater understanding of self control through simulation of the brain as a functionally decomposed parallel system. Unlike a serial system, a functionally decomposed system can have internal conflicts, as different processes exhibit different behaviours. Self control through precommitment is an example of this internal conflict. In reality, self control manifests itself by how people limit themselves by choosing a later reward over an immediate reward. Precommitment is a mechanism for doing this, by making a choice now that will make it impossible to change our minds later or, if we do, the change is costly. Although there has been much research in psychology and economics in the area of self control, to the best of our knowledge this is the first time that it is being proposed to simulate self control through precommitment behaviour in a computational model that is both biologically and psychologically relevant, in a competitive interaction. The research will aim to answer the questions: is this behaviour learned as part of socialisation and survival, or are we born with this capacity? This research will facilitate the advancement of academic knowledge from cognitive and computational neuroscience as well as related areas, most notably psychology and economics. It will therefore contribute to bridging the gap between the modelling community and the experimentalists. 1. Background The project will test the theories proposed in psychology and economics on self control behaviour. Self control can be broadly defined as is choosing a large delayed reward over a small immediate reward (Rachlin, 1995). Figure 1a illustrates a choice between a smaller-sooner (SS) reward, available at time t2 and a larger-later (LL) reward, available at t3. The thin lines subtended from points SS and LL are temporal discount functions indicating the effect of increasing the delay of the reward or delay of gratification (Mischel et al. 1989). The gradient of the lines represents the fact that the closer you get to a reward the faster its current value increases (Rachlin, personal communication, 2003). The crossing of the discount functions indicates a reversal of preferences as in hyperbolic discounting (Green et al., 1994). At time t 1 for instance, the value of the larger-later (LL) reward exceeds the smaller-sooner reward (SS). However at t2 when SS would be immediately available, the value of SS exceeds LL value. LL Reward SS Choice X value (v) LL SS Reward Choice Y LL t1 t2 t3 t1 time (t) t2 t3 time (t) Figure 1a. Illustration of choice of larger-later over smaller-sooner reward (adapted from Rachlin, 1995) Figure 1b. Illustration of Precommitment behaviour (adapted from Rachlin, 1995) To give an example where self control behaviour is exercised, with reference to Figure 1a let LL represent obtaining good grades and SS going to the pub. Let t 1 indicate the start of an academic year. At this time for most students the value of getting good grades exceeds that of going to the pub. When invited to the pub at t 2 however, the value of SS is higher than their long term goal of getting good grades (LL). If the student exercises self control s/he will choose study (LL) over the pub (SS). Research to date suggests that we recognise that we have self control problems and try to solve them by precommitment behaviour (Rachlin, 2000; Ariely, 2002). Precommitment is defined as making a choice with the specific aim of denying oneself future choices (Rachlin, 1995). Examples are: putting an alarm clock away from your bed, to force you to get up to turn it off. saving part of your monthly pay cheque into an investment fund, to prevent you from spending it. keeping chocolate biscuits out of your house to prevent late-night binges. disconnecting the Internet connection to your computer when you have a deadline. 1 Precommitment is illustrated in Figure 1b, which is based on experiments by Rachlin and Green (1972) on commitment behaviour by pigeons. At Choice X at time t2 choosing between a small-immediate reward, SS, and a large-delayed reward, LL, the preference was for the smaller-immediate reward SS. However, at a prior point in time t 1, shown as Choice Y, an alternative was preferred (the lower arm) that restricted choice to LL only at t 2, i.e. at t1 they committed to LL. Precommitment behaviour can be viewed as an example of the internal conflicts that arise in our brain (Nesse, 2001). This could be representative of the competition between the "higher" centre of the brain (i.e. rational thought) and the low-level centre (i.e. instinctive behaviour) (Jacobs, 1999). If we were truly rational our preferences would not change over time: if it is in your interest to get up when the alarm clock goes off, you should not want to go back to sleep when it wakes you (Samuelson and Swinkels, 2002). Is precommitment behaviour learned as part of socialisation or are we born with this capacity? Why do people engage in precommitment behaviour, when there is always a cost? For example alcoholics exercising precommitment by taking the drug antabuse, which causes severe pain after drinking. This research aims to provide explanations for both how and why this apparently inconsistent behaviour occurs. 2. Programme and Methodology 2.1 Overall Aims and Objectives Self control behaviour is a universal trait, which touches our lives at all levels. Although there has been much research in psychology and some biological evidence on the neuroanatomy of self control (Matsui et al., 2002), to the best of our knowledge there is no research to date that integrates these findings into an analysis of how the apparently inconsistent behaviour of self control through precommitment came about. This project is the first attempt to simulate self control through precommitment behaviour in a biologically and psychologically relevant computational model with learning. The overall aim of the proposed research is to find an explanation for self control through precommitment. We will do this by simulating this behaviour in a computational model of the neural cognitive system. The model must encompass a cognitive architecture that provides a general explanation of self control. It also needs to cater for anomalies. For example, it needs to deal with discounting, that is, the reduction in value of a reward due to delay; and reversal of preferences as shown in Figure 1a. The model should incorporate a continuous learning process by interacting with the environment through its sensors, i.e. it will have minimal pre-programmed knowledge. It will also be an evolutionary system. The aim is to develop a model that is simple, but also sufficiently detailed to capture essential characteristics of the real neural system, so as to be useful for understanding how the human brain functions. The resulting model will be subject to simulation of evolution with learning in order to find an explanation for self control through precommitment. Possible explanations that will be explored, which are neither mutually exclusive nor exhaustive, are: 1. It results from one of the following: (a) an animal evolving optimal low-level, (i.e. instinctive) behaviours in response to certain cues in an ancestral environment; (b) the animal being moved to a novel environment where these low-level behaviours are inappropriate to its higher goals; (c) the animal learning cognitively in the "higher" part of the brain that the low-level behaviours are inappropriate; (d) the animal trying to devise a way in the higher centre of the brain to bypass the low-level behaviours (Sozou, 2003). As such, standard psychological self-control problems can be understood as part of a spectrum of phenomena involving overcoming behaviours which cognition can directly control only partially or not at all. Nesse (2001) alluded to this behaviour, in that the frontal lobes, the most recent part of the human brain in evolution terms, are essential to the inhibition of short term goals in order to fulfil long term objectives. 2. It results from a best evolutionary compromise to environmental complexity and variability. It is not feasible for evolution to program the brain with a direct hard-wired response to every situation it could meet. Instead, there is goal-directed behaviour and a capacity for learning. However, these goals cannot perfectly correspond to fitness. Hence, natural selection has allowed low-level behaviours to effectively take control when cues are strong enough to reliably reflect fitness consequences. This gives rise to the multiple personalities theory; for example, the person that wakes up in the morning is different from the person who went to sleep the previous night. This theory was suggested by Trivers (2000) in the evolution of self-deception. 3. It is a side effect of the necessarily functionally decomposed structure of the nervous system and brain (Jacobs, 1999). Natural selection has not removed this apparent inconsistent behaviour because the imposed cost is low or the constraint to be overcome is too small. 4. It is a direct consequence of a conflict of interest between altruistic behaviour associated with the higher centre of the brain and behaviour such as emotion and aggression associated with the limbic system (suggested by Trivers (2000) to be a genetic conflict). 5. It is a side effect of the evolution of commitment mechanisms for game-theoretical situations where commitment is useful, e.g. anger and self deception (Nesse, 2001). The individual measurable objectives of the proposed research are as follows: 1. To bring together research in psychology, neuroscience, economics and computer science, and to validate this research by implementation of a neural network system that is both biologically and psychologically relevant. 2. To investigate functional decomposition in the brain with particular reference to internal conflict related to self control through precommitment behaviour, with the aim of building a neural model of these decomposed functions. The model will provide a cognitive architecture of the human brain with learning. It should not exclude the other scales of brain modelling, i.e. networks of neurons and single neurons. The model should be able to provide a general explanation of self control behaviour through precommitment, but must also deal with anomalies. 2 3. To test the explanations 1 to 5 above by simulation of a functional decomposed neural system undergoing evolutionary adaptation with robust learning. In order to do this, the conditions and parameter values required for each theory will be established empirically. 2.2 Detailed Methodology A brief description of the overall methodology to be used The research will explore the theory that the human brain is a modular system, i.e. it is composed of different modules each specialising in different tasks with each module skilled in all areas but dominant in one (Jacobs, 1999). A computational model of the neural cognitive system of self control through precommitment behaviour will be developed inspired by the structure and function of the human brain. From the viewpoint of modern cognitive neuroscience, self control as an internal process can be represented as in Figure 2 (Rachlin, 2000). Environment Agent 1 Higher Brain (cognition) State Arrow 1 Information Arrow 2 Behaviour Arrow 3 Stimulus 2 Action Lower Brain (motivation) 3 Stimulus Figure 2. Self control as an internal process within a Reinforcement Learning (RL) framework (based upon Rachlin, 2000) Arrow 1 represents of information coming into the cognitive system located in the higher centre of the brain, which represents the frontal lobes associated with rational behaviour such as planning and control. This information combines with memory located elsewhere in the brain (possibly the hippocampus) to form ideas about the world. These ideas combine with messages coming from the lower brain, representing the limbic system which is associated with emotion and action selection (Trivers, 2000; O’Reilly and Munakata, 2000; Rachlin, 2000). This forms purpose, which travels back down countermanding or augmenting stimuli entering the lower brain (arrow 3). This finally results in behaviour (arrow 2). In this simple model, the modules can be represented as a network architecture of two interacting networks of neurons, to reflect the higher and lower centres of the brain. This follows on from the ideas proposed in the Section 1 above that precommitment behaviour is an example of some internal conflict between the lower and the higher centres of the brain. Learning from interaction with our environment is a fundamental idea underlying most theories of learning and for this reason a neural network with reinforcement learning is to be used. Different forms of weight update rules will be introduced for each module, with various number of neurons with the aim of determining what dependencies there are and what internal conflicts exist in a functionally decomposed neural system. The resulting model will go some way in investigating the possible explanations as listed in Section 2.1, for this apparently irrational behaviour, i.e. if individuals were fully rational, precommitment would be unnecessary as any later temptation that would jeopardise their true preference would be rejected (Nesse, 2001). In order to simplify the model there may be a requirement to differentiate between personal precommitment and precommitment as viewed within game theory (Nesse, 2001), i.e. in games one can safely make the assumption that a player is playing to win, as opposed to personal precommitment where one has moral and other dilemmas. The scope of the project will focus on the latter initially, with a view to using this as a basis for future modelling of personal precommitment. The resulting Artificial Neural System will be verified by examining how people play games, in particular games which have a real world application. For example, pollution can be regarded as a game of Prisoner’s Dilemma (Hamburger, 1979); war can be regarded as another dilemma game called Chicken (Binmore, 1992), where two players compete for a piece of territory. If one chickens out he looses, if both chicken out the situation remains the same and if neither chickens out the consequences are unpleasant for both players. The effect of reward and punishment on the two neural networks’ behaviour will be observed in various game-theoretical situations. The neural system will play against other neural systems and also against human players. The results of these tournaments will be compared to theoretically optimal strategies. A strategy or policy is defined as the decision making function which specifies what action to take in any situation. In psychology this would be a set of stimuli-responses. An optimal strategy is defined in terms of Nash Equilibrium where each player’s strategy choice is a best reply to the strategy choice of other players (Rubenstein, 1982). 3 In the final simulation of this functionally decomposed neural network system which is able to undergo evolutionary adaptation with learning, the parameters defining the network will be subject to simulated genetic evolution using genetic algorithms (Holland, 1992). The resulting networks will be subject to learning thereby exploring evolution with robust learning within a modular neural system. Methodology for neural modelling of a computational functionally decomposed cognitive system The model must explain in computational terms how the brain generates the apparent inconsistent behaviour of self control through precommitment based on the known neurophysiology of the brain. It will do this by building a network architecture of two networks exhibiting different behaviours to represent the higher versus lower cognitive functions as described in Figure 2. Current research indicates that the higher cognitive functions are not based on the action of individual neurons in a limited area but are based on the outcome of integrated action of the brain as a whole (O’Reilly and Munakata, 2000). For this reason, a holistic approach to the brain as a functionally decomposed system will be adopted. Whilst the model will be a cognitive architecture of higher cognitive functions, it should not exclude research into brain modelling at the neuron level and the morphology of individual neurons or networks of neurons responsible for specific functions (Fodor, 1983; Jacobs, 1999). The model will explore the development of the brain as a functionally decomposed system with a bootstrapping strategy, i.e. building blocks are constructed from simpler tasks first as a framework for more difficult tasks later (Cohen et al., 2002). Particular attention will be given to neural competition between modules, without excluding inhibitory competition at the neuron level (Jacobs, 1999). The model will also take into consideration the complexity of the environment as well as behaviour. The variables that define the network will be parameterised to enable control of the model. At a minimum these will include the form of learning, the learning rate and the number of neurons in the each module. We will validate the model by comparing with the behaviour of more detailed models (Cohen et al., 1990; Taylor, 2002). Methodology for building the neural model and testing with game-theoretical scenarios A feasibility study has already been conducted where we constructed a computer simulation of a neural network playing a game with real world consequences using natural forms of learning, e.g. Reinforcement Learning (Banfield and Christodoulou, 2003; 2004). The Reinforcement Learning framework as described by Barto and Sutton (2002) was implemented successfully in games where the players' payoffs are neither totally positively correlated nor totally negatively correlated (general-sum games). The results showed that Reinforcement Learning worked successfully when the network competed against an artificial opponent whose responses were generated randomly with uniform probability (Banfield and Christodoulou, 2003) and when two networks learned simultaneously in a shared environment (Banfield and Christodoulou, 2004). The current study will build on the results of Banfield and Christodoulou (2003; 2004) to implement two artificial neural networks representing the higher and lower centres of the brain, which will compete against each other in general-sum games that model a real-world situation. In the first set of experiments a simple general-sum game will be used. Rubinstein’s Bargaining game (1982) is such a game and therefore is appropriate. Rubinstein’s Bargaining game involves two players and a resource, or pot, such as money. The two players seek to agree how to divide the pot. The pot decreases at each turn of the game, by a fixed amount, hence it pays both players to reach an agreement sooner rather than later. At the beginning of a turn, one player makes an offer to the other player, which is the fraction of the pot that player is willing to give to the other player. The other player can either accept or reject. If he rejects the offer, he then makes a counter-offer and the game continues for another turn. The game terminates when either nothing remains in the pot or one of the players accepts an offer. Each player seeks to gain as much of the pot as he can. The concept of time is only relevant for the duration of one game, which is found to last at most five turns (Banfield and Christodoulou, 2003). Thus there is no need to retain details of previous games and no need to employ temporal neural networks such as Time Delay or Recurrent neural networks. The Neural Network will be implemented as a simple feed forward multi-layer perceptron network using a variety of weight update rules. To verify the results of the Rubinstein’s Bargaining Game we will test if a similar neural architecture would give equivalent results with a different game. For this purpose, it is intended to repeat the experiments with a simulation of the Prisoner’s Dilemma game. Research on self control suggests that there is a relationship between cooperation and self control (Brown and Rachlin, 1999). Human cooperation has been modelled as a game of Prisoner’s Dilemma (Axelrod and Hamilton, 1981) and therefore this game is appropriate for this stage of the research. Brown and Rachlin (1999) played a variation of the Prisoner’s Dilemma game whereby choosing a higher current reward conflicted with behaviour that maximised the overall reward. The results of our simulation will be compared with the empirical results of Brown and Rachlin (1999), which showed a close analogy between self control and social cooperation. We will implement precommitment by adding or changing a strategy to include some signalling information, which will tell the network when precommitment is activated. In this first case we still have one network against an artificial opponent whose responses are generated randomly with predefined strategies, which are the optimal strategies for that game. For the second stage of the verification, we will assume that the network behaves as expected and then we will play the network against “itself”, i.e. another network with same strategies or different strategies reflecting the internal conflict of different parts of the brain wanting different behaviour, i.e. higher versus lower parts of the brain (refer to Figure 2.) Methodology for evolution of the neural cognitive system The final step is to create alternative strategies through evolutionary methods with a view to enhancing the network and explore the possible explanations given in Section 2.1. We will consider two approaches being considered for the building of the evolutionary system. The first is to introduce evolutionary ideas in a game theory context such as in Axelrod’s evolution of co-operation (Axelrod and Hamilton, 1981), where different strategies were represented as a 4 string of chromosomes for the Iterated Prisoner Dilemma game. These were played against the same opponent for 200 games. Each strategy was ranked by the total payoff accumulated (not by the number of opponents defeated). The most successful strategies were subject to genetic evolution using genetic algorithm (GA) techniques (Holland, 1992). The aim was to see which strategy emerged as the Evolutionary Stable Strategy (ESS). This approach will be extended to include with the string of chromosomes, the network topology and learning parameters, as described in the research by Bullinaria (2004). The second approach will use cascade-correlation neural networks in models of development as described by Shultz and Mareschal (1996). Shultz and Mareschal (1996) start with a simple network containing no hidden units, and then grow the network to some acceptable level using the cascade-correlation algorithm of Fahlman and Lebiere (1990). The simulation of evolution on the two Artificial Neural Network (ANN) model from our research will be tested using games. The simulation will focus on the functional decomposition of the brain (see Section 2.1) and attempt to explain such behaviour as a by-product of some internal mechanism. The simulation will examine the role of reinforcement and reinforcement history to explain variances in an individual’s behaviour. Finally, the simulation will explore the theory that such behaviour may be adaptive. In summary, the model will go some way to answering the question whether self control behaviour through precommitment is (i) a biological by-product, (ii) an internal conflict or (iii) an adaptation to enhance the survival of the species. The behaviours, which evolve, will be evaluated in different environments, and in different game theoretical situations. The extent to which: inconsistent behaviours occur when the neural network is tested in conditions similar to those in which it evolved will go some way to supporting the theory that self control through precommitment is a biological byproduct; inconsistent behaviours occur when the neural network is tested in conditions different to those in which it evolved will support the theory that self control through precommitment is a manifestation of an internal conflict (higher v. lower brain centres); commitment behaviour evolves when the network is subjected to evolutionary change in game-theoretical interaction will support or dismiss that self control through precommitment is adaptive. 2.3 Programme of Work and Management Based on the Overall Aims and Objectives described in Section 2.1, a Diagrammatic Work Plan of the research indicating the length and interactions of all tasks is attached. The task list is as follows: a. Identification of possible techniques (ANN with RL, GAs) and testing of these techniques through game play; b. Building of a simplified Neural Model from a top down perspective, looking at which regions of the brain do what, i.e. structure-function; c. Verification/Validation of the model in terms of structure; d. Testing of the model in game theoretical situations; e. Analysis of results/refining of model; f. Building of an evolutionary system using relevant computational techniques; g. Verification of the system through simulation of self control through precommitment (the model must encompass a cognitive architecture that provides a general explanation of self control); h. Simulation of the behaviour for anomalies (e.g. the reduction in value of a reward due to delay and the reversal of preferences as shown in Figure 1a). Deliverables will take the form of refereed articles to be presented at relevant learned society journals and prestigious international conferences. The research will benefit from close collaboration with the Psychology Department of Birkbeck College through Dr. Denis Mareschal and Dr. Peter Sozou from the CoMPLEX (Centre for Mathematics and Physics in the Life Sciences and Experimental Biology), University College London, with whom we will have monthly meetings for updating them on the research progress and for receiving their feedback from their point of view (letters of support are included from both). We will also seek new relevant collaborations. The principal investigator (who will devote eight hours per week to this project) will be responsible for the efficient, productive, technical and administrative management of the project. He will have regular meetings with the researcher to discuss and resolve technical issues; to define the direction of the project and identify solutions if required; to control and review technical work carried out in the tasks; to suggest technical modifications according to the aims of the project; to verify the correct implementation of the project and to monitor the project progress and evaluate the deliverables. All these will be facilitated by progress reports at regular intervals, which the researcher will be asked to prepare. The researcher will also attend short courses in relevant research areas, like computational and cognitive neuroscience. 3. Relevance to beneficiaries A detailed theory for the evolution of self control promises to provide a foundation for a science of self control which will eventually be able to predict both the circumstances expected to induce greater self control and the forms of self control induced. The theory will enable problems of self control to be identified and rectified, and thus it will have a significant impact in the area of healthcare with direct beneficiaries being the general public. This will facilitate the advancement of academic knowledge in related disciplines. The main field of study is computational neuroscience; however the research touches on a wide spectrum of subjects, most notably cognitive science and game theory. The 5 extent to which reinforcement learning can be realised in games that model real life situations will interest game theorists and economists. The results from the later stages of the project will contribute to bridging the gap between the modelling community and experimentalists. 4. Justification of Resources Funding is requested for a research assistant for a 36-month period. This will be Gaye Banfield who will carry out the tasks detailed in Section 2.3 and the Diagrammatic Work Plan (attached) under the supervision of Dr Christodoulou. A laptop is requested along with the software MATLAB plus toolboxes for Neural Networks. The developed system will also be tested on departmental server resources for which 10% of a technical support person is requested. Clerical support of 5% of a person is also requested for assisting with the administration of the project. Due to the crossdisciplinary nature of the research it is important to present results to both the psychology and computer science communities and we request support for one national conference per year (Neural Computation and Psychology Workshop; two people to attend) and two international ones (Cognitive Science conference; one person to attend). Support is also requested for Gaye Banfield to attend an advanced course in Computational Neuroscience for learning the state of the art in the subject. References Ariely D., 2002, Procrastination, Deadlines, and Performance: Self-Control by Precommitment, MIT Press, MA. Axelrod R. and Hamilton W.D., 1981, The Evolution of Cooperation, Science 211, 1390-1396. Banfield G. and Christodoulou C., 2003, On Reinforcement Learning in two player “real-world” games, In Proc. ICCS ASCS Int. Conf. on Cognitive Science, 22. Banfield G. and Christodoulou C., 2004, Multiagent Reinforcement Learning in General Sum Games, Submitted to the IEEE Transactions on Evolutionary Computation. Barto A. G. and Sutton R. S., 2002, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA. Binmore K., 1992, Fun and Games A text on Game Theory, D. C. Heath and Co., Lexington, MA. Brown J. and Rachlin H., 1999, Self-control and Social Cooperation, Behavioral Processes 47, 65-72. Bullinaria, J. A. 2004, On the Evolution of Irrational Behaviour. In H. Bowman and C. Labiouse, editors, Proceedings of the 8th Neural Computation and Psychology Workshop, Connectionist Models of Cognition and Perception II, vol. 15 of Progress in Neural Processing, Singapore, April 2004. World Scientific. Cohen, J. D., Dunbar, K. and McClelland, J., 1990, On the Control of Automatic Processes: A Parallel Distributed Processing Account of the Stroop Effect, Psychol. Rev. 97, 332-361. Cohen L. B., Chaput H. H., and Cashon C. H., 2002, A Constructivist Model of Infant Cognition, Cognitive Development 100, 1-21. Fahlman, S., and Lebiere, C., 1990, The Cascade-Correlation learning architecture, Technical Report CMU-CS-90-100, Comp. Sci. Dept., Carnegie Mellon University, Pittsburgh, PA. Fodor, J. A., 1983, The Modularity of Mind, MIT Press, Cambridge, MA. Green, L., Fry A. F., and Myerson J., 1994, Discounting of Delayed Rewards: A Life Span Comparison, Psychological Science 5, 33-36. Hamburger H., Games as Models of Social Phenomena, W.H. Freeman and Co, San Fran. US, 1979. Holland J.H., 1992, Genetic Algorithms, Scientific America 267, 66-72. Jacobs R. A., 1999, Computational Studies of the Development of Functionally Specialized Neural Modules, Trends in Cognitive Science 3, 31-38. Matsui M., Yoneyama E., Sumiyoshi T., Noguchi K., Nohara S., Suzuki M., Kawasaki Y., Seto H. and Kurachi M., 2002, Lack of Self-Control as Assessed by a Personality Inventory is Related to Reduced Volume of Supplementary Motor Area, Psychiatry Research Neuroimaging 116, 53-61. Mischel, W., Shoda, Y., and Rodriguez, M., 1989, Delay of Gratification in Children, Science 244, 933-938. Nesse R. M., 2001, Natural Selection and the Capacity for Subjective Commitment. In R. M. Nesse (Ed.), Evolution and the Capacity for Commitment, (pages 1-44), New York, Russell Sage. O’Reilly R. and Munakata Y., 2000, Computational Explorations in Cognitive Neuroscience, MIT Press, MA. Rachlin H., 1995, Self-Control: Beyond commitment, Behavioural and Brain Sciences 18, 109-159. Rachlin H., 2000, The Science of Self-Control, Harvard University Press, Cambridge, MA. Rachlin, H. and Green, L. 1972, Commitment, Choice and Self-control, Journal of the Experimental Analysis of Behavior 17,15-22. Rubinstein A., 1982, Perfect Equilibrium in a Bargaining Model, Econometrica 50/1, 99-109. Samuelson L. & Swinkels J. M., 2002, Information and the Evolution of the Utility Function, Journal of Economic Literature, Working Papers 6, Wisconsin Madison - Social Systems. Shultz, T.R. and D. Mareschal, D., 1996, Generative Connectionist Networks and Constructivist Cognitive Development, Cognitive Development 11, 571-603. Sozou P. D., 2003, The Evolutionary Context of Self-Control Problems, Presented at the Workshop on the Evolutionary Biology of Learning, Fribourg, Switzerland, 21-22 February 2003. Taylor J.G., 2002, Paying Attention to Consciousness, Trends in Cognitive Science 6, 206-210. Trivers, R., 2000, The Elements of a Scientific Theory of Self-Deception, Annals of the New York Academy of Sciences 907,114-131. 6 Modelling of Self Control through Precommitment Behaviour in a Functionally Decomposed Connectionist Architecture Diagrammatic Work Plan Task Description Identification of possible techniques (ANN with RL, GAs) and testing of these techniques through game play Building of a simplified Neural Model from a top down perspective, looking at which regions of the brain do what, i.e. structure-function Verification/ Validation of the model in terms of structure Testing of the model in game theoretical situations Analysis of results/refining of model Building of an evolutionary system using relevant computational techniques Verification of the system through simulation of self control through precommitment Simulation of the behaviour for anomalies 4 8 12 16 20 24 28 32 36 4 8 12 16 20 24 28 32 36 Final Report Months 7
© Copyright 2026 Paperzz