Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 Stéphane Airiau LAMSADE joint work with Sandip Sen and Daniel Villatoro Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 1 Convention “A convention as an equilibrium that everyone expects in interactions that have more than one equilibrium” Young (1996). The Economics of Convention. Journal of Economic Perspectives examples: picking the side of the road picking a symbol / word for an object picking the notation for retweeting The Emergence of Conventions in Online Social Networks. Kooti, Yang, Cha, Gummadi, Mason in AAAI conference on Weblogs and Social Media A norm is more than a convention: there is a deontic aspect (obligation/sanctions) Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 2 Convention and Norms in MAS Some examples of research directions: designing tools to build communities of agents that use norms (e.g. reasoning about norms, implementation of sanctions) modeling or understanding establishments of human norms or conventions (ex: language, conventions) Some (overlaping) communities interested in norms DEON: International Conference on Deontic Logic and Normative Systems (12 editions) NorMAS: Normative Multiagent Systems systems where individual and collective behavior is affected by norms Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 3 Related work on emergence agents can observe other agents’ interactions H.P. Young Econometrica 1993 (Markov Chain) Epstein Computational Economics 2001 (majority rule), Axelrod American Political Science review 1986 (evolutionary) Hao et al. AAMAS 2013, AAAI 2014 agents communicate their model Verhagen Social Science Computer Review 2001 use of social networks Delgado Artificial Intelligence 2002 Yu et al AAMAS 2013, Hao et al. AAMAS 2013, AAAI 2014 human agents Centola and Baronchelli PNAS 2015 Kooti et al. in AAAI conference on Weblogs and Social Media 2012 private information Shoham and Tenneholtz Artificial Intelligence 1997 (local learning algorithm HCR rule) Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 4 Our work When multiple conventions are possible, how does a society of artificial agent collectively adopts a norm? voting would be a possibility: but one needs to organise the election the convention may emerge from interactions between agents Contribution: previous work assumes the observation of interactions between other agents ë will a convention emerge even if all interactions are private? when agents are learning in repeated interactions, they interact witht the same agent(s) ë does learning converge when the interacting agents keep on changing? Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 5 Social Learning Framework We consider only conventions between pairs of agents (example of the road, shaking hands, etc) N is the set of n agents Ar set of actions of row role Ac set of actions of column role Gi payoff matrix of agent i ∈ N: we do not assume inter comparison of utility but we assume the same ordering over all joint-actions In most cases: Ar = Ac and ∀i, j Gi = Gj interconnection topology: a topology may restrict the interactions between agents. Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 6 Social Learning Framework Interaction protocol: One iteration initialisation: all agents are available until there is no pair of available agents do: randomly pick a pair (i, j) of available agents randomly select a role row or column for i and j row selects an action in Ar , column selects an action in Ac row and column receive the corresponding payoff row and column update their learning algorithms update the set of available agents Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 7 Stability Definition (convention) A convention is a pure Nash equilibrium of the game in practice conventions are pure strategy if there is a unique pure Nash equilibrium, actually, we do not study a convention. Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 8 Examples Go Yield Go -1 -1 3 2 Yield 2 3 1 1 L L R intersection Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning R 4 2 -1 -10 -1 -10 4 2 picking side Séminaire équipe multiagent LIP6 9 Learning algorithms Fictitious Play Q-learning with -greedy Watkins and Dayan (1992) Machine Learning Win of learn fast – policy hill climbing (WoLF-PHC) Bowling and Veloso (2002) Artificial Intelligence Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 10 First algorithm: Fictitious Play The learner believes its opponent is playing a fixed mixed strategy given by the empirical distribution of the opponents previous action. ë the learner plays a best response to this mixed strategy. 1 2 3 4 intialize frequencies p of the actions played by the opponent repeat play a best response to p observe the action played by the opponent and update frequencies Theorem If the empirical distribution of each player’s strategies converges in fictitious play, then it converges to a Nash equilibrium the play converges to a NE, but the players may not play a NE and may not receive a NE expected payoff (ex anti-coordination game) convergence is not always guaranteed (ex Rock-paper-cisors) Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 11 Q-learning with -greedy We apply the Q-learning algorithm with only one state! Update rule of Q-learning is Q(a) ← Q(a) + α(r − Q(a)) For exploration, we use the -greedy method: argmaxa∈A(s) Q(a) with probability 1 − at = , sample from U(A(s)) with probability where U(S) is the uniform probability distribution over a set of alternative S. may decrease during learning. Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 12 Detecting convergence Even if all agents use the same strategy, if they use some exploration we will observe some deviation. For simplicity, we use a threshold: if a strategy profile is played by 95% of the population, we declare convergence. note: some authors used a different threshold (90%). Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 13 Simulation results Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 14 Intersection game (G, YL ) emerges 506 times and (YR , G) 494 times. (G, YL ) emerges 534 times and (YR , G) 466 times. Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 15 Dynamics Dynamics of the probability to play R for each agent each agent is represented by two lines: policy to play as a row and a column player the clearer the cell, the more likely to play L, the darker, the more likely to play R. The convention of choosing action L emerges through social learning. Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 16 Influence of population size average payoff of a learner Influence of the population size (average over 100 runs) with agents using WoLF 2.4 2.2 2 2 agents 10 agents 20 agents 50 agents 100 agents 150 agents 200 agents 300 agents 400 agents 500 agents 1.8 1.6 1.4 1.2 0 200 400 600 800 1000 number of iterations 1200 1400 Dynamics of the average payoff of learners using WoLF with different population sizes (average over 100 runs). Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 17 Varying the size of the game (number of actions) Dynamics of the payoff of 200 WoLF learners with different size of the game 4 3.5 average payoffs 3 2.5 2 1.5 1 2x2 3x3 4x4 0.5 0 0 500 1000 1500 iterations 2000 2500 3000 Coordination game Dynamics of the payoff of learners using WoLF with different game sizes (average over 100 runs). Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 18 Using different learning algorithms Dynamics of the average payoff in a population of 200 agents using different learning algorithms 4 average payoffs 3.5 3 2.5 FP QL WoLF FP+QL FP+WoLF QL+WoLF QL+WoLF+FP 2 1.5 1 100 1000 iterations 10000 100000 Dynamics of the payoff of learners using different learning algorithms (population of 200 agents, average over 100 runs). Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 19 Influence of fixed agents percentage of time it converged to a norm Effect of fixed agents 1 converged to (0,0) converged to (1,1) 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 1 2 3 4 number of additional agents playing fixed strategy 1 5 Number of times each convention emerges (average over 100 runs) a small imbalance in the number of agents using a pure strategy is enough to influence an entire population. Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 20 In a network Convergence Times for Different Neighborhood Sizes and Different Learning Algorithms Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 21 Emergence of different sub-conventions fraction of time system converge to a norm 2 groups evolving a norm with different degree of isolation 1 (0,0) (1,1) 0.8 0.6 0.4 0.2 0 0.5 0.4 0.3 0.2 0.1 0.05 0.01 probability of interaction between agents of different groups Two groups of 100 agents each evolve conventions with different interactions frequencies (average over 1,000 runs). When the probability of interaction is low, the groups can evolve different conventions. Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 22 Emergence of different sub-conventions subconventions may emerge in scale-free networks k-n connected star networks Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 23 Conclusion private interactions are enough for a convention to emerge agents may use generic learning mechanisms stable subconventions may exist and be stable SA, Sandip Sen, Daniel Villatoro. Emergence of Conventions through Social Learning in JAAMAS 2014 Stéphane Airiau (LAMSADE) - Emergence of Conventions through social learning Séminaire équipe multiagent LIP6 24
© Copyright 2026 Paperzz