Comput Math Organ Theory (2015) 21:115–149 DOI 10.1007/s10588-014-9179-0 Building social networks out of cognitive blocks: factors of interest in agent-based socio-cognitive simulations Changkun Zhao • Ryan Kaulakis • Jonathan H. Morgan • Jeremiah W. Hiam Frank E. Ritter • Joesph Sanford • Geoffrey P. Morgan • Published online: 27 December 2014 ! Springer Science+Business Media New York 2014 Abstract This paper examines how cognitive and environmental factors influence the formation of dyadic ties. We use agent models instantiated in ACT-R that interact in a social simulation, to illustrate the effect of memory constraints on networks. We also show that environmental factors are important including population size, running time, and map configuration. To examine these relationships, we ran simulations of networks using a factorial design. Our analyses suggest three interesting conclusions: first, the tie formation of these networks approximates a logistic growth model; second, that agent memory quality (i.e., perfect or humanlike) strongly alters the network’s density and structure; third, that the three C. Zhao (&) ! R. Kaulakis ! J. W. Hiam ! F. E. Ritter The College of Information Sciences and Technology, The Pennsylvania State University, State College, USA e-mail: [email protected] R. Kaulakis e-mail: [email protected] J. W. Hiam e-mail: [email protected] F. E. Ritter e-mail: [email protected] J. H. Morgan Department of Sociology, Duke University, Durham, USA e-mail: [email protected] J. Sanford Department of Computer Science, Tufts University, Medford, USA e-mail: [email protected] G. P. Morgan School of Computer Science, Carnegie Mellon University, Pittsburgh, USA e-mail: [email protected] 123 116 C. Zhao et al. environmental factors all influence both network density and some aspects of network structure. These findings suggest that meaningful variance of social network analysis measures occur in a narrow band of memory strength (the cognitive band); the threshold for defining tie criteria is important; and future simulations examining generative social networks should control and carefully report these environmental and cognitive factors. Keywords Cognitive simulation ! ACT-R ! Dunbar’s number ! Social cognition 1 Introduction Anderson argues that ‘large-scale learning in groups arises from of small-scale learning’ (2002, p. 87). This paper explores that thesis by examining how cognitive and environmental factors influence the formation and resulting structure of social networks. We do this by varying these factors and seeing how they affect the formation of dyads in networks. This work is motivated by a desire to better understand how socio-cognitive processes influence the development of persistent patterns of relations. By socio-cognitive processes, we refer to both those cognitive resources and mechanisms necessary to create and sustain social ties, as well as those group-level factors known to moderate human decision-making (Morgan and Carley 2011; Morgan et al. 2010; Silverman 2004; Yen et al. 2001). We focus on memory retention and navigation patterns in this paper because these two processes seem foundational to understanding the emergence of social networks in a variety of contexts. To explore these relationships, we introduce one simulation environment, two agent-based models, two experiments, and three analyses that examine the influence of memory, population size, run time, map configuration, and agent navigation. The simulation environment outputs both interaction networks, a matrix showing the total number of agent co-occurrences in the simulation, and ego-nets, weighted declarative representations of each agents’ friends network. For any one run, there is one interaction network and as many ego-nets as there are agents in the experiment. We merge the individual ego-nets to examine how each factor influences the generation process and structure of the resulting network. We start by reviewing previous work in social and cognitive modeling related to social network growth and structure. 1.1 Related previous work using simulations Researchers have developed agent-based models to explore a variety of questions. We briefly examine two major, related, agent-based, modeling approaches: cognitive modeling and social modeling. These approaches are not necessarily mutually exclusive. However, combined socio-cognitive models are relatively rare because they are generally expensive to create and run. 123 Factors of interest in agent-based socio-cognitive simulations 117 Social simulation is a large category that includes both computational and graphical models. Computational social simulation models such as those in the Complex Systems literature frequently, but not always, use boundedly rational agents (Conte et al. 2014; Holland 1992; Miller and Page 2007). Boundedly rational agents are limited both cognitively and socially (Simon 1991). These models can feature entities that range from simple rule-defined automata to goal-driven agents. What unifies these approaches is the use of agents to understand the emergence of system-level properties, generally by showing how past states condition future states. In these models, system-level behaviors arise out of the agents’ discrete interactions, but cannot be explained entirely with reference to them. Further, at different levels of observation, different kinds of emergent behavior can be seen. It is often these kinds of traits that simulations are uniquely able to capture. Early work in social simulation includes Schelling’s (1971) segregation model, Axelrod and Hamilton’s (1981) work on cooperation, and Epstein and Axtell’s (1996) Sugarscape simulations. More recently, increases in computational power have led to a renewed interest in multi-level modeling (Epstein 2006) specifically simulating the influence of dynamics occurring at one level of analysis (e.g., the small group level) on others (e.g., individuals or organizations). For instance, Barrett et al. (2006) model the influence of paths in crisis situations using large simulations featuring millions of agents, allowing them to identify emergent effects at the community level. Cognitive modeling, emerging at roughly the same time as work in social simulation, has attempted to model computationally the effects of perception and memory by simulating cognitive functions. To our knowledge, Carley and Newell (Carley 1991, 1992; Newell 1990; Newell et al. 1989; Carley et al. 1990) were the first to study organizations using models based on a cognitive architecture. For example, they created a warehouse simulation (Carley et al. 1990) with multiple (n = 1–5) boundedly rational agents to examine the influences of group size on joint task performance. More recently, Reitter and Lebiere (2010, 2011) created an agent-based simulation (ACT-UP) that includes the basic mechanisms of ACT-R theory. ACT-UP focuses on examining the cognitive factors in real-world scenarios by running large-scale simulations with thousands of agents. ACT-UP replicates some but not all of ACT-R’s declarative and procedural memory systems. More recently, Gonzalez and Lebiere (2005), Lebiere et al. (2009), Reitter and Lebiere (2010), and Juvina et al. (2011) have used cognitive architectures to model human decision making in collaborative tasks. While our work builds upon these efforts, our interest is how cognitive and environmental factors, modify network formation and structure. There is a rich literature on how proximity influences social ties. We note a few examples. Kraut et al. (2002) found that actor proximity fundamentally influences the evolution of network topologies by determining the interaction frequencies of actors across the network, while Allen (1984) demonstrated that the probability of two people communicating in an environment could be represented with a decreasing hyperbolic function of their distance. Beyond a certain distance, the probability that two people will communicate decreases rapidly, making tie formation unlikely (Carley 1991, 1992). This work suggests it is useful to examine 123 118 C. Zhao et al. how cognitive and environmental factors interact to condition interaction opportunities and consequently the growth and structure of social networks. Creating agents based on cognitive architectures to study social phenomena has been a useful approach and provides us the tools to examine explicitly the interaction of memory and environment. Cognitive models attempt to represent the processes associated with cognition, including but not limited to perception, decision making, and learning. Most multi-agent social simulations rely on generalizations of these processes (e.g., Axelrod and Hamilton’s (1981) tit-for-tat strategy) to generate useful and frequently counterintuitive results. Socio-cognitive models attempt to explore how cognition informs social interaction. In our case, we are interested in how imperfect memory influences the structure of human interactions. 1.2 The effect of agents’ memory and space on nodal carrying capacity We look at the processing of social information by exploring the concept of nodal carrying capacity, the number of alters an ego can retain in memory. To that end, we examine how environmental factors contribute to the consolidation and retention of social ties. This concept is similar to Dunbar’s (1998) social brain hypothesis, where he argues that limitations associated with the neocortex limit the number of bidirectional ties any one person can retain in memory to approximately 150 people. Dunbar argues that maintaining stable relationships requires repeated memory activations to identify not only one-on-one relationships but also third party relationships (i.e., the knowledge that my friend is also friends with other actors who I monitor). Further, he claims that the cognitive load associated with maintaining this set of relationships in memory rises exponentially as group size increases (Dunbar 1998, p. 63). Based on retrospective empirical studies, Dunbar (1998, pp. 68–75) argues that this ratio between cognitive load and group size underlies the small-world effect observed by Milgram (1967) and others. McCarty et al. (2001) propose that humans retain in memory a far larger number of ties (n = 291). In part, this discrepancy is rooted in a difference in definitions. McCarty et al. (2001) definition of a social tie requires mutual identification as opposed to Dunbar’s stricter definition of mutual identification as well as placement in the network. In this work, the social ties we identify are closer to McCarty et al.’s definition. Nevertheless, we believe that our simulation is still relevant to Dunbar’s social brain hypothesis because both concepts concern the effect of imperfect memory on the number of social ties an individual can maintain in memory. Consequently, in our experiments, we expect that larger populations acting over longer time periods in fully connected environments will result in the most connected declarative network structures. We also expect that less connected environment layouts will result in interaction networks that consist of smaller and more locally clustered ego-nets, resulting in more fragmented networks. We also expect that map configurations characterized by nexus points (locations which act as hubs by virtue of being centrally located) will exhibit behaviors similar to the water- 123 Factors of interest in agent-based socio-cognitive simulations 119 cooler effect (DiFonzo 2008). We, however, are less certain where we might see thresholds in network formation, where for instance population growth no longer has an effect or run time is no longer relevant. 1.3 Preview of the paper Our model is unusual in that we model social processes using a cognitive architecture (ACT-R) that is primarily associated with cognitive science. Our model follows in this tradition, but focuses on the effect of memory as opposed to decision making in an attempt to explore aspects of Robin Dunbar’s (1998) social brain hypothesis. Our emphasis is to show memory’s effect on the formation and structure of interaction networks, not as it informs choices but as it structures interaction opportunities. We do this in three analyses using relatively simple agents that feature ACT-R’s Declarative Memory (DM) activation and decay function. Analysis 1 shows the formation of networks by comparing agents using a fixedpath as opposed to a random-path navigation pattern; we find that memory in conjunction with environmental factors predictably influences the networks’ formation rate. This rate follows the population growth sigmoid function, with networks composed of rationally bounded agents better approximating the function. Analysis 1 further suggests that memory is likely to influence the ultimate structure of social networks. We confirm this hypothesis in analysis 2 by showing the effect of memory on five network measures, demonstrating that the useful range of variance for these measures is contingent on memory. Finally, in analysis 3, we analyze the effects of memory and environment on the topologies of 165,240 merged ego-nets. We found that memory and environmental factors were all significant and that the influence of spatial configuration on network topologies is conditioned by the joint effects of memory and population size in some interesting ways. Based on these results, we are able to make some suggestions for reporting work in this area. 2 Experiment environment and agents Simulations using multiple cognitive agents need not be as onerous to design and build as in the past. It has become easier to build cognitive models (Ritter et al. 2006). Also, with the increase in power of desktop computers, it is now possible to run interesting network simulations locally. Preliminary explorations may be necessary to find what factors we need to pay attention to, to record and report, and to explore when simulating network formation and structure. This paper is a starting point for these tasks. All of our experiments were conducted on a 2 GHz eight-core server with 8 GB of RAM. The server runs Linux 2.6.31 under Ubuntu 11.04, with SBCL 1.0.52 Lisp, and ACT-R 6 (Anderson et al. 2004). Appendix 1 has further details. 123 120 C. Zhao et al. 2.1 The VIPER environment Because we are interested in the constraints associated with embodiment, and need an environment to support multi-agent simulation with cognitive agents, we developed the VIPER environment. It is lightweight in that it is text-based, but is extendable and records agent behaviors. VIPER allows multiple agents to connect to it via Telnet, and for these agents to run in real or (faster) simulated time. The network’s speed and frequency of communication are determined by its component agents, with no queue of events being enforced. VIPER is designed so variations in performance originate from the agents participating in the environment, versus being a function of the environment. The VIPER server is based on NakedMUD, an open-source MUD environment. Within the environment provided by the server, agents or human subjects are situated on maps of interconnected rooms. The agents can see and communicate with others within each room. Agents can walk between the rooms, and can interact with objects in the rooms. 2.2 ACT-R agents For our agents in the simulation, we built two types of ACT-R models. One conducts random walks in the environment. The other follows a fixed path. Both models contain two types of declarative memories: a ‘‘goal’’ type containing the agent’s current location, remaining steps, total friends counts and a ‘‘friend’’ type used to store friend’s names. The model consists of four basic components with nine productions. First, the agent’s ‘‘walking’’ component selects an available direction randomly or from the fixed path, and sends a moving message to VIPER. Second, the ‘‘waiting’’ component utilizes ACT-R’s temporal buffer to wait 16 real-time seconds before allowing the agent to enter a new room, to simulate transition time. Third, the agent’s ‘‘checking’’ component, consisting of three productions, checks if the current room is empty. Fourth and finally, the ‘‘memorizing’’ component, also three productions, first checks DM to see if the agent has previously encountered the agent it has just seen. If not, the model creates a new friend memory chunk, using the imaginal buffer to store the new agent name. Each memory chunk i is assigned an activation value Bi to represent current retention. This activation will decay over time, and the number of active memory chunks is influenced by several factors, including initial memory activation, retrieval activation threshold, memory decay rate, time of retrievals, and practice time. In Eq. 1, n represents the number of presentations associated with chunk i; d represents the decay parameter; tj represents the time since the jth presentation. ACT-R calculates the time since last presentation (tj) in milliseconds, and this equation is usually used to predict memory retention for short durations (i.e., less than 1 h). Pavlik and Anderson (2005) have extended this theory to a long-term memory by changing the time parameters, allowing for the modeling of more complex social phenomena. 123 Factors of interest in agent-based socio-cognitive simulations Bi ¼ ln n X j¼1 tj#d ! 121 ðChunk activation based on times of usesÞ ð1Þ When our agents retrieve/recall a memory chunk (i) there are three parameters involved in the recall process, the retrieval activation threshold, current activation value, and a noise parameter (s). The retrieval activation threshold is a minimum activation to retrieve a chunk. Equation 2 shows how the probability of recall also depends on the current activation value (Ai) and noise parameter (s). As the activation becomes higher or the noise parameter becomes lower, the probability of recall is higher. recall probabilityi ¼ 1 s#Ai : 1þe s ðThe probability of recalling a declarative memoryÞ ð2Þ When a chunk is successfully retrieved, there is a retrieval latency time that is determined by the activation value. Equation 3 shows the relations between activation value and retrieval latency time (F is a latency parameter): if the activation value is high, it will take less time to retrieve a memory chunk. Because activation value can be used to infer retrieval time (as shown in Eq. 3), we can express our network semantics in terms of recall time for an alter (another agent), which then allows future comparison to human data. Time ¼ Fe#A ðThe time to recall a declarative memory based on its activationÞ ð3Þ In our simulation, we implement each agent-to-alter relation as a DM (Declarative Memory) chunk associated with an activation value. The result of our simulation directly outputs a list of activation values of these relations. We use each alter’s activation value, which allows us to infer how long it would take to recall the alter, to identify network ties. Thus, as the activation threshold for determining if a network tie exists increases, the semantics for identifying another alter as an acquaintance become more and more stringent. 3 Data and methods The VIPER environment allows many ACT-R agents to interact with each other. These interactions produce representations of these other agents (alters, from the point of the view of the ego), which can then be interpreted as networks. By manipulating parameters of the simulation, we can observe changes in the growth and final structure of the resulting networks. This process is illustrated in Fig. 1. To better understand how memory decay and environmental factors influence social networks, we conduct three analyses that: (a) test the influence of memory decay on network tie formation, (b) analyze the effects of memory decay on 123 122 Fig. 1 C. Zhao et al. High-level overview of our modeling process network structure (operationalized using five network measures), and (c) test the joint effects of memory decay and environmental factors on network structure. 3.1 Independent variables We manipulate the following factors (activation threshold, navigation patterns, population size, run time, and environmental configuration) to examine their effects on the resulting networks. Table 1 shows our simulation parameters and the independent variables tested in each of our analyses. 3.1.1 Activation threshold To test the effects of memory decay, we manipulate activation threshold. In our first analysis, we treat threshold as a dichotomous variable contrasting agents implemented with a threshold of 0 to those with a threshold of -5. In analyses 2 and 3, activation threshold is a continuous variable ranging from -5 to 5. Activation thresholds correspond to the amount of time necessary to recall a tie. We expect our dependent variables (the rate of network formation and their structure) are influenced by activation thresholds, and in the case of our networks are only meaningful within a limited range essentially approximating bounded rationality. 3.1.2 Navigation patterns In our first analysis, we examine the rate of tie formation for agents using either a random or fixed-path navigation pattern. In the second and third analyses, we test the strength of the effects of memory decay and environment on network structure by testing a boundary case (agents moving randomly). 3.1.3 Population size In our first analysis, we examine networks consisting of 40 agents. In analyses 2 and 3, we examine networks consisting of 20, 40, and 60 agents. We use a Box– 123 Factors of interest in agent-based socio-cognitive simulations 123 Table 1 Experimental parameters and independent variables for analyses 1–3 Independent variables Values Analysis 1 (4,320 networks) Threshold -5 or 0 Navigation patterns Random or fixed-path Population 40 Run time 10–500 s Environment Full-grid, central, hall (100, 75, 60 % of connectivity compared to full 2D grid) Analyses 2 and 3 (810 distinct combinations, 165,240 merged ego networks) Threshold -5 to 5 Navigation patterns Random Population 20, 40, 60 Run time 125, 250, 500 s Environment Full-grid, central, hall (100, 75, 60 %) Behnken design (Kuehl 2000) to avoid extreme treatment combinations. Consequently, we transform our original population values into an ordinal scale (-1, 0, 1).1 We expect as population increases network density will decrease as agents encounter only a fraction of the agents in the world, and of those seen remember even fewer. 3.1.4 Length of simulation (run time) We run our experiments for 125, 250, and 500 s in real time. Again for analyses 2 and 3, we coded these values into an ordinal variable (-1, 0, 1). We expect increasing the run time will increase the density of the simulated networks, as the agents will have more opportunities to form ties. 3.1.5 Environment configuration We test three map configurations (Fig. 2). All configurations have the same number of rooms (25). We operationalize the concept of environment configuration in two ways. First, we treat each configuration as a categorical variable, with the hallway map being the reference category. Second, we isolate the effect that room connectivity has by using a measure we refer to as grid ratio. Grid ratio is simply the number of ties between rooms present over the number of ties possible given a flat two-dimensional world, essentially a density measure for the three maps.2 1 When using the original variables for population and run time, we find the same directionality, no change in the standard errors or significance, and no effect on the other variables or the explained variance. The transformed scale does increase the magnitude of the observed effects. 2 We include both measures because they are not equivalent. We did find instances where our categorical variable (Room Map) was significant when Grid Ratio by itself was not and vice versa. 123 124 C. Zhao et al. Fig. 2 a 5 by 5 grid map, b central area map, c and hallway grid map The first configuration (2a) is a full 5 9 5 map with a grid ratio 1.0. This configuration simulates an open space without spatial constraint. The second configuration (2b) has a central area with grid ratio 0.75. This configuration simulates an open space with some spatial constraint (e.g., a warehouse). We believe this central meeting point will lead to network densities and clustering that are less pronounced than those associated with the 5 9 5 map but more than those associated with the hallway map. The third configuration (2c), our reference setting, is a two-hallway configuration with grid ratio 0.6. This is a realistic configuration that simulates spatial layout of a building floor with a hallway. This configuration should lead to lower connectivity due to the larger distances between agents. 3.2 Constructing the data: showing growth and finding structure Because we are interested in the effects of memory and environmental factors on both network formation and their ultimate structure, we ran an initial analysis testing our hypothesis that memory and environment influence network formation. Drawing the growth curves of the activation networks requires taking numerous samples at different cut points. Consequently, our first analysis uses a smaller data set. The results of this analysis informed the design of analyses 2 and 3, where we examine the strength of these effects on 165,240 merged ego-nets. 3.2.1 Analysis 1 Analysis 1 compares the growth curves of 4,320 activation networks. In this experiment, we use a sample population size of 40 and a running time of 500 s. To draw the growth curve, we collected network data at 18 time points, ranging from 10 to 500 s. We chose 500 s as the longest running time because we did not observe any significant changes in the network after 500 s. Based on our computational resources, each experiment setting was run 20 times. As we focus on the growth pattern of social networks in this experiment, the time parameters are not directly applicable. However, Pavlik and Anderson’s (2005) results suggest that we can 123 Factors of interest in agent-based socio-cognitive simulations 125 generalize the results of our short-term simulations to larger time scales because ACT-R allows us to predict long term memory effects by applying different time parameters. 3.2.2 Analyses 2 and 3 Analysis 2 examines changes in our five network measures as a function of activation threshold. Analysis 3 consists of a set of Ordinary Least Squares (OLS) regressions.3 We are confident that OLS is sufficient for our purposes because our experimental design makes heteroskedasticity issues associated with auto regression or autocorrelation unlikely, ensures that we can consider our independent variables fixed over repeated samples, and eliminates the possibility of perfect collinearity. Figure 3 shows an example ego network. We capture the effect of environmental factors and memory constraints on network structure by combining and analyzing egonets generated by the ACT-R agents in VIPER. Each agent represents other agents it has met as declarative memory (DM) chunks. The activation of each DM chunk can be used to derive the probability of retrieving a chunk and the amount of time required to recall a tie. The semantics of each dyadic tie is important in interpreting a network. An activation value of -3 indicates that the actor will need approximately 200 ms to recall the chunk (if they can retrieve it at all), whereas an activation of 3 indicates that the actor will need less than 5 ms to recall the chunk (but longer to report it because of additional processes including planning an utterance and speaking). These semantics seem to be useful in considering social networks; if an agent cannot recall another agent at all, it is unlikely they have a social tie. These ego networks thus can be considered analogous to network survey data, with a survey instrument used to gather each individual’s view on their social network. We merge these ego files to create merged ego networks. We have two distinct strategies for merging these ego files; we call these methods symmetricnecessary and directed-sufficient. To merge our ego-nets, we considered two rules for determining whether a tie exists: symmetric-necessary (where threshold, t, must be met by both Aij and Aji) and directed-sufficient (where threshold, t, must be met by either Aij or Aji). In either case, a binary two-way tie is formed if the necessary conditions are met. We chose to form binary two-way ties to ensure that the network measures being used could be compared directly. The directed-sufficient network merge method is less conservative; typically creating more ties, differences between the networks generated by each method may also offer interesting insights on the ramifications of environmental structure. The semantics implied by the symmetric-necessary merge method closely resembles Dunbar’s (1998) and McCarty et al’s (2001) mutual identification criteria for a social tie. For simplicity in the remainder of the paper, we will refer to the symmetricnecessary semantic as ‘symmetric’, and the directed-sufficient semantic as 3 In our analyses, we ran Cooks D and DFFITS tests. We tested our model on the entire data and subsections of the data. The models effects were stable and we saw no evidence of problems associated with emergence. 123 126 C. Zhao et al. Fig. 3 An example ego network, ties are colored by activation value ‘directed’. Note that all networks produced by either method are both binary and undirected. Given that our alter DM chunks were inherently weighted by activation value, it was tempting to simply use the activation values to inform a weighted network measure analysis. We chose not to do that for three reasons. Firstly, we consider that changes to the threshold value change the meaning of the network, which makes such a network analysis problematic. Secondly, ACT-R’s activation values range from negative to positive values, spanning zero (0). Zero is usually reserved in network analysis for the absence of a tie. Transforming the data is possible, for example, by converting the activation value into the time to recall or probability of recall but adds an additional processing step. Finally, network measures are sensitive to the edge weighting and may produce non-intuitive results—many network analysis tools automatically binarize networks before calculating measures. Consequently, we used the social tie threshold to filter ego networks and infer binarized undirected ties. For each simulation run, we created both a symmetric-necessary and directionalsufficient network using social tie thresholds between 5 and -5, with intervals of 0.1. Thus, we created two networks per run for each tie threshold value of -5.0, -4.9, -4.8, -4.7, and so on to 5.0. This process created a corpus of (810 runs 9 2 merge methods 9 102 threshold values) 165,240 merged ego-networks. We included isolates, egos with no connections, in our networks. Figure 4 offers an example of how a single merged ego-network can reveal the strong ties in a network as the criteria for inclusion in that network becomes stricter. By stricter, we mean that the activation threshold necessary to be considered a tie 123 Factors of interest in agent-based socio-cognitive simulations 127 Fig. 4 The same ego network at various memory thresholds, a -3.5, b 0.0, and c 1.0 increases. While the three networks shown in Fig. 4 arise from the same declarative representation, these networks reflect activation thresholds of -3.5, 0.0, and 1.0, corresponding to networks a, b, and c respectively. In essence, network c identifies the agent’s core relationships, those of which the agent has the strongest memory. 3.3 Formation rates and network measures: our dependent variables and descriptive statistics In our first experiment, we analyze changes in network tie formation rates to test the effects of memory decay, navigation pattern, and map configuration. In our subsequent analyses, we examine changes in five network-level measures (density, average distance, clustering coefficient, average eigenvector centrality, and average betweenness) to test the effects of cognitive limitations and environmental factors. We define these measures now. Network tie formation rate is the slope of our networks’ growth curves. The curves show the number of ties in memory as a function of time, threshold, and navigation pattern. Density is the number of ties present in the network compared to the maximum number of possible ties, excluding self-loops (Kephart 1950). Network density is a fundamental measure of a network, with implications for the interpretation of many other measures. Networks tend to be less dense as the population grows, but should be denser, other factors held equal, as the simulation run-time increases (at least until activation equilibrium is achieved for networks with memory decay). Average distance is the number of jumps required, on average, for each agent to reach every other agent (Moreno and Jennings 1938; Wasserman and Faust 1994). If every agent is connected to every other agent (a density of 1) then average distance will also be 1. Thus, dense networks tend to have average distance near 1. In ORA (Carley et al. 2012) isolated nodes report a distance of 0 to every other agent, which can complicate the interpretation of this measure. Thus, networks with isolated nodes will have their average brought down towards 0. Clustering coefficient is a transformation, over all agents, of each ego’s local network density, of each ego’s alters, how many of those alters are themselves connected (Freeman 1978–1979). A high clustering coefficient in a social network suggests a decentralized information structure and one where actors tend to have an 123 128 C. Zhao et al. Table 2 Descriptive statistics for symmetric and directed networks (Analyses 2 and 3) Dependent variable Mean SD Median Interquartile range Range Symmetric networks (n = 82,620) Density 0.34 0.38 0.07 0–0.85 0–1 Average distance 0.68 0.73 1 0–0.02 0–6.68 Clustering coefficient 0.43 0.43 0.34 0–0.93 0–1 Eigenvector centrality 0.07 0.16 0.02 0–0.06 0–1 Average betweenness 0.01 0.04 0 0–0.001 0–0.76 Directed networks (n = 82,620) Density 0.42 0.41 0.5 0–0.86 0–1 Average distance 0.65 0.59 1 0–0.01 0–5.54 Clustering coefficient 0.5 0.46 0.83 0–0.94 0–1 Eigenvector centrality 0.04 0.13 0.017 0–0.017 0–1 Average betweenness 0.01 0.06 0 0–0.0004 0.89 accurate understanding of the state of their local work-group. This measure is highly correlated with density. Average network eigenvector centrality is a measure of the second-order connections an agent has (Bonacich 1972). Because eigenvector is normalized against the sum of each agent’s second-order connections, density and eigenvector interact. In very dense networks, average eigenvector is very low, typically near 0. In very sparse networks with isolated nodes, the eigenvector will again be near 0. In medium density networks, the eigenvector may be large. Thus, eigenvector values tend to make an inverse U-shaped curve as network connectivity increases. Average betweenness is a measure of a node’s prominence on the paths between other agents (Everett and Borgatti 2005; Freeman 1978–1979). We expect map topology will strongly affect this measure. We expect that betweenness will be higher in environments with the hallway configuration. Betweenness is low on highly dense networks, and also tends towards 0 in highly sparse networks with isolated nodes.4 Betweenness, thus, also tends to have an inverse U-Shaped curve in relation to threshold. Table 2 shows the descriptive statistics of our five network measures for all the merged ego-nets, symmetric and directed. We see that the median of density for our symmetric networks are generally smaller than those in the directed networks, as this is the more conservative method. Medians are a more useful referent as means are vulnerable to outliers. 4 Results This section presents the results of our three analyses. Our first analysis examines the effect of memory decay, map configurations, and navigation patterns on the 4 In ORA, the treatment of isolated nodes for path-based measures is customizable. We use a simple default where nodes report a ‘0’ to nodes they cannot reach. 123 Factors of interest in agent-based socio-cognitive simulations 129 formation of networks. We next examine five network measures as a function of activation threshold. Finally, we examine OLS regression results to better understand the joint effects of memory decay and environmental factors. 4.1 Threshold’s effect on network formation Analysis 1 shows the effects of the model’s three parameters on the rate of tie formation. Based on these figures, we will discuss how memory thresholds, map configurations, and navigation patterns influence the formation rates of simulated networks. Figure 5 shows the growth curve of a network consisting of agents using the fixed-path navigation pattern within the Hallway map. The x-axis represents simulation time in real seconds, the y-axis the number of ties in memory. The solid line with open triangles represents the network formation rate of a network where no memory threshold was applied—if an agent met an agent, they formed a permanent tie (the directed-sufficient method). We find that the upper curve increases rapidly and then flattens when it reaches 1,336 ties (the maximum is 40 9 39, or 1,560, if the agents’ paths completely overlap, which they do not). The line with filled triangles in Fig. 5 represents the network formation rate of a network where a memory activation threshold of 0.0 was applied. According to the ACT-R theory, the activation threshold represents a memory limitation, meaning that memory chunks with an activation value lower than the threshold cannot be retrieved. The top curve’s more gradual progression illustrates the influence of memory on the formation rate: multiple exposures are required to remember another agent, while the difference in total number of ties (800 versus 1,336) illustrates memory’s effect on the network’s density. In addition, this network never achieves a fully connected state, in the sense that the agent’s declarative representation at no point includes the total set of possible interactions. In other words, these agents must continue to maintain their relationships because they continue to forget. Nevertheless, this network does eventually achieve equilibrium, with agents forgetting old alters as quickly as they meet new ones, around 150 s with a network tie count of 800 ties. Fig. 5 The effect of memory threshold on network formation over time for the fixed path navigation pattern in the hallway map (n = 40) 123 130 C. Zhao et al. Comparing the two curves in the Fig. 5, we notice another difference, the time at which the rate of growth begins to increase. For the thresholded network, this time happens later than for the un-thresholded network. This is because tie formation for thresholded agents requires multiple exposures. Initially, agents are busy simply encountering other agents and building their friends list. As they, however, begin to meet more ‘‘old friends’’, the activation values of friendships start to increase enough to be retrieved. We set the travel interval between rooms at 16 s to make the effect of memory decay more prominent. Nevertheless, we recognize this interval is unrealistic because people can take minutes or hours to find one another. As this work only focuses on the growth pattern of social networks, however, we argue that the measurement of time is a secondary factor in this case because over 80 % of the decay happens in the first 16 s according to ACT-R’s decay function, with little additional decay occurring at greater time scales. In exploratory tests, we found this to be true, resulting in our choice of a total running time of 500 s and a travel interval of 16 s. Figure 6 shows the growth curve of a network of agents moving through the Hallway map using the random navigation pattern. Comparing Figs. 5 and 6, the non-threshold curves have the same growth pattern, but the threshold curves appear to be different. Memory appears to have different effects based on the setting and navigation pattern in which the agents operate. Figure 7 compares the growth curves of two networks where a memory retrieval threshold of 0.0 was applied; these networks differ with respect to the navigation pattern used by the agents. The fixed path-pattern (dashed line) forms ties more quickly than the random-path pattern. We suspect that the fixed-path pattern achieves equilibrium sooner because it is more localized, and thus provides more chances for agents to meet their ‘‘old friends’’. On the other hand, both networks achieve equilibrium at about 800 ties, suggesting that the navigation patterns in this simulation do not constrain the total number of relations an agent can maintain in memory. Figure 8 compares the network formation rates of networks occurring in each of the three map configurations (full grid, central, and hallway); all these networks consist of agents with a memory activation threshold of 0.0. We find that the map configurations have a similar influence on the networks’ growth curves as the navigation patterns. Again, the map configurations influence the rate of formation but not the network’s density at equilibrium. In Fig. 8 the Hallway map (grid ratio = 60 %) is associated with the longest delay in network formation and the lowest rate of increase; the full grid map (grid ratio = 100 %) has the shortest delay and the fastest rate of tie formation. These results show that delay in the network’s growth rate is negatively correlated with map configuration (operationalized as grid ratio), while the network’s growth rate during its growth spurt is positively correlated. Besides the threshold effect, we also find that the networks’ formation rate follows a logistic equation. As Verhulst (1845) showed in his article, Mathematical Researches into the Law of Population Growth, population growth follows a sigmoid function. His equation models population growth as a three-step process, (1) firstly, the population has a delay step when a new species is adjusting to the 123 Factors of interest in agent-based socio-cognitive simulations 131 Fig. 6 The effect of memory threshold on network formation over time for the random walk pattern in the hallway map (n = 40) Fig. 7 The effect of navigation pattern on network formation over time in the hallway map with threshold (n = 40) for agents with a memory threshold environment; (2) secondly, the species populates exponentially when they have adapted to the environment; and (3) thirdly, the population size reaches an upper boundary due to limits in living space and food. Figure 9 shows Verhulst’s logistic growth curve using Pearl and Reed’s (1920) more familiar notation. Comparing Figs. 8 and 9, we find that the network tie formation rate matches nicely Verhulst’s logistic equation with its three phases. Based on their similarity, we could mathematically model the formation rate by a differential logistic equation. Equation 4 provides an approximate model of our result, in which N represents the total number of ties, t represents time, r represents a growth parameter, K represents the upper network capacity. This equation suggests that the tie count and its rate of growth are related to the current network’s density and is constrained by available cognitive capacity. ! " dN K#N ¼ rN ð4Þ dt K ðThe differential equation represents network growth patternÞ 123 132 C. Zhao et al. Fig. 8 The effect of map configuration on network formation over time (n = 40) for the random walk agent Fig. 9 Population growth as a function of species population and living time 4.2 Network structure as a function of threshold Analysis 2 shows the effect that tie threshold has on the aggregate characteristics of the symmetric networks and directed networks from the final state of the simulation. Analysis 1, which shows how the networks grew over time, is suggestive in several ways. One, it shows that thresholded networks exhibit fewer ties and thus lower functional densities. Two, it shows that map configurations and locality are important, at least on the rate of tie formation. On the other hand from these analyses, we can make a few inferences about the effect of memory decay on network structure. Analysis 1 shows the maximum connectivity associated with two 123 Factors of interest in agent-based socio-cognitive simulations 133 threshold conditions; and it shows the time required to reach that maximum as a function of two factors (room configuration and navigation patterns). From analysis 1, however, we cannot ascertain the effect memory decay has on network structure, other than to speculate about its effect on density. Figure 10 presents ten plots that display the changes in five network measures as a function of tie threshold (the definition of what constitutes a social tie based on the length of time to recall an alter). Each plot has three lines: each is the mean for each of the three population values. The dot–dashed green line is population 20, the dashed red line is population 40, and the solid blue line is population 60. Highly translucent scatter points, similarly colored, are included to give an indication of spread for the measure as a function of threshold. The plots show that, as the memory threshold increases from -5 to 5, that all measures change significantly, especially when threshold values range between -1 and 1.5. We see this as the cognitive band (Newell 1990, p. 7). It corresponds to time retrieval for the chunks as ranging from 28 to 2 ms for retrieval of the alterchunk, based on our setting for the noise parameter (s) of -3.5. Measures at thresholds outside this range effectively represent hypothetical boundaries of perfect memory when the retrieval threshold is below -1 and non-existent memory above 1.5. Thus, Fig. 10 suggests that the applicability of these measures (the range of meaningful variance) to social networks relies, in part, on the bounded rationality of their nodes. Comparing the directed and symmetric networks, we see that symmetric networks have less variance as threshold changes. This difference is an artifact of the stricter reciprocity criteria associated with the symmetric networks. We also find that the threshold, within our found cognitive bounds, has a positive influence on average distance, average eigenvector centrality, and betweenness, while having a negative influence on density and clustering coefficient. This result implies that when increasing activation threshold, the overall density of the network will shrink but the network will split into several smaller and tighter groups with more second-order connections, as can be seen previously in Fig. 4. 4.3 The effects of threshold and environment on network structure Analysis 3 shows regression results for each measure, with y being a function of threshold, running time, population, room map, grid ratio, and threshold2 (squared threshold). Equation 4, the equation for clustering coefficient, is an example. Based on the results of analysis 2, we focus our analysis on predicting each measure when threshold is between -1 and 1.5.5 We included threshold2 because we found threshold has the largest regression parameters and the relation between threshold and all measures is apparently non-linear (see Fig. 10). 5 In ORA, the treatment of isolated nodes for path-based measures is customizable. We use a simple default where nodes report a ‘0’ to nodes they cannot reach. 123 134 C. Zhao et al. Fig. 10 The distribution of each network measure as a function of threshold for both symmetric and directed networks. Translucent scatter points, similarly colored, give an indication of spread for each measure Clustering Coefficient ¼ 0:67 þ #:40ðThresholdÞ þ :23ðRun TimeÞ þ :12ðPopulationÞ þ 0:3ðRoom MapÞ þ :12ðGrid RatioÞ þ #:09ðThreshold2 Þ þ e ð4Þ ðEquation for clustering coefficient for directed networkÞ We then analyze regression results by population size, identifying instances where environmental configuration is more or less important. Analysis 2 established that threshold has an effect on network structure; analysis 1 showed that environmental factors, at a minimum, influence the rate of tie formation. Brantingham and 123 Factors of interest in agent-based socio-cognitive simulations 135 Fig. 11 a Hallway map’s heatmap; b Agents-by-location network Brantingham (1993) found that environment configurations create loci of interaction or activity spaces. These locations are where the majority of interactions occur. Brantingham and Brantingham use this concept to study crime densities, but this idea can be expanded to other activities, such as co-occurrence or socialization. From this research, it seems reasonable to conclude that environment configurations have implications for the structure of agents’ friend networks as well as their formation rates. Analysis 1 lends support to this argument. Figure 11a shows a heat map of relative room activity in analysis 1 for the hallway configuration, while Fig. 11b shows the connectivity between all agents and the rooms in which they interacted. The concentration and degree of these spaces suggests that agents who travel between activity spaces tend to travel along the same path, creating more interaction opportunities. Consequently, we theorize that the effect of room maps and grid ratio are influenced by population size. 4.3.1 Comparing symmetric and directed networks We next compare the symmetric and directed networks. Table 3 shows the parameter coefficients of each of our measures for the symmetric networks. We see that our model best explains the variance associated with density, average distance, and the networks’ clustering coefficient, while explaining less well changes in eigenvector centrality and betweenness. As these latter measures generally exhibit an inverse U-shaped curve as networks move from highly dense to sparse—this is not surprising. We see for each dependent measure that the effect of threshold and threshold2 are significant, although the directionality with respect to each other is not always consistent. These effects generally exhibit the largest magnitude. We, however, must be cautious in comparing relative magnitudes due to the fact that room map and grid ratio are collinear. As analysis 1 predicted, we find that as threshold increases density decreases, essentially as agents forget alters they are less able to 123 136 Table 3 C. Zhao et al. Symmetric network measure comparison (n = 82,620 networks) Independent variables Density Average distance Clustering coefficient Eigenvector centrality Betweenness Threshold -0.30*** (0.00) -0.52*** (0.01) -0.41*** (0.00) 0.05*** (0.00) -0.03*** (0.00) Run time 0.06*** (0.00) 0.50*** (0.01) 0.11*** (0.00) 0.07*** (0.00) 0.02*** (0.00) Population -0.05*** (0.00) 0.06*** (0.00) -0.04*** (0.00) n.s. -0.02*** (0.00) Room map 0.03*** (0.01) n.s. 0.05*** (0.01) n.s. n.s. Grid ratio 0.16*** (0.03) n.s. 0.24*** (0.04) n.s. n.s. Threshold2 0.17*** (0.00) -0.67*** (0.01) 0.13*** (0.00) -0.24*** (0.00) -0.02*** (0.00) Intercept 0.02 1.39*** 0.1 0.43*** 0.09** Adjusted R2 0.76 0.54 0.79 0.22 0.24 -55,460.828 -105,449.55 BIC -12,634.49 -12,634.49 -76,865.20 Standard errors are in parentheses. n.s. indicates not significant values * p \ 0.05, ** p \ 0.01, *** p \ 0.001 capitalize upon all the possible ties in the set. We also see that average distance, clustering coefficient, and betweenness decrease while network eigenvector centrality increases, which lends support to our interpretation in analysis 2 that as threshold increases more subgroups with more second-order connections form. We see for every unit increase in run time a corresponding increase in density, average distance, clustering coefficient, eigenvector centrality, and betweenness, suggesting as indicated by analysis 1 that by 500 s the networks in memory have achieved a stable pattern. We also see that an increase in population, as expected, leads to a decrease in density and clustering coefficient, as these two measures are highly related. The decrease in betweenness combined with the increase in average distance suggests that as the population increases the network as a whole tends to fragment. Interestingly for symmetric networks, population’s effect on eigenvector centrality is insignificant, this may be a result of reciprocity criteria combined with the random movement pattern. Essentially, no group leaders form, when we require agents to acknowledge the tie, because the agents are moving randomly through the environment. Finally, we find the effects of room map and grid ratio to only be significant for network density and clustering coefficient. We find that as grid ratio increases or as room map changes (from Hallway to Central and Grid configurations), density and clustering coefficient also increase. Although these measures are collinear, they are not perfectly so; and we found that separating them improved the overall fit of the model. As our heat map analysis suggested (Fig. 11), the influence of room map and grid ratio seems, in part, a function of population. As these two measures are the most directly affected by shifts in population, they are also the most likely to be sensitive to shifts in room map and grid ratio. Table 4 shows the parameter coefficients of each of our measures for the directed networks. The results are similar but not identical. We find that threshold and threshold2 are significant and in the same direction as in the symmetric networks; however as suggested in analysis 2, the directed networks appear more linear, as the magnitude of threshold2 is far less. Run time also remains significant but moves in 123 Factors of interest in agent-based socio-cognitive simulations Table 4 137 Directed network measure comparison (n = 82,620 networks) Independent variable Density Average distance Clustering coefficient Eigenvector centrality Betweenness Threshold -0.35*** (0.00) -0.15*** (0.01) -0.40*** (0.00) 0.09*** (0.00) 0.01*** (0.00) Run time 0.21*** (0.00) 0.17*** (0.00) 0.23*** (0.00) -0.04*** (0.00) -0.01*** (0.00) Population -0.13*** (0.00) 0.12*** (0.00) -0.07*** (0.00) 0.05*** (0.00) n.s. Room map* n.s. n.s. 0.03* (0.01) n.s. n.s. Grid ratio n.s. n.s. 0.12* (0.06) n.s. n.s. Threshold2 -0.05*** (0.01) -0.41*** (0.01) -0.09*** (0.00) -0.08*** (0.00) -0.03*** (0.00) Intercept 0.72*** 1.07*** 0.67*** 0.05 0.04 Adjusted R2 0.77 0.29 0.77 0.09 0.04 BIC -65,998.0 -24,340.0 -62,330.6 -56,519.1 -93,633.7 Standard errors are in parentheses. n.s. indicates not significant values * The hallway map is the reference category for this and the subsequent analysis * p \ 0.05, ** p \ 0.01, *** p \ 0.001, *** p \ 0.001 the opposite direction in the cases of eigenvector centrality and betweenness. We suspect this difference is due to the time requirements of our symmetric-network merge method, where more time is required to meet the reciprocity requirement. Population’s significance and directionality are the same for the networks’ density, average distance, and clustering coefficient; however, we see a significant positive effect for eigenvector centrality and no effect for betweenness while we see the opposite trends in the symmetric networks (no effect on eigenvector centrality and significant negative effect on betweenness). We suspect that the opposite trends in directed and symmetric networks with regards to eigenvector is the result from the difference in tie criteria. Directed networks that require only a single nomination to establish a tie feature more unique agent-to-alter relations. Because eignvector centrality approaches zero as networks shift from being dense to very dense, we suspect the difference in population’s main effect in these two setting can be explained as a difference in the effect of an increase in population on moderately dense network compared to a very dense network. With regards to betweenness, the directed networks tie requirements allow for the retention of more ties, meaning more paths between the agents. The increase in paths has a suppressive effect on betweenness as seen in the symmetric networks; but in the directed networks, high density is almost assured, limiting the amount of meaningful variation in the betweenness measure. With regards to room map and grid ratio, we see the effects of these two factors are in the same direction for the networks’ average clustering coefficient but are less significant. For directed networks, they have no effect on density. Again, this difference is likely a result of the more relaxed tie requirements—the lack of a reciprocity criteria means that agents’ require fewer interactions to establish a tie. 4.3.2 The effect of environment in relation to population size Our heat map and co-occurrence network analysis suggests that population size was likely to influence the degree to which environmental configuration (operationalized 123 138 C. Zhao et al. Table 5 Environmental effects as a function of population Measure Population 20 Population 40 Population 60 Symmetric networks Density Room map (2) Room map (1) Room map (2) Grid ratio (?) Grid ratio (1) Grid ratio (2) Average distance Clustering coefficient Eigenvector centrality Room map (2) Room map (1) Grid ratio (?) Grid ratio (1) Room map (2) Room map (1) Grid ratio (2) Grid ratio (1) Betweenness Directed networks Density Room map (1) Room map (2) Grid ratio (1) Grid ratio (2) Average distance Room map (2) Grid ratio (2) Clustering coefficient Room map (1) Grid ratio (1) Eigenvector centrality Betweenness Room map (1) Room map (1) Room map (2) Grid ratio (1) Grid ratio (1) Grid ratio (2) Signs indicate the directionality of the relationship, Blank cells indicate non-significant effects as room map and grid ratio) influences our measures. Table 5, a summary of six regression analyses found in Appendix 2, supports this conjecture. Table 5 shows that environmental configuration differs in significance with respect to population. Interestingly, for both network types, environmental configuration has a greater impact for population sizes of 20 and 60. We see, however, that environmental configuration seems to play a greater role in symmetric networks, where the reciprocity requires on average more interactions to form a tie. For the directed networks, it is interesting to see that environmental configuration affects networks consisting of 60 agents for four of the five measures. We suspect that for these networks the comparative advantage of grid and central configurations are at their greatest, where agents are likely to have many interaction opportunities but also many missed opportunities. 5 Discussion and conclusions Anderson’s Decomposition Thesis says that ‘learning on the large scale arises from learning on the small scale (Anderson 2002, p. 87).’ In this paper, the large-scale learning of interest is the resulting social network of these agents and the recognition of previous interaction partners is the small scale learning process. We examine this thesis through a multi-agent social simulation that provides a flexible platform to examine several factors on social networks. Taking advantage of ACT- 123 Factors of interest in agent-based socio-cognitive simulations 139 R’s memory mechanism, we created an egocentric view of our agents’ interaction networks in memory by reconstructing the declarative representations used by our agents to recall past associations. We then merged these ego-nets to create a system wide representation of our agents’ recalled interactions. Based on our review, we distinguished between environmental and cognitive factors. We focused on two cognitive factors: memory activation threshold and navigation pattern. Our environmental factors included population size, run time, and environment configuration (Room Map and Grid Ratio). To examine these factors, we conducted three analyses. Our experiments’ results show each of these factors impact the growth and structure of the merged ego-networks. The first analysis focuses on the network formation rate; that analysis suggests that activation threshold, navigation pattern, and room map influence the resulting network formation rate. To explore the findings of the first experiment further, we performed two additional confirmatory experiments. We explored the effect of threshold, population, and room map on the resulting merged networks, examining these networks based on five standard social network analysis (SNA) measures. The second analysis (Fig. 10) shows the effect that tie threshold has on our selected SNA measures of both symmetric and directed networks. This second analysis suggested that cognitive bounds contribute to the meaningful variance of these measures and offers some indication of where these bounds may be. The third analysis is a set of OLS regressions, evaluating the utility of each of our independent variables (threshold, population, room map, grid ratio) on each selected SNA measure, as well as providing useful information on the directionality of these effects. One interesting insight of Analysis 3, shown in Table 5, is that the relationship between environmental configuration and these various SNA measures is dependent on population, which suggests the importance of cognitive limits in that they prevent agents from capitalizing on the full set of alters available for interaction. These results have implications for network formation, reporting network models, and suggest future work based on the limitations of our current work. 5.1 The implications for modeling network formation We can view the progression of the thresholded curves depicted in Figs. 5, 6, 7, 8 as corresponding to three stages in the formation of a network, though at abbreviated time scales. In the simulation running 500 s, the network’s total tie count grows in three periods: the counts remain low and unstable in the first period; it increase rapidly in the second period; and it stops growing and remains stable in the last period. The network tie count is stable at roughly 800 ties (an average of 20 ties per agent). Comparing the formation curves, we see that grid ratio and navigation pattern both impact the formation rate. Specifically, grid ratio has a positive influence on the formation rate when agents employ a random-walk pattern. We conclude that the grid ratio of an environment is an important factor for embodied simulation of network formation and should be reported. Interestingly, the growth pattern of these declarative representations seems to correspond to the growth of populations, as both are characterized by these three stages. Additionally, the growth of these representations has an intuitive appeal, as 123 140 C. Zhao et al. we can interpret these stages in terms of direct experience (e.g., the tendency of people to take time to develop relationships with other people as they become more familiar with their surroundings). This analysis suggests that it may be fruitful to model the mediating influence of memory on network density using population growth models when simulating disruptions to social networks (i.e., using growth models to provide better estimates of the time required for a social network of various sizes to acquire new ties). However, although ACT-R is nominally capable of representing alter retrieval on appropriate time-scales, work remains to identify an appropriate DM chunk structure for alters. Although Analysis 1 is suggestive, in that it shows a relationship between cognition and network density, density is only one measure. Density constrains structure, in that there is little useful variation associated with other network measures as networks approach maximal and minimal density. In the former case, no node or path is uniquely privileged and thus there exists no structural advantage. In the latter, the network is so sparse that relations, and thus the representation itself, ceases to be meaningful. Consequently, further analyses were necessary to understand how the interaction between cognition and environment influences the structure of social networks. Analyses 2 and 3 suggest that meaningful variation associated with five common SNA measures has a cognitive foundation. For analysts conducting empirical studies of firms or testing the resiliency of defense networks, this finding may seem obvious and offer little traction. For the purposes of many studies, the impact of cognition can be accounted for using a few generalizations (e.g., specifying the number of interactions necessary to constitute a tie). Or, for institutional scholars, this analysis may appear trivial in that it strips away the structuring influence of institutions—there is no account of hierarchy, preferential attachment, or of social closure. These analyses, however, offer a powerful corrective. First, as analysts seek to better understand multimodal and multiplex networks, we cannot assume that every node or relation shares the same basic constraints and thus that measures have a comprehensive meaning for the entire network. As Fig. 10 shows, networks composed of nodes with perfect memory have different characteristics than those operating in the cognitive band; and as Tables 3, 4, and 5 show the effects of the environment on these network measures may be nuanced and multi-faceted. We saw these effects even in a simulation where agents move randomly and social interaction was limited merely to remembering previous interaction partners, there is no reason to expect this would not hold true in simulations with richer agent representations and interaction opportunities. In short, you cannot remove the social context from the interpretation of a social network measure. So, we conclude that network simulation of cognitive agents need to include the effect of memory or risk not making accurate predictions of human network structure. Another implication of this work is that agents that co-locate do not always know each other. This supports Cranshaw et al.’s work (2012) on what they refer to as ‘‘livehoods’’. We conclude from this analysis that other analyses that that assume co-location to infer ties are using a problematic assumption. Agents may collocate based on other factors than knowing each other, even repeated collocations. For 123 Factors of interest in agent-based socio-cognitive simulations 141 example, not all students in a class know each other. And if collocated and they have met, they will not always remember that. 5.2 The implication for reporting network studies Finally, these analyses suggest that our selection of our tie criteria is important and influences our results. Unlike common social tie criteria, such as interaction count, the criteria we used here, based on activation threshold, has important homeostatic properties implicit in a cognitive architecture, which suggests that time is important in the interpretation of a human’s social network. Criteria such as interaction count result in network formations that are persistent and aggregative over time, resulting frequently in what are called scale-free structures. Although it is infeasible to derive directly a cognitive activation value for humans interacting with other humans, using an algorithm for determining ties that is sensitive to the timing of the interaction events seems like it would produce networks more similar in structure to human networks in our minds, which we believe is sympathetic to the approach discussed in Anderson and Schooler (1991). This limitation of traditional tie criteria may be partially ameliorated through accounting of time in the sampling process (e.g., latent growth modeling) or by using a specific time window (e.g., Cranshaw et al. 2012). These findings suggest that future simulations examining generative social networks must control and carefully report particularly criteria, memory quality, and running time. All have an effect on networks and network formation. 5.3 Implications for cognitive science The time-scales for memory retrieval of these chunks suggest that the cognitive band in our data appears to be important for social agents. As our cognitive agents become more complex and have social ties, the information about human social ties can provide an additional constraint on our cognitive theories. This work extends Morgan et al.’s (2010) work examining the influence of environmental factors on cognition and social interaction. Cognitive theories generally limit the influence of environment to inputs in the decision cycle. State changes in these models are the result of previous actions. Our work suggests that environmental factors influence interactions at a more basic level through relational networks in the mind. As cognitive theories move from modeling actions occurring at the cognitive and rational bands to the social band, we will need to formalize our understanding of the persistent effects of environment on memory. Simply put, modeling long term collaborations or social learning will require more than refreshing our agents’ initial friends list. Although ACT-R is capable of representing alter retrieval on appropriate time-scales, work remains to identify an appropriate DM chunk structure for alters. Table 5 suggests that even for these simple agents we can see the effects of limited attention as the population increases, indicated by the shift in the effect of grid ratio on density as the population moves from 40 to 60 agents. In our model, we identified core friend groups using memory thresholds after the fact; however, 123 142 C. Zhao et al. modeling long term collaborations will require more explicit models for recalling primary groups or significant others because agents will have to span primary groups (recall close friends who are geo-spatially separated from them) as the environment increases in complexity. Humans often recall friends by recalling institutions or settings. This suggests we need a more explicit theory of reification and schema building that accounts for the emergence of social objects in memory. Georgeon et al. (2010) and Georgeon and Marshall (2013) provide insights into this problem in their work examining sense making; however, further work is necessary to extend these theories to model the emergence of social objects and schemas. 5.4 Limitations and future work Future research will build upon some of the more interesting issues raised by this work. First, we would look to see if normalized thresholds have consistent effects on network topology independent of the environment. Second, we should include more agents and more runs (Reitter and Lebiere 2011), as we examined only a small set of factors and environmental configurations. Second, our claim to ‘‘cognitive agents’’ is somewhat weak because it rests nearly exclusively on the memory decay function. Our agents do not use many aspects of the ACT-R architecture, for example, perceptions do not inform decisions, perceiving a ‘‘friend’’ does not inform whether the agent decides to go to another room (i.e., our agents never choose to hang-out), and the agents do not have extensive knowledge. Extending the agents to interact more with the world and to have their behavior influenced by more sophisticated knowledge structures are obvious places for future work. For instance, such models would help to explore the influence of small cognitive effects (such as memory slips or motor control errors) on information diffusion or collaboration. Finally, we would extend our analysis to see if Dunbar’s Number arises from agents managing social ties while completing essential collaborative tasks (e.g., foraging for food). Testing this hypothesis will require increasing the population size and incorporating a task model, among other things. Acknowledgments This work was supported by a grant from DTRA (HDTRA1-09-1-0054). We thank Jaeyhon Paik for help on the agent design and Jeremy Lothian for assistance with VIPER. Appendix 1: Simulation implementation details To connect ACT-R to VIPER, we implemented the Telnet Agent Wrapper for ACTR (TAWA) in Common Lisp. It supports logging in, waiting for synchronization, logging, halting, and writing results to CSV files. It also exports a number of functions that provide ways to examine the environment, speak, listen, move, and otherwise control a virtual body in VIPER. When an ACT-R model is wrapped by TAWA, executions of model code are delayed until a privileged administrator agent signals for synchronization. Error 123 Factors of interest in agent-based socio-cognitive simulations 143 conditions are also caught by TAWA and standard UNIX error codes are returned instead of dropping into the more standard debugger. For example, a successful run returns 0 to the parent process, while any error (e.g., network errors like the server being unreachable) returns a non-zero value. Returning error codes like this allows automated error checking in large-scale experiments. Because memory decay and networks are strongly temporal, we paid special attention to time. To synchronize the agents, the administrator agent (which does not take part in the experiment) waits for all of the TAWA wrapped agents to finish loading and logging in. It then signals to TAWA to begin the simulation. Because TAWA delays the evaluation of the model code until synchronization, no agent experiences time before the synchronization signal. Further, all ACT-R models are set to run in real-time and for the same amount of ‘‘real-time’’, so they all halt after the same perceived period. Thus, the total time experienced is the same for all agents. Early benchmarks showed that ACT-R processes took up about 80 MB per process. We would only have been able to run about 100 processes on a single 8 GB machine before swapping would occur. To reduce the per-process footprint, a number of optimizations were implemented. Basic space reductions were achieved by using the DECLARE Lisp construct, as well as by pre-compiling the components, removing the debugger, and saving the whole system (sans the ACT-R agent model) as a system image. This reduced our per-process memory footprint somewhat, but they were not the biggest contributions towards memory usage reduction. In SBCL Lisp 1.0.52, the ‘‘–merge-core-pages’’ flag was recently added. This flag enables Kernel Same Page Merging under recent versions of Linux (Arcangeli et al. 2009). This optimization flags shared areas of memory as being able to be merged unless modified. Because a significant percentage of our agents were replicated, we found that we could reduce the per-process memory footprint as low as 8 MB per process (with one shared copy of the merged pages excepted). Thus, the only activities that increase the size of this footprint are changes within individual agent models. Sharing and merging copies increases the number of agents capable, whether on single processors or HPC. Together, these optimizations permit orders of magnitude more agents to be run in a single experiment than many previous efforts, enabling larger-scale analysis than has been previously done. Such large-scale work is planned for future research. Appendix 2: Analysis by population Symmetric ties Population: 20 agents See Table 6. 123 144 Table 6 C. Zhao et al. Symmetric network measure comparison (20 agents) Independent variable Density Average distance Clustering coefficient Eigenvector centrality Betweenness Threshold -0.38*** (0.00) -0.65*** (0.01) -0.47*** (0.00) 0.07*** (0.01) -0.06*** (0.00) Running length 0.06*** (0.00) 0.30*** (0.01) 0.07*** (0.00) 0.09*** (0.00) 0.01*** (0.00) Room map -0.004** (0.00) – -0.01*** (0.00) -0.07** (0.02) – Grid ratio 0.33*** (0.05) – 0.39*** (0.07) -0.32** (0.12) – Threshold2 0.21*** (0.00) -0.48*** (0.02) 0.15*** (0.00) -0.27*** (0.01) -0.02*** (0.00) Intercept -0.22** 1.36** -0.10 0.76*** 0.06 Adjusted R2 0.83 0.55 0.85 0.28 0.26 BIC -30,022.99 -6,114.63 -27,048.64 -18,914.03 -32,853.96 Standard errors are in parentheses * p \ 0.05, ** p \ 0.01, *** p \ 0.001 Population: 40 agents See Table 7. Table 7 Symmetric network measure comparison (40 agents) Independent variable Density Average distance Clustering coefficient Eigenvector centrality Betweenness Threshold -0.32*** (0.00) -0.40*** (0.02) -0.40*** (0.00) 0.05*** (0.01) -0.02*** (0.00) Running length 0.07*** (0.00) 0.64*** (0.01) 0.16*** (0.00) 0.08*** (0.00) 0.02*** (0.00) Room map 0.05*** (0.01) – 0.01*** (0.02) – – Grid ratio 0.26*** (0.05) – 0.36*** (0.08) – – Threshold2 0.19*** (0.00) -0.70*** (0.02) 0.10*** (0.00) -0.18*** (0.01) -0.03*** (0.00) Intercept -0.18** 1.53** -0.11 0.41** 0.04 Adjusted R2 0.78 0.54 0.76 0.18 0.20 BIC -29,849.2 -3,865.24 -24,276.8 -18,869.83 -34,137.72 Standard errors are in parentheses * p \ 0.05, ** p \ 0.01, *** p \ 0.001 123 Factors of interest in agent-based socio-cognitive simulations 145 Population: 60 agents See Table 8. Table 8 Symmetric network measure comparison (60 agents) Independent variable Density Average distance Clustering coefficient Eigenvector centrality Betweenness Threshold -0.19*** (0.00) -0.51*** (0.02) -0.36*** (0.00) 0.03*** (0.01) -0.02*** (0.00) Running length 0.04*** (0.00) 0.56*** (0.01) 0.11*** (0.00) 0.04*** (0.00) 0.02*** (0.00) Room map -0.02** (0.01) – – 0.05! (0.05) – Grid ratio -0.09** (0.03) – – 0.23! (0.08) – Threshold2 0.12*** (0.00) -0.82*** (0.02) 0.13*** (0.00) -0.26*** (0.01) -0.02*** (0.00) Intercept 0.18*** 1.74** 0.26** 0..10 0.08** Adjusted R2 0.77 0.57 0.81 0.24 0.32 BIC -36,536.88 -3,686.64 -28,199.56 -18,172.47 -42,935.08 Standard errors are in parentheses * p \ 0.05, ** p \ 0.01, *** p \ 0.001 Directed ties Population: 20 agents See Table 9. Table 9 Direct network measure comparison (20 agents) Independent variable Density Average distance Clustering coefficient Eigenvector centrality Betweenness Threshold -0.41*** (0.00) -0.10*** (0.01) -0.39*** (0.05) 0.08*** (0.00) 0.04*** (0.00) Running length 0.19*** (0.00) 0.28*** (0.01) 0.21*** (0.00) 0.02*** (0.00) -0.01*** (0.00) Room map 0.05** (0.02) – – – 0.03* (0.01) Grid ratio 0.19* (0.1) – – – 0.12* (0.06) Threshold2 -0.11*** (0.01) -0.28*** (0.01) -0.15*** (0.01) -0.02** (0.00) -0.05*** (0.00) Intercept 0.47*** 0.96** 0.71*** 0.07 0.09 Adjusted R2 0.79 0.29 0.76 0.07 0.05 BIC -21,807.24 -9,449.68 -20,262.82 -22,790.75 -28,845.32 Standard errors are in parentheses * p \ 0.05, ** p \ 0.01, *** p \ 0.001 123 146 C. Zhao et al. Population: 40 agents See Table 10. Table 10 Symmetric network measure comparison (40 agents) Independent variable Density Average distance Clustering coefficient Eigenvector centrality Betweenness Threshold -0.37*** (0.00) -0.23*** (0.01) -0.39*** (0.00) 0.06*** (0.01) – Running length 0.27*** (0.00) 0.20*** (0.01) 0.28*** (0.00) -0.06*** (0.00) -0.02*** (0.00) Room map – – – – 0.02* (0.01) Grid ratio – – – – 0.1* (0.05) Threshold2 -0.04*** (0.01) -0.29*** (0.02) -0.06*** (0.01) -0.08*** (0.01) -0.03*** (0.00) Intercept 0.39** 0.70* 0.47** -0.05 -0.08 Adjusted R2 0.76 0.28 0.76 0.06 0.05 BIC -21,326.68 -8,182.56 -20,146.43 -17,997.19 -30,998.11 Standard errors are in parentheses * p \ 0.05, ** p \ 0.01, *** p \ 0.001 Population: 60 agents See Table 11. Table 11 Directed network measure comparison (60 agents) Independent variable Density Average distance Clustering coefficient Eigenvector centrality Betweenness Threshold -0.28*** (0.00) -0.10*** (0.01) -0.42*** (0.00) 0.14*** (0.01) -0.01*** (0.00) Running length 0.18*** (0.00) 0.02* (0.01) 0.21*** (0.00) -0.09*** (0.00) – Room map -0.02! (0.01) -0.13* (0.06) 0.04* (0.02) – -0.04*** (0.01) Grid ratio -0.15* (0.07) -0.62* (0.28) 0.18* (0.09) – -0.20*** (0.03) Threshold2 – -0.66*** (0.02) -0.04*** (0.00) -0.15*** (0.01) -0.02*** (0.00) Intercept 0.51*** 2.28*** 0.39** 0.18 0.3*** Adjusted R2 0.79 0.37 0.81 0.15 0.11 BIC -26,961.6 -8,110.41 -23,085.32 -17,580.27 -36,378.39 Standard errors are in parentheses * p \ 0.05, ** p \ 0.01 123 Factors of interest in agent-based socio-cognitive simulations 147 References Allen TJ (1984) Managing the flow of technology: technology transfer and the dissemination of technological information within the R&D organization. MIT Press, Cambridge, MA Anderson JR (2002) Spanning seven orders of magnitude: a challenge for cognitive modeling. Cogn Sci 26(1):85–112 Anderson JR, Schooler LJ (1991) Reflections of the environment in memory. Psychol Sci 2(6):396–408 Anderson JR, Bothell D, Byrne MD, Douglass S, Lebiere C, Qin Y (2004) An intergrated theory of the mind. Psychol Rev 111(4):1036–1060 Arcangeli A, Eidus I, Wright C (2009) Increasing memory density by using KSM. In: Proceedings of the Linux Symposium. Montreal, pp 19–28 Axelrod R, Hamilton WD (1981) The evolution of cooperation. Science 211(4489):1390–1396 Barrett C, Eubank S, Marathe M (2006) Modeling and simulation of large biological, information and socio-technical systems: an interaction based approach. In: Goldin D, Smoka SA, Wegner P (eds) Interactive computation: the new paradigm. Springer, Berlin Heidelberg, pp 353–394 Bonacich P (1972) Factoring and weighting approaches to status scores and clique identification. J Math Sociol 2(1):113–120 Brantingham PL, Brantingham PJ (1993) Nodes, paths and edges: considerations on the complexity of crime and the physical environment. J Environ Psychol 13(1):3–18 Carley KM (1991) A theory of group stability. Am Sociol Rev 56:331–354 Carley KM (1992) Organizational learning and personnel turnover. Organ Sci 3(1):20–46 Carley KM, Kjaer-Hansen J, Prietula M, Newell A (1990) Plural-Soar: a prolegomenon to artificial agents and organizational behavior. In: Masuch M, Warglien M (eds) Artificial intelligence in organization and management theory. Elsevier, Amsterdam, pp 87–118 Carley KM, Pfeffer J, Reminga J, Storrick J, Columbus D (2012) ORA User’s Guide 2012 Conte R, Andrighetto G, Campennı̀ M (eds) (2014) Minding norms: mechanisms and dynamics of social order in agent societies. Oxford University Press, New York Cranshaw J, Schwartz R, Hong JI, Sadeh N (2012) The Livehoods Project: Utilizing social media to understand the dynamics of a city. Paper presented at the 6th international AAAI conference on Weblogs and Social Media. Retrieved 10/24/2013 DiFonzo N (2008) The watercooler effect: a psychologist explores the extraordinary power of rumors. Avery, New York Dunbar RIM (1998) Grooming, gossip, and the evolution of language. Harvard University Press, Cambridge Epstein JM (2006) Generative social science: studies in agent-based computational modeling. Princeton University Press, Princeton Epstein JM, Axtell RL (1996) Growing artificial societies: social science from the bottom up. The MIT Press, Cambridge Everett M, Borgatti SP (2005) Ego network betweenness. Soc Netw 27(1):31–38 Freeman LC (1978–1979) Centrality in social networks conceptual clarification. Soc Netw 1(3):215–239 Georgeon O, Marshall J (2013) Demonstrating sensemaking emergence in artificial agents: a method and an example. Int J Mach Conscious 5(02):131–144 Georgeon O, Morgan JH, Ritter FE (2010) An algorithm for self-motivated hierarchical sequence learning. In: International Conference on Cognitive Modeling, Philadelphia, pp 73–78 Gonzales C, Lebiere C (2005) Instance-based cognitive models of decision making. In: Zizzo D, Courakis A (eds) Transfer of knowledge in economic decision-making. Palgrave Macmillan, New York, pp 148–165 Holland JH (1992) Complex adaptive systems. Daedalus 121(1):17–30 Juvina I, Lebiere C, Martin JM, Gonzalez C (2011) Intergroup prisoner’s dilemma with intragroup power dynamics. Games Econ Behav 2(1):21–51 Kephart WM (1950) A quantitative analysis of intragroup relationships. Am J Sociol 55(6):544–549 Kraut R, Kiesler S, Boneva B, Cummings J, Helgeson V, Crawford A (2002) Internet paradox revisited. J Soc Issues 58:49–74 Kuehl RO (2000) Design of experiments: statistical principles of research design and analysis. Duxbury/ Thomson Learning, Belmont 123 148 C. Zhao et al. Lebiere C, Gonzalez C, Dutt V, Warwick W (2009) Predicting cognitive performance in open-ended dynamic tasks: a modeling comparison challenge. In: the 9th international conference on cognitive modeling, Manchester McCarty C, Killworth PD, Bernard HR, Johnsen EC, Shelley GA (2001) Comparing two methods for estimating network size. Hum Organ 60(1):28–39 Milgram S (1967) The small world problem. Psychol Today 1(1):61–67 Miller JH, Page SE (2007) Complex adaptive systems: an introduction to computational models of social life. Princeton University Press, Princeton Moreno JL, Jennings HH (1938) Statistics of social configurations. Sociometry 1(3/4):342–374 Morgan GP, Carley KM (2011) Exploring the impact of a stochastic hiring function in dynamic organizations. Paper presented at the BRIMS, Retrieved Morgan JH, Morgan GP, Ritter FE (2010) A preliminary model of participation for small groups. Comput Math Organ Theory 16(3):246–270 Newell A (1990) Unified theories of cognition. Harvard University Press, Cambridge Newell A, Rosenbloom P, Laird J (1989) Symbolic architectures for cognition. In: Posner M (ed) Foundations of cognitive science. MIT Press, Cambridge Pavlik PI Jr, Anderson JR (2005) Practice and forgetting effects on vocabulary memory: an activationbased model of the spacing effect. Cogn Sci 29:559–586 Pearl R, Reed LJ (1920) On the rate of growth of the population of the United States since 1790 and its mathematical representation. Proc Natl Acad Sci 6:275–288 Reitter D, Lebiere C (2010) A cognitive model of spatial path planning. Comput Math Organ Theory 16(3):220–245 Reitter D, Lebiere C (2011) Towards cognitive models of communication and group intelligence. In: Proceedings of the 33rd annual meeting of the cognitive science society, Boston, pp 734–739 Ritter FE, Haynes SR, Cohen MA, Howes A, John B, Best B (2006) High-level behavior representation languages revisited. In: Proceedings of the 7th international conference on cognitive modeling, Trieste, pp 404–407 Schelling TC (1971) Dynamic models of segregation. J Math Sociol 1(2):143–184 Silverman BG (2004) Human performance simulation. In: Ness JW, Ritzer DR, Tepe V (eds) The science and simulation of human performance. Elsevier, Amsterdam, pp 469–498 Simon HA (1991) Bounded rationality and organizational learning. Organ Sci 2(1):125–134 Verhulst P-F (1845) Mathematical researches into the law of population growth. Nouveaux Mémoires de l’Académie Royale des Sciences et Belles-Lettres de Bruxelles 18:1–42 Wasserman S, Faust K (1994) Social network analysis: methods and applications, vol 8. Cambridge University Press, New York Yen J, Yin J, Ioerger TR, Miller MS, Xu D, Volz RA (2001) CAST: collaborative agents for simulating teamwork. In: Proceedings of the 17th international joint conference on artificial intelligence (IJCAI-01), Los Altos, pp 1135–1142 Changkun Zhao is a PhD candidate at the College of Information Sciences and Technology’s Applied Cognitive Science Lab. He is interested in spatial memory modeling, information processing theory. Ryan Kaulakis is a PhD candidate at the Applied Cognitive Science Laboratory in the College of IST. His research interests are in model fitting in cognitive architectures, and soft-computing. Jonathan H. Morgan is and a Ph.D student at Duke University and a former researcher at the College of IST’s Applied Cognitive Science Lab with expertise in modeling social interactions and social moderators, including the moderating effect that relative spatial positions have on group performance. Jeremiah W. Hiam is a research assistant at the College of IST’s Applied Cognitive Science Lab. Game design, AI in games, and psychology of gaming are his fields of interest. Frank E. Ritter is on the faculty of the College of IST, an interdisciplinary academic unit at Penn State to study how people process information using technology. He edits the Oxford Series on Cognitive 123 Factors of interest in agent-based socio-cognitive simulations 149 Models and Architectures and is an editorial board member of Cognitive Systems Research, Human Factors, and IEEE SMC Part A: Systems and Humans. Joesph Sanford is a Ph.D student at the Department of Computer Science, Tufts University. Before that, he was a research assistant at Applied Cognitive Science Lab of Penn State University. Geoffrey P. Morgan is a PhD student in Carnegie Mellon’s Computation, Organization, and Society program. Morgan has experience in developing autonomous robotic systems, self-learning systems, cognitive models, and user-assistive systems. He is interested in organizational impacts on individual decision-making and on optimizing organizational performance. 123
© Copyright 2026 Paperzz