J Evol Econ DOI 10.1007/s00191-013-0334-4 REGULAR ARTICLE An NK-like model for complexity Marco Valente © Springer-Verlag Berlin Heidelberg 2013 Abstract The level and nature of complexity is widely regarded as an important determinant of a number of economic, technological and organizational phenomena. A popular modeling tool for the representation of complexity in economics and organizational sciences is the NK model that represents the complexity stemming from the interactions among the elements of a system. This paper proposes an enhanced model for complexity that, though maintaining the core design (and properties) of the NK model, provides a more intuitive and richer representation of complexity, extending its applications and deepening the understanding of its effects on economic systems. The proposed pseudo-NK (pNK) model is defined on real-valued variables, as opposed to the binary variables required by NK, so as to allow for richer and more intuitive definitions of distance and search strategies. It also admits as a source of complexity not only the number of interactions, as in NK, but also their intensity, opening a novel way to express and measure the level of complexity. Finally, instead of relying on statistical properties of a large dataset of random values, pNK is defined as a deterministic function, far simpler to implement, to interpret and to calibrate for specific requirements. The paper replicates known results and presents original ones; in both cases, the proposed model proves a powerful tool for the investigation of the role of complexity, particularly in agent-based models. This paper was presented to the International Schumpeter Society Conference, 2010, Aalborg (DK). I wish to thank Tommaso Ciarli for valuable comments and suggestions during the development of the model. Luigi Marengo, Lee Altenberg and an anonymous referee provided encouraging comments and corrections to earlier versions. Any remaining error or imprecision is, obviously, my sole responsibility. M. Valente () Università dell’Aquila, L’Aquila, Italy e-mail: [email protected] M. Valente LEM, Pisa, Italy M. Valente Keywords NK model · Complexity · Simulation models JEL Classifications C63 · D83 · O32 · O33 1 Introduction Borrowing ideas developed for a given purpose and applying them to a completely different domain is a widely used method to advance knowledge and solving problems, up to the point that, one may suggest, copying and mixing different ideas is the main driver of human creativity and of scientific advancements. The use of metaphor and concepts developed in different domains, however, runs the risk of misadaptation. Though the core of the original idea can be useful in the new domain, it is frequently necessary to make some adaptations, removing and adding elements as necessary for its novel application. Witness, for example, the pioneering attempts to fly by means of heavier than air flying machines. For centuries inventors attacked the problem following the strategy of imitating the solution observed in most birds, i.e. moving wings. As long as the imitation was too close (using wings for both supporting and propelling), the degree of success was very limited. Only when the core idea was radically revised, separating the function of propelling from the function of supporting, did the imitation actually succeed. Concerning the study of complexity, biologists have developed the NK model (Kauffman and Levin 1987; Kauffman 1993) representing, in very stylized and elegant way, the effects of fitness improving mutations in a context in which interdependency makes rough landscapes. NK has been particularly successful because it provides a simple tool for generating an abstract representation of a problem (the probability of survival of a species) that, contrary to other modelization techniques, could be easily tuned to make the search for a solution harder or simpler. The representation of the problem relies on the core of complexity, that is, interdependency, reproduced by means of a large number of uniformly distributed random numbers. The result is that NK fitness landscapes show statistical properties depending on the interaction levels only, which are very robust to the choice of the set of random numbers used.1 From a biological perspective, NK is very useful because it is well known that genes’ effects on phenotypes are strongly influenced by their interactions. NK poses the interaction explicitly at the center of the analysis, and carefully avoids making any further assumptions, being able to derive interesting results concerning the expected characteristics of a species, its pattern of evolution, and even the reasons for the spontaneous emergence of modularity.2 However, biologists are not the only scientists interested in the modeling of complexity by means of interactions. Economists and scholars in organization sciences have had a long-time interest in the study of the effects of interactions (Simon 1969). Therefore, many researchers from these fields have adopted NK as their instrument 1 See, 2 See, e.g., Weinberger (1991), Durret and Limic (2003), Skellett et al. (2005), Kaul and Jacobson (2006). e.g., Wagner and Altenberg (1996), Altenberg (1995). An NK-like model for complexity of choice to represent and to study properties of organizational structures, technological innovation, etc. Originally devised as a metaphor for how nature deals with complex problems, NK-inspired models have been used to study artificial systems, organizations, technological developments, industrial dynamics, and much more. In such applications, NK typically represents a complex problem, where the performance depends on the interactions among several components. Simulated agents (say a firm or a generic problem solver) are engaged in exploring the problem space on the base of local and myopic information. The researchers are interested in assessing the results as a function of the complexity of the task and the limited capacities of the agents. Simulation modelers in economics and management have been eager adopters of variations of NK models producing a sizable number works.3 This work is based on the conviction that, however successful, the results from complexity studies in economics and organizational sciences have been limited by a mis-adaptation (or, better, insufficient adaptation) of NK in its passage from biology to social sciences. As an example of the limitation imposed by NK, consider the problems concerning the search strategy. The original NK model relies on random mutations of one single dimension replicating the equivalent of random mutation in natural systems. Human individuals or organizations, conversely, are likely to include at least a bit of intentional, purpose-driven behavior, however limited by informational constraints. For this reason, many scholars proposed modifications of the baseline NK model to include different search strategies. However, in order to design appropriate intentional strategies for search it would be useful, to the modeler, to know where the optimum is located, and possibly also to have the opportunity to control some properties of the problem space, such as the specific location and nature of local optima. However, NK has been designed as a metaphor for natural systems, which are passively observed and not controlled, and therefore these possibilities, useless from a biological perspective, are precluded. The increasing popularity of NK made ever more evident its limitations, and several contributions propose variations altering some aspects of its mechanisms (Li et al. 2006). Along these lines, the goal of the present paper is to propose a radically alternative implementation of the core features of NK in such a way as to make the model more flexible and adaptive for the novel applications to which it is increasingly put at work. The major changes of our proposal consist in the replacement of the binary variables required in NK with more general (and familiar) real-valued variables, and the shifting from a stochastic definition of the landscape to a deterministic functional one. Besides being able to inherit the same properties of the original NK model, our proposal allows us also to distinguish the structure of interactions (which component interacts with other components) from the intensity of interactions (how strong is a given interaction). In the following, we present the new model by hinting at the improvements it can bring to the study of complexity, particularly when purpose-driven agents are involved. During the presentation, we alternate examples of applications of the new 3 See, for example, Levinthal (1997), Frenken et al. (1999), Kauffman et al. (2000), Rivkin (2000), Rivkin and Siggelkow (2002), Lenox et al. (2006), Frenken (2006). M. Valente model to replicate known results generated with NK with original exercises permitted by the new model. The next section discusses the core elements and properties of NK, highlighting its shortcomings, particularly for application in fields different from biology. After that, we describe the new model, called pseudo-NK model (pNK), showing how it is able to replicate all the relevant properties of the original NK and removing, or relaxing, its major shortcomings. The third section presents a series of results produced by the simplest pNK version, based on two dimensions only, and exploiting only the intensity of interdependency, the novel feature introduced by pNK. The last section discusses the properties of a generalised pNK defined over N dimensions. A final section draws the conclusions. 2 Pseudo-NK model for complexity An NK model can be considered as composed by two distinct components: a problem specification represented by a space of potential solutions to an implicit problem, and a search algorithm scanning this space hunting for better and better solutions. The problem space is represented by all the binary strings of length N, each associated with a fitness value, that is, the pay-off for that solution. The search algorithm consists in a routine generating a pattern, that is, a sequence of solutions starting from a randomly chosen initial solution, or point in the N-dimensional binary space. The search routine defines the way in which we move from any one point to the next, that is, how to generate a new string from an existing one. For example, the typical routine, originally proposed, consists in choosing randomly one of the N elements of the current string and flipping (“mutate”) its value, from 0 to 1 or vice-versa, so that the new string will differ from the old one for only one bit. If the fitness of the new string is higher than that of the previous one, the new string is adopted, making a “step” in the pattern of exploration of the space. Otherwise, if the fitness of the mutated fitness is lower than that of the present one, the mutation is rejected and the search will continue from the same string. The repeated application of the search algorithm generates a path across the space of solutions. The path will be made by fitness-increasing steps, because of its construction, and terminates when it reaches a string from which all accessible strings have lower fitness, and hence there are no further possible steps. These strings are referred to as “peaks”, trapping any search based on hill-climbing strategies. Two aspects make NK particularly attractive. First, it is possible to determine how complex should be the space of solutions, or fitness landscape. The effect of complexity is represented by the probability to encountering the highest fitness point during a search pattern. Hence “simple” means that it is very likely to find this point, while “difficult” means that many searches will stop at lower fitness peaks. In NK, it is possible to tune the degree of complexity (i.e. the number of sub-optimal local peaks) by setting the number of interactions among the variables of the space, indicated by K. Landscapes with no or few interactions represent simple problems, i.e. contain few local peaks. In the simplest possible case (K = 0), there exists only one peak, so that all searches starting from any initial point will lead to this global optimum. Increasing K generates “harder” problems, represented by landscapes An NK-like model for complexity with a larger number of local peaks and decreasing chances of reaching the global optimum. The second aspect of NK is the representation of the search algorithm. The original proposal, coming from a biological metaphor, assumes the search strategy to be local and myopic. Local because the search implies the impossibility to observe the space beyond the narrow neighborhood of the current string; myopic because it prevents collecting past information or predict future events, focusing on the immediate goal of just improving the current condition. In other applications, NK models are endowed with search strategies on longer distances, for example admitting the possibility for long jumps (Levinthal 1997), search within modules of a given dimension (Frenken et al. 1999) and other methods (Auerswald et al. 2000; Kauffman et al. 2000). These two elements, complexity due to interactions and algorithmic search strategy, provide a stylized representation of a complex task faced by a would-be solver with limited capacity and information. Hence, the model allows us to study the effects of varying degrees of complexity upon actors with different problem solving approaches. In its original context, NK has been used as a pure representation of complexity, studying, for example, the expected number, location and average fitness of local peaks as a function of K, playing down the role of the search strategy, since this represents merely the blind action of nature. However, in other fields, such as management, organization theory, economics of technical change, etc., the popularity of NK stems from the large number of extensions to its basic implementation, particularly concerning the opportunities of search performed by purpose-driven intelligent actors. Hence, in these fields, NK has become a sort of sand-box where it is possible to test different problem solving approaches facing tasks of varying complexity. For example, Levinthal (1997) introduces the possibility for agents (representing firms) to perform “long jumps” besides local search and imitation of successful competitors. Rivkin and Siggelkow (2002) propose a model where decisions are distributed between a “CEO” concerned with overall fitness, and managers interested in “parochial” interests whose success depends on a portion of the landscape under their control. Lenox et al. (2006) use a version of NK to represent a technological space explored by competing firms. In general, NK provides a working implementation for the intuition that simple problems can be successfully solved, while increasingly complex ones generate sub-optimal solutions with degrading expected payoff’s. Using NK, researchers are able to investigate likely consequences expected under controlled conditions for the environmental complexity of the problem and solving strategy adopted. The very simplicity of the model invites users of NK to devise variations of the baseline version to implement customized search strategies for specific types of problem solvers. In this work, we claim that NK, however effective in offering a representation of complexity and its consequences, contains a number of unnecessary biases and limitations when used to represent complexity in an economic context. A first problem arises from the necessity to use of binary variables, representing presence/absence of a given feature. This aspect generates confusion because the only possible measure of the distance between two points is the Hamming distance, that M. Valente is, the number of variables with different values between two points. This concept of distance is very different from the familiar Euclidean distance applicable to realvalued spaces. In a binary space, a “step” in a given direction can be only one unit long, though it can be directed in N different dimensions. Besides making impossible the graphical representation of the landscape, the use of binary variables generates confusion and misinterpretations. For example, one is entitled to speak of “valleys” and “peaks”, but it would be wrong to assume that they have the three dimensional shape evoked by those names. A second problem arises from the use of randomness to generate the landscape, the properties of which are actually represented only as statistics over a reasonable large number of replications. In practice, each landscape will have a different (random) distribution of the relevant points, such as the global maximum or the local peaks, and the distribution cannot be pre-determined nor observed without testing each and every point of the landscape. This means that the modeler is as ignorant of the fitness landscape as the virtual agents exploring it, making their assessment very difficult in case of large landscapes. A last issue hindering the use of NK models consists in the relative difficulty in its implementation and use. Though the programming of a NK is relatively straightforward, its structure relies on a combinatorial number of random values, the sheer number of which is simply impossible to manage even for modest number of dimensions. The reason is that a fitness landscape requires the storage of a database made of N*2K+1 random values, which can easily exceed the capacity of any computer even for low levels of K, in effect making it impossible to build all the points of a landscape. It is possible to find technical solutions to generate portions of large landscapes so as to collect sample statistics of their properties,4 but the necessity to rely on samples further diminishes the capacity of the modeler to investigate and to exploit the landscape to assess the results. 2.1 Improving on NK The limitations listed above make NK difficult to use and its results complicated to interpret when it is applied to investigate complexity in artificial systems. We propose here an alternative model replicating (and extending) the features of NK that are relevant to economic and organizational scientists, without the limitations of the original NK highlighted above. The new model, which we call pseudo-NK (pNK), enjoys the following properties: • • Functional representation of fitness: the fitness function is defined as a deterministic function of a multidimensional vector. Therefore, it can be implemented as a routine, without requiring large data sets of random numbers, simplifying the coding of very fast implementation for even large landscapes. Multidimensional real-valued landscape: the landscape of pNK is composed of a subset of the n-dimensional real-valued space: x = {x1 , x2 , ..., xN } ∈ N , 4 Altenberg (1997) is credited as the first to propose a solution to this problem. An NK-like model for complexity • • with fitness represented by a real-valued function f ( x ). Both domain and codomain can be freely determined by the modeller. User determined maximum and landscape’s overall shape: f ( x ) has a maxx ∗ ) ≥ f ( x ), x ∈ ) determined by the user. imum5 in a point x∗ (such that f ( The shape of the landscape is a well-behaved function, so that the modeller can evaluate analytically the properties of the landscape. User determined interdependency: for any given couple of dimensions i and j, the user can set a varying degree of interdependence ai,j , ranging from full independence to maximum interdependence. Intermediate and varied levels of interdependency allow us, for example, to define landscapes where a dimension depends strongly on some dimensions and weakly on others.6 Besides these improvements, the definition of interdependency used in pNK remains the same as that used in the NK model: dimension xi is dependent on dimension xj if the maximization of the fitness function requires a different value for xi for different values of xj . In a different terminology, we can say that there exists dependency of xi on xj if, for at least some value of the other N-2 variables, the derivative of f ( x ) in respect to xi changes signs for different values of xj . It is worth noting that, in NK (and pNK, too), interdependence does not simply imply that modifications xj affects how xi impact on the fitness function f ( x ). In fact, it may be possible that such influence is strictly monotonic, that is, for different xj we observe that xi has different impacts on the overall fitness value, but they always have the same sign. Though in this case we do have interdependency, in a sense, this is not affecting the possibility of identifying the optimal value for xi independently from xj . Only when ∂f ( x) ∂xi changes sign for different values of xj , is the fitness-optimizing value of xi a function of xj . In this case, a hill-climbing strategy based on varying xi only is liable to be stuck in a local peak determined by the specific value of xj . In the following, we describe a possible implementation of a fitness function for pNK providing the properties listed above. 2.2 pNK fitness function pNK, as NK, consists of a fitness function defined on a set of N variables and a search algorithm. The fitness function proposed here7 for pNK borrows heavily from the NK implementation, with three notable differences. First, it considers realvalued variables (instead of binary); second, it is a deterministic function, rather than stochastic; third, it allows for different degrees of interdependence, instead of presence/absence. 5 Actually, it is possible to extend the model to admit multiple global maxima, though we will ignore this possibility. 6 A different version of the standard NK models, the so-called generalized NK, also allows user determined interdependency (Altenberg 1995, 1997). 7 Here we adopt a specific functional form providing the properties required to pNK; however, there is a whole class of functions that may be used, depending on specific requirements. M. Valente The overall fitness value of a point of the landscape domain (i.e. a point of N ) is, as in NK, the average of N fitness contributions for each of the variables: N x) ( (1) N where φi (...) is the fitness contribution function for dimension i. While in NK φi is a random value, in pNK this is a deterministic function defined as: f ( x) = i=1 φi f Max (2) x )|) (1 + |xi − μi ( where f Max is a user-determined parameter indicating the maximum of the function (typically 1). Equation 2 essentially states that φi is a decreasing function of the distance between the variable’s value and another function μi ( x ), defined as: x) = φi ( μi ( x ) = ci + N ai,j xj (3) j =1 The value μi defines a sort of “target” for dimension i that, when hit by xi , generates the maximum level of contribution of the variable to the overall fitness function. However, the value of xi maximizing φi may not be the most desirable, concerning the overall fitness value. In fact, xi influences also all contributions φj for the variables where aj,i = 0. Therefore, it is well possible that moving xi to maximize φi actually decreases the overall fitness value because of the deterioration generated in other fitness contributions φj ’s where aj,i = 0. The fitness function here8 proposed allows for ample flexibility. In particular, it is possible to determine features of the landscape that are not available in NK models: • • • Set maximum fitness. Simply setting f Max determines the maximum value of the fitness function. Set the global optimum. For any dimension i, it is possible to compute ci such ∗ . that the maximum fitness is obtained at a desired point x∗ = x1∗ , x2∗ , ..., xN To this point is the global maximum, we need to set ci = xi∗ − ensure that ∗ j =i ai,j xj . Set interdependencies. Varying the values of ai,j it is possible to make more or less relevant the interdependency between two dimensions. The first two properties are useful to exploit pNK in a context where the modeller is interested in determining a specific maximum fitness value at a specific point of the landscape. However, the last property is far more interesting, since it allows pNK to model complexity in a more detailed way in respect of NK. In fact, NK implements interdependency as a presence/absence property: if dimension j affects 8 The proposed model can easily be modified changing the functional form for Eqs. 2 and 3. The same properties of the model discussed here are maintained provided that μi ( x ) is a positive function of ai,j xj and that φi ( x ) is an inverse function of |xi − μi ( x )|. Using different functional forms for these elements will change the density of local peaks and the overall shape of the landscape, though maintaining the same properties discussed here. An NK-like model for complexity i then the contribution of i changes when j changes, and the two contributions are totally unrelated. Conversely, pNK allows to tune the level of interdependency: the effect of j on i can be null, weak or maximum, depending on a single parameter. Furthermore, since pNK is deterministic and expressed in a functional form, it is possible to represent and even visualize (when N = 2) the effect of sliding levels of interdependency. The explicit distinction between complexity stemming from interactions’ structures and from interactions’ intensities is theoretically relevant, as noted by Simon (1969) in his analysis of nearly-decomposable systems, suggesting that, though every element of a system is influenced by any other, some of the interactions are stronger than others. This distinction makes it possible to manage apparently intractable problems, trading off optimality against the reduced costs of dealing with simplified systems. In NK, since interactions have only one level, this distinction is blurred, and the intensity of interactions can only be crudely regulated by increasing or decreasing the number of connections, that is, K. However, in many cases, researchers are interested in representing complexity explicitly, and exclusively, as stemming from interaction intensities. The next section shows how many of the applications of NK to the organization literature can be reproduced by using a space made of N = 2 dimensions only, and using instead the level of the only existing interaction. The intuitiveness of the bi-dimensional representation, due to the possibility of visualize the landscape, allows a far greater insight than the original exercises implemented in NK. Before reporting on these exercises, the next paragraph defines the algorithm of search strategies in a pNK context, as required in dealing with landscapes made of real-valued variables. 2.3 Search strategy on pNK The baseline research strategy in the NK model lies in the so called one-bit mutation. This consists in generating a new point starting from an existing one by modifying a single bit of the binary string representing the point. The new point is accepted as a step if the fitness associated with the new point is higher than that of the existing one; otherwise the step is rejected. On a real-valued fitness landscape, the one-bit mutation strategy can be expressed by the following algorithm applied to a generic point in order to generate a subsequent point of a pattern. 1. 2. 3. 4. 5. 6. Choose randomly one dimension among the N available. Choose randomly one direction: + or −. Change by the value of the chosen dimension in the chosen direction. If the change leads to a point with higher fitness, move to the new point. If the change leads to a point with lower fitness, remain at the same point. Continue from 1. A path of a landscape is therefore made up of a sequence of steps generated by repeating applications of the above algorithm. The routine is repeated until no M. Valente change of on any of the variables of the landscape is able to produce a fitness increment. The points on which this happens constitute the (local or global) peaks of the landscape. In the economic and organizational literature, there are many alternative proposals for search algorithms designed to represent specific behaviors. Each can easily be adapted to work on a real-valued landscape, actually providing far more flexibility. This is because in pNK it is possible to define three distinct features of a step: a) its “angle” (the dimension(s) chosen); b) its direction (increase or decrease in the current value); c) its length, set by . As a consequence, it is possible, for example, to distinguish two different types of “long jumps” (Levinthal 1997): one made of small modifications of many dimensions, and another made of a large change in a single-dimension. For our purposes, we will consider as a constant parameter represented by a small value. This value is of only minor importance since its only role is to discretize the continuous real-valued space. The discretization permits us to ignore the small differences of fitness between two very close points. This choice is due to the necessity, in this first version, to maintain the myopic nature of search strategies in complex landscapes as in the original version of NK. Limiting to be small and constant, in fact, implies forcing the search strategy to consider only a small patch of the landscape surrounding the agent’s position. However, it is easy to devise extensions of pNK in which agents searching a landscape may attempt both steps in various dimensions but also modify the length of steps , for example, extending when trapped in a local optimum. 3 Complexity from intensity of interdependency The interdependency among components of a system generates complexity by means of its structure (the network of interactions) and their intensities. In this section, we study the properties of pNK in relation to the intensity of interdependency only, neglecting the role of the structure. For this purpose, we consider two dimensions only (N = 2), so that the structure of interdependency, trivially made up of the only possible connection between the two existing dimensions, cannot play any role. We will show that, even using this minimal landscape, pNK is able to replicate (and, actually, enrich) a large number of results originally produced with NK. Limiting to a space with small dimensionality, we will be able to produce the same results in a far simpler and more intuitive manner. We will consider the implementation of pNK as a landscape containing a single global peak located at a pre-determined point. We will consider only symmetric landscapes, where the two interdependency relations between the variables are identical in absolute levels and opposite in values. This is obtained by considering a1,2 = −a2,1 and 0 ≤ |ai,j | ≤ 1. Firstly, we show the topology of the space represented by pNK, exploiting the possibility to visualize the 3D graph made of two dimensions for the plane and the third for the fitness corresponding to each point. We will then move to replicate, to extend and to comment on various results obtained by considering myopic explorations of this space. An NK-like model for complexity Example of fitness landscape 1 0.9 0.8 0.7 0.6 0.5 102 101.5 101 100.5 100 X2 99.5 0.4 0.3 98 98.5 99 99.5 X1 100 100.5 99 101 101.5 98.5 10298 Fig. 1 Fitness landscape for N = 2 with symmetric and relatively strong interdependency. Global optimum set at (100,100) with maximum fitness set to 1, and |ai,j | = 0.7. A hill-climbing strategy based on one-dimensional mutation corresponds to moves parallel to one of the axes. If, as in the figure, the ridges of the landscape are diagonal to the axes, they represent areas of local optima for one-dimensional search strategies. In fact, any vertical or diagonal step would force a “step down” from the ridge resulting in a fall of fitness. Only if the ridges are parallel to the axes (ai,j | = 0) can a one-dimensional search strategy walk on the ridges, and it will always reach the global maximum 3.1 Visualisation of 2-D pNK fitness landscapes Figure 1 shows a fitness landscape built on two dimensions. The graph reports a landscape with the global optimum set at x ∗ = (100, 100), maximum fitness to f Max = 1, and a high, but not maximum, level of symmetric interdependency |ai,j | = 0.7. The landscape formed by the chosen fitness function9 is composed of a single peaked surface with monotonically decreasing fitness for points increasingly distant from the optimum, though points with the same distance from the optimum may have different fitness values. The surface is composed by four “ridges”, separated by “valleys”, leading to the global optimum. The two ridges are orthogonal because of the assumption of symmetry of interdependency, i.e. ai,j = −aj,i . The central aspect of a bi-dimensional pNK is, obviously, its degree of interdependency. In graphical terms, the interdependency is represented by the angles of the ridges in respect of the axes, which are controlled by the absolute values of the parameters |ai,j |. To make a more complete analysis of the effects of interdependency the graphs reported, in Fig. 2 describe five different landscapes generated by pNK for |ai,j | = 0, 0.25, 0.5, 0.75, 1. For |ai,j | = 0, the two ridges leading to the global maximum 9 Different functional forms can be generated by altering the expression of the fitness contributions φ . i only requirement is that the φi is inverted related to the difference |μi − xi |. The M. Valente Fitness |a|=0.25 Fitness |a|=0 102 101.5 1 102 0.9 101.5 0.8 0.8 101 100.5 100 101 0.7 100.5 0.6 0.5 100 99.5 0.3 99 99.5 100 100.5 101 101.5 98 102 98 98.5 99 102 101.5 99.5 100 100.5 101 101.5 98 102 102 1 0.9 101.5 101 100.5 100 101 0.7 0.6 100.5 0.5 100 99 99.5 100 100.5 101 101.5 99.5 0.3 99 99 98.5 98.5 98 102 0.7 0.6 0.5 0.4 0.4 99.5 1 0.9 0.8 0.8 98.5 0.3 Fitness |a|=0.75 Fitness |a|=0.50 98 0.5 98.5 98.5 98.5 0.6 99 99 98 0.7 0.4 0.4 99.5 1 0.9 98 98.5 99 99.5 100 100.5 101 101.5 0.3 98 102 Fitness |a|=1.00 102 101.5 1 0.9 0.8 101 100.5 100 0.7 0.6 0.5 0.4 99.5 0.3 99 98.5 98 98.5 99 99.5 100 100.5 101 101.5 98 102 Fig. 2 Fitness landscapes for N = 2 and |ai,j | = 0.0; 0.25; 0.50; 0.75; 1.00 are parallel to the axes. For |ai,j | = 1, the ridges run on the diagonals, at 45 degrees in respect of the axes. For intermediate values, they have increasing angles.10 The next paragraph explores the areas of local peaks trapping one-dimensional search strategies for each of the above landscapes. 3.2 Local peaks and interdependency In contrast to NK, pNK is not only able to replicate the statistical relation between complexity and extension of local peaks, but it is also possible to locate them and to explain the mechanism trapping the search strategy and hence causing the peak. The angle of the ridges in respect of the axes is crucial to determine the properties of pNK. In fact, any point other than the ridges (the “valleys”) cannot have local peaks, since there is always a whole area of neighbors closer to the optimum and with 10 All the experiments reported in the paper are produced with simulation programs implemented with Laboratory for Simulation Development-LSD (Valente 2008). The LSD platform is available for download at www.labsimdev.org. The code for the models and the specific exercises, together for the instructions on their use, is available upon request. An NK-like model for complexity higher fitness. Hence, either a pattern remains always in a valley, and hence ends up necessarily in the global optimum, or it reaches one of the ridges, which are therefore the only areas that needs to be studied. The one-dimensional search strategy means that the only admissible steps are those made maintaining constant one variable, that is, moving “horizontally” or “vertically”, parallel to one of the axes. If the ridges of the fitness surface run parallel to the axes, than such a strategy will surely bring us to the global optimum. This is because the ridges “dominate” the valleys, and therefore are likely to be reached by the pattern.11 Once a point on a ridge is reached, the strategy is able to walk on the top of it, eventually reaching the global optimum. Conversely, for landscapes with diagonal ridges (i.e. forming large angles with the axes), search patterns are likely to get trapped in local peaks. In effect, most of the points on the ridges constitute, for these landscapes, the set of local optima. This is because, when the ridges are diagonal, moving parallel to the axes (in the direction pointing towards the global optimum) generates two opposite variations of fitness. First, since the step will bring away from the ridge, the fitness of the new point will tend to be lower than the starting point on top of the ridge. Second, since one direction of the step brings us closer to the optimal point (because of the angle of the ridges), the fitness will tend to increase. The net effect is uncertain in general, and depends on which segment of the ridge the pattern has reached, besides their angle in respect to the axes. Given the functional form chosen, the slope of the ridges is gentler the farther away from the global optimum, and therefore, in these areas, the positive effect on fitness by getting closer to the maximum will be weaker. Conversely, segments of the ridges near the global optimum have steeper fitness and shallower valleys, and therefore it is more likely that the fitness loss caused by stepping off the ridge is smaller than the gain in getting closer to the maximum. As a result, patterns reaching the ridges far away from the optimum are likely to get trapped, while those managing to stay in valleys until close to the maximum will avoid the trap of local peaks. To visualize the properties of the intensity of interdependency, we consider the (average) final fitness reached by search strategies. For each landscape, we ran about 30,000 independent searches, starting from randomly chosen initial points. Each search consists in steps performed by adding or subtracting to a randomly chosen variable, and moving to the new point in case the change generates higher fitness.12 For each search, we associate the coordinates of the starting point to the fitness reached when the search terminates, averaging the values of different searches from the same starting point. Points from which searches lead to higher fitness final points are shown with brighter colors, while those producing patterns trapped in low fitness values are indicated with darker colors. 11 Valleys, trivially, lead always to the global optimum, so we need not consider these areas. with different values of showed the irrelevance of the value chosen for this parameter, but for the level of detail of the graphs. The value used for the simulations is 0.05, so that the portion of the graphs shown (from 98 to 102 on both dimensions) is effectively considered as a square lattice made of 80 units on each side and composed of 6,400 points. Therefore, the 30,000 runs generate, on average, about five random searches started from each point of the landscape. 12 Tests M. Valente The graphs reported in Fig. 3 show these values for different complexity levels, i.e. values of |ai,j |. The graph for |ai,j | = 0 is not reported, since it consists of a uniform plane of value 1, meaning that all searches starting from every point of the landscape consistently manage to reach the global maximum. The remaining cases show that, while |ai,j |’s increase, the areas of starting points bringing to the global optimum shrinks. Under the most challenging case (|ai,j | = 1), we obtain a Fuji-like figure: only the searches starting close to the global optimum are able to reach the top spot. In this setting, any initial point far from the global peak generates a pattern leading to final fitness values orderly distributed according to the distance from the optimum. The reason is that, on this maximum-complexity landscape, movements parallel to the axes lead to a point on the ridges with a probability increasing with the distance from the global optimum. The fitness values of the ridges also have decreasing fitness farther from the global peak, and this is why the final fitness of a search started far from the optimum is generally lower. For intermediate values of complexity, we observe “propeller”-like figures. For these cases, the farther from the global peak the search starts, the lower the probability of reaching the global optimum. However, the probability is not distributed symmetrically on both sides of the ridges. In fact, given that the angle of the ridges is positive but lower than 45 degrees, points on different sides from the ridges have different probabilities of being stuck in local optima. A pattern leading on the “lucky” side of a ridge are more likely to get closer to the global optimum, while points on the “wrong” side are more likely to get stuck earlier on. This result confirms the capacity of pNK to replicate the core property of NK: increasing the complexity of the landscape generates lower and lower expected fitness resulting from the same myopic search strategy. pNK obtains this result in a far Random search |a|=0.50 Random search |a|=0.25 102 101.5 101 100.5 100 99.5 98 98.5 99 99.5 100 100.5 101 101.5 102 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 101.5 101 100.5 100 99.5 99 99 98.5 98.5 98 102 98 98.5 99 99.5 100 100.5 101 101.5 102 101.5 101 100.5 100 99.5 102 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 101.5 101 100.5 100 99.5 98.5 98.5 99 99.5 100 100.5 101 101.5 98 102 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 99 99 98.5 98 102 Random search |a|=1.00 Random search |a|=0.75 98 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 98 98.5 99 99.5 100 100.5 101 101.5 98 102 Fig. 3 Final fitness produced by random one-dimensional search strategy for landscapes with N = 2 and different values of |ai,j |. The graphs report the average final fitness in correspondence with each starting point. The average is computed from an expected number of about five searches from each point An NK-like model for complexity simpler setting than that required in NK and provides a more intuitive mechanism. We have used a landscape with the minimal number of dimensions to express interdependency and generated complexity by means of the intensity of the interaction. Furthermore, pNK defines precisely the locations leading to global or local maxima, providing deterministic and logical reasons for the eventual result. Once having established these basic properties, we can use pNK to investigate more elaborated ones. 3.3 Emergence of order The most attractive feature of NK is its representation of lock-ins of myopic explorations on complex landscapes. Smooth, simple landscapes allow all or most of the explorations to reach the highest fitness point, while rough, complex landscapes are likely to trap the exploration in local peaks. Levinthal (1997) interprets this feature in terms of expected diversity of organizations engaged in the search for efficient strategies. He notes that, after starting with a population of firms scattered randomly on different points of the landscape (i.e. with different initial conditions), the local (“adaptive”) search rapidly brings a steep reduction of the firms’ population variety. This is due to the convergence of firms in high-fitness areas, that is, adopting successful strategies that can easily be reached by “adaptation”. Conversely, complex landscapes present a large number of local peaks with relatively small basins of attraction. Consequently, firms starting from different areas will end up in different local peaks and we will observe a larger variety of strategies. In the original paper, this result was supported by an experiment showing the number of different points reached, on average, by 100 firms over 100 independent landscapes setting N = 10 and repeating the experiments for K = 0, 1, 5 (Levinthal 1997, fig.1, p.940). The graph shows that, through time, the number of points reached by the population of firms in the three cases considered drops sensibly, with the most complex landscape maintaining the highest diversity (i.e. larger number of different points). As a test to assess the capacity of the new model, we used pNK to replicate the same exercise. As expected, we obtain the same results, though with a far richer detail and greater ease of interpretation. Moreover, the more intuitive structure of pNK managed to produce greater insight extending the original result. Figure 4 reports on the results produced by pNK using N = 2 dimensions only, and expresses the complexity of the landscape by means of the degree of interdependency |ai,j |. The data are generated defining 100 different landscapes, each with a different value of interdependency between the two variables |ai,j | ranging from null to maximum. On each landscape, we performed 30,000 searches starting from randomly chosen points and registering the coordinates of the initial and final point of the search, and the fitness level of the latter. The graph reports the variance of one dimension’s final location, as a proxy for the variety of the results produced in each search, and the average fitness of these final points. The data show that, for landscapes with low complexity, all searches manage to reach the global optimum (fitness 1.0), generating the highest level of homogeneity (variance of locations is null). For increasing levels M. Valente Fig. 4 Variance of locations and average fitness by search strategies on N = 2 pNK fitness landscapes. Data produced by 30,000 searches for each of 100 landscapes with |ai,j | ranging from 0.0 to 1.0 of interdependency, we find that the final locations are increasingly dispersed, as indicated by the higher and higher levels of variance, confirming the original result. We can also observe another result deriving from increasing complexity: decreasing levels of local peak fitness. That is, sub-optimal points trapping search patterns in rougher landscapes show lower fitness than in the case of smoother ones. This is due to the obvious consideration that higher complexity stops searches earlier on “bad” local peaks, while gentler landscapes are more likely to allow search patterns either at the top, or closer to it. Although intuitively obvious, such results cannot be observed in NK because of its statistical properties. It is well known that NK landscapes generate higher levels of fitness of global peaks for higher K’s (Kauffman 1989). Therefore, one will observe that a low ranking local peak in a landscape with high complexity shows a higher fitness value than the global peak of a landscape with low complexity. This property is due to mere statistical reasons motivated by the way NK generates fitness values.13 Though innocuous when comparing results from searches on the landscapes with the same complexity, this aspect of NK creates serious problems when comparing results from landscapes defined with different complexity levels. For example, presenting a model where innovative firms’ 13 In NK, the fitness is computed out of a sample of random numbers the dimension of which is proportional to 2K . Hence, rougher landscapes use a larger sample than smoother ones, increasing the probability of finding a higher value. An NK-like model for complexity qualities are obtained by fitness levels produced with NK, Lenox et al. (2006) write “One could make a compelling argument [for] this trait of NK [...]. We decided, however, to eliminate this feature of the NK specification by expressing individual firm marginal cost or quality as a percentage of the global optimal” (pag. 763). As in the cited work, it is frequently the case that researchers are interested in comparing the relative fitness produced by search strategies in different landscapes. While NK requires an elaborated transformation, necessarily arbitrary, to compare fitness values from landscapes with different complexities, pNK maintains the same overall distribution of fitness values, allowing an objective, direct comparison. The next experiment exploits this feature, comparing two different search strategies across landscapes with different intensities of interdependency. 3.4 Greedy vs. random strategies In this experiment, we investigate the performance of two different strategies for dealing with complexity. We will show that each of the two strategies is preferable (i.e. provides higher fitness) within a certain level of complexity, providing an insight that may prove useful to actors dealing with complex tasks. We continue to use the simplest pNK problem space, made of two dimensions only, and consider different scenarios with increasing levels of complexity as represented by the intensity of the interdependency. For each scenario, we compare the average result produced by two research strategies, both myopic (only surrounding points can be explored) and allowing changes on one single dimension at each step. The first is the standard strategy, testing random points in the immediate neighborhood of the currently held point. The second strategy is slightly more elaborate; instead of testing a point chosen randomly (within the permitted range), it tests all possible one-dimensional changes (four, for the 2 dimensional space), and opts for the one providing the highest possible fitness increase (if any). Obviously, the two strategies provide the same step in case only one change is able to provide a fitness increment. However, when both dimensions could deliver a fitness increment, the second, “greedy”, strategy will consistently choose the step providing the largest fitness increment, while the random strategy is expected to do so only half the time. The question we explore is whether the supposedly “smarter” strategy (the one able to choose always the best step) is actually superior to the random one. Figure 5 answers this question, showing the fitness provided, on average, by the two strategies for different levels of intensity of interdependency |ai,j |, ranging from null to maximum complexity. As expected, for low values of |ai,j |’s (simple landscapes), both strategies manage systematically to reach the global maximum, as indicated by the value of 1 for both strategies provided for low values of the interdependency parameter.14 14 We do not consider the speed of research of the two research strategies, that is, the number of steps required on average to reach the global optimum. In fact, we may expect that the greedy strategy is faster than the random one, requiring a smaller number of steps to reach the eventual peak. In the following section, we will discuss the issue of speed of research showing how pNK is able, contrary to NK, to provide an intuitively appropriate representation of the length of research patterns. M. Valente 1 Random Greedy 0.95 Av. fitness 0.9 0.85 0.8 0.75 0.7 0 0.2 0.4 0.6 0.8 1 |a| Fig. 5 Average final fitness over 30,000 searches starting from randomly chosen points with random and greedy strategies on 11 landscapes, with ai,j ’s ranging from 0.0 to 1.0 For the opposite cases, with high levels of complexity (|ai,j | approaching 1), the greedy strategy manages to obtain higher fitness than the random one. This may be easily explained since, in these contexts, the best that can be obtained is a relatively high local peak, given that there are very few chances of reaching the global peak by means of any hill-climbing strategy based on the adaptation of a single dimension only. Consequently, the patterns will generally stop after a few steps, and therefore the best that can be obtained is reaching the highest among the local peaks around the starting point. Clearly, in these environments, the greedy strategy is the most appropriate. The dominance of the greedy strategy, however, is limited to the cases with high complexity. Conversely, for moderate levels of complexity, when |ai,j | takes values in the range from about 0.2 to 0.4, the humbler, random strategy dominates the greedy, supposedly smarter one. The explanation for such an apparent counterintuitive result can be found comparing the patterns generated by the two strategies. Landscapes with moderate complexity have areas of local peaks, but smaller than those with high complexity. The results show that rushing to find the best local peak in the immediate neighborhood of the starting point is not the best strategy when the local peaks are few and distant. Rather, avoiding them altogether may be the best way to carry on, ascending slowly (i.e. with smaller fitness increments per step), but ensuring a longer path (i.e. with more steps). An NK-like model for complexity This phenomenon offers interesting insights. Consider an organization where the activities of its members are only moderately interdependent, so that a given change by one member will be positive or negative for the organization, depending on the current activities by one or more members, but these links are not very strong. Imagine that each member loyally proposes to the top management, say the CEO, the course of actions within his department that ensures the highest improvement to the firm’s performance. If the CEO is forced to implement only one of the received proposals, an apparently rational behavior would be to choose on the basis of the highest expected performance gain. Our result says that there are cases in which the organization would be better off by choosing an alternative proposal providing a dominated expected payoff. These cases are those in which higher performance is obtained at the (hidden) cost of restricting subsequent exploration. Hence, the correct criterion to choose should be more elaborate than mere expected performance measured in the near future, but should consider also the effect of today’s choice on tomorrow’s options. This may be considered as an example of the phenomenon called “premature convergence” by researchers in problem-solving algorithms (Michalewicz 1996). These are the cases in which a problem solver focuses on good solutions discovered in earlier steps, preventing the exploration of a larger area potentially containing even better solutions. In organization studies, this phenomenon has already be noted: “adaptive processes, by refining exploitation more rapidly than exploration, are likely to become effective in the short run but self-destructive in the long run” (March 1991). This exercise shows clearly the cost of short-term “greed”, providing a formal context for observing the effects of exploitation versus exploration. The premature convergence is due to the constraints posed by interdependency starting to bite early on during a research strategy, preventing the reaching of high fitness portions of the search space. In cases of moderate interdependency, it would be better to not pursue the search for optimality, but to leave some room for (locally) sub-optimal decisions that, increasing the variety of exploration, enlarge the areas open to future explorations. In short, there are cases in which the good is preferable to the best. Concluding this section, we have shown that it is possible to replicate many results originally produced with NK by using the simplest pNK implementation, based on two dimensions only. We have seen that many (all?) results concerning on the effects of complexity can be reproduced by neglecting the role of the structure of complexity and focusing only on the role of the intensity of interdependency. Furthermore, these same results, besides being produced more efficiently, are more coherent and provide additional insights than was the case with NK. The following section continues the tests of pNK, extending the analysis to include discussions on the complexity stemming from different interaction structures defined over a high number of dimensions. 4 Complexity from interdependencies’ structure In this section, we analyze the role of the structure of interdependencies in generating complexity. The first paragraph defines a basic representation of complexity structure M. Valente and discusses the (partly) interchangeable role of complexity structure and intensity. The following paragraph presents some results stemming from the exploitation of complexity structure using pNK, highlighting the greater insights available in respect of the original analysis based on the NK representation. 4.1 Complexity: structure and intensity of interactions Consider an N dimensional problem space. As in a standard NK model, we know that varying the number of interactions (increasing K) we can vary the roughness of the resulting landscape, increasing the probability that hill-climbing myopic search will end in a local peak. The standard NK model assumes no structure of the interdependency among variables, limiting their number and assigning randomly the interdependency links.15 It is possible, however, to construct an NK model in such a way as to impose a desired interdependency structure on the problem space. To create a very basic structure, we assign each variable to a group so that each group contains the same number of variables. We then set interdependency connections among all the variables in the same group, and no connection between variables from different groups. Hence, the larger the groups, the larger the number of interactions and, consequently, the landscape will be rougher. Notice that, in this setting, it is more convenient to indicate with K the number of variables within groups, rather than the number of interactions affecting a variables as in the original NK model. Consequently, in this case, K varies from 1 (groups composed by only one variable, therefore having no interactions) to N (all variables placed within a single group, and all variables having connections with every other variable). Conversely, the original K, referring to the number of interactions, varies between 0 and N-1. We can test this setting in a pNK model by evaluating the performance of a onedimensional search strategy for various levels of K. However, contrary to the original NK model, we can vary also the strength of interaction setting |ai,j | to values in [0, 1]. Figure 6 reports the results generated with these settings. The simulations are performed on different landscapes made up of N = 24 variables and eight different values of K: 1,2,3,4,6,8,12,24. For each K, we generate 11 landscapes using as many values of |ai,j | ranging from 0 to 1. On the resulting 88 landscapes, we ran 300 independent explorations based on the random strategy described above. The data show the average value for each combination, using the fitness of the local peak if the search got trapped into one, or the fitness value of the point reached after 30,000 steps otherwise. As before, we find that these results replicate and extend prior results generated with NK. First, they show that increasing K decreases the average fitness we can expect to reach, confirming that the choice of a “structured” K, instead of randomly chosen connections, does not affect the role of K in producing a tougher environment. Moreover, as noted in the previous section, pNK allows us to compare directly the 15 Such a random structure actually leads quickly to the production of fully uncorrelated structures, even for relatively low K values (Frenken et al. 1999). An NK-like model for complexity Average Fitness 1 |a|=0.0 |a|=0.1 |a|=0.2 |a|=0.3 |a|=0.4 |a|=0.5 |a|=0.6 |a|=0.7 |a|=0.8 |a|=0.9 |a|=1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0 5 10 15 20 25 K Fig. 6 Average fitness produced at the end of 300 searches for each of the eight landscapes with K = {1, 2, 3, 4, 6, 8, 12, 24} and for various value of ai,j . Interdependent dimensions have |ai,j | from 0.0 to 1.0 while independent ones have ai,j = 0 results from different landscapes, without the bias produced by different distributions of fitness values for local peaks across landscapes with different complexity. The novel results stem from the effect of the intensity of interaction. The graph shows clearly that weak interactions (|ai,j | < 1) allow the exploration to reach on average higher local peaks than the case of maximum interaction, and the weaker the intensity, the higher the expected fitness of the final points of an exploration. This latter result provides support for the intuition that, for a given strategy of search, the number of interactions and their intensity are two distinct and complementary mechanisms to generate complexity. Once we define a given measure of complexity, such as, for example, the expected fitness reached by an adaptive, onedimensional mutation search strategy, it is possible to obtain different landscapes exhibiting the same level of complexity. They will differ in having a higher number of interactions and lower intensity, or, vice versa, a lower number of interactions and higher intensity. Generating similar results for one measure of complexity, however, does not imply necessarily that these landscapes will share identical properties. In fact, they will likely differ in respect of other measures of complexity, or in respect of other applications, such as, for example, the population dynamic properties of agents exploring the landscapes. We will not further explore this aspect, leaving to future work the analysis of the properties of landscapes built with different combinations M. Valente of number and intensity of interactions. Rather, we concentrate on the effects of the complexity structure in respect of research strategies. 4.2 Modular strategies and decomposability In the previous exercise, we adopted the standard research strategy based on a one-dimensional step. However, contrary to natural systems, artificial ones, such as individuals or firms, can be expected to recognize the limitations of explorations based on one-dimensional steps in the presence of complexity. Therefore, problem solvers facing highly complex tasks are likely to engage in more sophisticated exploration strategies to avoid the traps of local optima. An interesting issue is, therefore, what are the costs and advantages of different exploration strategies in respect of different problems? To analyze this issue, we need to remind ourselves that the very definition of a local peak depends on both the properties of the landscape and the search strategy adopted to explore it. In general, the shape of a landscape (e.g. being “smooth” or “rough”) depends not only on its own structure but also on the the rule defining the admissible steps over the landscape. For example, a point may be a local peak if we allow the exploration to consider only points within a given radius from the current one, but it may not be a local peak if we extend the radius of feasible steps. Consequently, we may intuitively conjecture that for each type of complexity structure there will be a matching exploration strategy able to “smooth away” any local peak, and therefore systematically lead to the global optimum. As we will see, however, it is possible to show that, while this is true, at least within a given sub-set of structures, other considerations may suggest that sub-optimal search strategies leading to local peaks are sometimes more desirable than “optimal” search strategies that, though not admitting any local peak, yet show other problems advising against their adoption. Since its inception (Simon 1969) the concept of complexity has been strictly connected with discussions concerning specific structures of interactions, particularly those exhibiting (or lacking) modularity. Modularity is the property of systems made of several components; in modular systems, groups of components have strong reciprocal interactions with members of the same group, while showing little or no interaction with components outside their groups. Non-modular (or integral) systems, conversely, have no visible structure, with interactions spanning different components. Many disciplines use modularity as a central concept to explain a variety of phenomena in natural (Callebout and Rasskin-Gutman 2005) and artificial systems (Brusoni et al. 2007). For decades, organizational scholars have represented the problem of organizational design from a problem solving perspective (Miles and Snow 1978), comparing the structure of an organization to the structure of its tasks, such as production processes or development of new products. According to this perspective, the performance of an organization can be evaluated on the basis the matching between its internal structure and the external requirements. The concept of modularity can thus be applied to both the problem task and to the problem solver. To avoid confusion we propose to use different terms, frequently used as synonyms, to express, on the one hand, the interdependency structure of the problem space and, on the An NK-like model for complexity other hand, the property of the search strategy adopted by the problem solver. So, we will call decomposability the exogenous property of the problem space, such as the definitions and the technical interactions among components of a product. The term modularity will be used to indicate the property of the searching strategy adopted. As an experiment, we match the “block-based” complexity structure defined by grouping the variables within subsets by defining search strategies also based on the grouping of variables. We call C the size of the block of variables as considered by the problem solver strategy. Such generalized search strategies can be represented by the following algorithm:16 1. Choose randomly one block among the N/C blocks in the problem solver strategy. 2. Choose randomly an integer in the range [1;C] (call this number c). 3. Choose randomly c dimensions within the chosen block. 4. Change by the value of the chosen dimensions in a random direction. 5. If the change leads to a point with higher fitness, move to the new point. 6. If the change leads to a point with lower fitness, remain at the same point. 7. Continue from 1. In practice, a step of the research strategy consists in choosing randomly a new point differing from the old one by between 1 and C dimensions. Hence, the singledimension mutation can be considered as a block-based search strategy when C = 1. For larger C values, the strategy is able to explore a wider area proportional to C. Figures 7 and 8 report on an experiment presented in Frenken et al. (1999), but use pNK instead of the standard NK. Each of the eight graphs considers landscapes built with different K values, where the interdependent dimensions are chosen in “blocks”, so that each dimension within a block is linked to exactly K-1 other dimensions in the same block, and have no connection to the dimensions in other blocks.17 Each graph concerns the results from landscapes with different K values, and reports the average fitness of the populations of agents adopting different modular research strategies of dimension C. The results generated with the pNK model coincide perfectly with those generated with the standard NK. It can be shown that only search strategies having C being a multiple of K can reach the global optimum, and this is what we actually observe for small K values. However, for more and more complex landscapes (higher K values), other search strategies seem to dominate. The reason lies in the fact that search strategies with large modules are very slow, since they essentially test randomly all possible combinations of the dimensions in each group. In this case, “optimal” search strategies take far too long to show their superiority, and therefore “quick & dirty” strategies, though doomed to eventually get stuck in a local peak, actually dominate 16 For simplicity, we consider only C values that are integer divisors of N, though the results would not change otherwise. It would only mean that the last class would contain a smaller number of variables than the other classes. 17 For obvious reasons of comparability, we assume that each interdependency in pNK is maximum, i.e. |ai,j | = 1 for each i and j dimensions within the same block. M. Valente 1 1 0.861562 0.896547 0.723124 0.793093 0.584687 0.689639 0.446249 1 Cl. 1 Cl. 6 7500 Cl. 2 Cl. 8 15000 Cl. 3 Cl. 12 22500 30000 Cl. 4 0.586186 1 Cl. 1 Cl. 6 7500 Cl. 2 Cl. 8 15000 Cl. 3 Cl. 12 K=1 0.999844 0.873058 0.876073 0.746115 0.752302 0.619173 0.628531 7500 Cl. 2 Cl. 8 15000 Cl. 3 Cl. 12 30000 22500 30000 K=2 1 0.49223 1 Cl. 1 Cl. 6 22500 Cl. 4 22500 Cl. 4 K=3 30000 0.50476 1 Cl. 1 Cl. 6 7500 Cl. 2 Cl. 8 15000 Cl. 3 Cl. 12 Cl. 4 K=4 Fig. 7 Average fitness across time for classes of agents mutating blocks made of C = {1, 2, 3, 4, 6, 8, 12} dimensions. Simulations reported for landscapes with K = {1, 2, 3, 4} slow ones for a long period before ending up trapped into a local peak. Hence, if the relevant period for competition is sufficiently short, it is well possible that “dominated” strategies (in the long term) are actually superior to “optimal” ones for any reasonable length of time. 4.3 pNK and agent-based models In our review of the limitations of NK we mentioned the difficulty in using NK as a model for complexity to plug into a wider model of, say, organizations, markets, etc. This is because the NK properties are statistical, made evident only by collecting a sufficiently large number of repetitions. However, each single run of a landscape exploration is ill adapted to represent an actual exploration process. The reason is that the pattern actually generated by a simulated agent on a NK landscape is composed of very few fitness increasing steps in between a long series of failed attempts. This feature prevents the direct use of NK to represent agents performing activities such as, e.g., R&D, where the modeller is likely to expect innovative firms to generate a stream of innovations, though at random rates of arrival. On the contrary, a pattern in NK consists in one or a few jumps in between long periods of failed attempted mutations. Such limitations makes hard, for example, to assume that the rate of arrival of innovations, or their relative innovativeness, is proportional to the amount of R&D investment. Such limitations do not affect pNK. The proposed model produces search patterns closer to the intuitive sequence of fitness increasing steps, each composed, as in NK, by twists and turns (looking for the right direction), but also as actual steps in a An NK-like model for complexity 0.993804 0.908933 0.842206 0.787098 0.690607 0.665264 0.539009 0.543429 0.38741 1 Cl.1 Cl. 6 7500 15000 Cl. 2 Cl. 8 Cl. 3 Cl. 12 22500 30000 Cl. 4 0.421594 1 Cl.1 Cl.6 7500 Cl. 2 Cl. 8 K=6 0.843884 0.725412 0.677626 0.583792 0.511368 0.442172 0.34511 7500 15000 Cl. 2 Cl. 8 Cl. 3 Cl. 12 22500 30000 22500 30000 Cl. 4 K=8 0.867032 0.300552 1 Cl. 1 Cl. 6 15000 Cl. 3 Cl. 12 22500 30000 Cl. 4 0.178852 1 Cl. 1 Cl. 6 7500 Cl. 2 Cl. 8 15000 Cl. 3 Cl. 12 K=12 Cl. 4 K=24 Fig. 8 Average fitness across time for classes of agents mutating blocks made of C = {1, 2, 3, 4, 6, 8, 12} dimensions. Simulations reported for landscapes with K = {6, 8, 12, 24} familiar Euclidean space. To highlight the difference between NK and pNK in this respect, Fig. 9 shows a comparison between 1,000 searches (one-bit mutations) on an NK landscape composed of N = 24 dimensions and defined as moderately complex (K = 3). The graphs represent the scatter plots between the fitness value of the local peak eventually reached by the search patterns and the number of fitness increasing mutations produced by the patterns, that is, the successful “steps”. The left graph (NK) shows a random cloud of points, indicating that there is no relation between the 0.68 0.66 Fitness 0.67 0.64 0.66 0.62 0.65 0.6 0.64 0.58 0.63 0.56 0.62 0.54 0.61 0 2 4 6 8 10 12 Num. of successfull mutation NK 14 16 18 0.52 160 Fitness 180 200 220 240 260 Num. of successfull mutation 280 300 p NK Fig. 9 Values of fitness and the number of successful mutations for NK and pNK. In both cases, the graphs report the values for N = 24, K = 3 and agents adopting a one-bit mutation strategy. The data are obtained by 1,000 independent runs on the same landscapes. Values were collected at the end of exploration, when a local peak is reached M. Valente number of steps and the final fitness reached by the search strategy. Moreover, the average number of steps is about nine, meaning that the patterns generated by NK are pretty short. Conversely, the right graph shows that patterns generated with pNK have a more plausible positive correlation between the number of positive mutations and the final fitness of a search path, representing the obvious fact that to reach higher fitness areas you need a longer walk. Besides the correlation, moreover, patterns in pNK are overall much longer (between about 160 and 300 steps), providing a far better metaphor for results from innovating activities. As an example of the possibilities offered by pNK, Ciarli et al. (2007, 2008) use a model derived from pNK to express the technological race of innovative firms. In these models, firms undergo several economic activities (e.g. sell final and/or intermediate products, collect revenues, set prices, etc.) as well as research activities (searching for better technologies on a complex technological landscape). In these works, the relative timing between the output of research (i.e. better technologies) and their economic impact (i.e. higher competitiveness) is crucial. An NK model, implying a slow and erratic pattern of “discoveries”, would not be able to represent the intended functions. Conversely, pNK provides the flexibility to represent sensible search patterns made of gradual improvements resulting from the appropriate research strategy. Moreover, the functional form of the complexity model and the possibility to control every aspect of the landscape gives the opportunity to extend the representation of the complexity space. In these works, for example, the whole technological landscape is represented as shifting in time under the pressure of exogenous movements of the technological frontier. Therefore, the very same point of the landscape (i.e. technology) provides different fitness values (relative competitiveness) through time, as the whole landscape drifts away. Consequently, firms must continuously invest in research even just to maintain the current relative technological position of their product, as represented by the fitness. Such a feature is easily represented in pNK, allowing the explorations of how the overall industry performance differs under, say, fast or slow changes of the technological frontier. 5 Conclusions and further research This work proposes a new model to represent complexity by implementing the core elements of NK in ways that are more suited for applications in economics rather than the original biological metaphor. The proposed model, pNK, generates a complex landscape by means of the (controlled) interaction among the dimensions of the landscape as in NK. However, pNK considers real-valued dimensions (instead of the binary ones of NK), allows for grades of interaction (instead of presence/absence), and is implemented by means of a deterministic function, greatly simplifying the implementation of the model and the interpretation of the results. These differences make it possible to enhance significantly the study of complexity, in general, and its applications to specific domains, such as economics and organization theory. The paper describes several experiments using pNK to represent complex landscapes, showing how the core properties of NK are maintained in the new model, adding the benefit of dealing with the more intuitive context of a real-valued space. In An NK-like model for complexity particular, we show that intensity of interdependency is equivalent to the number of interdependencies, at least for many results presented in the literature. This permits us to replicate the same properties using a familiar bi-dimensional plane and provide intuitive graphical representations. Moreover, we present a number of original results made possible by pNK. Extensive experiments support the claim that pNK is far better suited to implement complexity within models entailing the representation of simulated agents facing complexity. Several lines of research may be devised starting from the present work. First, there are many properties of the proposed model that still need to be explored. For example, it may be interesting to study the properties of non symmetric landscapes, produced when |ai,j | = |aj,i |. More in general, there are many functional shapes that may be used as a fitness function; since this determines the relative distribution of local peaks, it may be useful to generate a library of functions for different uses. A related line of research is to study the possibility to generate landscapes on the basis of empirical data. Many theoretical and applied researchers would be interested in the possibility of interpolating a data set of available evidence to be considered as a sample of scattered points, filling the gaps with econometric techniques and eventually producing an estimated fitness landscape of a specific complex problem. This may contribute to a promising literature on the empirical applications of complexity theory in economics (Frenken and Nuvolari 2004; Alkemade et al. 2009), and would obviously find many uses for both theoretical as well as applied issues. Finally, pNK is particularly suited to be used in conjunction with agent-based simulation models. That is, using pNK as “complexity module” within models representing systems dealing with the exploration of complex spaces, such as the economics of technical change or organizational theory. Its features make it very simple to devise search strategies modeled on the behaviors of actual agents, either individuals or organizations, facing complex tasks. The comparison of varied complexity and behaviors adopted to deal with it has provided in the past many insights using NK, and more can be expected by using the more powerful and flexible pNK model. References Alkemade F, Frenken K, Hekkert MP, Schwoon M (2009) A complex systems methodology to transition management. J Evol Econ 19(4):527–543 Altenberg L (1995) Genome growth and the evolution of the genotype-phenotype map. In: Banzhaf W, Eeckman FH (eds) Evolution and biocomputation: computational models of evolution, vol 899. Springer, Berlin, pp 205–259 Altenberg L (1997) NK fitness landscapes. In: Baeck T, Fogel D, Michalewicz Z (eds) Handbook of evolutionary computation. Taylor & Francis, New York Auerswald P, Kauffman SA, Loboc J, Shell K (2000) The production recipes approach to modeling technological innovation: an application to learning by doing. J Econ Dyn Control 24(3):389–450. ISSN 0165-1889 Brusoni S, Marengo L, Prencipe A, Valente M (2007) The value and costs of modularity: a problemsolving perspective. Eur Manag Rev 4:121–132 Callebout W, Rasskin-Gutman D (eds) (2005) Modularity: understanding the development and evolution of natural complex systems, Vienna series in theoretical biology. MIT Press, Cambridge M. Valente Ciarli T, Leoncini R, Montresor S, Valente M (2007) Innovation and competition in complex environments. Innov Manag Policy Pract 9(3–4):292–310 Ciarli T, Leoncini R, Montresor S, Valente M (2008) Technological change and the vertical organization of industries. J Evol Econ 18(3):367–387 Durret R, Limic V (2003) Rigorous results for the NK model. Ann Probab 31(4):1713–1753 Frenken K (2006) A fitness landscape approach to technological complexity, modularity, and vertical disintegration. Struct Chang Econ Dyn 17(3):288–305 Frenken K, Nuvolari A (2004) The early development of the steam engine: an evolutionary interpretation using complexity theory. Ind Corp Chang 13(2):419–450 Frenken K, Marengo L, Valente M (1999) Interdependencies, nearly-decomposability and adaptation. In: Brenner T (ed) Computational techniques for modelling learning in economics. Springer, Berlin, pp 145–165 Kauffman SA (1989) Adaptation on rugged fitness landscapes. In: Stein D (ed) Lectures in the sciences of complexity. The santa ire institute series. Addison-Wesley, New York, pp 527–618 Kauffman SA (1993) The origins of order: self-organization and selection in evolution. Oxford University Press, Oxford Kauffman SA, Levin S (1987) Towards a general theory of adaptive walks on rugged landscapes. J Theor Biol 128:11–45 Kauffman SA, Lobo J, Macready WG (2000) Optimal search on a technology landscape. J Econ Behav Organ 43:141–166 Kaul H, Jacobson SH (2006) Global optima results for the Kauffman NK model. Math Program 106(2):319–338. ISSN 0025-5610 Lenox MJ, Rockart SF, Lewin AY (2006) Interdependency, competition, and the distribution of firm and industry profits. Manag Sci 52(5):757–772 Levinthal DA (1997) Adaptation on rugged landscapes. Manag Sci 43(7):934–950 Li R, Emmerich MTM, Eggermont J, Bovenkamp EGP, Bäck T, Dijkstra J, Reiber JHC (2006) Mixedinteger NK landscapes. In: Runarsson TP, Beyer H, Burke E, Merelo-Guervós J, Whitley LD, Yao X (eds) Parallel problem solving from nature—PPSN IX. Lecture Notes in Computer Science. Springer, Berlin, pp 42-51 March JG (1991) Exploration and exploitation in organizational learning. Organ Sci 2(1):71–87 Michalewicz Z (1996) Genetic algorithms + data structures = evolution programs, 3rd edn. Springer, Berlin Miles RE, Snow CC (1978) Organizational strategy, structure, and process. MacGraw-Hill, New York Rivkin JW (2000) Imitation of complex strategies. Manag Sci 46(6):824–844 Rivkin JW, Siggelkow N (2002) Organizational sticking points on NK landscapes. Complexity 7:31–43 Simon HA (1969) The sciences of the artificial. MIT Press, Cambridge Skellett B, Cairns B, Geard N, Tonkes B, Wiles J (2005) Maximally rugged NK landscapes contain the highest peaks. In: GECCO ’05: proceedings of the 2005 conference on genetic and evolutionary computation. ACM Press, New York, pp 579–584. ISBN 1-59593-010-8 Valente M (2008) Laboratory for simulation development—LSD, working paper 2008/63. Sant’Anna School for Advanced Studies, LEM, Pisa Wagner GP, Altenberg L (1996) Complex adaptations and the evolution of evolvability. Evolution 50(3):967–976 Weinberger E (1991) Kauffman, S.A. 1993. The origins of order: self-organization and local properties of Kauffman’s N-k model: a tunably rugged energy landscape. Phys Rev A 44(10):6399–6413
© Copyright 2025 Paperzz