An NK-like model for complexity - Economia

J Evol Econ
DOI 10.1007/s00191-013-0334-4
REGULAR ARTICLE
An NK-like model for complexity
Marco Valente
© Springer-Verlag Berlin Heidelberg 2013
Abstract The level and nature of complexity is widely regarded as an important
determinant of a number of economic, technological and organizational phenomena.
A popular modeling tool for the representation of complexity in economics and organizational sciences is the NK model that represents the complexity stemming from
the interactions among the elements of a system. This paper proposes an enhanced
model for complexity that, though maintaining the core design (and properties) of
the NK model, provides a more intuitive and richer representation of complexity,
extending its applications and deepening the understanding of its effects on economic
systems. The proposed pseudo-NK (pNK) model is defined on real-valued variables,
as opposed to the binary variables required by NK, so as to allow for richer and more
intuitive definitions of distance and search strategies. It also admits as a source of
complexity not only the number of interactions, as in NK, but also their intensity,
opening a novel way to express and measure the level of complexity. Finally, instead
of relying on statistical properties of a large dataset of random values, pNK is defined
as a deterministic function, far simpler to implement, to interpret and to calibrate for
specific requirements. The paper replicates known results and presents original ones;
in both cases, the proposed model proves a powerful tool for the investigation of the
role of complexity, particularly in agent-based models.
This paper was presented to the International Schumpeter Society Conference, 2010,
Aalborg (DK). I wish to thank Tommaso Ciarli for valuable comments and suggestions
during the development of the model. Luigi Marengo, Lee Altenberg and an anonymous
referee provided encouraging comments and corrections to earlier versions. Any
remaining error or imprecision is, obviously, my sole responsibility.
M. Valente ()
Università dell’Aquila, L’Aquila, Italy
e-mail: [email protected]
M. Valente
LEM, Pisa, Italy
M. Valente
Keywords NK model · Complexity · Simulation models
JEL Classifications C63 · D83 · O32 · O33
1 Introduction
Borrowing ideas developed for a given purpose and applying them to a completely
different domain is a widely used method to advance knowledge and solving problems, up to the point that, one may suggest, copying and mixing different ideas
is the main driver of human creativity and of scientific advancements. The use of
metaphor and concepts developed in different domains, however, runs the risk of misadaptation. Though the core of the original idea can be useful in the new domain,
it is frequently necessary to make some adaptations, removing and adding elements
as necessary for its novel application. Witness, for example, the pioneering attempts
to fly by means of heavier than air flying machines. For centuries inventors attacked
the problem following the strategy of imitating the solution observed in most birds,
i.e. moving wings. As long as the imitation was too close (using wings for both supporting and propelling), the degree of success was very limited. Only when the core
idea was radically revised, separating the function of propelling from the function of
supporting, did the imitation actually succeed.
Concerning the study of complexity, biologists have developed the NK model
(Kauffman and Levin 1987; Kauffman 1993) representing, in very stylized and
elegant way, the effects of fitness improving mutations in a context in which interdependency makes rough landscapes. NK has been particularly successful because
it provides a simple tool for generating an abstract representation of a problem (the
probability of survival of a species) that, contrary to other modelization techniques,
could be easily tuned to make the search for a solution harder or simpler. The representation of the problem relies on the core of complexity, that is, interdependency,
reproduced by means of a large number of uniformly distributed random numbers.
The result is that NK fitness landscapes show statistical properties depending on the
interaction levels only, which are very robust to the choice of the set of random numbers used.1 From a biological perspective, NK is very useful because it is well known
that genes’ effects on phenotypes are strongly influenced by their interactions. NK
poses the interaction explicitly at the center of the analysis, and carefully avoids making any further assumptions, being able to derive interesting results concerning the
expected characteristics of a species, its pattern of evolution, and even the reasons for
the spontaneous emergence of modularity.2
However, biologists are not the only scientists interested in the modeling of complexity by means of interactions. Economists and scholars in organization sciences
have had a long-time interest in the study of the effects of interactions (Simon 1969).
Therefore, many researchers from these fields have adopted NK as their instrument
1 See,
2 See,
e.g., Weinberger (1991), Durret and Limic (2003), Skellett et al. (2005), Kaul and Jacobson (2006).
e.g., Wagner and Altenberg (1996), Altenberg (1995).
An NK-like model for complexity
of choice to represent and to study properties of organizational structures, technological innovation, etc. Originally devised as a metaphor for how nature deals with
complex problems, NK-inspired models have been used to study artificial systems,
organizations, technological developments, industrial dynamics, and much more. In
such applications, NK typically represents a complex problem, where the performance depends on the interactions among several components. Simulated agents (say
a firm or a generic problem solver) are engaged in exploring the problem space on
the base of local and myopic information. The researchers are interested in assessing the results as a function of the complexity of the task and the limited capacities
of the agents. Simulation modelers in economics and management have been eager
adopters of variations of NK models producing a sizable number works.3
This work is based on the conviction that, however successful, the results from
complexity studies in economics and organizational sciences have been limited by
a mis-adaptation (or, better, insufficient adaptation) of NK in its passage from biology to social sciences. As an example of the limitation imposed by NK, consider the
problems concerning the search strategy. The original NK model relies on random
mutations of one single dimension replicating the equivalent of random mutation
in natural systems. Human individuals or organizations, conversely, are likely to
include at least a bit of intentional, purpose-driven behavior, however limited by
informational constraints. For this reason, many scholars proposed modifications of
the baseline NK model to include different search strategies. However, in order to
design appropriate intentional strategies for search it would be useful, to the modeler,
to know where the optimum is located, and possibly also to have the opportunity to
control some properties of the problem space, such as the specific location and nature
of local optima. However, NK has been designed as a metaphor for natural systems,
which are passively observed and not controlled, and therefore these possibilities,
useless from a biological perspective, are precluded.
The increasing popularity of NK made ever more evident its limitations, and
several contributions propose variations altering some aspects of its mechanisms
(Li et al. 2006). Along these lines, the goal of the present paper is to propose a radically alternative implementation of the core features of NK in such a way as to make
the model more flexible and adaptive for the novel applications to which it is increasingly put at work. The major changes of our proposal consist in the replacement of
the binary variables required in NK with more general (and familiar) real-valued variables, and the shifting from a stochastic definition of the landscape to a deterministic
functional one. Besides being able to inherit the same properties of the original NK
model, our proposal allows us also to distinguish the structure of interactions (which
component interacts with other components) from the intensity of interactions (how
strong is a given interaction).
In the following, we present the new model by hinting at the improvements it
can bring to the study of complexity, particularly when purpose-driven agents are
involved. During the presentation, we alternate examples of applications of the new
3 See,
for example, Levinthal (1997), Frenken et al. (1999), Kauffman et al. (2000), Rivkin (2000), Rivkin
and Siggelkow (2002), Lenox et al. (2006), Frenken (2006).
M. Valente
model to replicate known results generated with NK with original exercises permitted
by the new model. The next section discusses the core elements and properties of
NK, highlighting its shortcomings, particularly for application in fields different from
biology. After that, we describe the new model, called pseudo-NK model (pNK),
showing how it is able to replicate all the relevant properties of the original NK and
removing, or relaxing, its major shortcomings. The third section presents a series
of results produced by the simplest pNK version, based on two dimensions only,
and exploiting only the intensity of interdependency, the novel feature introduced by
pNK. The last section discusses the properties of a generalised pNK defined over N
dimensions. A final section draws the conclusions.
2 Pseudo-NK model for complexity
An NK model can be considered as composed by two distinct components: a problem
specification represented by a space of potential solutions to an implicit problem, and
a search algorithm scanning this space hunting for better and better solutions. The
problem space is represented by all the binary strings of length N, each associated
with a fitness value, that is, the pay-off for that solution. The search algorithm consists in a routine generating a pattern, that is, a sequence of solutions starting from
a randomly chosen initial solution, or point in the N-dimensional binary space. The
search routine defines the way in which we move from any one point to the next,
that is, how to generate a new string from an existing one. For example, the typical
routine, originally proposed, consists in choosing randomly one of the N elements
of the current string and flipping (“mutate”) its value, from 0 to 1 or vice-versa, so
that the new string will differ from the old one for only one bit. If the fitness of the
new string is higher than that of the previous one, the new string is adopted, making a “step” in the pattern of exploration of the space. Otherwise, if the fitness of
the mutated fitness is lower than that of the present one, the mutation is rejected and
the search will continue from the same string. The repeated application of the search
algorithm generates a path across the space of solutions. The path will be made by
fitness-increasing steps, because of its construction, and terminates when it reaches
a string from which all accessible strings have lower fitness, and hence there are no
further possible steps. These strings are referred to as “peaks”, trapping any search
based on hill-climbing strategies.
Two aspects make NK particularly attractive. First, it is possible to determine how
complex should be the space of solutions, or fitness landscape. The effect of complexity is represented by the probability to encountering the highest fitness point
during a search pattern. Hence “simple” means that it is very likely to find this point,
while “difficult” means that many searches will stop at lower fitness peaks. In NK,
it is possible to tune the degree of complexity (i.e. the number of sub-optimal local
peaks) by setting the number of interactions among the variables of the space, indicated by K. Landscapes with no or few interactions represent simple problems, i.e.
contain few local peaks. In the simplest possible case (K = 0), there exists only
one peak, so that all searches starting from any initial point will lead to this global
optimum. Increasing K generates “harder” problems, represented by landscapes
An NK-like model for complexity
with a larger number of local peaks and decreasing chances of reaching the global
optimum.
The second aspect of NK is the representation of the search algorithm. The original proposal, coming from a biological metaphor, assumes the search strategy to be
local and myopic. Local because the search implies the impossibility to observe the
space beyond the narrow neighborhood of the current string; myopic because it prevents collecting past information or predict future events, focusing on the immediate
goal of just improving the current condition. In other applications, NK models are
endowed with search strategies on longer distances, for example admitting the possibility for long jumps (Levinthal 1997), search within modules of a given dimension
(Frenken et al. 1999) and other methods (Auerswald et al. 2000; Kauffman et al.
2000).
These two elements, complexity due to interactions and algorithmic search strategy, provide a stylized representation of a complex task faced by a would-be solver
with limited capacity and information. Hence, the model allows us to study the
effects of varying degrees of complexity upon actors with different problem solving
approaches.
In its original context, NK has been used as a pure representation of complexity, studying, for example, the expected number, location and average fitness of local
peaks as a function of K, playing down the role of the search strategy, since this
represents merely the blind action of nature. However, in other fields, such as management, organization theory, economics of technical change, etc., the popularity of
NK stems from the large number of extensions to its basic implementation, particularly concerning the opportunities of search performed by purpose-driven intelligent
actors. Hence, in these fields, NK has become a sort of sand-box where it is possible to test different problem solving approaches facing tasks of varying complexity.
For example, Levinthal (1997) introduces the possibility for agents (representing
firms) to perform “long jumps” besides local search and imitation of successful
competitors. Rivkin and Siggelkow (2002) propose a model where decisions are distributed between a “CEO” concerned with overall fitness, and managers interested
in “parochial” interests whose success depends on a portion of the landscape under
their control. Lenox et al. (2006) use a version of NK to represent a technological
space explored by competing firms.
In general, NK provides a working implementation for the intuition that simple problems can be successfully solved, while increasingly complex ones generate
sub-optimal solutions with degrading expected payoff’s. Using NK, researchers are
able to investigate likely consequences expected under controlled conditions for the
environmental complexity of the problem and solving strategy adopted. The very
simplicity of the model invites users of NK to devise variations of the baseline version
to implement customized search strategies for specific types of problem solvers.
In this work, we claim that NK, however effective in offering a representation
of complexity and its consequences, contains a number of unnecessary biases and
limitations when used to represent complexity in an economic context.
A first problem arises from the necessity to use of binary variables, representing
presence/absence of a given feature. This aspect generates confusion because the only
possible measure of the distance between two points is the Hamming distance, that
M. Valente
is, the number of variables with different values between two points. This concept
of distance is very different from the familiar Euclidean distance applicable to realvalued spaces. In a binary space, a “step” in a given direction can be only one unit
long, though it can be directed in N different dimensions. Besides making impossible
the graphical representation of the landscape, the use of binary variables generates
confusion and misinterpretations. For example, one is entitled to speak of “valleys”
and “peaks”, but it would be wrong to assume that they have the three dimensional
shape evoked by those names.
A second problem arises from the use of randomness to generate the landscape, the
properties of which are actually represented only as statistics over a reasonable large
number of replications. In practice, each landscape will have a different (random)
distribution of the relevant points, such as the global maximum or the local peaks,
and the distribution cannot be pre-determined nor observed without testing each and
every point of the landscape. This means that the modeler is as ignorant of the fitness
landscape as the virtual agents exploring it, making their assessment very difficult in
case of large landscapes.
A last issue hindering the use of NK models consists in the relative difficulty in
its implementation and use. Though the programming of a NK is relatively straightforward, its structure relies on a combinatorial number of random values, the sheer
number of which is simply impossible to manage even for modest number of dimensions. The reason is that a fitness landscape requires the storage of a database made
of N*2K+1 random values, which can easily exceed the capacity of any computer
even for low levels of K, in effect making it impossible to build all the points of
a landscape. It is possible to find technical solutions to generate portions of large
landscapes so as to collect sample statistics of their properties,4 but the necessity to
rely on samples further diminishes the capacity of the modeler to investigate and to
exploit the landscape to assess the results.
2.1 Improving on NK
The limitations listed above make NK difficult to use and its results complicated
to interpret when it is applied to investigate complexity in artificial systems. We
propose here an alternative model replicating (and extending) the features of NK that
are relevant to economic and organizational scientists, without the limitations of the
original NK highlighted above. The new model, which we call pseudo-NK (pNK),
enjoys the following properties:
•
•
Functional representation of fitness: the fitness function is defined as a deterministic function of a multidimensional vector. Therefore, it can be implemented
as a routine, without requiring large data sets of random numbers, simplifying
the coding of very fast implementation for even large landscapes.
Multidimensional real-valued landscape: the landscape of pNK is composed
of a subset of the n-dimensional real-valued space: x = {x1 , x2 , ..., xN } ∈ N ,
4 Altenberg
(1997) is credited as the first to propose a solution to this problem.
An NK-like model for complexity
•
•
with fitness represented by a real-valued function f (
x ). Both domain and codomain can be freely determined by the modeller.
User determined maximum and landscape’s overall shape: f (
x ) has a maxx ∗ ) ≥ f (
x ), x ∈ ) determined by the user.
imum5 in a point x∗ (such that f (
The shape of the landscape is a well-behaved function, so that the modeller can
evaluate analytically the properties of the landscape.
User determined interdependency: for any given couple of dimensions i and
j, the user can set a varying degree of interdependence ai,j , ranging from full
independence to maximum interdependence. Intermediate and varied levels of
interdependency allow us, for example, to define landscapes where a dimension
depends strongly on some dimensions and weakly on others.6
Besides these improvements, the definition of interdependency used in pNK
remains the same as that used in the NK model: dimension xi is dependent on dimension xj if the maximization of the fitness function requires a different value for xi for
different values of xj . In a different terminology, we can say that there exists dependency of xi on xj if, for at least some value of the other N-2 variables, the derivative
of f (
x ) in respect to xi changes signs for different values of xj . It is worth noting
that, in NK (and pNK, too), interdependence does not simply imply that modifications xj affects how xi impact on the fitness function f (
x ). In fact, it may be possible
that such influence is strictly monotonic, that is, for different xj we observe that xi
has different impacts on the overall fitness value, but they always have the same sign.
Though in this case we do have interdependency, in a sense, this is not affecting the
possibility of identifying the optimal value for xi independently from xj . Only when
∂f (
x)
∂xi changes sign for different values of xj , is the fitness-optimizing value of xi a
function of xj . In this case, a hill-climbing strategy based on varying xi only is liable
to be stuck in a local peak determined by the specific value of xj .
In the following, we describe a possible implementation of a fitness function for
pNK providing the properties listed above.
2.2 pNK fitness function
pNK, as NK, consists of a fitness function defined on a set of N variables and
a search algorithm. The fitness function proposed here7 for pNK borrows heavily
from the NK implementation, with three notable differences. First, it considers realvalued variables (instead of binary); second, it is a deterministic function, rather
than stochastic; third, it allows for different degrees of interdependence, instead of
presence/absence.
5 Actually,
it is possible to extend the model to admit multiple global maxima, though we will ignore this
possibility.
6 A different version of the standard NK models, the so-called generalized NK, also allows user determined
interdependency (Altenberg 1995, 1997).
7 Here we adopt a specific functional form providing the properties required to pNK; however, there is a
whole class of functions that may be used, depending on specific requirements.
M. Valente
The overall fitness value of a point of the landscape domain (i.e. a point of N ) is,
as in NK, the average of N fitness contributions for each of the variables:
N
x)
(
(1)
N
where φi (...) is the fitness contribution function for dimension i. While in NK φi is a
random value, in pNK this is a deterministic function defined as:
f (
x) =
i=1 φi
f Max
(2)
x )|)
(1 + |xi − μi (
where f Max is a user-determined parameter indicating the maximum of the function
(typically 1). Equation 2 essentially states that φi is a decreasing function of the
distance between the variable’s value and another function μi (
x ), defined as:
x) =
φi (
μi (
x ) = ci +
N
ai,j xj
(3)
j =1
The value μi defines a sort of “target” for dimension i that, when hit by xi , generates the maximum level of contribution of the variable to the overall fitness function.
However, the value of xi maximizing φi may not be the most desirable, concerning
the overall fitness value. In fact, xi influences also all contributions φj for the variables where aj,i = 0. Therefore, it is well possible that moving xi to maximize φi
actually decreases the overall fitness value because of the deterioration generated in
other fitness contributions φj ’s where aj,i = 0.
The fitness function here8 proposed allows for ample flexibility. In particular, it is
possible to determine features of the landscape that are not available in NK models:
•
•
•
Set maximum fitness. Simply setting f Max determines the maximum value of
the fitness function.
Set the global optimum. For any dimension i, it is possible to compute ci such
∗ .
that the maximum fitness is obtained at a desired point x∗ = x1∗ , x2∗ , ..., xN
To
this point is the global maximum, we need to set ci = xi∗ −
ensure that
∗
j =i ai,j xj .
Set interdependencies. Varying the values of ai,j it is possible to make more or
less relevant the interdependency between two dimensions.
The first two properties are useful to exploit pNK in a context where the modeller is interested in determining a specific maximum fitness value at a specific point
of the landscape. However, the last property is far more interesting, since it allows
pNK to model complexity in a more detailed way in respect of NK. In fact, NK
implements interdependency as a presence/absence property: if dimension j affects
8 The
proposed model can easily be modified changing the functional form for Eqs. 2 and 3. The same
properties of the model discussed here are maintained provided that μi (
x ) is a positive function of ai,j xj
and that φi (
x ) is an inverse function of |xi − μi (
x )|. Using different functional forms for these elements
will change the density of local peaks and the overall shape of the landscape, though maintaining the same
properties discussed here.
An NK-like model for complexity
i then the contribution of i changes when j changes, and the two contributions are
totally unrelated. Conversely, pNK allows to tune the level of interdependency: the
effect of j on i can be null, weak or maximum, depending on a single parameter.
Furthermore, since pNK is deterministic and expressed in a functional form, it is possible to represent and even visualize (when N = 2) the effect of sliding levels of
interdependency.
The explicit distinction between complexity stemming from interactions’ structures and from interactions’ intensities is theoretically relevant, as noted by
Simon (1969) in his analysis of nearly-decomposable systems, suggesting that,
though every element of a system is influenced by any other, some of the interactions are stronger than others. This distinction makes it possible to manage apparently
intractable problems, trading off optimality against the reduced costs of dealing with
simplified systems.
In NK, since interactions have only one level, this distinction is blurred, and the
intensity of interactions can only be crudely regulated by increasing or decreasing the number of connections, that is, K. However, in many cases, researchers are
interested in representing complexity explicitly, and exclusively, as stemming from
interaction intensities. The next section shows how many of the applications of NK
to the organization literature can be reproduced by using a space made of N = 2
dimensions only, and using instead the level of the only existing interaction. The
intuitiveness of the bi-dimensional representation, due to the possibility of visualize
the landscape, allows a far greater insight than the original exercises implemented
in NK.
Before reporting on these exercises, the next paragraph defines the algorithm of
search strategies in a pNK context, as required in dealing with landscapes made of
real-valued variables.
2.3 Search strategy on pNK
The baseline research strategy in the NK model lies in the so called one-bit mutation.
This consists in generating a new point starting from an existing one by modifying
a single bit of the binary string representing the point. The new point is accepted as
a step if the fitness associated with the new point is higher than that of the existing
one; otherwise the step is rejected.
On a real-valued fitness landscape, the one-bit mutation strategy can be expressed
by the following algorithm applied to a generic point in order to generate a
subsequent point of a pattern.
1.
2.
3.
4.
5.
6.
Choose randomly one dimension among the N available.
Choose randomly one direction: + or −.
Change by the value of the chosen dimension in the chosen direction.
If the change leads to a point with higher fitness, move to the new point.
If the change leads to a point with lower fitness, remain at the same point.
Continue from 1.
A path of a landscape is therefore made up of a sequence of steps generated
by repeating applications of the above algorithm. The routine is repeated until no
M. Valente
change of on any of the variables of the landscape is able to produce a fitness
increment. The points on which this happens constitute the (local or global) peaks of
the landscape.
In the economic and organizational literature, there are many alternative proposals
for search algorithms designed to represent specific behaviors. Each can easily be
adapted to work on a real-valued landscape, actually providing far more flexibility.
This is because in pNK it is possible to define three distinct features of a step: a)
its “angle” (the dimension(s) chosen); b) its direction (increase or decrease in the
current value); c) its length, set by . As a consequence, it is possible, for example,
to distinguish two different types of “long jumps” (Levinthal 1997): one made of
small modifications of many dimensions, and another made of a large change in a
single-dimension.
For our purposes, we will consider as a constant parameter represented by a
small value. This value is of only minor importance since its only role is to discretize
the continuous real-valued space. The discretization permits us to ignore the small
differences of fitness between two very close points. This choice is due to the necessity, in this first version, to maintain the myopic nature of search strategies in complex
landscapes as in the original version of NK. Limiting to be small and constant, in
fact, implies forcing the search strategy to consider only a small patch of the landscape surrounding the agent’s position. However, it is easy to devise extensions of
pNK in which agents searching a landscape may attempt both steps in various dimensions but also modify the length of steps , for example, extending when trapped
in a local optimum.
3 Complexity from intensity of interdependency
The interdependency among components of a system generates complexity by means
of its structure (the network of interactions) and their intensities. In this section,
we study the properties of pNK in relation to the intensity of interdependency only,
neglecting the role of the structure. For this purpose, we consider two dimensions
only (N = 2), so that the structure of interdependency, trivially made up of the only
possible connection between the two existing dimensions, cannot play any role. We
will show that, even using this minimal landscape, pNK is able to replicate (and,
actually, enrich) a large number of results originally produced with NK. Limiting to
a space with small dimensionality, we will be able to produce the same results in a
far simpler and more intuitive manner.
We will consider the implementation of pNK as a landscape containing a single
global peak located at a pre-determined point. We will consider only symmetric landscapes, where the two interdependency relations between the variables are identical in
absolute levels and opposite in values. This is obtained by considering a1,2 = −a2,1
and 0 ≤ |ai,j | ≤ 1. Firstly, we show the topology of the space represented by pNK,
exploiting the possibility to visualize the 3D graph made of two dimensions for the
plane and the third for the fitness corresponding to each point. We will then move
to replicate, to extend and to comment on various results obtained by considering
myopic explorations of this space.
An NK-like model for complexity
Example of fitness landscape
1
0.9
0.8
0.7
0.6
0.5
102
101.5
101
100.5
100
X2
99.5
0.4
0.3
98
98.5
99
99.5
X1
100
100.5
99
101
101.5
98.5
10298
Fig. 1 Fitness landscape for N = 2 with symmetric and relatively strong interdependency. Global optimum set at (100,100) with maximum fitness set to 1, and |ai,j | = 0.7. A hill-climbing strategy based on
one-dimensional mutation corresponds to moves parallel to one of the axes. If, as in the figure, the ridges
of the landscape are diagonal to the axes, they represent areas of local optima for one-dimensional search
strategies. In fact, any vertical or diagonal step would force a “step down” from the ridge resulting in a
fall of fitness. Only if the ridges are parallel to the axes (ai,j | = 0) can a one-dimensional search strategy
walk on the ridges, and it will always reach the global maximum
3.1 Visualisation of 2-D pNK fitness landscapes
Figure 1 shows a fitness landscape built on two dimensions. The graph reports a
landscape with the global optimum set at x ∗ = (100, 100), maximum fitness to
f Max = 1, and a high, but not maximum, level of symmetric interdependency
|ai,j | = 0.7.
The landscape formed by the chosen fitness function9 is composed of a single
peaked surface with monotonically decreasing fitness for points increasingly distant
from the optimum, though points with the same distance from the optimum may
have different fitness values. The surface is composed by four “ridges”, separated by
“valleys”, leading to the global optimum. The two ridges are orthogonal because of
the assumption of symmetry of interdependency, i.e. ai,j = −aj,i . The central aspect
of a bi-dimensional pNK is, obviously, its degree of interdependency. In graphical
terms, the interdependency is represented by the angles of the ridges in respect of the
axes, which are controlled by the absolute values of the parameters |ai,j |.
To make a more complete analysis of the effects of interdependency the graphs
reported, in Fig. 2 describe five different landscapes generated by pNK for |ai,j | =
0, 0.25, 0.5, 0.75, 1. For |ai,j | = 0, the two ridges leading to the global maximum
9 Different functional forms can be generated by altering the expression of the fitness contributions φ .
i
only requirement is that the φi is inverted related to the difference |μi − xi |.
The
M. Valente
Fitness |a|=0.25
Fitness |a|=0
102
101.5
1
102
0.9
101.5
0.8
0.8
101
100.5
100
101
0.7
100.5
0.6
0.5
100
99.5
0.3
99
99.5
100
100.5
101
101.5
98
102
98
98.5
99
102
101.5
99.5
100
100.5
101
101.5
98
102
102
1
0.9
101.5
101
100.5
100
101
0.7
0.6
100.5
0.5
100
99
99.5
100
100.5
101
101.5
99.5
0.3
99
99
98.5
98.5
98
102
0.7
0.6
0.5
0.4
0.4
99.5
1
0.9
0.8
0.8
98.5
0.3
Fitness |a|=0.75
Fitness |a|=0.50
98
0.5
98.5
98.5
98.5
0.6
99
99
98
0.7
0.4
0.4
99.5
1
0.9
98
98.5
99
99.5
100
100.5
101
101.5
0.3
98
102
Fitness |a|=1.00
102
101.5
1
0.9
0.8
101
100.5
100
0.7
0.6
0.5
0.4
99.5
0.3
99
98.5
98
98.5
99
99.5
100
100.5
101
101.5
98
102
Fig. 2 Fitness landscapes for N = 2 and |ai,j | = 0.0; 0.25; 0.50; 0.75; 1.00
are parallel to the axes. For |ai,j | = 1, the ridges run on the diagonals, at 45 degrees
in respect of the axes. For intermediate values, they have increasing angles.10
The next paragraph explores the areas of local peaks trapping one-dimensional
search strategies for each of the above landscapes.
3.2 Local peaks and interdependency
In contrast to NK, pNK is not only able to replicate the statistical relation between
complexity and extension of local peaks, but it is also possible to locate them and to
explain the mechanism trapping the search strategy and hence causing the peak.
The angle of the ridges in respect of the axes is crucial to determine the properties
of pNK. In fact, any point other than the ridges (the “valleys”) cannot have local
peaks, since there is always a whole area of neighbors closer to the optimum and with
10 All
the experiments reported in the paper are produced with simulation programs implemented with
Laboratory for Simulation Development-LSD (Valente 2008). The LSD platform is available for download
at www.labsimdev.org. The code for the models and the specific exercises, together for the instructions on
their use, is available upon request.
An NK-like model for complexity
higher fitness. Hence, either a pattern remains always in a valley, and hence ends up
necessarily in the global optimum, or it reaches one of the ridges, which are therefore
the only areas that needs to be studied.
The one-dimensional search strategy means that the only admissible steps are
those made maintaining constant one variable, that is, moving “horizontally” or “vertically”, parallel to one of the axes. If the ridges of the fitness surface run parallel
to the axes, than such a strategy will surely bring us to the global optimum. This is
because the ridges “dominate” the valleys, and therefore are likely to be reached by
the pattern.11 Once a point on a ridge is reached, the strategy is able to walk on the
top of it, eventually reaching the global optimum.
Conversely, for landscapes with diagonal ridges (i.e. forming large angles with the
axes), search patterns are likely to get trapped in local peaks. In effect, most of the
points on the ridges constitute, for these landscapes, the set of local optima. This is
because, when the ridges are diagonal, moving parallel to the axes (in the direction
pointing towards the global optimum) generates two opposite variations of fitness.
First, since the step will bring away from the ridge, the fitness of the new point will
tend to be lower than the starting point on top of the ridge. Second, since one direction
of the step brings us closer to the optimal point (because of the angle of the ridges),
the fitness will tend to increase. The net effect is uncertain in general, and depends on
which segment of the ridge the pattern has reached, besides their angle in respect to
the axes. Given the functional form chosen, the slope of the ridges is gentler the farther away from the global optimum, and therefore, in these areas, the positive effect
on fitness by getting closer to the maximum will be weaker. Conversely, segments of
the ridges near the global optimum have steeper fitness and shallower valleys, and
therefore it is more likely that the fitness loss caused by stepping off the ridge is
smaller than the gain in getting closer to the maximum. As a result, patterns reaching
the ridges far away from the optimum are likely to get trapped, while those managing
to stay in valleys until close to the maximum will avoid the trap of local peaks.
To visualize the properties of the intensity of interdependency, we consider the
(average) final fitness reached by search strategies. For each landscape, we ran about
30,000 independent searches, starting from randomly chosen initial points. Each
search consists in steps performed by adding or subtracting to a randomly chosen
variable, and moving to the new point in case the change generates higher fitness.12
For each search, we associate the coordinates of the starting point to the fitness
reached when the search terminates, averaging the values of different searches from
the same starting point. Points from which searches lead to higher fitness final points
are shown with brighter colors, while those producing patterns trapped in low fitness
values are indicated with darker colors.
11 Valleys,
trivially, lead always to the global optimum, so we need not consider these areas.
with different values of showed the irrelevance of the value chosen for this parameter, but for
the level of detail of the graphs. The value used for the simulations is 0.05, so that the portion of the graphs
shown (from 98 to 102 on both dimensions) is effectively considered as a square lattice made of 80 units
on each side and composed of 6,400 points. Therefore, the 30,000 runs generate, on average, about five
random searches started from each point of the landscape.
12 Tests
M. Valente
The graphs reported in Fig. 3 show these values for different complexity levels, i.e.
values of |ai,j |. The graph for |ai,j | = 0 is not reported, since it consists of a uniform
plane of value 1, meaning that all searches starting from every point of the landscape
consistently manage to reach the global maximum. The remaining cases show that,
while |ai,j |’s increase, the areas of starting points bringing to the global optimum
shrinks. Under the most challenging case (|ai,j | = 1), we obtain a Fuji-like figure:
only the searches starting close to the global optimum are able to reach the top spot.
In this setting, any initial point far from the global peak generates a pattern leading to
final fitness values orderly distributed according to the distance from the optimum.
The reason is that, on this maximum-complexity landscape, movements parallel to
the axes lead to a point on the ridges with a probability increasing with the distance
from the global optimum. The fitness values of the ridges also have decreasing fitness
farther from the global peak, and this is why the final fitness of a search started far
from the optimum is generally lower.
For intermediate values of complexity, we observe “propeller”-like figures. For
these cases, the farther from the global peak the search starts, the lower the probability of reaching the global optimum. However, the probability is not distributed
symmetrically on both sides of the ridges. In fact, given that the angle of the ridges is
positive but lower than 45 degrees, points on different sides from the ridges have different probabilities of being stuck in local optima. A pattern leading on the “lucky”
side of a ridge are more likely to get closer to the global optimum, while points on
the “wrong” side are more likely to get stuck earlier on.
This result confirms the capacity of pNK to replicate the core property of NK:
increasing the complexity of the landscape generates lower and lower expected fitness resulting from the same myopic search strategy. pNK obtains this result in a far
Random search |a|=0.50
Random search |a|=0.25
102
101.5
101
100.5
100
99.5
98
98.5
99
99.5
100
100.5
101
101.5
102
1
0.95
0.9
0.85
0.8
0.75
0.7
0.65
0.6
101.5
101
100.5
100
99.5
99
99
98.5
98.5
98
102
98
98.5
99
99.5
100
100.5
101
101.5
102
101.5
101
100.5
100
99.5
102
1
0.95
0.9
0.85
0.8
0.75
0.7
0.65
0.6
101.5
101
100.5
100
99.5
98.5
98.5
99
99.5
100
100.5
101
101.5
98
102
1
0.95
0.9
0.85
0.8
0.75
0.7
0.65
0.6
99
99
98.5
98
102
Random search |a|=1.00
Random search |a|=0.75
98
1
0.95
0.9
0.85
0.8
0.75
0.7
0.65
0.6
98
98.5
99
99.5
100
100.5
101
101.5
98
102
Fig. 3 Final fitness produced by random one-dimensional search strategy for landscapes with N = 2 and
different values of |ai,j |. The graphs report the average final fitness in correspondence with each starting
point. The average is computed from an expected number of about five searches from each point
An NK-like model for complexity
simpler setting than that required in NK and provides a more intuitive mechanism.
We have used a landscape with the minimal number of dimensions to express interdependency and generated complexity by means of the intensity of the interaction.
Furthermore, pNK defines precisely the locations leading to global or local maxima, providing deterministic and logical reasons for the eventual result. Once having
established these basic properties, we can use pNK to investigate more elaborated
ones.
3.3 Emergence of order
The most attractive feature of NK is its representation of lock-ins of myopic explorations on complex landscapes. Smooth, simple landscapes allow all or most of
the explorations to reach the highest fitness point, while rough, complex landscapes are likely to trap the exploration in local peaks. Levinthal (1997) interprets
this feature in terms of expected diversity of organizations engaged in the search
for efficient strategies. He notes that, after starting with a population of firms
scattered randomly on different points of the landscape (i.e. with different initial
conditions), the local (“adaptive”) search rapidly brings a steep reduction of the
firms’ population variety. This is due to the convergence of firms in high-fitness
areas, that is, adopting successful strategies that can easily be reached by “adaptation”. Conversely, complex landscapes present a large number of local peaks with
relatively small basins of attraction. Consequently, firms starting from different
areas will end up in different local peaks and we will observe a larger variety of
strategies.
In the original paper, this result was supported by an experiment showing the
number of different points reached, on average, by 100 firms over 100 independent
landscapes setting N = 10 and repeating the experiments for K = 0, 1, 5 (Levinthal
1997, fig.1, p.940). The graph shows that, through time, the number of points reached
by the population of firms in the three cases considered drops sensibly, with the most
complex landscape maintaining the highest diversity (i.e. larger number of different
points).
As a test to assess the capacity of the new model, we used pNK to replicate the
same exercise. As expected, we obtain the same results, though with a far richer detail
and greater ease of interpretation. Moreover, the more intuitive structure of pNK
managed to produce greater insight extending the original result. Figure 4 reports
on the results produced by pNK using N = 2 dimensions only, and expresses the
complexity of the landscape by means of the degree of interdependency |ai,j |. The
data are generated defining 100 different landscapes, each with a different value of
interdependency between the two variables |ai,j | ranging from null to maximum. On
each landscape, we performed 30,000 searches starting from randomly chosen points
and registering the coordinates of the initial and final point of the search, and the
fitness level of the latter. The graph reports the variance of one dimension’s final
location, as a proxy for the variety of the results produced in each search, and the
average fitness of these final points. The data show that, for landscapes with low
complexity, all searches manage to reach the global optimum (fitness 1.0), generating
the highest level of homogeneity (variance of locations is null). For increasing levels
M. Valente
Fig. 4 Variance of locations and average fitness by search strategies on N = 2 pNK fitness landscapes.
Data produced by 30,000 searches for each of 100 landscapes with |ai,j | ranging from 0.0 to 1.0
of interdependency, we find that the final locations are increasingly dispersed, as
indicated by the higher and higher levels of variance, confirming the original result.
We can also observe another result deriving from increasing complexity: decreasing levels of local peak fitness. That is, sub-optimal points trapping search patterns
in rougher landscapes show lower fitness than in the case of smoother ones. This
is due to the obvious consideration that higher complexity stops searches earlier
on “bad” local peaks, while gentler landscapes are more likely to allow search patterns either at the top, or closer to it. Although intuitively obvious, such results
cannot be observed in NK because of its statistical properties. It is well known
that NK landscapes generate higher levels of fitness of global peaks for higher K’s
(Kauffman 1989). Therefore, one will observe that a low ranking local peak in a landscape with high complexity shows a higher fitness value than the global peak of a
landscape with low complexity. This property is due to mere statistical reasons motivated by the way NK generates fitness values.13 Though innocuous when comparing
results from searches on the landscapes with the same complexity, this aspect of NK
creates serious problems when comparing results from landscapes defined with different complexity levels. For example, presenting a model where innovative firms’
13 In
NK, the fitness is computed out of a sample of random numbers the dimension of which is proportional to 2K . Hence, rougher landscapes use a larger sample than smoother ones, increasing the probability
of finding a higher value.
An NK-like model for complexity
qualities are obtained by fitness levels produced with NK, Lenox et al. (2006) write
“One could make a compelling argument [for] this trait of NK [...]. We decided, however, to eliminate this feature of the NK specification by expressing individual firm
marginal cost or quality as a percentage of the global optimal” (pag. 763).
As in the cited work, it is frequently the case that researchers are interested in comparing the relative fitness produced by search strategies in different landscapes. While
NK requires an elaborated transformation, necessarily arbitrary, to compare fitness
values from landscapes with different complexities, pNK maintains the same overall distribution of fitness values, allowing an objective, direct comparison. The next
experiment exploits this feature, comparing two different search strategies across
landscapes with different intensities of interdependency.
3.4 Greedy vs. random strategies
In this experiment, we investigate the performance of two different strategies for
dealing with complexity. We will show that each of the two strategies is preferable
(i.e. provides higher fitness) within a certain level of complexity, providing an insight
that may prove useful to actors dealing with complex tasks.
We continue to use the simplest pNK problem space, made of two dimensions
only, and consider different scenarios with increasing levels of complexity as represented by the intensity of the interdependency. For each scenario, we compare the
average result produced by two research strategies, both myopic (only surrounding
points can be explored) and allowing changes on one single dimension at each step.
The first is the standard strategy, testing random points in the immediate neighborhood of the currently held point. The second strategy is slightly more elaborate;
instead of testing a point chosen randomly (within the permitted range), it tests all
possible one-dimensional changes (four, for the 2 dimensional space), and opts for
the one providing the highest possible fitness increase (if any). Obviously, the two
strategies provide the same step in case only one change is able to provide a fitness
increment. However, when both dimensions could deliver a fitness increment, the
second, “greedy”, strategy will consistently choose the step providing the largest fitness increment, while the random strategy is expected to do so only half the time. The
question we explore is whether the supposedly “smarter” strategy (the one able to
choose always the best step) is actually superior to the random one. Figure 5 answers
this question, showing the fitness provided, on average, by the two strategies for different levels of intensity of interdependency |ai,j |, ranging from null to maximum
complexity.
As expected, for low values of |ai,j |’s (simple landscapes), both strategies manage
systematically to reach the global maximum, as indicated by the value of 1 for both
strategies provided for low values of the interdependency parameter.14
14 We
do not consider the speed of research of the two research strategies, that is, the number of steps
required on average to reach the global optimum. In fact, we may expect that the greedy strategy is faster
than the random one, requiring a smaller number of steps to reach the eventual peak. In the following
section, we will discuss the issue of speed of research showing how pNK is able, contrary to NK, to
provide an intuitively appropriate representation of the length of research patterns.
M. Valente
1
Random
Greedy
0.95
Av. fitness
0.9
0.85
0.8
0.75
0.7
0
0.2
0.4
0.6
0.8
1
|a|
Fig. 5 Average final fitness over 30,000 searches starting from randomly chosen points with random and
greedy strategies on 11 landscapes, with ai,j ’s ranging from 0.0 to 1.0
For the opposite cases, with high levels of complexity (|ai,j | approaching 1), the
greedy strategy manages to obtain higher fitness than the random one. This may be
easily explained since, in these contexts, the best that can be obtained is a relatively
high local peak, given that there are very few chances of reaching the global peak by
means of any hill-climbing strategy based on the adaptation of a single dimension
only. Consequently, the patterns will generally stop after a few steps, and therefore
the best that can be obtained is reaching the highest among the local peaks around
the starting point. Clearly, in these environments, the greedy strategy is the most
appropriate.
The dominance of the greedy strategy, however, is limited to the cases with high
complexity. Conversely, for moderate levels of complexity, when |ai,j | takes values in
the range from about 0.2 to 0.4, the humbler, random strategy dominates the greedy,
supposedly smarter one. The explanation for such an apparent counterintuitive result
can be found comparing the patterns generated by the two strategies. Landscapes
with moderate complexity have areas of local peaks, but smaller than those with high
complexity. The results show that rushing to find the best local peak in the immediate
neighborhood of the starting point is not the best strategy when the local peaks are
few and distant. Rather, avoiding them altogether may be the best way to carry on,
ascending slowly (i.e. with smaller fitness increments per step), but ensuring a longer
path (i.e. with more steps).
An NK-like model for complexity
This phenomenon offers interesting insights. Consider an organization where the
activities of its members are only moderately interdependent, so that a given change
by one member will be positive or negative for the organization, depending on the
current activities by one or more members, but these links are not very strong. Imagine that each member loyally proposes to the top management, say the CEO, the
course of actions within his department that ensures the highest improvement to the
firm’s performance. If the CEO is forced to implement only one of the received proposals, an apparently rational behavior would be to choose on the basis of the highest
expected performance gain. Our result says that there are cases in which the organization would be better off by choosing an alternative proposal providing a dominated
expected payoff. These cases are those in which higher performance is obtained at
the (hidden) cost of restricting subsequent exploration. Hence, the correct criterion
to choose should be more elaborate than mere expected performance measured in
the near future, but should consider also the effect of today’s choice on tomorrow’s
options.
This may be considered as an example of the phenomenon called “premature convergence” by researchers in problem-solving algorithms (Michalewicz 1996). These
are the cases in which a problem solver focuses on good solutions discovered in
earlier steps, preventing the exploration of a larger area potentially containing even
better solutions. In organization studies, this phenomenon has already be noted:
“adaptive processes, by refining exploitation more rapidly than exploration, are likely
to become effective in the short run but self-destructive in the long run” (March
1991). This exercise shows clearly the cost of short-term “greed”, providing a formal
context for observing the effects of exploitation versus exploration. The premature
convergence is due to the constraints posed by interdependency starting to bite early
on during a research strategy, preventing the reaching of high fitness portions of
the search space. In cases of moderate interdependency, it would be better to not
pursue the search for optimality, but to leave some room for (locally) sub-optimal
decisions that, increasing the variety of exploration, enlarge the areas open to future
explorations. In short, there are cases in which the good is preferable to the best.
Concluding this section, we have shown that it is possible to replicate many results
originally produced with NK by using the simplest pNK implementation, based on
two dimensions only. We have seen that many (all?) results concerning on the effects
of complexity can be reproduced by neglecting the role of the structure of complexity
and focusing only on the role of the intensity of interdependency. Furthermore, these
same results, besides being produced more efficiently, are more coherent and provide additional insights than was the case with NK. The following section continues
the tests of pNK, extending the analysis to include discussions on the complexity stemming from different interaction structures defined over a high number of
dimensions.
4 Complexity from interdependencies’ structure
In this section, we analyze the role of the structure of interdependencies in generating
complexity. The first paragraph defines a basic representation of complexity structure
M. Valente
and discusses the (partly) interchangeable role of complexity structure and intensity.
The following paragraph presents some results stemming from the exploitation of
complexity structure using pNK, highlighting the greater insights available in respect
of the original analysis based on the NK representation.
4.1 Complexity: structure and intensity of interactions
Consider an N dimensional problem space. As in a standard NK model, we know
that varying the number of interactions (increasing K) we can vary the roughness of
the resulting landscape, increasing the probability that hill-climbing myopic search
will end in a local peak. The standard NK model assumes no structure of the interdependency among variables, limiting their number and assigning randomly the
interdependency links.15 It is possible, however, to construct an NK model in such
a way as to impose a desired interdependency structure on the problem space. To
create a very basic structure, we assign each variable to a group so that each group
contains the same number of variables. We then set interdependency connections
among all the variables in the same group, and no connection between variables from
different groups. Hence, the larger the groups, the larger the number of interactions
and, consequently, the landscape will be rougher. Notice that, in this setting, it is
more convenient to indicate with K the number of variables within groups, rather
than the number of interactions affecting a variables as in the original NK model.
Consequently, in this case, K varies from 1 (groups composed by only one variable,
therefore having no interactions) to N (all variables placed within a single group, and
all variables having connections with every other variable). Conversely, the original
K, referring to the number of interactions, varies between 0 and N-1.
We can test this setting in a pNK model by evaluating the performance of a onedimensional search strategy for various levels of K. However, contrary to the original
NK model, we can vary also the strength of interaction setting |ai,j | to values in
[0, 1]. Figure 6 reports the results generated with these settings.
The simulations are performed on different landscapes made up of N = 24 variables and eight different values of K: 1,2,3,4,6,8,12,24. For each K, we generate
11 landscapes using as many values of |ai,j | ranging from 0 to 1. On the resulting
88 landscapes, we ran 300 independent explorations based on the random strategy
described above. The data show the average value for each combination, using the
fitness of the local peak if the search got trapped into one, or the fitness value of
the point reached after 30,000 steps otherwise. As before, we find that these results
replicate and extend prior results generated with NK.
First, they show that increasing K decreases the average fitness we can expect
to reach, confirming that the choice of a “structured” K, instead of randomly chosen connections, does not affect the role of K in producing a tougher environment.
Moreover, as noted in the previous section, pNK allows us to compare directly the
15 Such
a random structure actually leads quickly to the production of fully uncorrelated structures, even
for relatively low K values (Frenken et al. 1999).
An NK-like model for complexity
Average Fitness
1
|a|=0.0
|a|=0.1
|a|=0.2
|a|=0.3
|a|=0.4
|a|=0.5
|a|=0.6
|a|=0.7
|a|=0.8
|a|=0.9
|a|=1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0
5
10
15
20
25
K
Fig. 6 Average fitness produced at the end of 300 searches for each of the eight landscapes with K =
{1, 2, 3, 4, 6, 8, 12, 24} and for various value of ai,j . Interdependent dimensions have |ai,j | from 0.0 to 1.0
while independent ones have ai,j = 0
results from different landscapes, without the bias produced by different distributions
of fitness values for local peaks across landscapes with different complexity.
The novel results stem from the effect of the intensity of interaction. The graph
shows clearly that weak interactions (|ai,j | < 1) allow the exploration to reach on
average higher local peaks than the case of maximum interaction, and the weaker the
intensity, the higher the expected fitness of the final points of an exploration.
This latter result provides support for the intuition that, for a given strategy of
search, the number of interactions and their intensity are two distinct and complementary mechanisms to generate complexity. Once we define a given measure of
complexity, such as, for example, the expected fitness reached by an adaptive, onedimensional mutation search strategy, it is possible to obtain different landscapes
exhibiting the same level of complexity. They will differ in having a higher number
of interactions and lower intensity, or, vice versa, a lower number of interactions and
higher intensity. Generating similar results for one measure of complexity, however,
does not imply necessarily that these landscapes will share identical properties. In
fact, they will likely differ in respect of other measures of complexity, or in respect of
other applications, such as, for example, the population dynamic properties of agents
exploring the landscapes. We will not further explore this aspect, leaving to future
work the analysis of the properties of landscapes built with different combinations
M. Valente
of number and intensity of interactions. Rather, we concentrate on the effects of the
complexity structure in respect of research strategies.
4.2 Modular strategies and decomposability
In the previous exercise, we adopted the standard research strategy based on a
one-dimensional step. However, contrary to natural systems, artificial ones, such as
individuals or firms, can be expected to recognize the limitations of explorations
based on one-dimensional steps in the presence of complexity. Therefore, problem
solvers facing highly complex tasks are likely to engage in more sophisticated exploration strategies to avoid the traps of local optima. An interesting issue is, therefore,
what are the costs and advantages of different exploration strategies in respect of
different problems?
To analyze this issue, we need to remind ourselves that the very definition of a
local peak depends on both the properties of the landscape and the search strategy
adopted to explore it. In general, the shape of a landscape (e.g. being “smooth” or
“rough”) depends not only on its own structure but also on the the rule defining the
admissible steps over the landscape. For example, a point may be a local peak if we
allow the exploration to consider only points within a given radius from the current
one, but it may not be a local peak if we extend the radius of feasible steps. Consequently, we may intuitively conjecture that for each type of complexity structure
there will be a matching exploration strategy able to “smooth away” any local peak,
and therefore systematically lead to the global optimum. As we will see, however, it
is possible to show that, while this is true, at least within a given sub-set of structures,
other considerations may suggest that sub-optimal search strategies leading to local
peaks are sometimes more desirable than “optimal” search strategies that, though not
admitting any local peak, yet show other problems advising against their adoption.
Since its inception (Simon 1969) the concept of complexity has been strictly
connected with discussions concerning specific structures of interactions, particularly those exhibiting (or lacking) modularity. Modularity is the property of systems
made of several components; in modular systems, groups of components have strong
reciprocal interactions with members of the same group, while showing little or
no interaction with components outside their groups. Non-modular (or integral)
systems, conversely, have no visible structure, with interactions spanning different
components.
Many disciplines use modularity as a central concept to explain a variety of
phenomena in natural (Callebout and Rasskin-Gutman 2005) and artificial systems
(Brusoni et al. 2007). For decades, organizational scholars have represented the problem of organizational design from a problem solving perspective (Miles and Snow
1978), comparing the structure of an organization to the structure of its tasks, such
as production processes or development of new products. According to this perspective, the performance of an organization can be evaluated on the basis the matching
between its internal structure and the external requirements. The concept of modularity can thus be applied to both the problem task and to the problem solver. To avoid
confusion we propose to use different terms, frequently used as synonyms, to express,
on the one hand, the interdependency structure of the problem space and, on the
An NK-like model for complexity
other hand, the property of the search strategy adopted by the problem solver. So, we
will call decomposability the exogenous property of the problem space, such as the
definitions and the technical interactions among components of a product. The term
modularity will be used to indicate the property of the searching strategy adopted.
As an experiment, we match the “block-based” complexity structure defined by
grouping the variables within subsets by defining search strategies also based on the
grouping of variables. We call C the size of the block of variables as considered by
the problem solver strategy. Such generalized search strategies can be represented by
the following algorithm:16
1. Choose randomly one block among the N/C blocks in the problem solver
strategy.
2. Choose randomly an integer in the range [1;C] (call this number c).
3. Choose randomly c dimensions within the chosen block.
4. Change by the value of the chosen dimensions in a random direction.
5. If the change leads to a point with higher fitness, move to the new point.
6. If the change leads to a point with lower fitness, remain at the same point.
7. Continue from 1.
In practice, a step of the research strategy consists in choosing randomly a new
point differing from the old one by between 1 and C dimensions. Hence, the singledimension mutation can be considered as a block-based search strategy when C = 1.
For larger C values, the strategy is able to explore a wider area proportional to C.
Figures 7 and 8 report on an experiment presented in Frenken et al. (1999), but use
pNK instead of the standard NK. Each of the eight graphs considers landscapes built
with different K values, where the interdependent dimensions are chosen in “blocks”,
so that each dimension within a block is linked to exactly K-1 other dimensions in the
same block, and have no connection to the dimensions in other blocks.17 Each graph
concerns the results from landscapes with different K values, and reports the average
fitness of the populations of agents adopting different modular research strategies of
dimension C.
The results generated with the pNK model coincide perfectly with those generated
with the standard NK. It can be shown that only search strategies having C being a
multiple of K can reach the global optimum, and this is what we actually observe
for small K values. However, for more and more complex landscapes (higher K values), other search strategies seem to dominate. The reason lies in the fact that search
strategies with large modules are very slow, since they essentially test randomly all
possible combinations of the dimensions in each group. In this case, “optimal” search
strategies take far too long to show their superiority, and therefore “quick & dirty”
strategies, though doomed to eventually get stuck in a local peak, actually dominate
16 For
simplicity, we consider only C values that are integer divisors of N, though the results would not
change otherwise. It would only mean that the last class would contain a smaller number of variables than
the other classes.
17 For obvious reasons of comparability, we assume that each interdependency in pNK is maximum, i.e.
|ai,j | = 1 for each i and j dimensions within the same block.
M. Valente
1
1
0.861562
0.896547
0.723124
0.793093
0.584687
0.689639
0.446249
1
Cl. 1
Cl. 6
7500
Cl. 2
Cl. 8
15000
Cl. 3
Cl. 12
22500
30000
Cl. 4
0.586186
1
Cl. 1
Cl. 6
7500
Cl. 2
Cl. 8
15000
Cl. 3
Cl. 12
K=1
0.999844
0.873058
0.876073
0.746115
0.752302
0.619173
0.628531
7500
Cl. 2
Cl. 8
15000
Cl. 3
Cl. 12
30000
22500
30000
K=2
1
0.49223
1
Cl. 1
Cl. 6
22500
Cl. 4
22500
Cl. 4
K=3
30000
0.50476
1
Cl. 1
Cl. 6
7500
Cl. 2
Cl. 8
15000
Cl. 3
Cl. 12
Cl. 4
K=4
Fig. 7 Average fitness across time for classes of agents mutating blocks made of C = {1, 2, 3, 4, 6, 8, 12}
dimensions. Simulations reported for landscapes with K = {1, 2, 3, 4}
slow ones for a long period before ending up trapped into a local peak. Hence, if the
relevant period for competition is sufficiently short, it is well possible that “dominated” strategies (in the long term) are actually superior to “optimal” ones for any
reasonable length of time.
4.3 pNK and agent-based models
In our review of the limitations of NK we mentioned the difficulty in using NK as
a model for complexity to plug into a wider model of, say, organizations, markets,
etc. This is because the NK properties are statistical, made evident only by collecting
a sufficiently large number of repetitions. However, each single run of a landscape
exploration is ill adapted to represent an actual exploration process. The reason is that
the pattern actually generated by a simulated agent on a NK landscape is composed
of very few fitness increasing steps in between a long series of failed attempts. This
feature prevents the direct use of NK to represent agents performing activities such
as, e.g., R&D, where the modeller is likely to expect innovative firms to generate a
stream of innovations, though at random rates of arrival. On the contrary, a pattern
in NK consists in one or a few jumps in between long periods of failed attempted
mutations. Such limitations makes hard, for example, to assume that the rate of arrival
of innovations, or their relative innovativeness, is proportional to the amount of R&D
investment.
Such limitations do not affect pNK. The proposed model produces search patterns
closer to the intuitive sequence of fitness increasing steps, each composed, as in NK,
by twists and turns (looking for the right direction), but also as actual steps in a
An NK-like model for complexity
0.993804
0.908933
0.842206
0.787098
0.690607
0.665264
0.539009
0.543429
0.38741
1
Cl.1
Cl. 6
7500
15000
Cl. 2
Cl. 8
Cl. 3
Cl. 12
22500
30000
Cl. 4
0.421594
1
Cl.1
Cl.6
7500
Cl. 2
Cl. 8
K=6
0.843884
0.725412
0.677626
0.583792
0.511368
0.442172
0.34511
7500
15000
Cl. 2
Cl. 8
Cl. 3
Cl. 12
22500
30000
22500
30000
Cl. 4
K=8
0.867032
0.300552
1
Cl. 1
Cl. 6
15000
Cl. 3
Cl. 12
22500
30000
Cl. 4
0.178852
1
Cl. 1
Cl. 6
7500
Cl. 2
Cl. 8
15000
Cl. 3
Cl. 12
K=12
Cl. 4
K=24
Fig. 8 Average fitness across time for classes of agents mutating blocks made of C = {1, 2, 3, 4, 6, 8, 12}
dimensions. Simulations reported for landscapes with K = {6, 8, 12, 24}
familiar Euclidean space. To highlight the difference between NK and pNK in this
respect, Fig. 9 shows a comparison between 1,000 searches (one-bit mutations) on an
NK landscape composed of N = 24 dimensions and defined as moderately complex
(K = 3). The graphs represent the scatter plots between the fitness value of the local
peak eventually reached by the search patterns and the number of fitness increasing
mutations produced by the patterns, that is, the successful “steps”. The left graph
(NK) shows a random cloud of points, indicating that there is no relation between the
0.68
0.66
Fitness
0.67
0.64
0.66
0.62
0.65
0.6
0.64
0.58
0.63
0.56
0.62
0.54
0.61
0
2
4
6
8
10
12
Num. of successfull mutation
NK
14
16
18
0.52
160
Fitness
180
200
220
240
260
Num. of successfull mutation
280
300
p NK
Fig. 9 Values of fitness and the number of successful mutations for NK and pNK. In both cases, the graphs
report the values for N = 24, K = 3 and agents adopting a one-bit mutation strategy. The data are obtained
by 1,000 independent runs on the same landscapes. Values were collected at the end of exploration, when
a local peak is reached
M. Valente
number of steps and the final fitness reached by the search strategy. Moreover, the
average number of steps is about nine, meaning that the patterns generated by NK are
pretty short. Conversely, the right graph shows that patterns generated with pNK have
a more plausible positive correlation between the number of positive mutations and
the final fitness of a search path, representing the obvious fact that to reach higher
fitness areas you need a longer walk. Besides the correlation, moreover, patterns in
pNK are overall much longer (between about 160 and 300 steps), providing a far
better metaphor for results from innovating activities.
As an example of the possibilities offered by pNK, Ciarli et al. (2007, 2008) use
a model derived from pNK to express the technological race of innovative firms. In
these models, firms undergo several economic activities (e.g. sell final and/or intermediate products, collect revenues, set prices, etc.) as well as research activities
(searching for better technologies on a complex technological landscape). In these
works, the relative timing between the output of research (i.e. better technologies)
and their economic impact (i.e. higher competitiveness) is crucial. An NK model,
implying a slow and erratic pattern of “discoveries”, would not be able to represent
the intended functions. Conversely, pNK provides the flexibility to represent sensible search patterns made of gradual improvements resulting from the appropriate
research strategy. Moreover, the functional form of the complexity model and the
possibility to control every aspect of the landscape gives the opportunity to extend the
representation of the complexity space. In these works, for example, the whole technological landscape is represented as shifting in time under the pressure of exogenous
movements of the technological frontier. Therefore, the very same point of the landscape (i.e. technology) provides different fitness values (relative competitiveness)
through time, as the whole landscape drifts away. Consequently, firms must continuously invest in research even just to maintain the current relative technological
position of their product, as represented by the fitness. Such a feature is easily represented in pNK, allowing the explorations of how the overall industry performance
differs under, say, fast or slow changes of the technological frontier.
5 Conclusions and further research
This work proposes a new model to represent complexity by implementing the core
elements of NK in ways that are more suited for applications in economics rather
than the original biological metaphor. The proposed model, pNK, generates a complex landscape by means of the (controlled) interaction among the dimensions of the
landscape as in NK. However, pNK considers real-valued dimensions (instead of the
binary ones of NK), allows for grades of interaction (instead of presence/absence),
and is implemented by means of a deterministic function, greatly simplifying the
implementation of the model and the interpretation of the results. These differences
make it possible to enhance significantly the study of complexity, in general, and its
applications to specific domains, such as economics and organization theory.
The paper describes several experiments using pNK to represent complex landscapes, showing how the core properties of NK are maintained in the new model,
adding the benefit of dealing with the more intuitive context of a real-valued space. In
An NK-like model for complexity
particular, we show that intensity of interdependency is equivalent to the number of
interdependencies, at least for many results presented in the literature. This permits
us to replicate the same properties using a familiar bi-dimensional plane and provide intuitive graphical representations. Moreover, we present a number of original
results made possible by pNK. Extensive experiments support the claim that pNK is
far better suited to implement complexity within models entailing the representation
of simulated agents facing complexity.
Several lines of research may be devised starting from the present work. First,
there are many properties of the proposed model that still need to be explored. For
example, it may be interesting to study the properties of non symmetric landscapes,
produced when |ai,j | = |aj,i |. More in general, there are many functional shapes
that may be used as a fitness function; since this determines the relative distribution of local peaks, it may be useful to generate a library of functions for different
uses.
A related line of research is to study the possibility to generate landscapes on the
basis of empirical data. Many theoretical and applied researchers would be interested
in the possibility of interpolating a data set of available evidence to be considered as
a sample of scattered points, filling the gaps with econometric techniques and eventually producing an estimated fitness landscape of a specific complex problem. This
may contribute to a promising literature on the empirical applications of complexity
theory in economics (Frenken and Nuvolari 2004; Alkemade et al. 2009), and would
obviously find many uses for both theoretical as well as applied issues.
Finally, pNK is particularly suited to be used in conjunction with agent-based
simulation models. That is, using pNK as “complexity module” within models
representing systems dealing with the exploration of complex spaces, such as the economics of technical change or organizational theory. Its features make it very simple
to devise search strategies modeled on the behaviors of actual agents, either individuals or organizations, facing complex tasks. The comparison of varied complexity and
behaviors adopted to deal with it has provided in the past many insights using NK,
and more can be expected by using the more powerful and flexible pNK model.
References
Alkemade F, Frenken K, Hekkert MP, Schwoon M (2009) A complex systems methodology to transition
management. J Evol Econ 19(4):527–543
Altenberg L (1995) Genome growth and the evolution of the genotype-phenotype map. In: Banzhaf
W, Eeckman FH (eds) Evolution and biocomputation: computational models of evolution, vol 899.
Springer, Berlin, pp 205–259
Altenberg L (1997) NK fitness landscapes. In: Baeck T, Fogel D, Michalewicz Z (eds) Handbook of
evolutionary computation. Taylor & Francis, New York
Auerswald P, Kauffman SA, Loboc J, Shell K (2000) The production recipes approach to modeling technological innovation: an application to learning by doing. J Econ Dyn Control 24(3):389–450. ISSN
0165-1889
Brusoni S, Marengo L, Prencipe A, Valente M (2007) The value and costs of modularity: a problemsolving perspective. Eur Manag Rev 4:121–132
Callebout W, Rasskin-Gutman D (eds) (2005) Modularity: understanding the development and evolution
of natural complex systems, Vienna series in theoretical biology. MIT Press, Cambridge
M. Valente
Ciarli T, Leoncini R, Montresor S, Valente M (2007) Innovation and competition in complex environments. Innov Manag Policy Pract 9(3–4):292–310
Ciarli T, Leoncini R, Montresor S, Valente M (2008) Technological change and the vertical organization
of industries. J Evol Econ 18(3):367–387
Durret R, Limic V (2003) Rigorous results for the NK model. Ann Probab 31(4):1713–1753
Frenken K (2006) A fitness landscape approach to technological complexity, modularity, and vertical
disintegration. Struct Chang Econ Dyn 17(3):288–305
Frenken K, Nuvolari A (2004) The early development of the steam engine: an evolutionary interpretation
using complexity theory. Ind Corp Chang 13(2):419–450
Frenken K, Marengo L, Valente M (1999) Interdependencies, nearly-decomposability and adaptation. In:
Brenner T (ed) Computational techniques for modelling learning in economics. Springer, Berlin,
pp 145–165
Kauffman SA (1989) Adaptation on rugged fitness landscapes. In: Stein D (ed) Lectures in the sciences
of complexity. The santa ire institute series. Addison-Wesley, New York, pp 527–618
Kauffman SA (1993) The origins of order: self-organization and selection in evolution. Oxford University
Press, Oxford
Kauffman SA, Levin S (1987) Towards a general theory of adaptive walks on rugged landscapes. J Theor
Biol 128:11–45
Kauffman SA, Lobo J, Macready WG (2000) Optimal search on a technology landscape. J Econ Behav
Organ 43:141–166
Kaul H, Jacobson SH (2006) Global optima results for the Kauffman NK model. Math Program
106(2):319–338. ISSN 0025-5610
Lenox MJ, Rockart SF, Lewin AY (2006) Interdependency, competition, and the distribution of firm and
industry profits. Manag Sci 52(5):757–772
Levinthal DA (1997) Adaptation on rugged landscapes. Manag Sci 43(7):934–950
Li R, Emmerich MTM, Eggermont J, Bovenkamp EGP, Bäck T, Dijkstra J, Reiber JHC (2006) Mixedinteger NK landscapes. In: Runarsson TP, Beyer H, Burke E, Merelo-Guervós J, Whitley LD, Yao X
(eds) Parallel problem solving from nature—PPSN IX. Lecture Notes in Computer Science. Springer,
Berlin, pp 42-51
March JG (1991) Exploration and exploitation in organizational learning. Organ Sci 2(1):71–87
Michalewicz Z (1996) Genetic algorithms + data structures = evolution programs, 3rd edn. Springer, Berlin
Miles RE, Snow CC (1978) Organizational strategy, structure, and process. MacGraw-Hill, New York
Rivkin JW (2000) Imitation of complex strategies. Manag Sci 46(6):824–844
Rivkin JW, Siggelkow N (2002) Organizational sticking points on NK landscapes. Complexity 7:31–43
Simon HA (1969) The sciences of the artificial. MIT Press, Cambridge
Skellett B, Cairns B, Geard N, Tonkes B, Wiles J (2005) Maximally rugged NK landscapes contain the
highest peaks. In: GECCO ’05: proceedings of the 2005 conference on genetic and evolutionary
computation. ACM Press, New York, pp 579–584. ISBN 1-59593-010-8
Valente M (2008) Laboratory for simulation development—LSD, working paper 2008/63. Sant’Anna
School for Advanced Studies, LEM, Pisa
Wagner GP, Altenberg L (1996) Complex adaptations and the evolution of evolvability. Evolution
50(3):967–976
Weinberger E (1991) Kauffman, S.A. 1993. The origins of order: self-organization and local properties of
Kauffman’s N-k model: a tunably rugged energy landscape. Phys Rev A 44(10):6399–6413