Thermodynamic Investigations of Random Graph Models

Thermodynamic Investigations of Random Graph Models
Samuel Peana, Joaquin A. Maury, and Prashanth S. Venkataram
MIT Department of Economics: 14.15
(Dated: May 15, 2014)
Analogies between the Erdős-Renyi random graph G(n, M ) and the microcanonical ensemble from
statistical mechanics are analyzed. This leads to a natural derivation of the Poisson random graph
G(n, p) from the Erdős-Renyi random graph G(n, M ) through thermodynamic considerations, just as
the canonical ensemble can be derived from the microcanonical ensemble. The approximations made
to achieve the canonical ensemble are justified qualitatively through computational simulations.
Finally, the possibility of allowing the number of nodes in a random graph to vary is discussed in
analogy to the grand canonical ensemble.
Random graphs, especially of the Erdős-Renyi
G(n, M ) and Poisson G(n, p) varieties, are convenient for
modeling the statistics of networks where the exact edge
formations are not fully deterministic or may vary in an
ergodic manner. Frequently, the derivation of exponential random graphs such as the Poisson G(n, p) graph
from the Erdős-Renyi G(n, M ) graph is done through
constrained maximization of the graph entropy. An alternate way to perform this derivation is to instead consider
analogies to statistical thermodynamics, where the enforcement of the equivalent of thermal equilibrium yields
certain explicit assumptions regarding the properties of
a graph and its surroundings to lead to an exponential
random graph. These assumptions can be further examined using the power of computational simulations.
Finally, once the links between random graphs and statistical thermodynamics are established, they can be used
to motivate a random graph model with a variable number of nodes in analogy to the grand canonical ensemble
of statistical thermodynamics.
A model system can be described as follows to motivate the rest of the paper. Consider a company with
all employees located on the planet Mars, so that they
are isolated from the rest of humanity and may only interact with each other. For certain projects, the company may require a fixed number of partnerships to occur. The statistics of these partnerships yield the ErdősRenyi random graph G(n, M ), as there are n employees
and M partnerships. More likely, though, the company
is located on Earth, and the employees have relationships with people outside the company as well; moreover,
the interactions between employees cannot be so tightly
controlled by the company. If the number of partnerships among all of humanity is taken to be fixed but the
number of partnerships within the company may vary,
and if the number of employees is fixed at any point,
then this will be shown to reproduce the Poisson random
graph G(n, p). Finally, if the number of employees fluctuates due to hirings and firings, then the network in the
company would be analogous to a grand canonical ensemble from statistical thermodynamics, because people
from outside the company can enter and people inside
the company can leave. This last idea has not been investigated much in existing literature on random graph
theory. Furthermore, in contrast to random attachment
models where the number of nodes can only increase, the
grand canonical ensemble allows the number of nodes to
fluctuate in both directions around a mean. This then
allows for the description of a static equilibrium system
because averaging over fluctuations in time for a single
system should remain equivalent to average over a static
ensemble of systems, as is the case for any equilibrium
statistical thermodynamic model.
I.
ERDŐS-RENYI RANDOM GRAPH
I.1.
Graph Theory
The Erdős-Renyi random graph G(n, M ) models a random graph as having a fixed number of nodes n and a
fixed number of edges M distributed uniformly among
all n2 possible edge locations. Because the edges are
distributed randomly through the graph, the probability
that a given edge location is actually occupied by an edge
is
p=
M
n
(1)
2
for this graph. Moreover, because the number of edges is
fixed, the average degree is simply
hki =
1X
1X
2m
ki =
Aij =
n i
n i,j
n
(2)
given an adjacency matrix A that is any realization of
this random graph. However, other properties are not
as simple to calculate, motivating the use of the Poisson
random graph G(n, p) instead. An adventure into the
statistical mechanics of the microcanonical ensemble for
a two-level system can shed some light onto this.
I.2.
Microcanonical Ensemble
The Erdős-Renyi random graph G(n, M ) is in fact
identical to the two-level system commonly seen in statistical physics. Consider a system of N particles fixed in
place where each particle may have energy of either 0 or 1.
2
If the total energy E is fixed, the most likely probability
distribution of this system is the one where every realization of the system (with the same N particles where E
of them are in the state of energy 1) is equally probable.
This means the probability of choosing any such system
realization G is simply
1
p(G) =
Ω
(3)
where Ω is a normalization constant representing the
number of such realizations possible. In fact, this number
is simply the combinatorial factor of arranging E indistinguishable particles of a given type into N indistinguishable slots: Ω = N
E . This probability distribution
can also be shown to maximize the entropy
X
S=−
p(G) ln(p(G))
(4)
G
to yield
S = ln(Ω)
(5)
which is exactly the Boltzmann entropy formula; the entropy is thus a logarithmic measure of the number of
available realizations of the system given the conditions,
which at this moment are simply fixed (N, E).
Statistical mechanical analysis is most useful in the
limit where N 1, which is the thermodynamic limit.
Given that Ω = N
E for this two-level system, the thermodynamic limit justifies the use of the Stirling approximation
ln(a!) ≈ a ln(a) − a, a 1
(6)
which yields
S ≈ N ln(N ) − E ln(E) − (N − E) ln(N − E)
(7)
as the entropy. In general, for a microcanonical system
where the number of particles and total energy is fixed
at each point of consideration, the way the entropy may
change if the energy changes is characterized by the inverse temperature
β=
∂S
∂E
(8)
which gives
β ≈ − ln(E) + ln(N − E) = ln
N
−1
E
(9)
for this system. This can be rearranged to give
E=
N
1 + e−β
(10)
which is the energy in terms of the inverse temperature.
Because the energy E is uniformly distributed across all
N particles, the probability that a particular particle has
E
, from which the relation
energy 1 is simply N
p=
E
1
=
N
1 + e−β
(11)
emerges. Finally, if the probability of a realization of the
system having general property X is desired, then that
is
p(X) =
Ω0 (X)
Ω
(12)
where Ω is the total number
of fixed (N, E)
of realizations
0
(which in this case is N
),
and
Ω
(X)
is
the
number out
E
of those graphs that also have property X. For example,
if the number of graphs where a given particle has energy
1 is desired, the number of total graphs is again Ω = N
E ,
and in fact the number of graphs
with the given particle
−1
having energy 1 is simply N
E−1 . Dividing the latter by
the former and simplifying the factorials yields exactly
E
the equation p = N
as above. However, for most desired
properties, this method of counting realizations satisfying
the given properties is a much more computationally difficult problem, which makes the microcanonical ensemble
computationally less desirable.
What does all of this have to do with Erdős-Renyi
random graphs G(n, M )? The particles here are potential edge locations between nodes (i.e. the particles are
node pairings), and whether a particle has energy 0 or 1
is equivalent to whether an edge is absent (0 edges) or
present (1 edge), respectively, between a pair of nodes.
The number of edges being uniformly distributed is identical to the energy being distributed.
Hence, as the numn
ber of potential
edge
locations
is
2 , if the replacements
N → n2 and E → M are made, the resulting analysis
exactly reproduces the well-known results of the ErdősRenyi random graph G(n, M ), while also providing a new
origin for the parameter β seen in exponential random
graph models as being analogous
to temperature; in fact,
for brevity the notation N ≡ n2 will be used for random
graph models, because this is indicative of the number of
potential edge locations, akin to the number of particles
in a physical system. Moreover, the difficulties with the
microcanonical ensemble in statistical mechanics are exactly the same difficulties with the Erdős-Renyi random
graph model G(n, M ): counting the number of graph realizations satisfying fixed (n, M ) and a given property is
computationally difficult, and it would be easier to be
able to count all graph realizations in some way.
There is one difference to be noted between the microcanonical ensemble and the Erdős-Renyi random graph
model G(n, M ) with regard to temperature. The microcanonical ensemble comes from statistical mechanics,
which aims to reproduce classical thermodynamics in the
N 1 limit. This also means it must reproduce physical
laws, including the idea that particles tend to settle in the
lowest energy state possible when not driven by external
factors. As per equation 11, if β > 0 then p < 21 , which
3
means that it is more likely for a particle to be in energy
0 than 1, which is consistent with the prior statement.
If β < 0, however, then p > 12 , meaning that more particles on average are in the higher-energy state than in
the lower-energy state. This is not an equilibrium system
because if such a system is brought in contact with another system that has β > 0, energy will spontaneously
flow from the former system to the latter because energy
flowing away from the system with β < 0 increases its
entropy. For random graphs, no such considerations are
needed; β can be any real value, which allows p to take
on any value between 0 and 1.
II.
CANONICAL ENSEMBLE
It is possible to derive the Poisson random graph
G(n, p) as a special case of the exponential random
graph model where only the average number of edges
is constrained; the exponential random graph model itself is typically derived from the Erdős-Renyi random
graph G(n,
P M ) as a maximization of the graph entropy
S = − G p(G) ln(p(G)) constrained by the averages of
the desired graph properties. Instead, the Poisson random graph G(n, p) will be derived thermodynamically
from the Erdős-Renyi random graph G(n, M ).
Let us consider an Erdős-Renyi random graph GU ∈
{G(nU , MU )} (meaning the whole graph GU is in the set
of Erdős-Renyi random graphs with
nU nodes and MU
edges present) where nU = n2U 1. Let us further
consider a subset of the nodes in this whole graph whose
number n gives a potential edge number N that is much
smaller than the remainder of the graph: NU − N N .
To be clear, the potential edge number N is only found
from edges that can be made within the n nodes chosen,
so any edges that can be made fully externally to the chosen subset or from the subset to the surroundings is part
of NU − N . Because the surroundings are so much bigger
than the chosen subset, although the number of edges in
the surroundings can vary contingent on the number of
edges in the whole graph being fixed, the fraction of edges
landing in the surroundings is so much larger than the
fraction of edges landing in the chosen subset that the
probability of edge formation p in the surroundings can
be regarded as essentially fixed. The probability that the
chosen subset has M edges present out of the N possible,
which is the probability that the subgraph is in the set of
Erdős-Renyi graphs with n nodes and M edges present,
is
Ωsurr (MU − M )
p(G) =
for G ∈ {G(n, M )}
ΩU (MU )
(13)
where the numerator is the number of realizations of the
surroundings having the remaining number MU − M of
edges and the denominator is the number of realizations
of the whole graph having MU total edges; the dependences of Ω on NU and of Ωsurr on NU − N have been
suppressed, as those are less relevant for this derivation.
The above expression can be rewritten as
ln(p(G)) = Ssurr (MU − M ) − SU (MU )
for G ∈ {G(n, M )}
(14)
(15)
using the relation S = ln(Ω) for the surroundings and
whole graph separately. Now using the fact that M MU , the entropy of the surroundings can be expanded
as a power series around the total number of edges MU ,
because the variance of the number of edges in the surroundings is very small compared to that. This gives
ln(p(G)) ≈ Ssurr (MU )
(16)
∂
−M
Ssurr (Msurr )|Msurr =MU
(17)
∂Msurr
+ O(M 2 ) − SU (MU ) for G ∈ {G(n, M )} (18)
where Msurr = MU − M is the number of edges in the
surroundings, and the partial derivative of the entropy
of the surroundings is taken with respect to the number
of edges and is then evaluated at MU . Note that this
expansion is done to first order in M , accounting for the
approximation below O(M 2 ). Using equation 11, the
probability of edge formation in the surroundings being
essentially constant is the same as β being essentially
constant across the system and surroundings. In other
words, thermal equilibrium for a random graph means
that two subgraphs that are connected have equal probabilities of edge formation. This means that the partial
derivative of the entropy of the surroundings is simply β,
giving
ln(p(G)) ≈ −βM + Ssurr (MU ) − SU (MU ) + O(M 2 )
(19)
for G ∈ {G(n, M )}
(20)
which can be further rewritten to yield
p(G) = eSsurr (MU )−SU (MU ) e−βM for G ∈ {G(n, M )}
(21)
where the first exponential is independent of the number
of edges M present within the chosen subset of nodes
and can therefore be treated as a normalization constant.
Indeed, because
N
X
X
p(G) = 1
(22)
M =0 G∈{G(n,M )}
must be true for p(M ) to be a probability, meaning that
the sum over all edge values M from 0 to N of the sum
over all Erdős-Renyi graphs G(n, M ) of the probability
must be unity, it can be shown that the partition function
Z=
N
X
X
e−βM
(23)
M =0 G∈{G(n,M )}
which gives a weighted count of the number of graph
realizations available to this chosen subset of nodes is
4
exactly equivalent to the reciprocal of the first exponen-
N
tial in equation 21. Moreover, because there are M
Erdős-Renyi graph realizations G(n, M ), then in fact
N X
N −βM
Z=
e
= (1 + e−β )N
M
(24)
M =0
is the exact partition function for this class of graphs,
owing to the edges forming independently of each other.
Hence,
because for random graphs, the edge formation probability p is more convenient to use than the inverse temperature β. In any case, β and p are monotonically decreasing functions of one another, so β can be viewed as
a rewriting of the edge formation probability p in a more
“thermodynamic” form.
What is the average number of edges in such a graph?
Ordinarily this would be calculated as
hM i =
1 X
M (G)e−βM (G)
Z
(33)
G
p(G) =
e−βM (G)
Z
(25)
is the probability of picking a graph realization on a set
of nodes weighted by the number of edges present in the
graph. Average quantities are then computed as usual
from
X
hXi =
X(G)p(G)
(26)
G
for a given quantity X that is dependent on the graph
realization G. If the quantity X only depends on M
rather than the graph realization G, it is more convenient
to write
−βM
N e
p(M ) =
(27)
Z
M
as the probability of picking
any graph with M edges by
N
accounting for the M
ways of picking a graph out of
only those with a fixed number of edges M , so that
hXi =
N
X
X(M )p(M )
(28)
M =0
may be easier to calculate in many instances.
What is the probability of finding an edge between two
nodes in such a graph? This is the average adjacency
beP
tween two nodes, and using the fact that M = i<j Aij ,
then


X
X
1
p = hAlm i =
Alm exp −β
Aij 
(29)
Z
i<j
{Aij }
P
A ∈{0,1} Alm exp (−βAlm )
= Plm
(30)
Alm ∈{0,1} exp (−βAlm )
=
1
1 + eβ
(31)
which exactly reproduces the result from equation 11
as well as the occupation number for fermions (in more
lay terms, electrons) from elementary quantum statistical
mechanics. Moreover, this says that edges do in fact
form independently of each other with equal Bernoulli
probability p. This makes it more convenient to rewrite
the expression as
1
β = ln
−1
(32)
p
but it is easy to manipulate this to yield
hM i = −
∂
ln(Z)
∂β
(34)
in analogy to the energy in the canonical ensemble in
statistical mechanics. This can be intuitively interpreted
as saying that the average number of edges in this graph
should be related to how the number of accessible graph
realizations changes with the probability of edge formation. In fact, it can be more generally shown that
hM r ic = (−1)r
∂r
ln(Z)
∂β r
(35)
where r is any nonnegative integer and the subscript c
denotes the cumulant of the distribution. For this random graph model, the formulas above give the average
number of edges as being
hM i =
N
1 + eβ
(36)
which exactly matches the expression from the microcanonical ensemble analysis.
As a further example, let us consider an analogy to the
heat capacity C = ∂hEi
∂T from statistical mechanics, which
describes how much the energy E of a system changes
when only its temperature T is changed slightly. Making
the replacement E → M and using the inverse temperature β = T1 , then C = −β 2 ∂M
∂β . In this case, that
2 β
e
means C = (eNββ+1)
2 . However, it is more useful to consider how the number of edges changes on average with
the probability rather than with the complicated function β. The issue with this is that hM i = N p, so the
change in the number of average edges with probability
is simply N ; this is the case because all the edges are
present independently of each other with probability p.
i
Hence, C = −β 2 ∂hM
∂β is only useful through its relation
to the variance (second cumulant) of edges hM 2 ic .
As a final example, let us consider an analogy to the
chemical potential µ = − β1 ∂ ln(Z)
∂N : as N is related to the
number of nodes n in a one-to-one manner (because n ≥ 1
always), then µ is related to (but not exactly) how the
average number of edges changes when the probability
of edge formation changes. Elementary manipulation of
5
equation 23 yields
µ=−
1
ln(1 + e−β )
β
(37)
as the chemical potential. The issue with this expression is that µ is independent of N because ln(Z) depends
linearly on N ; this is also a consequence of saying that
edges form independently, which will pose issues further
on.
III.
COMPUTATIONAL JUSTIFICATIONS
Computational simulations of a large random graph
can be used to qualitatively justify some of the approximations made, such as that of expanding the entropy
of the surroundings only to linear order in M . An
Erdős-Renyi graph G(nU , MU ) was simulated
for fixed
nU = 1000 and MU = 105 , so that n2U = 499500 1;
this was done by picking node pairs uniformly at random and adding an edge if one was not already present.
Then, the number of edges M in a system of n nodes
was counted by considering the rows and columns 1 to n
in
P the adjacency matrix and dividing the total by 2, as
i ki = 2M . For each n considered, the random graph
G(nU , MU ) was generated 104 times to produce acceptable statistics in the distribution of M values for each
n.
For the 104 simulations of each system size n given
fixed universe parameters G(nU , MU ), a further assumption needs to be made. From equation 27, the approximation
2
N
2
N 2
N
2 !
e− N (x− 2 )
(38)
≈
N!
M
can be shown to hold as long as N = n2 1, though the
overall scale in front of the exponent is less relevant; this
is indeed true to a very high order even for n = 100. If the
system were isolated (G(n, M )) rather than in thermal
equilibrium with the surroundings (G(n, p)), the probability of choosing any (variable) number of edges M
would be proportional to this Gaussian factor which is
peaked at N2 , with an inverse variance (multiplied by 14 )
of N2 controlling the quadratic term that is exponentiated. Multiplying this Gaussian approximation by the
factor e−βM simply shifts the location of the peak in M
but preserves the Gaussian shape. This means that the
data for the edge distributions for each system size should
be able to be fitted to a Gaussian.
The computational results for p(M ) for each n are
shown in figure 1, where each set of data was fitted to
x−b 2
the Gaussian function f (x) = ae−( c ) . The results of
the fits are given in table I. As can be seen
in that table,
N
when using the formula β ≈ ln hM i − 1 , β is constant
for every system size n to within 2 decimal places, and
FIG. 1. Fits of p(M ) data (data are black dots in all plots)
to Gaussian functions (fits are red lines in all plots) for values
of n that are (from top to bottom) 100, 200, 300, or 900
this number is the same for the exact value of β calculated from thetrue microcanonical graph G(nU , MU ):
NU
β = ln M
− 1 ≈ 1.38. This is a positive result in
U
reproducing the notion that the universe is at the same
temperature as the system and surroundings. Note however that the coefficients of the quadratic term in the
exponential, which are c12 for the Gaussian fit and N2 for
the Gaussian approximation to the binomial coefficient,
stay within a factor of 1.6of each other through n = 300.
However, at n = 448, 448
= 100128 which is larger than
2
the number of edges (105 ) placed into the whole universe,
6
TABLE I. Calculated mean b from Gaussian fit, calculated
inverse variance c12 from Gaussian fit, first order mean N2 and
inverse variance N2 from analytical Gaussian approximation
N
of M
, β = ln Nb − 1 from the microcanonical ensemble
replacing exact M with average hM i = b
n
100
200
300
900
b
993
3993
8996
81160
N
2
2
N
2475
9950
22425
202275
4.0 × 10−4
1.0 × 10−4
4.5 × 10−5
4.9 × 10−6
1
c2
−4
6.5 × 10
1.6 × 10−4
7.6 × 10−5
4.1 × 10−5
β
1.38
1.38
1.38
1.38
inversion to get N in terms of µ that would then allow
the Legendre transformation to occur as desired. Thus,
a more fundamental statistical approach is required.
From statistical mechanics, varying the number of
nodes simply becomes another Lagrange multiplier constraint in the entropy maximization problem. This route
will be taken rather than the appeal to thermodynamics
from the earlier sections for simplicity. The constrained
graph entropy is
S=−
N
XX
X
p(G) ln(p(G))
(40)
p(G) − 1)
(41)
M (G)p(G) − hM i)
(42)
N (G)p(G) − hN i)
(43)
p(G) = 1
(44)
M (G)p(G) = hM i
(45)
N M =0 G∈{G(n,M )}
so n > 448 may not be considered “small” relative to the
universe anymore. At this point, the quadratic coefficients may be expected to diverge further, and the starkest example of this is at n = 900, where there is a full
order of magnitude difference between the two quadratic
coefficients. This can be seen as the point where the expansion of the entropy of the surroundings must be taken
to an order higher than linear in M . Qualitatively, then,
it can be seen that the notion of “smallness” does break
down for sufficiently large n as a fraction of nU , though
the notion of thermodynamic equilibrium is maintained
between the system and surroundings for any system size.
More systematic quantitative analysis in fitting the logarithms of the histogram data to higher-order polynomials
than just quadratics (which would give Gaussian curves)
was hampered by issues of weighting the relative data
points to ensure that a good fit in the logarithmic domain
remains good in the linear domain; for related reasons,
the original analysis comparing the binomial coefficient
to its skewed product with a decaying exponential was
found to be flawed, and the resulting corrections led to
the analysis here.
+ α(
N
XX
N M =0 G∈{G(n,M )}
N
XX
− β(
GRAND CANONICAL ENSEMBLE
N
XX
+ γ(
S
− µN
β
X
N M =0 G∈{G(n,M )}
where α constrains
N
XX
X
N M =0 G∈{G(n,M )}
for probabilities, β constrains
N
XX
X
N M =0 G∈{G(n,M )}
for the average edges, and γ constrains
X
N (G)p(G) = hN i
(46)
N M =0 G∈{G(n,M )}
From classical thermodynamics, the grand potential
is related to the energy E = E(S, N ) by the Legendre
transformation
G(β, µ) = E −
X
N M =0 G∈{G(n,M )}
N
XX
IV.
X
for the average nodes. Maximizing S with respect to
p(G) and appealing to thermodynamics to equate γ = βµ
yields
(39)
where E is the energy, S is the entropy, β is the inverse
temperature, and µ is the chemical potential describing
how the usable energy of a system changes under different
conditions when particles are added or removed. In this
case, with the replacement E → M is made, the relation
should still hold true. This would give a measure of the
properties of a random graph when the number of nodes
is also variable, as the number of potential edges N is
related to the number
of nodes n through the one-to-one
relation N = n2 when n ≥ 1. The problem with this
approach is that the chemical potential µ is independent
of N as per equation 37, so there is no relation allowing
p(G) =
eβµN (G)−βM (G)
Q
(47)
where the grand canonical partition function
Q=
N
XX
X
eβµN −βM
(48)
N M =0 G∈{G(n,M )}
=
X
(eβµ (1 + e−β ))N
(49)
N
is effectively a discrete Laplace transform of the canonical partition function from equation 23. Note that the
sum over N is not for all integers N , because N = n2 .
Because the sum should really be performed over nodes
7
rather than potential edges as fractional nodes cannot
be added even if this does add integer numbers of potential edges (from the definition of the factorial function
through the Γ function), the partition function should
really be rewritten as
nU
X
Q=
n
(eβµ (1 + e−β ))( 2 )
(50)
n=0
for nU nodes in the total graph so that the number of
nodes in the system n ≤ nU . This has no closed form
expression when nU is finite, but if n can vary, it is most
useful to consider the case where the system can exchange
arbitrary numbers of nodes with its surroundings, which
is the limit of nU → ∞. In terms of the second elliptic
theta function
1
θ2 (u, q) = 2q 4
∞
X
q n(n+1) cos((2n + 1)u)
(51)
n=0
and relabeling the summand as
x ≡ eβµ (1 + e−β )
(52)
then
Q=1+
1
2x
1
8
√
θ2 (0, x)
(53)
is the grand canonical partition function for this system.
It can be shown that the average number of potential
edges for this system, which
gives the average number of
nodes through N = n2 , is
hN i =
∂ ln(Q)
∂(βµ)
(54)
so this can be solved to yield hni. However, it is more
convenient to express this average more explicitly as a
weighted sum. This yields
n
P∞
nx( 2 )
hni = Pn=0
∞
(n2 )
n=0 x
(55)
where x ≡ eβµ (1+e−β ) as defined previously. This makes
sense because in a graph where n is fixed but M may
vary, the number of nodes is of course n, so that average
in the grand canonical ensemble is a weighted sum of the
canonical ensemble average with respect to the summand
n
x( 2 ) .
The average number of nodes hni can be plotted
against the summand x. It can be seen in figure 2 that
hni > 1 and is finite only when 0.5553 < x < 1 where the
lower bound is approximate and must be computed numerically, while the upper bound is exact. Because this
region is plotted against β and βµ, then the bound x = 1
(where hni → ∞) corresponds to µ increasing from 1 to
+∞ as β increases from −∞ to 0, then increasing from
−∞ to 0 as β increases from 0 to +∞. This implies that
FIG. 2. Top: plot of hni against the summand x; bottom:
region of interest 0.5553 < x < 1 plotted against β (horizontal
axis) and βµ (vertical axis) corresponding to hni ≥ 1
when β > 0, which is when having more edges present
is disfavored, then µ < 0 so having nodes to allow those
edges to fill in is also disfavored; by contrast, when β < 0,
which is when having more edges present is favored, then
µ > 0 so having nodes to allow those edges to fill in is
also favored. In fact, solving x = 1 for β and µ yields
exactly the expression in equation 37 for µ = µ(β). The
fact that µ matches its expression in equation 37 when
hni → ∞ is precisely indicative of the fact that statistical mechanics only works in the limit of arbitrarily large
system sizes, and because the system size only needs to
grow asymptotically slower than the universe size (unless the system is isolated as in the Erdős-Renyi graph
G(n, M )), there is no further inconsistency. Hence, the
grand canonical approach is consistent with the canonical approach in teasing out the meaning of the chemical
potential.
V.
CONCLUDING REMARKS
The microcanonical ensemble was discussed as an analogy to Erdős-Renyi graphs G(n, M ), and a thermodynamic derivation of the canonical ensemble successfully
8
reproduced the Poisson graph G(n, p). Computational
simulations could qualitatively justify some of the assumptions used in that derivation. Future work would
make those justifications more quantitative by properly
fitting to higher-order exponentiated polynomials and
more systematically considering shifts away from the
Erdős-Renyi or Poisson limits; this would require a fitting algorithm that can consistently manipulate logarithmic data to produce a polynomial fit that still retains a
high quality of fit when exponentiated once more. The
simplest model of the grand canonical ensemble was considered in analogy to statistical thermodynamics. Future
work would focus on extracting more analogous information from this model (such as the average number of
edges, which was not considered in this paper due to
the much higher complexity of the analysis) and then
modifying the model in analogy to other statistical grand
canonical ensemble systems to be more reflective of realworld random networks with node numbers that fluctuate
around a mean. Moreover, the grand canonical ensemble as formulated in this paper considers systems that
can have an infinite number of nodes, and only systems
that are infinitely large can have non-analytic partition
functions allowing for true phase transitions (as in a statistical field theory), which would also be the subject of
future work.