Transition Matrix Monte Carlo Methods for Density of

Transition Matrix
Monte Carlo Methods
for Density of States Prediction
von der Fakultät für Naturwissenschaften der Technischen Universität Chemnitz
genehmigte Dissertation zur Erlangung des akademischen Grades
doctor rerum naturalium
(Dr. rer. nat.)
vorgelegt von René Haber, M. Sc.
geboren am 11.09.1983 in Karl-Marx-Stadt jetzt Chemnitz
eingereicht am 2. Mai 2014
Gutachter: Prof. Dr. Karl Heinz Hoffmann
Prof. Dr. Christian Schön
Tag der Verteidigung: 20. Juni 2014
2
Bibliographische Beschreibung
Haber, René
Transition Matrix Monte Carlo Methods for Density of States Prediction
Technische Universität Chemnitz, Fakultät für Naturwissenschaften
Dissertation (in englischer Sprache), 2014
106 Seiten, 25 Abbildungen, 1 Tabelle und 114 Literaturzitate
Referat
Ziel dieser Arbeit ist zunächst die Entwicklung einer Vergleichsgrundlage, auf Basis derer Algorithmen zur Berechnung der Zustandsdichte verglichen werden können. Darauf
aufbauend wird ein bestehendes übergangsmatrixbasiertes Verfahren für das großkanonisch Ensemble um ein neues Auswerteverfahren erweitert. Dazu werden numerische Untersuchungen verschiedener Monte-Carlo-Algorithmen zur Berechnung der Zustandsdichte
durchgeführt. Das Hauptaugenmerk liegt dabei auf Verfahren, die auf Übergangsmatrizen
basieren, sowie auf dem Verfahren von Wang und Landau.
Im ersten Teil der Forschungsarbeit wird ein umfassender Überblick über Monte-CarloMethoden und Auswerteverfahren zur Bestimmung der Zustandsdichte sowie über verwandte Verfahren gegeben. Außerdem werden verschiedene Methoden zur Berechnung der
Zustandsdichte aus Übergangsmatrizen vorgestellt und diskutiert.
Im zweiten Teil der Arbeit wird eine neue Vergleichsgrundlage für Algorithmen zur Bestimmung der Zustandsdichte erarbeitet. Dazu wird ein neues Modellsystem entwickelt,
an dem verschiedene Parameter frei gewählt werden können und für das die exakte Zustandsdichte sowie die exakte Übergangsmatrix bekannt sind. Anschließend werden zwei
weitere Systeme diskutiert für welche zumindest die exakte Zustandsdichte bekannt ist:
das Ising Modell und das Lennard-Jones System.
Der dritte Teil der Arbeit beschäftigt sich mit numerischen Untersuchungen an einer
Auswahl der vorgestellten Verfahren. Auf Basis der entwickelten Vergleichsgrundlage wird
der Einfluss verschiedener Parameter auf die Qualität der berechneten Zustandsdichte
quantitativ bestimmt. Es wird gezeigt, dass Übergangsmatrizen in Simulationen mit WangLandau-Verfahren eine wesentlich bessere Zustandsdichte liefern als das Verfahren selbst.
Anschließend werden die gewonnenen Erkenntnisse genutzt um ein neues Verfahren zu
entwickeln mit welchem die Zustandsdichte mittels Minimierung der Abweichungen des
detaillierten Gleichgewichts aus großen, dünnbesetzten Übergangsmatrizen gewonnen werden kann. Im Anschluss wird ein Lennard-Jones-System im großkanonischen Ensemble
untersucht. Es wird gezeigt, dass durch das neue Verfahren Zustandsdichte und Dampfdruckkurve bestimmt werden können, welche qualitativ mit Referenzdaten übereinstimmen.
Schlagworte
Statistische Mechanik, Statistische Physik, Markov-Ketten-Monte-Carlo Verfahren, Simulated Annealing, Zustandsdichte, Wang-Landau Verfahren, Übergangsmatrix, Zufallsgraph, Detailliertes Gleichgewicht, Phasenübergang, Großkanonisches Ensemble
3
4
Contents
1 Introduction
2 Methods for Density of States Calculation
2.1 The State Space and its Dynamics . . . . . . . . . . . . . . .
2.2 Sampling and Analysis Methods . . . . . . . . . . . . . . . . .
2.2.1 Metropolis Sampling . . . . . . . . . . . . . . . . . . .
2.2.2 Histogram Reweighting . . . . . . . . . . . . . . . . .
2.2.3 Multihistogram Method . . . . . . . . . . . . . . . . .
2.2.4 Parallel Tempering . . . . . . . . . . . . . . . . . . . .
2.2.5 Multicanoncial Sampling and Multicanonical Recursion
2.2.6 Umbrella Sampling . . . . . . . . . . . . . . . . . . . .
2.2.7 Broad Histogram Method . . . . . . . . . . . . . . . .
2.2.8 Nested Sampling . . . . . . . . . . . . . . . . . . . . .
2.3 Wang-Landau Method . . . . . . . . . . . . . . . . . . . . . .
2.3.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.2 Extension to the Grand Canonical Ensemble . . . . . .
2.3.3 Flatness . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.4 Implementation . . . . . . . . . . . . . . . . . . . . . .
2.3.5 Analysis of the Original Method . . . . . . . . . . . .
2.3.6 Extensions to the Original Method . . . . . . . . . . .
2.4 ParQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.2 Implementation . . . . . . . . . . . . . . . . . . . . . .
Data Acquisition . . . . . . . . . . . . . . . . . . . . .
Grand Canonical Ensemble . . . . . . . . . . . . . . .
2.5 Transition Matrix Methods . . . . . . . . . . . . . . . . . . .
2.5.1 Transition Matrices at Finite Temperature . . . . . . .
2.5.2 Infinite Temperature Transition Matrices . . . . . . . .
2.5.3 Transition Matrices in the Grand Canonical Ensemble
2.5.4 Solving for the Density of States . . . . . . . . . . . .
Broad Histogram Equation . . . . . . . . . . . . . . .
Minimization of Detailed Balance Deviations . . . . .
Minimization Weights . . . . . . . . . . . . . . . . . .
Eigenvector Methods . . . . . . . . . . . . . . . . . . .
7
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9
9
11
11
12
14
17
18
19
20
21
22
23
24
25
26
26
27
28
28
29
29
30
32
32
33
35
38
38
39
40
41
5
Contents
2.6
2.7
Implementing Minimization Procedures . . . . . . . . . . . 43
Configurational Bias Sampling . . . . . . . . . . . . . . . . . . . . . 44
Continuous Fractional Component Monte Carlo . . . . . . . . . . . 48
3 A New Benchmark for Density of States Methods
3.1 Development of a Fully Adjustable Benchmark System
3.2 Ising Model . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Lennard-Jones Model . . . . . . . . . . . . . . . . . . .
3.4 Schedules for Simulated Annealing . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
51
52
55
57
58
4 Benchmarking Density of States Methods
4.1 Fully Adjustable Benchmark System .
4.1.1 Sampling Method Comparison .
4.1.2 Metropolis sampling . . . . . .
4.1.3 Error of Q Matrix Entries . . .
4.2 2D Ising Ferromagnet . . . . . . . . .
4.2.1 Algorithm Comparison . . . . .
4.2.2 Power Iteration . . . . . . . . .
4.2.3 Least Squares Minimization . .
4.2.4 Simulated Annealing Schedules
4.2.5 Wang-Landau Method . . . . .
4.3 Two-Particle Lennard Jones System . .
4.3.1 Optimal Binning . . . . . . . .
4.3.2 Sampling Method Comparison .
4.4 Discussion of Benchmark Results . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
61
62
62
63
64
65
66
67
69
69
72
74
75
76
79
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5 Grand Canonical Ensemble
81
5.1 A new Method for the Computation of the Joint Density of States
from Sparse Transition Matrices . . . . . . . . . . . . . . . . . . . . 82
5.2 Joint Density of States of a Lennard-Jones System in the Grand
Canonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.3 Calculating Phase Coexistence Properties . . . . . . . . . . . . . . 85
6 Summary and Conclusion
91
Bibliography
93
6
1 Introduction
Obtaining predictions of thermodynamic properties of complex materials is of interest in many fields of research and engineering. For example, one may need to
conduct measurements with an expensive material. Having a prior approximation
of the outcome of the measurement may allow to reduce the number of experiments
and hence to reduce the costs. Another aspect to consider is the scaling of laboratory equipment or of chemical reactors prior to performing an experiment with
explosive or unstable chemicals. Thus, extensive planning, often including material simulations, is performed beforehand. Therefore, increasing the quality of the
predictions of such simulations or reducing the simulation time is of large interest.
While the latter aspect is supported by the ever increasing processing power, both
aspects, quality and performance, benefit from algorithmic improvements.
With computer processors following Moore’s law we find that performance doubles approximately every 18 months. But this increase is not found solely within
single processors but rather in the increase in the number of processors within a
computer. Thus, the parallelization of algorithms is necessary to utilize the full
potential of such multi-processor computers and further of high-performance cluster computers. But not all algorithms are easy to parallelize. Hence, the user
or implementer of a simulation program needs to be aware of the implications on
quality and speed when choosing a specific algorithm.
Calculating properties of a thermodynamic system can be done in at least two
ways. The first, called molecular dynamics, is based on integrating Newton’s laws
over time and keeping track of averages of observables. Successive configurations
are created by calculating the forces acting upon atoms and molecules and updating
their velocities and positions. But the derivatives needed to calculate the forces
can be hard to derive and implement.
A second way to calculate predictions of thermodynamic properties are Monte
Carlo methods. Instead of needing the forces acting upon atoms, Monte Carlo
methods content themself on using the potential energy. This makes computations
much simpler. The aim is to sample the state space uniformly. Thus, one method
is to generate new configurations from “scratch” and weight them appropriately.
Although the states of a system do not necessarily have to be connected, defining
a neighborhood may increase the efficiency. Therefore we can set up a Markov
chain of states by repeatedly creating a new configuration through modification
of the previous configuration. The sequence of states does not necessarily have
the meaning of time, it depends on the details of the algorithm implemented. A
7
1 Introduction
feature of such Markov chain Monte Carlo methods is, that the move class, i.e. the
way new states are built by modifying a previous state, can be constructed and
designed without any restrictions, as long as it generates states with an appropriate
weight. For example, there are move classes that perform self-avoiding walks to
insert large molecules or move classes that exchange the type of two different atoms.
This makes the Monte Carlo approach much more versatile compared to molecular
dynamics.
The properties of interest include free energy, conditions of phase coexistence and
critical points. While it is possible to assess these properties by a conventional
Metropolis sampling approach [1], more advanced techniques like Wang-Landau
sampling [2, 3] and transition matrix methods [4–6] have been developed in recent
years. Besides helping in overcoming long tunneling times or large free energy
barriers, the main reason for their usage is, that they give estimates of the density
of states, a quantity allowing the direct calculation of several thermodynamic observables. Transition matrix methods have also shown to be particularly easy to
parallelize [6, 7].
The aim of this thesis is to find and develop an optimal set of methods for
estimating the density of states of large grand canonical systems. Therefore we first
need to develop a new test-bed which can be used to benchmark density of states
methods. Using this test-bed different candidate algorithms and combinations of
algorithms have to be investigated. Using the results of this investigation we will
introduce a new method for estimating the density of states, based on a transition
matrix method previously presented by the author [8]. The sparse structure of the
transition matrix is an essential key for performance improvements of up to two
orders of magnitude.
To start our investigation, we will first clarify our notion of state spaces and
the dynamics therein. The remainder of chapter 2 is devoted to the introduction
of different Monte Carlo sampling and analysis techniques in order to understand
their historic relationship, their weaknesses and strengths and their applicability in
different thermodynamic ensembles. Additionally two special Monte Carlo move
classes for improved sampling of the grand canonical ensemble are discussed.
In the third chapter a new benchmark for density of states methods is developed,
utilizing three model systems of which the density of states is known. First a new,
random graph based model system is developed, of which we have full control over
all parameters and also have knowledge of the exact density of states as well as
the exact transition matrix. The test-bed is completed by the Ising model and the
two-particle Lennard-Jones system.
Based on this benchmark a set of Wang-Landau and transition matrix based
methods is investigated in chapter 4. The findings of this investigations are successively used in chapter 5 to develop a new method for estimating the density of
states from grand canonical transition matrices.
Chapter 6 will summarize the results and give an outlook to future investigations.
8
2 Methods for Density of States Calculation
Monte Carlo methods are a common tool in statistical mechanics. In this chapter different methods and algorithms for sampling thermodynamic systems and
analyzing the results are presented. We will start by discussing the properties of
thermodynamic systems in the sense of statistical mechanics. Thereafter sampling
algorithms are discussed beginning with the root of most methods presented here,
the Metropolis method. In this chapter, the focus lies on algorithms capable of
calculating the density of states, e.g. reweighting [9, 10], transition matrix [4–6]
and multicanonical [2, 11, 12] based methods.
2.1 The State Space and its Dynamics
A system is defined as a set S, called state space, of microstates s ∈ S. The state
space can either be discrete or continuous. Discrete systems have the property that
their number of microstates NS = |S| is finite, i.e. NS < ∞, with the contrary being
true for continuous systems. Common examples of discrete systems include the
Ising [13] and Potts [14, 15] models, lattice gases [16] as well as lattice polymers [17],
and lattice proteins [18] to name a few. On the side of continuous systems we find
the Heisenberg model [19], models for molecular systems [20], proteins [21] or
clusters of molecules [22].
On the set of microstates we can define a function E(s), representing the energy
of microstate s. For example, the energy of a molecular system is given by
E(s) = K(s) + V (s),
(2.1)
with K(s) being the kinetic energy and V (s) the potential energy. Performing
Markov Chain Monte Carlo simulations, which only use the potential energy, we
can apply the equipartition theorem and set E(s) = V (s).
To create a dynamics within the state space we first need to define the neighborhood N (s) ⊆ S of a microstate s. One step of a random walk through such a
state space consists of randomly selecting a microstate s′ ∈ N (s) with probability
q(s → s′ ) and accepting this choice with probability a(s → s′ ). In terms of Markov
chain Monte Carlo simulations the first part, i.e. defining the neighborhood N (s)
and the probability q(s → s′ ), is called “move class”, and the second part, the
acceptance probability a(s → s′ ), belongs to the sampling algorithm employed.
9
2 Methods for Density of States Calculation
Generalizing, one may say that the move class is system-dependent, whereas the
sampling algorithm is mostly independent of the system under consideration. For
some advanced techniques like Configurational Bias Monte Carlo (CBMC) [23–26]
this separation is invalid, given that the move class generates biased configurations
where the bias has to be compensated for in the acceptance part.
In classical statistical mechanics the partition function is a central quantity
containing all essential information of a system of interest [27]. In the canonical ensemble it is defined as a function of temperature T or inverse temperature
β = 1/(kB T ):
∑
∑
Ẑ(T ) =
e−E(s)/(kB T ) =
e−βE(s) ,
(2.2)
s∈S
s∈S
with the Boltzmann constant kB and the hat ˆ· indicating an exact quantity. The
probability to find a system in microstate s is then given as
P̂s =
1
Ẑ(T )
e−E(s)/(kB T ) =
1
Ẑ(β)
e−βE(s) .
(2.3)
Using the partition function we can directly calculate physical quantities of interest like the Helmholtz free energy
1
F̂ (T ) = −kB T ln Ẑ(T ) = − ln Ẑ(β),
β
(2.4)
which we will refer to throughout the rest of this work as “free energy”. All other
thermodynamic equilibrium quantities can be calculated by differentiating this
equation. Thereby we are able to calculate, for example, the internal energy
Ê(T ) = −T 2
∂(F̂ /T )
∂ ln Ẑ(β)
=−
.
∂T
∂β
(2.5)
Typically simulations yield only approximations for the internal energy. The free
energy has then to be obtained by integration, with the limitation that the integration constant is unknown in most cases. Not knowing the free energy at a reference
temperature leaves us with the ability to calculate only free energy differences.
All these quantities can be formulated in terms of the density of states Ω̂(E). It
is defined as the number of microstates s having E(s) = E. The observables above
read then
Ẑ(T ) =
NE
∑
Ω̂(Eα ) e−βEα
(2.6)
α=1
and Ê(T ) =
10
1
NE
∑
Ẑ(T )
α=1
Eα Ω̂(Eα ) e−βEα ,
(2.7)
2.2 Sampling and Analysis Methods
for a discrete system with NE different energies. Each energy Eα , consisting of
Ω̂(Eα ) microstates, will be called a macrostate. Here Ω̂(E) denotes the exact
density of states. If we only have an estimate Ω(E) available, the estimated values
for the internal energy and its second moment become
E
1 ∑
⟨E(T )⟩ =
Eα Ω(Eα ) e−βEα
Z(T ) α=1
N
E
1 ∑
and ⟨E (T )⟩ =
E 2 Ω(Eα ) e−βEα ,
Z(T ) α=1 α
(2.8)
N
2
(2.9)
respectively. This allows us to calculate the estimated specific heat cV (T ) by
1
(2.10)
[⟨E 2 (T )⟩ − ⟨E(T )2 ⟩],
kB T 2
which is commonly used as an indicator for phase transitions. The temperature
where the specific heat becomes maximal is the temperature at which the phase
transition occurs.
cV (T ) =
2.2 Sampling and Analysis Methods
In this section an overview of different methods and algorithms to calculate properties of thermodynamic systems is given.
2.2.1 Metropolis Sampling
Given a state space S, we can draw a sample s1 . . . sN , chosen from a uniform
distribution. Each microstate st can be assigned an energy Et and a value for an
observable At . In order to calculate an estimate of an observable A we build a
weighted average over all observations st :
1 ∑
⟨A(T )⟩ =
At e−βEt ,
Z(T ) t=1
N
(2.11)
∑
−βEt
where Z(T ) = N
is the partition function and β = 1/(kB T ) the inverse
t=1 e
temperature. This method of averaging has some drawbacks, especially if the Boltzmann factor e−βEt turns out to be very small for a large number of observations.
The major improvement of Metropolis et al. [1] was to show that for Boltzmann
distributed observations At the average becomes
⟨A⟩ =
N
1 ∑
At .
N t=1
(2.12)
11
2 Methods for Density of States Calculation
The resulting problem of generating samples underlying such a distribution can be
solved by performing a random walk through state space.
To perform such a random walk we can set up a simulation, for instance, in the
canonical ensemble where a system of N particles and volume V is connected to
a heat bath at temperature T . Being in state i we can apply an operation called
“move class” to reach state j. The move class may, for instance, move atoms
or molecules inside a simulation box or flip a single spin. If the probability of
accepting such a move is appropriately chosen, like [1]
[
]
a(i → j) = min 1, e−β(Ej −Ei ) ,
(2.13)
then the states of the resulting Markov chain underlie the Boltzmann distribution
pi = Ω̂(Ei )
e−βEi
Ẑ(T )
.
(2.14)
Multiplying the probability q(i → j) to propose a move from i to j by the
probability a(i → j) gives us the transition probability
Γij = q(i → j) · a(i → j)
(2.15)
from which we can assemble the transition matrix Γ.
In order to use a Markov chain of states, the underlying system has to be ergodic,
i.e. every point in state space has to be reachable after a finite number of steps,
and the move class must obey microscopic reversibility, i.e. q(i → j) = q(j → i).
This also implies the detailed balance condition
pi Γij = pj Γji ,
(2.16)
where pi and pj are the probabilities of finding a system at temperature T in states
i and j, respectively.
2.2.2 Histogram Reweighting
Histogram reweighting [9, 10] is a technique used to obtain more information out of
the simulation process. During a simulation at temperature T0 a histogram H(E)
is built by counting
∑ the visits to energy E. Afterwards, the probability distribution
P (E) = H(E)/ E H(E) can be reweighted to a different temperature T0 + ∆T .
The distance ∆T can be chosen arbitrarily, keeping in mind, that the quality of
the reweighted histogram depends on the quality of the tails of the distribution
and usually decreases quickly with increasing ∆T .
12
2.2 Sampling and Analysis Methods
The probability of observing a system at inverse temperature β = 1/(kB T ) being
in energy E is given by
P̂β (E) =
Ω̂(E) −βE
e
,
Z(β)
(2.17)
where Ω̂(E) is the exact density of states and Z(β) is the partition function. Then,
the histogram generated during the simulation at an inverse temperature β0 is
H(E) =
NH
Ω(E)e−β0 E ,
Z(β0 )
where Ω(E) is an estimate of the exact density of states Ω̂(E) and NH =
Eq. (2.18) can now be inverted to
Ω(E) =
Z(β0 )
H(E)eβ0 E .
NH
(2.18)
∑
E
H(E).
(2.19)
If we replace Ω̂(E) in eq. (2.17) with the estimate Ω(E) from eq. (2.19) and apply
normalization we obtain
H(E)e−(β−β0 )E
Pβ (E) = ∑
,
−(β−β0 )E
E H(E)e
(2.20)
with β = 1/(kB (T0 + ∆T )). This is the relationship between a histogram measured
at inverse temperature β0 and the estimated probability distribution Pβ for an
arbitrary β.
Instead of β any other order parameter can be used. For instance, in the grand
canonical ensemble, replacing e−(β−β0 )E by eβ0 (µ−µ0 )N as well as P (E) and H(E)
by P (N ) and H(N ), we can also reweight the probability of observing a given
particle number P (N ) for different values of the chemical potential µ.
In fig. 2.1 an application of histogram reweighting is shown. For this figure
simulations of a 32 × 32-spins Ising ferromagnet have been performed. A detailed
definition of this system is given in the next chapter in section 3.2. To show the
limits of histogram reweighting a simulation at the critical temperature Tcrit ≈
2.269 K with 500,000 sweeps (1 sweep = 1024 spin flips) has been conducted. The
results are depicted in the left frame of fig. 2.1 together with reweighted data.
One can clearly see that in spite of the large number of spin flips performed the
reweighted data shows increasing fluctuations with increasing reweighting distance.
The right frame shows the heat capacity cV over temperature based on reweighted
data from this simulation (orange line) as well as data from a set of 40 simulations
in the range [1.5, 3.5] K (light blue line). Comparing both, one finds that the
reweighted data is only capable of reproducing the features around the temperature
at which it has been generated.
13
2 Methods for Density of States Calculation
0.04
0.035
0.03
2
T=2.269
T=2.369
T=2.469
reweighted
multiple
1.5
0.02
cV
P(E)
0.025
1
0.015
0.01
0.5
0.005
0
-1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1
E/(J ⋅ 1000)
0
1.5
2
2.5
3
3.5
T/K
Figure 2.1: Plots showing histogram reweighting in use. The left figure shows a
normalized energy histogram P (E) of an Ising ferromagnet simulated
at the critical temperature Tcrit ≈ 2.269 K (red line). The dashed
green and blue lines are reweighted from this data to T = 2.369 K and
T = 2.469 K, respectively. The graph on the right shows the specific
heat cV for the same system. The solid orange line is obtained from
a single simulation at the critical temperature using reweighting while
the dashed light blue line is obtained from a set of 40 simulations
distributed over the shown range with ∆T = 0.05 K. The difference is
due to insufficient coverage of state space in the reweighted data.
2.2.3 Multihistogram Method
The combination of histograms of different simulations is called a multihistogram
method [28–30]. It can be used either to reweight the histograms to obtain estimates for observables at different temperatures or to calculate estimates of the
density of states. The basic idea is to perform m Monte Carlo simulations each
having N1 , . . . , Nm measurements at inverse temperatures β1 , . . . , βm , respectively.
The histograms of all simulations are then reweighted to a common reference point
β0 and combined using an error weighted averaging.
For each simulation we∑obtain a histogram Hi (E) counting the occurrences of
energy
√ E. It adheres to E Hi (E) = Ni and has an approximate statistical error
of Hi (E). Using this information we can compute the probability distribution
14
2.2 Sampling and Analysis Methods
16
12
10
8
μ=-3.4388
0.05
0.04
P(N)
14
H(N) / 105
0.06
μ=-3.0
μ=-3.2
μ=-3.3
μ=-3.4
μ=-3.5
μ=-3.6
6
0.03
0.02
4
0.01
2
0
0
0
20
40
60
80
100
0
20
N
40
60
80
100
N
Figure 2.2: Left: Histograms H(N ) from simulations of Lennard-Jones particles in
the grand canonical ensemble at fixed temperature T = 1 K for a set
of different values of the chemical potential µ. The peak at small N
corresponds to the gas phase whereas the peak at high particle numbers
resembles the liquid phase. The data for this graph has been taken
from a previous publication [8]. Right: Probability P (N ) of observing
a particle number N at µ = −3.4388, calculated by reweighting data
shown in the left frame. The chemical potential has been tuned, so
that the areas under the two peaks are equal. This corresponds to the
point of phase coexistence.
at β, following Landau & Binder [27], as
[m
∑
Pβ (E) =
]
Hi (E) e−βE
i=1
m
∑
,
Ni
(2.21)
e−βi E−fi
i=1
with the parameters fi being unknown and subject to an iterative procedure:
∑
(2.22)
fi = ln
Pβi (E).
E
The easiest, but not necessarily fastest way is to iterate eq. (2.22) until all fi
are converged [31]. To start the iteration, the set of parameters fi is best set to
0. These equations can be simplified by choosing β = 0 [32], i.e. evaluating at
15
2 Methods for Density of States Calculation
infinite temperature, resulting in an equation for the density of states:
m
∑
Hi (E)
i=1
Ω(E) = ∑
m
,
Ni
(2.23)
e−βi E−fi
i=1
∑
with fi = ln E Ω(E)e−βi E .
An example for the application of above equations can be found in fig. 2.2. In
the left frame histograms Hi (N ) for different chemical potentials µi are shown for a
Lennard-Jones system in the grand canonical ensemble at T = 1. The data for this
graph has been taken from a previous publication [8]. To obtain the probability
P (N ) at phase coexistence, shown in the right frame, eq. (2.23) has been modified
according to
m
∑
Hi (N )
i=1
Ω(N ) = ∑
m
,
Ni
(2.24)
eβi µi N −fi
i=1
to obtain a density of states in particle number. After optimizing the parameters
fi the probability
Ω(N )N eβµN
P (N ) = ∑
βµN
N Ω(N )N e
(2.25)
at any µ can be obtained utilizing Ω(N ). Phase coexistence is found, when the
area under the two peaks is equal. Therefore we need to determine an Nmid , being
the local minimum of P (N ) located between the two peaks. Now an optimization
procedure has to be applied, tuning µ such that the condition
∑
∑
!
P (N ) =
P (N )
(2.26)
N ≤Nmid
N >Nmid
is fulfilled.
States within a Markov chain typically exhibit some kind of temporal correlation which decays over time. To incorporate this characteristic relaxation time τi ,
Ferrenberg and Swendsen [29] introduce a weighting factor gi = 1 + 2τi in their
original approach. The probability distribution at β is then
]
[n
∑ −1
gi Hi (E) e−βE
.
Pβ (E) = i=1
(2.27)
n
∑
Ni gi−1 e−βi E−fi
i=1
16
2.2 Sampling and Analysis Methods
This weighting may improve the quality of the reweighted distribution, but the
procedure has the downside, that the relaxation times τi are not necessarily easy
to determine.
A description of the application of this method, also called WHAM (weighted
histogram analysis method), to biomolecular systems is presented by Kumar et
al. [33]. They also provide the relation of this method to the umbrella sampling
method (cf. section 2.2.6). A recent variant of the multihistogram method for
density of states calculation has been presented by Fenwick [34].
2.2.4 Parallel Tempering
The parallel tempering method [35, 36], also known as replica exchange, is a tempering procedure similar to simulated (or simple) tempering [37, 38]. The idea is
to run several replica of a system in parallel at different inverse temperatures β.
These should be selected carefully so that the energy distributions for two consecutive temperatures have sufficient overlap. After some Monte Carlo steps a “replica
swap move” between replicas i and j is proposed that may be accepted by
[
]
a(i ↔ j) = min 1, e(βj −βi )∆E ,
(2.28)
with ∆E = Ej − Ei being the energy difference between energy Ei of replica i and
Ej of replica j. Typically i and j fulfill j = i + 1 [36, 39].
Although the method is called “parallel”, the computation itself does not necessarily have to be run in parallel. The only event that is happening in parallel
for each replica is the replica exchange move. Nonetheless, parallel simulation is
possible but may have a high communication overhead, as exchange moves may
be performed as often as every 10 time steps. Furthermore the steps in-between
may need different computing time which inevitably leads to processes waiting for
e.g. an exchange move event.
An enhanced method usable for the grand canonical ensemble has been presented by Yan et al. [39], dubbed hyper-parallel tempering (HPTMC). They create a sequence of replica with different temperatures and chemical potentials. The
probability for an exchange between two neighboring replica is then
[
]
a(i ↔ j) = min 1, e(βj −βi )(Ej −Ei ) × e−(βj µj −βi µi )(Nj −Ni ) .
(2.29)
The parallel tempering method inherits the drawbacks of the histogram and
multiple histogram reweighting techniques, e.g. that one needs explicit knowledge
about the system, for example the approximate point of phase transition, to correctly set up the sequence of temperatures, chemical potentials or both. But, on
the positive side it can overcome large free energy barriers more easily and it is
capable of relaxing the system much faster [39]. Applying the multihistogram
method, we are also able to obtain the density of states.
17
2 Methods for Density of States Calculation
2.2.5 Multicanoncial Sampling and Multicanonical Recursion
Multicanonical sampling is a method independently developed by Berg et al. [11,
12] and Lee [40, 41] (called entropic sampling). The equivalence of both descriptions is shown in ref. [42]. Berg et al. presented their algorithm in a generalized
context of systems of finite volume V = Ld , undergoing a first-order phase transitions. Here L denotes a length and d the dimension of the system. They showed
that multicanonical sampling can reduce the exponentially fast slowing down to
a quadratic one [11]. Later they presented that the exponentially fast increasing
tunneling times in the first-order phase transition of a 10-state Potts model [14,
15] can be reduced [12].
Instead of performing multiple simulations at different temperatures and applying multihistogram reweighting, multicanonical sampling achieves combined statistics over a broad energy range in a single run. The main idea of the algorithm is
to weight every state by an approximation of 1/Ω̂(E). This would ideally lead to a
flat energy histogram H(E), i.e. that every energy would be visited equally often.
To achieve this the canonical distribution
Pcan (E) ∝ e−βE
(2.30)
is replaced by the multicanonical distribution
Pmult (E) ∝ e−β(E)E−α(E) = e−S(E) ,
(2.31)
where S(E) = ln Ω(E) is the entropy, β(E) can be described as an “effective”
inverse temperature and α(E) is a recursively defined parameter. For the algorithm
to work with continuous systems a binning scheme has to be applied that lumps
all energies in a given range [Ei − ∆E
, Ei + ∆E
) into one bin Ei . The formulation
2
2
containing entropy S(E) is the one chosen by Lee, whereas Berg et al. prefer the
use of an inverse temperature. In the following we will stick to the notion of Berg
et al.
The choice of β(E) is crucial and can be either defined manually [11, 12] or by a
procedure called multicanonical recursion [40, 41, 43–45]. The recursion to refine
β(E) is given by
β0 (Ei ) = α0 (Ei ) = 0 ∀i
and βn+1 (Ei ) = βn (Ei ) + ε−1 ln
[
]
Hn (Ei+1 )
,
Hn (Ei )
(2.32)
(2.33)
where Hn (E) is the histogram obtained during the nth run and ε = Ei+1 − Ei is
the difference between adjacent energy bins. Eq. (2.32) means that the simulation
18
2.2 Sampling and Analysis Methods
starts at infinite temperature. The values for α are updated in every recursion
step according to
α(Ei ) = α(Ei+1 ) + (β(Ei ) − β(Ei+1 ))Ei+1
with α(EN ) = 0,
(2.34)
where Ei and Ei+1 are neighboring bins with E1 ≤ Ei < Ei+1 ≤ EN .
A detailed description on how to treat unreliable histogram entries can be found
in ref. [45]. Instead of using histograms to update the weights Smith et al. [44]
recommend the usage of a transition matrix to achieve faster convergence of the
recursion procedure. To further improve the recursion scheme one may use all the
information obtained up to step n, i.e. H1 (E), . . . , Hn (E). For further reference
the articles of Berg et al. [45–47] are recommended.
2.2.6 Umbrella Sampling
Umbrella Sampling introduced by Torrie and Valleau [48, 49] is a method based
on the “overlapping distribution method” from Bennett et al. [28]. Let r denote
the coordinates of an N particle off-lattice system in a volume V . Then, its goal is
to calculate the free energy difference between a “system of interest”, with energy
E(r) at temperature T , and a “reference” system with E0 (r) being at temperature
T0 [50]:
F − F0 = −kB T ln(Z/Z0 ) = −kB T ln(⟨e−β∆E ⟩0 ),
(2.35)
where F and F0 are the free energy of the systems, ⟨·⟩0 is the ensemble average
taken in the reference system and ∆E(r) = E(r)−E0 (r). To overcome problems in
sampling ∆E Torrie and Valleau chose to sample from a general density function
ρW (r) = ∫
W (r)e−βE0 (r)
,
drW (r)e−βE0 (r)
(2.36)
with W (r) = W (∆E(r)) as a weight function that has to be defined at the beginning of the simulation. Trial moves from state i to state j are therefore accepted
by
]
[
Wj −β(E0 (rj )−E0 (ri ))
e
.
(2.37)
a(i → j) = min 1,
Wi
One drawback of this method is the choice of the weight function W (∆E), which
either needs understanding of the system at hand or a trial-and-error approach until
ρW has the, in that sense, optimal form. A second drawback is the restriction that
we can only calculate differences in the free energy between two similar systems.
19
2 Methods for Density of States Calculation
Absolute free energies can only be calculated if one has accurate knowledge of the
reference system.
Virnau et al. [51] presented a method called “successive umbrella sampling”
which aims at calculating the density of states Ω(N ) depending on the particle
number N of the grand canonical ensemble. Their main idea is to create several
overlapping windows from N = 0 to N = Nmax . Then weights are calculated
from the first window and extrapolated onto the next one. This extrapolation is
repeated for all following windows.
In general the umbrella sampling method is usable with Monte Carlo methods,
but also received a broad adaption in the molecular dynamics (MD) community.
One recent development by Wojtas-Niziurski et al. [52] advances the umbrella sampling method by implementing a feedback mechanism that automatically generates
a set of umbrella sampling windows fitting to the system. The authors describe
their algorithm as “self-learning” and use it for the calculation of energy landscapes
in large biomolecular systems.
2.2.7 Broad Histogram Method
The aim of the Broad Histogram Method, invented by de Oliveira et al. [53, 54], is
to use a special random walk dynamics to directly calculate the density of states.
The authors present it using 1 and 2-dimensional Ising spin systems. The idea
is to classify the possible moves into three categories, i.e. (1) E → E − ∆E, (2)
E → E + ∆E and (3) E → E. Only the first two categories are used for statistics
while the third one is only performed to reduce the correlation between successive
measurements. As the method should perform an unbiased random walk in energy
the acceptance is set up so that moves decreasing the energy are always accepted
and those that increase the energy are accepted with probability Ndown (E)/Nup (E),
where Ndown (E) and Nup (E) are the numbers of possible moves of the classes 1 and
2, respectively. Using
⟨Nup (E)⟩Ω(E) = ⟨Ndown (E + ∆E)⟩Ω(E + ∆E),
(2.38)
which can be rewritten as
ln Ω(E + ∆E) = ln Ω(E) + ln
⟨Nup (E)⟩
,
⟨Ndown (E + ∆E)⟩
(2.39)
makes it possible to directly calculate the density of states Ω(E). Here the averages
⟨·⟩ are over all states s and s′ having energies E and E + ∆E, respectively. In
most cases, we are only interested in differences in ln Ω(E). Hence, one may iterate
eq. (2.39) by setting ln Ω(Emin ) = 0. Then, during the simulation, one only needs
to keep record of the averages ⟨Nup (E)⟩ and ⟨Ndown (E + ∆E)⟩.
20
2.2 Sampling and Analysis Methods
The problem with this formulation is, that it only covers problems like the 1DIsing model, where |∆E| (besides being 0) only takes one value. For the 2D-Ising
(4)
(2)
model for example, one has the choice to either keep track of ⟨Nup (E)⟩, ⟨Nup (E)⟩,
(2)
(4)
⟨Ndown (E)⟩ and ⟨Ndown (E)⟩ corresponding to ∆E ∈ {4, 2, −2, −4} or by keeping
1/∆E
1/∆E
track of ⟨Nup (E)⟩ and ⟨Ndown (E)⟩. Taking the first approach one can calculate
two Ω(E) by using eq. (2.39). Both should be identical. Regarding the second
method, the author states [54], that it gives equal quality compared to the first
strategy but may be subject to systematic errors.
Criticism to the above method has been brought up by Wang [55], who shows
that the transition rates specified by de Oliveira et al. introduce a systematic error
that is especially pronounced in small systems, e.g. up to 20% for an Ising system
with lattice size L = 4. As a solution the author proposes a different transition
probability which fulfills detailed balance and also gives a flat histogram:
[
]
⟨N (s′ , E − E ′ )⟩E ′
′
a(E → E ) = min 1,
.
(2.40)
⟨N (s, E ′ − E)⟩E
Here N (s, E ′ − E) is the number of states s′ having energy E ′ that are reachable
from s by a single spin flip. The averages ⟨·⟩E and ⟨·⟩E ′ are effectively transition
rates between the two energies.
2.2.8 Nested Sampling
The Nested Sampling (NS) algorithm is a relatively new and rather promising sampling technique. Its roots lie in the Bayesian statistics community where it has
been introduced by J. Skilling [56, 57] in 2006. Since then it has been adopted
for a wide range of problems including hard spheres [58], Lennard-Jones clusters
[59, 60] water molecules [61], the Potts model [62] and protein folding [63]. The
original publications by Skilling use the statistical term likelihoods whereas successive publication by other authors employ the term energies for obvious reasons.
Likelihoods and energies are related inversely proportional.
Nested sampling explores the energy landscape in an unbiased way. It constructs
a series of energy levels spaced equidistant in ln Ω(E). This is done in a single pass
going from high to low energies. To achieve broad sampling a large number of
walkers N is needed. The error in ln Z, i.e. logarithm of the partition function,
scales as N −1/2 .
The algorithm works as follows:
1. Generate N configurations from a distribution uniform in configuration space.
2. Replace configuration i having the highest energy Emax by a new configuration, which shall be created from one of the remaining configurations with
21
2 Methods for Density of States Calculation
Ej < Emax . Only configurations with Enew < Emax are accepted. Save Emax
and any observable of interest corresponding to this configuration to disk.
Ideally one would save the whole configuration si .
The second step is repeated until no configuration with a smaller energy is found.
At this point the nested sampling iteration is considered converged.
To obtain a new configuration from one of the remaining ones, Monte Carlo
sampling can be used. Therefore m steps are performed using either uniform
sampling, i.e. accepting every move with an upper energy limit at Emax , or by
performing a N V T (canonical) ensemble simulation (MC or MD) at an inverse
temperature β derived from the energy histogram of the previous iteration [60].
(k)
After the procedure converged every removed energy Emax , k = 0 . . . n with n
being the number of iterations, is assigned a state space weight wk = αk − αk+1 .
Here α = N /(N + 1) is the ratio of state space volumes of two consecutive Emax
values. An estimate of the partition function and of any observable of interest is
then given by
Z(β) =
n
∑
(k)
(αk − αk+1 )e−βEmax
(2.41)
k=0
⟨A(β)⟩ =
n
(k)
1∑ k
(α − αk+1 )A(sk )e−βEmax
Z k=0
(2.42)
With every nested sampling iteration the phase space volume shrinks exponentially
by a factor α. Thus it is possible to sample exponentially small regions of the state
space.
Different parallelization schemes have been proposed. The original author proposed [56] merging the different likelihoods into a single sequence. Burkoff et
al. [63] employed a scheme where p configuration (instead of one) are removed
from the set of configurations and p new configurations are generated in parallel.
To give some numbers, Pártay et al. applied the nested sampling algorithm to
Lennard-Jones clusters. They used a set of N = 300 configurations for clusters of
size 2 − 5 and up to N = 250000 for clusters of size 30 − 40 atoms, corresponding
to 1.6 × 106 and 2.6 × 1012 energy evaluations respectively [59]. The authors also
present an interesting method for generating energy landscapes from the sampling
data. This method has thereafter been used by Burkoff et al. [63] to sketch the
energy landscape of a polyalanine β-hairpin.
2.3 Wang-Landau Method
In the following the Wang-Landau method shall be introduced briefly. Being a
flat histogram method it has many aspects in common with the multicanonical
22
2.3 Wang-Landau Method
method (cf. section 2.2.5). Generalizing, the method performs a random walk in
energy space while directly calculating the density of states Ω(E).
It has been proposed by F. Wang and D. P. Landau [2, 3] in the context of Ising
(2D and 3D) and Potts models, both being discrete models. The extension to
continuous systems has been presented by Yan et al. [64] as well as Shell et al. [65]
in the context of Lennard-Jones systems. They also extended the algorithm to
both the N pT -ensemble (Shell et al.) and the µV T -ensemble (both).
In the following subsection the method will be introduced in detail. Afterwards,
all steps necessary for the extension to the grand canonical ensemble are presented
and appropriate flatness conditions will be discussed. Finally, obstacles and subtleties of the implementation are discussed and some extensions to the method are
presented.
2.3.1 Method
The Wang-Landau method is a flat histogram method for Monte Carlo simulations
that samples a generalized ensemble. In the canonical ensemble it samples a previously defined energy range [Emin , Emax ) uniformly by accepting proposed moves
with the probability
[
]
Ω(Ei )
a(Ei → Ej ) = min 1,
,
(2.43)
Ω(Ej )
where Ei and Ej are the energy levels before and after the move. Starting from
an initial guess of the density of states Ω(E) it tries to estimate the exact density
of states Ω̂(E) by modifying Ω(E) in every Monte Carlo step by
Ωt+1 (En ) = Ωt (En ) · f,
(2.44)
where f > 1 is a modification factor and En is the energy after the evaluation of
the acceptance criterion. A histogram H(E) is modified simultaneously according
to
Ht+1 (En ) = Ht (En ) + 1.
(2.45)
If a move generates a state outside the predefined energy range it is rejected and
Ω(E) and H(E) are updated at the previous energy [66].
If no initial guess can be made Ω(E) is set to 1 and H(E) is set to 0 for all E in
the range of interest. In the original publications [2, 3] the modification factor is
set to f0 = e1 at the beginning of the algorithm. It is modified according to
√
fk+1 = fk ,
(2.46)
when a specific flatness condition is met. Here k counts the number of refinements.
Section 2.3.3 will explain different flatness conditions in detail.
23
2 Methods for Density of States Calculation
Figure 2.3: Sketch illustrating the insertion (left) and removal (right) moves in the
grand canonical ensemble. The system is coupled to an infinitely large
particle reservoir with chemical potential µ. Insertions are performed
by selecting a random position within the dimensions of the box and
placing a new particle there. Removing a particle is implemented by
selecting a molecule at random and deleting it from the system.
2.3.2 Extension to the Grand Canonical Ensemble
Simulations in the grand canonical ensemble µV T allow the investigation of the
distribution of particle numbers according to a given chemical potential µ. Typically, several simulations with varying µ have to be carried out to find the exact
chemical potential at which the system exhibits phase equilibrium.
Extending the Wang-Landau method to the grand canonical ensemble allows
direct access to several observables depending on the chemical potential at once. To
achieve this Shell et al. [65] proposed the extension of the Wang-Landau algorithm
to continuous systems and conducted simulations in the grand canonical ensemble,
the isothermal-isobaric ensemble and the canonical ensemble. For the µV T and
the N pT ensemble they provided formulae for the acceptance criteria for moves
changing either particle number or volume. They present two sets of formulae, one
using the configurational density of states Ω and one using the excess density of
states Ωex , where for the latter the ideal gas contribution to the density of states
Ωig has been factored out. For simulations they stuck to the excess density of
states, as it gives shorter acceptance formulae. Contrary, Yan et al. [64] provide a
reference for the use of configurational density-of-states simulations. Both methods
differ in the prefactors applied to the acceptance probabilities of the insertion and
deletion (removal) moves. A sketch illustrating the two moves can be found in
fig. 2.3. Using the configurational density of states the acceptance criteria for
particle insertions and removals are
[
Ω(Ei , N )
V
a(N → N + 1) = min 1,
3
(N + 1)Λ Ω(Ej , N + 1)
[
]
N Λ3 Ω(Ei , N )
and a(N → N − 1) = min 1,
.
V Ω(Ej , N − 1)
24
]
(2.47)
(2.48)
2.3 Wang-Landau Method
Here Λ denotes the de Broglie thermal wavelength. The excess density of states
acceptance criteria, as given by Shell et al. [65], are
]
[
Ωex (Ei , N )
a(N → N + 1) = min 1,
(2.49)
Ωex (Ej , N + 1)
[
]
Ωex (Ei , N )
and a(N → N − 1) = min 1,
(2.50)
.
Ωex (Ej , N − 1)
Configurational and excess density of states are related by
Ω(N, V, E) ∝
VN
Ωex (N, V, E),
N!
(2.51)
where the factor V N /N ! is related to the partition function of an ideal gas [67],
with N ! being attributed to the indistinguishability of the particles. The density
of states is updated similar to the canonical ensemble case:
Ωt+1 (E, N ) = Ωt (E, N ) · f.
(2.52)
2.3.3 Flatness
The main idea of the Wang-Landau algorithm is, that the histogram H(E), counting the occurrences of energies, should be flat. If we could generate states with a
probability ∝ Ω̂−1 (E) the histogram would be perfectly flat. But, as we are working with an approximation of Ω̂(E) the histogram H(E) would never be perfectly
flat. Thus, Wang and Landau originally proposed the flatness criterion
minE H(E)
> γ,
⟨H(E)⟩
(2.53)
with γ ∈ [0, 1]. Typically γ lies between 0.7 and 0.99. This means that for γ = 0.95
every energy has to have at least 95% of the mean number of visits. Depending on
the system investigated the number of time steps needed may become very high
due to this flatness criterion.
One problem with this type of flatness criterion is the treatment of systems where
some energies in the energy range [Emin , Emax ] are never visited. This can happen
for example
a) for the Ising model on a square lattice, where Eground + 4J is never visited1
b) or for grand canonical systems, where the ground state energy for every
particle number would have to be known.
1
In the ground state, all spins point either up- or downwards, flipping one spin introduces an
energy change of ∆E = 8J.
25
2 Methods for Density of States Calculation
For systems comparable to a) one can incorporate this knowledge into the previously described criterion, while systems like b) need a different treatment. A
criterion that is easy to implement, is to check the count of non-zero entries in
the histogram. If it stays constant for a given number of steps one can presume
that a reasonable large number of states has been visited. Based on this principle
some variations can be made up, one is to ensure a minimum number of visits
0 < Hmin <= minE H(E) for every visited state. These and other types [68, 69] of
flatness conditions have been studied by Poulain et al. [70]. Their method applies
some annealing schedule with a repeated increase of f . This is supposed to help
in filling “gaps” encountered in the later course of a simulation.
These “gaps” are a common problem for some systems where a joint density
of states of two macro variables has to be computed. There is the possibility
to discover a state, say {Ei , N }, that has never been visited before after some
refinement steps, e.g. k ≥ 10. The initial value of Ω(Ei , N ) is 1. After applying
10
eq. (2.44) the entry becomes Ω(Ei , N ) = fk , e.g. fk = e(1/2) ≈ 1.00098. The
density of states Ω(Ej , N ) at neighboring states Ej may already be several orders
of magnitude larger. Now evaluating eq. (2.43) results in a very low acceptance
probability which forces the simulation to remain in this state, sometimes for
several thousand or million time steps, until Ω(Ei , N ) is large enough so that an
escape is possible.
2.3.4 Implementation
When implementing the Wang-Landau algorithm one has to be careful about incrementing the density of states. Using the original formulation (cf. eq. (2.44)) would
lead to overflow errors, as the modification factor f is multiplied multiple times
to the density of states entries. Therefore working with the logarithmic density of
states avoids this problem. The better formulation is then:
ln Ωt+1 (E, N ) = ln Ωt (E, N ) + ln f
1
ln fk+1 = ln fk .
2
The acceptance probability becomes within this scheme
[
]
a(Ei → Ej ) = min 1, eln Ω(Ei )−ln Ω(Ej ) .
(2.54)
(2.55)
(2.56)
2.3.5 Analysis of the Original Method
A thorough analysis of the convergence of the Wang-Landau algorithm has first
been presented by Zhou and Batt [69].
√ They found, that the fluctuation of the
histogram, which is proportional to 1/ ln f , is causing statistical errors. Another
26
2.3 Wang-Landau Method
source of error is the correlation between consecutive records to the histogram
which can be reduced by small f or by interpreting k consecutive steps as a single step. Cluster algorithms also help to decrease the correlation. The authors
recommend to start simulations with a large modification factor, e.g. f = e4 , and
reducing it faster, e.g. by dividing ln f by 10.
2.3.6 Extensions to the Original Method
Two extensions to improve the convergence of the Wang-Landau algorithm have
been presented by Zhou et al. [19]. The authors improve the sampling of a joint
density of states of two continuous variables, e.g. Ω(E, M ) for a Heisenberg model,
with M being the magnetization, by modifying the update procedure of eq. (2.44)
and by introducing a global update scheme. Additionally, a bilinear interpolation
is used in the acceptance criterion. Their modified update step is given by
ln Ω(x) = ln Ω(x) + γk((x − x0 )/δ),
(2.57)
where x is a shorthand for (E, M ), x0 is the state at which the random walker
arrived and γ and δ are constants. The kernel function k(x) can either be Gaussian
2
(e−|x| ) or Epanechnikovian ([1 − |x|2 ]+ ) [71]. The global update procedure is given
by
[
]
−λ
ln Ω(x) = ln Ω(x) + κ exp
Θ(ln Ω(x) − ω),
(2.58)
ln Ω(x) − ω
with Θ being the Heaviside step function. With this update ln Ω is shifted up
by κ where ln Ω(x) > ω. This global update pushes the random walker towards
unexplored regions of the state space. For details on implementing the algorithm
the reader is referred to refs. [19, 72].
Finally, we will discuss an easy to implement extension to the Wang-Landau
method, called 1/t-algorithm [73, 74]. The idea is to switch to a time dependent
f after several refinement steps. But, instead of using the standard WL flatness
criterion, i.e. eq. (2.53), f is already reduced if H(E) > 0 for all E. This is repeated
until
ln f ≤
c
,
(τ (t))p
(2.59)
with c and p being free parameters. The time τ (t) is defined as a function of the
number of time steps t:
τ (t) =
t
Nbins
,
(2.60)
27
2 Methods for Density of States Calculation
with Nbins being the number of different energies (or, for continuous systems, the
number of energy bins). From this time step on we set
ln f = F (t) =
c
(τ (t))p
(2.61)
and update it in every step. The histogram H(E) is then not needed anymore.
With this method Ωt (E) approaches the exact value Ω̂(E) asymptotically with
∝ t−1/2 . The error bounds are discussed in more detail in ref. [75]. The original
authors found [74], that the parameters c and p are best set to 1. We will use these
settings if not otherwise stated.
One problem that arises when one wants to implement the 1/t-algorithm for
the grand canonical ensemble is the definition of “time” in eq. (2.60). It contains
the number of different energies Nbins . This means, that, when calculating a joint
density of states, e.g. Ω(E, N ), one would have to know the number of energies
visitable for every particle number N . This is by no means trivial, as it implicates
knowledge of the ground state for every N .
2.4 ParQ
In this subsection a brief introduction into the Q-method [4], and especially the
parQ method [6–8, 76] is given. The Q-method is essentially a transition matrix
method in a coarse-grained, i.e. macroscopic, state space. The method approximates an infinite temperature transition matrix Q while simulating at a finite
temperature. An overview of other uses of transition matrices, either at finite or
infinite temperature, is given in section 2.5.
In the following subsections the algorithm is formulated and details of its implementation are discussed. Finally the extension to the grand canonical ensemble,
already presented in a previous publication by the author [8], is derived.
2.4.1 Method
Following [4, 6, 8] we derive the density of states equation for the parQ method.
The Master equation of a random walker performing a walk in energy space is
given as
p(Ej , t + 1) =
N
∑
Γij (T ) · p(Ei , t),
(2.62)
i=1
where the left side of the equation is the probability of the walker being in state
Ej in a phase space of N discrete energies at time t + 1. Γ(T ) is the temperature
28
2.4 ParQ
dependent transition matrix and is normalized by row:
∑
Γik (T ) = 1 ∀i.
(2.63)
k
The left eigenvector to the largest eigenvalue 1 of the stationary distribution p∗ (E)
has to be the Boltzmann distribution:
p∗ (Ej ) =
1
Ω(Ej )e−βEj .
Z(T )
(2.64)
Thus, for t → ∞ we can apply the stationary distribution to eq. (2.62) and obtain:
−βEj
Ω(Ej )e
=
N
∑
Γij (T )Ω(Ei )e−βEi .
(2.65)
i=1
Taking the limit T → ∞ gives an eigenvector equation for the density of states:
∑
Ω(Ej ) =
Qij Ω(Ei ),
(2.66)
i
where Q denotes the infinite temperature transition matrix. Having knowledge of
this matrix allows one to calculate the density of states and thus thermodynamic
properties of the system.
2.4.2 Implementation
The basic idea of the Q method is to employ the Metropolis algorithm for simulation and combine it with simulated annealing [77]. The method consists of
two parts: data acquisition and post-processing. During the simulation all proposed Monte Carlo moves are counted in a matrix C which is then turned into
the stochastic transition matrix Q. Thereafter the density of states is obtained by
calculating the eigenvector of the matrix.
Data Acquisition
In a canonical (N V T ) Monte Carlo simulation changes to the system (moves)
are proposed and may be accepted depending on a given temperature. While
the probability to accept such a move depends on the temperature the proposal
probability q(i → j) does not. During the simulation all proposed moves are
counted in a matrix C
Cij = Cij + 1,
(2.67)
29
2 Methods for Density of States Calculation
where i and j are labels for the energy before the move Ei and the energy of the
proposed state Ej , respectively.
For systems with continuous energies or huge number of discrete states, discretization or lumping is needed i.e. all energies in the range [Ei − ∆E
, Ei + ∆E
)
2
2
are lumped into bin i. At the end of the simulation the created counting matrix
is normalized row-wise:
Cij
,
Qij = ∑
k Cik
(2.68)
which creates the stochastic matrix Q, i.e. the infinite temperature transition matrix.
Simulating at a fixed temperature only samples a narrow part of the state space.
Thus algorithms originally designed for optimization, like simulated annealing [4,
6] or threshold accepting [76], have been used in previous work. Despite those,
other mechanisms to reach a broader sampling over the energy range of interest
[Emin , Emax ] are possible.
In the original publication Heilmann et al. [6] applied simulated annealing with
linear and exponential schedule to achieve a broad sampling of the energy space
of an Ising spin glass. Another approach dubbed WL-TM is proposed by Shell et
al. [78], where the Wang-Landau [2, 3] method (cf. section 2.3.1) is used to achieve
broad sampling of Lennard-Jones and Ising systems. Fenwick [79] presented a
third method, using the replica exchange (parallel tempering) method to run the
simulation in parallel at different temperatures, which are selected in a way that
the resulting energy distributions overlap.
As the transition matrix Q needs to have a predefined energy range one needs
to be cautious while handling moves leaving this range. Every move leading to
an Ej outside of our energy range has to be counted to the diagonal element [66]
according to:
Cii = Cii + 1.
(2.69)
If a move already starts outside of the energy range, e.g. Ei ∈
/ [Emin , Emax ], then
it is ignored completely.
Grand Canonical Ensemble
In the grand canonical ensemble µV T the assembly of the Q matrix is less intuitive and one has to give some thought about correct sampling, broad state space
exploration across the whole particle number range and the in-memory matrix layout. Contrary to the canonical ensemble, where energy is the only macroscopic
variable, we now have energy E and particles number N . The resulting joint density of states (JDOS) Ω(E, N ) is a function of both state variables. Applying the
30
2.4 ParQ
Figure 2.4: Subsection of a Q matrix of a Lennard-Jones system in the grand canonical ensemble. The matrix indices grow from top left to bottom right,
so that the structure of the plot corresponds to eq. (2.70).
notation of Shell et al. [65] (cf. section 2.3.2) we find, that we obtain the excess
density of states, i.e. without the ideal gas contribution.
Considering a system limited to 100 particles with the energy range being split
up into 1000 bins the resulting transition matrix would have (100 × 1000)2 = 1010
entries. By using the restriction made by grand canonical particle insertion and
removal moves, where only one particle can be added or deleted at a time we can
reduce the matrix size to a banded block matrix with e.g. 98 × 3 × 10002 + 2 × 2 ×
10002 = 298 × 106 entries [8]. Therefore the matrix is structured like:
 E

Q1,1 QE
0
1,2
QE QE QE

2,2
2,3
 2,1

E
E
E


Q
Q
Q
3,2
3,3
3,4


Q=
 with QE = RNE ×NE .
..
..
..


.
.
.


E
E
E

Qn−1,n−2 Qn−1,n−1 Qn−1,n 
0
QE
QE
n,n−1
n,n
(2.70)
Each pair of energy and number of particles (E, N ) can be seen as a separate
macrostate. Therefore we have transitions from (E, N )i to (E, N )j . Each proposed
transition is counted like in the canonical ensemble. The method is independent
of the acceptance probability of the underlying sampling scheme, as long as the
proposal probability q(i → j) is unchanged.
A plot showing the structure of a real system can be found in fig. 2.4. The data
stems from a grand canonical Lennard-Jones system. Only a subsection of the
matrix from the high density region, i.e. for large N , is shown.
31
2 Methods for Density of States Calculation
2.5 Transition Matrix Methods
In this section the usage of transition matrices in Monte Carlo simulations is discussed. The first use of transition matrices or transition probabilities in Monte
Carlo data acquisition can be found in the work of Andresen et al. [4]. The Q
method as well as the extension parQ have already been discussed in the previous
section (cf. section 2.4).
In the following the roots of the transition matrix methods and the Transition
Matrix Monte Carlo (TMMC) method, a sampling algorithm based on an approximated transition matrix, will be presented. Precisely, we can distinct between
the use of finite temperature transition matrices and infinite temperature transition matrices. Afterwards, different usage schemes, data analysis methods and
extensions are discussed.
2.5.1 Transition Matrices at Finite Temperature
Besides the work of Andresen et al. [4] we find the first use of transition probabilities
in Monte Carlo data acquisition in the articles of Smith and Bruce [44, 80]. They
used a transition matrix to improve the weights in a multicanonical recursion
simulation. During one multicanoncial recursion step a histogram matrix C records
transitions according to
Cij = Cij + 1
and Cii = Cii + 1
for accepted moves
for rejected moves,
(2.71)
(2.72)
where i and j are macrostates, e.g. energies, particle numbers or other order parameters. This updating scheme for the counting matrix corresponds neither a
finite temperature nor an infinite temperature scheme, but, for completeness, their
method is nevertheless presented here.
The transition matrix Γ is then obtained by
Γij = ∑
Cij + 1
.
k (Cik + 1)
(2.73)
The term +1 in eq. (2.73) is not discussed in either of the two articles. To fill
C the authors propose to repeatedly “release” the system from different starting
points far from equilibrium, e.g. near the ground state, until C is “full”. Finally the
eigenvector is calculated using the broad histogram equation, which is then used
to update the multicanonical weights. Smith and Bruce show, that their transition
matrix based scheme of updating the weights of a multicanoncial simulation can
be superior to a visited states approach.
32
2.5 Transition Matrix Methods
A slightly different approach on using transition probabilities has been taken by
Fitzgerald et al. [81]. The authors use transition probabilities depending on temperature and importance weights to calculate “canonical transition probabilities”,
i.e. Γij (βk ), at one or more inverse temperatures βk that may differ from the actual
β of the simulation. The basic idea is to keep track of a set of counting matrices
C(βk ) of different βk of interest and to update them for every proposed move from
energy Ei to Ej during a simulation at β according to
Cij (βk ) = Cij (βk ) + e(β−βk )Ei [rij (βk )]
(2.74)
and Cii (βk ) = Cii (βk ) + e(β−βk )Ei [1 − rij (βk )],
(2.75)
where rij is given by
[
]
(2.76)
rij (β) = min 1, e−β(Ej −Ei ) .
∑
Γij (βk ) is then calculated as Cij (βk )/ j Cij (βk ). This procedure basically corresponds to a reweighting in temperature of the transition matrix instead of a
histogram. In a second article Fitzgerald et al. [82] omit the reweighting procedure in eqs. (2.74) and (2.75) and just record the transitions as
Cij = Cij + rij (β)
and Cii = Cii + 1 − rij (β).
(2.77)
(2.78)
They find their method to reduce variance by 6% to 8% compared to the Smith
et al. [44, 80] method (dubbed “empirical transition properties” in Fitzgerald et
al. [82]).
2.5.2 Infinite Temperature Transition Matrices
So far, only transition matrices at finite temperature have been investigated. Wang,
Swendsen and co-workers [5, 83, 84] were the first to work with infinite temperature
transition matrices, besides the work of Andresen et al. [4], and introduced the term
transition matrix Monte Carlo (TMMC). The idea for their algorithm is based on
the corrections of Wang [55] to the broad histogram method (cf. section 2.2.7).
In ref. [83] the algorithm is introduced by investigating an Ising system with
single-spin-flip Glauber dynamics. The matrix elements are computed by
Γij = w(∆E)⟨N (σi , ∆E)⟩E ,
(2.79)
where ∆E = Ei − Ej =
̸ 0 is the difference in energy, N (σi , ∆E) is the number of
cases that the energy is changed by ∆E for a given configuration σi and
[
(
)]
1
∆E
(2.80)
w(∆E) =
1 − tanh
2
2kB T
33
2 Methods for Density of States Calculation
corresponds to the Glauber dynamics. The average ⟨N (σ, ∆E)⟩E is over all configurations σ with E(σ) = E. To obtain ⟨N (σ, ∆E)⟩E histograms Hβ (E) are built by
running a set of simulations in the canonical ensemble at different temperatures.
It has to be noted, that Γ is temperature dependent, but dividing ⟨N (σ, ∆E)⟩E
the number of spins N , gives the probability for a transition from energy E to
E + ∆E at infinite temperature.
The algorithm presented in ref. [5] differs slightly. The authors propose a three
stage procedure, where the first stage comprises of setting up a random start
configuration. Acceptance rates are either set by knowledge or to one and a first
transition matrix Q is created. This stage does not satisfy detailed balance and is
biased towards unvisited states. For the second stage the transition matrix of the
first stage is used with an identity equation (originally called T T T -identities, with
transition matrix labeled T )
Qij Qjk Qki = Qji Qik Qkj ,
(2.81)
imposed on it. The acceptance rates are defined by this transition matrix and
satisfy detailed balance. The resulting transition matrix is therefore unbiased.
The third stage is similar to the second one, i.e. the transition matrix from the
previous stage is taken and T T T -identities are imposed, then acceptance rates are
defined by the resulting matrix.
To improve the statistics the authors use the “N-fold way” method [85]. After
each spin-flip every single spin is classified by ∆E a flip of that spin would contribute. This additional information, correctly weighted, is then accumulated into
the transition matrix.
A more thorough investigation is presented in ref. [84]. Here, the three stage
procedure is given up and the simulation is run by directly updating the averages
⟨N (σ, ∆E)⟩E , i.e. the infinite temperature transition matrix. The authors propose
different sampling schemes and compare them to the multicanonical sampling, i.e.
[
]
Ω(Ei )
a(i → j) = min 1,
,
(2.82)
Ω(Ej )
with exactly known density of states. The recommended acceptance rates are
]
[
Qji
a(i → j) = min 1,
(2.83)
Qij
Qji
and a(i → j) =
.
(2.84)
Qji + Qij
In this thesis we will use eq. (2.83) as the acceptance criterion if not stated otherwise.
34
2.5 Transition Matrix Methods
The density of states can be calculated by determining the left eigenvector of the
transition matrix, but Wang and Swendsen state that the solutions to the eigenvalue problem are numerically unstable. They propose different ways to calculate
the density of states. The first method, already presented by de Oliveira et al., is
the broad histogram equation (cf. eq. (2.39)). As this only uses two diagonals of
the transition matrix the authors recommend a second method, the least squares
minimization of
(
)2
Q
∑ S(Ej ) − S(Ei ) − ln Qijji
,
(2.85)
2
σ
ij
i,j
where S(E) = ln Ω(E) is the logarithmic density of states, i.e. the entropy, which is
unknown and σij2 is the variance of the Monte Carlo estimate of the term ln(Q(Ei →
Ej )/Q(Ej → Ei )). Constraints may be given by a known ground state degeneracy,
any symmetry in Ω(E) or the total number of states. A third method is to minimize
the term
∑ (Q̂ij − Qij )2
,
2
σ
ij
i,j
(2.86)
where Q̂ is unknown and subject to optimization and σij is the error of Qij . To
perform the minimization of eq. (2.86) additional conditions are required:
0 ≤ Q̂ij ≤ 1,
and
∏
∑
j
Q̂ij =
Q̂ij = 1
∏
Q̂ji .
(2.87)
(2.88)
This constraints complicate solving eq. (2.86). The authors also report, that, although it should give better results, eq. (2.86) gives twice the error of eq. (2.85).
2.5.3 Transition Matrices in the Grand Canonical Ensemble
An extension of the transition matrix method of Fitzgerald et al. [81, 82] to the
grand canonical ensemble has been presented by Errington and co-workers [86, 87].
Instead of tracking transitions in energy, e.g. Ei → Ej , only particle insertions
and deletions, i.e. N → N ± 1, are recorded. The simulation is set up in the
grand canonical ensemble at fixed volume V and temperature T with the chemical
potential µ0 chosen near phase coexistence. Based on eq. (2.78) we obtain an
update scheme for the collection matrix C. For a move leading from microstate sk
35
2 Methods for Density of States Calculation
to sl , with i = N (sk ) and j = N (sl ) = N (sk ) ± 1, the matrix elements Cij and Cii
are updated according to
Cij = Cij + rkl
and Cii = Cii + 1 − rkl ,
(2.89)
(2.90)
whether the move is accepted or not. The probability to accept such a move in a
conventional grand canonical ensemble Metropolis Monte Carlo simulation is given
by
[
]
π(sl )
rkl = min 1,
,
(2.91)
π(sk )
with the grand canonical probability function
π(s) =
1
V N (s)
e−βE(s)+βµ0 N (s)
Ξ Λ3N (s) N (s)!
(2.92)
for being in microstate s. Here, Ξ is the grand canonical partition function and
Λ the de Broglie thermal wavelength. Moves, i.e. particle insertions and deletions,
are accepted by
[
]
ln Π(N (sl ))π(sl )
a(k → l) = min 1,
,
(2.93)
ln Π(N (sk ))π(sk )
where Π(N ) is an estimate of the probability of having N particles in the simulation
box. It is used to bias the simulation such that it has a flat histogram in N . The
transition matrix Γ for a fixed β and µ can be obtained by normalizing C:
Cij
Γij = ∑
.
k Cik
(2.94)
The macrostate probabilities are then calculated by the broad histogram equation
(cf. eq. (2.39)), i.e. the recursive interpretation of the expression
ln Π(N + 1) = ln Π(N ) + ln
Γ(N → N + 1)
.
Γ(N + 1 → N )
(2.95)
These probabilities, regularly recalculated, are used in eq. (2.93) to bias the simulation similar to the multicanonical ensemble. Finally, histogram reweighting is
applied to the macrostate probabilities to find the chemical potential µ of phase
coexistence.
This is different in methodology from Wang and Swendsen [5, 83], as the matrix used contains transition probabilities at a given temperature and chemical
36
2.5 Transition Matrix Methods
potential and the biasing scheme is not purely matrix based but involves repeated
computation of a stationary probability.
Nonetheless, the authors provide further interesting methods and ideas. Concerning a joint density of states, or joint macrostate probability, they are the first
to discuss the idea of recording changes in particle number as well as in energy in
the same matrix. Unfortunately, they desist from using this approach and choose
a combination of visited states approach, i.e. histograms, and grand canonical transition matrix to obtain Π(N, E). Multihistogram reweighting is then employed to
reweight and combine these Π(N, E) in β and µ.
Additionally, Errington provides hints on implementing an isothermal-isobaric
(N pT ) ensemble. Changes in volume are performed on a logarithmic scale with
ln ∆V chosen such that the transition matrix only consists of three bands. By this
choice only eq. (2.95) is needed to solve for the macrostate probabilities.
In a later article Shen and Errington [88] investigate the fluid-phase behavior of
binary Lennard-Jones mixtures. These mixtures consist of two species, counted
by N1 and N2 , and are simulated in the grand canonical ensemble. The different
types of moves lead to the following transitions:
(N1 , N2 ) → (N1 , N2 )
(N1 , N2 ) → (N1 ± 1, N2 )
(N1 , N2 ) → (N1 , N2 ± 1)
(N1 , N2 ) → (N1 + 1, N2 − 1)
(N1 , N2 ) → (N1 − 1, N2 + 1)
for
for
for
for
for
particle displacements,
insertions and removals of species 1,
insertions and removals of species 2,
identity changes 2 → 1 and
identity changes 1 → 2.
Here, the authors are the first to record transitions in a joint transition matrix. To
optimize the sampling of the state space the authors came up with an “isochoric
semigrand ensemble”. Hence, for every Ntot = N1 + N2 of interest a simulation is
performed holding Ntot fixed and performing only particle identity changes. Additionally, “phantom” particle insertions and removals for both species are performed,
i.e. they are recorded but never accepted. The first part of this scheme ensures
that every (reasonable) combination of N1 and N2 is sampled, while the second one
ensures that the different simulations can be combined into one single transition
matrix.
To obtain the joint macrostate probabilities the equation
[
]2
∑√
Γij Πi
2
Cij Cji × ln
(2.96)
σtot =
Γji Πj
i,j
is minimized in Π, where i and j label macrostates, i.e. pairs (N1 , N2 ).
In more recent articles Paluch, Singh and Errington enhanced their method
to incorporate Configurational Bias sampling [89, 90] (cf. section 2.6) as well as
37
2 Methods for Density of States Calculation
Extended Ensemble sampling [90] and applied it to normal alkanes up to dodecane,
but also to more complex molecules. They also implemented their grand canonical
TMMC method for Configurational Bias sampling in the free MCCCS Towhee
simulation program. For configurational bias sampling the acceptance probabilities
for molecule insertions and removals used to update the collection matrix C are
given by
[
]
q(T )V eβµ
r(N → N + 1) = min 1,
Rw,new
(insertion)
(2.97)
N +1
]
[
1
N
,
(removal)
(2.98)
and r(N → N − 1) = min 1,
q(T )V eβµ Rw,old
with the Rosenbluth weight Rw and the kinetic contribution to the molecular
partition function q(T ) (for atoms q(T ) would be 1/Λ3 ). For accepting the trial
insertion or removal eq. (2.93) together with eq. (2.92) is still used, with the need to
replace 1/Λ3 by q(T ). A detailed explanation of the configurational bias algorithm
and how to obtain the weights Rw is presented in section 2.6.
2.5.4 Solving for the Density of States
Calculating the density of states Ω from transition probabilities may look straightforward at first glance, but choosing the right method for the problem at hand is
not. Several methods have been used throughout the literature. In the following
different versions are discussed.
Broad Histogram Equation
The Broad Histogram equation eq. (2.39) of de Oliveira et al. [53, 54] can be easily
derived from the detailed balance equation:
Ωi Qij = Ωj Qji
⇔ ln Ωi + ln Qij = ln Ωj + ln Qji
Qij
⇔
ln Ωj = ln Ωi + ln
.
Qji
(2.99)
(2.100)
(2.101)
Using this equation with the choice of Ω0 = 1 makes it easy to recursively calculate
Ωj . If the system has more than one |∆E| one needs to choose which pair of
diagonals of the transition matrix to use. In general, using the two minor diagonals
next to the major diagonal is the best choice, as small changes in energy are
more common than larger ones. The drawback of this method is that all other
transitions besides the chosen diagonals are not incorporated into the density of
states calculation.
38
2.5 Transition Matrix Methods
Minimization of Detailed Balance Deviations
The detailed balance equation gives rise to a second method incorporating a minimization procedure. Having a “perfect” transition matrix the left and right side
of eq. (2.99) would be equal for every pair of i and j. In reality, i.e. having a
transition matrix obtained by Monte Carlo simulation, both sides may differ by a
∆ij :
∆ij = ln Ωi + ln Qij − ln Ωj − ln Qji
Qij
⇔ ∆ij = ln Ωi − ln Ωj + ln
.
Qji
Building the sum of squares over all pairs we obtain
)2
(
Qij
∑ ∆2ij ∑ ln Ωi − ln Ωj + ln Qji
2
=
=
,
σtot
σij2
σij2
i,j
i,j
(2.102)
(2.103)
(2.104)
where we introduced σij2 as an estimate of the variance of ∆ij and use it to weight
the different ∆ij . This is identical to eq. (2.85) of Wang et al. [84].
The problem at hand can either be interpreted as an overdetermined system
2
of linear equations or as the minimization of an objective function σtot
. The first
interpretation is often called linear least squares where the matrix M and the right
side b of the equation
M ·x=b
(2.105)
are given, with M being a rectangular matrix with the number of rows m being
larger than the number of columns n. If the n columns are linearly independent
we can resort to solving
(M T M )x = M T b
(2.106)
by an appropriate algorithm. Other options include singular value decomposition
of M and various methods designed for large sparse matrices. We choose the implementation of the Gnu Scientific Library [91] which implements a robust iterative
method and will refer to this method using the term “least squares minimization”.
In a more recent article Escobedo et al. [92] introduced an “explicit” method
(referring to the previous method as “implicit”) to solve for the density of states.
The method is somewhere in between the broad histogram equation and the least
squares method of Wang. It is called “explicit” because no minimization problem
has to be solved. Given an order N one has to compute the recursion
)
N (
σ2 ∗
σ2
Ci,i+1 ∑
σ2 ∗
Hi+1
+ 2
ln
+
SM − + 2 SM + , (2.107)
Si+1 = Si + ln
2
Hi
σi,i+1 Ci+1,i M =2 σM
σM +
−
39
2 Methods for Density of States Calculation
where
∗
SM
−
i
∑
Ci+1−M,i+1
Ck,k−1
= ln
+
ln
,
Ci+1,i+1−M k=i+2−M Ck−1,k
(2.108)
∗
SM
+
i+M
∑
Ci,i+M
Ck,k−1
= ln
+
ln
,
Ci+M,i k=i+2 Ck−1,k
(2.109)
2
σM
−
=
2
σi+1−M,i+1
i
∑
+
2
σk,k−1
,
(2.110)
k=i+2−M
2
2
σM
+ = σi,i+M +
i+M
∑
2
σk,k−1
,
(2.111)
k=i+2
)
N (
∑
1
1
1
1
=
+
+ 2
and
2
σ2
σi,i+1 M =2 σM
σM +
−
(2.112)
σij2 = Cij−1 + Cji−1 .
(2.113)
By setting N = 1 we obtain eq. (2.39), i.e. the broad histogram equation. For a
full derivation of the method the reader is referred to reference [92].
Instead of using a least squares minimization Shen et al. [88] proposed the use
of objective function minimization by the conjugate gradients method to minimize
the expression
]2
[
∑√
Qij Ωi
2
σtot =
Cij Cji ln
,
(2.114)
Qji Ωj
i,j
which is equal to eqs. (2.85) and (2.104) up to the weighting factor. The gradient
can be determined analytically and is given by
]2
[
∂ ∑√
2 ∑√
Q2 Ω2k
Qij Ωi
Cij Cji ln
=
Cik Cki ln ki
.
(2.115)
∂Ωk i,j
Qji Ωj
Ωk i
Q2ik Ω2i
Minimization Weights
A variety of different weights have been used in literature. First, Shell et al. [78]
used
σij2 = Cij−1 + Cji−1 + Hi−1 + Hj−1 ,
(2.116)
which incorporates values from the counting matrix C as well as from the histogram
H. Later, Shen et al. [88] presented the term
σij2 = (Cij Cji )−1/2 ,
40
(2.117)
2.5 Transition Matrix Methods
which uses the square root of the product of corresponding counting matrix entries.
A third variant for the weight has been used by Escobedo et al. [92]
σij2 = Cij−1 + Cji−1 ,
(2.118)
which is similar to eq. (2.116).
Eigenvector Methods
Methods to calculate the eigenvector are a common means to find the steady state
of a Markov chain. The simplest approach is the power iteration (also known as
van-Mises iteration [93])
Ω(n) =
QT Ω(n−1)
,
||QT Ω(n−1) ||
(2.119)
with || · || being the square norm. It is known to be numerically stable and is
guaranteed to converge to the eigenvector belonging to the largest eigenvalue λ1
(= 1 as Q is a stochastic matrix) but also known to converge slowly if the eigenvalue
|λ2 | of the subdominant eigenvector is close to |λ1 |.
Its advantage over other methods lies in its simplicity. On the one hand, matrixvector multiplication, either for dense or sparse matrices, are among the best optimized algorithms in high performance computing. They can be performed highly
parallel, and even be implemented on general purpose graphics computing units
(GPGPUs). On the other hand, especially for large sparse matrices, it is important
that the power iteration requires no changes to the matrix itself or any other kind
of processing that would increase the number of non-zero elements.
One problem that may arise when using this method is that of underflow errors.
Typical implementations rely on double precision numbers. Thus, applying the
power iteration to the transition matrix of a system whose density of states spans
more than ≈ 600 orders of magnitude necessarily leads to an underflow of eigenvector entries. The only way to solve this problem is to use software libraries allowing
for arithmetic of higher precision, e.g. quadruple precision or more. But, using
such software increases the computation time by a factor typically much larger
than the increase in precision.
A method to directly calculate the steady state has been presented by Grassmann, Taksar and Heyman (GTH) [94, 95]. The GTH method is a numerically
stable version of the Gaussian elimination. It has been investigated in ref. [79]
where it is compared to other eigenvector methods, e.g. power method, ILU(0)preconditioned2 power iteration and inverse iteration, but also to data obtained
2
incomplete LU factorization with no fill-in of the matrix
41
2 Methods for Density of States Calculation
for i ← d to 2 do
i−1
∑
s←
Qij
j=1
Qii ← −s
for j ← 1 to i − 1 do
Qji ← Qji /s
end
for k ← 1 to i − 1 do
for j ← 1 to i − 1 do
Qkj ← Qkj + Qki Qij
end
end
end
Figure 2.5: The Grassmann, Taksar and Heyman (GTH) [94, 95] version of a numerically stable Gaussian elimination. The algorithm carries out the
factorization I − Q = U L where U and L are the upper and lower part
of Q after the algorithm has been applied. The main diagonal belongs
to L with L11 = 0 and all main diagonal entries of U are −1.
via the Wang-Landau method. The GTH method carries out the upper-lower decomposition I −Q = U L which is outlined in fig. 2.5. After applying this procedure
the main diagonal of Q belongs to L with L11 = 0 and the main diagonal elements
of U are −1. The (un-normalized) density of states is then obtained from
Ωi =
i−1
∑
Ωj Qji
for i = 2, . . . , d,
(2.120)
j=1
with d being the size of the matrix Q ∈ Rd×d . To overcome under-/overflows in Ωi
eq. (2.120) can be rewritten as [79]
ln Ωi = ln Ωi−1 + ln
i−1
∑
exp(ln Ωj − ln Ωi−1 + ln Qji ).
(2.121)
j=1
O’Cinneide et al. [96] gives an analysis of the relative error of the resulting
distribution with respect to the relative error of the matrix elements as well as
the size of the problem. They prove by entrywise perturbation theory, that the
accuracy of the calculated probabilities is about O(n3 )u (specifically at most 9 n3 u
[97]), where u is the unit round-off in floating-point arithmetic, which is typically
42
2.5 Transition Matrix Methods
u = 5 × 10−14 for double precision values. Using that proof they also provide
guarantees for a minimum of accurate digits — e.g. for a 1000-state Markov chain
4 digits and for a 10,000-state system 1 digit is guaranteed to be accurate.
Implementing Minimization Procedures
Implementing the minimization of eq. (2.104) by using either least squares or conjugate gradient methods, one has to make several decisions. If one of the matrix
entries Qij or Qji is zero the logarithm ln(Qij /Qji ) is undefined. We therefore
discard all pairs {i, j} where one of two entries is zero. This symmetrizes the zero
pattern of the matrix. The question remaining is whether we remove these entries
from the counting matrix C and rebuild Q from this symmetrized C or if we just
remove these entries from Q. This decision also stretches out to the weight formulation in eq. (2.116) which uses the histogram H, i.e. the column sum of C, as a
weight.
An example of the resulting transition matrices, i.e. original and symmetrized,
can be found in fig. 2.6. The data for these graphs comes from the simulation of
a Lennard-Jones system with N = 7 particles using Wang-Landau sampling with
parQ (cf. sections 2.3 and 2.4) attached to it. We find, that a large set of entries,
especially transitions from low energies to high energies, is removed.
We expect, that transitions observed more often than others receive a higher
weight compared to less sampled transitions. But since the weights build the
denominator of eq. (2.104) we find that σij , depicted in fig. 2.7a, is complementary
to fig. 2.6b, with σij being largest at the “border”, i.e. for the least observed
transitions. The density plot shows the weighting term of Shell et al. (eq. (2.116))
obtained from a symmetrized C matrix.
Using the weights and the transition matrices from above we can calculate the
density of states either using least squares minimization or power iteration. The
results are presented in fig. 2.7b. Finding that the power iteration, when applied
to the original (asymmetric) transition matrix, introduces some kind of artifacts
in the, otherwise smooth, density of states (labeled “PI”, red solid line) we also
try applying it to the symmetrized Q matrix (“PS”, green dashed line). As a
result, the artifacts are reduced, but the lower energies become overrepresented.
Using the least squares minimization results in a smooth density of states for Q
being constructed either by using the asymmetric or the symmetrized matrix C
(“LS”, blue dashed line). For clarity only the latter is shown in the graph. Using
the Wang-Landau algorithm as the underlying sampling method gives us a second
source for the density of states. For better comparison the Wang-Landau method
data (“WL”, magenta, dash-dotted line) has been offset by a small factor. We find
good agreement between the Wang-Landau data and the results from least squares
minimization.
43
2 Methods for Density of States Calculation
10
100
100
10
0
10-2
10-3
-10
10-4
10-1
Energy from
Energy from
10-1
10-2
0
10-3
10-4
-10
10-5
-20
-20
-10
0
10
Energy to
(a) Transition matrix before and ...
10-6
10-5
-20
-20
10-6
-10
0
10
Energy to
(b) ... after symmetrization.
Figure 2.6: Plots showing structure and entries of a transition matrix. The plots
show the matrix (a) before and (b) after applying a procedure to symmetrize the structure of zero and non-zero entries. The color represents
the values of the entries and is scaled logarithmically. Dark blue areas
have a value of 0, i.e. no transitions were counted. The data stems
from a Lennard-Jones system with N = 7 particles.
For the detailed investigations in chapter 4 we will use three of the presented
methods and variants. First, the power iteration method applied to the asymmetric
Q matrix shall be used, as it is easy to implement and has already proven [6, 8,
79] to give good results. Second, we will use the GTH method for all non-sparse
matrices, as it is very fast and also gives good results [79]. Finally we will make
use of the least squares minimization with weights from eq. (2.116).
2.6 Configurational Bias Sampling
Sampling of large molecule chains like homo-polymers can be challenging. Insertion or movement of such molecules can be very tedious especially in simulations
with high density. Naïvely inserting monomer by monomer into a densely packed
simulation box will lead to nearly no acceptance of the created configuration. By
using the method introduced by Rosenbluth and Rosenbluth [17] an off-lattice algorithm for the biased insertion of chain molecules or parts thereof can be derived.
Siepmann et al. [24] introduced a first version of the configurational bias algorithm by applying a biasing scheme to lattice polymers, shortly followed by Frenkel
et al. [23] who showed the validity and the performance for off-lattice molecules.
44
2.6 Configurational Bias Sampling
100
101
10
10-5
10-1
0
10-2
Ω(E)
Energy from
100
PI
PS
LS
WL
10-7
10-10
10-8
10-3
-10
10-4
10-9
10-15
10-10
10-5
-20
-20
10-6
-10
0
10
Energy to
2 from eq. (2.116).
(a) σij
10-11
-11
10-20
-20
-15
-10
-10.5
-10
-5
0
Energy
-9.5
5
-9
10
(b) Comparison of minimization and power
iteration.
Figure 2.7: The density plot (a) shows the weights obtained when applying
eq. (2.116) to the transition data of a seven-particle Lennard-Jones
system. The graph (b) on the right side shows the resulting density of
states when applying power iteration and least squares minimization
to the same transition data. For the power iteration (PI, red solid
line) we find some artifacts are introduced in the, otherwise smooth,
density of states. The same is true if we apply the power iteration to
the symmetrized matrix, i.e. to fig. 2.6b (PS, green dashed line). The
least squares minimization is able to find a smooth solution (LS, blue
dashed line) as well as the Wang-Landau algorithm (WL, magenta,
dash-dotted line, offset by a small factor). The inset shows the region
around the artifacts in detail.
Introducing such a bias has consequences. In general one can decompose a Monte
Carlo move into a probability q(i → j) for proposing a change from state i to state
j as well as an acceptance probability a(i → j) for accepting the proposed move.
For Metropolis sampling [1] the acceptance probability is given by
]
[
a(i → j) = min 1, e−β(Ej −Ei )
(2.122)
and the transition probability at temperature T is then given by
Γij = q(i → j) · a(i → j).
(2.123)
If we change the proposal probability q(i → j), as it is done by the configurational
bias algorithm, one also has to change a(i → j) to obtain correct importance
45
2 Methods for Density of States Calculation
sampling, i.e. a Boltzmann distribution of configurations. In the following an
outline of the algorithm for fully flexible polymer chains is presented and a short
overview is given on how aCB (i → j), i.e. the acceptance probability for a proposed
configurational bias move, is calculated.
We start by assuming that we already have grown l − 1 segments of, for instance,
a polymer. To add another monomer, l, we generate k trial positions b1 . . . bk .
The probability to generate a trial position b is then given as
pbond
(b)db = C exp(−βubond
(b))db,
l
l
(2.124)
with the volume element db depending on the bond length r, the bond angle θ
and the torsional angle ϕ defined as
db = r2 dr d cos θ dϕ.
(2.125)
For all k trial segments, we compute the external Boltzmann factors exp(−βuext
l (bl ))
and select one with probability
pext
l (bn ) =
exp(−βuext
l (bn ))
ext
wl (n)
(2.126)
with
wlext (n)
=
k
∑
exp(−βuext
l (bj )).
(2.127)
j=1
By “external” we refer to interactions between the newly inserted monomer and
already inserted parts of the molecule being inserted as well as other molecules
already residing in the simulation box. Now segment n becomes the lth segment
of the trial configuration. This is repeated until the whole molecule is inserted.
Finally, we can calculate the Rosenbluth factor of the chain of length L:
W
ext
(n) =
L
∏
wlext (n).
(2.128)
l=1
To calculate the Rosenbluth factor W ext (o) for the old configuration the above
procedure is adapted to sample one randomly selected chain. The move is then
accepted with a probability
]
[
W ext (n)
CB
.
(2.129)
a (o → n) = min 1, ext
W (o)
For the parQ method we measure the infinite temperature transition matrix,
by counting proposed transitions generated with the (temperature independent)
46
2.6 Configurational Bias Sampling
0
-50
ln Ω(E,N)
-100
-150
-200
-250
-300
-350
widom
default
explicit
-400
-450
-700
-600
-500
-400
-300
-200
-100
0
E/ε
Figure 2.8: Comparison of the joint density of states from the parQ algorithm
calculated with (“default” and “explicit”) and without (“widom”) configurational bias insertion. From left to right, the curves belong to
N = 110, 100, . . . , 10. For the single atom Lennard-Jones system investigated, the configurational bias algorithm creates, upon insertion,
multiple trial positions (10 for setting “default” and 5 for “explicit”) of
which one is selected randomly. The introduced bias leads to a shift in
the density of states which grows with the particle number.
probability q(i → j) at finite temperature. As the configurations generated by
configurational bias sampling are proposed with the probability q CB (i → j) ̸=
q(i → j) it is not possible to use it with parQ or TMMC.
This can be easily verified by running the Monte Carlo simulation tool MCCCS Towhee which I modified to support Wang-Landau sampling and parQ. Results for different settings of the cbmc_settings_style parameter can be found
in fig. 2.8. There a Lennard-Jones system in the grand canonical ensemble has
been investigated. For each color, the single curves belong to the particle numbers
N = 110, 100, . . . , 10, from left to right. As there are no molecules that could be
grown, it is the insertion of single atoms that is biased by choosing multiple sites
upon each insertion move, of which only one is selected by above scheme. The
number of trial sites (parameter nch_nb_one) is given as 1, 5 and 10 for the settings “widom”, “explicit” and “default”, respectively. We find, that the bias leads
to an over-weighting of larger particle numbers.
47
2 Methods for Density of States Calculation
2.7 Continuous Fractional Component Monte Carlo
Typically grand canonical Monte Carlo simulations suffer from low insertion probabilities at high particle or molecule densities. A recent method, called “continuous
fractional component Monte Carlo”, has been presented by Shi et al. [98–100] to
circumvent this problem.
The idea is to gradually insert a molecule driven by an additional parameter.
Insertion of a particle is performed by expanding it using a parameter λ that scales
the energy contribution of the partly inserted molecule with the other molecules
in the box. For instance, the modified LJ potential becomes [101]
[
]
1
1
V (r) = λ4ϵ
−
. (2.130)
((1/2)(1 − λ)2 + (r/σ)6 )2 ((1/2)(1 − λ)2 + (r/σ)6 )
The modified potential is finite for r → 0 as long as λ ̸= 1. It has the correct
behavior at λ = 0 (particle / molecule has no interaction) and λ = 1 (particle
/ molecule is completely inserted). Only inter-molecular interactions are scaled,
but for molecules the scaling can either be applied for the full molecule or on a
per-atom basis. To the set of Monte Carlo moves an additional move has to be
added that scales λ by
λnew = λold + δλ,
(2.131)
with δλ being uniformly chosen from [−∆λ, ∆λ]. The value of ∆λ is chosen such
that λ moves have an acceptance rate of about 50%. Changes in λ are accepted
by
[
]
a(i → j) = min 1, e−β(Einter (j)−Einter (i))+η(λ(j))−η(λ(i)) ,
(2.132)
if λ stays within (0, 1), where Einter is the inter-molecule energy and η is a biasing
function. If λ exceeds unity the move is accepted by a modified grand canonical
insertion probability given by
]
[
f βV η(λ(j))−η(λ(i)) −β(Einter (j)−Einter (i))
a(N → N + 1) = min 1,
e
e
,
(2.133)
N +1
where f is the fugacity. In case of acceptance the fractional molecule becomes an
integral part of the system, N is increased by 1 and a new fractional molecule is
added to the box using the remainder of λ − 1 as its new value for λ. In case λ
decreases below 0 the move is accepted according to
]
[
N η(λ(j))−η(λ(i)) −β(Einter (j)−Einter (i))
e
e
,
(2.134)
a(N → N − 1) = min 1,
f βV
48
2.7 Continuous Fractional Component Monte Carlo
and, in case of acceptance, the fractional molecule is removed and a new fractional
molecule is selected among the N remaining molecules.
CFCMC has proven to achieve higher insertion acceptance rates as CBMC [100],
eventually replacing CBMC as the standard tool for grand canonical ensemble
simulations. One thing that remains common with the CBMC method is, that it
cannot be used with parQ, as it changes the probabilities by which insertions and
removals are proposed.
49
50
3 A New Benchmark for Density of States
Methods
In this chapter we develop a new test bed for Monte Carlo algorithms which calculate the density of states. It should allow for an objective and qualitative comparison. Hence, the quantity of interest is the deviation of the density of states from
its exact value. To judge the performance of an algorithm we have to look at the
development of this quantity with the number of time steps. Therefore we need
to employ models providing an exact density of states.
Looking at previous studies we find, that new methods for density of states
estimation are, at best, compared to one competitor. In most cases a simple
Ising system is employed. The well studied and well understood Wang-Landau
method [2, 3, 69] (cf. section 2.3) was presented using the Ising model with no
discussion of how the deviation of the density of states from its exact value develops
over time. This was up to later publications, e.g. by Shell et al. [78] or Zhou
et al. [69]. The latter was also the basis of the Wang-Landau 1/t method of
Belardinelli et al. [73, 74]. There, extensive studies of both, the standard and 1/t
variants, were performed and the development of the error with the number of
time steps analyzed.
For the TMMC method (cf. section 2.5.2 and eq. (2.83)), an infinite temperature
transition matrix method using the matrix entries as sampling weights, the most
thorough study has been conducted by its inventors Wang and Swendsen [84]. The
authors investigated several acceptance formulations as well as the multicanonical
method and the Wang-Landau method using the Ising model. To calculate the
density of states from the transition matrices only the minimization of detailed
balance deviations has been employed, neglecting existing eigenvector methods.
Also, no comparison of the development of the error in the density of states over
time for the different methods is given. Regarding eigenvector calculation for
transition matrices, we find a detailed study by Fenwick [79], unfortunately lacking
a comparison with the method of minimizing the detailed balance deviations.
Based on the literature, it is fair to say that a thorough comparison between
Wang-Landau and transition matrix based methods and combinations thereof is
lacking. To design such a benchmark we need to employ models of which the exact
density of states is known. Therefore we first develop a new model system, dubbed
fully adjustable benchmark (FAB) system, which can be customized with regard
to the number of macrostates, microstates as well as the number of connections
51
3 A New Benchmark for Density of States Methods
between microstates. Further, the exact density of states as well as the exact
transition matrix are known. Having knowledge of the exact transition matrix
allows for a more detailed study of transition matrix based methods. The second
model we will use for comparison is the Ising model, which is commonly used in
literature for testing and evaluating algorithms. Its exact density of states can be
calculated analytically. As a third model we will use a two-particle Lennard-Jones
system for which we also can calculate the exact density of states.
After describing the three models, which will be used in the next chapter for
comparing a variety of algorithms, we will present and discuss existing as well
as new simulated annealing schedules. Simulated annealing is the basis of the
original formulation of the parQ algorithm. Until now, only linear and exponential
schedules have been used in this context. This advanced set of schedules will
successively be investigated in chapter 4.
3.1 Development of a Fully Adjustable Benchmark
System
To gain deeper insight into the inner workings of different sampling algorithms we
developed a new model system where we have control over every parameter and
also have exact knowledge about the density of states as well as the transition
matrix. The number of different macrostates N , their energies Em = m, the
density of states Ωm and the connections in-between can be specified. The number
of microstates M is adjusted by the number of macrostates and the density of
states. We start with a parabolic density of states in the form
(
)2
N +1
N 2 + 2N + 1
Ω(E) =
(3.1)
−E +
,
2
4
which is designed such that Ω(Em ) ∈ N for Em ∈ {1, . . . , N }. Using the greatest
common divisor, denoted by gcd(·), we obtain
Ωm =
Ω(Em )
,
gcd(Ω(E1 ), . . . , Ω(EN ))
(3.2)
with the aim to reduce the total number of microstates while maintaining the
parabolic form. Hence, we can calculate the number of microstates by
M=
N
∑
Ωi .
(3.3)
i=1
Figure 3.1a shows the discrete density of states of a system having N = 6
macrostates, M = 28 microstates. Additionally eq. (3.1) is shown, of which the
discrete density of states is derived.
52
3.1 Development of a Fully Adjustable Benchmark System
Finally, we need to define a fixed number of connections per microstate K. The
connections shall be made in a random manner. These requirements describe a
graph, known in the literature as K-connected random graph or random graph
with prescribed degree sequence [102]. Using the parameter K we have full control
over the connectivity of the microstates. The choice of a random graph is not
mandatory, other strategies for connecting the microstates are possible.
The random graph connecting the microstates is created with the gengraph tool
of Viger et al. [102]. It has been slightly modified to deliver reproducible graph
configurations by adding a program parameter that specifies the random number
generator seed. The graph structure for the above example with K = 3 connections
per microstate can be found in fig. 3.1b.
An exemplary graph for a system of N = 4 macrostates and M = 10 microstates,
represented as an adjacency matrix G is given by

0
1

1

1

0
G=
0

1

0

0
0
1
0
0
1
0
0
0
1
0
1
1
0
0
0
1
0
0
1
0
1
1
1
0
0
0
0
1
0
1
0
0
0
1
0
0
1
0
0
1
1
0
0
0
0
1
0
1
1
1
0
1
0
0
1
0
1
0
1
0
0
0
1
1
0
0
1
1
0
0
0
0
0
0
1
1
1
0
0
0
1

0
1

1

0

1
.
0

0

0

1
0
(3.4)
Having M = 10 microstates in total the first and last two indices correspond to the
energies E1 = 1 and E4 = 4 respectively, whereas the indices three to five belong
to energy E2 = 2 and the indices six to eight to energy E3 = 3. Using a lumping
matrix

1
1

0

0

0
L=
0

0

0

0
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
1
1
1
0
0

0
0

0

0

0

0

0

0

1
1
(3.5)
53
3 A New Benchmark for Density of States Methods
16
eq. (3.1)
discrete DOS
Density of States Ω
14
12
10
8
.
6
4
2
0
0
1
2
3
4
5
6
7
Energy Em
Figure 3.1: Density of states from eq. (3.1) and the corresponding discretization
from eq. (3.2) are shown on the left. On the right side the different
microstates with their connections are shown. The colors correspond
to the bins in the diagram.
1.6
N=4
N=12
1.4
1.2
cV
1
0.8
0.6
0.4
0.2
0
0
1
2
3
4
5
6
7
8
T/K
Figure 3.2: Heat capacity cV of the FAB systems with N = 4 and N = 12
macrostates. The temperatures of the mode are Tmode = 0.5348 K
and Tmode = 0.8493 K respectively.
54
3.2 Ising Model
we can calculate the unnormalized transition

2

3
Qun = LT GL = 
2
1
matrix
3
2
3
4
2
3
6
1

1
4
.
1
2
(3.6)
Finally, the normalized infinite temperature transition matrix
1 3 1 1 
4
8
1
6
1
4
1
2
1

Q =  41
6
1
8
4
1
4
1
2
1
8
8
1 
3 
,
1 

12
1
4
with the eigenvector corresponding to the largest eigenvalue being
(
)
3 3
Ω(E) = 1, , , 1 ,
2 2
(3.7)
(3.8)
is obtained. As the eigenvector Ω represents the density of states, we can, for
instance, calculate the heat capacity by
cV (T ) =
1
(⟨E 2 (T )⟩ − ⟨E(T )⟩2 )
2
kB T
(3.9)
with
1 ∑
E Ω(E) e−βE ,
Z(T ) E
1 ∑ 2
⟨E 2 (T )⟩ =
E Ω(E) e−βE ,
Z(T ) E
∑
and
Z(T ) =
Ω(E) e−βE .
⟨E(T )⟩ =
(3.10)
(3.11)
(3.12)
E
The heat capacities for N = 4 and N = 12 are depicted in fig. 3.2. The temperatures of the mode are Tmode = 0.5348 K and Tmode = 0.8493 K.
3.2 Ising Model
The Ising model [13] is one of the best studied models in theoretical physics. Besides of its properties being well known, it provides a highly complex dynamics.
Thus it is commonly used as a tool for comparing and studying Monte Carlo algorithms.
55
3 A New Benchmark for Density of States Methods
100
10-10
10-20
Ω(E)
10-30
10-40
10-50
10-60
10-70
10-80
-600
-400
-200
0
200
400
600
E/J
Figure 3.3: Density of states for the 16 × 16 two-dimensional Ising ferromagnet.
For |E| = 508 the density of states is 0, leading to jumps at these
energies.
The energy of a system with nearest neighbor interactions is given as
E(σ) = −J
∑
⟨i,j⟩
σi σj − H
∑
σj ,
(3.13)
j
where the first sum is over pairs of neighboring spins, J is the interaction and H the
external field. The interaction is either ferromagnetic (J > 0), anti-ferromagnetic
(J < 0) or noninteracting (J = 0). The individual spins σi can only take the values
1 (up) and −1 (down). Here we study a two dimensional ferromagnet on a square
lattice of size L without an external field (H = 0) and with periodic boundary
conditions applied.
For the two-dimensional ferromagnetic model the exact density of states is known
analytically [103] and can be calculated using Mathematica. The program used
to obtain reference values is provided by Beale et al. [103, 104]. In fig. 3.3 the
normalized density of states for a system of size L = 16 from the ground state
energy EG = −512 to the anti-ground state energy EAG = 512 is shown.
Simulations are initialized by giving the spins on the lattice a random orientation.
Then, to create a dynamics, single spin-flips are used.
56
3.3 Lennard-Jones Model
3.3 Lennard-Jones Model
In the previous sections two different discrete systems were presented. To enhance
our benchmark we also want to study the properties of the Monte Carlo algorithms
under investigation using continuous systems. Therefore, we need a model that is
simple on the one hand, but also rich in features.
The Lennard-Jones potential given by
[(
V(rij ) = 4ϵ
σ
rij
)12
(
−
σ
rij
)6 ]
(3.14)
,
provides such a model system. Starting with a simple two particle system we
can rapidly increase the complexity by either increasing the particle count or by
employing a different ensemble like the grand canonical ensemble. It is capable of
reproducing the complete thermodynamic behavior of classic fluids, while being one
of the simplest models available. The Lennard-Jones system exhibits solid-liquid
[105] and vapor-liquid [106] phase transitions as well as a Mackay/anti-Mackay
transition [59, 107] at low temperatures. The latter can be observed for clusters of
different sizes [108–110]. Thereby the cluster overlayer melts, changing its Mackay
packaging [111] to a less dense liquid-like packaging.
The Lennard-Jones potential V(r) used here and in the simulations below is
cut off at radius rc = 2.5σ, if not otherwise stated. To allow a comparison with
literature data, e.g. Yan et al. [64], the potential is neither shifted by the energy
at the cut-off radius V(rc ) nor tail-corrected.
One feature that eases algorithm comparison is, that the exact density of states
for the two particle system is known [65]. It can be calculated analytically [8].
Therefore we place one of the two atoms in the center of our box of volume V = L3
and build the function

√ √
√

−ϵ+ ϵ(E+ϵ)
4

3
3

L − 3 2πσ

E


π
3
3


(L
−
8σ
)
6 (
)
√ √
Π(E) = π L3 − 8√2σ 3 −ϵ+ ϵ(E+ϵ)

6
E


(√
)
√ √

√

√

ϵ+ ϵ(E+ϵ)
−ϵ+ ϵ(E+ϵ)

4
3

−
−
 3 2πσ
E
E
E>0
E=0
V(rc ) < E < 0
(3.15)
E ≤ V(rc ),
which is the total volume in which the second particle can be placed such that the
energy is less than E. V(rc ) is the interaction energy at the cut-off radius rc . The
57
3 A New Benchmark for Density of States Methods
derivative of this function with respect to E gives us the density of states:
 3(
)
E=0
L 1 − π6 δ(E)


√ √


(
)
√ 3
√
ϵ+ ϵ(E+ϵ)
2πσ
E
+
2ϵ
−
2
(V(rc ) < E < 0) ∨ (E > 0)
ϵ(E + ϵ)
2
Ω(E) =
3E
E+ϵ
(
)
√
√

√

3

ϵ(E+ϵ)
ϵ+ ϵ(E+ϵ)

√ ϵ(E+2) √ϵ− √
 2Eπσ
+√ √
E ≤ V(rc ),
2
3E
ϵ(E+ϵ)
−ϵ+
ϵ(E+ϵ)
−ϵ−
ϵ(E+ϵ)
(3.16)
Here δ(E) is the Dirac-δ function, which results from the derivation of the jump
in Π(E) at E = 0. Thus we have to be careful when comparing measured values
to eq. (3.16) at and around the point E = 0. For a complete derivation the reader
is referred to a previous publication [8].
3.4 Schedules for Simulated Annealing
The original idea of parQ is the coupling of a simulated annealing schedule to a
conventional Metropolis Monte Carlo simulation. Heilmann et al. [6, 7] proposed
to use an exponential or a linear schedule. One of his conclusions was, that other
annealing schedules may exist, which might deliver results of higher quality.
Connected with this problem is how to sample transitions such that the transition
matrix created is filled “optimal”. An annealing schedule that spends more time
in high temperatures naturally samples more transitions with high energies and
henceforth spending more time at low temperatures samples low energies better.
Heilmann concludes [7] that an optimal schedule would produce “equally small
relative errors in the whole energy region”. He imagines, that such a schedule
would be made self-adapting.
In addition to the exponential and linear schedule studied before [6, 7, 77]
we present a set of temperature schedules of which some can be found in literature [112]. Each annealing schedule will be assigned a short handle for easier
reference:
lin The linear schedule, defined as
T (t) = T0 −
T0 − Tmin
t,
tmax
(3.17)
reduces the temperature from an initial temperature T0 to Tmin linearly
in time. The parameter Tmin (0 ≤ Tmin < T0 ) is introduced so that it is
possible to set a minimum temperature.
58
3.4 Schedules for Simulated Annealing
sine If we imagine that a conventional simulated annealing optimization, i.e.
with monotonic decreasing temperature, converges to the ground state
by the end of a run we might have a problem if the ground state is
highly degenerated. We hence introduce a sine schedule that repeatedly reduces the temperature to 0 in the hope to visit several deep-lying
minima or degenerated ground states. The schedule is given as
)
( (
)
t
T0
sin A 2π
+1 ,
(3.18)
T (t) =
2
tmax
where A is the number of oscillations within tmax time steps. The initial
and final temperature is T0 /2 with maxima at T0 .
Another choice would be a sawtooth function
(⌊
⌋
)
At
At
T (t) = T0
−
+1 ,
tmax
tmax
(3.19)
where ⌊·⌋ is the floor function. This also follows the idea of repeatedly
reducing the temperature to eventually reach multiple minima. But,
our second requirement is to sample the state space region of the phase
transition multiple times. Thus we will only work with the sine schedule
which, contrary to the sawtooth function, crosses the phase transition in
both directions, i.e. from unordered to ordered (or from “hot” to “cold”)
and backwards.
logsine Similarly a log-periodic schedule can be designed by replacing the
fraction in eq. (3.18) by its corresponding logarithmic term:
( (
)
)
T0
log t
T (t) =
sin A 2π
+1 ,
(3.20)
2
log tmax
where we assume that the time t starts at 1 and is dimensionless. With
increasing time t the actual frequency of the oscillations reduces.
The error, at best, follows a power-law behavior ∝ t−1/2 . Having an
oscillating schedule linear in time, e.g. sine, we would find most oscillations in the last 90% of simulation time. Using log-periodic oscillations,
we might be able to imitate the power-law behavior and reduce the error
with every magnitude of simulation time.
exp The exponential schedule, given as
T (t) = T0 αt ,
(3.21)
together with the linear schedule, is one of the first schedules used [77].
The usual implementation is to repeatedly lower the temperature by the
constant α (0 < α < 1). Here we employ a continuously changing T .
59
3 A New Benchmark for Density of States Methods
power A schedule following a power law might also be of interest. It is defined
as
T (t) = T0 tγ ,
(3.22)
with γ = log(Tmin /T0 )/ log tmax .
log A logarithmic schedule [112], starting at a temperature T0 can be
defined as
T (t) = T0 − (T0 − Tmin )
log t
,
log tmax
(3.23)
where one should take care that the time t starts at 1.
Of the schedules presented above, the sine, logsine and power schedules have not
been found in literature.
60
4 Benchmarking Density of States Methods
In this chapter strengths and weaknesses of a selection of algorithms from chapter 2
are investigated. The selection consists of algorithms that are compatible with the
parQ method, e.g. the Wang-Landau method and Metropolis method, but also
includes the TMMC method. These algorithms have yet not been found in a
combined comparison. For our benchmark we employ the models presented in the
previous chapter. Throughout this chapter, a set of abbreviations for the different
methods is used:
BM Sampling is performed with standard Boltzmann/Metropolis method
(see section 2.2.1) at a fixed temperature T . During the simulation a
parQ transition matrix is recorded.
SA Sampling is performed using the simulated annealing method (see section 3.4). During the simulation a parQ transition matrix is recorded.
WL Density of states is generated using Wang-Landau sampling with flatness
parameter γ (see section 2.3).
pqWL Density of states is calculated from a transition matrix obtained during
a WL run.
WL1t Density of states is obtained from Wang-Landau sampling with 1/t
method (see section 2.3.6).
pqWL1t Similar to pqWL, the density of states is calculated from the transition
matrix recorded during a Wang-Landau simulation with 1/t algorithm.
TM Transition Matrix sampling (also called TMMC) by Wang and Swendsen
using the acceptance rule from eq. (2.83).
Assume one has to investigate a black-box system, i.e. a Monte Carlo system
not investigated before, and to find its density of states. The most crucial choice
to make is that of the algorithm to be used. With algorithms varying widely in
their appropriateness for a given problem, in the number of parameters one has
to set and in the quality that is reachable after some amount of computing time,
this is a hard choice to make. With modern parallel computers and general purpose graphics processing units another aspect has to be considered: parallelization.
61
4 Benchmarking Density of States Methods
After selecting an algorithm one is left with the choice of proper values for the
parameters controlling the outcome of the simulation. As some parameters have
a larger influence on the quality then others, one should know the implications
arising from each.
In order to judge the quality of the results produced by the different algorithms
we need to define a measure for the error. Having either a discrete system or a
continuous system, appropriately lumped into NE bins, we can calculate a mean
relative error per bin
ξ[Ω(E)] =
1 ∑ |Ω(E) − Ω̂(E)|
,
NE E
Ω̂(E)
(4.1)
if we have access to an exact reference value, e.g. an exact density of states Ω̂(E).
Computing ξ for every time step we can judge how an algorithm performs with
increasing runtime. But, as single runs do fluctuate quite a bit, a better approach
is to consider the average of ξ over independent runs:
⟨ξ[Ωt (E)]⟩ =
1
Nruns
N
runs
∑
(k)
ξ[Ωt (E)],
(4.2)
k=1
where the subscript t indicates the time step.
In the three sections to follow the systems presented in the previous chapter will
be employed to perform comparative investigations of the methods and algorithms
presented in chapter 2. Finally, the findings of this chapter are discussed altogether.
4.1 Fully Adjustable Benchmark System
In the following subsections the influence of the different parameters used by the
sampling algorithms shall be investigated. A system of N = 64 macrostates resulting in M = 22,880 microstates has been investigated for six different types of MC
sampling: WL and pqWL with γ = 0.9, WL1t, pqWL1t and TM as well as BM
with different temperatures. For all parameters and sampling schemes 108 time
steps averaged over 1000 runs have been performed. If not otherwise stated the
GTH method was used to calculate the density of states for the parQ and TM
data.
4.1.1 Sampling Method Comparison
In fig. 4.1 the different sampling algorithms are compared for K = 8 connections
per microstate. For each algorithm the mean relative error is averaged over 1000
62
4.1 Fully Adjustable Benchmark System
mean relative error 〈ξ[Ω(E)]〉
1
0.1
0.01
0.001
pqWL, γ=0.90
WL, γ=0.90
pqWL1t
WL1t
TM
BM, T=10K
0.0001
102
103
104
105
106
t / time step
107
108
Figure 4.1: Comparison of the different sampling schemes for a system of K = 8
connections per microstate. For each algorithm we averaged over 1000
runs. A power-law behavior can be observed for the pqWL, pqWL1t
and BM methods, with the lines of all three methods coinciding. The
power-law behavior is reached asymptotically for the TM and WL1t
algorithm. The WL method saturates after about 106 time steps.
independent runs. The error bars indicate the standard deviation σξ of the mean
relative error ξ. For the TM method the error bars have been omitted, as they are
too large. We find, that the lines for pqWL and pqWL1t coincide. They follow
a simple power-law behavior right from the beginning of the measurements. The
BM method for T = 10 K follows the same behavior, but performs generally worse.
The WL1t method asymptotically reaches this power-law behavior after at least
106 time steps while still performing slightly worse than the pqWL and pqWL1t
methods. We also find the typical saturation of error for the Wang-Landau method
(WL) with a flatness condition of γ = 0.9 after about 106 time steps. The TM
method completely fails to deliver usable results.
The parQ method, when combined with one of the Wang-Landau algorithms,
performs very well. For this small system it can deliver estimates of the density of
states after fewer time steps (t < 104 ) than any of the other algorithms.
4.1.2 Metropolis sampling
For the Boltzmann/Metropolis sampling with parQ only one parameter can be
identified, i.e. the temperature T . An additional option is to apply a temperature
63
4 Benchmarking Density of States Methods
mean relative error 〈ξ[Ω(E)]〉
1
0.1
0.01
T=2.00K
T=2.50K
0.001
T=10.00K
T=100.00K
T=1000.00K
0.0001
102
103
104
105
106
t / time step
107
108
Figure 4.2: Influence of the temperature T on the performance of the Boltzmann/Metropolis method for K = 8 connections per microstate. The
error bars indicate the standard deviation σξ of the mean relative error. For temperatures larger than 100 K no improvement in the relative
error can be found.
schedule that reduces or modifies T by a prescribed scheme. As the FAB model is
very simple we will stick to using a fixed temperature. Temperature schedules are
discussed in section 4.2.4.
In fig. 4.2 the dependence of the temperature T on the relative error ξ of the
density of states is shown for a system of K = 8 connections per microstate.
Temperatures of 100 K and above neither improve nor reduce the relative error.
This means that sampling at a high, maybe even infinite, temperature gives optimal
results for this algorithm. The reason for this is the structure and the small size
of the system.
4.1.3 Error of Q Matrix Entries
The main idea of the FAB model is to have explicit knowledge of the micro- and
macrostate transition matrix. Therefore we can calculate the actual error in the
estimated macrostate transition matrix for parQ and TM based methods. Having the exact infinite temperature transition matrix Q̂, the mean relative error is
64
mean relative error 〈ζ[Q]〉
4.2 2D Ising Ferromagnet
1
0.1
0.01
pqWL, γ=0.80
pqWL, γ=0.90
pqWL, γ=0.99
pqWL1t
TM
BM, T=10K
0.001
102
103
104
105
106
t / time step
107
108
Figure 4.3: Plot showing the mean relative error of the entries of Q for different
algorithms and parameters. The error bars indicate the standard deviation σζ . The lines for the pqWL and pqWL1t methods coincide and
follow a power-law behavior. The BM method for T = 10 K performs
about half an order of magnitude worse than above methods. The TM
method is found to perform worst.
calculated as
ζ[Q] =
1 ∑ |Qij − Q̂ij |
,
c i,j
Q̂ij
(4.3)
where the sum is over all entries of the matrix where Q̂ij > 0 and c is the number
of entries fulfilling this condition.
In fig. 4.3 the mean relative error over time is shown for the different algorithms.
For the pqWL method with three different settings for the flatness parameter γ
and the pqWL1t method we find a power-law behavior after about 104 time steps.
For the BM method the same behavior is found after 105 time steps. The error of
the TM method remains high throughout the 108 time steps.
4.2 2D Ising Ferromagnet
In the following subsections we will discuss several aspects and findings arising
while investigating the Ising system with methods and algorithms introduced in the
65
4 Benchmarking Density of States Methods
mean relative error 〈ξ[Ω(E)]〉
1
0.1
0.01
pqWL, γ=0.80
WL, γ=0.80
pqWL1t
WL1t
TM
0.005
105
106
107
108
109
t / time step
Figure 4.4: Comparison of the different algorithms for the two-dimensional Ising
ferromagnet. The mean relative error has been averaged over 1000 independent runs. The GTH algorithm has been used to obtain the
densities of states from transition data. Both, the TM and WL1t
algorithm follow a power-law behavior with TM having the smaller
exponent. Nonetheless it performs best in the range [106 , 107 ]. The
parQ based algorithms pqWL and pqWL1t both asymptotically reach
this power-law behavior with the pqWL1t version performing better
throughout the whole simulation time.
previous chapters. We will start with a general comparison of the sampling methods. Thereafter we will investigate some aspects of the power iteration method
and the least squares minimization method. Afterwards annealing schedules for the
combination of simulated annealing with parQ are presented and studied. Finally
the influence of parameters of the WL and WL1t algorithm on the Wang-Landau
results and the transition matrix data is discussed.
Simulations are run for 109 time steps and averaged over 1000 runs. The energy
range is limited to [EG , 0] as the density of states for the Ising system is symmetric
and simulating in the energy range (0, EAG ] would be a waste of computing time.
4.2.1 Algorithm Comparison
A comparison of the performance of the different algorithms, in terms of average
error per bin over time, can be found in fig. 4.4. The graph shows results obtained
66
4.2 2D Ising Ferromagnet
using the GTH method to calculate the density of states from transition data. We
find, that both the TM and the WL1t algorithm show a power-law behavior, i.e.
f (t) = a·tb , with slightly differing exponents. Fitting f (t) to the data for the range
t ∈ [107 , 109 ] results in bTM = −0.476 and b1t = −0.520. Hence, the 1/t-algorithm
converges faster in terms of the exponents, but, looking at the time steps from
106 to 107 , the TM algorithm performs best while from this time step onward the
WL1t algorithm followed by the pqWL1t algorithm deliver better results.
The parQ variants pqWL and pqWL1t asymptotically reach the power-law behavior, while the WL algorithm saturates at about 2 · 107 time steps. Interestingly,
from 107 to 108 time steps the WL algorithm gives results of higher quality than
the parQ algorithm within the same simulation. The BM method with fixed temperature has been omitted from the plot, as it does not produce any usable result.
This is due to the size of the state space, which is orders of magnitude larger
compared to the FAB model. Different variants of SA are discussed separately in
section 4.2.4. In general, we will find, that they perform worse.
4.2.2 Power Iteration
Using the power method to calculate densities of states from transition data gives
the same results as above, with differences being at most 10−5 compared to the
GTH method. For some rare cases in the first 6 · 105 time steps the GTH method
is able to calculate a density of states with a mean relative error several orders of
magnitude smaller than the power method, but still in the order of 50% – 100%.
The power iteration, as presented in section 2.5.4, repeatedly performs a matrixvector multiplication, reusing the normalized output as the new input until some
convergence criterion is met. Typically, the distance d(Ω(n) , Ω(n−1) ) between two
consecutive eigenvectors is compared to an appropriately chosen ϵ and the iteration
is stopped when d ≤ ϵ. The distance used here is defined as
Ω(n)
k
d(Ω(n) , Ω(n−1) ) = max (n−1)
− 1 ,
(4.4)
k Ω
k
which is the maximum relative distance between corresponding entries of two consecutive eigenvectors. For the simulations presented here we used ϵ = 0. Additionally we restricted the maximum number of iterations to 100,000.
In the upper row of fig. 4.5 the mean relative error ξ(Ω(E)) (left axis, solid
lines) and the distance d (right axis, dashed lines) are shown over the number of
iterations n of the power method for Ising lattices of different sizes. In fig. 4.5 (a)
the transition matrices created during WL1t simulations have been investigated
whereas in fig. 4.5 (b) data from TM simulations is used. The data shown is based
on the final Q matrix after 109 time steps. Additionally, for L = 64 up to 1010 time
67
ξ[Ω(E)]
1
L=16
L=22
L=32
L=64
L=64, 1010 steps
100
10-1
10-2
10-3
10-4
10-5
10-6
10-7
10-8
10-9
0.1
(1) (2)
(3) (4)
0.01
(a)
0.001
0.6
0.5
0.4
0.3
0.2
0.1
0
-2000
103
iteration n
(b)
104
103
iteration n
(1)
(2)
104
(3)
(4)
(c)
-1500
-1000
E/J
-500
distance d(Ω(n), Ω(n-1))
mean relative error ξ[Ω(n)(E)]
4 Benchmarking Density of States Methods
(d)
0 -2000
-1500
-1000
E/J
-500
0
Figure 4.5: The plots in the upper row show the mean relative error of the eigenvector (i.e. density of states) over the iteration number of the power
method (left axis, solid lines). Additionally, the distance d between
two consecutive iterations of the power method (right axis, dashed
lines) for different sizes L is shown. For each size L the power iteration method has been applied to the final transition matrix after 109
steps and for L = 64 also for 1010 steps. The left plot (a) uses data
from a pqWL1t simulation. whereas the right plot (b) uses the transition matrices from a TM simulation. For pqWL1t we additionally
have the final error from the Wang-Landau density of states, which is
shown with short horizontal lines. We find that the error may have
a minimum at an iteration number that does not coincide with the
final iteration. Repeating the simulations with different seeds for the
random number generator gives different results. The plots (c) and
(d) in the lower row show the error over energy for L = 32. The solid
lines (1) & (3) correspond to the minima observed in the upper plots,
whereas the dashed lines (2) & (4) correspond to the final result of the
power iteration. The four points referred to are indicated in the upper
row. Between the minimum and reaching the convergence criterion,
the structure of ξ[Ω(E)] remains the same, but the error in the low
energy region (E/J < −1500) increases while it decreases in the range
E/J > −1500.
68
4.2 2D Ising Ferromagnet
steps were performed. We observe, that the error ξ of the final iteration, i.e. for
the converged eigenvector, may be larger than a minimum value found several 100
or 1000 iterations earlier. In fig. 4.5 (a) the short horizontal dashed lines indicate
the error of the WL1t method after 109 time steps.
In the lower row of fig. 4.5 the mean relative error of the points (1) – (4), belonging to the L = 32 data, is shown. The solid lines correspond to the minima, whereas
the dashed lines belong to the final power iteration results. We observe, comparing
(1) to (2) and (3) to (4), that structure within the pairs remains roughly equal. We
only find a redistribution of error from the range E/J > −1500 to E/J < −1500.
This leads to an increase of the mean over all energies. This can be ascribed to
the fact, that we find the phase transition at about this energy. Thus we think
that the parts of the eigenvector belonging to the ordered and unordered phase
respectively, are structurally fixed; and the final iterations just reweight both parts
against each other.
This behavior may be an indicator for insufficient observations of the phase
transition. The transition matrix entries connecting the two phases may exhibit a
larger error than the entries which can be ascribed to one of the phases.
4.2.3 Least Squares Minimization
During the simulations performed in section 4.2.1 the transition matrix data has
additionally been evaluated by the least squares method of Wang et al. [84]. Therefore eq. (2.85) together with weights from eq. (2.116) has been used.
The results are presented in fig. 4.6. In the bottom graph the mean error per bin
of the density of states is compared to the GTH method data already presented
in fig. 4.4. We find, that the least squares method (solid lines) is able to improve
further upon the results of the GTH method (dashed lines). The improvement
is most pronounced for the TM algorithm. Despite this improvement the WangLandau/parQ combinations pqWL1t and pqWL still perform better in the final
8 · 108 and 4 · 108 steps, respectively.
In the upper graph of fig. 4.6 the standard deviation σξ of the mean error per
bin ⟨ξ⟩ is shown. We find that there are only marginal differences between the
different algorithms. The most pronounced difference can be found in the TMGTH combination. Hence, we cannot base the choice or the preference of an
algorithm on the standard deviation of the error.
4.2.4 Simulated Annealing Schedules
The mean error over time for these schedules is presented in the lower graph of
fig. 4.7. The upper plot shows the corresponding temperatures over time. The
parameters used are given in the table on the right. Looking at the lower graph
69
4 Benchmarking Density of States Methods
std-dev σξ
0.1
0.01
mean relative error 〈ξ[Ω(E)]〉
0.003
1
0.1
0.01
pqWL, γ=0.80
pqWL1t
TM
0.005
105
106
107
108
109
t / time step
Figure 4.6: Bottom: Comparison of the results obtained using the GTH method
(dashed lines) and the least squares method (solid lines). The mean relative error has been averaged over 1000 runs. For the TM method we
observe a significant improvement while for the pqWL and pqWL1t
algorithms, i.e. the combination of parQ with Wang-Landau based
methods, we only find a marginal improvement of the least squares
method over GTH. In the upper graph the standard deviation σξ of
the mean error per bin ⟨ξ⟩ is shown. The labeling is the same as in the
bottom graph despite the violet line ( .
) which represents standard
deviation of the Wang-Landau density of states of the WL1t method.
we find that all schedules perform competitively well with exception of the power
schedule. For easier comparison with fig. 4.4, the WL1t results are shown in
gray. All simulations, except for the log and power schedules, used an initial
temperature of T0 = 10 K/J. As one would expect, the commonly used linear (lin)
and exponential (exp) schedules both finish at very low mean errors, even below
Wang-Landau with γ = 0.8. The downside of these two methods is that the error
“jumps” from a very high value to its final value. This is partly due to the way
how the mean relative error, cf. eq. ∑
(4.1), is calculated. Both, exact and estimated
density of states are normalized to E Ω(E) = 1. Thus, having large errors in low
70
mean relative error 〈ξ[Ω(E)]〉
temperature
4.2 2D Ising Ferromagnet
10
8
6
4
2
0
1
Parameters
0.1
0.01
T0 10 K/J
Tmin 0.1 K/J
A
5
α 1 − 10−8
γ
−0.39
lin
sine
logsine
exp
power, T0=500K
log, T0=25K
log, T0=50K
WL1t
0.005
105
106
107
t / time step
108
109
Figure 4.7: In the lower graph the mean error over time of the SA method for
different annealing schedules is depicted in a log-log plot. We find
the schedules lin, exp, sine and logsine to perform competitively, when
comparing this results to fig. 4.4. The lin and exp schedules have the
problem, that, until all energies have been visited, no useful result
can be extracted. Contrary to this, the logsine schedule is able to
continuously deliver a usable density of states. In the upper diagram
the associated temperatures are shown over time in a log-linear plot.
energies due to the ground state not having been well-sampled, leads to a high mean
error, as the normalization affects all states. Another reason is, that the ground
state and near-ground-state energies are only likely to be visited if the temperature
is near or below the critical temperature Tc of the phase transition. For the log
schedule we observe the same behavior, but, to achieve the same quality, we had
to increase the initial temperature to 50 K/J. Additionally, data for T0 = 25 K/J
is shown for comparison.
The logsine schedule performs equally well with the advantage, that the error
continuously improves over time and does not jump. Here, the ground state is
already visited in the early part of the simulation and henceforth repeatedly visited.
The sine schedule shows a similar behavior, but as most of the oscillations occur
71
4 Benchmarking Density of States Methods
in the last 9 × 108 time steps, we obtain the first usable result only after 2 × 108
steps. Looking at the power schedule we find, that even increasing the initial
temperature to 500 K/J it is not possible to obtain competitive results for long
simulations. Only for a short period after ≈ 2 · 106 time steps the power schedule
performs better than the WL1t algorithm.
In order to understand the workings of the different schedules we have to look at
the mean error per bin over time. This is presented in fig. 4.8. We can confirm our
previous observation that the lin, exp and log schedules only deliver useful results
after the ground state has been sampled sufficiently well. The same observation
can be made for the sine schedule. For the logsine schedule we can confirm the
continuous reduction of error. The power schedule fails to sample the lower energies
correctly, as it decreases the temperature too fast.
4.2.5 Wang-Landau Method
For the Wang-Landau algorithm only one parameter influencing the quality of
the result can be identified, i.e. the flatness criterion. Using the standard scheme
proposed by Wang and Landau [2, 3] (see section 2.3.3) we can vary the parameter
γ. This means that the smallest entry of the histogram H(E) has to be at least
γ × ⟨H(E)⟩ for a refinement to take place.
In the left diagram of fig. 4.9 the results of WL simulations for three different
values of γ (solid lines) are presented together with the parQ data (dashed lines)
obtained during these simulations. One finds the typical saturation of error. Increasing γ postpones the saturation and improves the quality of the result at the
cost of reducing the precision in the beginning of the simulation. We also observe
that the transition data recorded during the Wang-Landau simulations depends
on the parameter γ. With increasing γ the power-law behavior of the error is postponed. Hence it is better to use a small γ when one intends to use parQ together
with WL sampling. For easier comparison the WL1t data is also shown in this
diagram.
Using the 1/t-method of Belardinelli et al. [73, 74] changes the flatness scheme
of the simulation. Instead of using the minimum and mean histogram values to
determine flatness, the 1/t-algorithm uses a different approach. In the first part, it
periodically checks if all entries of the histogram have values > 0 (cf. section 2.3.6).
If this criterion is fulfilled, ln f is reduced as described in eq. (2.55). This initial
procedure is repeated until ln f ≤ c/(τ (t))p , with time τ (t) being defined according
to eq. (2.60). From this time step on ln f = F (t) is updated in every step (cf.
eq. (2.61)). For the simulations presented here, we found, that decreasing ln f
right after H(E) > 0 for all E is not optimal. Switching too early to the timestep dependent modification factor F (t) gave results worse than standard WangLandau sampling. We found, that the number of steps t between two consecutive
72
4.2 2D Ising Ferromagnet
Energy E/J
0
-128
-256
-384
Energy E/J
-512
0
-128
-384
Energy E/J
-128
Energy E/J
logsine
-256
-384
-512
0
-128
exp
-256
-384
-512
0
Energy E/J
sine
-256
-512
0
-128
power, T0=500K
-256
-384
-512
0
Energy E/J
lin
-128
log, T0=50K
-256
-384
-512
106
107
t / time step
108
109
103
102
101
100
10-1
10-2
10-3
10-4
103
102
101
100
10-1
10-2
10-3
10-4
103
102
101
100
10-1
10-2
10-3
10-4
103
102
101
100
10-1
10-2
10-3
10-4
103
102
101
100
10-1
10-2
10-3
10-4
103
102
101
100
10-1
10-2
10-3
10-4
Figure 4.8: Graphs showing the mean error per bin over time for the different schedules. The axis of abscissae and the color mapping are in logarithmic
scale. White areas correspond to bins with errors larger than 103 or an
IEEE 754 “NaN” resulting from a double precision number overflow.
Parameters are given in fig. 4.7. We can observe the continuously improving error for the logsine schedule. For the lin, exp, power and log
schedule we find that no improvement in quality can be observed after
the ground state has been sampled.
73
4 Benchmarking Density of States Methods
1
mean relative error 〈ξ[Ω(E)]〉
mean relative error 〈ξ[Ω(E)]〉
1
0.1
0.01
WL1t, c=1
WL, γ=0.80
WL, γ=0.90
WL, γ=0.95
0.005
105
106
107
t / time step
(a) WL
108
109
0.1
WL1t, c=1
WL1t,
c=10
0.01
WL1t, c=0.1
0.005
105
106
107
108
109
t / time step
(b) WL1t
Figure 4.9: Left: Comparison of the Wang-Landau algorithm with different parameters γ as well as Wang-Landau 1/t method with c = 1 (solid lines).
With increasing γ the WL results become more precise, but it also
takes longer to reach that precision. The dashed lines show the corresponding transition matrix densities of states. Here we find a strong
influence of the parameter γ on the transition matrix results. Right:
Diagram showing the influence of the parameter c of the 1/t-algorithm
on the mean error per bin (solid lines). We find a strong influence of c
on the quality of the results. Additionally the corresponding error from
the parQ data is shown (coinciding dashed lines), where no influence
on c exists.
refinements needs to be at least 100 · Nbins .
In fig. 4.9b the influence of the parameter c is shown and the results are compared
to the corresponding parQ data. We can confirm the findings of Belardinelli et
al. that the optimal choice is c = 1. But we also find, that the parQ method is
independent of the underlying parameters for the WL1t method.
4.3 Two-Particle Lennard Jones System
In the previous sections two different discrete systems were investigated with a
variety of density of states MC methods. In this section we will apply those
methods to the Lennard Jones system. First, the binning of energies and the
74
4.3 Two-Particle Lennard Jones System
104
104
103
103
102
101
Ω(E)
102
100
10-1
-0.1
101
-0.05
0
0.05
0.1
100
10-1
-1
-0.5
0
0.5
1
E/ε
1.5
2
2.5
3
Figure 4.10: Exact (red line, cf. eq. (3.16)) and binned density of states (black dots,
100 bins) for a two-particle Lennard-Jones system. The cut-off radius
was set to rc = 2.5σ with the linear length of the box set to 2 rc . The
inset shows the region around E = 0 in detail. Additionally, exact
lumped data for 101 bins (blue crosses) and simulation data using the
WL1t method with 109 time steps and 100 (green dashed line) and 101
bins (violet dashdotted line) is shown. For the 100 bin data error bars
are shown, indicating the standard deviation of the density of states.
The error bars have been omitted for the 101 bin calculation, as they
do not exceed the width of the corresponding line. We find, that the
data for 101 bins fits well to the exact density of states, whereas the
results for 100 bins is off by a large factor and shows a large standard
deviation at |E| = 0.02. This is due to the way the energy calculation
within the simulation is implemented (see text).
lumping of reference data are discussed. Thereafter, Wang-Landau and transition
matrix based methods are evaluated.
4.3.1 Optimal Binning
For the two particle system we have to be cautious when calculating the reference
density of states. Our simulations employ a lumping scheme and count all energies
in a range [Ei , Ei + ∆E) into one bin labeled i. A straight-forward way would be
to compute Ωexact (Ei + ∆E/2) as reference value for this bin. But being sharply
75
4 Benchmarking Density of States Methods
peaked at E ∈ {−1, V (rc ), 0} calculating only the value at the center of the bin
would result in large errors. Thus we need to integrate eq. (3.16) for each bin as
E∫
i +∆E
Ωexact (E) dE = Π(Ei + ∆E) − Π(Ei ),
Ωlumped (Ei ) =
(4.5)
Ei
to obtain reference values for comparison. Knowing the antiderivative Π, i.e.
eq. (3.15), is helpful, as we thus can avoid integrating over the peak at E = 0.
In fig. 4.10 the exact lumped density of states (black dots) is shown together
with the exact density of states (red line). We find, that the lumped values at
E ∈ {−0.98, −0.02} slightly deviate from the exact values. This is due to the
sharp peaks and the area underneath. In the inset the region around E = 0 is
shown in detail for exact lumped data for 100 (black dots) and 101 bins (blue
crosses). Additionally, simulation data from a WL1t simulation with 100 (green
dashed line) and 101 bins (violet dashdotted line) is shown for comparison. For the
100 bin simulation error bars, indicating the standard deviation of the density of
states, are shown. Being smaller than the width of the line, they have been omitted
for the 101 bin data. For the 100 bin data we find a large difference for the bins
left and right of E = 0, whereas we do not see this behavior for 101 bins. This is
due to how the simulation calculates the energy of the system: the implemented
program is very generic and can work with more than two particles. Thus the
system energy is calculated once at the beginning of the simulation. Thereafter,
in every step that is accepted, the energy contribution of the atom that is going
to be moved is subtracted from the total energy and the contribution at the new
position is added. This leads to rounding errors in E which are visible especially
for E = 0. The total energy can thus take values slightly above or below (by
some machine epsilon) zero. If now the border of two consecutive bins is exactly
at E = 0, a configuration with E ≈ 0 will be counted either in the bin left or right
of E = 0. Choosing a different number of bins or a different energy range avoids
this problem, as can be seen for the 101 bins example.
4.3.2 Sampling Method Comparison
Using the lumped data for 101 bins as reference values, we performed simulations
using the MC algorithms described in the beginning of this chapter. For the
transition matrix based algorithms we perform our analysis, as already presented
in the previous section, with three different eigenvector methods to obtain the
density of states: power iteration, GTH method and least squares minimization
of the detailed balance. We perform only atom translation moves, where a single
atom is moved by δr · ⃗e with ⃗e being a unit vector randomly distributed on the
76
4.3 Two-Particle Lennard Jones System
mean relative error 〈ξ[Ω(E)]〉
1
pqWL, γ=0.80
WL, γ=0.80
pqWL1t
WL1t
TM
0.1
GTH
0.01
0.001
Least Squares
105
106
Power Method
107
108
t / time step
109
1010
106 107 108 109 1010
t / time step
Figure 4.11: Mean relative error in the density of states for the two-particle LJ
system with δr = 1σ. For the transition matrix based algorithms the
eigenvector method used is noted in the plot. The mean relative error ⟨ξ⟩ has been averaged over 512 runs. The error bars indicate the
standard deviation σξ . They have been omitted for the TM method,
as the underlying distribution is skewed. Using the least squares minimization, we find that the pqWL and pqWL1t methods perform best,
followed by the WL1t method. The WL method shows its typical saturation after 2 × 107 time steps, while the TM method needs over an
order of magnitude more time steps to deliver a usable result. In the
final 109 time steps the error for the TM method is only about 3 to
5 times that of the best method. Using either GTH or power method
we find that the transition matrix methods perform worse than WL1t
method, being off by a factor of ≈ 5.
unit sphere. This adds another parameter influencing the quality of the outcome.
Results for δr = 1σ and δr = 2.5σ can be found in figs. 4.11 and 4.12, respectively.
For each dataset the mean relative error is averaged over 512 runs.
Looking at the left frame of fig. 4.11, where the least squares method has been
used to obtain the density of states for the transition matrix methods, we find, that
the pqWL and pqWL1t methods perform best, followed by the WL1t method. The
WL method shows the typical saturation of error while the TM method is about
77
4 Benchmarking Density of States Methods
mean relative error 〈ξ[Ω(E)]〉
1
pqWL, γ=0.80
WL, γ=0.80
pqWL1t
WL1t
TM
0.1
GTH
0.01
0.001
Least Squares
105
106
Power Method
107
108
t / time step
109
1010
106 107 108 109 1010
t / time step
Figure 4.12: Mean relative error in the density of states for the two-particle LJ
system with δr = 2.5σ. Comparing these results to the previous
figure, we find that increasing δr leads to an overall deterioration of
the relative error. For the least squares method, all thee transition
matrix based algorithms perform best, followed by the WL1t method.
For the GTH and power method the WL1t method overtakes after
108 time steps and delivers the best result.
half an order of magnitude worse than the best method and needs 107 time steps
before the mean relative error becomes smaller than 1. The error bars present the
standard deviation σξ of the mean relative error ⟨ξ⟩. We find, that all methods
have small standard deviations. For the TM method the error bars have been
omitted for some data points, as the underlying distribution is skewed and using
the standard deviation would result in too large error bars.
Comparing the larger frame on the left, representing the results obtained via least
squares minimization, to the two smaller frames on the right side, we find, that
the least squares method is able to increase the quality for the transition matrix
methods. Using the GTH method or the power iteration leads to the pqWL and
pqWL1t methods performing significantly worse than the WL1t method.
Increasing δr to 2.5σ gives different results which can be found in fig. 4.12. The
ranges of the abscissae and ordinates has been kept equal to fig. 4.11 for easier
comparison. We find, that the Wang-Landau methods (WL, WL1t) need an order
78
4.4 Discussion of Benchmark Results
of magnitude more time to reach a quality comparable to δr = 1σ. Here the TM
method and the parQ methods combined with the least squares algorithm are able
to outperform WL1t, if the least squares minimization is applied. For the power
iteration and GTH method the transition matrix based algorithms perform worse
than WL1t, differing by ∆ξ = 7.1 · 10−4 at the final time step.
4.4 Discussion of Benchmark Results
From the simulations performed, employing three different systems, we found for
the standard Wang-Landau method the well known time-quality tradeoff. Setting
the parameter γ to a small value gives fast results of medium quality whereas
setting γ near 1 gives good results for long runs. Stopping a WL simulation before
it converged we obtain mostly unusable results of low quality.
We also found that the Wang-Landau 1/t method is superior to the standard
implementation at least in the canonical ensemble. Its results continuously improve
with simulation time in a predictable manner, following a power-law behavior.
Both methods show very small standard deviations for the error.
Infinite temperature transition matrices captured during WL and WL1t simulations have been found to enhance the quality of the results, especially for the
standard WL implementation. We observed that, although the error of the WL
density of states already had saturated, the error of the density of states from the
transition matrix continued to decline. As a general guideline we recommend to
record a transition matrix during such simulations.
A question that remained unanswered until now is, whether it is better to use
eigenvector calculation via GTH or power method, or to use the minimization of
detailed balance deviations. The latter, i.e. least squares minimization, in most
cases improves the results over GTH and power iteration methods. The improvement strongly depends on structure and size of the transition matrix and thus on
the structure of the state space of the system investigated. For instance, for the
FAB system and the Ising system the transition matrices are very sparse and least
squares minimization only gives slight improvements. Looking at the LennardJones system we find significant improvements, especially for the two particle system with δr = 2.5σ. Under some circumstances, e.g. for δr = 1σ, an increase in
the standard deviation of the error was found.
As a rule of thumb we find, that both, eigenvector calculation and minimization
of detailed balance deviations, should be performed when calculating final results
which are destined for further processing. If the densities of states obtained via
these two techniques are equal, we have an indicator that the state space has
been sampled well. Otherwise, having a large deviation between the two, we need
different criteria to determine which one to use. Such criteria may include the comparison to reference values, if available, or other, system specific, considerations.
79
80
5 Grand Canonical Ensemble
Simulations of a Lennard-Jones system in the grand canonical ensemble are rather
challenging, as we not only have a continuous state space, but also a varying
particle number. A system in the grand canonical ensemble is in thermodynamic
equilibrium with a reservoir, with which it can exchange energy and particles.
It can be pictured as a small box with permeable walls, placed inside a huge
reservoir. As the walls are permeable, particles can move freely in and out of
our box and thus the number of particles inside the box fluctuates. The mean
number of particles we observe is related to the chemical potential µ, which is in
turn related to the pressure, or in case of multi-component systems, to the partial
pressure. Hence, MC algorithms are confronted with two varying macrovariables,
i.e. particle number and energy.
The incentive for the development of an algorithm comparison is the problem of
estimating the joint density of states (JDOS) Ω(E, N ) in the grand canonical ensemble and eventually in the isobaric-isothermal ensemble (Ω(E, V )). The findings
of the previous chapter suggest, that ideally a combination of the Wang-Landau
method and parQ transition matrix method should be used. This is also supported
by an initial study of grand canonical systems performed by the author [8]. We
also found, that estimating the density of states by minimizing the detailed balance
deviations seems generally to be superior to eigenvector calculation by e.g. power
iteration.
We implemented the combination of Wang-Landau algorithm and parQ transition matrix method in the open source Monte Carlo simulation tool MCCCS
Towhee. Initial studies showed, that the advantages of this combination, presented
in the previous chapter, are also found for grand canonical systems. Here, some features are even more important, for instance the ability to combine several matrices
of independent runs.
From our initial studies we also found, that the flatness formulation for the WangLandau algorithm can be troublesome. The algorithm may get stuck in its iterative
procedure, sometimes already in the first iteration at ln f = 1. If this happens, the
transition matrix continues to improve and with it the density of states estimate.
Only carefully tuning the flatness condition as well as other parameters allows for
successful convergence of the Wang-Landau method.
Using the power iteration to calculate the density of states, sometimes large
deviations in Ω(E, N ) showed up for large N . The source of those deviations
81
5 Grand Canonical Ensemble
are unreliable entries, e.g. entries with low values in the counting matrix C. For
eigenvector methods, there is no way to incorporate weighting into the calculation
procedure. Thus, a new method had to be constructed, that can handle large
sparse matrices and can take some kind of weight into account. Such matrices,
as described in section 2.4.2, are very sparse and have at least a size of 50,0002 .
Typically they are larger. As a solution, we propose to use the minimization
of detailed balance deviations for this kind of transition matrices in an efficient
implementation. The application of this method to matrices of that size and
structure is new and needs special treatment.
5.1 A new Method for the Computation of the Joint
Density of States from Sparse Transition
Matrices
For the three models presented in chapter 3 calculating the density of states by
either power iteration, GTH method or minimization of detailed balance deviations
is trivial, as the matrix sizes are rather small.
Working in the grand canonical ensemble or the isobaric-isothermal ensemble, or
in general having other order parameters besides the temperature, increases the size
of the transition matrix enormously. Typically, matrices are of size 50,000 × 50,000
and larger as well as sparse. This rules out the GTH method from the list of viable
methods, as it fills the matrix.
Although this type of matrix lends itself to the application of the power method,
it still is not the optimal choice. One problem is, that the number of iterations
needed for the method to converge increases with the number of rows. And, the
time it takes to perform a single iteration increases as well. Thus, calculating the
eigenvector of a 500 × 500 matrix is a matter of seconds on a single processor,
whereas it can be a matter of hours for large grand canonical systems. A second
problem, which remained unsolved in a previous publication by the author [8], is
that of unreliable entries. Such entries may introduce errors into the estimated
density of states, such as spikes (cf. fig. 2.7) and large deviations of several orders
of magnitude in some regions of the state space.
The only method that yet has not been applied to large sparse transition matrices
is the minimization of detailed balance deviations, which is typically implemented
as least squares minimization. Standard implementations in libraries like the Gnu
Scientific Library [91] use dense matrices and are therefore unusable. Hence a new
method for sparse transition matrices is needed.
Here we propose a new method based on eqs. (2.105) and (2.106) and on a fast,
Krylov method based, linear equation system solver. We will refer to it as efasTM
82
5.2 Joint Density of States of a Lennard-Jones System in the Grand
Canonical Ensemble
(extremely fast method for determining the joint density of states from a sparse
transition matrix). The rectangular matrix M contains as many rows as there are
pairs (Qij , Qji ). The row for the k th pair is then
Mki = σij−1
and Mkj = −σij−1 ,
(5.1)
with σij determined according to one of eqs. (2.116) to (2.118). The corresponding
entry in vector b is given by
bk = − ln
Qij
.
Qji
(5.2)
The resulting matrix typically has a size in the order of 500,000 × 50,000, with only
two entries per row. Performing the multiplication M T M reduces the size back to
the size of the original square transition matrix. The resulting equation system
can then be solved in a matter of seconds or minutes.
5.2 Joint Density of States of a Lennard-Jones
System in the Grand Canonical Ensemble
Simulations in the grand canonical ensemble consist of moves in energy space (e.g.
atom and molecule displacements, molecule rotations, regrowths and reinsertions)
as well as particle insertions and removals (see fig. 2.3 for a sketch).
In the previous chapter we already showed, that the minimization of detailed
balance deviations can deliver densities of states of higher quality when compared
to power iteration or GTH method. Now we want to test the efasTM method,
which we developed in the previous section.
In fig. 5.1 we find a comparison of the density of states from a simulation of
Lennard-Jones particles in the µV T ensemble where the particle number N was
allowed to fluctuate between 1 and 110 particles, and the overall energy range
has been set to E ∈ [−700, 10] and split into 500 bins. Additionally per-particle
energy ranges have been set to bounds determined in a previous publication (see
ref. [8]). An implementation of the Wang-Landau algorithm has been used to
achieve broad sampling. It has a slightly modified flatness condition, which checks
that the number of visited bins stays constant for 10,000 time steps and for a
minimum number of visits to each bin. The resulting density of states, after
2 × 109 time steps and 14 refinements, reaching f = 6.10352 × 10−5 , has been
included for comparison (violet line). The transition matrix recorded during the
simulation has been evaluated using the efasTM method (red line) and the power
iteration method in machine precision (green line) as well as using about four times
of machine precision (blue line). We find that the precision used to perform the
83
5 Grand Canonical Ensemble
0
ln Ω(E,N)
-100
efasTM
Power Iteration
Power Iteration, high precision
Wang-Landau
-200
0
-300
-20
-40
-400
-60
-500
-700
-60 -50 -40 -30 -20 -10
-600
-500
-400
-300
-200
-100
0
10
0
E/ε
Figure 5.1: Density of states of a Lennard-Jones system in the grand canonical
ensemble. The simulation covered the energy range E ∈ [−700, 10)
and the particle number range N ∈ [1, 110]. Additionally the energy
range per particle number has been restricted (see ref. [8] for details).
For each data set a selection of particle numbers N = 10, 20, . . . , 110
is shown from right to left in increasing order. Wang-Landau sampling
has been used to achieve broad sampling. The corresponding density
of states is shown for reference. Additionally, transition data has been
recorded which has been analyzed with different methods. First the
power iteration at double precision and at about four times the precision of “double precision” has been used. Both methods give equal
results and match the Wang-Landau density of states up to N = 90.
For N ∈ {100, 110} strong deviations are visible at the low energy
bound. Using the efasTM method gives results comparable to WangLandau.
84
5.3 Calculating Phase Coexistence Properties
Method
efasTM
least squares implementation
power iteration
Wall Time Total CPU Time
35 s
484 s
4263 s
38 s
2390 s
4845 s
Table 5.1: Timings of three different methods to calculate the density of states
from large sparse transition matrices. Wall time and total CPU time
differ, as some parts of the algorithms are executed in parallel.
power method has no influence on the outcome. Both variants give results on par
with densities of states obtained from the Wang-Landau method and the efasTM
method for particle numbers N < 100. For larger N we observe a strong deviation
in the density of states at low energies. Looking at the efasTM method, we only
find a slight differences for N = 110.
Besides differing densities of states we also find large differences in the computing
times needed to obtain those results. Table 5.1 gives an overview of how much
CPU time was needed in total, incorporating parallel execution of parts of the
algorithm, as well as the wall time needed altogether. The computations have been
performed using the highly optimized algorithms of Mathematica. We find, that
the efasTM method performs an order of magnitude faster than the least squares
implementation of Mathematica. Looking at the total CPU time the speedup is
nearly two orders of magnitude. For the power iteration we find, that the efasTM
method is more than two orders of magnitude faster.
To check, which of the results is closer to the true density of states, we cannot
resort to comparisons with exact results, as such do not exist. We therefore have
to compare quantities computed from the joint density of states to literature data.
To do so, we present appropriate formulae in the next section and develop an
algorithm to determine the liquid and vapor coexistence lines.
5.3 Calculating Phase Coexistence Properties
Being able to predict the outcomes of measurements is of huge interest in material
research in industry and science. Having a precise estimate of the density of states
available alleviates this task. The formulae to come are taken from Yan et al. [64]
and use the configurational density of states (cf. section 2.3.2). For instance, having
obtained Ω as a function of energy E and particle number N , the grand canonical
85
1.2
1.2
1.1
1.1
1.0
1
T
T
5 Grand Canonical Ensemble
0.9
0.9
0.8
0.8
0.7
0.0
0.2
0.4
ρ
0.6
0.8
0.7
efasTM
efasTM, fit
Power Iteration
ref. Yan et.al., fit
0
0.2
0.4
0.6
0.8
ρ
Figure 5.2: The left frame shows a density plot of the function p(ρ) (cf. eq. (5.10))
over temperature T and density ρ = N /V for density of states data
obtained via least squares minimization. For every value of T the
chemical potential µ has been tuned to reach phase coexistence (see
text). The red points lying on top are the mean densities of the vapor
and liquid phases. The two top-most points, at T = 1.24 K, are slightly
off, as the algorithm for determining phase coexistence failed to find
the correct density for separating liquid and vapor phase. In the right
diagram, the red points correspond to those shown in the left plot.
The blue stars correspond to density of states data, obtained via power
iteration from the same transition matrix as the red points. We find a
strong deviation in the liquid region for T < 1 K, which can be traced
back to the deviations in the density of states. The red line is an Ising
form fit to the red points, whereas the black dashed line is a fit to data
from Yan et al. [64].
86
5.3 Calculating Phase Coexistence Properties
partition function
Ξ(T, µ) =
∑∑
N
Ω(E, N ) e−βE+N βµ
(5.3)
E
can be calculated as a function of temperature T and chemical potential µ. From
this, we can obtain quantities like the the mean particle number and its second
moment:
∑∑
N Ω(E, N ) e−βE+N βµ ,
⟨N ⟩ = Ξ−1
(5.4)
N
⟨N 2 ⟩ = Ξ−1
E
∑∑
N
N 2 Ω(E, N ) e−βE+N βµ ,
(5.5)
E
both being functions of T and µ as well. Using eqs. (5.4) and (5.5) we are able to
obtain the compressibility of a fluid:
(
)
1 ∂V
V ⟨N 2 ⟩ − ⟨N ⟩2
κ(T, µ) = −
(5.6)
=
.
V ∂p
kB T
⟨N ⟩2
A liquid and a vapor phase are in phase coexistence, if the pressure in both
phases is equal, i.e. pvap = pliq . The pressure is a function of the logarithm of the
grand canonical partition function, i.e.
p(T, µ) =
kB T
log Ξ(T, µ).
V
(5.7)
To separate the two phases we have to find an Nmid such that the pressure can be
separated into
∑ ∑
kB T
log
Ω(E, N ) e−βE+N βµ
V
N ≤Nmid E
∑ ∑
kB T
and pliq (T, µ) =
log
Ω(E, N ) e−βE+N βµ .
V
N >N
E
pvap (T, µ) =
(5.8)
(5.9)
mid
To find Nmid we may define the probability
∑
p(N ) =
Ω(E, N ) e−βE+N βµ
(5.10)
E
to observe particle number N at a fixed T and µ. Now, for any given temperature
we have to guess an initial value for µ such that p(N ) exhibits two distinct peaks.
Then, the local minimum in between the two peaks can be determined, i.e. Nmid =
87
5 Grand Canonical Ensemble
argminN p(N ) under the given constraints. Next, µ has to be tuned to satisfy
pvap = pliq while recalculating Nmid in every optimization step.
The coexistence densities may then be obtained via
∑ ∑
N Ω(E, N ) e−βE+N βµ0
1 N ≤Nmid E
∑ ∑
ρvap (T ) =
(5.11)
V
Ω(E, N ) e−βE+N βµ0
N ≤Nmid E
∑
∑
N Ω(E, N ) e−βE+N βµ0
1 N >Nmid E
∑ ∑
and ρliq (T ) =
,
V
Ω(E, N ) e−βE+N βµ0
(5.12)
N >Nmid E
where the chemical potential µ0 depends on T and is obtained from the previous
procedure.
Next, we will apply this technique of calculating coexistence curves from density
of states data by applying it to the data shown in fig. 5.1. Therein, density-of-states
results for a Wang-Landau algorithm simulation in the grand canonical ensemble
together with densities of states obtained by applying different eigenvector methods
to the transition data of this simulation are shown. The results can be found in
fig. 5.2. Here, the left figure shows a contour plot of eq. (5.10) over T and ρ = N /V
at the chemical potential µ0 of phase coexistence. The colors scale linearly from
blue (p(ρ) = 0) to red (p(ρ) = 0.04), whereas white indicates larger values of p(ρ).
The density of states used is the one obtained by the efasTM method, represented
by the red curves in fig. 5.1. The red dots indicate the coexistence densities ρvap and
ρliq determined according to eqs. (5.11) and (5.12). Above the critical temperature
Tc = 1.1876(3) [113], the distribution p(N ) looses its bimodality and thus Nmid
is hard to determine. This leads to the two top-most points and the underlying
p(ρ) being slightly off. The Nmid being calculated is attributed to noise in the
underlying Ω(E, N ) leading to a small local minimum, which the optimization
algorithm finds.
In the right frame of fig. 5.2 the red points correspond to those in the left frame.
The blue points are obtained by applying the phase coexistence calculation to the
power iteration data (blue dotted curves of fig. 5.1). We find that the deviations
already found in the previous figure lead to large deviations in the points representing the liquid phase.
The lines are Ising form fits to the function [106, 114]
T − Tc β
T − Tc ± b
ρ± = ρc + a (5.13)
Tc ,
Tc with density ρc at critical temperature Tc and free parameters a, b and β. The
latter is the 3D Ising order-parameter exponent and set to β = 0.3258 [10]. For the
88
5.3 Calculating Phase Coexistence Properties
critical temperature we use the result of Wilding et al. [113], who determined it to
be Tc = 1.1876(3) for a system of infinite size by finite size scaling. The remaining
parameters ρc , a and b have been obtained via numerical fitting.
The red dashed line in fig. 5.2 is a fit to the efasTM data (red points). For
reference, a fit to phase coexistence data from Yan et al. [64] has been added
(black line). We find, that both coincide in the fluid region, and only differ slightly
in the vapor region. Hence, we proved, that the parQ + Wang-Landau sampling
combination for the grand canonical ensemble can produce reliable results, on par
with literature data obtained by pure Wang-Landau sampling. The parQ-approach
has the additional benefit of much easier parallelization.
89
90
6 Summary and Conclusion
Markov chain Monte Carlo methods are an important part of statistical mechanics. Key properties of Markov chains are the transition matrix and its stationary
distribution. The latter, evaluated at infinite temperature, can be identified as
the density of states, which is a central quantity of statistical mechanics. This
thesis builds on the fact, that the density of states may be obtained directly if we
know the infinite temperature transition matrix. Besides that, other techniques
to obtain the density of states exist. In this thesis, we developed a model system
and subsequently used it as a test bed to investigate and compare such techniques.
Further we developed a new method for estimating the density of states based on
an existing transition matrix method for the grand canonical ensemble.
In chapter 2 a broad overview of sampling algorithms and analysis methods,
capable of calculating the density of states, was given. Their strengths and weaknesses, parallelizability as well as parameters and their influence on the outcome
were presented. Furthermore, transition matrix based methods found in literature
as well as the parQ method, developed by Andresen et al., Heilmann et al. and the
author, were discussed thoroughly. Additionally, related methods and two special
Monte Carlo moves, CBMC and CFCMC, were discussed.
Extracting the density of states out of infinite temperature transition matrices is
a key element when using parQ and other transition matrix based methods. Here,
a comprehensive presentation of available techniques was given; this includes either
calculating the eigenvector belonging to eigenvalue 1 or minimizing the deviations
of the detailed balance equation.
In chapter 3 a new test-bed for density of states algorithms was developed. Therefore we invented a new model system, called fully adjustable benchmark system,
where all parameters can be adjusted and of which the exact density of states as
well as the exact transition matrix is known. The test-bed was completed by adding
the Ising model and the two-particle Lennard-Jones system. Of both systems at
least the exact density of states is known.
In the fourth chapter numerical investigations were performed employing the
novel benchmark. For this analysis we selected the Wang-Landau method and its
1/t variant as well as a combination of these two methods with parQ. Additionally,
we investigated the TMMC method, a transition matrix based sampling algorithm.
We found, that the widely used Wang-Landau method confronts the researcher
with the choice of either obtaining fast results or results of high quality. This
91
6 Summary and Conclusion
is due to the saturation of error, which is inherent to standard Wang-Landau
sampling. We showed, that capturing an infinite temperature transition matrix
during such a simulation, no such choice has to be made. The error in the density
of states follows a simple power-law behavior and continues to reduce even when
the Wang-Landau algorithm already has saturated. Additionally, parallelization, a
notoriously difficult task for Wang-Landau sampling, becomes easier. Using the 1/t
variant of Wang-Landau sampling also lifts the time-quality trade-off. Nonetheless,
using it in conjunction with transition matrices, we showed, that a further increase
in quality is possible.
For the parQ method in its original formulation, simulated annealing is employed
to achieve broad state space sampling. We thus investigated different schedules,
of which some are new, and showed, that, with the right choice, it is possible to
achieve competitive results.
Finally, we use the the findings of chapter 4 to develop a new analysis method to
determine the joint density of states from large sparse transitions matrices. Using
this method we improve a transition matrix based method for the grand canonical
ensemble, previously invented by the author [8]. The new algorithm is two orders
of magnitude faster than the conventional power iteration method, while achieving
better results, and at least one order of magnitude faster than a highly optimized
least squares minimization algorithm. Using this new method, we investigated a
grand canonical Lennard-Jones system. We found, that results for the density of
states and the phase coexistence curve can be obtained in the same quality as in
literature.
Concluding, we found that infinite temperature transition matrix methods can
greatly enhance the process of capturing transition data for a variety of sampling
algorithms. Their strengths are the low number of parameters and to some extent their independence of the parameters of the underlying sampling algorithm.
Additionally, as previously shown [7], they have a near linear speedup when running simulations in parallel. This makes such transition matrix methods ideal for
sampling systems with more than one order parameter, e.g. grand canonical or
isobaric-isothermal systems.
92
Bibliography
[1] N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E.
Teller. “Equation of State Calculations by Fast Computing Machines”. J.
Chem. Phys. 21 (1953), p. 1087.
[2] F. Wang and D. P. Landau. “Determining the density of states for classical
statistical models: A random walk algorithm to produce a flat histogram”.
Phys. Rev. E 64 (2001), p. 056101.
[3] F. Wang and D. P. Landau. “Efficient, Multiple-Range Random Walk Algorithm to Calculate the Density of States”. Phys. Rev. Lett. 86 (2001),
pp. 2050–2053.
[4] B. Andresen, K. H. Hoffmann, K. Mosegaard, J. Nulton, J. M. Pedersen, and
P. Salamon. “On lumped models for thermodynamic properties of simulated
annealing problems”. J. Phys. 49 (1988), pp. 1485–1492.
[5] R. H. Swendsen, B. Diggs, J.-S. Wang, S.-T. Li, C. Genovese, and J. B.
Kadane. “Transition Matrix Monte Carlo”. Internat. J. Modern Phys. C 10
(1999), pp. 1563–1569.
[6] F. Heilmann and K. H. Hoffmann. “ParQ – high-precision calculation of the
density of states”. Europhys. Lett. 70 (2005), pp. 155–161.
[7] F. Heilmann. “The State Space of Complex Systems”. PhD Thesis. 09107
Chemnitz: Technische Universit�t Chemnitz, 2005.
[8] R. Haber. “Numerical methods for density of states calculations”. MA thesis.
09107 Chemnitz: Technische Universit�t Chemnitz, 2008.
[9] A. M. Ferrenberg and R. H. Swendsen. “New Monte Carlo technique for
studying phase transitions”. Phys. Rev. Lett. 61 (23 Dec. 1988), pp. 2635–
2638.
[10] A. M. Ferrenberg and D. P. Landau. “Critical behavior of the three-dimensional Ising model: A high-resolution Monte Carlo study”. Phys. Rev. B 44
(10 Sept. 1991), pp. 5081–5091.
[11] B. A. Berg and T. Neuhaus. “Multicanonical algorithms for first order phase
transitions”. Phys. Lett. B 267 (1991), p. 249.
93
Bibliography
[12] B. A. Berg and T. Neuhaus. “Multicanonical ensemble: A new approach to
simulate first-order phase transitions”. Phys. Rev. Lett. 68 (1 Jan. 1992),
pp. 9–12.
[13] E. Ising. “Beitrag zur Theorie des Ferromagnetismus”. Z. Phys. A 31 (Feb.
1925), pp. 253–258.
[14] R. B. Potts. “Some generalized order-disorder transformations”. Math. Proc.
Cambridge Philos. Soc. 48 (01 Jan. 1952), pp. 106–109.
[15] F. Y. Wu. “The Potts model”. Rev. Mod. Phys. 54 (1 Jan. 1982), pp. 235–
268.
[16] D. S. Gaunt and M. E. Fisher. “Hard-Sphere Lattice Gases. I. Plane-Square
Lattice”. J. Chem. Phys. 43 (1965), pp. 2840–2863.
[17] M. N. Rosenbluth and A. W. Rosenbluth. “Monte Carlo Calculation of the
Average Extension of Molecular Chains”. J. Chem. Phys. 23 (1955), pp. 356–
359.
[18] K. F. Lau and K. A. Dill. “A lattice statistical mechanics model of the
conformational and sequence spaces of proteins”. Macromolecules 22 (1989),
pp. 3986–3997.
[19] C. Zhou, T. C. Schulthess, S. Torbrügge, and D. P. Landau. “Wang-Landau
algorithm for continuous models and joint density of states”. Phys. Rev. Lett.
96 (2006), p. 120201.
[20] K. A. Maerzke and J. I. Siepmann. “Transferable Potentials for Phase
Equilibria-Coarse-Grain Description for Linear Alkanes”. J. Phys. Chem.
B 115 (2011), pp. 3452–3465.
[21] J.-S. Yang and W. Kwak. “Application of Wang-Landau sampling to a
protein model using SMMP”. Comp. Phys. Comm. 181 (Jan. 2010), pp. 99–
104.
[22] A. Vı́tek and R. Kalus. “Two-dimensional multiple-histogram method applied to isothermal–isobaric Monte Carlo simulations of molecular clusters”.
Comp. Phys. Comm. (2014). in press.
[23] D. Frenkel, G. Mooij, and B. Smit. “Novel scheme to study structural and
thermal properties of continuously deformable molecules”. J. Phys.: Condens. Matter 4 (1992), pp. 3053–3076.
[24] J. I. Siepmann and D. Frenkel. “Configurational bias Monte Carlo: a new
sampling scheme for flexible chains”. Mol. Phys. 75 (1992), pp. 59–70.
94
Bibliography
[25] M. G. Martin and J. I. Siepmann. “Novel Configurational-Bias Monte Carlo
Method for Branched Molecules. Transferable Potentials for Phase Equilibria. 2. United-Atom Description of Branched Alkanes”. J. Phys. Chem. B
103 (1999), p. 4508.
[26] M. G. Martin and A. L. Frischknecht. “Using arbitrary trial distributions
to improve intramolecular sampling in configurational-bias Monte Carlo”.
Mol. Phys. 104 (Aug. 2006), pp. 2439–2456.
[27] D. P. Landau and K. Binder. A Guide to Monte Carlo Simulations in
Statistical Physics. New York, NY, USA: Cambridge University Press, 2005.
[28] C. H. Bennett. “Efficient estimation of free energy differences from Monte
Carlo data”. J. Comput. Phys. 22 (1976), pp. 245–268.
[29] A. M. Ferrenberg and R. H. Swendsen. “Optimized Monte Carlo data analysis”. Phys. Rev. Lett. 63 (12 Sept. 1989), pp. 1195–1198.
[30] C.-K. Hu. “Histogram Monte Carlo renormalization group method for phase
transition models without critical slowing down”. Phys. Rev. Lett. 69 (19
Nov. 1992), pp. 2739–2742.
[31] A. Ghoufi, F. Goujon, V. Lachet, and P. Malfreyt. “Multiple histogram
reweighting method for the surface tension calculation”. J. Chem. Phys.
128 (2008), p. 154718.
[32] W. Janke. “Histograms and All That”. In: Computer Simulations of Surfaces
and Interfaces. Ed. by B. Dünweg, D. P. Landau, and A. I. Milchev. Vol. 114.
NATO Science Series. Springer Netherlands, 2003, pp. 137–157.
[33] S. Kumar, J. M. Rosenberg, D. Bouzida, R. H. Swendsen, and P. A. Kollman.
“The weighted histogram analysis method for free-energy calculations on
biomolecules. I. The method”. J. Comp. Chem. 13 (1992), pp. 1011–1021.
[34] M. K. Fenwick. “A direct multiple histogram reweighting method for optimal computation of the density of states”. J. Chem. Phys. 129, 125106
(2008), p. 125106.
[35] M. Tesi, E. Janse van Rensburg, E. Orlandini, and S. Whittington. “Monte
carlo study of the interacting self-avoiding walk model in three dimensions”.
J. Stat. Phys. 82 (1996), pp. 155–181.
[36] U. H. Hansmann. “Parallel tempering algorithm for conformational studies
of biological molecules”. Chem. Phys. Lett. 281 (1997), pp. 140–150.
[37] A. P. Lyubartsev, A. A. Martsinovski, S. V. Shevkunov, and P. N. VorontsovVelyaminov. “New approach to Monte Carlo calculation of the free energy:
Method of expanded ensembles”. J. Chem. Phys. 96 (1992), pp. 1776–1783.
95
Bibliography
[38] E. Marinari and G. Parisi. “Simulated Tempering: A New Monte Carlo
Scheme”. Europhys. Lett. 19 (1992), p. 451.
[39] Q. Yan and J. J. de Pablo. “Hyper-parallel tempering Monte Carlo: Application to the Lennard-Jones fluid and the restricted primitive model”. J.
Chem. Phys. 111 (1999), pp. 9509–9516.
[40] J. Lee. “New Monte Carlo Algorithm: Entropic Sampling”. Phys. Rev. Lett.
71 (1993), pp. 211–214.
[41] J. Lee. “New Monte Carlo Algorithm: Entropic Sampling”. Phys. Rev. Lett.
71 (1993), p. 2353.
[42] B. A. Berg, U. H. E. Hansmann, and Y. Okamoto. “Comment on ”Monte
Carlo Simulation of a First-Order Transition for Protein Folding””. J. Phys.
Chem. 99 (1995), pp. 2236–2237.
[43] B. A. Berg and T. Celik. “New approach to spin-glass simulations”. Phys.
Rev. Lett. 69 (15 Oct. 1992), pp. 2292–2295.
[44] G. R. Smith and A. D. Bruce. “A study of the multi-canonical Monte Carlo
method”. J. Phys. A: Math. Gen. 28 (1995), p. 6623.
[45] B. A. Berg. “Multicanonical recursions”. J. Stat. Phys. 82 (1 1996), pp. 323–
342.
[46] B. A. Berg. “Introduction to Multicanonical Monte Carlo Simulations”.
Fields Inst. Commun. 26 (2000), pp. 1–24.
[47] B. A. Berg. “Multicanonical simulations step by step”. Comp. Phys. Comm.
153 (2003), pp. 397–406.
[48] G. M. Torrie and J. P. Valleau. “Monte Carlo free energy estimates using non-Boltzmann sampling: Application to the sub-critical Lennard-Jones
fluid”. Chem. Phys. Lett. 28 (1974), p. 578.
[49] G. M. Torrie and J. P. Valleau. “Nonphysical sampling distributions in
Monte Carlo free-energy estimation: Umbrella sampling”. J. Comput. Phys.
23 (1977), p. 187.
[50] M. P. Allen and D. J. Tildesley. Computer Simulation of Liquids. Oxford:
Clarendon Press, 1987.
[51] P. M. Virnau and M. Müller. “Calculation of free energy through successive
umbrella sampling”. J. Chem. Phys. 120 (2004), p. 10925.
[52] W. Wojtas-Niziurski, Y. Meng, B. Roux, and S. Bernèche. “Self-Learning
Adaptive Umbrella Sampling Method for the Determination of Free Energy
Landscapes in Multiple Dimensions”. J. Chem. Theory Comput. 9 (2013),
pp. 1885–1895.
96
Bibliography
[53] P. M. C. de Oliveira, T. J. P. Penna, and H. J. Herrmann. “Broad Histogram
Method”. Braz. J. Phys. 26 (Dec. 1996), pp. 677–683.
[54] P. de Oliveira. “Broad histogram relation is exact”. Eur. Phys. J. B 6 (1998),
pp. 111–115.
[55] J.-S. Wang. “Is the broad histogram random walk dynamics correct?” Eur.
Phys. J. B 8 (2 1999). 10.1007/s100510050692, pp. 287–291.
[56] J. Skilling. “Nested sampling for general Bayesian computation”. Bayesian
Anal. 1 (Dec. 2006), pp. 833–859.
[57] J. Skilling. “Nested Sampling’s Convergence”. AIP Conf. Proc. 1193 (2009),
pp. 277–291.
[58] L. B. Pártay, A. P. Bartók, and G. Csányi. “Nested sampling for materials:
the case of hard spheres”. ArXiv e-prints (Aug. 2012).
[59] L. B. Pártay, A. P. Bartók, and G. Csányi. “Efficient Sampling of Atomic
Configurational Spaces”. J. Phys. Chem. B 114 (2010), pp. 10502–10512.
[60] S. O. Nielsen. “Nested sampling in the canonical ensemble: Direct calculation of the partition function from NVT trajectories”. J. Chem. Phys. 139
(2013), p. 124104.
[61] H. Do, J. D. Hirst, and R. J. Wheatley. “Rapid calculation of partition
functions and free energies of fluids”. J. Chem. Phys. 135 (2011), p. 174105.
[62] I. Murray, D. MacKay, Z. Ghahramani, and J. Skilling. “Nested sampling
for Potts models”. In: Advances in Neural Information Processing Systems
18. Ed. by Y. Weiss, B. Schölkopf, and J. Platt. 2005, pp. 947–954.
[63] N. S. Burkoff, C. Várnai, S. A. Wells, and D. L. Wild. “Exploring the Energy
Landscapes of Protein Folding Simulations with Bayesian Computation”.
Biophys. J. 102 (2012), pp. 878–886.
[64] Q. Yan, R. Faller, and J. de Pablo. “Density of States Monte Carlo Method
for Simulation of Fluids”. J. Chem. Phys. 116 (2002), p. 8745.
[65] M. S. Shell, P. G. Debenedetti, and A. Z. Panagiotopoulos. “Generalization
of the Wang-Landau method for off-lattice simulations”. Phys. Rev. E 66
(2002), p. 056703.
[66] B. J. Schulz, K. Binder, M. Müller, and D. P. Landau. “Avoiding boundary
effects in Wang-Landau sampling”. Phys. Rev. E 67 (2003), p. 067102.
[67] W. Greiner, L. Neise, and H. Stöcker. Theoretische Physik Band 9: Thermodynamik und Statistische Mechanik. Verlag Harri Deutsch, 1993.
97
Bibliography
[68] D. Jayasri, V. S. S. Sastry, and K. P. N. Murthy. “Wang-Landau Monte
Carlo simulation of isotropic-nematic transition in liquid crystals”. Phys.
Rev. E 72 (3 Sept. 2005), p. 036702.
[69] C. Zhou and R. N. Bhatt. “Understanding and improving the Wang-Landau
algorithm”. Phys. Rev. E 72 (2005), p. 025701.
[70] P. Poulain, F. Calvo, R. Antoine, M. Broyer, and P. Dugourd. “Performances of Wang-Landau algorithms for continuous systems.” Phys. Rev. E
73 (2006), p. 056704.
[71] V. A. Epanechnikov. “Non-Parametric Estimation of a Multivariate Probability Density”. Theory Probab. Appl. 14 (1969), pp. 153–158.
[72] C. Zhou, T. C. Schulthess, and D. P. Landau. “Monte Carlo simulations of
NiFe2O4 nanoparticles”. J. Appl. Phys. 99 (2006), 08H906.
[73] R. E. Belardinelli and V. D. Pereyra. “Fast algorithm to calculate density
of states.” Phys. Rev. E 75 (2007), p. 046701.
[74] R. E. Belardinelli and V. D. Pereyra. “Wang-Landau algorithm: A theoretical analysis of the saturation of the error”. J. Chem. Phys. 127 (2007),
p. 184105.
[75] C. Zhou and J. Su. “Optimal modification factor and convergence of the
Wang-Landau algorithm”. Phys. Rev. E 78, 046705 (2008), p. 046705.
[76] R. Schulz. “The Investigation of The parQ-Method for Continuous Systems”.
MA thesis. Technische Universit�t Chemnitz, 2007.
[77] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. “Optimization by Simulated
Annealing”. Science 220 (1983), pp. 671–680.
[78] M. S. Shell, P. G. Debenedetti, and A. Z. Panagiotopoulos. “An improved
Monte Carlo method for direct calculation of the density of states”. J. Chem.
Phys. 119 (2003), pp. 9406–9411.
[79] M. K. Fenwick. “Accurate estimation of the density of states from Monte
Carlo transition probability data”. J. Chem. Phys. 125, 144905 (2006),
p. 144905.
[80] G. R. Smith and A. D. Bruce. “Multicanonical Monte Carlo study of solidsolid phase coexistence in a model colloid”. Phys. Rev. E 53 (6 June 1996),
pp. 6530–6543.
[81] M. Fitzgerald, R. R. Picard, and R. N. Silver. “Canonical transition probabilities for adaptive Metropolis simulation”. Europhys. Lett. 46 (1999),
pp. 282–287.
98
Bibliography
[82] M. Fitzgerald, R. R. Picard, and R. N. Silver. “Monte Carlo Transition
Dynamics and Variance Reduction”. J. Stat. Phys. 98 (2000), pp. 321–345.
[83] J.-S. Wang, T. K. Tay, and R. H. Swendsen. “Transition Matrix Monte
Carlo Reweighting and Dynamics”. Phys. Rev. Lett. 82 (1999), pp. 476–
479.
[84] J.-S. Wang and R. H. Swendsen. “Transition Matrix Monte Carlo Method”.
J. Stat. Phys. 106 (2002), pp. 245–285.
[85] A. B. Bortz, M. H. Kalos, and J. L. Lebowitz. “A new algorithm for Monte
Carlo simulation of Ising spin systems”. J. Comput. Phys. 17 (1975), pp. 10–
18.
[86] J. R. Errington. “Direct calculation of liquid–vapor phase equilibria from
transition matrix Monte Carlo simulation”. J. Chem. Phys. 118 (2003),
p. 9915.
[87] J. R. Errington. “Evaluating surface tension using grand-canonical transition-matrix Monte Carlo simulation and finite-size scaling.” Phys. Rev. E
67 (2003), p. 012102.
[88] V. K. Shen and J. R. Errington. “Determination of fluid-phase behavior
using transition-matrix Monte Carlo: binary Lennard-Jones mixtures”. J.
Chem. Phys. 122 (2005), p. 064508.
[89] J. K. Singh and J. R. Errington. “Calculation of Phase Coexistence Properties and Surface Tensions of n-Alkanes with Grand-Canonical TransitionMatrix Monte Carlo Simulation and Finite-Size Scaling”. J. Phys. Chem.
B 110 (2006), pp. 1369–1376.
[90] A. S. Paluch, V. K. Shen, and J. R. Errington. “Comparing the Use of Gibbs
Ensemble and Grand-Canonical Transition-Matrix Monte Carlo Methods
to Determine Phase Equilibria”. Ind. Eng. Chem. Res. 47 (May 2008),
pp. 4533–4541.
[91] M. e. a. Galassi. Gnu Scientific Library Reference Manual. 3rd ed. Network
Theory Ltd, 2009.
[92] F. A. Escobedo and C. R. A. Abreu. “On the use of transition matrix
methods with extended ensembles”. J. Chem. Phys. 124 (2006), p. 104110.
[93] R. van Mises and H. Pollaczek-Geiringer. “Praktische Verfahren der Gleichungsauflösung”. ZAMM - J. Appl. Math. Mech. / Z. angw. Math. Mech.
9 (1929), pp. 58–77.
[94] W. K. Grassmann, M. I. Taksar, and D. P. Heyman. “Regenerative Analysis
and Steady State Distributions for Markov Chains.” Oper. Res. 33 (1985),
pp. 1107–1116.
99
Bibliography
[95] D. Heyman and D. O’Leary. “Overcoming Instability In Computing The
Fundamental Matrix For A Markov Chain”. SIAM. J. Matrix Anal. & Appl.
19 (1998), pp. 534–540.
[96] C. A. O’Cinneide. “Entrywise perturbation theory and error analysis for
Markov chains”. Numer. Math. 65 (1 1993). 10.1007/BF01385743, pp. 109–
120.
[97] M. Benzi. “A direct projection method for Markov chains”. Linear Algebra
Appl. 386 (2004). Special Issue on the Conference on the Numerical Solution
of Markov Chains 2003, pp. 27–49.
[98] W. Shi and E. J. Maginn. “Continuous Fractional Component Monte Carlo:
An Adaptive Biasing Method for Open System Atomistic Simulations”. J.
Chem. Theory Comput. 3 (2007), pp. 1451–1463.
[99] W. Shi and E. J. Maginn. “Improvement in molecule exchange efficiency in
Gibbs ensemble Monte Carlo: Development and implementation of the continuous fractional component move”. J. Comp. Chem. 29 (2008), pp. 2520–
2530.
[100] A. Torres-Knoop, S. P. Balaji, T. J. H. Vlugt, and D. Dubbeldam. “A
Comparison of Advanced Monte Carlo Methods for Open Systems: CFCMC
vs CBMC”. J. Chem. Theory Comput. in press (2014).
[101] D. Dubbeldam, A. Torres-Knoop, and K. S. Walton. “On the inner workings
of Monte Carlo codes”. Mol. Simul. 39 (2013), pp. 1253–1292.
[102] F. Viger and M. Latapy. “Efficient and Simple Generation of Random Simple Connected Graphs with Prescribed Degree Sequence”. In: Computing
and Combinatorics. Ed. by L. Wang. Vol. 3595. Lecture Notes in Computer
Science. Springer Berlin / Heidelberg, 2005, pp. 440–449.
[103] P. D. Beale. “Exact Distribution of Energies in the Two-Dimensional Ising
Model”. Phys. Rev. Lett. 76 (1 Jan. 1996), pp. 78–81.
[104] R. K. Pathria and P. D. Beale. Statistical Mechanics. Elsevier Science, 2011.
[105] E. A. Mastny and J. J. de Pablo. “Melting line of the Lennard-Jones system, infinite size, and full potential”. J. Chem. Phys. 127, 104504 (2007),
p. 104504.
[106] N. B. Wilding. “Critical-point and coexistence-curve properties of the Lennard-Jones fluid: A finite-size scaling study”. Phys. Rev. E 52 (1995), p. 602.
[107] J. A. Northby. “Structure and binding of Lennard-Jones clusters: 13�N�147”.
J. Chem. Phys. 87 (1987), pp. 6166–6177.
[108] P. A. Frantsuzov and V. A. Mandelshtam. “Size-temperature phase diagram
for small Lennard-Jones clusters”. Phys. Rev. E 72 (3 Sept. 2005), p. 037102.
100
Bibliography
[109] D. J. Wales and J. P. K. Doye. “Global Optimization by Basin-Hopping and
the Lowest Energy Structures of Lennard-Jones Clusters Containing up to
110 Atoms”. J. Phys. Chem. A 101 (1997), pp. 5111–5116.
[110] J. P. K. Doye, D. J. Wales, and M. A. Miller. “Thermodynamics and the
global optimization of Lennard-Jones clusters”. J. Chem. Phys. 109 (1998),
pp. 8143–8153.
[111] A. L. Mackay. “A dense non-crystallographic packing of equal spheres”.
Acta Crystallogr. 15 (Sept. 1962), pp. 916–918.
[112] Y. Nourani and B. Andresen. “A comparison of simulated annealing cooling
strategies”. J. Phys. A: Math. Gen. 31 (1998), p. 8373.
[113] N. B. Wilding. “Simulation studies of fluid critical behaviour”. J. Phys.:
Condens. Matter 9 (1997), p. 585.
[114] G. Ganzenmüller and P. J. Camp. “Applications of Wang-Landau sampling
to determine phase equilibria in complex fluids.” J. Chem. Phys. 127 (2007),
p. 154504.
101
102
Selbständigkeitserklärung gemäß
§6 Promotionsordnung
Hiermit erkläre ich, dass ich die vorliegende Arbeit selbständig angefertigt, nicht
anderweitig zu Prüfungszwecken vorgelegt und keine anderen als die angegebenen
Hilfsmittel verwendet habe. Sämtliche wissentlich verwendete Textausschnitte, Zitate oder Inhalte anderer Verfasser wurden ausdrücklich als solche gekennzeichnet.
Chemnitz, den 2. Mai 2014
René Haber
103
104
Lebenslauf
Allgemeine Angaben
Name:
René Haber
Geburtsdatum:
11.09.1983
Geburtsort:
Karl-Marx-Stadt jetzt Chemnitz
Schulbildung
1990 – 1994
1994 – 2002
Gotthold-Ephraim-Lessing Grundschule Chemnitz
Alexander-von-Humboldt Gymnasium Chemnitz
– mathematisch-naturwissenschaftliches Profil –
Studium und wissenschaftlicher Werdegang
2003 – 2006
Studium im Bachelorstudiengang Computational Science an
der Technischen Universität Chemnitz
Thema der Bachelorarbeit: “Diffusion on Menger Sponges”
Abschluss: Bachelor of Science
2006 – 2008
Studium im Masterstudiengang Computational Science an
der Technischen Universtät Chemnitz
Thema der Masterarbeit: “Numerical Methods for Density
of States Calculations”
Abschluss: Master of Science (mit Auszeichnung)
ab 2008
Promotionsstudium an der Technischen Universität Chemnitz an der Professur “Theoretische Physik, insbesondere
Computerphysik”
Mär. 2011
Gutachter in der ACQUIN Akkreditierung der Studiengänge “Materialwissenschaften – Advanced Material Science
(M.Sc.)” und “Physik mit Informatik (M.Sc.)” der Universität Osnabrück
Aug. 2011
Forschungsaufenthalt am Centre for Nonlinear Studies, Institute of Cybernetics at Tallinn University of Technology
Feb. & Okt. 2011
Forschungsaufenthalt am Max Planck Institut für Festkörperforschung Stuttgart
Jul. 2012
Gutachter in der ACQUIN Akkreditierung der Studiengänge
Physik, B.Sc. und M.Sc. der Universität Konstanz
105
Publikationen
Haber, R.
Diffusion on Menger Sponges
Bachelorarbeit, TU Chemnitz, Juli 2006
Haber, R.
Numerical Methods for Density of States Calculations
Masterarbeit, TU Chemnitz, Juli 2008
Haber, R., Prehl, J., Hoffmann, K. H. and Herrmann, H.
Diffusion of oriented particles in porous media
Physics Letters A 377 (2013), p. 2840–2845
Haber, R., Prehl, J., Hoffmann, K. H. and Herrmann, H.
Random walks of oriented particles on fractals
Journal of Physics A: Mathematical and Theoretical 47 (2014), p. 155001
Tagungen
Haber, R. and Hoffmann, K. H.
Generalization of the parQ method to the grand canonical ensemble
Vortrag auf der Frühjahrstagung der DPG, Dresden, 2011
Haber, R. and Hoffmann, K. H.
parQ – A transition matrix method to calculate the density of states
Energy Landscapes, Juni / Juli 2010, Chemnitz
JETC 11 – Joint European Thermodynamics Conference, Juni / Juli 2011,
Chemnitz
Prehl, J., Haber, R., Hoffmann, K. H. and Herrmann, H.
Oriented Particles in Porous Media
Vortrag auf der Frühjahrstagung der DPG, Regensburg, 2013
Haber, R. and Hoffmann, K. H.
parQ - Infinite Temperature Transition Matrix Monte Carlo in the Canonical and
Grand Canonical Ensemble
Vortrag auf der Frühjahrstagung der DPG, Dresden, 2014
106